Optimizing smp_affinity for irqs on interfaces under conditions of max load

I'm looking to tune the interface irqs by assigning them to specific cpus on my 4-core x86_64

What's the best way of doing this? I have a 400mbps connection running off eth0 which is shaped by sqm using layer cake, as well as another on eth1 also shaped by cake that is a ~55mbps connection.

My lan is unshaped.

The aim is to reduce cpu load and optimize latency under conditions of max utilization of the link, which would occur mainly on download, as the connection is heavily asymmetric (400mbps download, 38mbps upload).

So what is the optimal strategy?

  • Try to balance the irqs by assigning them across cpu1, cpu2 and cpu3 (not to cpu0) such that on a maxed out download each cpu gets more or less than same ratio of interrupts, without regard to whether i'm mixing rx and tx interrupts (and indeed interrupts from different interfaces) on a specific cpu?

  • Or, assign all tx irqs on the wan interface to one cpu, all rx interrupts on the wan interface to another cpu and then both tx and rx on the lan interface to the third cpu

  • Some other way that I've not listed

Each ethernet interface has 4 tx and 4 rx queues, each serviced by a distinct interrupt

Did you ever find resolution to this?

Hmm, wouldn't the irqbalance package do point #1 automatically?

But if I remember correctly, SQM on a single connection will always run off of one core, due to the latency issue that pops up if you have to start sharing data between cores. I wonder if running two separate instances on one connection (one for download, one for upload) would work better if you're being bottlenecked by single threaded performance.