Xrx200 IRQ balancing between VPEs

I installed the patches and my HomeHub 5A started behaving abnormally. USB started malfunctioning and it would not either mount or read any data. WiFi would not work if I changed the smp_affinity to 2, 1 or 3 was fine. I didnot make any change to USB IRQ though. SO I had to revert back to normal v19.07.2.

I hope I will get it into mainline kernel :-).

There is reg patch for AR9 in your link :-/, are you using 5.4 kernel? There was difference between vanilla kernel and openwrt kernel DTS structure in older kernels. IIRC I was told there were some leftovers which will be deleted, but I don't own these so I relied on kernel mailing list hints.

Do you have logs (dmesg, /proc/interrupts during traffic)? That almost looks like the second VPE doesn't get interrupts at all (but it should) .. and the 0901-add-icu-smp-support.patch is not applied.

No I dont have the logs because I reverted back to un-patched kernel and everything came back normal. Right now I am using v5.4 kernel patches and they work fine. Also I am getting up to 9 MB/s on 2.4GHz and 23 MB/s on 5GHz with all your patches applied.

2.4 GHz Test with Iperf3 on HomeHub 5A
root@AhmarRouter:~# iperf3 -c 192.168.1.238 -b 0 -t 60
Connecting to host 192.168.1.238, port 5201
[  5] local 192.168.1.1 port 54272 connected to 192.168.1.238 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  5.21 MBytes  43.7 Mbits/sec    0    245 KBytes
[  5]   1.00-2.00   sec  8.95 MBytes  75.0 Mbits/sec    0    519 KBytes
[  5]   2.00-3.00   sec  8.33 MBytes  69.8 Mbits/sec    0    608 KBytes
[  5]   3.00-4.00   sec  4.97 MBytes  41.7 Mbits/sec    0    645 KBytes
[  5]   4.00-5.00   sec  10.3 MBytes  86.3 Mbits/sec    0    677 KBytes
[  5]   5.00-6.00   sec  11.7 MBytes  98.5 Mbits/sec    0    677 KBytes
[  5]   6.00-7.00   sec  7.02 MBytes  58.9 Mbits/sec    0    713 KBytes
[  5]   7.00-8.00   sec  9.20 MBytes  77.1 Mbits/sec    0    748 KBytes
[  5]   8.00-9.00   sec  8.08 MBytes  67.8 Mbits/sec    0    748 KBytes
[  5]   9.00-10.00  sec  8.70 MBytes  72.8 Mbits/sec    0    748 KBytes
[  5]  10.00-11.00  sec  7.89 MBytes  66.1 Mbits/sec    0    792 KBytes
[  5]  11.00-12.00  sec  11.0 MBytes  92.6 Mbits/sec    0    792 KBytes
[  5]  12.00-13.00  sec  7.33 MBytes  61.5 Mbits/sec    0    792 KBytes
[  5]  13.00-14.00  sec  10.1 MBytes  84.2 Mbits/sec    0   1.16 MBytes
[  5]  14.00-15.00  sec  7.71 MBytes  64.8 Mbits/sec    0   1.16 MBytes
[  5]  15.00-16.00  sec  5.28 MBytes  44.3 Mbits/sec    0   1.16 MBytes
[  5]  16.00-17.00  sec  9.94 MBytes  83.4 Mbits/sec    0   1.16 MBytes
[  5]  17.00-18.00  sec  9.76 MBytes  81.8 Mbits/sec    0   1.16 MBytes
[  5]  18.00-19.00  sec  6.77 MBytes  56.7 Mbits/sec    0   1.16 MBytes
[  5]  19.00-20.00  sec  9.20 MBytes  77.4 Mbits/sec    0   1.16 MBytes
[  5]  20.00-21.00  sec  8.08 MBytes  67.7 Mbits/sec    0   1.16 MBytes
[  5]  21.00-22.01  sec  8.45 MBytes  70.6 Mbits/sec    0   1.16 MBytes
[  5]  22.01-23.00  sec  6.90 MBytes  58.2 Mbits/sec    0   1.16 MBytes
[  5]  23.00-24.00  sec  9.13 MBytes  76.6 Mbits/sec    0   1.16 MBytes
[  5]  24.00-25.00  sec  9.45 MBytes  79.2 Mbits/sec    0   1.16 MBytes
[  5]  25.00-26.00  sec  10.1 MBytes  85.0 Mbits/sec    0   1.16 MBytes
[  5]  26.00-27.00  sec  9.51 MBytes  79.8 Mbits/sec    0   1.16 MBytes
[  5]  27.00-28.01  sec  12.9 MBytes   108 Mbits/sec    0   1.16 MBytes
[  5]  28.01-29.00  sec  11.6 MBytes  97.6 Mbits/sec    0   1.16 MBytes
[  5]  29.00-30.01  sec  9.69 MBytes  80.5 Mbits/sec    0   1.16 MBytes
[  5]  30.01-31.00  sec  9.13 MBytes  77.4 Mbits/sec    0   1.16 MBytes
[  5]  31.00-32.00  sec  9.20 MBytes  77.1 Mbits/sec    0   1.16 MBytes
[  5]  32.00-33.00  sec  9.07 MBytes  76.1 Mbits/sec    0   1.16 MBytes
[  5]  33.00-34.01  sec  9.57 MBytes  79.6 Mbits/sec    0   1.16 MBytes
[  5]  34.01-35.00  sec  10.6 MBytes  89.5 Mbits/sec    0   1.16 MBytes
[  5]  35.00-36.00  sec  11.8 MBytes  99.0 Mbits/sec    0   1.16 MBytes
[  5]  36.00-37.00  sec  11.3 MBytes  94.9 Mbits/sec    0   1.16 MBytes
[  5]  37.00-38.00  sec  9.82 MBytes  82.4 Mbits/sec    0   1.16 MBytes
[  5]  38.00-39.00  sec  9.63 MBytes  80.8 Mbits/sec    0   1.16 MBytes
[  5]  39.00-40.01  sec  10.6 MBytes  88.4 Mbits/sec    0   1.16 MBytes
[  5]  40.01-41.00  sec  9.63 MBytes  81.4 Mbits/sec    0   1.16 MBytes
[  5]  41.00-42.01  sec  10.3 MBytes  85.5 Mbits/sec    0   1.16 MBytes
[  5]  42.01-43.00  sec  8.70 MBytes  73.4 Mbits/sec    0   1.16 MBytes
[  5]  43.00-44.00  sec  10.4 MBytes  87.1 Mbits/sec    0   1.16 MBytes
[  5]  44.00-45.00  sec  9.57 MBytes  80.3 Mbits/sec    0   1.16 MBytes
[  5]  45.00-46.00  sec  9.69 MBytes  81.3 Mbits/sec    0   1.16 MBytes
[  5]  46.00-47.00  sec  10.1 MBytes  84.4 Mbits/sec    0   1.16 MBytes
[  5]  47.00-48.00  sec  9.13 MBytes  76.6 Mbits/sec    0   1.16 MBytes
[  5]  48.00-49.00  sec  10.1 MBytes  85.0 Mbits/sec    0   1.16 MBytes
[  5]  49.00-50.00  sec  9.63 MBytes  80.8 Mbits/sec    0   1.16 MBytes
[  5]  50.00-51.00  sec  9.57 MBytes  80.2 Mbits/sec    0   1.16 MBytes
[  5]  51.00-52.00  sec  8.95 MBytes  75.1 Mbits/sec    0   1.16 MBytes
[  5]  52.00-53.00  sec  9.88 MBytes  82.8 Mbits/sec    0   1.16 MBytes
[  5]  53.00-54.00  sec  9.26 MBytes  77.7 Mbits/sec    0   1.16 MBytes
[  5]  54.00-55.00  sec  9.94 MBytes  83.4 Mbits/sec    0   1.16 MBytes
[  5]  55.00-56.00  sec  9.88 MBytes  82.6 Mbits/sec    0   1.16 MBytes
[  5]  56.00-57.00  sec  9.20 MBytes  77.4 Mbits/sec    0   1.16 MBytes
[  5]  57.00-58.00  sec  9.32 MBytes  78.2 Mbits/sec    0   1.16 MBytes
[  5]  58.00-59.00  sec  9.63 MBytes  80.8 Mbits/sec    0   1.16 MBytes
[  5]  59.00-60.00  sec  9.76 MBytes  81.8 Mbits/sec    0   1.16 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-60.00  sec   557 MBytes  77.8 Mbits/sec    0             sender
[  5]   0.00-60.00  sec   556 MBytes  77.7 Mbits/sec                  receiver

iperf Done.
root@AhmarRouter:~# iperf3 -c 192.168.1.238 -b 0 -t 60 -R
Connecting to host 192.168.1.238, port 5201
Reverse mode, remote host 192.168.1.238 is sending
[  5] local 192.168.1.1 port 54276 connected to 192.168.1.238 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.01   sec  6.19 MBytes  51.6 Mbits/sec
[  5]   1.01-2.01   sec  6.78 MBytes  56.7 Mbits/sec
[  5]   2.01-3.01   sec  6.90 MBytes  57.8 Mbits/sec
[  5]   3.01-4.00   sec  6.66 MBytes  56.3 Mbits/sec
[  5]   4.00-5.02   sec  6.84 MBytes  56.6 Mbits/sec
[  5]   5.02-6.01   sec  6.76 MBytes  57.3 Mbits/sec
[  5]   6.01-7.00   sec  6.66 MBytes  56.0 Mbits/sec
[  5]   7.00-8.02   sec  6.76 MBytes  56.0 Mbits/sec
[  5]   8.02-9.00   sec  6.59 MBytes  56.0 Mbits/sec
[  5]   9.00-10.02  sec  7.01 MBytes  58.0 Mbits/sec
[  5]  10.02-11.01  sec  6.75 MBytes  56.9 Mbits/sec
[  5]  11.01-12.01  sec  6.74 MBytes  56.6 Mbits/sec
[  5]  12.01-13.02  sec  6.79 MBytes  56.6 Mbits/sec
[  5]  13.02-14.02  sec  6.75 MBytes  56.6 Mbits/sec
[  5]  14.02-15.01  sec  6.68 MBytes  56.2 Mbits/sec
[  5]  15.01-16.01  sec  6.70 MBytes  56.4 Mbits/sec
[  5]  16.01-17.00  sec  6.67 MBytes  56.3 Mbits/sec
[  5]  17.00-18.00  sec  6.64 MBytes  55.9 Mbits/sec
[  5]  18.00-19.01  sec  6.76 MBytes  56.1 Mbits/sec
[  5]  19.01-20.00  sec  6.67 MBytes  56.3 Mbits/sec
[  5]  20.00-21.00  sec  6.63 MBytes  55.8 Mbits/sec
[  5]  21.00-22.01  sec  6.65 MBytes  55.5 Mbits/sec
[  5]  22.01-23.01  sec  6.72 MBytes  56.0 Mbits/sec
[  5]  23.01-24.00  sec  6.68 MBytes  56.7 Mbits/sec
[  5]  24.00-25.00  sec  6.67 MBytes  56.0 Mbits/sec
[  5]  25.00-26.02  sec  6.93 MBytes  57.1 Mbits/sec
[  5]  26.02-27.01  sec  6.64 MBytes  56.1 Mbits/sec
[  5]  27.01-28.01  sec  7.21 MBytes  60.7 Mbits/sec
[  5]  28.01-29.01  sec  6.68 MBytes  55.8 Mbits/sec
[  5]  29.01-30.00  sec  6.60 MBytes  56.0 Mbits/sec
[  5]  30.00-31.01  sec  6.75 MBytes  56.3 Mbits/sec
[  5]  31.01-32.02  sec  6.74 MBytes  56.0 Mbits/sec
[  5]  32.02-33.00  sec  6.70 MBytes  56.9 Mbits/sec
[  5]  33.00-34.01  sec  6.84 MBytes  56.9 Mbits/sec
[  5]  34.01-35.01  sec  6.72 MBytes  56.7 Mbits/sec
[  5]  35.01-36.00  sec  6.70 MBytes  56.5 Mbits/sec
[  5]  36.00-37.02  sec  6.84 MBytes  56.5 Mbits/sec
[  5]  37.02-38.02  sec  6.72 MBytes  56.3 Mbits/sec
[  5]  38.02-39.00  sec  5.46 MBytes  46.4 Mbits/sec
[  5]  39.00-40.01  sec  2.98 MBytes  24.8 Mbits/sec
[  5]  40.01-41.00  sec  1.99 MBytes  16.9 Mbits/sec
[  5]  41.00-42.00  sec  2.30 MBytes  19.3 Mbits/sec
[  5]  42.00-43.00  sec  2.67 MBytes  22.4 Mbits/sec
[  5]  43.00-44.00  sec  2.98 MBytes  25.0 Mbits/sec
[  5]  44.00-45.01  sec  3.18 MBytes  26.3 Mbits/sec
[  5]  45.01-46.00  sec  4.79 MBytes  40.7 Mbits/sec
[  5]  46.00-47.01  sec  5.83 MBytes  48.6 Mbits/sec
[  5]  47.01-48.01  sec  5.22 MBytes  43.9 Mbits/sec
[  5]  48.01-49.01  sec  5.17 MBytes  43.2 Mbits/sec
[  5]  49.01-50.01  sec  5.26 MBytes  43.9 Mbits/sec
[  5]  50.01-51.00  sec  4.60 MBytes  39.2 Mbits/sec
[  5]  51.00-52.02  sec  4.94 MBytes  40.5 Mbits/sec
[  5]  52.02-53.05  sec  8.25 MBytes  67.6 Mbits/sec
[  5]  53.05-54.04  sec  4.83 MBytes  40.8 Mbits/sec
[  5]  54.04-55.01  sec  5.10 MBytes  44.3 Mbits/sec
[  5]  55.01-56.01  sec  4.79 MBytes  40.2 Mbits/sec
[  5]  56.01-57.01  sec  4.35 MBytes  36.1 Mbits/sec
[  5]  57.01-58.00  sec  4.78 MBytes  40.7 Mbits/sec
[  5]  58.00-59.00  sec  4.02 MBytes  33.7 Mbits/sec
[  5]  59.00-60.01  sec  5.11 MBytes  42.6 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-60.01  sec   356 MBytes  49.8 Mbits/sec    0             sender
[  5]   0.00-60.01  sec   354 MBytes  49.5 Mbits/sec                  receiver

iperf Done.

No, not on the lantiq experimental 5.4 as yet. Was just noting that the devicetree were missing the second reg entries for icu1 on those devices compared to your original patches.

Has 0901-add-icu-smp-support.patch made it into kernel 5.4 yet? If not does anyone have a patched version of this Kernel for using on a BT Homehub 5a, running OpenWrt 21.02.0?

1 Like

Hi,
I sent this patch upstream but it hasn't been accepted yet[1]. You can send a reply to the mailing list after you test it (e.g. add a Tested-by line)[2].

  1. https://patchwork.kernel.org/project/linux-mips/patch/20210606181525.761333-2-olek2@wp.pl/
  2. https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html#using-reported-by-tested-by-reviewed-by-suggested-by-and-fixes
2 Likes

Thanks, I was looking for a pre-compiled kernel. Not sure my skills extend to building and patching my own.

Hello I applied this patch (0901-add-icu-smp-support.patch shared by pc2005 few post ago) on my BT HomeHub 5A, and few others (pull/3946, pull/4326, pull/4339 and pull/4353, just because I like to break stuff), how can I check if this is really working?

cat /proc/interrupts gives:

           CPU0       CPU1
  7:      59273      53064      MIPS   7  timer
  8:       3233       2513      MIPS   0  IPI call
  9:       7742       9855      MIPS   1  IPI resched
 30:      85421          0       icu  30  ath9k
 63:      25188          0       icu  63  mei_cpe
 72:      14360          0       icu  72  xrx200_net_rx
 73:      25291          0       icu  73  xrx200_net_tx
 96:      80051          0       icu  96  atm_mailbox_isr
112:        185          0       icu 112  asc_tx
113:          0          0       icu 113  asc_rx
114:          0          0       icu 114  asc_err
126:          0          0       icu 126  gptu
127:          0          0       icu 127  gptu
128:          0          0       icu 128  gptu
129:          0          0       icu 129  gptu
130:          0          0       icu 130  gptu
131:          0          0       icu 131  gptu
144:         34          0       icu 144  ath10k_pci
161:          0          0       icu 161  ifx_pcie_rc0
ERR:          1

logread | grep err gives few unrelated errors (but is get those even on a stock master build).

Unlike with x86/amd64, the interrupts are not automatically distributed so you still need to set their CPU affinity. For example, to force all interrupts for ath9k wireless to CPU1 run;

echo 2 > /proc/irq/30/smp_affinity

Then check that its CPU1 column starts incrementing. You'll need to configure this on every reboot by adding the following example (which I use) to /etc/rc.local to move ath9k wireless, ath10k wireless and DSL Rx interrupts to CPU1.

echo 2 > /proc/irq/30/smp_affinity
echo 2 > /proc/irq/72/smp_affinity
echo 2 > /proc/irq/144/smp_affinity

Or, alternatively, use the irqbalance daemon.

2 Likes

Are those numbers bitmask? Like, can I just run echo 3 > /proc/irq/.... and expect to automatically balance all irq between two VPEs or I must use just one VPE for each irq?

1 Like

Yes, smp_affinity is a bitmask of available CPUs but, no, it will not be automatically balanced (hence using 2 to force it to second CPU).

Compiled the same image + irqbalance, but it seems is not balancing shit?
I enabled it inside /etc/config/irqbalance but it looks like no balancing is taking place

           CPU0       CPU1
  7:      58099      63385      MIPS   7  timer
  8:       3061       3498      MIPS   0  IPI call
  9:       7935      13953      MIPS   1  IPI resched
 30:      83348          0       icu  30  ath9k
 63:      20052          0       icu  63  mei_cpe
 72:      21565          0       icu  72  xrx200_net_rx
 73:      37525          0       icu  73  xrx200_net_tx
 96:     109119          0       icu  96  atm_mailbox_isr
112:        188          0       icu 112  asc_tx
113:          0          0       icu 113  asc_rx
114:          0          0       icu 114  asc_err
126:          0          0       icu 126  gptu
127:          0          0       icu 127  gptu
128:          0          0       icu 128  gptu
129:          0          0       icu 129  gptu
130:          0          0       icu 130  gptu
131:          0          0       icu 131  gptu
144:         28          0       icu 144  ath10k_pci
161:          0          0       icu 161  ifx_pcie_rc0
ERR:          1

Should I open a ticket or is some config i'm missing?

Running irqbalance --debug --oneshot shows:

This machine seems not NUMA capable.
Isolated CPUs: 00000000
Adaptive-ticks CPUs: 00000000
Banned CPUs: 00000000
Package 0:  numa_node -1 cpu mask is 00000003 (load 0)
        Cache domain 0:  numa_node is -1 cpu mask is 00000003  (load 0)
                CPU number 1  numa_node is -1 (load 0)
                CPU number 0  numa_node is -1 (load 0)
Adding IRQ 30 to database
Adding IRQ 144 to database
Adding IRQ 7 to database
Adding IRQ 8 to database
Adding IRQ 9 to database
Adding IRQ 63 to database
Adding IRQ 72 to database
Adding IRQ 73 to database
Adding IRQ 96 to database
Adding IRQ 112 to database
Adding IRQ 113 to database
Adding IRQ 114 to database
Adding IRQ 126 to database
Adding IRQ 127 to database
Adding IRQ 128 to database
Adding IRQ 129 to database
Adding IRQ 130 to database
Adding IRQ 131 to database
Adding IRQ 161 to database
NUMA NODE NUMBER: -1
LOCAL CPU MASK: ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff

Daemon couldn't be bound to the file-based socket.



-----------------------------------------------------------------------------
Package 0:  numa_node -1 cpu mask is 00000003 (load 0)
        Cache domain 0:  numa_node is -1 cpu mask is 00000003  (load 0)
                CPU number 1  numa_node is -1 (load 0)
                  Interrupt 144 node_num is -1 (ethernet/0:0)
                CPU number 0  numa_node is -1 (load 0)
                  Interrupt 30 node_num is -1 (ethernet/0:1622)
  Interrupt 96 node_num is -1 (other/0:50)
  Interrupt 73 node_num is -1 (other/0:20)
  Interrupt 72 node_num is -1 (other/0:19)
  Interrupt 63 node_num is -1 (other/0:456)
  Interrupt 9 node_num is -1 (other/0:94)
  Interrupt 8 node_num is -1 (other/0:11)
  Interrupt 7 node_num is -1 (other/0:2159)
  Interrupt 128 node_num is -1 (other/0:0)
  Interrupt 127 node_num is -1 (other/0:0)
  Interrupt 126 node_num is -1 (other/0:0)
  Interrupt 114 node_num is -1 (other/0:0)
  Interrupt 113 node_num is -1 (other/0:0)
  Interrupt 112 node_num is -1 (other/0:0)
  Interrupt 161 node_num is -1 (other/0:0)
  Interrupt 131 node_num is -1 (other/0:0)
  Interrupt 130 node_num is -1 (other/0:0)
  Interrupt 129 node_num is -1 (other/0:0)

I don't use irqbalance myself so can't help there. But did you actually test echo'ng 2 to an active IRQ smp_affinity? If it doesn't work then it's likely your patch failed to be applied.

That work. I tried irqbalance because I assumed a deamon could make a better adjustment than me.

Just wondering whether the patches mentioned above were ever included in the master branch. @pc2005 would you be able to confirm ?

Hi,

can you tell me how to add this patch myself (where?) or where to find sufficient documentation so i can try it?
The last time i patched the source and compiled a kernel myself is about 15 years ago, an then i had the complete kernel source locally on my hdd.
But with this cross-compiling...i'm a bit lost.

Thanks in advance.

@elder_tinkerer This patch is already in the main OpenWRT branch. Tomorrow's snapshot should contain this fix.

@olek210
Thank you, i'll try it.
I (finally) managed to patch the kernel myself (phew), and it seems to work.
Now i can set TX and RX to different VPEs.
But the VDSL download speed is still much lower than with 21.02.3 (11.460 kb/s vs. 6970 kb/s).
Maybe you've got some brilliant ideas on how to get back to the higher speed...
I also forgot to put LuCI on the list, so i had to do everything via cli. Fortunately that's not a big problem...

Other users report a drop in the speed to 60 Mbps.

I'm interested if you see any difference with Software flow Offload enabled?

@elder_tinkerer

Are you able to generate a graph using flamegraph and perf record?

Then we can see what consumes more CPU power.