No, not on the lantiq experimental 5.4 as yet. Was just noting that the devicetree were missing the second reg entries for icu1 on those devices compared to your original patches.
Has 0901-add-icu-smp-support.patch made it into kernel 5.4 yet? If not does anyone have a patched version of this Kernel for using on a BT Homehub 5a, running OpenWrt 21.02.0?
Hi,
I sent this patch upstream but it hasn't been accepted yet[1]. You can send a reply to the mailing list after you test it (e.g. add a Tested-by line)[2].
Thanks, I was looking for a pre-compiled kernel. Not sure my skills extend to building and patching my own.
Hello I applied this patch (0901-add-icu-smp-support.patch shared by pc2005 few post ago) on my BT HomeHub 5A, and few others (pull/3946, pull/4326, pull/4339 and pull/4353, just because I like to break stuff), how can I check if this is really working?
cat /proc/interrupts
gives:
CPU0 CPU1
7: 59273 53064 MIPS 7 timer
8: 3233 2513 MIPS 0 IPI call
9: 7742 9855 MIPS 1 IPI resched
30: 85421 0 icu 30 ath9k
63: 25188 0 icu 63 mei_cpe
72: 14360 0 icu 72 xrx200_net_rx
73: 25291 0 icu 73 xrx200_net_tx
96: 80051 0 icu 96 atm_mailbox_isr
112: 185 0 icu 112 asc_tx
113: 0 0 icu 113 asc_rx
114: 0 0 icu 114 asc_err
126: 0 0 icu 126 gptu
127: 0 0 icu 127 gptu
128: 0 0 icu 128 gptu
129: 0 0 icu 129 gptu
130: 0 0 icu 130 gptu
131: 0 0 icu 131 gptu
144: 34 0 icu 144 ath10k_pci
161: 0 0 icu 161 ifx_pcie_rc0
ERR: 1
logread | grep err
gives few unrelated errors (but is get those even on a stock master build).
Unlike with x86/amd64, the interrupts are not automatically distributed so you still need to set their CPU affinity. For example, to force all interrupts for ath9k wireless to CPU1 run;
echo 2 > /proc/irq/30/smp_affinity
Then check that its CPU1 column starts incrementing. You'll need to configure this on every reboot by adding the following example (which I use) to /etc/rc.local to move ath9k wireless, ath10k wireless and DSL Rx interrupts to CPU1.
echo 2 > /proc/irq/30/smp_affinity
echo 2 > /proc/irq/72/smp_affinity
echo 2 > /proc/irq/144/smp_affinity
Or, alternatively, use the irqbalance daemon.
Are those numbers bitmask? Like, can I just run echo 3 > /proc/irq/....
and expect to automatically balance all irq between two VPEs or I must use just one VPE for each irq?
Yes, smp_affinity is a bitmask of available CPUs but, no, it will not be automatically balanced (hence using 2 to force it to second CPU).
Compiled the same image + irqbalance, but it seems is not balancing shit?
I enabled it inside /etc/config/irqbalance but it looks like no balancing is taking place
CPU0 CPU1
7: 58099 63385 MIPS 7 timer
8: 3061 3498 MIPS 0 IPI call
9: 7935 13953 MIPS 1 IPI resched
30: 83348 0 icu 30 ath9k
63: 20052 0 icu 63 mei_cpe
72: 21565 0 icu 72 xrx200_net_rx
73: 37525 0 icu 73 xrx200_net_tx
96: 109119 0 icu 96 atm_mailbox_isr
112: 188 0 icu 112 asc_tx
113: 0 0 icu 113 asc_rx
114: 0 0 icu 114 asc_err
126: 0 0 icu 126 gptu
127: 0 0 icu 127 gptu
128: 0 0 icu 128 gptu
129: 0 0 icu 129 gptu
130: 0 0 icu 130 gptu
131: 0 0 icu 131 gptu
144: 28 0 icu 144 ath10k_pci
161: 0 0 icu 161 ifx_pcie_rc0
ERR: 1
Should I open a ticket or is some config i'm missing?
Running irqbalance --debug --oneshot
shows:
This machine seems not NUMA capable.
Isolated CPUs: 00000000
Adaptive-ticks CPUs: 00000000
Banned CPUs: 00000000
Package 0: numa_node -1 cpu mask is 00000003 (load 0)
Cache domain 0: numa_node is -1 cpu mask is 00000003 (load 0)
CPU number 1 numa_node is -1 (load 0)
CPU number 0 numa_node is -1 (load 0)
Adding IRQ 30 to database
Adding IRQ 144 to database
Adding IRQ 7 to database
Adding IRQ 8 to database
Adding IRQ 9 to database
Adding IRQ 63 to database
Adding IRQ 72 to database
Adding IRQ 73 to database
Adding IRQ 96 to database
Adding IRQ 112 to database
Adding IRQ 113 to database
Adding IRQ 114 to database
Adding IRQ 126 to database
Adding IRQ 127 to database
Adding IRQ 128 to database
Adding IRQ 129 to database
Adding IRQ 130 to database
Adding IRQ 131 to database
Adding IRQ 161 to database
NUMA NODE NUMBER: -1
LOCAL CPU MASK: ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff
Daemon couldn't be bound to the file-based socket.
-----------------------------------------------------------------------------
Package 0: numa_node -1 cpu mask is 00000003 (load 0)
Cache domain 0: numa_node is -1 cpu mask is 00000003 (load 0)
CPU number 1 numa_node is -1 (load 0)
Interrupt 144 node_num is -1 (ethernet/0:0)
CPU number 0 numa_node is -1 (load 0)
Interrupt 30 node_num is -1 (ethernet/0:1622)
Interrupt 96 node_num is -1 (other/0:50)
Interrupt 73 node_num is -1 (other/0:20)
Interrupt 72 node_num is -1 (other/0:19)
Interrupt 63 node_num is -1 (other/0:456)
Interrupt 9 node_num is -1 (other/0:94)
Interrupt 8 node_num is -1 (other/0:11)
Interrupt 7 node_num is -1 (other/0:2159)
Interrupt 128 node_num is -1 (other/0:0)
Interrupt 127 node_num is -1 (other/0:0)
Interrupt 126 node_num is -1 (other/0:0)
Interrupt 114 node_num is -1 (other/0:0)
Interrupt 113 node_num is -1 (other/0:0)
Interrupt 112 node_num is -1 (other/0:0)
Interrupt 161 node_num is -1 (other/0:0)
Interrupt 131 node_num is -1 (other/0:0)
Interrupt 130 node_num is -1 (other/0:0)
Interrupt 129 node_num is -1 (other/0:0)
I don't use irqbalance myself so can't help there. But did you actually test echo'ng 2 to an active IRQ smp_affinity? If it doesn't work then it's likely your patch failed to be applied.
That work. I tried irqbalance because I assumed a deamon could make a better adjustment than me.
Just wondering whether the patches mentioned above were ever included in the master branch. @pc2005 would you be able to confirm ?
Hi,
can you tell me how to add this patch myself (where?) or where to find sufficient documentation so i can try it?
The last time i patched the source and compiled a kernel myself is about 15 years ago, an then i had the complete kernel source locally on my hdd.
But with this cross-compiling...i'm a bit lost.
Thanks in advance.
@elder_tinkerer This patch is already in the main OpenWRT branch. Tomorrow's snapshot should contain this fix.
@olek210
Thank you, i'll try it.
I (finally) managed to patch the kernel myself (phew), and it seems to work.
Now i can set TX and RX to different VPEs.
But the VDSL download speed is still much lower than with 21.02.3 (11.460 kb/s vs. 6970 kb/s).
Maybe you've got some brilliant ideas on how to get back to the higher speed...
I also forgot to put LuCI on the list, so i had to do everything via cli. Fortunately that's not a big problem...
Other users report a drop in the speed to 60 Mbps.
I'm interested if you see any difference with Software flow Offload enabled?
Are you able to generate a graph using flamegraph and perf record?
Then we can see what consumes more CPU power.
Yes, there's a difference.
With Software flow Offload enabled i'll get a download speed at approx. 9350 kB/s.
Without it's approx. 7500 kB/s.
I'll get the highest speed when both irq 72 and 73 are set to use the first VPE.
All other combinations (both to the 2nd VPE, or both on different) result in lower speed.
Btw: For this test i used the latest snapshot.
To be honest,
i haven't the slightest clue on how to set this up on the router.
Maybe with some (or a lot of) help i might be able to do this...
You need to compile the image with perf and perf_event.
Later, invoke the commands:
perf record -F 99 -a -g -- sleep 60
perf script > /tmp/perf.script
Later, download the contents of the perf.script file to the computer via scp. On the computer, execute the commands:
cat perf.script | ./stackcollapse-perf.pl > out.perf-folded
# ./flamegraph.pl out.perf-folded > perf.svg
The output is a perf.svg file, which is best to open in a browser.