[18.06.4] speed fix for BT HomeHub 5a

Yes, tested and running (or ran since it's a hotplug script). It'd be hard to get this simple change wrong!

Although with git head, I'm getting an odd "No such file or directory" error when trying to read or write to xps_cpus files (that are present). Looks to be a kernel or selinux issue perhaps.

I guess the real question is, whw are the defaults in 20_smp_tune as they are, and why do some routers/socs have issues, like the BTHH5A or the turris omnia. I think there is a common property of the problematic socs that might be detectable and might allow 20_smp_tune to do the right thing....

Can't speak for the BTHH5A, but the patch below (which has been with mvebu devices since Kernel 4.4) could be one of the reasons why devices using mvneta + mvebu (as is the Turris Omnia) have quirky reactions to 20_smp_tune:

Which aims to fix this: https://bugs.openwrt.org/index.php?do=details&task_id=294

From Dave Täht 19.01.2017 22:57 in FS#294:

 Also, the current code is locked to the first core.

root@linksys-1200ac:/proc/irq# cd 37
root@linksys-1200ac:/proc/irq/37# echo 2 > smp_affinity
-ash: write error: I/O error
root@linksys-1200ac:/proc/irq/37# ls
affinity_hint node smp_affinity_list
mvneta smp_affinity spurious 
1 Like

I've created a pull request PR2553 to change the packet steering to use all CPUs but to also disable it by default. Hopefully the reasoning provided in the pull will be enough to be accepted. Essentially;

  • The non-irq CPUs for RPS just seems to come from just one Red Hat document without any actual performance testing.
  • The original packet steering patches advise that optimal settings for the CPU mask depend on architectures and cache hierarchy so one size does not fit all.
  • The original packet steering patches also advise that the overhead in processing for a lightly loaded server can cause performance degradation.
  • Proper IRQ balancing is a better option.
5 Likes

I have tried running irqbalance and seen a 20% improvement in iperf3 results as well as greater consistency.

For anyone who is interested, I ran iperf3 against LEDE 17.01.6, 18.06.4 and 19.07.0-rc2, with and without software flow offloading and fixes for 20-smp-tune script (20 line untested fix). Using Windows 10 laptops. Server connected to red WAN port (static IP) or a LAN port. Client connected to LAN port or 5 GHz wifi.



Example iperf3 commands:
Single thread: iperf3 -c 192.168.111.2 -t 10 -R
Single thread: iperf3 -c 192.168.111.2 -t 10 
Multi thread: iperf3 -c 192.168.111.2 -t 10 -P 5 -R
Multi thread: iperf3 -c 192.168.111.2 -t 10 -P 5 
9 Likes

Thanks for the time for making the above results @bill888 with the 19.07.0-RC2 builds, Did you make any changes in the settings such as

  1. WMM Mode (Did you leave enable or disabled)
  2. Did you force 40MHz under the 2,4G band
  3. 802.11w Management Frame Protection

No extra packages installed. Using Ch.36 and WPA2-PSK with CCMP(AES) enabled were only wifi changes.
1 - default. WMM Mode enabled.
2 - only 5 GHz 80 MHz tested. 2.4 GHz radio disabled.
3 - tbh, no idea what and where this 802.11w Management Frame Protection setting is located.

2 Likes

Thanks a lot for the testing. I'm not sure I get what exactly you mean by "(echo) & flow"...
Flow offload enabled + echo script is the 2 lines script from your post above?

echo 3 > /sys/class/net/eth0/queues/rx-0/rps_cpus
echo 3 > /sys/class/net/eth0/queues/tx-0/xps_cpus

or the long script version below "# Untested fix to restore to pre Feb 2018 values." ?

I used the 'long' 20 line 'untested fix' echo command script.

Sorry for not being clear. '(echo) & flow' was my abbreviation for using both above script and enabling software flow offloading.

2 Likes

Can you show your config, i wonder what's the cpu affinity on this 2 cpu router:
cat /proc/interrupts; grep . /proc/irq/*/smp_affinity_list

on ZBT WG3526 with MT7621A which should be 2 cores but shows 4 in /proc/cpuinfo defaults are that ethernet irqs are handled by all cpus (0-3), but each radio is pinned to a specific cpu:

# cat /proc/interrupts|grep -E "mt7|ethernet"
 22:   11376422          0          0          0  MIPS GIC  10  1e100000.ethernet
 24:         23          0          0          0  MIPS GIC  11  mt7603e
 25:          2          0   98994760          0  MIPS GIC  31  mt76x2e

# grep . /proc/irq/{22,24,25}/smp_affinity_list
/proc/irq/22/smp_affinity_list:0-3
/proc/irq/24/smp_affinity_list:3
/proc/irq/25/smp_affinity_list:2

# grep . /sys/class/net/*/queues/*/rps_cpus
/sys/class/net/br-lan/queues/rx-0/rps_cpus:0
/sys/class/net/eth0.1/queues/rx-0/rps_cpus:0
/sys/class/net/eth0.2/queues/rx-0/rps_cpus:0
/sys/class/net/eth0/queues/rx-0/rps_cpus:e
/sys/class/net/lo/queues/rx-0/rps_cpus:0
/sys/class/net/wlan0/queues/rx-0/rps_cpus:e
/sys/class/net/wlan1/queues/rx-0/rps_cpus:e

It's come to my attention that @Reiver commit has finally made it into Master and will be included in OpenWrt 20.x
https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=d3868f15f876507db54afacdef22a7059011a54e

Apparently, it will not be back ported into 19.07, or 18.06.
https://github.com/openwrt/openwrt/pull/2553#issuecomment-594771114

@Reiver

Judging from this:

and this:

I can see that if somebody wants to manually edit the existing "20-smp-tune" file to take advantage of the performance fix needs to copy/paste the following instead of the code suggested back in 12 Nov '19:

for q in ${dev}/queues/rx-*; do
            set_hex_val "$q/rps_cpus" "$PROC_MASK"
    done

    for q in ${dev}/queues/tx-*; do
            set_hex_val "$q/xps_cpus" "$PROC_MASK"
    done

Could you please confirm if that's the case or if the 4 additional lines that I removed are still needed? I am talking about those:

ntxq="$(ls -d ${dev}/queues/tx-* | wc -l)"
idx=$(($irq_cpu + 1))
            
let "idx = idx + 1"
[ "$idx" -ge "$NPROCS" ] && idx=0

The quickest way on an existing device, since packet steering is now disabled by default, is just to;

$ rm /etc/hotplug.d/net/20-smp-tune

Of course, this won't affect currently up interfaces until a restart or you manually echo 0 to their rps_cpus and xps_cpus proc entries.

Those four additional lines are not needed if you want to keep packet steering enabled and use the $PROC_MASK change from before (so not just non-IRQ handling cores). They're just noise now.

I've just tested latest snapshot r12470 on a spare HH5A.

Using iperf3 multithreaded testing, the best WAN to LAN I can get is 80 Mbps. This is no better than 18.06 with 'broken' packet steering settings, and far worse than 17.01.

I re-read the description of the Commit and it states packet steering is now 'disabled by default' and

The previous netifd implementation (by default but could be configured) simply used all CPUs and this patch essentially reverts to this behaviour.

But the new commit clearly doesn't return 130+ Mbps throughput for the HH5a from my observations when compared to SMP snapshots from 2017 for the HH5a.

The new commit seems to have fixed one problem but created another ?

Update: Following tip from @mkresin

To enable packet steering, SSH into the HH5A and execute these two commands:

uci set network.globals.packet_steering=1
uci commit network

This will also add a line to the /etc/config/network file:

config globals 'globals'
	option packet_steering '1'

WAN to LAN throughput now seems to return 130 Mbps in both directions in brief testing.

2 Likes

Just installed r12629 and enabled packet_steering. However I don't see any changes, my WAN to LAN speeds are still the same ~50Mbps both directions. Any ideas to debug?

I just tried r12629. I get 100-130 Mbps WAN to LAN using iperf3 multi-threaded speeds with packet steering enabled in both directions. Only 70-80 Mbps with packet steering disabled.

Single threaded iperf3 speed tests return 90 Mbps in both directions between WAN and LAN with packet steering enabled.

50 Mbps seems awfully low. How are you measuring the speed, and is it without installing any extra packages (eg. SQM) other than LuCI ?

Stating the obvious, but did you reboot router after enabling packet steering?

Hi all,
I have a BT HomeHub 5a and I am on 19.07.2. If I understood correctly, to fix this I would need to upgrade to the current snapshot version and then enable packet steering? Thanks

Not necessarily. You may simply edit /etc/hotplug.d/net/20-smp-tune as explained in the Reiver's post above.

I am on master and I have this setting in LuCI. Does this option conflict with SQM?