Okay, I'll check if irqbalance does any difference, as for rpi-perftweaks.sh I'll take a look, but it's been a while since I dabbled in hardcore bash (time to Google I guess ).
As per the other point, hard agree. Let me see if I can find a spare USB and I'll give it a whirl.
This strikes me as a little off... if sqm was enabled during this test... it's likely that it's probably not tuned so well...
In addition to that above comments...
treat all tests seperately... (sort out core bottlenecks without sqm first) then just try to sort out sqm as directly as possible (client directly wired to the pi or as best you can)
mixing the two of these things will make isolating stuff very difficult...
Now that you mention it, I tried running again from all previous devices and only the AP reported speeds lower than WAN. I find it quite likely SQM was being the bottleneck then.
PC:
Speedtest by Ookla
Server: INFINITUM
ISP: Telmex
Latency: 2.09 ms (3.53 ms jitter)
Download: 754.75 Mbps (data used: 421.3 MB)
Upload: 204.47 Mbps (data used: 201.2 MB)
Packet Loss: 0.0%
Router:
Retrieving speedtest.net configuration...
Testing from Telmex (REDACTED)...
Retrieving speedtest.net server list...
Selecting best server based on ping...
Hosted by INFINITUM (REDACTED) [5.88 km]: 25.44 ms
Testing download speed................................................................................
Download: 752.06 Mbit/s
Testing upload speed......................................................................................................
Upload: 175.55 Mbit/s
strip everything back to the bare essentials then slowly change one thing at a time...
cat <<'TTT' > /bin/toaster.sh
echo -n 2 > /proc/irq/ETH0INTERRUPTNUM2/smp_affinity
TTT
chmod +x /bin/toaster.sh
echo 'PERFTWEAKS_SCRIPT="/bin/toaster.sh"' >> /root/wrt.ini
/etc/init.d/irqbalance disable
#reboot and wait 3 mins then test
the reboot is only needed to undo the previous settings... if you copy what the defaults are you can just run the script manually and reset each default each time...
for testing...
watch cat /proc/interrupts
something like this htoprc -RC htoprc-cpu-w-kernel-sortcpu
Setting both min and max frequency to 2000000 and then performing a speedtest leads to an increase in speed of about 20Mbps with irqbalance enabled, and no difference with it disabled.
Also htop reports both dnsmasq and ksoftirqd/1 hitting core 0 hard and the latter core 1 as well, as you say this might have to do more with interrupt handling than anything.
I'll read up a little through RHEL's materials for IRQ, first time dealing with it actually, so if you have any pointers I'll be more than thankful.
it's probably wise to swap that nic and/or swap wan<>lan before doing too much more... or just do some general web searching about that nic and what other people say about the linux driver...
sorry, I missed you were asking specifically re:sqm...
i'm probably not the best person to ask... I don't even use overheads... or any other fancy settings... just tweak the speeds down until it works (-2 to 5%)... then make sure that value is lower than mean actual obtainable bandwidth over time...
that said... on really high speeds you might experiment with trying 'ack-filter'... ( a separate thread would be good for your tuning once you get the baseline bandwidth to a predictable value)...
I think I may have also read that fqcodel(or no sqm) not cake is prefferred for super high bandwidth connections...
(that said... the low(er) results from the AP are likely application load on the AP itself or some constraint thereafter)
I personally wouldn't be too concerned with such a value if it's consistent... and nothing (any cpu core) is maxed out... ( but I suspect we are not there yet )
thank you for the explanation. Since I need a dependency package that doesn't available yet in OpenWrt so I need to make sure the services running well after upgrading.
All bundled up in /bin/toaster.sh as my custom perftweaks.sh you helped me create yesterday.
I also added the following to /boot/config.txt to overclock to 2.0GHz
# Overclocking to 2Ghz
over_voltage=6
arm_freq=2000
After all that, it seems I'm able to do a speedtest without maxing out any CPU core reliably from any device in the network, topping out around 87% for core 1 on average as long as SQM is disabled, in which case using cake as qdisk and simple.qos as the Queue Setup Script leads to the core maxing out at 100% but not reducing speed, and only reducing latency a little.
Given that SQM aside the configuration seems stable, should I aim for further fine-tuning, or look somewhere else along the chain?
good news... sounds like your are on the limit / have enough wiggle room to tune things to run cake if you really want to...
what i've done in this case... is to ensure any services that may use cpu are pinned to core 4(or 3 if you count from zero)... you can see examples with taskset maybe in rpi-perftweaks.sh if not... ask and I may be able to advise... but these are just for a 'little' wiggle room on a single core and can backfire...
i.e.
taskset-aarch64 -apc 2,3 $thispid
( good cantidates are collectd, nlbwmon, uhttpd etc. )
other than that... i'd create that new thread regarding sqm for more professional advice re: tuning cake/fqcodel/other for on-the-limit-cpu/high bandwidth scenarios...