The ax210 is supposed to fq_codel also, and probably? also has AQL in it. An even narrower channel (HT20?) would be revealing. And probably depressing.
I had NOTHING to do with the ax210 port to fq_codel, I just made the assumption that the 300 member engineering team at intel would get it right.
It doesn't look right to me, sadly. However, in this test scenario where the AX210 connects as a client to the WiFi it shouldn't matter, correct? It should be relying on the fq_codel implementation in the R7800.
@pattagghiu@asvio, if you're still getting pppoe compile errors, can you open an issue on my nss-packages repo? I'd like to keep this thread more focused on issues related to the functioning/stability of NSS drivers.
I'd like you to run: make target/clean
and a verbose output of the error with the following: make package/{qca-nss-drv,qca-nss-clients,qca-nss-ecm}/{clean,compile} V=sc -j$(nproc)
I saw that you applied patches under package/kernel/mac80211, they should be applied under target/linux they must live inside build_dir/target-*/linux-*/linux-*/patches/generic, add them there to your quilt series.
I had no build problem with qca-nss-drv-pppoe enabled, using the your latest commits yesterday as well as with your openwrt master merge to 5.15-qsdk11-new-krait-cc today.
Unfortunately, even with this new 5.15 based build, I have still encountered random reboots due to "RCU stalling", similar to what we observed in NSS 22.03 or NSS master builds. Not everyone reported it, but at least 4 of us (Tishipp, Mpilon, D43m0n and me) had a chance to see it in the console or remote syslog using NSS 22.03 or Master. The symptom of this RCU stalling is that the router seemes to get hung for about 30 seconds (WIFI was also down) prior to the router's spontaneous reboot. 30 seconds is the default watchdog timeout in OpenWrt. The spontaneous reboot was most likely triggered by the Watchdog timeout, thus there was no ramoops crash dump.
Mpilon also reported a spontaneous reboot with his router running your 5.15-based build after 7 hours. Even though he did not get any syslog, I'm very sure it was caused by the same RCU stalling issue. Mpilon and I were the two particularly unlucky creatures who have bumped into this RCU stalling issue quite a few times lately.
Anyway, I have disabled irqbalance and run the same 5.15 build again. Disabling irqbalance seemed to mitigate the RCU stalling-caused reboots to large extent, based on my previous encounters with recent NSS 22.03 or Master builds. As for clamping the Krait cores to a specific frequency, I intentionally did not use it.
After the strange data that was obtained, I have set up a native linux machine (a bit old -Q9550) connected by cable to serve as a test server. the laptop that I am using as a client is the same but from now on I will do the tests in the native kubuntu that I have installed.
Previous latency issues stemmed from using a virtual machine as a server even though it was running on an i7-12700k with 16 assigned threads.
¡Sí señor, mucho mejor! Can you redo it by adding -l 300 -s 0.5 parameters to the flent rrul_be test parameters? And, let's see how it behaves after running for a little while.
Is this with aql_txq_limit set at 2000 2000 and our suggested patches?
@vochong, WiFi is not a full duplex connection. It is a half-duplex one, so you need to add both to get the total throughput in a simultaneous download and upload test. So it looks pretty decent.
I was writing the post below when I see your post:
"In previous tests I have noticed that the speed was too low in my opinion. I have done additional tests and I have realized that the switch to which the new server I have prepared connects had a problem and was working at 100 mbits."
R7800 is running a 5.15 image based on the latest Qosmio's commits today (which may have the incorrectly applied patches as you mentioned). Netperf server is running on the same R7800.
The flent client is running on a very old Dell laptop (E7470) with 2x2 WIFI5.
# cat /sys/kernel/debug/ieee80211/*/aql_txq_limit
AC AQL limit low AQL limit high
VO 2000 2000
VI 2000 2000
BE 2000 2000
BK 2000 2000
AC AQL limit low AQL limit high
VO 2000 2000
VI 2000 2000
BE 2000 2000
BK 2000 2000
you may want to first run these tests with 22.03 as a baseline for each of your test setups - and then run the same tests with 5.15, sundry patches ...
I think assumptions about what's ok/not are being made for performance in general, not how your test system is working.
B) I recall reading (for developers, latency thread) that most of the Intel WiFi adapters have unexplained stall or stuttering and the issue is in their firmware - we can't fix it.
I had one such m.2 adapter and saw my WiFi fall on its face sometimes. Suggestion was to go with a mediatek mt7921, as the 7922AX is difficult to source.
As is the 7921 ... I found one removed from an hp laptop on ebay here in the US - the rest looked like knockoffs.
Anyway .... Suggest you test your setups with acceptable good released code, then compare with this latest stuff.
M.
EDIT: here's the Discussion about mediatek, Intel issues.