amteza
2963
All good. I'm on AEST. Enjoy dinner, your name looks very Spaniard to me, so I guess it's time for dinner there. 
amteza
2964
It doesn't look right to me, sadly. However, in this test scenario where the AX210 connects as a client to the WiFi it shouldn't matter, correct? It should be relying on the fq_codel implementation in the R7800.
dtaht
2965
on the rrul test
A) Total potential bandwidth is halved (but buffering not controlled by fq_codel stays the same). Both sides have buffering.
B) The solo download test from the r7800 was pretty good, perhaps it will get better with those patches.
qosmio
2966
Make sure you're using the 5.15-qsdk11-new-krait-cc branch
It incorporates @amteza 's patch.
@pattagghiu @asvio, if you're still getting pppoe compile errors, can you open an issue on my nss-packages repo? I'd like to keep this thread more focused on issues related to the functioning/stability of NSS drivers.
I'd like you to run:
make target/clean
and a verbose output of the error with the following:
make package/{qca-nss-drv,qca-nss-clients,qca-nss-ecm}/{clean,compile} V=sc -j$(nproc)
My compilation has just failed with these errors
ERROR: modpost: "nss_crypto_pm_notify_register" [/home/R7800-qosmio-5.15-qsdk11-new-krait-cc/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/linux-ipq806x_generic/qca-nss-crypto-2021-03-20-2271a3a/v1.0/src/qca-nss-crypto.ko] undefined!
ERROR: modpost: "nss_crypto_notify_register" [/home/R7800-qosmio-5.15-qsdk11-new-krait-cc/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/linux-ipq806x_generic/qca-nss-crypto-2021-03-20-2271a3a/v1.0/src/qca-nss-crypto.ko] undefined!
ERROR: modpost: "nss_crypto_pm_notify_unregister" [/home/R7800-qosmio-5.15-qsdk11-new-krait-cc/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/linux-ipq806x_generic/qca-nss-crypto-2021-03-20-2271a3a/v1.0/src/qca-nss-crypto.ko] undefined!
ERROR: modpost: "nss_crypto_tx_msg" [/home/R7800-qosmio-5.15-qsdk11-new-krait-cc/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/linux-ipq806x_generic/qca-nss-crypto-2021-03-20-2271a3a/v1.0/src/qca-nss-crypto.ko] undefined!
ERROR: modpost: "nss_crypto_data_register" [/home/R7800-qosmio-5.15-qsdk11-new-krait-cc/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/linux-ipq806x_generic/qca-nss-crypto-2021-03-20-2271a3a/v1.0/src/qca-nss-crypto.ko] undefined!
ERROR: modpost: "nss_crypto_tx_buf" [/home/R7800-qosmio-5.15-qsdk11-new-krait-cc/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/linux-ipq806x_generic/qca-nss-crypto-2021-03-20-2271a3a/v1.0/src/qca-nss-crypto.ko] undefined!
make[5]: *** [scripts/Makefile.modpost:133: /home/R7800-qosmio-5.15-qsdk11-new-krait-cc/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/linux-ipq806x_generic/qca-nss-crypto-2021-03-20-2271a3a/Module.symvers] Error 1
make[5]: *** Deleting file '/home/R7800-qosmio-5.15-qsdk11-new-krait-cc/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/linux-ipq806x_generic/qca-nss-crypto-2021-03-20-2271a3a/Module.symvers'
make[4]: *** [Makefile:1813: modules] Error 2
make[4]: Leaving directory '/home/R7800-qosmio-5.15-qsdk11-new-krait-cc/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/linux-ipq806x_generic/linux-5.15.68'
make[3]: *** [Makefile:79: /home/R7800-qosmio-5.15-qsdk11-new-krait-cc/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/linux-ipq806x_generic/qca-nss-crypto-2021-03-20-2271a3a/.built] Error 2
make[3]: Leaving directory '/home/R7800-qosmio-5.15-qsdk11-new-krait-cc/feeds/nss/qca-nss-crypto'
time: package/feeds/nss/qca-nss-crypto/compile#5.40#1.19#10.66
ERROR: package/feeds/nss/qca-nss-crypto failed to build.
make[2]: *** [package/Makefile:116: package/feeds/nss/qca-nss-crypto/compile] Error 1
make[2]: Leaving directory '/home/R7800-qosmio-5.15-qsdk11-new-krait-cc'
make[1]: *** [package/Makefile:110: /home/R7800-qosmio-5.15-qsdk11-new-krait-cc/staging_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/stamp/.package_compile] Error 2
make[1]: Leaving directory '/home/R7800-qosmio-5.15-qsdk11-new-krait-cc'
make: *** [/home/R7800-qosmio-5.15-qsdk11-new-krait-cc/include/toplevel.mk:231: world] Error 2
amteza
2968
I saw that you applied patches under package/kernel/mac80211, they should be applied under target/linux they must live inside build_dir/target-*/linux-*/linux-*/patches/generic, add them there to your quilt series.
1 Like
vochong
2969
@qosmio
I had no build problem with qca-nss-drv-pppoe enabled, using the your latest commits yesterday as well as with your openwrt master merge to 5.15-qsdk11-new-krait-cc today.
Unfortunately, even with this new 5.15 based build, I have still encountered random reboots due to "RCU stalling", similar to what we observed in NSS 22.03 or NSS master builds. Not everyone reported it, but at least 4 of us (Tishipp, Mpilon, D43m0n and me) had a chance to see it in the console or remote syslog using NSS 22.03 or Master. The symptom of this RCU stalling is that the router seemes to get hung for about 30 seconds (WIFI was also down) prior to the router's spontaneous reboot. 30 seconds is the default watchdog timeout in OpenWrt. The spontaneous reboot was most likely triggered by the Watchdog timeout, thus there was no ramoops crash dump.
Mpilon also reported a spontaneous reboot with his router running your 5.15-based build after 7 hours. Even though he did not get any syslog, I'm very sure it was caused by the same RCU stalling issue. Mpilon and I were the two particularly unlucky creatures who have bumped into this RCU stalling issue quite a few times lately.
Anyway, I have disabled irqbalance and run the same 5.15 build again. Disabling irqbalance seemed to mitigate the RCU stalling-caused reboots to large extent, based on my previous encounters with recent NSS 22.03 or Master builds. As for clamping the Krait cores to a specific frequency, I intentionally did not use it.
1 Like
vochong
2970
@sppmaster
make clean && make download && make -j1 V=s will fix the build errors most of the time for me, unless some feeds, patches or commits are actually bad.
1 Like
xeonpj
2971
we are on the right track.
2 Likes
asvio
2972
After the strange data that was obtained, I have set up a native linux machine (a bit old -Q9550) connected by cable to serve as a test server. the laptop that I am using as a client is the same but from now on I will do the tests in the native kubuntu that I have installed.
Previous latency issues stemmed from using a virtual machine as a server even though it was running on an i7-12700k with 16 assigned threads.
Here are the new tests.
wifi5 160mhz
wifi5 80mhz
wifi5 20mhz
amteza
2973
¡Sí señor, mucho mejor! Can you redo it by adding -l 300 -s 0.5 parameters to the flent rrul_be test parameters? And, let's see how it behaves after running for a little while.
Is this with aql_txq_limit set at 2000 2000 and our suggested patches?
asvio
2974
yes, it is. the last test I did yesterday already included it. I do not reboot router. 22h active so far.
I would need the full command. I'm very very rookie to these things.
amteza
2975
Apologies, here you go:
flent rrul_be --verbose -t 'rrul_be Kubuntu v22.03-ath10k-fixes WLAN ecn-off tx_burst-2 ts2time-8ms napi-pool-8 aql-2000ms' -H ubuntu-server -p all --figure-width=12.80 --figure-height=9.60 -l 300 -s .05 -o rrul_be-Kubuntu-v22.03-ath10k-fixes-WLAN-ECN-of_tx_burst-2_ts2time-8ms-napi_poll-8-aql-2000ms-noaql.png
Based on your graphs, I imagined you run netserver and irtt in your ubuntu-server.
vochong
2976
Why were your throughputs so low in all these tests? Only around 30 Mbps Upload / Download, even with WIFI 5 160 MHz?
amteza
2977
@vochong, WiFi is not a full duplex connection. It is a half-duplex one, so you need to add both to get the total throughput in a simultaneous download and upload test. So it looks pretty decent.
asvio
2978
I was writing the post below when I see your post:
"In previous tests I have noticed that the speed was too low in my opinion. I have done additional tests and I have realized that the switch to which the new server I have prepared connects had a problem and was working at 100 mbits."
I need to do new test.
vochong
2979
@amteza
@asvio
R7800 is running a 5.15 image based on the latest Qosmio's commits today (which may have the incorrectly applied patches as you mentioned). Netperf server is running on the same R7800.
The flent client is running on a very old Dell laptop (E7470) with 2x2 WIFI5.
# cat /sys/kernel/debug/ieee80211/*/aql_txq_limit
AC AQL limit low AQL limit high
VO 2000 2000
VI 2000 2000
BE 2000 2000
BK 2000 2000
AC AQL limit low AQL limit high
VO 2000 2000
VI 2000 2000
BE 2000 2000
BK 2000 2000
Mpilon
2980
Stray thoughts about some of these test results -
- you may want to first run these tests with 22.03 as a baseline for each of your test setups - and then run the same tests with 5.15, sundry patches ...
I think assumptions about what's ok/not are being made for performance in general, not how your test system is working.
B) I recall reading (for developers, latency thread) that most of the Intel WiFi adapters have unexplained stall or stuttering and the issue is in their firmware - we can't fix it.
I had one such m.2 adapter and saw my WiFi fall on its face sometimes. Suggestion was to go with a mediatek mt7921, as the 7922AX is difficult to source.
As is the 7921 ... I found one removed from an hp laptop on ebay here in the US - the rest looked like knockoffs.
Anyway .... Suggest you test your setups with acceptable good released code, then compare with this latest stuff.
M.
EDIT: here's the Discussion about mediatek, Intel issues.
3 Likes
asvio
2981
data
iperf after test
asvio@MSI-GS72-6QE:~$ date
vie 23 sep 2022 08:39:56 CEST
asvio@MSI-GS72-6QE:~$ iperf3 -c 192.168.1.208 -f M -i 60 -t 10 -P 1
Connecting to host 192.168.1.208, port 5201
[ 5] local 192.168.1.10 port 37436 connected to 192.168.1.208 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-10.00 sec 733 MBytes 73.2 MBytes/sec 0 3.17 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 733 MBytes 73.2 MBytes/sec 0 sender
[ 5] 0.00-10.00 sec 732 MBytes 73.2 MBytes/sec receiver
iperf Done.
amteza
2982
Wow, I'm completely puzzled by your new graph. It makes no sense that after your previous one, you have this change. Was someone else using the network? What changed? It should be more in line with your previous one or @vochong's.