Qualcommax NSS Build

with this new build getting quite good speeds - with nss-drv, nss-ecm and nss-crypto (&cryptoapi)

very happy with the results

the repo is synced with the openwrt master and includes the latest firmware WLAN.HK.2.9.0.1-01837-QCAHKSWPL_SILICONZ-1

3 Likes

@rmandrad, What do you see about the CPU usage on Luci web interface. I wonder if this is a bug or maybe related to the dashboard package. I use it for the first time.
Another thing that I noticed is that I have relatively high CPU usage (first core at 100% the other three at 40-60%) during high speed wlan transfers/speed tests. I get almost 900Mbps with my smartphone with WiFi 6.

finally done

thanks for your support


3 Likes

don't see high usage - here's a fast.com test takes 25% of all cpu's
image

and /proc/interrupts show actually a bias towards CPU3

note that I don't have packet steering (under Global network options) nor software/HW offloading (under firewall) enabled.

i do iperf test,
some small issue during local speed test to my server.
AX6 lan2 connect to pc
AX6 lan3 connect to server
Software flow offloading ON

from pc direct to AX6

D:\iperf>iperf3 -c 192.168.3.1
Connecting to host 192.168.3.1, port 5201
[  4] local 192.168.3.80 port 60830 connected to 192.168.3.1 port 5201
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.00   sec   113 MBytes   948 Mbits/sec
[  4]   1.00-2.00   sec   113 MBytes   946 Mbits/sec
[  4]   2.00-3.00   sec   113 MBytes   947 Mbits/sec
[  4]   3.00-4.00   sec   113 MBytes   946 Mbits/sec
[  4]   4.00-5.00   sec   113 MBytes   946 Mbits/sec
[  4]   5.00-6.00   sec   113 MBytes   946 Mbits/sec
[  4]   6.00-7.00   sec   113 MBytes   946 Mbits/sec
[  4]   7.00-8.00   sec   113 MBytes   945 Mbits/sec
[  4]   8.00-9.00   sec   113 MBytes   947 Mbits/sec
[  4]   9.00-10.00  sec   113 MBytes   946 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-10.00  sec  1.10 GBytes   946 Mbits/sec                  sender
[  4]   0.00-10.00  sec  1.10 GBytes   946 Mbits/sec                  receiver

iperf Done.

from pc to server (show speed drop)

D:\iperf>iperf3 -c 192.168.3.71
Connecting to host 192.168.3.71, port 5201
[  4] local 192.168.3.80 port 60929 connected to 192.168.3.71 port 5201
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.00   sec  81.2 MBytes   682 Mbits/sec
[  4]   1.00-2.00   sec  82.2 MBytes   689 Mbits/sec
[  4]   2.00-3.00   sec  79.2 MBytes   665 Mbits/sec
[  4]   3.00-4.00   sec  81.1 MBytes   681 Mbits/sec
[  4]   4.00-5.00   sec  88.2 MBytes   740 Mbits/sec
[  4]   5.00-6.00   sec  86.5 MBytes   726 Mbits/sec
[  4]   6.00-7.00   sec  82.6 MBytes   692 Mbits/sec
[  4]   7.00-8.00   sec  86.1 MBytes   723 Mbits/sec
[  4]   8.00-9.00   sec  84.5 MBytes   709 Mbits/sec
[  4]   9.00-10.00  sec  75.6 MBytes   635 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-10.00  sec   828 MBytes   694 Mbits/sec                  sender
[  4]   0.00-10.00  sec   828 MBytes   694 Mbits/sec                  receiver

iperf Done.

from server to AX6

root@nas[~]# iperf3 -c 192.168.3.1
Connecting to host 192.168.3.1, port 5201
[  5] local 192.168.3.71 port 57846 connected to 192.168.3.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   114 MBytes   953 Mbits/sec    0    339 KBytes       
[  5]   1.00-2.00   sec   112 MBytes   936 Mbits/sec    0    404 KBytes       
[  5]   2.00-3.00   sec   113 MBytes   948 Mbits/sec    0    781 KBytes       
[  5]   3.00-4.00   sec   110 MBytes   923 Mbits/sec    0    781 KBytes       
[  5]   4.00-5.00   sec   111 MBytes   933 Mbits/sec    0    781 KBytes       
[  5]   5.00-6.00   sec   112 MBytes   944 Mbits/sec    0    820 KBytes       
[  5]   6.00-7.00   sec   111 MBytes   933 Mbits/sec    0    820 KBytes       
[  5]   7.00-8.00   sec   108 MBytes   902 Mbits/sec    0    820 KBytes       
[  5]   8.00-9.00   sec   111 MBytes   933 Mbits/sec    0    820 KBytes       
[  5]   9.00-10.00  sec   111 MBytes   933 Mbits/sec    0    863 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.09 GBytes   934 Mbits/sec    0             sender
[  5]   0.00-10.00  sec  1.08 GBytes   931 Mbits/sec                  receiver

iperf Done.

the all point imho of having nss is that it does HW offloading itself ... so why are you using SW offloading .. pls try without SW offload switched on and also with packet steering switched off.

pls reboot after switching off the above settings

I do see an issue though with upload speeds that are slower by 50% than the base openwrt build ...

as I said there is still a lot do do and I am not an expert really just hacked around for nss to work with the 6.1
openwrt build

I have done a bit of searching above and no one has posted:

I am running 2 QNAP 301Ws. Using the 10gbe ports (both) on both devices.

What kind of bandwidth are people getting with these builds in a dummy AP mode?

On the stock build I can get close to 300MB/s (so about 2500mbit) when just using the 10gbe ports as a switch.

Of course once they hit those speeds load on core 0 goes to 100%.


So, do these NSS builds let people go over that mark?

the reason i turn on SW oflload, in snapshot version it can reach 980mbps local network speed (Packet Steering always off)., if turn off SW offload the speed around 700mbps

i already turn off SW offload, still same can't reach 1GB in local newtork.
if u cek the iperf test
pc -> router (lan2) 1GBps
server -> router (lan3) 1GBps
pc -> server via lan 2-3 around 600mbps

next if i hv more time i will check what packages i miss.

i will use this version for couple next week.

test isp speed with SQM on. no issue with upload speed.

noted well, no need to worry i know the risk when switch to Openwrt.
thanks for you job :+1:

1 Like

i get close to 2500mbit using the qnap as a server and the dynalink as a client

[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   274 MBytes  2.29 Gbits/sec
[  5]   1.00-2.00   sec   281 MBytes  2.35 Gbits/sec
[  5]   2.00-3.00   sec   281 MBytes  2.35 Gbits/sec
[  5]   3.00-4.00   sec   279 MBytes  2.34 Gbits/sec
[  5]   4.00-5.00   sec   281 MBytes  2.35 Gbits/sec
[  5]   5.00-6.00   sec   279 MBytes  2.34 Gbits/sec
[  5]   6.00-7.00   sec   280 MBytes  2.35 Gbits/sec
[  5]   7.00-8.00   sec   280 MBytes  2.35 Gbits/sec
[  5]   8.00-9.00   sec   280 MBytes  2.35 Gbits/sec
[  5]   9.00-10.00  sec   281 MBytes  2.35 Gbits/sec
[  5]  10.00-11.00  sec   281 MBytes  2.35 Gbits/sec
[  5]  11.00-12.00  sec   280 MBytes  2.35 Gbits/sec
[  5]  12.00-13.00  sec   280 MBytes  2.35 Gbits/sec
[  5]  13.00-14.00  sec   280 MBytes  2.35 Gbits/sec
[  5]  14.00-15.00  sec   281 MBytes  2.35 Gbits/sec
[  5]  15.00-16.00  sec   281 MBytes  2.35 Gbits/sec
[  5]  16.00-17.00  sec   280 MBytes  2.35 Gbits/sec
[  5]  17.00-18.00  sec   281 MBytes  2.35 Gbits/sec
[  5]  18.00-19.00  sec   281 MBytes  2.35 Gbits/sec
[  5]  19.00-20.00  sec   280 MBytes  2.35 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-20.00  sec  5.47 GBytes  2.35 Gbits/sec  692             sender
[  5]   0.00-20.00  sec  5.47 GBytes  2.35 Gbits/sec                  receiver

the cpu's hardly move mostly with the load on cpu1 and cpu0 (16%) ... note at the same time both the qnap and dynalink are serving clients

1 Like

I have added a number of things to rc.local some taken from tips that I got from the nss ipq806x community forum

advertise 10g-2 to use 2500mbit

ethtool -s 10g-2 advertise 18000000E102C

sysctl -w net.ipv4.tcp_rmem='65536 262144 8388608'
sysctl -w net.ipv4.tcp_wmem='65536 262144 8388608'
sysctl -w net.ipv4.tcp_mem='65536 262144 8388608'
#sysctl -w net.ipv4.tcp_window_scaling=3
sysctl -w net.ipv4.tcp_low_latency=1
sysctl -w net.ipv4.tcp_sack=1
sysctl -w net.ipv4.tcp_dsack=1
sysctl -w net.netfilter.nf_conntrack_max=8192
sysctl -w net.core.somaxconn=8192
sysctl -w net.core.optmem_max=81920
sysctl -w net.ipv4.tcp_tw_reuse=1
sysctl -w net.ipv4.tcp_max_tw_buckets=262144

# depends on patch being applied - https://github.com/cloudflare/linux/blob/master/patches/0014-add-a-sysctl-to-enable-disable-tcp_collapse-logic.patch
sysctl -w net.ipv4.tcp_collapse_max_bytes=2048

echo 1689600000 > /proc/sys/dev/nss/clock/current_freq

echo 2048 > /proc/sys/dev/nss/n2hcfg/n2h_queue_limit_core0
echo 2048 > /proc/sys/dev/nss/n2hcfg/n2h_queue_limit_core1
echo 8704 > /proc/sys/dev/nss/n2hcfg/n2h_high_water_core0
echo 8704 > /proc/sys/dev/nss/n2hcfg/n2h_high_water_core1
echo 4352 > /proc/sys/dev/nss/n2hcfg/n2h_low_water_core0
echo 4352 > /proc/sys/dev/nss/n2hcfg/n2h_low_water_core1
echo 8704 > /proc/sys/dev/nss/n2hcfg/n2h_empty_pool_buf_core0
echo 8704 > /proc/sys/dev/nss/n2hcfg/n2h_empty_pool_buf_core1

1 Like

Edit : Oh, I see, you are @ 2.5gbit... I wonder what people connecting @ 10gbit are getting.

That's pretty much what I get using the standard OpenWRT build.

Here is a dump from my workstation (Windows 10) to 301W running regular github OpenWRT (not NSS build etc), direct connection 10gbit:

[  4]   0.00-1.00   sec   268 MBytes  2.25 Gbits/sec
[  4]   1.00-2.00   sec   272 MBytes  2.28 Gbits/sec
[  4]   2.00-3.00   sec   272 MBytes  2.29 Gbits/sec
[  4]   3.00-4.00   sec   269 MBytes  2.25 Gbits/sec
[  4]   4.00-5.00   sec   272 MBytes  2.28 Gbits/sec
[  4]   5.00-6.00   sec   274 MBytes  2.30 Gbits/sec
[  4]   6.00-7.00   sec   266 MBytes  2.24 Gbits/sec
[  4]   7.00-8.00   sec   270 MBytes  2.27 Gbits/sec
[  4]   8.00-9.00   sec   270 MBytes  2.27 Gbits/sec
[  4]   9.00-10.00  sec   268 MBytes  2.25 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-10.00  sec  2.64 GBytes  2.27 Gbits/sec                  sender
[  4]   0.00-10.00  sec  2.64 GBytes  2.27 Gbits/sec                  receiver```
2 Likes

yeah i also wonder that ... unfortunately don't have any other devices (apart from the qnap) with a 10gbit port

I have done some speedtest with iperf.
NSS seems to work on TCP (941 Mbit/s) and low CPU usage.
But on UDP I only got 770 Mbit/s and CPU usage of one core is 100%
Can Someone reproduce?

udp seems to perform better in my case (a windows box connected via wifi to the qnap that has nss offloading) ... the cpu hardly moved

do you have software offloading and packet steering enabled ? try to disable both

iperf3 udp

[ ID] Interval           Transfer     Bitrate         Total Datagrams
[  5]   0.00-1.00   sec   129 KBytes  1.06 Mbits/sec  93
[  5]   1.00-2.00   sec   128 KBytes  1.05 Mbits/sec  92
[  5]   2.00-3.00   sec   128 KBytes  1.05 Mbits/sec  92
[  5]   3.00-4.00   sec   129 KBytes  1.06 Mbits/sec  93
[  5]   4.00-5.00   sec   128 KBytes  1.05 Mbits/sec  92
[  5]   5.00-6.00   sec   128 KBytes  1.05 Mbits/sec  92
[  5]   6.00-7.00   sec   129 KBytes  1.06 Mbits/sec  93
[  5]   7.00-8.00   sec   128 KBytes  1.05 Mbits/sec  92
[  5]   8.00-9.00   sec   128 KBytes  1.05 Mbits/sec  92
[  5]   9.00-10.00  sec   128 KBytes  1.05 Mbits/sec  92
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec  1.25 MBytes  1.05 Mbits/sec  0.000 ms  0/923 (0%)  sender
[  5]   0.00-10.00  sec  1.25 MBytes  1.05 Mbits/sec  0.116 ms  0/923 (0%)  receiver

iperf3 tcp

[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  75.8 MBytes   635 Mbits/sec    0   3.54 MBytes
[  5]   1.00-2.00   sec  93.8 MBytes   786 Mbits/sec    0   3.80 MBytes
[  5]   2.00-3.00   sec  92.5 MBytes   776 Mbits/sec    0   3.99 MBytes
[  5]   3.00-4.00   sec  87.5 MBytes   734 Mbits/sec    0   3.99 MBytes
[  5]   4.00-5.00   sec  81.2 MBytes   682 Mbits/sec    0   3.99 MBytes
[  5]   5.00-6.00   sec  90.0 MBytes   755 Mbits/sec    0   3.99 MBytes
[  5]   6.00-7.00   sec  92.5 MBytes   776 Mbits/sec    0   3.99 MBytes
[  5]   7.00-8.00   sec  82.5 MBytes   692 Mbits/sec    0   3.99 MBytes
[  5]   8.00-9.00   sec  80.0 MBytes   671 Mbits/sec    0   3.99 MBytes
[  5]   9.00-10.00  sec  80.0 MBytes   671 Mbits/sec    0   3.99 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   856 MBytes   718 Mbits/sec    0             sender
[  5]   0.00-10.08  sec   855 MBytes   712 Mbits/sec                  receiver

I had a couple of hours to test:

With the @AgustinLorenzo build, on my 301w, using the 10gbe ports (connected at 10gbe):

[  6]   4.00-4.44   sec  24.2 MBytes   468 Mbits/sec
[  8]   4.00-4.44   sec  24.2 MBytes   468 Mbits/sec
[ 10]   4.00-4.44   sec  24.2 MBytes   468 Mbits/sec
[ 12]   4.00-4.44   sec  24.2 MBytes   468 Mbits/sec
[ 14]   4.00-4.44   sec  24.4 MBytes   470 Mbits/sec
[ 16]   4.00-4.44   sec  24.4 MBytes   470 Mbits/sec
[ 18]   4.00-4.44   sec  24.2 MBytes   468 Mbits/sec
[SUM]   4.00-4.44   sec   194 MBytes  3.75 Gbits/sec

Thats substantially better than the stock snapshot.

I had one major issue though... Configuration settings would not stick. Also doing a sysupgrade would not replace the build either. Its probably because the menuconfig settings are not set correctly for the 301w? I was able to recover by forcing a boot from the 2nd partition (stock QNAP software) and essentially installing a regular snapshot as though I was doing it for the first time.

I am tryin the @rmandrad build now. Just started the compile so we''ll see in about 30 minute.

1 Like

So the @rmandrad build seams to be working fine.

The NSS offloading is working as well... Previously on this 10gbe to 10gbe I was getting about 2.5gbit:

[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.00   sec   308 MBytes  2.58 Gbits/sec
[  6]   0.00-1.00   sec   236 MBytes  1.98 Gbits/sec
[SUM]   0.00-1.00   sec   544 MBytes  4.56 Gbits/sec

Its possible I am reaching the limits of iperf3, and that I can now (finally) do 10gbe.

Without going into too much detail this 301w is at the end of my network, the one where the NAS (so I can test real world speed) is connected is one hop down.

That's running a 301w as well, but with the stock build for now.

I'll let this build run for a few days and see how things look before upgrading the other one.

And btw, yes something is causing higher then usual load:

root@OPENWRT-UPSTAIRS:~# uptime
 22:46:41 up 8 min,  load average: 1.16, 0.92, 0.49

I'll know in a few hours if its real or not as I have temperature graphs to compare... If its running hotter something else is clearly going on.

Thank you @rmandrad.

2 Likes

Welp shucks I couldn't wait so I just upgraded the other 301w up the chain... So now everything between my workstation and my NAS is running with NSS enabled...

The result... is impressive! I'm guessing that's about as fast as the drives on the NAS will do.

I disabled all the NSS modules except for ecm and drv... I have a 400/100 connection so I am able to max it out without NSS enabled. This was all about getting as close to 10gbe on the 301s 10gbe ports.

Success!

...lets see how stable this ends up being.

Untitled

1 Like

Nice numbers, however my understanding is that stock could do 8000+ which would be close to 1GB/sec. I guess there is still some bottleneck out there...

just curious how much speed u get if using snapshot version.
from https://firmware-selector.openwrt.org/?version=SNAPSHOT&target=qualcommax%2Fipq807x&id=qnap_301w

Stock I get about 250-270MB/s.

I had to roll it back btw :frowning:

Crazy SSL errors... The one upstairs, which does no firewalling (dummy AP) was fine...

But the one downstairs which is my main gateway (WAN in), as soon as I did that one I couldn't even access my locally hosted webapps :frowning:

I tried a couple of the fixes above but it did not help.

Those are mechanical drives btw... in raid 5. They will max out around 500-600 MB/s. So it will probably do full 10gbe. Too bad for the SSL errors... and thank god for attendedupgrade... Both of us have to work tomorrow and need a working backbone here :wink:

1 Like