IPQ806x NSS Drivers

It seems, that did did the trick with the esp6.c. However now I'm getting this:

gcc -o build_dir/nsinstall.o -c -std=c99 -g -g -O2 -I/home/dj/GITHUB/openwrt/staging_dir/host/include -I/home/dj/GITHUB/openwrt/staging_dir/hostpkg/include -I/home/dj/GITHUB/openwrt/staging_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/host/include -Wall -Wshadow -DNSS_NO_GCC48 -DXP_UNIX -DXP_UNIX -DDEBUG -UNDEBUG -D_DEFAULT_SOURCE -D_BSD_SOURCE -D_POSIX_SOURCE -DSQL_MEASURE_USE_TEMP_DIR -D_REENTRANT -DDEBUG -UNDEBUG -D_DEFAULT_SOURCE -D_BSD_SOURCE -D_POSIX_SOURCE -DSQL_MEASURE_USE_TEMP_DIR -D_REENTRANT -DNSS_DISABLE_AVX2 -DNSS_NO_INIT_SUPPORT -DUSE_UTIL_DIRECTLY -DNO_NSPR_10_SUPPORT -DSSL_DISABLE_DEPRECATED_CIPHER_SUITE_NAMES -I../../../dist/build_dir/include -I../../../dist/public/coreconf -I../../../dist/private/coreconf  nsinstall.c
nsinstall.c: In function 'main':
nsinstall.c:201:5: warning: implicit declaration of function 'getopt' [-Wimplicit-function-declaration]
     while ((opt = getopt(argc, argv, "C:DdlL:Rm:o:g:t")) != EOF) {
     ^
nsinstall.c:203:20: error: 'optarg' undeclared (first use in this function)
    case 'C': cwd = optarg; break;
                    ^
nsinstall.c:203:20: note: each undeclared identifier is reported only once for each function it appears in
nsinstall.c:225:13: error: 'optind' undeclared (first use in this function)
     argc -= optind;
             ^
make[4]: *** [build_dir/nsinstall.o] Error 1
make[4]: Leaving directory `/home/dj/GITHUB/openwrt/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/nss-3.55/nss/coreconf/nsinstall'
make[3]: *** [/home/dj/GITHUB/openwrt/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/nss-3.55/.prepared_2991e914492e5b4eb538f2028ed631e4_6664517399ebbbc92a37c5bb081b5c53] Error 2
make[3]: Leaving directory `/home/dj/GITHUB/openwrt/feeds/packages/libs/nss'
time: package/feeds/packages/nss/compile#1.46#0.55#1.78
make[2]: *** [package/feeds/packages/nss/compile] Error 2
make[2]: Leaving directory `/home/dj/GITHUB/openwrt'
make[1]: *** [/home/dj/GITHUB/openwrt/staging_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/stamp/.package_compile] Error 2
make[1]: Leaving directory `/home/dj/GITHUB/openwrt'
make: *** [world] Error 2

Any idea?

In the mean time, I'll try to go back to the vanilla build, just to get the ipsec / vti related stuff out of the way

Some good news, some bad and some questions :slight_smile:

I managed to narrow down the issue to the feeds. I used to pull in all the feeds and hope the build process will pick out what it needs. That lead to the error above which is an expanded version of the one below:

make[3] -C feeds/packages/libs/nss compile
make -r world: build failed. Please re-run make with -j1 V=s or V=sc for a higher verbosity level to see what's going on
make: *** [world] Error 1
$

It seems that this was caused by the libreswan package.

The bad news is that if I pulled in this package from the feed, the only way I could get rid of the issue was by starting from scratch (probably for the better)

The next good news is that strongswan (default setup) compiled without issues.
Unfortunately once I added quagga for BGP routing into the mix, I ended up again with an error:

make[3] -C package/qca/qca-nss-ecm compile
make -r world: build failed. Please re-run make with -j1 V=s or V=sc for a higher verbosity level to see what's going on
make: *** [world] Error 1
$ 

I'll play with this a bit more in the coming days, but if anyone has any suggestions, please let me know.

As for questions:

  • Has anyone used OpenWRT as an ipsec site to site solution? If yes, can you point me to any documentation? I'm not looking for some high performance with ipsec, just trying to avoid an extra device on my network.
  • Additionally, does anyone have good steps to build an image with extra packages (like the previously mentioned ipsec, luci etc). Somehow I can't seem to figure this part out

Thanks in advance

Starting from a clone or a merge of my git repo or Ansuel’s:

make dirclean
./scripts/feeds update -a
./scripts/feeds install -a
./scripts/diffconfig.sh > diffconfig

Copy the below in to your new diffconfig file (the below compiles). The below gets you luci, working usb, NSS qdisc/wifi/routing offload, and nice statistics graphs. The # are intentional (ex: # in front of wolfssl makes sure that only OpenSSL is loaded). You can add custom programs here.


# Use "make defconfig" to expand this to a full .config
CONFIG_TARGET_ipq806x=y
CONFIG_TARGET_ipq806x_generic=y
CONFIG_TARGET_ipq806x_generic_DEVICE_netgear_r7800=y

# exfat is patented
CONFIG_BUILD_PATENTED=y

# NSS Drivers
CONFIG_PACKAGE_kmod-qca-nss-drv=y
CONFIG_PACKAGE_kmod-qca-nss-drv-qdisc=y
CONFIG_PACKAGE_kmod-qca-nss-ecm-standard=y
CONFIG_PACKAGE_kmod-qca-nss-gmac=y
CONFIG_PACKAGE_kmod-nss-ifb=y
CONFIG_PACKAGE_iptables-mod-physdev=y
CONFIG_PACKAGE_kmod-ipt-physdev=y

# Longer waiting for failsafe button push
CONFIG_IMAGEOPT=y
CONFIG_PREINITOPT=y
CONFIG_TARGET_PREINIT_TIMEOUT=5

# Busybox tweaks
CONFIG_BUSYBOX_CUSTOM=y
CONFIG_BUSYBOX_CONFIG_FEATURE_EDITING_SAVEHISTORY=y
CONFIG_BUSYBOX_CONFIG_FEATURE_EDITING_SAVE_ON_EXIT=y
CONFIG_BUSYBOX_CONFIG_FEATURE_LESS_FLAGS=y
CONFIG_BUSYBOX_CONFIG_FEATURE_LESS_REGEXP=y
CONFIG_BUSYBOX_CONFIG_FEATURE_LESS_WINCH=y

# Add-on programs
CONFIG_PACKAGE_irqbalance=y
CONFIG_DROPBEAR_ECC=y

# USB device mount & file systems support
CONFIG_PACKAGE_block-mount=y
CONFIG_PACKAGE_kmod-usb-storage=y
CONFIG_PACKAGE_kmod-fs-cifs=y
CONFIG_PACKAGE_kmod-fs-exfat=y
CONFIG_PACKAGE_libblkid=y
CONFIG_PACKAGE_kmod-fs-ext4=y
CONFIG_PACKAGE_kmod-fs-hfsplus=y
CONFIG_PACKAGE_kmod-fs-msdos=y
CONFIG_PACKAGE_kmod-fs-vfat=y
CONFIG_PACKAGE_ntfs-3g=y
CONFIG_PACKAGE_kmod-nls-cp1250=y
CONFIG_PACKAGE_kmod-nls-cp437=y
CONFIG_PACKAGE_kmod-nls-cp850=y
CONFIG_PACKAGE_kmod-nls-iso8859-1=y
CONFIG_PACKAGE_kmod-nls-iso8859-15=y
CONFIG_PACKAGE_kmod-nls-utf8=y

# IPv6 support
CONFIG_PACKAGE_6in4=y
CONFIG_PACKAGE_6to4=y
CONFIG_PACKAGE_6rd=y

# IPv6 NAT support (ip6tables NAT extensions, ipt-nat6 and nf-nat6 kmods)
CONFIG_PACKAGE_ip6tables-mod-nat=y

# WLAN/WPS support
CONFIG_PACKAGE_hostapd-utils=y
CONFIG_WPA_MSG_MIN_PRIORITY=4
CONFIG_PACKAGE_wpad-openssl=y
# CONFIG_PACKAGE_wpad-basic-wolfssl is not set
# CONFIG_PACKAGE_libustream-wolfssl is not set

# SSL certificates
CONFIG_PACKAGE_ca-certificates=y

# Luci (SSL from OpenSSL)
CONFIG_PACKAGE_luci-ssl-openssl=y
CONFIG_PACKAGE_luci-app-commands=y
CONFIG_PACKAGE_luci-app-sqm=y
CONFIG_PACKAGE_luci-app-dawn=y

# Luci statistics
CONFIG_PACKAGE_luci-app-statistics=y
CONFIG_PACKAGE_collectd-mod-conntrack=y
CONFIG_PACKAGE_collectd-mod-cpufreq=y
CONFIG_PACKAGE_collectd-mod-entropy=y
CONFIG_PACKAGE_collectd-mod-ping=y
CONFIG_PACKAGE_collectd-mod-sqm=y
CONFIG_PACKAGE_collectd-mod-thermal=y
CONFIG_PACKAGE_collectd-mod-uptime=y

# nlbwmon app
CONFIG_PACKAGE_luci-app-nlbwmon=y

Lastly (feel free to use make with more processors, ex for a 4 processor system: make -j5)

cp diffconfig .config
make defconfig
make
1 Like

@Ansuel: I tested your wifi offload patch and it seems solid. I only tested it in my office with a stand alone R7800 and a file server. Everything works alright and speed test was better than before, in line with @ACwifidude results.

I would like to test it in my house, where I have a R7800 as main AP, a C2600 as wired AP, and two C7 as WDS repeaters. I also use the 80211r/k/v protocols and I want to test if everything works.

As for the C2600, is there someone here that successfully compiled the NSS version? If so, would it be possible to share the dts with @Ansuel so that he could upload it to his repo?

Thanks to all!

I can try to add the needed node for c2600. Also with wifi offload testing it would be good to have a top command while speedtesting. (to verify the cpu load during a high traffic)

Thanks! I'll test the top command tomorrow.

CPU usage with max wifi bandwidth testing:

with wifi offloading patch (on average ~5-10% lower usage)


CPU:   0% usr   0% sys   0% nic  80% idle   0% io   0% irq  18% sirq
Load average: 0.27 0.30 0.13 3/103 3389

Without wifioffload patch:


CPU:   0% usr   0% sys   0% nic  69% idle   0% io   0% irq  29% sirq
Load average: 0.20 0.38 0.19 2/99 3406

This is my top output, but comparing with a regular non-nss version:

Non-NSS version
Download
CPU:   0% usr   0% sys   0% nic  29% idle   0% io   0% irq  69% sirq
Load average: 0.16 0.10 0.03 1/91 19010
Upload
CPU:   0% usr   0% sys   0% nic  43% idle   0% io   0% irq  55% sirq
Load average: 0.27 0.13 0.04 3/91 19084

NSS version
Download
CPU:   0% usr   0% sys   0% nic  57% idle   0% io   0% irq  41% sirq
Load average: 0.14 0.08 0.01 1/99 4590
Upload
CPU:   0% usr   0% sys   0% nic  70% idle   0% io   0% irq  28% sirq
Load average: 0.10 0.07 0.01 1/99 4590
1 Like

Now I need to know if this wifi offload method cause some packet loss of firewall problem that some reported early.

1 Like

Thanks,

I've checked out Ansuels branch and will build my own version as well

@quarky in your opinion can the tasklet implementation cause some packet drop? Also in theory the tasklet implementation should give better perf... Was thinking of implementing it only for normal wifi operation and keep the fast_rx with the current implementation of directly passing the packet to nss. (my stupid theory is if something must be handled fast, it can't be handled using a tasklet and a queue process)

Also in an old qsdk 8.0 patch i notice that if nss offload is enabled, I notice they align the packet. This was absent from your old patch. Can this cause problems and cause pk drop?

I did not notice any drops in packets during my tests. I have to admin tho, my tests are done simplistically without much empirical measurements.

The reason I changed to a tasklet implementation in the mac80211 rx side is that for this path it looks like this:

WiFi client -> ath10k -> mac80211 -> NSS driver -> (DMA) -> NSS firmware

As the NSS driver to NSS firmware are done via DMA and interrupts, I thot it may cause a bottleneck. From what I remember, the tasklet queue grew to a backlog of more than 100 SKB packets during my iperf3 tests, so I thot using the tasklet would help in making the rx side better.

Packets alignment should have been done when the SKBs are created by ath10k (for the rx path), at least that's what I remember, and by the NSS drivers when it received network packets from the firmware. I think the Linux kernel also creates page aligned SKBs as well as it will make DMA more efficient.

How did you check and confirm that packets are dropped? Which path sees dropped packets? rx or tx or both?

I also have done very basic testing... Some user some month ago reported packet drop and disabling wifi offload fixed the problem.

@ACwifidude, thanks for a detailed write up about your build process! That is exactly what I was doing.
I changed the feed part to get only individual packages from the feeds once I found out that libreswan (for ipsec) and quagga (for bgp routing on a top of it) break the build.

To be more specific on the question about the image build process:
How do you get the R7800-20201012-MasterNSS-sysupgrade.bin file ?

I'm just wondering how do I get all the ipks into one file for upgrade, so I don't need to pull stuff from the repo?

Thanks again!

@ACwifidude and @Ansuel, I measured similar numbers on my laptop with an Intel 9260 card.
The actual file transfer speed over wifi seems a bit faster as well

Now, what is more impressive for me, is the fact that with the NSS build I get around 940-960Mb/s up/down (pretty much line speed) on my wired desktop for a 1Gb/s fiber connection with 2-3% CPU utilization.
I was unable to go over 850-880Mb/s on the standard build, even with all the tweaks I could think of.

That said the last build from @ACwifidude is pretty stable, it was up for 2+days. Just noticed today that uptime got back down to a few hrs, but I don't remember doing any reboot.

Whatever you have in your OpenWRT folder + .config file will be loaded in to your bin file after the make command. I have a 3 processor x86 Linux computer so once everything is set - this creates the bin folder and all the files in ~30minutes (never timed it):

make -j4

Definitely keep this thread posted on any patch issues or particular package issues so that the developers can troubleshoot. They have done great work to add features and keep abreast of changes to master.

This is where the .bin file is when make is done:

cd bin/targets/ipq806x/generic

I test that my firmware boots - once everything appears normal I rename it from the default name and push it to GitHub. To rename the file you can manually use a command like mv *squashfs-sysupgrade.bin R7800-20201012-MasterNSS-sysupgrade.bin or you can use a nice build script like hynman that builds the filename:

1 Like

@rog @ACwifidude i pushed some new patch... do you want to try them?


Anyway https://www.right.com.cn/forum/thread-258118-1-1.html this is a forum where I found some time ago a newer nss firmware... for some reason they also have qsdk12 source (no nss firmware)
Would be very good if someone that know Chinese could ask if they can give us some firmware :smiley:

2 Likes

very nice wifi performance (i mean, low cpu) with your files @Ansuel !
today i'll build with the new patches, what did you change? what have we to try?
is the firmware you found newer that the one we are using today?

it would be nice to have a "version number" of your development, just to be sure what we are working on :slight_smile:

is there a way to know what the various stats files in /sys/kernel/debug/qca-nss-drv/stats represents?

is virt_if about wifi? and so what wifi file is for?

thanks

with my 2x2 client i could not see any relevant impact on performance or cpu usage
with my new r14715+56 (lol for the +56) or the "old" driver, over 500Mbps and around 30% cpu usage by softirqd

is this of any help?

root@RUTTO:~# cat /sys/kernel/debug/qca-nss-drv/stats/virt_if
if_num 30 stats start:

rx_packets = 1308327
rx_bytes = 1900538942
rx_dropped = 1423
tx_packets = 294097
tx_bytes = 161306086
tx_enqueue_failed = 0
shaper_enqueue_failed = 0
ocm_alloc_failed = 0
if_num 30 stats end:

if_num 32 stats start:

rx_packets = 6784
rx_bytes = 664816
rx_dropped = 0
tx_packets = 8126
tx_bytes = 1083059
tx_enqueue_failed = 0
shaper_enqueue_failed = 0
ocm_alloc_failed = 0
if_num 32 stats end:

if_num 34 stats start:

rx_packets = 0
rx_bytes = 0
rx_dropped = 0
tx_packets = 0
tx_bytes = 0
tx_enqueue_failed = 0
shaper_enqueue_failed = 0
ocm_alloc_failed = 0
if_num 34 stats end:

if_num 36 stats start:

rx_packets = 0
rx_bytes = 0
rx_dropped = 0
tx_packets = 0
tx_bytes = 0
tx_enqueue_failed = 0
shaper_enqueue_failed = 0
ocm_alloc_failed = 0
if_num 36 stats end:

if_num 38 stats start:

rx_packets = 31211
rx_bytes = 37151783
rx_dropped = 0
tx_packets = 6287
tx_bytes = 1361423
tx_enqueue_failed = 0
shaper_enqueue_failed = 0
ocm_alloc_failed = 0
if_num 38 stats end:

base node stats begin (shown on if_num 38):

active_interfaces = 10
ocm_alloc_failed = 0
ddr_alloc_failed = 0
base node stats end.

from i wifi client i can ping both other wifi clients and wired clients, so it seems good.

I'm still on non-ct drivers, is it a problem?