IPQ806x NSS Drivers

@Evgeniy1 test was done using iperf3 with one computer connected via the WAN port and another using LAN port. No QoS. All settings using default lede-17.01 settings.

In any case the NSS drivers is unstable when used with the standard Linux kernel. I’m getting random reboots. Now trying with different versions of the drivers.

I's possible compile NSS driver with support only for hardware encryption ?

it's very low speeds for IPQ8064/8065 CPU.
Today I tested Linksys EA8500 (same hardware, as in your R7800, but CPU 1.4Ghz instead of your 1.7GHz ),
with OpenWRT 18.06.1 ,
default settings + light tuning (net buffers).
static address, NAT, 2 PC (1st in LAN, 2nd in WAN),
port forwarding (for tests in both directions) for Iperf port.
Few simple rules in firewall (for ssh, ipsec)

Iperf, ftp. (ftp test use passive ftp mode.)

Iperf (tcp, 2 streams, 250K buffers) :
(WAN-LAN, ~~ same for LAN-WAN)

without software offloading,
default settings for CPU governor & power management :
540-560 Mbits/sec. (~70-80 %sirq)

with software offloading,
default settings for CPU governor & power management :
635-650 Mbits/sec. (but less %sirq)

without any software offloading,
optimized settings for CPU governor & power management :
870-900 Mbits/sec. (~50-65 %sirq)

and 900 Mbits/sec. isn't a 100% load - router may more speed (for example, in duplex).

ftp, 1 stream , without any software offloading, in WAN<->LAN (both directions) ,
default settings for CPU governor & power management :
65-70 Mbytes/sec.
optimized settings for CPU governor & power management :
95-103 Mbytes/sec.

Next test : routing disabled, only WiFi AP, speed is limited by speed of WAN channel to other router (100/100 Mbits).
Even for this light load download to wifi client was ~90 in both cases, but upload was worse for default settings for CPU governor.

optimized settings for CPU governor & power management :

  1. settings for ondemand scheduler:
    35 for /sys/devices/system/cpu/cpufreq/ondemand/up_threshold
    (for up_threshold =30 or 40 I not detect any difference.)

10 for /sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor

  1. or set performance scheduler/governor , no additional settings.

Conclusion :
Settings for CPU governor & power management is
very important for this CPU for hi-speed channels.
Default settings for CPU governor & power management in OpenWRT 18.06.1 (and 17.01.xx too ) is very poor for any router, based on IPQ80xx or any other CPU with advanced power&frequency management .
Yes, I know that these numbers also are in many other "stock kernels", but it's not for routers&firewalls ! ( and not for many more other specific devices)

OT, but a great find. Please send an email with your findings to mailinglist to discuss this and see possibility of changing defaults. Or contact ipq80xx maintainer directly, @blogic

Hi @quarky, thank you for all your incredible efforts and everyone else's.

Edit: I was able to install your version with the following procedure:

$ tftp 192.168.1.1
> bin
> put lede-R7800-factory-r4018-largeubi.img

*(then sysupgrade "lede-R7800-sysupgrade-r4048-qca-nss.bin")

I did not have success with tftp flashing "lede-R7800-factory-r4048-qca-nss.img" directly. I can confirm that r4048-qca-nss crashes. In my case within minutes of playing with it. Let me know what you want to have tested.

My use-case is fast 3x3 wifi. My benchmarks so far:
Stock various 100 MB/s (crashes), OpenWRT 18.06.1 50 MB/s, LEDE r4018 90 MB/s, NSS r4048 27.0 MB/s. I've repeated the tests between OpenWRT and LEDE. I'm confused by the huge difference.

The crashes are caused by the NSS driver, with is required by the NSS Crypto driver. So we have to identify the reason why the NSS Driver is causing random spontaneous reboots.

If we get the NSS Driver stable, the Krait CPU tuning becomes less important as all routing tasks will be delegated to the NSS cores. The Krait CPU can then focus on other tasks like increasing VPN performance. That’s my main objective for trying to make the NSS cores work.

NetGear & Linksys use same version of NSS drivers ?

@cruiser the firmwares I’ve posted is unstable. No point testing those. I’m trying other versions of the NSS drivers. Will upload stabler version of firmware image when or if I manage to cobble one together.

If you’re referring to the EA8500, I think so. My firmware can’t run on the EA8500 yet. It’s targeted at the R7800 at the moment.

stock R7800 firmware use same version of NSS drivers ?
with old 3.4 kernel ?

I’m using a newer version of the drivers targeted at Linux 4.4. Not using the drivers from stock firmware. The NSS firmware is also different and I presume newer. I extracted the NSS firmware from the Synology RT2600ac firmware, which is using Linux 4.4.

I understand. I just got excited about it and thought I had to give it a go :wink:

Stock firmware has full routing acceleration (i.e. for both wired and wireless), so getting 100MB/s is expected. Stock firmware crashing the router is not normal tho.

I presume your test with LEDE r4018 is using my firmware build? If so, that version has the QCA shortcut-fe driver installed, which is the fast-path driver to help accelerate routing performance on top of stock linux driver. Using that the R7800 can achieve close to wireline speed, which is why you can get 90MB/s.

For my NSS build, it currently only accelerate wire routing. Wireless routing still not accelerated by the NSS drivers, as the atk10k driver have to be patched to hook into the NSS drivers. This will be tough, as currently there's no available sources that I can use. Likely have to figure it out by reading the NSS drivers sources.

I presume your test with LEDE r4018 is using my firmware build?

Yes, they are amazing. I went a head and tested all of them and all the other versions I could get my hands on from LEDE through to OpenWrt.

Your r4017 version is so far the best in terms of WiFi. It reaches stock performance with around 102 MB/s with my 3x3 adapter and without tweaks.

Stock Lede 17.01.2 and 17.01.5, reach 90 MB/s with performance governor. I'm happy to share all my results if someone is interested. OpenrWrt 18.0.1 won't go above 50 MB/s, no matter how much I tweak.

Stock various 100 MB/s (crashes)

To be exact, connections fail with a decryption error. This is very visible with large file transfers, which makes the AP useless. I've reproduced the behavior on two different R7800s. I've also tested various stock versions and with Voxel's modified stock.

Another nice bug in Linux/OpenWrt kernel :
kernel in few conditions correctly set CPU frequency, but incorrectly set L2 cache / RAM frequency.
Bug in code fir IPQ806x CPU.

For clear math. diff. is very low , but for some memory-hungry tasks - very big, up to 30-45% .
See

Did you tweak also rps / xps?

Can we use SQM / traffic shaping with NSS? Afaik not, so CPU speed / tuning / fixing is still important for avoiding buffer bloat. Also CPU commands NSS? So slowly responding CPU adds latency.

Offtopic:
In my opinion whole QUIC was invented to overcome poorly done/tuned TCP implementations and all latency in network path. They should had just fix the issue, which will partly still affect even QUIC and not try bypass it.

From what I've read, yes. It support traffic shaping. It also comes with a traffic shaping driver as part of the NSS suite of drivers.

In any case, my priority now is to find out what's causing the random spontaneous reboots when the qca-nss-drv driver is loaded. Until this problem is solved, we cannot proceed further. Unfortunately, I haven't the slightest clue where to begin to look. Have been trying many changes ... nothing seems to solve the reboot problem.

I have not yet heard of rps/xps. Do you know of a good reference on how to tweak them? Unfortunately I didn't find any obvious results.

I wonder what changed between lede and the merge back into openwrt. I tested a host of LEDE versions which all performed better than OpenWrt, at least based on how I performed my tests. (A) Wifi - R780- (B) Cable : only forwarding, no NAT, no routing,

It will be intersting to see how NSS behaves and if its even reasonably doable.

I suspect the worst routing performance is due to the Linux kernel, likely due to the Spectre fix.

@quarky
You may try the NSS firmware from PandoraBox https://downloads.pangubox.com/pandorabox/19.01/targets/qualcomm/ipq806x/