In very basic terms, it manages keeping bridge events in-sync between NSS firmware and the kernel. It's a pretty crucial part of NSS offloading, especially when dealing with VLANs.
Yes, it's related to the threaded NAPI patch. Looks like there's a race condition when accessing shared shared data structures (desc_ring, h2n_desc_ring, etc.) and needs proper spinlocks. That was the initial reason I just reverted the patch.
That's great to hear! So, your connection is fully offloaded?
What is the output of:
nss_stats rmnet_rx
EDIT:
logread won't show anything related to NAPI in threaded mode. If you don't see it in htop or ps, then the patch isn't applied or being built properly.
➤ ps|grep napi
1395 root 0 SW [napi/nss-6]
1396 root 0 SW [napi/nss-7]
1397 root 0 SW [napi/nss-8]
1398 root 0 SW [napi/nss-9]
1399 root 0 SW [napi/nss-10]
1400 root 0 SW [napi/nss-11]
1401 root 0 SW [napi/nss-12]
1402 root 0 SW [napi/nss-13]
1403 root 0 SW [napi/nss-14]
1404 root 0 SW [napi/nss-15]
1429 root 0 SW [napi/nss-16]
1430 root 0 SW [napi/nss-17]
1431 root 0 SW [napi/nss-18]
1432 root 0 SW [napi/nss-19]
1433 root 0 SW [napi/nss-20]
1434 root 0 SW [napi/nss-21]
1435 root 0 SW [napi/nss-22]
1436 root 0 SW [napi/nss-23]
1437 root 0 SW [napi/nss-24]
EDIT 2:
To clarify, the nss driver package is already using NAPI (New API) for managing interrupts. The patch is for threaded NAPI. Threaded NAPI is an operating mode that uses dedicated kernel threads rather than software IRQ context for NAPI processing. It's not required for proper functioning, and is experimental in the context of NSS driver.
There are so many factors at play here... It could be the channels used, interference, and most importantly your client device. Client devices will always have lower upload compared to download, since it's the one sending traffic. It's hardware is not as powerful as a dedicated router.
Neither
since you don't have control of closed source firmwares. If your client device is connected at a rate of 1200/1200mbps, what you posted is about as good as you can get it in a perfect setup.
If you're operating in 160mhz, with little to no interference, no DFS scanning, AND have a client that is capable of connecting at 2400mbps/2400mbps then 1600mbps/1500mbps is what you could expect.
Please don't use irqbalance, it's been discussed ad-nauseam in this thread... By default it does not make distinctions on which IRQs to move, only "looks busy" and moves it. For example the ce* IRQs DO NOT like being moved off CPU0, doing so will cause instability and crashing.
It is best to pin the IRQs to respective cores, and leave it. A lot of tuning has been done to ensure optimal spread between CPUs.