@robimarko I discovered the reason for my network packet corruption. Seems without the tx path acceleration my I cannot really access the internet over ethernet.
And of course to enable CONFIG_NF_CONNTRACK_CHAIN_EVENTS=y in nss-ecm.
What is your opinion of these patches? I have no idea of their real source, I just got them from the chinese fork with a whole lot of other patches and then started bisecting.
Hi, does all devices connected to gigabit ports have the shared throughput like in ax6000 or ax3600 switch is connected in different way and give us more throughput?
Can you elaborate a bit more? In Robimarko's branch both tx and rx path acceleration works on the wired interfaces with both IPoE and PPPoE modes. I see 0% load when I do speedtest in both the WAN or LAN domain.
Should have been more clear, I was talking about the ipv4 acceleration. In robi's branch in the nss stats only the rx packets are counted(and I get my corruption thing) with the above patches both paths are counted.
You can check in nss debug:
root@OpenWrt:~# cat /sys/kernel/debug/qca-nss-drv/stats/ipv4
________________________________________________________________________________
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< IPV4 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
________________________________________________________________________________
ipv4_rx_pkts = 6612485 common
ipv4_rx_byts = 7565932223 common
ipv4_tx_pkts = 413882 common
ipv4_tx_byts = 517088193 common
ipv4_rx_queue[0]_drops = 0 drop
ipv4_rx_queue[1]_drops = 0 drop
ipv4_rx_queue[2]_drops = 0 drop
ipv4_rx_queue[3]_drops = 0 drop
This is how it looks for me when both are what I assumed to be processed by the NSS.
I only had this issue when I forgot to enable the ECM client in the config. I believe we tested this in pure IP as well as PPPoE, and in both cases it did worked.
I built an image yesterday (latest commits in robimarko's AX3600-5.10-restart branch and no extra patches) which included miniupnpd, and I have the same problem:
# cat /sys/kernel/debug/qca-nss-drv/stats/ipv4
________________________________________________________________________________
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< IPV4 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
________________________________________________________________________________
ipv4_rx_pkts = 153215657 common
ipv4_rx_byts = 116206404624 common
ipv4_tx_pkts = 0 common
ipv4_tx_byts = 0 common
ipv4_rx_queue[0]_drops = 0 drop
ipv4_rx_queue[1]_drops = 0 drop
ipv4_rx_queue[2]_drops = 0 drop
ipv4_rx_queue[3]_drops = 0 drop
I also have CONFIG_PACKAGE_kmod-nf-conntrack-netlink=y in my build config.
I don't have the Can't register nf notifier hook.... error in my kernel log though:
Yesterday afternoon, with the IOT antenna not hidden, ath11 hidden, uptime 1 day 20h
total used free shared buff/cache available
Mem: 375804 163476 153020 30780 59308 142708
Swap: 0 0 0
I then unhid the ath11, confirming it was visible again. No clients had connected to either of the AXs wifi networks since it was booted.
Fast forward 8 hours, uptime 2d 4h. During this period, free RAM had been stable, bouncing between 150 & 155 MB. I then connected my laptop to the ath11 & began running speed tests for a few minutes (the AX is acting just as an AP). Was getting just under 250 Mbps up & down, which is what I have provisioned so no issues there. After that for the rest of the hour I was streaming youtube & browsing.
Once connected, free memory began to drop and jumped between 110MB & 130MB. After an hour I disconnected from the AX and left it idle since then. In the following 10 hours to now, free RAM has been stable, jumping between 125 & 130 MB. i.e. not once have I seen any hint of the idle-memory grabbing.
Could it be as 'simple' as setting ath11 as hidden triggers something which then persists even when unhidden later that stops the memory grabbing?
@robimarko I applied only the 512MB profile patch to your branch, and it is running for 2.5 days now. The leak is much slower, I am at 89MB free mem (a total of 407MB) now, no OOM yet. I dont think this solves the issue, but it improves it quite a bit. Maybe other patches also have to apply to resolve this.