Adding OpenWrt support for Xiaomi AX3600 (Part 1)

@robimarko I discovered the reason for my network packet corruption. Seems without the tx path acceleration my I cannot really access the internet over ethernet.

After a bit of experimentation I noticed that these two patches are enough to enable acceleration of the nss tx path and fix my corruption issue: https://github.com/hgblob/openwrt/commit/2455c085eee1ecec43481cff490f7ec98fad512a

And of course to enable CONFIG_NF_CONNTRACK_CHAIN_EVENTS=y in nss-ecm.

What is your opinion of these patches? I have no idea of their real source, I just got them from the chinese fork with a whole lot of other patches and then started bisecting.

3 Likes

Did you try to clean and start from scratch: make clean && make all ?

Hi, does all devices connected to gigabit ports have the shared throughput like in ax6000 or ax3600 switch is connected in different way and give us more throughput?

Can you elaborate a bit more? In Robimarko's branch both tx and rx path acceleration works on the wired interfaces with both IPoE and PPPoE modes. I see 0% load when I do speedtest in both the WAN or LAN domain.

Should have been more clear, I was talking about the ipv4 acceleration. In robi's branch in the nss stats only the rx packets are counted(and I get my corruption thing) with the above patches both paths are counted.

You can check in nss debug:

root@OpenWrt:~# cat /sys/kernel/debug/qca-nss-drv/stats/ipv4

________________________________________________________________________________

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< IPV4 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
________________________________________________________________________________

        ipv4_rx_pkts           = 6612485         common
        ipv4_rx_byts           = 7565932223      common
        ipv4_tx_pkts           = 413882          common
        ipv4_tx_byts           = 517088193       common
        ipv4_rx_queue[0]_drops = 0               drop
        ipv4_rx_queue[1]_drops = 0               drop
        ipv4_rx_queue[2]_drops = 0               drop
        ipv4_rx_queue[3]_drops = 0               drop

This is how it looks for me when both are what I assumed to be processed by the NSS.

This is Robi's branch, only the 512M patch is applied, nothing else:

root@XAX6:~# cat /sys/kernel/debug/qca-nss-drv/stats/ipv4

________________________________________________________________________________

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< IPV4 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
________________________________________________________________________________

        ipv4_rx_pkts           = 2718231         common
        ipv4_rx_byts           = 2919466237      common
        ipv4_tx_pkts           = 2425734         common
        ipv4_tx_byts           = 2664666702      common
        ipv4_rx_queue[0]_drops = 0               drop
        ipv4_rx_queue[1]_drops = 0               drop
        ipv4_rx_queue[2]_drops = 0               drop
        ipv4_rx_queue[3]_drops = 0               drop

root@XAX6:~# cat /sys/kernel/debug/qca-nss-drv/stats/pppoe

#pppoe base node stats start

        pppoe_rx_packets               = 1790011         common
        pppoe_rx_bytes                 = 2270994311      common
        pppoe_tx_packets               = 900723          common
        pppoe_tx_bytes                 = 658991810       common
        pppoe_rx_dropped[0]            = 0               drop
        pppoe_rx_dropped[1]            = 0               drop
        pppoe_rx_dropped[2]            = 0               drop
        pppoe_rx_dropped[3]            = 0               drop
        pppoe_short_pppoe_hdr_length   = 0               exception
        pppoe_short_packet_length      = 0               exception
        pppoe_wrong_version_or_type    = 0               exception
        pppoe_wrong_code               = 0               exception
        pppoe_unsupported_ppp_protocol = 0               exception
        pppoe_disabled_bridge_packet   = 0               exception

Acceleration works both on IP and on PPPoE, both TX and RX.

Dunno if PPoE impacts it or not, I just have a normal ethernet connection upstream. I get a ECM error in kernel logs and only rx shows as accelerated.

I only had this issue when I forgot to enable the ECM client in the config. I believe we tested this in pure IP as well as PPPoE, and in both cases it did worked.

For me the pppoe client was always enabled, it might be some configuration issue, but this is the only way to get wired ethernet to work at all.

OK, just to be clear, for acceleration to work, you need to select:

  1. The nss-firmware-ipq8074 package under firmwares.
  2. The kmod-qca-nss-dp, kmod-qca-nss-drv, kmod-qca-nss-drv-pppoe and kmod-qca-ssdk-nohnat packages under Network Devices.
  3. And you need kmod-qca-nss-ecm under Network Support.

This later one missing can cause similar issues like you described (only one direction is accelerated or no acceleration at all).

1 Like

Yep, that's what I got.

My problem was that the later one(ecm) was throwing an error without the multi chain contrack patches I mentioned above.

@kirdes Thanks for the link, It's good to see that 11.4 FW is kind of public.
It's still not clear whether we can simply redistribute it ourselves.

@hgblob NSS was for sure working in both TX and RX directions before.

That's what I also remember, but after I while my kernel log started looking like this:

[   14.332237] ECM init
[   14.332302] ECM database jhash random seed: 0x730fb2f5
[   14.334703] Can't register nf notifier hook....

and then it didn't work anymore.

Then its gotta be a patch or something you pulled in as I dont have that error at all.

1 Like

Just checking, do you have this enabled in your openwrt build config?

CONFIG_PACKAGE_kmod-nf-conntrack-netlink=y

I get the feeling this is the other module that registers the notifier.

No, dont have that enabled.
BTW, anybody having issues with the latest build?

I cant get RX traffic on the WAN port of AX9000

Ok I think this one also registers for the same notifier before the ecm does, which makes the ecm to fail to start.

This dependency comes from miniupnpd, which I was running.

I built an image yesterday (latest commits in robimarko's AX3600-5.10-restart branch and no extra patches) which included miniupnpd, and I have the same problem:

# cat /sys/kernel/debug/qca-nss-drv/stats/ipv4

________________________________________________________________________________

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< IPV4 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
________________________________________________________________________________

        ipv4_rx_pkts           = 153215657       common
        ipv4_rx_byts           = 116206404624    common
        ipv4_tx_pkts           = 0               common
        ipv4_tx_byts           = 0               common
        ipv4_rx_queue[0]_drops = 0               drop
        ipv4_rx_queue[1]_drops = 0               drop
        ipv4_rx_queue[2]_drops = 0               drop
        ipv4_rx_queue[3]_drops = 0               drop

I also have CONFIG_PACKAGE_kmod-nf-conntrack-netlink=y in my build config.

I don't have the Can't register nf notifier hook.... error in my kernel log though:

[   13.111973] ECM init
[   13.112038] ECM database jhash random seed: 0x8d85c0ed
[   13.114495] ECM init complete

Update:
I just did a new build without miniupnpd (and kmod-nf-conntrack-netlink) (but included some other packages) and it's still the same!
:man_shrugging:

Following up on this.

Yesterday afternoon, with the IOT antenna not hidden, ath11 hidden, uptime 1 day 20h

              total        used        free      shared  buff/cache   available
Mem:         375804      163476      153020       30780       59308      142708
Swap:             0           0           0

I then unhid the ath11, confirming it was visible again. No clients had connected to either of the AXs wifi networks since it was booted.

Fast forward 8 hours, uptime 2d 4h. During this period, free RAM had been stable, bouncing between 150 & 155 MB. I then connected my laptop to the ath11 & began running speed tests for a few minutes (the AX is acting just as an AP). Was getting just under 250 Mbps up & down, which is what I have provisioned so no issues there. After that for the rest of the hour I was streaming youtube & browsing.

Once connected, free memory began to drop and jumped between 110MB & 130MB. After an hour I disconnected from the AX and left it idle since then. In the following 10 hours to now, free RAM has been stable, jumping between 125 & 130 MB. i.e. not once have I seen any hint of the idle-memory grabbing.

Could it be as 'simple' as setting ath11 as hidden triggers something which then persists even when unhidden later that stops the memory grabbing?

1 Like

@robimarko I applied only the 512MB profile patch to your branch, and it is running for 2.5 days now. The leak is much slower, I am at 89MB free mem (a total of 407MB) now, no OOM yet. I dont think this solves the issue, but it improves it quite a bit. Maybe other patches also have to apply to resolve this.

1 Like