Adding OpenWrt support for Xiaomi AX3600 (Part 1)

There's a simple patch to build nss-clients without disabling bonding kmod:

diff --git a/pppoe/Makefile b/pppoe/Makefile
index 3ee05b7..f3fa818 100644
--- a/pppoe/Makefile
+++ b/pppoe/Makefile
@@ -6,5 +6,5 @@ obj-m += qca-nss-pppoe.o
 qca-nss-pppoe-objs := nss_connmgr_pppoe.o
 
 ifneq (,$(filter $(CONFIG_BONDING),y m))
-ccflags-y += -DBONDING_SUPPORT
+#ccflags-y += -DBONDING_SUPPORT
 endif

Basically a hack to skip the problematic bonding code in pppoe.

Yeah, but that's fixes just bonding, I really am not in the mood to go through each config check to see whats gonna get compiled in and probably break everything.

Hi,

This is the actual fix for bonding (I ported it from an older QSDK with kernel 4.x), I tested it with a 5.10 LEDE/Chinese build a few weeks ago, it worked fine. This is yet another proof that Qualcomm barely test anything..

Just save it in patches-5.10 as 605-net-bonding-add-bond-get-id.patch:

--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -247,6 +247,7 @@ static const struct flow_dissector_key f
 };

 static struct flow_dissector flow_keys_bonding __read_mostly;
+static unsigned long bond_id_mask = 0xFFFFFFF0;

 /*-------------------------- Forward declarations ---------------------------*/

@@ -301,6 +302,20 @@ netdev_tx_t bond_dev_queue_xmit(struct b
 	return dev_queue_xmit(skb);
 }

+int bond_get_id(struct net_device *bond_dev)
+{
+    struct bonding *bond;
+
+    if (!((bond_dev->priv_flags & IFF_BONDING) &&
+          (bond_dev->flags & IFF_MASTER)))
+        return -EINVAL;
+
+    bond = netdev_priv(bond_dev);
+
+    return bond->id;
+}
+EXPORT_SYMBOL(bond_get_id);
+
 /*---------------------------------- VLAN -----------------------------------*/

 /* In the following 2 functions, bond_vlan_rx_add_vid and bond_vlan_rx_kill_vid,
@@ -4822,6 +4837,9 @@ static void bond_destructor(struct net_d
 	struct bonding *bond = netdev_priv(bond_dev);
 	if (bond->wq)
 		destroy_workqueue(bond->wq);
+
+    if (bond->id != (~0U))
+        clear_bit(bond->id, &bond_id_mask);
 }

 void bond_setup(struct net_device *bond_dev)
@@ -4936,7 +4954,7 @@ static int bond_check_params(struct bond
 	int bond_mode	= BOND_MODE_ROUNDROBIN;
 	int xmit_hashtype = BOND_XMIT_POLICY_LAYER2;
 	int lacp_fast = 0;
-	int tlb_dynamic_lb;
+    int tlb_dynamic_lb;

 	/* Convert string parameters. */
 	if (mode) {
@@ -5275,7 +5293,7 @@ static int bond_check_params(struct bond
 	params->peer_notif_delay = 0;
 	params->use_carrier = use_carrier;
 	params->lacp_fast = lacp_fast;
-	params->primary[0] = 0;
+    params->primary[0] = 0;
 	params->primary_reselect = primary_reselect_value;
 	params->fail_over_mac = fail_over_mac_value;
 	params->tx_queues = tx_queues;
@@ -5390,7 +5408,15 @@ int bond_create(struct net *net, const c
 	bond_work_init_all(bond);

 	rtnl_unlock();
-	return 0;
+
+    bond = netdev_priv(bond_dev);
+    bond->id = ~0U;
+    if (bond_id_mask != (~0UL)) {
+        bond->id = (u32)ffz(bond_id_mask);
+        set_bit(bond->id, &bond_id_mask);
+    }
+
+    return 0;
 }

 static int __net_init bond_net_init(struct net *net)
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -256,6 +256,7 @@ struct bonding {
 	/* protecting ipsec_list */
 	spinlock_t ipsec_lock;
 #endif /* CONFIG_XFRM_OFFLOAD */
+    u32      id;
 };

 #define bond_slave_get_rcu(dev) \
@@ -629,6 +629,7 @@ struct bond_net {
 
 int bond_arp_rcv(const struct sk_buff *skb, struct bonding *bond, struct slave *slave);
 netdev_tx_t bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb, struct net_device *slave_dev);
+int bond_get_id(struct net_device *bond_dev);
 int bond_create(struct net *net, const char *name);
 int bond_create_sysfs(struct bond_net *net);
 void bond_destroy_sysfs(struct bond_net *net);
1 Like

I also needed to add the bonding kmod to the qca-nss-clients Makefile:

diff --git a/package/qca/nss/qca-nss-clients/Makefile b/package/qca/nss/qca-nss-clients/Makefile
index 11abb99a7..9c5168fb3 100644
--- a/package/qca/nss/qca-nss-clients/Makefile
+++ b/package/qca/nss/qca-nss-clients/Makefile
@@ -17,7 +17,7 @@ define KernelPackage/qca-nss-drv-pppoe
   CATEGORY:=Kernel modules
   SUBMENU:=Network Devices
   TITLE:=Kernel driver for NSS (connection manager) - PPPoE
-  DEPENDS:=@TARGET_ipq807x +kmod-qca-nss-drv +kmod-ppp +kmod-pppoe
+  DEPENDS:=@TARGET_ipq807x +kmod-qca-nss-drv +kmod-ppp +kmod-pppoe +kmod-bonding
   FILES:=$(PKG_BUILD_DIR)/pppoe/qca-nss-pppoe.ko
   AUTOLOAD:=$(call AutoLoad,51,qca-nss-pppoe)
 endef
1 Like

Hi guys,

Just wanted to clarify a few things and some potential new findings which might help us get closer to wrapping this up (I fantasize about that day tbh at this point, given all the fu*kery we have to deal with..)

I built a LEDE/Chinese branch a few weeks ago, stripped all the Chinese crap and potentially nasty things, added all of @robimarko's incredible work with the patches, and after literally days of failing builds etc. (let's just say I included almost everything one would need on OpenWRT - the resulting image was around 48MB), I got a successful build.

I flashed it to mtd13 with ubiformat, because at some point (early January-February), I flashed the Chinese QSDK which messed up the partition table and made the second partition huge.

First thing to clarify, it is possible to dual-boot/dual-partition with stock Xiaomi INT on mtd12 and OpenWRT on mtd13. You can actually go back and forth between them, with the nvram flags. sysupgrade will destroy your router most likely, I have not attempted flashing mtd12 or sysupgrade, because I don't have serial connectivity to the router, and it wasn't clear if I can revert to the "stock" partition table without serial/tftp recovery. I don't mind dual-booting and just ubiformat-ing and restoring the configuration manually for OpenWRT after each "upgrade" for now, until we get a stable build.

WireGuard worked - I was able to get around 390MBps max with Mullvad, but the CPU load is NOTICEABLE. All the cores were pinned to around 50-60% when doing a speedtest.
Everything else more or less worked (NAT, NAT6, mwan3, sqm/traffic shaping etc.).

This is where the fun begins though.. In order to get NAT-ing and everything else operational, I had to enable ecm and/or nss. There are some weird issues with the Chinese forks and modprobe/insmod, I literally had to insmod full module path, and I also had to create a symlink to the NSS firmware manually, to match the .bin filename expected by the kernel.

I need IPv6 (my cheap ISP offers a /128 only, so I'm doing NAT6 and split-routing with mwan3). As soon as ECM is modprobe'd and some traffic flows across the network, in literally a few minutes, the router would crash and reboot.. in an eternal loop of hell. So there are other serious issues, not just the Wi-Fi leaks. As @robimarko mentioned multiple times in the past, this NSS junk is a nightmare and in an ideal world, we'd somehow need to get rid of all of it.

Just before one of the crashes, I was able to capture this..:

[  218.444578] wlan1-1: NSS TX failed with error[8]: NSS_TX_FAILURE_NOT_ENABLED
[  218.570220] wlan1-1: NSS TX failed with error[8]: NSS_TX_FAILURE_NOT_ENABLED
[  218.616791] wlan1-1: NSS TX failed with error[8]: NSS_TX_FAILURE_NOT_ENABLED
[  218.634884] wlan1-1: NSS TX failed with error[8]: NSS_TX_FAILURE_NOT_ENABLED
[  218.656867] wlan1-1: NSS TX failed with error[8]: NSS_TX_FAILURE_NOT_ENABLED
**[  240.416755] detected buffer overflow in memcpy**
[  240.416804] ------------[ cut here ]------------
[  240.420102] Kernel BUG at fortify_panic+0x20/0x24 [verbose debug info unavailable]
[  240.424872] Internal error: Oops - BUG: 0 [#1] SMP
[  240.432243] Modules linked in: shortcut_fe_drv ecm xt_FULLCONENAT nf_nat_amanda nf_conntrack_amanda cdc_mbim ath11k_ahb ath11k ath10k_pci ath10k_core ath wireguard rndis_host qmi_wwan nft_fib_inet nf_flow_table_ipv6 nf_flow_table_ipv4 nf_flow_table_inet mac80211 lz4 libchacha20poly1305 libblake2s ipt_REJECT ebtable_nat ebtable_filter ebtable_broute chacha_neon cfg80211 cdc_subset cdc_ncm cdc_ether cdc_eem xt_u32 xt_time xt_tcpudp xt_tcpmss xt_string xt_statistic xt_state xt_socket xt_recent xt_quota2 xt_quota xt_policy xt_pkttype xt_physdev xt_owner xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length2 xt_length xt_ipv4options xt_iprange xt_ipp2p xt_iface xt_hl xt_helper xt_hashlimit xt_esp xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connlabel xt_connbytes xt_condition xt_comment xt_cluster xt_cgroup xt_bpf xt_addrtype xt_TRACE xt_TPROXY xt_TCPMSS xt_TARPIT xt_REDIRECT xt_PROTO xt_NFQUEUE xt_NFLOG xt_NETMAP xt_MASQUERADE xt_LOGMARK xt_LOG xt_LED xt_IPMARK xt_HL xt_FLOWOFFLOAD
[  240.432433]  xt_DSCP xt_DNETMAP xt_DHCPMAC xt_DELUDE xt_CT xt_CLASSIFY xt_CHECKSUM xt_CHAOS xt_ACCOUNT xfrm_interface vhci_hcd usbserial usbnet usbmon usbip_host usbip_core usbhid ums_usbat ums_sddr55 ums_sddr09 ums_karma ums_jumpshot ums_isd200 ums_freecom ums_datafab ums_cypress ums_alauda ts_kmp ts_fsm ts_bm tcp_hybla tcp_bbr sch_mqprio sch_cake pptp ppp_mppe ppp_async poly1305_neon ntfs3 nlmon nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject_bridge nft_reject nft_redir nft_quota nft_queue nft_objref nft_numgen nft_nat nft_meta_bridge nft_masq nft_log nft_limit nft_hash nft_fwd_netdev nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_dup_netdev nft_ct nft_counter nft_chain_nat nfnetlink_queue nfnetlink_log nf_tproxy_ipv6 nf_tproxy_ipv4 nf_tables nf_socket_ipv6 nf_socket_ipv4 nf_reject_ipv4 nf_nat_tftp nf_nat_snmp_basic nf_nat_sip nf_nat_rtsp nf_nat_pptp nf_nat_irc nf_nat_h323 nf_nat_ftp nf_log_ipv4 nf_flow_table nf_dup_netdev nf_conntrack_tftp nf_conntrack_snmp
[  240.502371]  nf_conntrack_sip nf_conntrack_rtsp nf_conntrack_pptp nf_conntrack_netlink nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp nf_conntrack_broadcast nf_conncount mdio_gpio mdio_bitbang macvlan macremapper lz4_decompress lz4_compress libcurve25519_generic libchacha libblake2s_generic l2tp_ppp ipvlan iptable_raw iptable_nat iptable_mangle iptable_filter ipt_rpfilter ipt_ah ipt_ECN ipt_CLUSTERIP ip6table_raw ip6t_rpfilter ip_tables hid_generic ebtables ebt_vlan ebt_stp ebt_snat ebt_redirect ebt_pkttype ebt_nflog ebt_mark_m ebt_mark ebt_log ebt_limit ebt_ip6 ebt_ip ebt_dnat ebt_arpreply ebt_arp ebt_among ebt_802_3 crc_ccitt compat_xtables compat cls_flower cdc_wdm cdc_acm asn1_decoder arptable_filter arpt_mangle arp_tables act_vlan fuse sch_teql sch_sfq sch_red sch_prio sch_pie sch_multiq sch_gred sch_fq sch_dsmark sch_codel em_text em_nbyte em_meta em_cmp act_simple act_police act_pedit act_ipt act_csum em_ipset cls_bpf act_bpf act_ctinfo act_connmark sch_tbf sch_ingress sch_htb
[  240.589182]  sch_hfsc em_u32 cls_u32 cls_tcindex cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred act_gact sg hid evdev gpio_fan hwmon i2c_gpio industrialio i2c_algo_bit qca_nss_pppoe pppoe pppox i2c_mux_gpio i2c_mux ledtrig_usbport trelay cryptodev xt_set ip_set_list_set ip_set_hash_netportnet ip_set_hash_netport ip_set_hash_netnet ip_set_hash_netiface ip_set_hash_net ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink ip6table_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6t_NPT ip6t_rt ip6t_mh ip6t_ipv6header ip6t_hbh ip6t_frag ip6t_eui64 ip6t_ah nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 nfsv4 pppoatm ppp_generic nfsd nfs nfs_ssc bonding ip6_gre ip_gre gre ifb nat46 l2tp_ip6 l2tp_ip l2tp_eth ip6_vti ip_vti sit sctp libcrc32c qca_nss_drv l2tp_netlink l2tp_core

I enabled SELinux, fortify, full ASLR / PIE and RELRO, and compiled with the latest toolchain. This is why the buffer overflow was caught. I wouldn't be surprised if the infamous "leak" was just memory overflowing..

I'm gonna try a build from @robimarko's latest branch tonight, with the same .config and will update this with my results.

6 Likes

I wish to somehow get the networking subsystem datasheet as this thing is really similar to the IPQ40xx, the 1G networking/switch is copy/paste from there.
What's different is the ethernet controller which now has 3 ports instead of the single 1G port in IPQ40xx.

People need to understand that the NSS is an entirely unmaintainable piece of junk, it will only get exponentially worse in 5.15 and later.
The offloading part will never be a part of mainline OpenWrt or upstream kernel, it's just not viable.
This is just a stop-gap measure until there is any kind of kernel networking driver and we can get rid of NSS.

Offloading is a great thing to have but its not something that we can maintain, there is a good reason why Qualcomm updates the kernel like once in 5 years or so and that is that stuff breaks when everything is out of tree

1 Like

The issue is the same platform and same NSS will be used for 10G routers. We might be able to get away from NSS with 1gig ports, but it will be impossible on 10gig. The performance penalty will be so severe that there will be no sane person to go OWRT if we drop acceleration or offloading...

I first abort the boot process, you have to see this in the serial console: "IPQ807x #"
Then start the tftpserver and i enter the commands according to the instructions from the link.

Trust me I would really like it if we can have all of the QSDK features, but the reality is that we are dealing with a black box without any kind of datasheet and expecting to somehow maintain it.

There is no viable way to do it, as who with a sane mind is gonna commit to maintaining the POS that is NSS working in newer kernels?
I am gonna be blunt and say directly that unless NSS drastically changes how it works it will never be a part of OpenWrt, its always gonna need to be somebody time hog and they are gonna have to sink in tons of time to move from one stable OpenWrt to another.

Not even sure how 2.5 Gbit networking would behave on AX9000 (without NSS offloading).
But I hear @robimarko saying in the existing form offloading drivers would never be merged into mainline OpenWRT nor upstreamed to the kernel.

I completely understand a development point. The question is: if we cant provide even remotely meaningful speeds on OWRT (without NSS), then what is the point? There will be no significantly higher speed CPUs anytime soon, and the ones we have are not able to do this without acceleration. So the question is: what is it that can be done?

Having at least something that works without major hack from qcom would be a miracle for ipq807x honestly LOL

Example look at ipq806x how much it took. Look at ipq40xx that still doesn't have a correct ethernet driver/switch.

The thing is it doesn't make sense to add acceleration and stuff if even the most basic thing doesn't work correctly.
And trying to support all these hacks would remove time and priority to some effort in correctly support basic feature.

Also in the new soc and wifi i agree that the main goal would be have higher speed and perf but also less power consumption and better wifi range / security are also important.

Im just bluntly chipping in an OT question...

Even though I'm owning an AX3600, what SOC for an AC or AX router would you recommend if working mainline is a desired feature?

QCA it seems definitely not, MTK? Anything else?

Currently, I don't think that there is any SoC with built-in AX radios except for the QCA products.
Everybody else is just reusing older SoC and adding a PCI based cards (Only Mediatek and QCA PCI cards have AP mode)

So not really the best time for all of us it seems.

Where is this offloading functionslity? Is it part of the wifi card, the Ethernet or the actual SoC? So if I would just combine any application processor with a sufficient amount of pci elanes with wifi and eth, would it come with offloading or not?

Sorry for the annoying questions, but not happy with stock mi fw, looking for an alternative...

@robimarko Do you think the ax3600 can do gigabit Ethernet without NSS/ECM?
If yes it would still be a good choice!

The famous Belkin rt3200 router is basically fully supported, but the Wi-Fi signal and coverage is really really bad. In particular if you compare it to the ax6/ax3600.

The offloading is done in the 2 UBI-based cores that run custom firmware called NSS firmware to which the whole mess of drivers interface to for offloading.
So it's part of the SoC.

It should do 1Gbit for sure without offloading.
RT3200 is just reusing the old MT7921 with an MT7915 PCI card, that is the reason it was easily supported.

2 Likes

I thought that the mediatek would be better for coverage and opensource because the 2.4g radio is a 4x4 which can compensate for the usual poor performance of opensource project drivers...2x2 performance probably acceptable only with oem firmware

Nope unfortunately coverage is really poor.

Must be the internal antennas...that will be a very good reason for choosing the new redmi ax6s (mediatek + external antennas)