Adding OpenWrt support for Xiaomi AX3600 (Part 1)

So after around two days uptime of my AX3600 AP i can say, that the build r0-a10ca2b ipq807x-2022-10-17-1510 is working better than anything before.
The ping over wifi is way less and also more constant and e.g. browsing the web seems snappier.
The free memory is on average around 175MB.
Also no kernel errors or anything.

4 Likes

Any idea of opensource packages that are able to stress test a router? namely on the number of clients they can support ? a quick coursory search on google seem to indicate there is none... i know this is probably not the right place to ask the question but bearing in mind the brain power in this domain here thought to ask the question

hello, the sysupgrade image from here brick my ax3600. It does not boot anymore.

  • no dhcp request
  • only LED is only orange

Sorft-brick revover like here seems not possible.
What can can I do?

Can you share a link?

i always test robimarkos builds.

I got to catch the problem that is causing ath11k to crash (I don't know if this helps..)

[55000.824234] qcom-q6v5-wcss-pil cd00000.q6v5_wcss: fatal error received: 
[55000.824234] QC Image Version: QC_IMAGE_VERSION_STRING=WLAN.HK.2.5.0.1-01208-QCAHKSWPL_SILICONZ-1
[55000.824234] Image Variant : IMAGE_VARIANT_STRING=8074.wlanfw.eval_v2Q
[55000.824234] 
[55000.824234] ar_wal_peer.c:2462 Assertion is_graceful_to_handle failedparam0 :zero, param1 :zero, param2 :zero.
[55000.824234] Thread ID      : 0x00000060  Thread name    : WLAN RT1  Process ID     : 0
[55000.824234] Register:
[55000.824234] SP : 0x4c135128
[55000.824234] FP : 0x4c135130
[55000.824234] PC : 0x4b195a10
[55000.824234] SSR : 0x00000008
[55000.824234] BADVA : 0x00020000
[55000.824234] LR : 0x4b1951ac
[55000.824234] 
[55000.824234] Stack Dump
[55000.824234] from : 0x4c135128
[55000.824234] to   : 0x4c135980
[55000.824234] 
[55000.872903] remoteproc remoteproc0: crash detected in cd00000.q6v5_wcss: type fatal error
[55000.895144] remoteproc remoteproc0: handling crash #1 in cd00000.q6v5_wcss
[55000.903262] remoteproc remoteproc0: recovering cd00000.q6v5_wcss
[55000.935997] remoteproc remoteproc0: stopped remote processor cd00000.q6v5_wcss
[55001.228888] ath11k c000000.wifi: failed to find peer 10:5a:17:48:99:eb on vdev 0 after creation
[55001.228946] ath11k c000000.wifi: failed to find peer vdev_id 0 addr 10:5a:17:48:99:eb in delete
[55001.236411] ath11k c000000.wifi: failed peer 10:5a:17:48:99:eb delete vdev_id 0 fallback ret -22
[55001.245117] ath11k c000000.wifi: Failed to add peer: 10:5a:17:48:99:eb for VDEV: 0
[55001.254129] ath11k c000000.wifi: Failed to add station: 10:5a:17:48:99:eb for VDEV: 0
[55001.348860] ath11k c000000.wifi: failed to send WMI_PEER_DELETE cmd
[55001.348898] ath11k c000000.wifi: failed to delete peer vdev_id 2 addr 10:5a:17:48:99:eb ret -108
[55001.353942] ath11k c000000.wifi: Failed to delete peer: 10:5a:17:48:99:eb for VDEV: 2
[55001.362999] ath11k c000000.wifi: Found peer entry 9e:9d:7e:75:17:62 n vdev 2 after it was supposedly removed
[55001.370765] ------------[ cut here ]------------
[55001.380582] WARNING: CPU: 0 PID: 2653 at sta_set_sinfo+0xba4/0xbc0 [mac80211]
[55001.385186] Modules linked in: pppoe ppp_async nft_fib_inet nf_flow_table_ipv6 nf_flow_table_ipv4 nf_flow_table_inet ath11k_ahb ath11k ath10k_pci ath10k_core ath wireguard pppox ppp_generic nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_objref nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_counter nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack mac80211 libchacha20poly1305 chacha_neon cfg80211 slhc qrtr_smd qrtr qmi_helpers poly1305_neon ns nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 libcurve25519_generic libcrc32c libchacha hwmon crc_ccitt compat ip6_gre ip_gre gre ip6_udp_tunnel udp_tunnel ip6_tunnel tunnel6 ip_tunnel tun seqiv jitterentropy_rng drbg michael_mic hmac cmac leds_gpio xhci_plat_hcd xhci_pci xhci_hcd dwc3 dwc3_qcom qca_nss_dp qca_ssdk gpio_button_hotplug crc32c_generic
[55001.452793] CPU: 0 PID: 2653 Comm: hostapd Not tainted 5.15.72 #0
[55001.475029] Hardware name: Xiaomi AX3600 (DT)
[55001.481189] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[55001.485533] pc : sta_set_sinfo+0xba4/0xbc0 [mac80211]
[55001.492301] lr : sta_set_sinfo+0xba0/0xbc0 [mac80211]
[55001.497509] sp : ffffffc00d383840
[55001.502539] x29: ffffffc00d383840 x28: ffffff8002892340 x27: ffffffc00d383dd0
[55001.505846] x26: 0000000000000000 x25: ffffffc008ba0740 x24: ffffffc000ac2000
[55001.512965] x23: ffffffc00d383a38 x22: ffffff8009498c00 x21: ffffff80099508c0
[55001.520082] x20: ffffff8005f78880 x19: ffffff801022f000 x18: 0000000000000196
[55001.527201] x17: 322076656476206e x16: 2032363a37313a35 x15: ffffffc008aa79c8
[55001.534318] x14: 00000000000004c2 x13: 0000000000000196 x12: ffffffc00d383358
[55001.541437] x11: ffffffc008aff9c8 x10: 00000000fffff000 x9 : ffffffc008aff9c8
[55001.548554] x8 : 0000000000000000 x7 : ffffffc008aa79c8 x6 : 0000000000008e08
[55001.555672] x5 : 00000000d8d39b7c x4 : 0000000000000000 x3 : ffffff8002892340
[55001.562790] x2 : 0000000000000000 x1 : ffffff8002892340 x0 : 00000000ffffff94
[55001.569910] Call trace:
[55001.577018]  sta_set_sinfo+0xba4/0xbc0 [mac80211]
[55001.579279]  sta_info_destroy_addr_bss+0x50/0x74 [mac80211]
[55001.584142]  ieee80211_color_change_finish+0x1d68/0x2030 [mac80211]
[55001.589524]  cfg80211_check_station_change+0x1448/0x47d0 [cfg80211]
[55001.595775]  genl_family_rcv_msg_doit+0xb8/0x120
[55001.602023]  genl_rcv_msg+0xd4/0x1d0
[55001.606880]  netlink_rcv_skb+0x5c/0x130
[55001.610439]  genl_rcv+0x38/0x50
[55001.613997]  netlink_unicast+0x1ec/0x2e4
[55001.617124]  netlink_sendmsg+0x1a4/0x3dc
[55001.621292]  ____sys_sendmsg+0x280/0x2c4
[55001.625198]  ___sys_sendmsg+0x84/0xf0
[55001.629103]  __sys_sendmsg+0x48/0x90
[55001.632662]  __arm64_sys_sendmsg+0x24/0x30
[55001.636308]  invoke_syscall.constprop.0+0x5c/0x104
[55001.640216]  do_el0_svc+0x74/0x16c
[55001.644988]  el0_svc+0x18/0x54
[55001.648373]  el0t_64_sync_handler+0xa4/0x130
[55001.651413]  el0t_64_sync+0x184/0x188
[55001.655840] ---[ end trace ece96d9102f846dd ]---
[55004.297911] ath11k c000000.wifi: failed to send WMI_PDEV_BSS_CHAN_INFO_REQUEST cmd
[55006.238871] qcom-q6v5-wcss-pil cd00000.q6v5_wcss: start timed out
[55006.238925] remoteproc remoteproc0: can't start rproc cd00000.q6v5_wcss: -110
[55010.302758] ath11k_warn: 56 callbacks suppressed
[55010.302781] ath11k c000000.wifi: failed to send WMI_PDEV_BSS_CHAN_INFO_REQUEST cmd
[55010.306473] ath11k c000000.wifi: failed to send pdev bss chan info request
[55010.314135] ath11k c000000.wifi: failed to send WMI_PDEV_SET_PARAM cmd
[55010.320727] ath11k c000000.wifi: Failed to set beacon mode for VDEV: 0
[55010.327232] ath11k c000000.wifi: failed to send WMI_BCN_TMPL_CMDID
[55010.333739] ath11k c000000.wifi: failed to submit beacon template command: -108
[55010.339908] ath11k c000000.wifi: failed to update bcn template: -108
[55010.347098] ath11k c000000.wifi: failed to send WMI_VDEV_SET_PARAM_CMDID
[55010.353707] ath11k c000000.wifi: Failed to set dtim period for VDEV 0: -108
[55010.360469] ath11k c000000.wifi: failed to send WMI_VDEV_SET_PARAM_CMDID
[55016.370217] ath11k_warn: 47 callbacks suppressed

I dont have the firmware sources and thus whatever the Q6 FW dumps in its panic trace is really worthless

Ah ok. Sorry for the noise then

No worries, its still usefull for looking later if QCA maybe fixed it

1 Like

Is CONFIG_PACKAGE_irqbalance=y recommended?

I see that default robimarko's build does not enable it. What are benefits and are there any issues if enabled?

In theory irqbalance is a good idea/practice. But IMO it makes mostly sense only if you push a system/router hard to the limits. E. g. you can see/feel an effect on VM hosting systems with irqbalance enabled/disabled. For a router I would enable it if you use e. g. SIP.

But I've enabled it anyway. :smiley:

For this specific device install irqbalance and execute the following commands after in a shell:

sed -i 's/option enabled '\''0'\''/option enabled '\''1'\''/g' /etc/config/irqbalance
sed -i '/banirq/d' /etc/config/irqbalance
sed -i '/ignore/a \
        list banirq '\''50'\'' \
        list banirq '\''51'\'' \
        list banirq '\''52'\'' \
        list banirq '\''53'\'' \
        list banirq '\''73'\'' \
        list banirq '\''74'\'' \
        list banirq '\''75'\'' ' /etc/config/irqbalance
sed -i '/exit 0/i \
sleep 20 \
#assign 4 rx interrupts to each core \
echo 8 > /proc/irq/50/smp_affinity \
echo 4 > /proc/irq/51/smp_affinity \
echo 2 > /proc/irq/52/smp_affinity \
echo 1 > /proc/irq/53/smp_affinity \
 \
#assign 3 tcl completions to 3 CPUs \
echo 4 > /proc/irq/73/smp_affinity \
echo 2 > /proc/irq/74/smp_affinity \
echo 1 > /proc/irq/75/smp_affinity \
 \
 \' /etc/rc.local

/etc/init.d/irqbalance enable
/etc/init.d/irqbalance start
1 Like

Thank you for the detailed reply. I have installed your script, collectd interrupt stats and be monitoring '/proc/interrupts` Will see what happens :slight_smile:

EDIT: looks like on my gateway router most used IRQ are: 13, 78 and 80.

image

Hahaha well, I could do that.

Well the good news is, since my latest pull and build the device has been pretty stable. The latest run was just shy of a week, with no oom issues and no other weirdness except strongswan which we know is an upstream problem which will have to wait for them. I only rebooted because internet went down which turned out to be an ISP problem.

Also, the qca-full-htt definitely appears to have resolved some of the roaming errors I was getting with ath11k.

Zram seems to have totally stopped any oom problems at present and memory usage looks stable on my config.

1 Like

I have found the same and not only with this device but also on WRT1900ac. I think WPA3 is not very well implemented in many client drivers because it is not commonly used and therefore, not well tested. I have several devices that act up with WPA3 enabled at all whether roaming or not and I don't think it has to do with ath11k. It is notable that WPA3-SAE requires 802.11w with 802.11r whereas WPA2-PSK with 802.11r, 802.11w is optional.

The old WRT1900ac has a known driver bug with 802.11w so I always though that was the problem with WPA3, but the same devices act up connected to the AX3600 with WPA3-SAE and 802.11r enabled.

When I enable WPA3 on AX3600 the WPA3 clients basically refuse to roam with some error noted in the logs. All works reliably on WPA2-PSK, 802.11r, and 802.11w. I have not tried again since switching to qca-full-htt which has some roaming fixes in it.

Quite a few 'smart' devices (including phones) don't like WPA2PSK/ WPA3SAE mixed mode nor 802.11r, they tend to be better with WPA3SAE-only and 802.11r disabled. This is a client issue, sadly not fixable on the AP side (apart from setting up distinct networks for WPA2PSK and WPA3SAE, at least with the later switching 802.11r off).

2 Likes

This has been my experience as well. Sweet spot seems to be 27dBm on AX3600. It will still connect, but at lower data rate up to 29dBm. At 30dBm I start getting weird errors and crashy drivers. I am in a very high noise environment. AP placed on second floor of the building can pick up over 100 networks on each band.

1 Like

This is my kconfig from the robimarko tree that includes SQM. It is working. It also has strongswan, dawn, collectd, adblock, ddns, dashboard, led config, and some extra cli utils like iperf3, tcpdump, and iftop. All working except strongswan will crash if you roam off network and bounce the router.

3 Likes

Yes. You are probably missing the board-2.bin. You can find the working file and copy it to the appropriate directory on the device and reboot.

Its been a couple weeks since I made a build, but the last time around it looked like it supposed to pull it from some qit repo and for whatever reason wasn't even though the file is there. You can copy it into the build tree or copy it straight on the device after installing a build.

Exactly the opposite here. I have to say that I'm forcing WPA3 only. Roaming is working basically. But most clients are lazy and not going back to 5G after they switched to 2G (but in rare cases this happens). I've played around so far with usteer but that is bugged (init script needs a fix in OWT) and is (my observation after reading the configs; no docs available) doing nothing because it is treating both bands equally. Next I will try dawn.

I think its all still in development. E. g. If I enable roaming on Xiaomi AX3200 (MTK based). I have to disable "Generate PMK locally" and set R0 and R1 manually to make roaming work. For the AX3600 I don't need to do this.

On the other hand if I enable roaming and WPA3 on AX3600 my Arch Linux client with Intel AC Wifi needs ages (30s+) to get a connection. I found out that DHCP is playing ping pong as soon roaming and WPA3 is enabled (this does not happen on AX3200/WP3/roaming enabled). Without roaming and WPA2 its all fine.

I don't know the root cause. But I found out that if I change the dhcp client for NetworkManager everything is fine. NM has a built in DHCP client based on nettoools. If I use dhclient or dhcpcd declared within NM config everything is working within seconds as usual. I don't know what here are the differences in terms of DHCP server/client. I thought that this is a very strict protocol.

This is the issue i ment: https://github.com/openwrt/openwrt/issues/7858

Working roaming is an own problem and the forum is full of it.
But as mentioned there seem to be an specific issue when you combine roaming plus sae-mixed so some clients cant connect. But also nothing specific to AX3600.