Hoping someone might have some insights on why 5Ghz dies with the following stack trace, after being up for a period of time:
[18140.114720] ------------[ cut here ]------------
[18140.119384] WARNING: CPU: 1 PID: 5747 at lib/list_debug.c:62 0x80388fcc
[18140.126046] list_del corruption. prev->next should be 81f29a30, but was 80b13db4. (prev=8431fe88)
[18140.134990] Modules linked in: pppoe ppp_async wireguard pptp pppox ppp_mppe ppp_generic nft_redir nft_nat nft_masq nft_flow_offload nft_fib_inet nft_ct nft_chain_nat nf_nat_tftp nf_nat_snmp_basic nf_nat_sip nf_nat_pptp nf_nat_irc nf_nat_h323 nf_nat_amanda nf_nat nf_flow_table_inet nf_flow_table nf_conntrack_tftp nf_conntrack_snmp nf_conntrack_sip nf_conntrack_sane nf_conntrack_pptp nf_conntrack_netlink nf_conntrack_netbios_ns nf_conntrack_irc nf_conntrack_h323 nf_conntrack_broadcast nf_conntrack_amanda nf_conntrack mt76x2e(O) mt76x2_common(O) mt76x02_lib(O) mt7603e(O) mt76(O) mac80211(O) libchacha20poly1305 ipt_REJECT ebtable_nat ebtable_filter ebtable_broute cfg80211(O) xt_time xt_tcpudp xt_tcpmss xt_statistic xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_ecn xt_dscp xt_comment xt_TCPMSS xt_LOG xt_HL xt_DSCP xt_CLASSIFY ums_usbat ums_sddr55 ums_sddr09 ums_karma ums_jumpshot ums_isd200 ums_freecom ums_datafab ums_cypress ums_alauda ts_kmp ts_fsm ts_bm slhc sch_cake poly1305_mips nft_reject_ipv6
[18140.135763] nft_reject_ipv4 nft_reject_inet nft_reject nft_quota nft_numgen nft_log nft_limit nft_hash nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_compat nf_tables nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv4 libcurve25519_generic libcrc32c iptable_raw iptable_mangle iptable_filter ipt_ECN ip_tables ebtables ebt_vlan ebt_stp ebt_redirect ebt_pkttype ebt_mark_m ebt_mark ebt_limit ebt_among ebt_802_3 crc_ccitt compat(O) chacha_mips asn1_decoder sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred act_gact ledtrig_usbport xt_set x_tables ip_set_list_set ip_set_hash_netportnet ip_set_hash_netport ip_set_hash_netnet ip_set_hash_netiface ip_set_hash_net ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ipmac ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink msdos ip6_gre ip_gre gre ifb nat46(O) nf_defrag_ipv6 ip6_udp_tunnel udp_tunnel sit ip6_tunnel
[18140.225256] tunnel6 tunnel4 ip_tunnel tun nls_utf8 nls_iso8859_1 nls_cp437 crypto_user algif_skcipher algif_rng algif_hash algif_aead af_alg sha512_generic sha256_generic sha1_generic seqiv sha3_generic drbg kpp hmac geniv rng ecb cmac arc4 mmc_block usb_storage mtk_sd mmc_core leds_gpio xhci_plat_hcd xhci_pci xhci_mtk_hcd xhci_hcd ohci_platform ohci_hcd sd_mod scsi_mod scsi_common gpio_button_hotplug(O) vfat fat ext4 mbcache jbd2 exfat usbcore nls_base usb_common crc32c_generic
[18140.357935] CPU: 1 PID: 5747 Comm: kworker/u8:1 Tainted: G O 6.6.73 #0
[18140.365876] Workqueue: phy1 0x84575898 [mt76x02_lib@599506a6+0x9000]
[18140.372322] Stack : 00000000 00001673 80943440 80841920 00000000 00000000 00000000 00000000
[18140.380744] 00000000 00000000 00000000 00000000 00000000 00000001 850b5c58 81891f40
[18140.389179] 850b5cf0 00000000 00000000 850b5b30 00000038 80822a84 ffffffea 00000000
[18140.397611] 850b5b3c 00000184 8094b060 ffffffff 80841920 850b5c38 850b5d60 80388fcc
[18140.406025] 00000009 808605f8 80943440 86c1d210 00000018 80496db4 00000004 80b10004
[18140.414451] ...
[18140.416952] Call Trace:
[18140.417474] [<80822a84>] 0x80822a84
[18140.423845] [<80388fcc>] 0x80388fcc
[18140.427347] [<80496db4>] 0x80496db4
[18140.431314] [<800073b8>] 0x800073b8
[18140.434811] [<800073c0>] 0x800073c0
[18140.438312] [<80388fcc>] 0x80388fcc
[18140.441808] [<807f8c54>] 0x807f8c54
[18140.445497] [<8002d9b0>] 0x8002d9b0
[18140.449006] [<80388fcc>] 0x80388fcc
[18140.452522] [<8002db64>] 0x8002db64
[18140.457175] [<80388fcc>] 0x80388fcc
[18140.461670] [<84794d64>] 0x84794d64 [mt76@c1375e29+0xd000]
[18140.468330] [<84794e7c>] 0x84794e7c [mt76@c1375e29+0xd000]
[18140.474164] [<805b07ec>] 0x805b07ec
[18140.478695] [<84575ac4>] 0x84575ac4 [mt76x02_lib@599506a6+0x9000]
[18140.484869] [<847e127c>] 0x847e127c [mt76x2e@7d6adb34+0x3000]
[18140.491672] [<80048db8>] 0x80048db8
[18140.495512] [<800498fc>] 0x800498fc
[18140.499022] [<80053698>] 0x80053698
[18140.502515] [<80824940>] 0x80824940
[18140.506007] [<80049560>] 0x80049560
[18140.509497] [<80053698>] 0x80053698
[18140.512988] [<8005379c>] 0x8005379c
[18140.516484] [<80053698>] 0x80053698
[18140.519977] [<80053698>] 0x80053698
[18140.523462] [<80002a58>] 0x80002a58
[18140.526977]
[18140.528542] ---[ end trace 0000000000000000 ]---
[18140.676208] mt76x2e 0000:01:00.0: Firmware Version: 0.0.00
[18140.676284] mt76x2e 0000:01:00.0: Build: 1
[18140.676305] mt76x2e 0000:01:00.0: Build Time: 201607111443____
[18140.694491] mt76x2e 0000:01:00.0: Firmware running!
[18140.697907] ieee80211 phy1: Hardware restart was requested
The router is an old IQRouter, which was touted to be a Zbtlink ZBT-WE3526 but in order to use the 24.10 branch, is installed as Zbtlink ZBT-WE1326. I cannot just restart wifi or the network to bring it back and requires a full device reboot.
brada4
February 27, 2025, 5:55pm
2
What events leading to driver crash are seen in logread
?
I may have to restart to see if I can catch it again. The beginning of logread only shows the last few lines of the trace from above.
brada4
February 27, 2025, 6:37pm
4
tomporter518:
ZBT-WE3526
You have lost of RAM, you can set log buffer to megabyte in place of 64(k)
Eh, the above stack trace may be unrelated. After restart, 5G appeared to 'die' again though it still shows active. Devices end up disconnecting and cannot reconnect. The logread just shows this leading up to the 'Hardware restart request', after which no connectivity.
Thu Feb 27 13:10:44 2025 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED a0:c9:a0:a8:b1:ee auth_alg=open
Thu Feb 27 13:10:44 2025 daemon.notice hostapd: phy1-ap0: EAPOL-4WAY-HS-COMPLETED a0:c9:a0:a8:b1:ee
Thu Feb 27 13:13:43 2025 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED 38:88:a4:b1:92:93 auth_alg=open
Thu Feb 27 13:13:43 2025 daemon.notice hostapd: phy1-ap0: EAPOL-4WAY-HS-COMPLETED 38:88:a4:b1:92:93
Thu Feb 27 13:15:35 2025 daemon.notice hostapd: phy0-ap0: AP-STA-CONNECTED a0:92:08:7c:00:f9 auth_alg=open
Thu Feb 27 13:15:35 2025 daemon.notice hostapd: phy0-ap0: EAPOL-4WAY-HS-COMPLETED a0:92:08:7c:00:f9
Thu Feb 27 13:15:35 2025 daemon.notice hostapd: phy0-ap0: AP-STA-CONNECTED fc:67:1f:d8:ff:fc auth_alg=open
Thu Feb 27 13:15:35 2025 daemon.notice hostapd: phy0-ap0: EAPOL-4WAY-HS-COMPLETED fc:67:1f:d8:ff:fc
Thu Feb 27 13:15:36 2025 daemon.notice hostapd: phy0-ap0: AP-STA-CONNECTED fc:67:1f:d8:85:6e auth_alg=open
Thu Feb 27 13:15:36 2025 daemon.notice hostapd: phy0-ap0: EAPOL-4WAY-HS-COMPLETED fc:67:1f:d8:85:6e
Thu Feb 27 13:15:36 2025 daemon.notice hostapd: phy0-ap0: AP-STA-CONNECTED fc:67:1f:d7:53:5a auth_alg=open
Thu Feb 27 13:15:36 2025 daemon.notice hostapd: phy0-ap0: EAPOL-4WAY-HS-COMPLETED fc:67:1f:d7:53:5a
Thu Feb 27 13:15:37 2025 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED 20:c9:d0:44:50:73 auth_alg=open
Thu Feb 27 13:15:37 2025 daemon.notice hostapd: phy1-ap0: EAPOL-4WAY-HS-COMPLETED 20:c9:d0:44:50:73
Thu Feb 27 13:15:38 2025 daemon.notice hostapd: phy0-ap0: AP-STA-CONNECTED ec:fa:bc:91:60:58 auth_alg=open
Thu Feb 27 13:15:38 2025 daemon.notice hostapd: phy0-ap0: EAPOL-4WAY-HS-COMPLETED ec:fa:bc:91:60:58
Thu Feb 27 13:15:40 2025 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED 20:00:00:68:e0:39 auth_alg=open
Thu Feb 27 13:15:40 2025 daemon.notice hostapd: phy1-ap0: EAPOL-4WAY-HS-COMPLETED 20:00:00:68:e0:39
Thu Feb 27 13:15:40 2025 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED fc:a1:83:e0:b3:92 auth_alg=open
Thu Feb 27 13:15:40 2025 daemon.notice hostapd: phy1-ap0: EAPOL-4WAY-HS-COMPLETED fc:a1:83:e0:b3:92
Thu Feb 27 13:15:40 2025 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED 6c:56:97:28:2f:b5 auth_alg=open
Thu Feb 27 13:15:40 2025 daemon.notice hostapd: phy1-ap0: EAPOL-4WAY-HS-COMPLETED 6c:56:97:28:2f:b5
Thu Feb 27 13:15:41 2025 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED 6c:56:97:03:df:5d auth_alg=open
Thu Feb 27 13:15:41 2025 daemon.notice hostapd: phy1-ap0: EAPOL-4WAY-HS-COMPLETED 6c:56:97:03:df:5d
Thu Feb 27 13:15:41 2025 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED fc:e9:d8:7f:9b:5f auth_alg=open
Thu Feb 27 13:15:41 2025 daemon.notice hostapd: phy1-ap0: EAPOL-4WAY-HS-COMPLETED fc:e9:d8:7f:9b:5f
Thu Feb 27 13:15:50 2025 daemon.notice hostapd: phy0-ap0: AP-STA-CONNECTED 44:39:c4:b6:ee:b0 auth_alg=open
Thu Feb 27 13:15:50 2025 daemon.notice hostapd: phy0-ap0: EAPOL-4WAY-HS-COMPLETED 44:39:c4:b6:ee:b0
Thu Feb 27 13:16:14 2025 daemon.notice hostapd: phy0-ap0: AP-STA-CONNECTED 70:89:76:4d:11:60 auth_alg=open
Thu Feb 27 13:16:14 2025 daemon.notice hostapd: phy0-ap0: EAPOL-4WAY-HS-COMPLETED 70:89:76:4d:11:60
Thu Feb 27 13:16:14 2025 daemon.notice hostapd: phy0-ap0: AP-STA-CONNECTED 10:5a:17:f6:42:25 auth_alg=open
Thu Feb 27 13:16:14 2025 daemon.notice hostapd: phy0-ap0: EAPOL-4WAY-HS-COMPLETED 10:5a:17:f6:42:25
Thu Feb 27 13:16:15 2025 daemon.notice hostapd: phy0-ap0: AP-STA-CONNECTED 10:5a:17:f7:34:24 auth_alg=open
Thu Feb 27 13:16:15 2025 daemon.notice hostapd: phy0-ap0: EAPOL-4WAY-HS-COMPLETED 10:5a:17:f7:34:24
Thu Feb 27 13:16:51 2025 daemon.notice hostapd: phy0-ap0: AP-STA-CONNECTED c0:21:0d:74:01:2d auth_alg=open
Thu Feb 27 13:16:51 2025 daemon.notice hostapd: phy0-ap0: EAPOL-4WAY-HS-COMPLETED c0:21:0d:74:01:2d
Thu Feb 27 13:16:57 2025 daemon.notice hostapd: phy0-ap0: AP-STA-CONNECTED 70:ee:50:29:21:54 auth_alg=open
Thu Feb 27 13:16:57 2025 daemon.notice hostapd: phy0-ap0: EAPOL-4WAY-HS-COMPLETED 70:ee:50:29:21:54
Thu Feb 27 13:23:21 2025 kern.info kernel: [ 1222.156515] mt76x2e 0000:01:00.0: Firmware Version: 0.0.00
Thu Feb 27 13:23:21 2025 kern.info kernel: [ 1222.156579] mt76x2e 0000:01:00.0: Build: 1
Thu Feb 27 13:23:21 2025 kern.info kernel: [ 1222.156592] mt76x2e 0000:01:00.0: Build Time: 201607111443____
Thu Feb 27 13:23:21 2025 kern.info kernel: [ 1222.175106] mt76x2e 0000:01:00.0: Firmware running!
Thu Feb 27 13:23:21 2025 kern.info kernel: [ 1222.178269] ieee80211 phy1: Hardware restart was requested
A bit more info. Some of the devices, my phone, stay connected with 'full wifi indicator' but the 'the internet is unreachable'. It's almost as if routing gets busted or the 5G is no longer 'part of the network'.
brada4
February 27, 2025, 7:16pm
7
How many devices are associated, it looks like a lot and none de-associates, maybe some over-eager mac randomizer in house?
You have mix of iptables and nftables modules installed. Please show
opkg list-installed | grep legacy
nft list ruleset | grep xt
(i.e it should be on nftables or iptables side not in between)
There are dissociates right after the restart request:
Thu Feb 27 13:26:48 2025 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED 6c:56:97:28:2f:b5
Thu Feb 27 13:26:51 2025 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED fc:e9:d8:7f:9b:5f
Thu Feb 27 13:26:51 2025 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED 6c:56:97:03:df:5d
Thu Feb 27 13:26:53 2025 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED fc:a1:83:e0:b3:92
Thu Feb 27 13:28:28 2025 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED 20:c9:d0:44:50:73
I have all 'mac randomizers' turned off as I use a pi and pi-hole for DNS/DHCP. And I hate that 'feature' on a home network.
Probably a mix of things on this as I've 'upgraded in place' from the IQRouter custom openwrt when they went EOL. That could be affecting the various configs too. I'm trying to hold off on a full reset as I've ordered two new routers to replace this one and the one running dd-wrt as a secondary AP. Getting two matching Cudy so that I can have a uniform network layout.
Both of those greps return nothing:
root@plexus:~# opkg list-installed | grep legacy
root@plexus:~# nft list ruleset | grep xt
root@plexus:~#
1 Like
brada4
February 27, 2025, 7:43pm
9
OK, no problem with firewall kmods.
wifi driver has no configurable parameters, but it is still strange it kills itself before like hundreds of clients.
I'm wondering if I should revert back to the 23.05 branch where I can actually switch back the ZBT-WE3526 functionality. There was some odd stuff from what I saw in bug reports.
brada4
February 27, 2025, 8:20pm
11
You can try 24.10-SNAPSHOT branch first ipo 24.10.0.
Well there were changes in the 24.10 branch to deal with PCI issues between those models and I think they would still be in even the snapshot. 23.05 seemed to have similar behavior but much less frequently. I could give it a shot though.
After a few tries with various versions, I have given up and reverted to OpenWrt 22.03.7, which has provided solid 5G for a day+. The 22.03 branch seems to be the last one that will work consistently with the former IQRouter touted to be a Zbtlink ZBT-WE3526. To get the 5G radio to even begin to work, Zbtlink ZBT-WE1326 firmware must be used but then the radio is not stable and eventually all clients to it stop working (usually between a few mins to a few hours). I found a few issue discussions about this but couldn't grok the implications of the changes made. I'll continue with this version until my new, different routers/aps arrive when I will try the 24.10 version with those.