So, my router "crashed" again (all network access stopped, cannot ping its internal IP, but wifi is still seen as alive by devices and all LED lights are continuing to flash).
I connected to the serial console, hit enter, and I still had shell access. I looked at the dmesg
output and see:
[ 9177.467282] ------------[ cut here ]------------
[ 9177.471906] NETDEV WATCHDOG: eth0 (mtk_soc_eth): transmit queue 4 timed out
[ 9177.478869] WARNING: CPU: 3 PID: 0 at dev_watchdog+0x238/0x240
[ 9177.484691] Modules linked in: pppoe ppp_async nft_fib_inet nf_flow_table_inet wireguard pppox ppp_generic nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_objref nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack mt7915e mt76_connac_lib mt76 mac80211 libchacha20poly1305 iptable_mangle iptable_filter ipt_REJECT ipt_ECN ip_tables chacha_neon cfg80211 xt_time xt_tcpudp xt_tcpmss xt_statistic xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_ecn xt_dscp xt_comment xt_TCPMSS xt_LOG xt_HL xt_DSCP xt_CLASSIFY x_tables slhc sch_cake poly1305_neon nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 libcurve25519_generic libcrc32c libchacha compat crypto_safexcel sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred act_gact ifb
[ 9177.484857] ip6_udp_tunnel
[ 9177.557177] mtk_soc_eth 15100000.ethernet eth1: transmit timed out
[ 9177.571174] udp_tunnel veth tun sha1_generic seqiv md5 des_generic libdes authencesn authenc leds_gpio xhci_plat_hcd xhci_pci xhci_mtk_hcd xhci_hcd gpio_button_hotplug usbcore usb_common aquantia
[ 9177.597562] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 6.1.79 #0
[ 9177.603463] Hardware name: ASUS TUF-AX6000 (DT)
[ 9177.607975] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 9177.614915] pc : dev_watchdog+0x238/0x240
[ 9177.618909] lr : dev_watchdog+0x238/0x240
[ 9177.622901] sp : ffffffc008cb3e40
[ 9177.626199] x29: ffffffc008cb3e40 x28: ffffffc008b769c0 x27: ffffffc008cb3f10
[ 9177.633315] x26: 00000000000000e0 x25: 0000000000000001 x24: dead000000000122
[ 9177.640430] x23: 0000000000000000 x22: ffffffc008b76000 x21: 0000000000000004
[ 9177.647546] x20: ffffff8000d8e000 x19: ffffff8000d8e4c0 x18: 0000000000000132
[ 9177.654662] x17: ffffffc01735c000 x16: ffffffc008cb0000 x15: ffffffc008b89ac0
[ 9177.661777] x14: 0000000000000396 x13: 0000000000000132 x12: 00000000ffffffea
[ 9177.668893] x11: 00000000ffffefff x10: ffffffc008be1ac0 x9 : ffffffc008b89a68
[ 9177.676009] x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 0000000000000001
[ 9177.683124] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
[ 9177.690239] x2 : 0000000000000004 x1 : 0000000000000004 x0 : 000000000000003f
[ 9177.697356] Call trace:
[ 9177.699787] dev_watchdog+0x238/0x240
[ 9177.703434] call_timer_fn.constprop.0+0x20/0x80
[ 9177.706051] mtk_soc_eth 15100000.ethernet eth0: Link is Down
[ 9177.708040] __run_timers.part.0+0x208/0x284
[ 9177.708043] run_timer_softirq+0x38/0x70
[ 9177.708046] _stext+0x10c/0x278
[ 9177.708050] ____do_softirq+0xc/0x14
[ 9177.708055] call_on_irq_stack+0x24/0x40
[ 9177.732426] do_softirq_own_stack+0x18/0x2c
[ 9177.736593] __irq_exit_rcu+0xcc/0xd4
[ 9177.740242] irq_exit_rcu+0xc/0x14
[ 9177.743630] el1_interrupt+0x34/0x50
[ 9177.747192] el1h_64_irq_handler+0x14/0x20
[ 9177.751272] el1h_64_irq+0x68/0x6c
[ 9177.754657] arch_cpu_idle+0x14/0x20
[ 9177.758218] do_idle+0xc8/0x150
[ 9177.761347] cpu_startup_entry+0x34/0x40
[ 9177.765255] arch_show_interrupts+0x0/0x15c
[ 9177.769422] __secondary_switched+0x64/0x68
[ 9177.773591] ---[ end trace 0000000000000000 ]---
[ 9177.778262] mtk_soc_eth 15100000.ethernet eth1: Link is Down
[ 9178.307534] mtk_soc_eth 15100000.ethernet: warm reset failed
[ 9178.326509] mtk_soc_eth 15100000.ethernet eth0: configuring for fixed/2500base-x link mode
[ 9178.334910] mtk_soc_eth 15100000.ethernet eth0: Link is Up - 2.5Gbps/Full - flow control rx/tx
[ 9178.335341] mtk_soc_eth 15100000.ethernet eth1: PHY [mdio-bus:06] driver [Maxlinear Ethernet GPY211C] (irq=POLL)
[ 9178.353662] mtk_soc_eth 15100000.ethernet eth1: configuring for phy/sgmii link mode
[ 9184.581261] mtk_soc_eth 15100000.ethernet eth1: Link is Up - 1Gbps/Full - flow control rx/tx
[ 9238.460207] page_pool_release_retry() stalled pool shutdown 30 inflight 60 sec
[ 9298.753761] page_pool_release_retry() stalled pool shutdown 30 inflight 120 sec
[ 9359.031185] page_pool_release_retry() stalled pool shutdown 30 inflight 180 sec
[ 9419.320104] page_pool_release_retry() stalled pool shutdown 30 inflight 241 sec
[ 9479.619586] page_pool_release_retry() stalled pool shutdown 30 inflight 301 sec
[ 9539.899311] page_pool_release_retry() stalled pool shutdown 30 inflight 361 sec
[ 9600.199094] page_pool_release_retry() stalled pool shutdown 30 inflight 422 sec
[ 9660.478937] page_pool_release_retry() stalled pool shutdown 30 inflight 482 sec
[ 9720.768799] page_pool_release_retry() stalled pool shutdown 30 inflight 542 sec
Rebooting the router brings everything back up. Does anyone have any idea what is going on?
I think I may need to go back to the Edgerouter X even though it is unable to handle anywhere near 1gbps traffic shaping, but at least it was reliable. I'll be out of home for a few weeks and won't be able to remotely reboot the router, losing all of my home-hosted things otherwise
OpenWRT version: OpenWrt SNAPSHOT, r25366-7f13b9f8be
WED can be enabled or disabled, both settings have resulted in this crash (which seems to be something with the physical ports rather than radio).