NETDEV WATCHDOG: eth0 (mtk_soc_eth): transmit queue 0 timed out

Hi,

I'm using this device:

Model TP-Link TL-WR840N v4
Architecture MediaTek MT7628AN ver:1 eco:2
Target Platform ramips/mt76x8
Firmware Version OpenWrt 23.05.0 r23497-6637af95aa / LuCI openwrt-23.05 branch git-23.236.53405-fc638c8

Randomly, when configuring the device using luci (add secondary wireless, or wireguard interface, or add software etc) the device goes offline:

[ 7124.339647] ------------[ cut here ]------------
[ 7124.344377] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:477 0x80453118
[ 7124.351555] NETDEV WATCHDOG: eth0 (mtk_soc_eth): transmit queue 0 timed out
[ 7124.358619] Modules linked in: pppoe ppp_async nft_fib_inet nf_flow_table_ipv6 nf_flow_table_ipv4 nf_flow_table_inet pppox ppp_generic nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_objref nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_counter nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack mt7603e mt76 mac80211 lzo cfg80211 slhc nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 lzo_rle lzo_decompress lzo_compress libcrc32c crc_ccitt compat sha512_generic seqiv jitterentropy_rng drbg hmac cmac crypto_acompress leds_gpio gpio_button_hotplug crc32c_generic
[ 7124.421989] CPU: 0 PID: 0 Comm: swapper Not tainted 5.15.134 #0
[ 7124.428006] Stack : 805d64c8 00000004 00000000 00000000 00000000 00000000 00000000 00000000
[ 7124.436530]         00000000 00000000 00000000 00000000 00000000 00000001 80c0bdb0 138b6981
[ 7124.445064]         80c0be48 00000000 00000000 80c0bc50 00000038 802cf024 00000000 ffffffea
[ 7124.453594]         00000000 80c0bc5c 000000da 8069e2d8 805e61d4 80c0bd90 00000000 80453118
[ 7124.462125]         00000009 00000000 000a69c3 8070be34 00000018 8033ed10 00000000 80850000
[ 7124.470655]         ...
[ 7124.473143] Call Trace:
[ 7124.473158] [<802cf024>] 0x802cf024
[ 7124.479172] [<80453118>] 0x80453118
[ 7124.482724] [<8033ed10>] 0x8033ed10
[ 7124.486268] [<80006a80>] 0x80006a80
[ 7124.489817] [<80006a88>] 0x80006a88
[ 7124.493362] [<80025cc8>] 0x80025cc8
[ 7124.496901] [<80453118>] 0x80453118
[ 7124.500453] [<80025ddc>] 0x80025ddc
[ 7124.504000] [<80453118>] 0x80453118
[ 7124.507573] [<80452e40>] 0x80452e40
[ 7124.511127] [<80071a1c>] 0x80071a1c
[ 7124.514681] [<80071e20>] 0x80071e20
[ 7124.518228] [<805d0f94>] 0x805d0f94
[ 7124.521779] [<800685a4>] 0x800685a4
[ 7124.525322] [<800633fc>] 0x800633fc
[ 7124.528866] [<80002890>] 0x80002890
[ 7124.532417]
[ 7124.533930] ---[ end trace a9db53526fc6c102 ]---
[ 7124.538623] mtk_soc_eth 10100000.ethernet eth0: transmit timed out
[ 7124.544914] mtk_soc_eth 10100000.ethernet eth0: dma_cfg:00000057
[ 7124.551032] mtk_soc_eth 10100000.ethernet eth0: tx_ring=0, base=01e2c000, max=1024, ctx=158, dtx=139, fdx=139, next=158
[ 7124.561999] mtk_soc_eth 10100000.ethernet eth0: rx_ring=0, base=01e30000, max=1024, calc=746, drx=747

Serial console remains alive, but the device needs to be restarted to restore network.

What is the root cause? Is there anything OpenWRT can do about it?

I have also noticed a similar issue after upgrading 22->23.
Let's see how a bug report goes: https://github.com/openwrt/openwrt/issues/14167

1 Like