Issue with mtk_soc_eth on mt7621?

Hi,

Out of sudden I started to get some strange ethernet issues on my mt7621 based router.

Every couple of days I have to reboot it due to as the wan ethernet stops to work. lan and wlan side of things continue to work fine

Mon Nov  2 07:37:52 2020 daemon.info dnsmasq-dhcp[1935]: DHCPACK(br-lan) 192.168.1.100 8c:85:90:b0:21:80 Gregorys-MBP-2
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.255011] ------------[ cut here ]------------
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.259649] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:320 0x8038c0d0
Mon Nov  2 07:38:09 2020 kern.info kernel: [74270.266693] NETDEV WATCHDOG: eth0 (mtk_soc_eth): transmit queue 0 timed out
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.273619] Modules linked in: pppoe ppp_async pppox ppp_generic nf_conntrack_ipv6 mt76x2e mt76x2_common mt76x02_lib mt7603e mt76 mac80211 iptable_nat ipt_REJECT ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_nat xt_multiport xt_mark xt_mac xt_limit xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT xt_LOG xt_FLOWOFFLOAD xt_CT slhc nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_conntrack_ipv4 nf_nat_ipv4 nf_nat nf_log_ipv4 nf_flow_table_hw nf_flow_table nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache nf_conntrack iptable_mangle iptable_filter ip_tables crc_ccitt compat nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 tun mmc_block mtk_sd mmc_core leds_gpio xhci_plat_hcd xhci_pci xhci_mtk xhci_hcd gpio_button_hotplug usbcore nls_base
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.344656]  usb_common
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.347137] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.195 #0
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.353201] Stack : 00000000 00000000 00000000 8fdec340 00000000 00000000 00000000 00000000
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.361534]         00000000 00000000 00000000 00000000 00000000 00000001 8fc09d60 ac07f5da
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.369868]         8fc09df8 00000000 00000000 000060f0 00000038 8049c858 00000007 00000000
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.378201]         00000000 80550000 00054c01 00000000 8fc09d40 00000000 00000000 8050aed8
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.386535]         8038c0d0 00000140 00000000 8fdec340 00000000 802ad210 00000000 806b0000
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.394875]         ...
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.397319] Call Trace:
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.397376] [<8049c858>] 0x8049c858
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.403284] [<8038c0d0>] 0x8038c0d0
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.406759] [<802ad210>] 0x802ad210
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.410233] [<8000c1a0>] 0x8000c1a0
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.413702] [<8000c1a8>] 0x8000c1a8
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.417174] [<804856b4>] 0x804856b4
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.420645] [<80071ab0>] 0x80071ab0
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.424124] [<8002e608>] 0x8002e608
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.427594] [<8038c0d0>] 0x8038c0d0
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.431071] [<8002e690>] 0x8002e690
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.434549] [<8038c0d0>] 0x8038c0d0
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.438018] [<80099a00>] 0x80099a00
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.441493] [<8038bf24>] 0x8038bf24
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.444966] [<80088568>] 0x80088568
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.448441] [<80088824>] 0x80088824
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.451909] [<80079158>] 0x80079158
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.455381] [<804a3658>] 0x804a3658
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.458854] [<80032fb4>] 0x80032fb4
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.462324] [<8025a5f0>] 0x8025a5f0
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.465797] [<80007488>] 0x80007488
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.469265]
Mon Nov  2 07:38:09 2020 kern.warn kernel: [74270.470862] ---[ end trace 16f73d7314096e47 ]---
Mon Nov  2 07:38:09 2020 kern.err kernel: [74270.475506] mtk_soc_eth 1e100000.ethernet eth0: transmit timed out
Mon Nov  2 07:38:09 2020 kern.info kernel: [74270.481662] mtk_soc_eth 1e100000.ethernet eth0: dma_cfg:80000067
Mon Nov  2 07:38:09 2020 kern.info kernel: [74270.487671] mtk_soc_eth 1e100000.ethernet eth0: tx_ring=0, base=0ec00000, max=0, ctx=2594, dtx=2296, fdx=2296, next=2594
Mon Nov  2 07:38:09 2020 kern.info kernel: [74270.498517] mtk_soc_eth 1e100000.ethernet eth0: rx_ring=0, base=0e1a0000, max=0, calc=3275, drx=3279
Mon Nov  2 07:38:09 2020 kern.info kernel: [74270.909029] mtk_soc_eth 1e100000.ethernet: 0x100 = 0x5c60000c, 0x10c = 0x80818
Mon Nov  2 07:38:09 2020 kern.info kernel: [74270.921904] mtk_soc_eth 1e100000.ethernet: PPE started
Mon Nov  2 07:38:15 2020 kern.err kernel: [74276.734768] mtk_soc_eth 1e100000.ethernet eth0: transmit timed out
Mon Nov  2 07:38:15 2020 kern.info kernel: [74276.740954] mtk_soc_eth 1e100000.ethernet eth0: dma_cfg:80000067
Mon Nov  2 07:38:15 2020 kern.info kernel: [74276.746982] mtk_soc_eth 1e100000.ethernet eth0: tx_ring=0, base=0e020000, max=0, ctx=3072, dtx=0, fdx=0, next=3072
Mon Nov  2 07:38:15 2020 kern.info kernel: [74276.757311] mtk_soc_eth 1e100000.ethernet eth0: rx_ring=0, base=0edc0000, max=0, calc=88, drx=89
Mon Nov  2 07:38:16 2020 kern.info kernel: [74277.175196] mtk_soc_eth 1e100000.ethernet: 0x100 = 0x5a60000c, 0x10c = 0x80818
Mon Nov  2 07:38:16 2020 kern.info kernel: [74277.188506] mtk_soc_eth 1e100000.ethernet: PPE started
Mon Nov  2 07:38:58 2020 daemon.notice hostapd: wlan1: AP-STA-DISCONNECTED 8c:85:90:b0:21:80
Mon Nov  2 07:38:58 2020 daemon.info hostapd: wlan1: STA 8c:85:90:b0:21:80 IEEE 802.11: disassociated
Mon Nov  2 07:38:59 2020 daemon.info hostapd: wlan1: STA 8c:85:90:b0:21:80 IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
Mon Nov  2 07:40:12 2020 kern.err kernel: [74393.290329] mtk_soc_eth 1e100000.ethernet eth0: transmit timed out
Mon Nov  2 07:40:12 2020 kern.info kernel: [74393.296514] mtk_soc_eth 1e100000.ethernet eth0: dma_cfg:80000067
Mon Nov  2 07:40:12 2020 kern.info kernel: [74393.302547] mtk_soc_eth 1e100000.ethernet eth0: tx_ring=0, base=0edc0000, max=0, ctx=3072, dtx=0, fdx=0, next=3072
Mon Nov  2 07:40:12 2020 kern.info kernel: [74393.312904] mtk_soc_eth 1e100000.ethernet eth0: rx_ring=0, base=0eb40000, max=0, calc=2425, drx=2426
Mon Nov  2 07:40:12 2020 kern.info kernel: [74393.731242] mtk_soc_eth 1e100000.ethernet: 0x100 = 0x5560000c, 0x10c = 0x80818
Mon Nov  2 07:40:12 2020 kern.info kernel: [74393.744558] mtk_soc_eth 1e100000.ethernet: PPE started

dmesg:

[    0.000000] Linux version 4.14.195 (builder@buildhost) (gcc version 7.5.0 (OpenWrt GCC 7.5.0 r11208-ce6496d796)) #0 SMP Sun Sep 6 16:19:39 2020
[    0.000000] SoC Type: MediaTek MT7621 ver:1 eco:3
[    0.000000] bootconsole [early0] enabled
[    0.000000] CPU0 revision is: 0001992f (MIPS 1004Kc)
[    0.000000] MIPS: machine is ZBT-WE1326
[    0.000000] Determined physical RAM map:
[    0.000000]  memory: 10000000 @ 00000000 (usable)
[    0.000000] Initrd not found or empty - disabling initrd
[    0.000000] VPE topology {2,2} total 4
[    0.000000] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
[    0.000000] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    0.000000] MIPS secondary cache 256kB, 8-way, linesize 32 bytes.
[    0.000000] Zone ranges:
[    0.000000]   Normal   [mem 0x0000000000000000-0x000000000fffffff]
[    0.000000]   HighMem  empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000000000-0x000000000fffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x000000000fffffff]
[    0.000000] On node 0 totalpages: 65536
[    0.000000] free_area_init_node: node 0, pgdat 805727c0, node_mem_map 81003000
[    0.000000]   Normal zone: 512 pages used for memmap
[    0.000000]   Normal zone: 0 pages reserved
[    0.000000]   Normal zone: 65536 pages, LIFO batch:15
[    0.000000] random: get_random_bytes called from 0x80575744 with crng_init=0
[    0.000000] percpu: Embedded 14 pages/cpu s26224 r8192 d22928 u57344
[    0.000000] pcpu-alloc: s26224 r8192 d22928 u57344 alloc=14*4096
[    0.000000] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3 
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 65024
[    0.000000] Kernel command line: console=ttyS0,115200 rootfstype=squashfs,jffs2
[    0.000000] PID hash table entries: 1024 (order: 0, 4096 bytes)
[    0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
[    0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
[    0.000000] Writing ErrCtl register=00024a20
[    0.000000] Readback ErrCtl register=00024a20
[    0.000000] Memory: 252468K/262144K available (4750K kernel code, 237K rwdata, 588K rodata, 1260K init, 255K bss, 9676K reserved, 0K cma-reserved, 0K highmem)
[    0.000000] SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[    0.000000] Hierarchical RCU implementation.
[    0.000000] NR_IRQS: 256
[    0.000000] CPU Clock: 880MHz
[    0.000000] clocksource: GIC: mask: 0xffffffffffffffff max_cycles: 0xcaf478abb4, max_idle_ns: 440795247997 ns
[    0.000000] clocksource: MIPS: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 4343773742 ns
[    0.000010] sched_clock: 32 bits at 440MHz, resolution 2ns, wraps every 4880645118ns
[    0.007831] Calibrating delay loop... 586.13 BogoMIPS (lpj=2930688)
[    0.073989] pid_max: default: 32768 minimum: 301
[    0.078766] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes)
[    0.085275] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
[    0.094189] Hierarchical SRCU implementation.
[    0.099373] smp: Bringing up secondary CPUs ...
[    0.105454] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
[    0.105463] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    0.105475] MIPS secondary cache 256kB, 8-way, linesize 32 bytes.
[    0.105625] CPU1 revision is: 0001992f (MIPS 1004Kc)
[    0.164129] Synchronize counters for CPU 1: done.
[    0.205454] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
[    0.205463] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    0.205471] MIPS secondary cache 256kB, 8-way, linesize 32 bytes.
[    0.205544] CPU2 revision is: 0001992f (MIPS 1004Kc)
[    0.255242] Synchronize counters for CPU 2: done.
[    0.286544] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
[    0.286552] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    0.286559] MIPS secondary cache 256kB, 8-way, linesize 32 bytes.
[    0.286636] CPU3 revision is: 0001992f (MIPS 1004Kc)
[    0.340432] Synchronize counters for CPU 3: done.
[    0.370284] smp: Brought up 1 node, 4 CPUs
[    0.377964] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
[    0.387764] futex hash table entries: 1024 (order: 3, 32768 bytes)
[    0.394078] pinctrl core: initialized pinctrl subsystem
[    0.400774] NET: Registered protocol family 16
[    0.411626] FPU Affinity set after 11720 emulations
[    0.412386] pull PCIe RST: RALINK_RSTCTRL = 0
[    0.717147] release PCIe RST: RALINK_RSTCTRL = 7000000
[    0.722193] ***** Xtal 40MHz *****
[    0.725551] release PCIe RST: RALINK_RSTCTRL = 7000000
[    0.730650] Port 0 N_FTS = 1b102800
[    0.734125] Port 1 N_FTS = 1b105000
[    0.737558] Port 2 N_FTS = 1b105000
[    1.892818] PCIE0 no card, disable it(RST&CLK)
[    1.897183]  -> 10207f2
[    1.899581] PCIE1 enabled
[    1.902175] PCIE2 enabled
[    1.904778] PCI host bridge /pcie@1e140000 ranges:
[    1.909559]  MEM 0x0000000060000000..0x000000006fffffff
[    1.914713]   IO 0x000000001e160000..0x000000001e16ffff
[    1.919909] PCI coherence region base: 0xbfbf8000, mask/settings: 0x60000000
[    1.936633] mt7621_gpio 1e000600.gpio: registering 32 gpios
[    1.942486] mt7621_gpio 1e000600.gpio: registering 32 gpios
[    1.948221] mt7621_gpio 1e000600.gpio: registering 32 gpios
[    1.955434] PCI host bridge to bus 0000:00
[    1.959462] pci_bus 0000:00: root bus resource [mem 0x60000000-0x6fffffff]
[    1.966327] pci_bus 0000:00: root bus resource [io  0xffffffff]
[    1.972151] pci_bus 0000:00: root bus resource [??? 0x00000000 flags 0x0]
[    1.978909] pci_bus 0000:00: No busn resource found for root bus, will use [bus 00-ff]
[    1.986804] pci 0000:00:00.0: [0e8d:0801] type 01 class 0x060400
[    1.986847] pci 0000:00:00.0: reg 0x10: [mem 0x00000000-0x7fffffff]
[    1.986859] pci 0000:00:00.0: reg 0x14: [mem 0x60300000-0x6030ffff]
[    1.986927] pci 0000:00:00.0: supports D1
[    1.986936] pci 0000:00:00.0: PME# supported from D0 D1 D3hot
[    1.987183] pci 0000:00:01.0: [0e8d:0801] type 01 class 0x060400
[    1.987209] pci 0000:00:01.0: reg 0x10: [mem 0x00000000-0x7fffffff]
[    1.987228] pci 0000:00:01.0: reg 0x14: [mem 0x60310000-0x6031ffff]
[    1.987280] pci 0000:00:01.0: supports D1
[    1.987289] pci 0000:00:01.0: PME# supported from D0 D1 D3hot
[    1.987685] pci 0000:01:00.0: [14c3:7662] type 00 class 0x028000
[    1.987738] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x000fffff 64bit]
[    1.987783] pci 0000:01:00.0: reg 0x30: [mem 0x00000000-0x0000ffff pref]
[    1.987874] pci 0000:01:00.0: PME# supported from D0 D3hot D3cold
[    1.988058] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
[    1.988248] pci 0000:02:00.0: [14c3:7603] type 00 class 0x028000
[    1.988301] pci 0000:02:00.0: reg 0x10: [mem 0x00000000-0x000fffff]
[    1.988427] pci 0000:02:00.0: PME# supported from D0 D3hot D3cold
[    1.988599] pci_bus 0000:02: busn_res: [bus 02-ff] end is updated to 02
[    1.988618] pci_bus 0000:00: busn_res: [bus 00-ff] end is updated to 02
[    1.988683] pci 0000:00:00.0: BAR 0: no space for [mem size 0x80000000]
[    1.995242] pci 0000:00:00.0: BAR 0: failed to assign [mem size 0x80000000]
[    2.002109] pci 0000:00:01.0: BAR 0: no space for [mem size 0x80000000]
[    2.008700] pci 0000:00:01.0: BAR 0: failed to assign [mem size 0x80000000]
[    2.015594] pci 0000:00:00.0: BAR 8: assigned [mem 0x60000000-0x600fffff]
[    2.022346] pci 0000:00:00.0: BAR 9: assigned [mem 0x60100000-0x601fffff pref]
[    2.029495] pci 0000:00:01.0: BAR 8: assigned [mem 0x60200000-0x602fffff]
[    2.036256] pci 0000:00:00.0: BAR 1: assigned [mem 0x60300000-0x6030ffff]
[    2.042987] pci 0000:00:01.0: BAR 1: assigned [mem 0x60310000-0x6031ffff]
[    2.049748] pci 0000:01:00.0: BAR 0: assigned [mem 0x60000000-0x600fffff 64bit]
[    2.056985] pci 0000:01:00.0: BAR 6: assigned [mem 0x60100000-0x6010ffff pref]
[    2.064163] pci 0000:00:00.0: PCI bridge to [bus 01]
[    2.069069] pci 0000:00:00.0:   bridge window [mem 0x60000000-0x600fffff]
[    2.075825] pci 0000:00:00.0:   bridge window [mem 0x60100000-0x601fffff pref]
[    2.082990] pci 0000:02:00.0: BAR 0: assigned [mem 0x60200000-0x602fffff]
[    2.089742] pci 0000:00:01.0: PCI bridge to [bus 02]
[    2.094644] pci 0000:00:01.0:   bridge window [mem 0x60200000-0x602fffff]
[    2.102833] clocksource: Switched to clocksource GIC
[    2.109362] NET: Registered protocol family 2
[    2.114579] TCP established hash table entries: 2048 (order: 1, 8192 bytes)
[    2.121494] TCP bind hash table entries: 2048 (order: 2, 16384 bytes)
[    2.127915] TCP: Hash tables configured (established 2048 bind 2048)
[    2.134335] UDP hash table entries: 256 (order: 1, 8192 bytes)
[    2.140102] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes)
[    2.146587] NET: Registered protocol family 1
[    2.150912] PCI: CLS 0 bytes, default 32
[    2.382778] 4 CPUs re-calibrate udelay(lpj = 2924544)
[    2.389344] Crashlog allocated RAM at address 0x3f00000
[    2.394764] workingset: timestamp_bits=30 max_order=16 bucket_order=0
[    2.403249] random: fast init done
[    2.411710] squashfs: version 4.0 (2009/01/31) Phillip Lougher
[    2.417494] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME) (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
[    2.431037] io scheduler noop registered
[    2.434927] io scheduler deadline registered (default)
[    2.441139] Serial: 8250/16550 driver, 16 ports, IRQ sharing enabled
[    2.451181] console [ttyS0] disabled
[    2.454826] 1e000c00.uartlite: ttyS0 at MMIO 0x1e000c00 (irq = 19, base_baud = 3125000) is a 16550A
[    2.463851] console [ttyS0] enabled
[    2.470706] bootconsole [early0] disabled
[    2.481149] MediaTek Nand driver init, version v2.1 Fix AHB virt2phys error
[    2.488603] spi-mt7621 1e000b00.spi: sys_freq: 220000000
[    2.503947] m25p80 spi0.0: w25q128 (16384 Kbytes)
[    2.508693] 4 fixed-partitions partitions found on MTD device spi0.0
[    2.515047] Creating 4 MTD partitions on "spi0.0":
[    2.519826] 0x000000000000-0x000000030000 : "u-boot"
[    2.525989] 0x000000030000-0x000000040000 : "u-boot-env"
[    2.532305] 0x000000040000-0x000000050000 : "factory"
[    2.538496] 0x000000050000-0x000001000000 : "firmware"
[    2.544925] 2 uimage-fw partitions found on MTD device firmware
[    2.550829] Creating 2 MTD partitions on "firmware":
[    2.555846] 0x000000000000-0x0000001c9c2c : "kernel"
[    2.561927] 0x0000001c9c2c-0x000000fb0000 : "rootfs"
[    2.568044] mtd: device 5 (rootfs) set to be root filesystem
[    2.573880] 1 squashfs-split partitions found on MTD device rootfs
[    2.580055] 0x000000460000-0x000000fb0000 : "rootfs_data"
[    2.587176] libphy: Fixed MDIO Bus: probed
[    2.654859] libphy: mdio: probed
[    4.057771] mtk_soc_eth 1e100000.ethernet: loaded mt7530 driver
[    4.064534] mtk_soc_eth 1e100000.ethernet eth0: mediatek frame engine at 0xbe100000, irq 22
[    4.075672] NET: Registered protocol family 10
[    4.081727] Segment Routing with IPv6
[    4.085557] NET: Registered protocol family 17
[    4.090075] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
[    4.102993] 8021q: 802.1Q VLAN Support v1.8
[    4.109920] hctosys: unable to open rtc device (rtc0)
[    4.124043] VFS: Mounted root (squashfs filesystem) readonly on device 31:5.
[    4.136008] Freeing unused kernel memory: 1260K
[    4.140539] This architecture does not have kernel memory protection.
[    4.328072] mtk_soc_eth 1e100000.ethernet eth0: port 3 link up
[    4.417252] mtk_soc_eth 1e100000.ethernet eth0: port 1 link up
[    4.956664] init: Console is alive
[    4.960320] init: - watchdog -
[    5.390266] mtk_soc_eth 1e100000.ethernet eth0: port 2 link up
[    5.669453] mtk_soc_eth 1e100000.ethernet eth0: port 4 link up
[    6.069894] kmodloader: loading kernel modules from /etc/modules-boot.d/*
[    6.250527] usbcore: registered new interface driver usbfs
[    6.256155] usbcore: registered new interface driver hub
[    6.261623] usbcore: registered new device driver usb
[    6.274390] xhci-mtk 1e1c0000.xhci: 1e1c0000.xhci supply vbus not found, using dummy regulator
[    6.283126] xhci-mtk 1e1c0000.xhci: 1e1c0000.xhci supply vusb33 not found, using dummy regulator
[    6.292071] xhci-mtk 1e1c0000.xhci: xHCI Host Controller
[    6.297433] xhci-mtk 1e1c0000.xhci: new USB bus registered, assigned bus number 1
[    6.312971] xhci-mtk 1e1c0000.xhci: hcc params 0x01401198 hci version 0x96 quirks 0x0000000000210010
[    6.322153] xhci-mtk 1e1c0000.xhci: irq 21, io mem 0x1e1c0000
[    6.328979] hub 1-0:1.0: USB hub found
[    6.332849] hub 1-0:1.0: 2 ports detected
[    6.337512] xhci-mtk 1e1c0000.xhci: xHCI Host Controller
[    6.342888] xhci-mtk 1e1c0000.xhci: new USB bus registered, assigned bus number 2
[    6.350353] xhci-mtk 1e1c0000.xhci: Host supports USB 3.0  SuperSpeed
[    6.356986] usb usb2: We don't know the algorithms for LPM for this host, disabling LPM.
[    6.365976] hub 2-0:1.0: USB hub found
[    6.369791] hub 2-0:1.0: 1 port detected
[    6.446346] kmodloader: done loading kernel modules from /etc/modules-boot.d/*
[    6.463197] init: - preinit -
[    7.418756] mtk_soc_eth 1e100000.ethernet: PPE started
[   10.794306] jffs2: notice: (474) jffs2_build_xattr_subsystem: complete building xattr subsystem, 7 of xdatum (3 unchecked, 4 orphan) and 41 of xref (4 dead, 0 orphan) found.
[   10.815158] mount_root: switching to jffs2 overlay
[   10.837680] overlayfs: upper fs does not support tmpfile.
[   10.971818] urandom-seed: Seeding with /etc/urandom.seed
[   11.173960] mtk_soc_eth 1e100000.ethernet: 0x100 = 0x6060000c, 0x10c = 0x80818
[   11.189371] procd: - early -
[   11.192356] procd: - watchdog -
[   11.893032] procd: - watchdog -
[   11.896530] procd: - ubus -
[   12.016516] random: ubusd: uninitialized urandom read (4 bytes read)
[   12.030040] random: ubusd: uninitialized urandom read (4 bytes read)
[   12.036858] random: ubusd: uninitialized urandom read (4 bytes read)
[   12.044089] procd: - init -
[   12.673805] kmodloader: loading kernel modules from /etc/modules.d/*
[   12.698438] tun: Universal TUN/TAP device driver, 1.6
[   12.769019] ip6_tables: (C) 2000-2006 Netfilter Core Team
[   12.781101] Loading modules backported from Linux version v4.19.137-0-gc076c79e03c6
[   12.788894] Backport generated by backports.git v4.19.137-1-0-g60c3a249
[   12.797465] ip_tables: (C) 2000-2006 Netfilter Core Team
[   12.810000] nf_conntrack version 0.5.0 (4096 buckets, 16384 max)
[   12.858521] xt_time: kernel timezone is -0000
[   12.878266] urngd: v1.0.2 started.
[   12.926042] bus=0x2, slot = 0x1, irq=0xff
[   12.930302] mt7603e 0000:02:00.0: ASIC revision: 76030010
[   13.035931] random: crng init done
[   13.039340] random: 7 urandom warning(s) missed due to ratelimiting
[   13.964227] mt7603e 0000:02:00.0: Firmware Version: ap_pcie
[   13.969802] mt7603e 0000:02:00.0: Build Time: 20160107100755
[   14.012831] mt7603e 0000:02:00.0: firmware init done
[   14.184793] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
[   14.195980] bus=0x1, slot = 0x0, irq=0xff
[   14.200317] mt76x2e 0000:01:00.0: ASIC revision: 76120044
[   14.917523] mt76x2e 0000:01:00.0: ROM patch build: 20141115060606a
[   14.927193] mt76x2e 0000:01:00.0: Firmware Version: 0.0.00
[   14.932675] mt76x2e 0000:01:00.0: Build: 1
[   14.936822] mt76x2e 0000:01:00.0: Build Time: 201507311614____
[   14.962822] mt76x2e 0000:01:00.0: Firmware running!
[   14.970663] ieee80211 phy1: Selected rate control algorithm 'minstrel_ht'
[   14.978439] PPP generic driver version 2.4.2
[   14.984264] NET: Registered protocol family 24
[   14.991703] kmodloader: done loading kernel modules from /etc/modules.d/*
[   21.464452] mtk_soc_eth 1e100000.ethernet: PPE started
[   21.479376] br-lan: port 1(eth0.1) entered blocking state
[   21.484904] br-lan: port 1(eth0.1) entered disabled state
[   21.491025] device eth0.1 entered promiscuous mode
[   21.496242] device eth0 entered promiscuous mode
[   21.505315] br-lan: port 1(eth0.1) entered blocking state
[   21.510763] br-lan: port 1(eth0.1) entered forwarding state
[   21.516945] IPv6: ADDRCONF(NETDEV_UP): br-lan: link is not ready
[   22.473642] IPv6: ADDRCONF(NETDEV_CHANGE): br-lan: link becomes ready
[   24.665385] IPv6: ADDRCONF(NETDEV_UP): wlan1: link is not ready
[   24.678142] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   24.684194] br-lan: port 2(wlan1) entered blocking state
[   24.689490] br-lan: port 2(wlan1) entered disabled state
[   24.695323] device wlan1 entered promiscuous mode
[   24.700441] br-lan: port 3(wlan0) entered blocking state
[   24.705809] br-lan: port 3(wlan0) entered disabled state
[   24.711565] device wlan0 entered promiscuous mode
[   24.716595] br-lan: port 3(wlan0) entered blocking state
[   24.721900] br-lan: port 3(wlan0) entered forwarding state
[   25.397303] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready


What is happening?

And it just happened again, only after 30 min after rebooting it:

[   25.397303] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[   91.345155] IPv6: ADDRCONF(NETDEV_CHANGE): wlan1: link becomes ready
[   91.352000] br-lan: port 2(wlan1) entered blocking state
[   91.357330] br-lan: port 2(wlan1) entered forwarding state
[ 1148.038731] ------------[ cut here ]------------
[ 1148.043368] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:320 0x8038c0d0
[ 1148.050418] NETDEV WATCHDOG: eth0 (mtk_soc_eth): transmit queue 0 timed out
[ 1148.057372] Modules linked in: pppoe ppp_async pppox ppp_generic nf_conntrack_ipv6 mt76x2e mt76x2_common mt76x02_lib mt7603e mt76 mac80211 iptable_nat ipt_REJECT ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_nat xt_multiport xt_mark xt_mac xt_limit xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT xt_LOG xt_FLOWOFFLOAD xt_CT slhc nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_conntrack_ipv4 nf_nat_ipv4 nf_nat nf_log_ipv4 nf_flow_table_hw nf_flow_table nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache nf_conntrack iptable_mangle iptable_filter ip_tables crc_ccitt compat nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 tun mmc_block mtk_sd mmc_core leds_gpio xhci_plat_hcd xhci_pci xhci_mtk xhci_hcd gpio_button_hotplug usbcore nls_base
[ 1148.128378]  usb_common
[ 1148.130861] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.14.195 #0
[ 1148.136927] Stack : 00000000 00000000 00000000 8fe20240 00000000 00000000 00000000 00000000
[ 1148.145262]         00000000 00000000 00000000 00000000 00000000 00000001 8fc0bd60 ac07f5da
[ 1148.153599]         8fc0bdf8 00000000 00000000 000058f0 00000038 8049c858 00000007 00000000
[ 1148.161935]         00000000 80550000 0001ff2d 00000000 8fc0bd40 00000000 00000000 8050aed8
[ 1148.170273]         8038c0d0 00000140 00000001 8fe20240 00000000 802ad210 00000004 806b0004
[ 1148.178605]         ...
[ 1148.181041] Call Trace:
[ 1148.181093] [<8049c858>] 0x8049c858
[ 1148.186991] [<8038c0d0>] 0x8038c0d0
[ 1148.190468] [<802ad210>] 0x802ad210
[ 1148.193945] [<8000c1a0>] 0x8000c1a0
[ 1148.197413] [<8000c1a8>] 0x8000c1a8
[ 1148.200886] [<804856b4>] 0x804856b4
[ 1148.204356] [<80071ab0>] 0x80071ab0
[ 1148.207841] [<8002e608>] 0x8002e608
[ 1148.211320] [<8038c0d0>] 0x8038c0d0
[ 1148.214799] [<8002e690>] 0x8002e690
[ 1148.218274] [<800550e8>] 0x800550e8
[ 1148.221752] [<8038c0d0>] 0x8038c0d0
[ 1148.225223] [<80099a00>] 0x80099a00
[ 1148.228707] [<8038bf24>] 0x8038bf24
[ 1148.232182] [<80088568>] 0x80088568
[ 1148.235652] [<8005f214>] 0x8005f214
[ 1148.239130] [<80088824>] 0x80088824
[ 1148.242601] [<80079158>] 0x80079158
[ 1148.246075] [<804a3658>] 0x804a3658
[ 1148.249553] [<80032fb4>] 0x80032fb4
[ 1148.253028] [<8025a5f0>] 0x8025a5f0
[ 1148.256502] [<80007488>] 0x80007488
[ 1148.259974] 
[ 1148.261550] ---[ end trace 84f2d55f19c351b0 ]---
[ 1148.266172] mtk_soc_eth 1e100000.ethernet eth0: transmit timed out
[ 1148.272360] mtk_soc_eth 1e100000.ethernet eth0: dma_cfg:80000067
[ 1148.278361] mtk_soc_eth 1e100000.ethernet eth0: tx_ring=0, base=0ec00000, max=0, ctx=2486, dtx=2090, fdx=2090, next=2486
[ 1148.289227] mtk_soc_eth 1e100000.ethernet eth0: rx_ring=0, base=0e1a0000, max=0, calc=3886, drx=3887
[ 1148.702904] mtk_soc_eth 1e100000.ethernet: 0x100 = 0x5f60000c, 0x10c = 0x80818
[ 1148.716724] mtk_soc_eth 1e100000.ethernet: PPE started

This is a known issue with the mt7530 switch included in mt7621 devices. See this 332 posts topic about this issue here: Mtk_soc_eth watchdog timeout after r11573

See this technical topic about a potential solution to this issue: Mt7621 / mt7530 programming: Disabling Flow Control on all ports

1 Like

I saw your post in the issue tracker in the mt76 repository. Please don't use that for this issue, since the mt76 is a WiFi driver, which has nothing to do with this issue, since this is switch related.

However, I did notice you said that you were stable on 19.07.3 and the instability was introduced for you with 19.07.4. If that's the case, then this commit might be your cause of instability: https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=7ac454014a11347887323a131415ac7032d53546

It was later reverted, but this happened after 19.07.4 was already released: https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=34a96529041d4e9502c490c66f8af0154187c6d2

You will either have to wait for 19.07.5 for this change to be reverted OR you will have to compile from the 19.07 branch yourself. I don't recommend going back to 19.07.3, since 19.07.4 does have a few security related patches.

2 Likes

Actually, no need to compile yourself. The buildbot also has snapshot images from the 19.07 branch. You should be able to fetch one for your device here: https://downloads.openwrt.org/releases/19.07-SNAPSHOT/targets/

Let me know if that solves your issue :slight_smile:

1 Like

I recall seeing over 100 days uptime when on 19.07.3. I just upgrade around a couple of weeks ago to 19.07.4 and seem to be getting this issue every other day.

I'm more inclined to move back to 19.07.3 just to double check it goes back to normal. Don't want to be faced with yet a different problem while on snapshot.

Thanks for the quick response and advice Mushoz!

Having said that, what's the expectation as to when 19.07.5 comes together?

Those snapshots are made from the 19.07 stable branch, and therefor should be stable as well. You can see the commits in that particular branch here: https://git.openwrt.org/?p=openwrt/openwrt.git;a=shortlog;h=refs/heads/openwrt-19.07

As you can see, there are very little commits after 19.07.4 and all changes are bugfixes as is normal with a stable branch. While it's not impossible, bug fixes are much less likely to introduce regressions than new features are. You're much better off going with such a snapshot which is not very likely to contain regressions, than going with the older 19.07.3 which contains known issues.

The only real issue that these snapshot builds have, is that once they get a kernel update, packages will be rebuilt for the newer kernel version. This means that once that happens, you are unable to install new kernel modules via opkg since your router will still be on the older kernel. For tagged releases such as 19.07.3 and 19.07.4 this isn't an issue, since they have a separate repository with all the packages for that particular release.

However, this issue is easily avoided by simply installing the packages you need right after you install your snapshot. In case you do end up needing another package down the road, you simply flash a new snapshot image from the 19.07 branch and install the package as needed.

There's no ETA for 19.07.5. Newer releases are usually tagged on a per-need basis. Usually after an important security fix. New releases also happen every now and then after there hasn't been a release for a while, just to get everything updated, but this has no hard rules/ETAs. But 19.07.4 isn't that old yet, so I wouldn't expect 19.07.5 anytime soon. Unless a security issue is found that needs patching of course.

1 Like

Mushoz, so I tried some older version of openwrt and this issue is still there. I am not getting it cause I previously had 100days uptime on 19.07.3 and now after upgrading and downgrading back to the same version it hardly stays up for a day. The only difference seems to be that I’m now heavily using OpenVPN while previously I have not. I don’t think this is anyhow related but I’m out of ideas in here.

I saw your other thread with a patch to the problem. Are you still convinced that it works? And if yes, would you be able to provide an actual diff to be applied to 19.07.4 so I can try and recompile?

19.07.5 out now, seems to includes fix for this issue.

I just tried 19.07.5 and it does not fix it for me. Took around 12h before it gave me kernel panic and ethernet stopped to work. Then after trying to reboot it it hardly stayed up for 2-3h.

The only version that seems to be working for me is Snapshot. I've been using this one - OpenWrt SNAPSHOT r15165-66d12ce667, kernel: 5.4.81

You have right. In 19.07.5 this issue not fixed.

There is an open case for this https://bugs.openwrt.org/index.php?do=details&task_id=2628 , but none of the developers respond yet.

1 Like

suggest you try snapshot, it works for me:

Yes thanks for advice but this version use new DSA driver with possibly other issues.
But mainly, DSA change configuration significantly and we use our own scripts connected to information system and redesign for use with new driver would be painful.

:frowning: