Mtk_soc_eth watchdog timeout after r11573

So peeps, been looking into this issue for a while - based on feedback from users with this issue, various routers based on the mt7621 (im dealing mostly with Xiaomi R4AG model).

It appears the TX packet handling is overrunning itself in the DMA space, getting out of sync and causing the kernel to crash.

In my repo i've done this patch - https://gitlab.com/db260179/xiaomi-m4a/-/commit/cb74ad8e8d5e90a5a35b74c1a49e0b79871db947

I was able to reliably trigger a port disconnect, but not a kernel trigger. So it makes me believe there is two possible issues:

  1. Certain revisions of the mt7621 have a bug that causes a hard lock
  2. Later built routers have had voltage or silicon fixes based on this issue.

Also the oem driver ralink ethernet and possible 5.x version of the ethernet driver is managing the DMA better and thus not triggering this issue.

So if possible, you can just use my git repo and build the model of router you have, there is nothing specific really to the xiaomi R4AG as they all share the same mt7621 chip

If you could report back (gsw: chip rev) and do the following:

When testing can you do a:

Run iperf3 -s in the ssh terminal on the router

Then on your pc/laptop that has iperf3 tool:

iperf3 -c 192.168.1.1 -u -t 100 -b 1000M -P 2

Feedback packet loss and interruptions.

Also from the dmesg:

grep "gsw: chip rev"

This issue has been going for some years, so be interested in getting it fixed.

1 Like

To be honest, since I am on kernel 5.4 I have not seen this issue. Probably the only thing keeps people on 4.14 is the HW offloading, that is painfully missing from the 5.4 kernel. Question is, how much effort does it worth to fix this on 4.14.

It seems like this issue had risen after v19.07.3, apart from a few kernel updates and disabling flow control nothing else much changed. And it really bugs me that its causing problems.

Plus using the snapshot openwrt is not a stable way of running, and really we only have v19.07 to go by for the moment.

Well, I am running the snapshot (usually updating every couple weeks), and it is stable for me.

Non the less, if you think (or someone confirms) this issue is present on the latest kernel (5.4 atm), maybe it worth to commit a patch that can be applied to the latest snapshot as well. I am happy to try it, though thankfully after 3+ years I will soon leave the Mediatek boat. It was fun and all, but I also learned my lesson.

What makes you think that master branch/snapshots is not stable? I'm using it because in 19.07.x with the swconfig driver definitely there is no solution. This master branch is much more recent, with many improvements that were never introduced on 19.07. All working well.

I also thought that the only "stable" version were the stable branches. Later I learned that any branch can be unstable, that everything depends on the commits introduced, that they are not always for the best. For me, the only meaning of "stable" in the stable branches of OpenWRT, is that the kernel modules can always be installed from the official repository, unlike snapshots which must be the last.

I have built a master branch with its own repository in case someone wants to definitely try the new DSA driver, and does not dare because they do not know how to compile, do not include luci or because of the need for a functional repo without constantly updating the firmware.

The more we try the master branch, the sooner we will get rid of doubts as to whether the problem really ended with the new kernel, leaving behind a version that no longer contributes anything. Hardware Flow Offloading is not a good excuse to stick to 19.07, as it never worked properly (and it does not yet). DSA with Software Offloading and packet steering can reach 800Mbps or more.

Its more of a 'less moving target' stable. Yes your correct, over the years it has started to become too unstable. I found v18.06 to be the most stable.

It would be nice to have a luci gui for the DSA driver, but thats more of a usability than functionality aspect.

Well i've done my little bit for the mt7621 and openwrt.

There have been recent patches to fix the hardware offloading, and now seems to work for my devices?.

Unfortunately you and others have devices with some bug in the SOC, that gets triggered by the v19.07 mtk ethernet driver.

No luci DSA gui is needed. Tagged VLAN on ports is configured with only the label interface in dot notation, like 'lan1.4'. This for example tags VLAN 4 on the LAN1 port. You only need to configure the interface adapter name in uci/luci or network config file. It is no longer necessary to enter any command to use VLAN on eth ports.

@apocalypse this project does not have massive funding. Meaning it makes sense that old "releases" are unstable. Not only that, figuring out this issue requires understanding of the kernel's networking stack and driver infrastructure. I doubt anyone would have the patience for such a thing.

In my experience, LEDE 17.01 was the most stable for mt7621. kernel 4.4 did not change enough to cause problems with the out of tree drivers used for mt7621.

@db260179 hardware bug is probably correct. And yes, the reason people want to use 19.07 is because of HWNAT. Adding HWNAT support for the upstream driver is apparently trivial but no one has done it yet. swconfig also happens to faster than DSA but stability is more important than speed.

Out of tree driver has no future. I pushed to get the upstream driver in the tree because of issues like this. There are more developers on kernel.org than openwrt.org.

master is quite good actually. non upstream PCI, ethernet, swconfig, GPIO, and SPI drivers were removed. I'm still pushing to get the mmc driver removed but it seems some hardware doesn't work with the upstream one.

Off-topic: there's a crypto driver in development for mt7621: https://github.com/vschagen/mtk-eip93

Only really useful for people running dm-crypt.

1 Like

@neheb Even on LEDE 17.01 i had these problems, also on 18.06. Eth0: transmit timed out on ER-X with LEDE 17.01.7

For now in master it is running 5 days without any problem. What do you mean that swconfig driver is faster?

I am running my own build based on master with kernel patches to overclock the CPU and more magic that I will not discuss in detail here. However, I see that my WG3526 works very "stable" (in the "does not crash" sense of the word). In fact, while there are several devices randomly crashing since the kernel patch 82, this one does keep running uninterrupted for the whole week (my releases are weekly, so I can't say if it will work for a month, for example).

Take a look at my kernel log:

[    0.000000] Linux version 5.4.82 (notengobattery@gmail.com) (gcc version 10.2.0 (OpenWrt GCC 10.2.0 r0+15161-36449b38ae)) #0 SMP PREEMPT Sat Dec 12 22:18:12 2020
[    0.000000] SoC Type: MediaTek MT7621 ver:1 eco:3
[    0.000000] printk: bootconsole [early0] enabled
[    0.000000] CPU0 revision is: 0001992f (MIPS 1004Kc)
[    0.000000] MIPS: machine is Zbtlink ZBT-WG3526 (Glucose)
[    0.000000] Initrd not found or empty - disabling initrd
[    0.000000] VPE topology {2,2} total 4
[    0.000000] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
[    0.000000] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    0.000000] MIPS secondary cache 256kB, 8-way, linesize 32 bytes.
[    0.000000] Zone ranges:
[    0.000000]   Normal   [mem 0x0000000000000000-0x000000000fffffff]
[    0.000000]   HighMem  [mem 0x0000000010000000-0x0000000023ffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000000000-0x000000001bffffff]
[    0.000000]   node   0: [mem 0x0000000020000000-0x0000000023ffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000023ffffff]
[    0.000000] On node 0 totalpages: 131072
[    0.000000]   Normal zone: 576 pages used for memmap
[    0.000000]   Normal zone: 0 pages reserved
[    0.000000]   Normal zone: 65536 pages, LIFO batch:15
[    0.000000]   HighMem zone: 65536 pages, LIFO batch:15
[    0.000000] percpu: Embedded 14 pages/cpu s26992 r8192 d22160 u57344
[    0.000000] pcpu-alloc: s26992 r8192 d22160 u57344 alloc=14*4096
[    0.000000] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3 
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 130496
[    0.000000] Kernel command line: console=ttyS0,115200 rootfstype=squashfs,jffs2 rcutree.kthread_prio=49
[    0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes, linear)
[    0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes, linear)
[    0.000000] Writing ErrCtl register=000213a0
[    0.000000] Readback ErrCtl register=000213a0
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Memory: 504684K/524288K available (6365K kernel code, 204K rwdata, 1280K rodata, 1228K init, 233K bss, 19604K reserved, 0K cma-reserved, 262144K highmem)
[    0.000000] SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[    0.000000] rcu: Preemptible hierarchical RCU implementation.
[    0.000000] 	Tasks RCU enabled.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[    0.000000] NR_IRQS: 256
[    0.000000] random: get_random_bytes called from start_kernel+0x34c/0x570 with crng_init=0
[    0.000000] CPU Overclock PLL: 0xc2005bb2
[    0.000000] CPU Clock: 1200MHz
[    0.000000] clocksource: GIC: mask: 0xffffffffffffffff max_cycles: 0x114c1bbbeec, max_idle_ns: 440795229572 ns
[    0.000000] clocksource: MIPS: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 3185434077 ns
[    0.000008] sched_clock: 32 bits at 600MHz, resolution 1ns, wraps every 3579139583ns
[    0.007813] Calibrating delay loop... 798.72 BogoMIPS (lpj=1597440)
[    0.038001] pid_max: default: 32768 minimum: 301
[    0.042753] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[    0.049969] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[    0.060164] rcu: Hierarchical SRCU implementation.
[    0.065606] smp: Bringing up secondary CPUs ...
[    2.948034] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
[    2.948043] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    2.948052] MIPS secondary cache 256kB, 8-way, linesize 32 bytes.
[    2.948141] CPU1 revision is: 0001992f (MIPS 1004Kc)
[    0.094350] Synchronize counters for CPU 1: done.
[    3.009179] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
[    3.009188] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    3.009195] MIPS secondary cache 256kB, 8-way, linesize 32 bytes.
[    3.009242] CPU2 revision is: 0001992f (MIPS 1004Kc)
[    0.155032] Synchronize counters for CPU 2: done.
[    3.062594] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
[    3.062601] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    3.062608] MIPS secondary cache 256kB, 8-way, linesize 32 bytes.
[    3.062656] CPU3 revision is: 0001992f (MIPS 1004Kc)
[    0.208891] Synchronize counters for CPU 3: done.
[    0.238739] smp: Brought up 1 node, 4 CPUs
[    0.246512] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.256253] futex hash table entries: 1024 (order: 3, 32768 bytes, linear)
[    0.263328] pinctrl core: initialized pinctrl subsystem
[    0.270131] NET: Registered protocol family 16
[    0.285533] FPU Affinity set after 6384 emulations
[    0.646613] printk: console [pstore-1] enabled
[    0.651344] pstore: Registered ramoops as persistent store backend
[    0.657631] ramoops: using 0x480000@0x7b80000, ecc: 16
[    0.677395] random: fast init done
[    0.683326] workqueue: max_active 576 requested for napi_workq is out of range, clamping between 1 and 512
[    0.694760] clocksource: Switched to clocksource GIC
[    0.701488] NET: Registered protocol family 2
[    0.706794] tcp_listen_portaddr_hash hash table entries: 512 (order: 0, 6144 bytes, linear)
[    0.715293] TCP established hash table entries: 2048 (order: 1, 8192 bytes, linear)
[    0.722984] TCP bind hash table entries: 2048 (order: 2, 16384 bytes, linear)
[    0.730225] TCP: Hash tables configured (established 2048 bind 2048)
[    0.736816] UDP hash table entries: 256 (order: 1, 8192 bytes, linear)
[    0.743388] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)
[    0.750950] NET: Registered protocol family 1
[    0.755374] PCI: CLS 0 bytes, default 32
[    0.854714] 4 CPUs re-calibrate udelay(lpj = 1593344)
[    0.861361] workingset: timestamp_bits=14 max_order=17 bucket_order=3
[    0.878392] squashfs: version 4.0 (2009/01/31) Phillip Lougher
[    0.884271] squashfs: patched 4.0 (2020/11/10) Oever Gonzalez
[    0.890101] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME) (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
[    0.901361] bounce: pool size: 64 pages
[    0.907281] mt7621_gpio 1e000600.gpio: registering 32 gpios
[    0.913205] mt7621_gpio 1e000600.gpio: registering 32 gpios
[    0.919189] mt7621_gpio 1e000600.gpio: registering 32 gpios
[    0.925583] Serial: 8250/16550 driver, 3 ports, IRQ sharing disabled
[    0.933281] printk: console [ttyS0] disabled
[    0.937677] 1e000c00.uartlite: ttyS0 at MMIO 0x1e000c00 (irq = 19, base_baud = 3125000) is a 16550A
[    0.946884] printk: console [ttyS0] enabled
[    0.955244] printk: bootconsole [early0] disabled
[    0.967349] spi-mt7621 1e000b00.spi: sys_freq: 300000000
[    0.983458] spi-nor spi0.0: w25q128 (16384 Kbytes)
[    0.988490] 4 fixed-partitions partitions found on MTD device spi0.0
[    0.994947] Creating 4 MTD partitions on "spi0.0":
[    0.999928] 0x000000000000-0x000000030000 : "u-boot"
[    1.006190] 0x000000030000-0x000000040000 : "u-boot-env"
[    1.012859] 0x000000040000-0x000000050000 : "factory"
[    1.019307] 0x000000050000-0x000001000000 : "firmware"
[    1.025945] 2 uimage-fw partitions found on MTD device firmware
[    1.032136] Creating 2 MTD partitions on "firmware":
[    1.037195] 0x000000000000-0x00000027aea9 : "kernel"
[    1.043511] 0x00000027aea9-0x000000fb0000 : "rootfs"
[    1.049800] mtd: device 5 (rootfs) set to be root filesystem
[    1.055854] 1 squashfs-split partitions found on MTD device rootfs
[    1.062127] 0x000000c30000-0x000000fb0000 : "rootfs_data"
[    1.070085] libphy: Fixed MDIO Bus: probed
[    1.101118] libphy: mdio: probed
[    1.104684] mt7530 mdio-bus:1f: MT7530 adapts as multi-chip module
[    1.114898] mtk_soc_eth 1e100000.ethernet eth0: mediatek frame engine at 0xbe100000, irq 22
[    1.124629] i2c-mt7621 1e000900.i2c: clock 100 kHz
[    1.131069] mt7621-pci 1e140000.pcie: Parsing DT failed
[    1.139141] NET: Registered protocol family 10
[    1.145286] Segment Routing with IPv6
[    1.149168] NET: Registered protocol family 17
[    1.153922] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
[    1.167296] 8021q: 802.1Q VLAN Support v1.8
[    1.171911] registered taskstats version 1
[    1.176591] zswap: default zpool zbud not available
[    1.181567] zswap: pool creation failed
[    1.189206] pstore: Using crash dump compression: zstd
[    1.195348] mt7530 mdio-bus:1f: MT7530 adapts as multi-chip module
[    1.214333] libphy: dsa slave smi: probed
[    1.219041] mt7530 mdio-bus:1f lan1 (uninitialized): PHY [dsa-0.0:00] driver [Generic PHY]
[    1.228782] mt7530 mdio-bus:1f lan2 (uninitialized): PHY [dsa-0.0:01] driver [Generic PHY]
[    1.238668] mt7530 mdio-bus:1f lan3 (uninitialized): PHY [dsa-0.0:02] driver [Generic PHY]
[    1.248509] mt7530 mdio-bus:1f lan4 (uninitialized): PHY [dsa-0.0:03] driver [Generic PHY]
[    1.258451] mt7530 mdio-bus:1f wan (uninitialized): PHY [dsa-0.0:04] driver [Generic PHY]
[    1.268228] mt7530 mdio-bus:1f: configuring for fixed/rgmii link mode
[    1.279702] DSA: tree 0 setup
[    1.283141] rt2880-pinmux pinctrl: pcie is already enabled
[    1.288818] mt7621-pci 1e140000.pcie: Error applying setting, reverse things back
[    1.296559] mt7621-pci-phy 1e149000.pcie-phy: PHY for 0xbe149000 (dual port = 1)
[    1.304300] mt7621-pci-phy 1e14a000.pcie-phy: PHY for 0xbe14a000 (dual port = 0)
[    1.412000] mt7621-pci-phy 1e149000.pcie-phy: Xtal is 40MHz
[    1.417663] mt7621-pci-phy 1e14a000.pcie-phy: Xtal is 40MHz
[    1.523365] mt7621-pci 1e140000.pcie: pcie2 no card, disable it (RST & CLK)
[    1.530401] mt7621-pci 1e140000.pcie: PCIE0 enabled
[    1.535443] mt7621-pci 1e140000.pcie: PCIE1 enabled
[    1.540407] mt7621-pci 1e140000.pcie: PCI coherence region base: 0x60000000, mask/settings: 0xf0000002
[    1.550022] mt7621-pci 1e140000.pcie: PCI host bridge to bus 0000:00
[    1.556512] pci_bus 0000:00: root bus resource [io  0x1e160000-0x1e16ffff]
[    1.563577] pci_bus 0000:00: root bus resource [mem 0x60000000-0x6fffffff]
[    1.570615] pci_bus 0000:00: root bus resource [bus 00-ff]
[    1.576251] pci 0000:00:00.0: [0e8d:0801] type 01 class 0x060400
[    1.582450] pci 0000:00:00.0: reg 0x10: [mem 0x00000000-0x7fffffff]
[    1.588825] pci 0000:00:00.0: reg 0x14: [mem 0x60400000-0x6040ffff]
[    1.595332] pci 0000:00:00.0: supports D1
[    1.599431] pci 0000:00:00.0: PME# supported from D0 D1 D3hot
[    1.605707] pci 0000:00:01.0: [0e8d:0801] type 01 class 0x060400
[    1.611883] pci 0000:00:01.0: reg 0x10: [mem 0x00000000-0x7fffffff]
[    1.618318] pci 0000:00:01.0: reg 0x14: [mem 0x60410000-0x6041ffff]
[    1.624732] pci 0000:00:01.0: supports D1
[    1.628913] pci 0000:00:01.0: PME# supported from D0 D1 D3hot
[    1.635919] pci 0000:01:00.0: [14c3:7603] type 00 class 0x028000
[    1.642167] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x000fffff]
[    1.648689] pci 0000:01:00.0: PME# supported from D0 D3hot D3cold
[    1.655969] pci 0000:00:00.0: PCI bridge to [bus 01-ff]
[    1.661308] pci 0000:00:00.0:   bridge window [io  0x0000-0x0fff]
[    1.667607] pci 0000:00:00.0:   bridge window [mem 0x60000000-0x600fffff]
[    1.674475] pci 0000:00:00.0:   bridge window [mem 0x60100000-0x601fffff pref]
[    1.681881] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
[    1.688871] pci 0000:02:00.0: [14c3:7662] type 00 class 0x028000
[    1.695060] pci 0000:02:00.0: reg 0x10: [mem 0x00000000-0x000fffff 64bit]
[    1.702046] pci 0000:02:00.0: reg 0x30: [mem 0x00000000-0x0000ffff pref]
[    1.708941] pci 0000:02:00.0: PME# supported from D0 D3hot D3cold
[    1.716225] pci 0000:00:01.0: PCI bridge to [bus 02-ff]
[    1.721559] pci 0000:00:01.0:   bridge window [io  0x0000-0x0fff]
[    1.727856] pci 0000:00:01.0:   bridge window [mem 0x60200000-0x602fffff]
[    1.734832] pci 0000:00:01.0:   bridge window [mem 0x60300000-0x603fffff pref]
[    1.742136] pci_bus 0000:02: busn_res: [bus 02-ff] end is updated to 02
[    1.748978] pci 0000:00:00.0: BAR 0: no space for [mem size 0x80000000]
[    1.755692] pci 0000:00:00.0: BAR 0: failed to assign [mem size 0x80000000]
[    1.762840] pci 0000:00:01.0: BAR 0: no space for [mem size 0x80000000]
[    1.769613] pci 0000:00:01.0: BAR 0: failed to assign [mem size 0x80000000]
[    1.776676] pci 0000:00:00.0: BAR 8: assigned [mem 0x60000000-0x600fffff]
[    1.783647] pci 0000:00:00.0: BAR 9: assigned [mem 0x60100000-0x601fffff pref]
[    1.790948] pci 0000:00:01.0: BAR 8: assigned [mem 0x60200000-0x602fffff]
[    1.797895] pci 0000:00:01.0: BAR 9: assigned [mem 0x60300000-0x603fffff pref]
[    1.805277] pci 0000:00:00.0: BAR 1: assigned [mem 0x60400000-0x6040ffff]
[    1.812149] pci 0000:00:01.0: BAR 1: assigned [mem 0x60410000-0x6041ffff]
[    1.819101] pci 0000:00:00.0: BAR 7: assigned [io  0x1e160000-0x1e160fff]
[    1.825965] pci 0000:00:01.0: BAR 7: assigned [io  0x1e161000-0x1e161fff]
[    1.832920] pci 0000:01:00.0: BAR 0: assigned [mem 0x60000000-0x600fffff]
[    1.839872] pci 0000:00:00.0: PCI bridge to [bus 01]
[    1.844927] pci 0000:00:00.0:   bridge window [io  0x1e160000-0x1e160fff]
[    1.851876] pci 0000:00:00.0:   bridge window [mem 0x60000000-0x600fffff]
[    1.858745] pci 0000:00:00.0:   bridge window [mem 0x60100000-0x601fffff pref]
[    1.866130] pci 0000:02:00.0: BAR 0: assigned [mem 0x60200000-0x602fffff 64bit]
[    1.873608] pci 0000:02:00.0: BAR 6: assigned [mem 0x60300000-0x6030ffff pref]
[    1.880905] pci 0000:00:01.0: PCI bridge to [bus 02]
[    1.886036] pci 0000:00:01.0:   bridge window [io  0x1e161000-0x1e161fff]
[    1.892901] pci 0000:00:01.0:   bridge window [mem 0x60200000-0x602fffff]
[    1.899854] pci 0000:00:01.0:   bridge window [mem 0x60300000-0x603fffff pref]
[    1.907696] mt7530 mdio-bus:1f: Link is Up - 1Gbps/Full - flow control off
[    1.915237] hctosys: unable to open rtc device (rtc0)
[    1.939103] VFS: Mounted root (squashfs filesystem) readonly on device 31:5.
[    1.951085] Freeing unused kernel memory: 1228K
[    1.955913] This architecture does not have kernel memory protection.
[    1.962433] Run /sbin/init as init process
[    3.726884] init: Console is alive
[    3.730608] init: - watchdog -
[    6.367227] kmodloader: loading kernel modules from /etc/modules-boot.d/*
[    6.961009] io scheduler bfq registered
[    6.966040] io scheduler kyber registered
[    6.971110] io scheduler mq-deadline registered
[    6.986919] usbcore: registered new interface driver usbfs
[    6.992654] usbcore: registered new interface driver hub
[    6.998393] usbcore: registered new device driver usb
[    7.051999] SCSI subsystem initialized
[    7.067688] libata version 3.00 loaded.
[    7.078308] xhci-mtk 1e1c0000.xhci: 1e1c0000.xhci supply vbus not found, using dummy regulator
[    7.087396] xhci-mtk 1e1c0000.xhci: 1e1c0000.xhci supply vusb33 not found, using dummy regulator
[    7.096572] xhci-mtk 1e1c0000.xhci: xHCI Host Controller
[    7.102013] xhci-mtk 1e1c0000.xhci: new USB bus registered, assigned bus number 1
[    7.114911] xhci-mtk 1e1c0000.xhci: hcc params 0x01401198 hci version 0x96 quirks 0x0000000000210010
[    7.124302] xhci-mtk 1e1c0000.xhci: irq 21, io mem 0x1e1c0000
[    7.131100] hub 1-0:1.0: USB hub found
[    7.135161] hub 1-0:1.0: 2 ports detected
[    7.139730] xhci-mtk 1e1c0000.xhci: xHCI Host Controller
[    7.145311] xhci-mtk 1e1c0000.xhci: new USB bus registered, assigned bus number 2
[    7.152915] xhci-mtk 1e1c0000.xhci: Host supports USB 3.0 SuperSpeed
[    7.159554] usb usb2: We don't know the algorithms for LPM for this host, disabling LPM.
[    7.168375] hub 2-0:1.0: USB hub found
[    7.172338] hub 2-0:1.0: 1 port detected
[    7.218015] usbcore: registered new interface driver usb-storage
[    7.228473] usbcore: registered new interface driver uas
[    7.234291] kmodloader: done loading kernel modules from /etc/modules-boot.d/*
[    7.251101] init: - preinit -
[    7.265526] mmc0: new high speed SDXC card at address 0007
[    7.277556] mmcblk0: mmc0:0007 SD64G 57.9 GiB 
[    7.304366]  mmcblk0: p1
[    7.514920] usb 2-1: new SuperSpeed Gen 1 USB device number 2 using xhci-mtk
[    7.545400] hub 2-1:1.0: USB hub found
[    7.549900] hub 2-1:1.0: 4 ports detected
[    7.783566] random: procd: uninitialized urandom read (4 bytes read)
[    7.798800] usb 1-1: new high-speed USB device number 2 using xhci-mtk
[    7.960383] hub 1-1:1.0: USB hub found
[    7.964892] hub 1-1:1.0: 4 ports detected
[    8.095639] extroot-attach: running bootfs attach pre-init script...
[    8.246792] usb 1-2: new high-speed USB device number 3 using xhci-mtk
[    9.171137] usb 2-1.4: new SuperSpeed Gen 1 USB device number 3 using xhci-mtk
[    9.289717] jffs2: notice: (632) jffs2_build_xattr_subsystem: complete building xattr subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.
[    9.307113] extroot-attach: successfully mounted the boot file system
[    9.855817] extroot-attach: successfully mounted pstore
[    9.897921] random: crng init done
[   10.396878] mtk_soc_eth 1e100000.ethernet eth0: configuring for fixed/rgmii link mode
[   10.405446] mtk_soc_eth 1e100000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
[   10.414123] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[   11.650879] mt7530 mdio-bus:1f lan1: configuring for phy/gmii link mode
[   11.658315] 8021q: adding VLAN 0 to HW filter on device lan1
[   16.054910] mount_root: loading kmods from internal overlay
[   16.092402] kmodloader: loading kernel modules from //etc/modules-boot.d/*
[   16.102713] kmodloader: done loading kernel modules from //etc/modules-boot.d/*
[   17.248117] jffs2: notice: (723) jffs2_build_xattr_subsystem: complete building xattr subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.
[   17.264450] block: attempting to load /tmp/jffs_cfg/upper/etc/config/fstab
[   17.275116] block: check_filesystem: /usr/sbin/fsck.f2fs not found
[   17.883121] F2FS-fs (mmcblk0p1): Mounted with checkpoint version = 531ea93c
[   17.903317] mount_root: switched to extroot
[   17.912420] urandom-seed: Seeding with /etc/urandom.seed
[   17.940651] urandom-seed: Seeding with /etc/urandom.log
[   17.974143] procd: - early -
[   17.977293] procd: - watchdog -
[   18.695006] procd: - watchdog -
[   18.701593] procd: - ubus -
[   19.060681] procd: - init -
[   21.296465] I/O scheduler: non-rotational device 'mtdblock4' was '[none] bfq kyber mq-deadline ', now 'bfq [kyber] mq-deadline none'
[   21.361686] urngd: v1.0.2 started.
[   21.430397] kmodloader: loading kernel modules from /etc/modules.d/*
[   21.448866] zram: Added device: zram0
[   21.457166] zbud: loaded
[   21.542825] I/O scheduler: non-rotational device 'mtdblock2' was '[none] bfq kyber mq-deadline ', now 'bfq [kyber] mq-deadline none'
[   22.123664] loop: module loaded
[   22.389883] Mirror/redirect action on
[   22.406626] u32 classifier
[   22.409565]     input device check on
[   22.413440]     Actions configured
[   22.414122] I/O scheduler: non-rotational device 'mtdblock0' was '[none] bfq kyber mq-deadline ', now 'bfq [kyber] mq-deadline none'
[   22.434270] usbcore: registered new interface driver cdc_wdm
[   22.453423] Loading modules backported from Linux version v5.8-0-gbcf876870b95
[   22.460892] Backport generated by backports.git v5.8-1-0-g79400d9e
[   22.520328] usbcore: registered new interface driver ums-alauda
[   22.528261] usbcore: registered new interface driver ums-cypress
[   22.536255] usbcore: registered new interface driver ums-datafab
[   22.544109] usbcore: registered new interface driver ums-freecom
[   22.551983] usbcore: registered new interface driver ums-isd200
[   22.559812] usbcore: registered new interface driver ums-jumpshot
[   22.567572] usbcore: registered new interface driver ums-karma
[   22.575540] usbcore: registered new interface driver ums-sddr09
[   22.583283] usbcore: registered new interface driver ums-sddr55
[   22.591402] usbcore: registered new interface driver ums-usbat
[   22.605481] usbcore: registered new interface driver usbserial_generic
[   22.612511] usbserial: USB Serial support registered for generic
[   22.648742] xt_time: kernel timezone is -0000
[   22.654388] I/O scheduler: non-rotational device 'mtdblock5' was '[none] bfq kyber mq-deadline ', now 'bfq [kyber] mq-deadline none'
[   22.657236] usbcore: registered new interface driver asix
[   22.811049] I/O scheduler: non-rotational device 'mtdblock3' was '[none] bfq kyber mq-deadline ', now 'bfq [kyber] mq-deadline none'
[   22.983170] I/O scheduler: non-rotational device 'mtdblock1' was '[none] bfq kyber mq-deadline ', now 'bfq [kyber] mq-deadline none'
[   23.007279] ax88179_178a 2-1.4:1.0 eth1: register 'ax88179_178a' at usb-1e1c0000.xhci-1.4, ASIX AX88179 USB 3.0 Gigabit Ethernet, 00:05:1b:d4:6a:b2
[   23.033939] usbcore: registered new interface driver ax88179_178a
[   23.129778] mt7621-pci 1e140000.pcie: bus=1 slot=0 irq=24
[   23.135602] pci 0000:00:00.0: enabling device (0006 -> 0007)
[   23.141545] mt7603e 0000:01:00.0: enabling device (0000 -> 0002)
[   23.148241] mt7603e 0000:01:00.0: ASIC revision: 76030010
[   23.276920] I/O scheduler: non-rotational device 'mmcblk0' was 'bfq kyber [mq-deadline] none', now 'bfq [kyber] mq-deadline none'
[   23.380341] I/O scheduler: non-rotational device 'mtdblock6' was '[none] bfq kyber mq-deadline ', now 'bfq [kyber] mq-deadline none'
[   23.572387] I/O scheduler: non-rotational device 'loop0' was 'bfq kyber [mq-deadline] none', now 'bfq [kyber] mq-deadline none'
[   23.688593] I/O scheduler: non-rotational device 'loop1' was 'bfq kyber [mq-deadline] none', now 'bfq [kyber] mq-deadline none'
[   23.772003] mt7603e 0000:01:00.0: Firmware Version: ap_pcie
[   23.789453] mt7603e 0000:01:00.0: Build Time: 20160107100755
[   23.822857] mt7603e 0000:01:00.0: firmware init done
[   23.837244] I/O scheduler: non-rotational device 'loop2' was 'bfq kyber [mq-deadline] none', now 'bfq [kyber] mq-deadline none'
[   23.940464] I/O scheduler: non-rotational device 'loop3' was 'bfq kyber [mq-deadline] none', now 'bfq [kyber] mq-deadline none'
[   24.008348] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
[   24.033785] mt7621-pci 1e140000.pcie: bus=2 slot=1 irq=25
[   24.039362] pci 0000:00:01.0: enabling device (0006 -> 0007)
[   24.045191] mt76x2e 0000:02:00.0: enabling device (0000 -> 0002)
[   24.051514] mt76x2e 0000:02:00.0: ASIC revision: 76120044
[   24.104568] I/O scheduler: non-rotational device 'loop4' was 'bfq kyber [mq-deadline] none', now 'bfq [kyber] mq-deadline none'
[   24.212568] I/O scheduler: non-rotational device 'loop5' was 'bfq kyber [mq-deadline] none', now 'bfq [kyber] mq-deadline none'
[   24.344753] I/O scheduler: non-rotational device 'loop6' was 'bfq kyber [mq-deadline] none', now 'bfq [kyber] mq-deadline none'
[   24.455609] mt76x2e 0000:02:00.0: ROM patch build: 20141115060606a
[   24.477617] mt76x2e 0000:02:00.0: Firmware Version: 0.0.00
[   24.483476] mt76x2e 0000:02:00.0: Build: 1
[   24.487755] mt76x2e 0000:02:00.0: Build Time: 201507311614____
[   24.499346] I/O scheduler: non-rotational device 'loop7' was 'bfq kyber [mq-deadline] none', now 'bfq [kyber] mq-deadline none'
[   24.510847] mt76x2e 0000:02:00.0: Firmware running!
[   24.529501] ieee80211 phy1: Selected rate control algorithm 'minstrel_ht'
[   24.580276] PPP generic driver version 2.4.2
[   24.588036] NET: Registered protocol family 24
[   24.598909] qmi_wwan 1-2:1.4: cdc-wdm0: USB WDM device
[   24.607267] qmi_wwan 1-2:1.4 wwan0: register 'qmi_wwan' at usb-1e1c0000.xhci-2, WWAN/QMI device, 06:76:4e:7c:d9:8f
[   24.618287] usbcore: registered new interface driver qmi_wwan
[   24.651190] usbcore: registered new interface driver option
[   24.657237] usbserial: USB Serial support registered for GSM modem (1-port)
[   24.665095] option 1-2:1.0: GSM modem (1-port) converter detected
[   24.672080] usb 1-2: GSM modem (1-port) converter now attached to ttyUSB0
[   24.679734] option 1-2:1.1: GSM modem (1-port) converter detected
[   24.686668] usb 1-2: GSM modem (1-port) converter now attached to ttyUSB1
[   24.694369] option 1-2:1.2: GSM modem (1-port) converter detected
[   24.701392] usb 1-2: GSM modem (1-port) converter now attached to ttyUSB2
[   24.709032] option 1-2:1.3: GSM modem (1-port) converter detected
[   24.715859] usb 1-2: GSM modem (1-port) converter now attached to ttyUSB3
[   24.728914] kmodloader: done loading kernel modules from /etc/modules.d/*
[   29.793376] zram: setup backing device /dev/loop0
[   29.841985] zram0: detected capacity change from 0 to 1036120064
[   30.070937] Adding 1011832k swap on /dev/zram0.  Priority:1000 extents:1 across:1011832k SSDscFS
[   37.694635] mtk_soc_eth 1e100000.ethernet eth0: Link is Down
[   37.711848] mtk_soc_eth 1e100000.ethernet: PPE table busy
[   37.733966] mtk_soc_eth 1e100000.ethernet eth0: configuring for fixed/rgmii link mode
[   37.742460] mtk_soc_eth 1e100000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
[   37.747580] mt7530 mdio-bus:1f lan1: configuring for phy/gmii link mode
[   37.759244] 8021q: adding VLAN 0 to HW filter on device lan1
[   37.769074] br-lan: port 1(lan1) entered blocking state
[   37.775871] br-lan: port 1(lan1) entered disabled state
[   37.783564] device lan1 entered promiscuous mode
[   37.788557] device eth0 entered promiscuous mode
[   37.847606] mt7530 mdio-bus:1f lan2: configuring for phy/gmii link mode
[   37.857204] 8021q: adding VLAN 0 to HW filter on device lan2
[   37.868053] br-lan: port 2(lan2) entered blocking state
[   37.874444] br-lan: port 2(lan2) entered disabled state
[   37.882972] device lan2 entered promiscuous mode
[   37.900174] mt7530 mdio-bus:1f lan3: configuring for phy/gmii link mode
[   37.908007] 8021q: adding VLAN 0 to HW filter on device lan3
[   37.918552] br-lan: port 3(lan3) entered blocking state
[   37.925568] br-lan: port 3(lan3) entered disabled state
[   37.935256] device lan3 entered promiscuous mode
[   37.955427] mt7530 mdio-bus:1f lan4: configuring for phy/gmii link mode
[   37.966797] 8021q: adding VLAN 0 to HW filter on device lan4
[   37.979537] br-lan: port 4(lan4) entered blocking state
[   37.985532] br-lan: port 4(lan4) entered disabled state
[   37.995444] device lan4 entered promiscuous mode
[   38.014263] mt7530 mdio-bus:1f wan: configuring for phy/gmii link mode
[   38.025862] 8021q: adding VLAN 0 to HW filter on device wan
[   38.036855] br-lan: port 5(wan) entered blocking state
[   38.046283] br-lan: port 5(wan) entered disabled state
[   38.055798] device wan entered promiscuous mode
[   38.252957] jffs2: notice: (4437) jffs2_build_xattr_subsystem: complete building xattr subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.
[   40.311800] br-lan: port 6(wlan0) entered blocking state
[   40.320117] br-lan: port 6(wlan0) entered disabled state
[   40.326348] device wlan0 entered promiscuous mode
[   40.781014] br-lan: port 7(wlan1) entered blocking state
[   40.786939] br-lan: port 7(wlan1) entered disabled state
[   40.800551] device wlan1 entered promiscuous mode
[   40.808057] br-lan: port 7(wlan1) entered blocking state
[   40.813651] br-lan: port 7(wlan1) entered forwarding state
[   40.828966] IPv6: ADDRCONF(NETDEV_CHANGE): br-lan: link becomes ready
[   41.079701] ax88179_178a 2-1.4:1.0 eth1: ax88179 - Link status is: 1
[   41.098007] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
[   41.107462] mt7530 mdio-bus:1f wan: Link is Up - 1Gbps/Full - flow control rx/tx
[   41.115514] br-lan: port 5(wan) entered blocking state
[   41.121093] br-lan: port 5(wan) entered forwarding state
[   41.406478] br-lan: port 7(wlan1) entered disabled state
[   41.734488] IPv6: br-lan: IPv6 duplicate address fd00:192:168::1 used by e4:95:6e:47:1f:55 detected!
[   45.973245] IPv6: ADDRCONF(NETDEV_CHANGE): wlan1: link becomes ready
[   45.981094] br-lan: port 7(wlan1) entered blocking state
[   45.986681] br-lan: port 7(wlan1) entered forwarding state
[   47.823120] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[   47.830318] br-lan: port 6(wlan0) entered blocking state
[   47.836080] br-lan: port 6(wlan0) entered forwarding state
[  121.118105] pstore: collected an compressed console-ramoops-0 from last boot
[  121.135233] pstore: collected an compressed pmsg-ramoops-0 from last boot
[85719.163811] ax88179_178a 2-1.4:1.0 eth1: ax88179 - Link status is: 1
[85722.107765] ax88179_178a 2-1.4:1.0 eth1: ax88179 - Link status is: 1
[85722.121596] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
[155843.369462] mt7530 mdio-bus:1f wan: Link is Down
[155843.466576] br-lan: port 5(wan) entered disabled state
[155875.336121] mt7530 mdio-bus:1f wan: Link is Up - 1Gbps/Full - flow control rx/tx
[155875.343896] br-lan: port 5(wan) entered blocking state
[155875.349327] br-lan: port 5(wan) entered forwarding state
[155878.407819] mt7530 mdio-bus:1f wan: Link is Down
[155878.412766] br-lan: port 5(wan) entered disabled state
[155890.695417] mt7530 mdio-bus:1f wan: Link is Up - 1Gbps/Full - flow control rx/tx
[155890.703061] br-lan: port 5(wan) entered blocking state
[155890.708493] br-lan: port 5(wan) entered forwarding state
[178367.985725] mt7530 mdio-bus:1f wan: Link is Down
[178367.990987] br-lan: port 5(wan) entered disabled state
[178399.728591] mt7530 mdio-bus:1f wan: Link is Up - 1Gbps/Full - flow control rx/tx
[178399.736352] br-lan: port 5(wan) entered blocking state
[178399.741699] br-lan: port 5(wan) entered forwarding state
[178408.944057] mt7530 mdio-bus:1f wan: Link is Down
[178408.948998] br-lan: port 5(wan) entered disabled state
[178413.040057] mt7530 mdio-bus:1f wan: Link is Up - 1Gbps/Full - flow control rx/tx
[178413.047702] br-lan: port 5(wan) entered blocking state
[178413.053126] br-lan: port 5(wan) entered forwarding state

I agree with you, snapshots are really good as far as I can report.

1 Like

Unfortunately you and others have devices with some bug in the SOC, that gets triggered by the v19.07 mtk ethernet driver.

In my case on zbt-wg3526-16M 19.07.3 was rock solid (months of uptime). I skipped .4 due to known issues, now .05 locks up. Same config and usage pattern.

Oh that's unfortunate then.

See https://github.com/openwrt/openwrt/pull/2798#issuecomment-599154521 and subsequent comments.

I'm sure there are hacks that can be used to speed certain aspects of the driver up. https://gitlab.com/db260179/xiaomi-m4a/-/commit/4f1a3c5223e253c58085d82f0f01fc58575512f5 caught my eye actually.

Hmm now that I'm looking at the comments, https://github.com/openwrt/openwrt/pull/2798#issuecomment-600006326 seems interesting...

@db260179 trying your iperf example doesn't trigger a crash or errors on zbt-wg3526-16M/19.07.5.

This is udp:

iperf3 -c 192.168.1.1 -u -t 100 -b 1000M -i10 -P2
[SUM]   0.00-100.00 sec  12.1 GBytes  1.04 Gbits/sec  0.000 ms  0/8992148 (0%)  sender
[SUM]   0.00-100.22 sec  3.40 GBytes   291 Mbits/sec  0.075 ms  6471825/8992143 (72%)  receiver

During udp run - see interrupts in 'in' column (router was unusable during the run)

# vmstat -t -w --one-header 10
procs -----------------------memory---------------------- ---swap-- -----io---- -system-- --------cpu-------- -----timestamp-----
 r  b         swpd         free         buff        cache   si   so    bi    bo   in   cs  us  sy  id  wa  st                 PST
 0  0            0       460792         2524        14856    0    0     0     0  637  184   0   0  99   0   0 2020-12-19 23:03:34
 2  0            0       460728         2524        14864    0    0     0     0  698  345   3  17  81   0   0 2020-12-19 23:03:44
 2  0            0       460912         2524        14864    0    0     0     0  609  284   7  43  50   0   0 2020-12-19 23:03:54
 2  0            0       460920         2524        14864    0    0     0     0  599  273   7  43  50   0   0 2020-12-19 23:04:04
 2  0            0       460852         2524        14864    0    0     0     0  590  246   7  43  50   0   0 2020-12-19 23:04:14
 2  0            0       460728         2524        14864    0    0     0     0  636  285   6  40  54   0   0 2020-12-19 23:04:24
 2  0            0       460812         2524        14864    0    0     0     0  589  252   7  43  50   0   0 2020-12-19 23:04:34
 3  0            0       461064         2524        14864    0    0     0     0  641  356   7  44  50   0   0 2020-12-19 23:04:44
 2  0            0       460968         2524        14864    0    0     0     0  635  304   8  43  50   0   0 2020-12-19 23:04:54
 2  0            0       461000         2524        14864    0    0     0     0  583  225   8  38  54   0   0 2020-12-19 23:05:04
 2  0            0       461060         2524        14864    0    0     0     0  607  288   7  43  50   0   0 2020-12-19 23:05:14
 0  0            0       461244         2524        14856    0    0     0     0  627  269   5  27  68   0   0 2020-12-19 23:05:24
 0  0            0       461212         2524        14856    0    0     0     0  663  198   0   0  99   0   0 2020-12-19 23:05:34

With tcp (note a lot more interrupts in 'in' compared to udp)

iperf3 -c 192.168.1.1 -t 100 -b 1000M -i10 -Z -P2
[SUM]   0.00-100.00 sec  8.97 GBytes   771 Mbits/sec  91866             sender
[SUM]   0.00-100.00 sec  8.96 GBytes   770 Mbits/sec                  receiver

# vmstat -t -w --one-header 10
procs -----------------------memory---------------------- ---swap-- -----io---- -system-- --------cpu-------- -----timestamp-----
 r  b         swpd         free         buff        cache   si   so    bi    bo   in   cs  us  sy  id  wa  st                 PST
 2  0            0       458604         2524        15112    0    0     0     0 5655  970   1  33  66   0   0 2020-12-19 23:01:44
 2  0            0       457220         2524        15112    0    0     0     0 9520  840   1  48  51   0   0 2020-12-19 23:01:54
 3  0            0       457252         2524        15112    0    0     0     0 8174 1013   1  48  51   0   0 2020-12-19 23:02:04
 2  0            0       459176         2524        15112    0    0     0     0 6285  933   1  46  53   0   0 2020-12-19 23:02:14
 2  0            0       456976         2524        15112    0    0     0     0 4562  686   1  48  51   0   0 2020-12-19 23:02:24
 3  0            0       460488         2524        15112    0    0     0     0 10176  779   1  49  50   0   0 2020-12-19 23:02:34
 2  0            0       457156         2524        15112    0    0     0     0 8067  921   1  48  51   0   0 2020-12-19 23:02:44
 2  0            0       457368         2524        15112    0    0     0     0 8250  946   1  49  50   0   0 2020-12-19 23:02:54
 2  0            0       457536         2524        15112    0    0     0     0 7704  893   1  48  52   0   0 2020-12-19 23:03:04
 2  0            0       459236         2524        15112    0    0     0     0 7804  952   1  48  52   0   0 2020-12-19 23:03:14
 0  0            0       460884         2524        14856    0    0     0     0 7073  856   1  41  58   0   0 2020-12-19 23:03:24
 0  0            0       460792         2524        14856    0    0     0     0  637  184   0   0  99   0   0 2020-12-19 23:03:34

only 'rev' in dmesg:

# dmesg|grep rev
[    0.000000] CPU0 revision is: 0001992f (MIPS 1004Kc)
[    0.105763] CPU1 revision is: 0001992f (MIPS 1004Kc)
[    0.206133] CPU2 revision is: 0001992f (MIPS 1004Kc)
[    0.287022] CPU3 revision is: 0001992f (MIPS 1004Kc)
[   14.024365] mt7603e 0000:01:00.0: ASIC revision: 76030010
[   15.299007] mt76x2e 0000:02:00.0: ASIC revision: 76120044

notably, i delete stock 20-smp-tune (, instead i manually assign ethernet interrupt to cpu 2 and only use 2.4 Ghz wifi on router:

# cat /proc/interrupts; grep . /proc/irq/{22,24,25}/smp_affinity_list; grep . /sys/class/net/*/queues/*/?ps_cpus
           CPU0       CPU1       CPU2       CPU3
  8:   13764565   13764547   13764539   13764532  MIPS GIC Local   1  timer
  9:      91742          0          0          0  MIPS GIC  63  IPI call
 10:          0      48303          0          0  MIPS GIC  64  IPI call
 11:          0          0     123406          0  MIPS GIC  65  IPI call
 12:          0          0          0      60066  MIPS GIC  66  IPI call
 13:     105351          0          0          0  MIPS GIC  67  IPI resched
 14:          0      87430          0          0  MIPS GIC  68  IPI resched
 15:          0          0     172782          0  MIPS GIC  69  IPI resched
 16:          0          0          0     128434  MIPS GIC  70  IPI resched
 19:         12          0          0          0  MIPS GIC  33  ttyS0
 20:          0          0          0          0  MIPS GIC  27  1e130000.sdhci
 21:          0          0          0          0  MIPS GIC  29  xhci-hcd:usb1
 22:         20          0   13099950          0  MIPS GIC  10  1e100000.ethernet
 23:          4          0          0          0  MIPS GIC  30  gsw
 24:         23          0          0   10341658  MIPS GIC  11  mt7603e
 25:          2          0          0          0  MIPS GIC  31  mt76x2e
ERR:          1

/proc/irq/22/smp_affinity_list:2
/proc/irq/24/smp_affinity_list:3
/proc/irq/25/smp_affinity_list:2
/sys/class/net/br-lan/queues/rx-0/rps_cpus:0
/sys/class/net/br-lan/queues/tx-0/xps_cpus:0
/sys/class/net/eth0.1/queues/rx-0/rps_cpus:0
/sys/class/net/eth0.1/queues/tx-0/xps_cpus:0
/sys/class/net/eth0.2/queues/rx-0/rps_cpus:0
/sys/class/net/eth0.2/queues/tx-0/xps_cpus:0
/sys/class/net/eth0/queues/rx-0/rps_cpus:0
/sys/class/net/eth0/queues/tx-0/xps_cpus:0
/sys/class/net/lo/queues/rx-0/rps_cpus:0
/sys/class/net/lo/queues/tx-0/xps_cpus:0
/sys/class/net/wlan0/queues/rx-0/rps_cpus:0
/sys/class/net/wlan0/queues/tx-0/xps_cpus:0
/sys/class/net/wlan0/queues/tx-1/xps_cpus:0
/sys/class/net/wlan0/queues/tx-2/xps_cpus:0
/sys/class/net/wlan0/queues/tx-3/xps_cpus:0
/sys/class/net/wlan1/queues/rx-0/rps_cpus:0
/sys/class/net/wlan1/queues/tx-0/xps_cpus:0
/sys/class/net/wlan1/queues/tx-1/xps_cpus:0
/sys/class/net/wlan1/queues/tx-2/xps_cpus:0
/sys/class/net/wlan1/queues/tx-3/xps_cpus:0

This patch i found, and added (seemed to have got lost in the mountain of patches, i guess?)

https://gitlab.com/db260179/xiaomi-m4a/-/commit/4156a8074271f28bbfbd9cbc9608b10f03971396

To allow 4.14 to have this NAPI threaded polling.

Yes i investigated this, and the mt7621 part of the old v19.07 ethernet driver doesn't use the port status feature, it appears its brute forced (speed,fc,duplex etc) and only a small part in the mtk_eth_soc.c to check port up or down.

This was on the original v19.07.5 source build and not my source build?

The gsw:rev, will only be in my source, as i added a patch to see what soc versions exist.

@neheb as much as I see, the threaded NAPI patch is part of 5.4 and 4.14 as well.

If someone take a look at Felix's git, there are a lot of new patches coming along with the 5.10 support. https://git.openwrt.org/?p=openwrt/staging/nbd.git;a=summary
I am not sure how "WIP" the 5.10 support is at the moment, but if @neheb or @nbd say its ready for beta testing, I would compile a version and load it.

That's here: https://github.com/openwrt/openwrt/pull/3693

1 Like

I am on snapshot (with kernel 5.4) and it is very stable. I have not seen any errors so far, no more mtk_soc_eth timeouts. While when on v19.07.X I experienced them daily.

Mind that kernel 5 introduced different architecture for switches - DSA (distributed switch architecture).