MT76 CPU Stall

I'm running a Newifi D2 on a development snapshot from June 13th. I run the snapshot because there had been a number of updates make to enhance performance of mt76 chips and I found it to be very stable compared to the latest service releases (19.07.3). Well after about two weeks, the router finally crashed and I was able to login to the CLI to retreive the logs. If anyone has ideas on a way to automatically restart the router when this happens, let me know. I suspect there is still some bugginess that is causing this with the mt76 drivers, but I'm no dev.

Tue Jun 23 08:23:05 2020 daemon.err dnsmasq[3640]: failed to send packet: Resource temporarily unavailable
Tue Jun 23 08:23:09 2020 daemon.err dnsmasq[3640]: failed to send packet: Resource temporarily unavailable
Tue Jun 23 08:23:11 2020 kern.err kernel: [401967.993483] rcu: INFO: rcu_sched self-detected stall on CPU
Tue Jun 23 08:23:11 2020 kern.err kernel: [401967.999159] rcu: 	2-....: (14999 ticks this GP) idle=2de/1/0x40000002 softirq=35130427/35130427 fqs=7500
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.008775] 	(t=15003 jiffies g=71879365 q=1075)
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.013471] NMI backtrace for cpu 2
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.017042] CPU: 2 PID: 7595 Comm: kworker/u8:2 Tainted: G        W         5.4.45 #0
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.024963] Workqueue: phy0 mt7603_mac_work [mt7603e]
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.030080] Stack : 8063fe48 8fc11d24 806c0000 80700000 8eaba900 80651348 00000002 80645684
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.038491]         807042b0 80700000 00000000 8007744c 00000002 00000001 8fc11ce0 28cf4e0c
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.046901]         00000000 00000000 00000000 00000000 5d000000 00000338 00000018 20307968
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.055309]         00000000 00004e1f 00000000 6d5f3330 00000000 80720000 00000000 00000002
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.063719]         80645684 807042b0 80700000 00000000 00000008 8033d6d4 00000008 80860008
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.072130]         ...
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.074654] Call Trace:
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.077202] [<8000b72c>] show_stack+0x30/0x100
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.081772] [<805942dc>] dump_stack+0xa4/0xdc
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.086221] [<8059a8dc>] nmi_cpu_backtrace+0xe4/0x134
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.091345] [<8059aa88>] nmi_trigger_cpumask_backtrace+0x15c/0x194
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.097603] [<80086ed4>] rcu_dump_cpu_stacks+0xe0/0x12c
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.102905] [<8008bb54>] rcu_sched_clock_irq+0x6e0/0x948
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.108290] [<80091058>] update_process_times+0x2c/0x78
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.113600] [<800a3144>] tick_handle_periodic+0x34/0xd0
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.118908] [<803d5d88>] gic_compare_interrupt+0x7c/0x9c
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.124311] [<8007e504>] handle_percpu_devid_irq+0xbc/0x19c
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.129949] [<8007848c>] generic_handle_irq+0x40/0x58
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.135079] [<802dd27c>] gic_handle_local_int+0x98/0x120
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.140459] [<802dd4c8>] gic_irq_dispatch+0x10/0x20
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.145405] [<8007848c>] generic_handle_irq+0x40/0x58
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.150551] [<805b4ccc>] do_IRQ+0x1c/0x2c
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.154632] [<802dcd24>] plat_irq_dispatch+0x64/0x104
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.159754] [<80006de8>] except_vec_vi_end+0xb8/0xc4
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.164817] [<8f370a70>] mt76_dma_attach+0x524/0x870 [mt76]
Tue Jun 23 08:23:11 2020 kern.warn kernel: [401968.170466] [<8f370a8c>] mt76_dma_attach+0x540/0x870 [mt76]
Tue Jun 23 08:23:14 2020 daemon.err dnsmasq[3640]: failed to send packet: Resource temporarily unavailable

Are you using hw flow offloading? The same thing happens with 19.07.4

I am testing activating only software offloading

Yeah, I probably was using hw flow offloading. Curious to hear how your tests go. Since having similar problems with 19.07.4, I revered to using my WNDR3800 for now. If it simply comes down to disabling HW flow offloading, I'll switch back to the Mt76 device.