Wifi issues with 18.06.4 on MT76

The latest OpenWRT 18.06.4 release lists as one of the highlights:

  • MT76 wireless driver updates

And there has indeed been progress with respect to WiFi performance and stability, however, there are still some significant issues I'm observing, and wanted to get input from other MT76 users to see if it might be something unique to my hardware (ZBT WE3526 which uses 7612e + 7603) or to the driver updates.

Specifically, with about seven connected devices (3 on 5Ghz rest on 2.4), if there is significant load run through the router (WLAN->WAN (e.g. speedtest)) or even locally (WLAN->LAN (e.g. backup)), the driver panics and restarts. This is from the system log:

Wed Jul 10 14:03:28 2019 daemon.info hostapd: wlan1: STA a4:5e:60:db:96:6f IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
Wed Jul 10 15:06:05 2019 kern.info kernel: [ 6995.807516] mt76x2e 0000:01:00.0: Firmware Version: 0.0.00
Wed Jul 10 15:06:05 2019 kern.info kernel: [ 6995.813015] mt76x2e 0000:01:00.0: Build: 1
Wed Jul 10 15:06:05 2019 kern.info kernel: [ 6995.817199] mt76x2e 0000:01:00.0: Build Time: 201507311614____
Wed Jul 10 15:06:05 2019 kern.info kernel: [ 6995.845733] mt76x2e 0000:01:00.0: Firmware running!
Wed Jul 10 15:06:05 2019 kern.info kernel: [ 6995.855838] ieee80211 phy1: Hardware restart was requested
Wed Jul 10 15:06:06 2019 kern.info kernel: [ 6997.697481] mt76x2e 0000:01:00.0: Firmware Version: 0.0.00
Wed Jul 10 15:06:06 2019 kern.info kernel: [ 6997.702966] mt76x2e 0000:01:00.0: Build: 1
Wed Jul 10 15:06:06 2019 kern.info kernel: [ 6997.707120] mt76x2e 0000:01:00.0: Build Time: 201507311614____
Wed Jul 10 15:06:06 2019 kern.info kernel: [ 6997.735691] mt76x2e 0000:01:00.0: Firmware running!
Wed Jul 10 15:06:06 2019 kern.info kernel: [ 6997.745776] ieee80211 phy1: Hardware restart was requested
Wed Jul 10 15:06:09 2019 kern.info kernel: [ 7000.637392] mt76x2e 0000:01:00.0: Firmware Version: 0.0.00
Wed Jul 10 15:06:09 2019 kern.info kernel: [ 7000.642879] mt76x2e 0000:01:00.0: Build: 1
Wed Jul 10 15:06:09 2019 kern.info kernel: [ 7000.647057] mt76x2e 0000:01:00.0: Build Time: 201507311614____
Wed Jul 10 15:06:09 2019 kern.info kernel: [ 7000.675636] mt76x2e 0000:01:00.0: Firmware running!
Wed Jul 10 15:06:09 2019 kern.info kernel: [ 7000.685719] ieee80211 phy1: Hardware restart was requested

Edit 5 minutes after posting this, I retested the speedtest load, and immediately caused this error where the driver had to restart:

Thu Jul 11 08:31:23 2019 kern.info kernel: [ 6795.098234] mt76x2e 0000:01:00.0: Firmware Version: 0.0.00
Thu Jul 11 08:31:23 2019 kern.info kernel: [ 6795.103742] mt76x2e 0000:01:00.0: Build: 1
Thu Jul 11 08:31:23 2019 kern.info kernel: [ 6795.107909] mt76x2e 0000:01:00.0: Build Time: 201507311614____
Thu Jul 11 08:31:23 2019 kern.info kernel: [ 6795.136123] mt76x2e 0000:01:00.0: Firmware running!
Thu Jul 11 08:31:23 2019 kern.info kernel: [ 6795.146208] ieee80211 phy1: Hardware restart was requested
Thu Jul 11 08:31:24 2019 kern.info kernel: [ 6796.898182] mt76x2e 0000:01:00.0: Firmware Version: 0.0.00
Thu Jul 11 08:31:24 2019 kern.info kernel: [ 6796.903682] mt76x2e 0000:01:00.0: Build: 1
Thu Jul 11 08:31:24 2019 kern.info kernel: [ 6796.907851] mt76x2e 0000:01:00.0: Build Time: 201507311614____
Thu Jul 11 08:31:24 2019 kern.info kernel: [ 6796.936069] mt76x2e 0000:01:00.0: Firmware running!
Thu Jul 11 08:31:24 2019 kern.info kernel: [ 6796.946178] ieee80211 phy1: Hardware restart was requested
Thu Jul 11 08:31:32 2019 daemon.info hostapd: wlan1: STA f8:da:0c:73:3b:c1 IEEE 802.11: authenticated

If just surfing the web, reading email or doing other non-bulk transfers, it seems to stay up reasonably well, but when running pingplotter for hours at a time, it does detect small periods (<2 seconds) of total ping loss. But there is nothing in the logs to indicate an issue.
Where would one look for more details on what might be going on during those periods, and for the larger restarts of the driver?

Thanks in advance.

Which radio is phy1? It seems that that one is the problem.

The firmware in these radio chips is in ROM but a "patching" system allows it to be modified with an upload.

I have some HooToo HT-ND001 routers with MT7602 + MT7612 and they seem real stable but haven't truly stressed them.

Yes, it's the 5Ghz radio supported by the 7612e chip.

Here is the full iw list output:

 iw list
Wiphy phy1
        max # scan SSIDs: 4
        max scan IEs length: 2247 bytes
        max # sched scan SSIDs: 0
        max # match sets: 0
        max # scan plans: 1
        max scan plan interval: -1
        max scan plan iterations: 0
        Retry short limit: 7
        Retry long limit: 4
        Coverage class: 0 (up to 0m)
        Available Antennas: TX 0x3 RX 0x3
        Configured Antennas: TX 0x3 RX 0x3
        Supported interface modes:
                 * IBSS
                 * managed
                 * AP
                 * AP/VLAN
                 * monitor
                 * mesh point
        Band 2:
                Capabilities: 0x1ff
                        RX LDPC
                        HT20/HT40
                        SM Power Save disabled
                        RX Greenfield
                        RX HT20 SGI
                        RX HT40 SGI
                        TX STBC
                        RX STBC 1-stream
                        Max AMSDU length: 3839 bytes
                        No DSSS/CCK HT40
                Maximum RX AMPDU length 65535 bytes (exponent: 0x003)
                Minimum RX AMPDU time spacing: 4 usec (0x05)
                HT TX/RX MCS rate indexes supported: 0-15
                VHT Capabilities (0x018001b0):
                        Max MPDU length: 3895
                        Supported Channel Width: neither 160 nor 80+80
                        RX LDPC
                        short GI (80 MHz)
                        TX STBC
                VHT RX MCS set:
                        1 streams: MCS 0-9
                        2 streams: MCS 0-9
                        3 streams: not supported
                        4 streams: not supported
                        5 streams: not supported
                        6 streams: not supported
                        7 streams: not supported
                        8 streams: not supported
                VHT RX highest supported: 0 Mbps
                VHT TX MCS set:
                        1 streams: MCS 0-9
                        2 streams: MCS 0-9
                        3 streams: not supported
                        4 streams: not supported
                        5 streams: not supported
                        6 streams: not supported
                        7 streams: not supported
                        8 streams: not supported
                VHT TX highest supported: 0 Mbps
                Frequencies:
                        * 5180 MHz [36] (23.0 dBm)
                        * 5200 MHz [40] (23.0 dBm)
                        * 5220 MHz [44] (23.0 dBm)
                        * 5240 MHz [48] (23.0 dBm)
                        * 5260 MHz [52] (23.0 dBm) (radar detection)
                        * 5280 MHz [56] (23.0 dBm) (radar detection)
                        * 5300 MHz [60] (23.0 dBm) (radar detection)
                        * 5320 MHz [64] (23.0 dBm) (radar detection)
                        * 5500 MHz [100] (23.0 dBm) (radar detection)
                        * 5520 MHz [104] (23.0 dBm) (radar detection)
                        * 5540 MHz [108] (23.0 dBm) (radar detection)
                        * 5560 MHz [112] (23.0 dBm) (radar detection)
                        * 5580 MHz [116] (23.0 dBm) (radar detection)
                        * 5600 MHz [120] (23.0 dBm) (radar detection)
                        * 5620 MHz [124] (23.0 dBm) (radar detection)
                        * 5640 MHz [128] (23.0 dBm) (radar detection)
                        * 5660 MHz [132] (23.0 dBm) (radar detection)
                        * 5680 MHz [136] (23.0 dBm) (radar detection)
                        * 5700 MHz [140] (23.0 dBm) (radar detection)
                        * 5745 MHz [149] (30.0 dBm)
                        * 5765 MHz [153] (30.0 dBm)
                        * 5785 MHz [157] (30.0 dBm)
                        * 5805 MHz [161] (30.0 dBm)
                        * 5825 MHz [165] (30.0 dBm)
        valid interface combinations:
                 * #{ IBSS } <= 1, #{ managed, AP, mesh point } <= 8,
                   total <= 8, #channels <= 1, STA/AP BI must match, radar detect widths: { 20 MHz (no HT), 20 MHz, 40 MHz, 80 MHz }

        HT Capability overrides:
                 * MCS: ff ff ff ff ff ff ff ff ff ff
                 * maximum A-MSDU length
                 * supported channel width
                 * short GI for 40 MHz
                 * max A-MPDU length exponent
                 * min MPDU start spacing
        Device supports VHT-IBSS.
Wiphy phy0
        max # scan SSIDs: 4
        max scan IEs length: 2257 bytes
        max # sched scan SSIDs: 0
        max # match sets: 0
        max # scan plans: 1
        max scan plan interval: -1
        max scan plan iterations: 0
        Retry short limit: 7
        Retry long limit: 4
        Coverage class: 0 (up to 0m)
        Available Antennas: TX 0x3 RX 0x3
        Supported interface modes:
                 * IBSS
                 * managed
                 * AP
                 * AP/VLAN
                 * monitor
                 * mesh point
        Band 1:
                Capabilities: 0x1fe
                        HT20/HT40
                        SM Power Save disabled
                        RX Greenfield
                        RX HT20 SGI
                        RX HT40 SGI
                        TX STBC
                        RX STBC 1-stream
                        Max AMSDU length: 3839 bytes
                        No DSSS/CCK HT40
                Maximum RX AMPDU length 65535 bytes (exponent: 0x003)
                Minimum RX AMPDU time spacing: 4 usec (0x05)
                HT TX/RX MCS rate indexes supported: 0-15
                Frequencies:
                        * 2412 MHz [1] (30.0 dBm)
                        * 2417 MHz [2] (30.0 dBm)
                        * 2422 MHz [3] (30.0 dBm)
                        * 2427 MHz [4] (30.0 dBm)
                        * 2432 MHz [5] (30.0 dBm)
                        * 2437 MHz [6] (30.0 dBm)
                        * 2442 MHz [7] (30.0 dBm)
                        * 2447 MHz [8] (30.0 dBm)
                        * 2452 MHz [9] (30.0 dBm)
                        * 2457 MHz [10] (30.0 dBm)
                        * 2462 MHz [11] (30.0 dBm)
                        * 2467 MHz [12] (disabled)
                        * 2472 MHz [13] (disabled)
                        * 2484 MHz [14] (disabled)
        valid interface combinations:
                 * #{ IBSS } <= 1, #{ managed, AP, mesh point } <= 4,
                   total <= 4, #channels <= 1, STA/AP BI must match
        HT Capability overrides:
                 * MCS: ff ff ff ff ff ff ff ff ff ff
                 * maximum A-MSDU length
                 * supported channel width
                 * short GI for 40 MHz
                 * max A-MPDU length exponent
                 * min MPDU start spacing

Thanks for the feedback on your device, and could you please run some load through them by performing a couple of DSLreports.com/speedtest back-to-back.

These crashes of the driver keep happening, sometimes multiples in a row, like these this morning:

`Fri Jul 19 07:39:52 2019 kern.info kernel: [52440.298472] mt76x2e 0000:01:00.0: Firmware Version: 0.0.00
Fri Jul 19 07:39:52 2019 kern.info kernel: [52440.303977] mt76x2e 0000:01:00.0: Build: 1
Fri Jul 19 07:39:52 2019 kern.info kernel: [52440.308164] mt76x2e 0000:01:00.0: Build Time: 201507311614____
Fri Jul 19 07:39:52 2019 kern.info kernel: [52440.336332] mt76x2e 0000:01:00.0: Firmware running!
Fri Jul 19 07:39:52 2019 kern.info kernel: [52440.346412] ieee80211 phy1: Hardware restart was requested`
Fri Jul 19 07:46:56 2019 kern.info kernel: [52865.095304] mt76x2e 0000:01:00.0: Firmware Version: 0.0.00
Fri Jul 19 07:46:56 2019 kern.info kernel: [52865.100792] mt76x2e 0000:01:00.0: Build: 1
Fri Jul 19 07:46:56 2019 kern.info kernel: [52865.104967] mt76x2e 0000:01:00.0: Build Time: 201507311614____
Fri Jul 19 07:46:56 2019 kern.info kernel: [52865.133510] mt76x2e 0000:01:00.0: Firmware running!
Fri Jul 19 07:46:56 2019 kern.info kernel: [52865.143602] ieee80211 phy1: Hardware restart was requested
Fri Jul 19 07:47:02 2019 kern.info kernel: [52871.155162] mt76x2e 0000:01:00.0: Firmware Version: 0.0.00
Fri Jul 19 07:47:02 2019 kern.info kernel: [52871.160650] mt76x2e 0000:01:00.0: Build: 1
Fri Jul 19 07:47:02 2019 kern.info kernel: [52871.164866] mt76x2e 0000:01:00.0: Build Time: 201507311614____
Fri Jul 19 07:47:02 2019 kern.info kernel: [52871.193330] mt76x2e 0000:01:00.0: Firmware running!
Fri Jul 19 07:47:02 2019 kern.info kernel: [52871.203373] ieee80211 phy1: Hardware restart was requested
Fri Jul 19 07:47:04 2019 kern.info kernel: [52873.015096] mt76x2e 0000:01:00.0: Firmware Version: 0.0.00
Fri Jul 19 07:47:04 2019 kern.info kernel: [52873.020583] mt76x2e 0000:01:00.0: Build: 1
Fri Jul 19 07:47:04 2019 kern.info kernel: [52873.024765] mt76x2e 0000:01:00.0: Build Time: 201507311614____
Fri Jul 19 07:47:04 2019 kern.info kernel: [52873.053318] mt76x2e 0000:01:00.0: Firmware running!
Fri Jul 19 07:47:04 2019 kern.info kernel: [52873.063377] ieee80211 phy1: Hardware restart was requested
Fri Jul 19 07:47:08 2019 kern.info kernel: [52876.744944] mt76x2e 0000:01:00.0: Firmware Version: 0.0.00
Fri Jul 19 07:47:08 2019 kern.info kernel: [52876.750431] mt76x2e 0000:01:00.0: Build: 1
Fri Jul 19 07:47:08 2019 kern.info kernel: [52876.754623] mt76x2e 0000:01:00.0: Build Time: 201507311614____
Fri Jul 19 07:47:08 2019 kern.info kernel: [52876.783184] mt76x2e 0000:01:00.0: Firmware running!
Fri Jul 19 07:47:08 2019 kern.info kernel: [52876.793276] ieee80211 phy1: Hardware restart was requested

These were both correlated to burst of substantial web request activity from an HP Envy laptop running Windows 10. Laptop is a few feet from the router and connected to the 5Ghz (7612e-based) radio.

This continues, any ideas on what low-level stats to look at to get a clue as to what is driving this?

This sequence of three crashes within 2 minutes is new.

Thu Aug  1 21:39:51 2019 daemon.info dnsmasq-dhcp[5172]: DHCPACK(br-lan) 192.168.7.199 04:f1:28:2f:07:cc
Thu Aug  1 22:07:17 2019 kern.info kernel: [294226.774229] mt76x2e 0000:01:00.0: Firmware Version: 0.0.00
Thu Aug  1 22:07:17 2019 kern.info kernel: [294226.779900] mt76x2e 0000:01:00.0: Build: 1
Thu Aug  1 22:07:17 2019 kern.info kernel: [294226.784188] mt76x2e 0000:01:00.0: Build Time: 201507311614____
Thu Aug  1 22:07:17 2019 kern.info kernel: [294226.812280] mt76x2e 0000:01:00.0: Firmware running!
Thu Aug  1 22:07:17 2019 kern.info kernel: [294226.822448] ieee80211 phy1: Hardware restart was requested
Thu Aug  1 22:07:19 2019 kern.info kernel: [294228.653894] mt76x2e 0000:01:00.0: Firmware Version: 0.0.00
Thu Aug  1 22:07:19 2019 kern.info kernel: [294228.659467] mt76x2e 0000:01:00.0: Build: 1
Thu Aug  1 22:07:19 2019 kern.info kernel: [294228.663727] mt76x2e 0000:01:00.0: Build Time: 201507311614____
Thu Aug  1 22:07:19 2019 kern.info kernel: [294228.692089] mt76x2e 0000:01:00.0: Firmware running!
Thu Aug  1 22:07:19 2019 kern.info kernel: [294228.702199] ieee80211 phy1: Hardware restart was requested
Thu Aug  1 22:07:20 2019 kern.info kernel: [294230.483913] mt76x2e 0000:01:00.0: Firmware Version: 0.0.00
Thu Aug  1 22:07:20 2019 kern.info kernel: [294230.489487] mt76x2e 0000:01:00.0: Build: 1
Thu Aug  1 22:07:20 2019 kern.info kernel: [294230.493719] mt76x2e 0000:01:00.0: Build Time: 201507311614____
Thu Aug  1 22:07:20 2019 kern.info kernel: [294230.522034] mt76x2e 0000:01:00.0: Firmware running!
Thu Aug  1 22:07:20 2019 kern.info kernel: [294230.532093] ieee80211 phy1: Hardware restart was requested
Thu Aug  1 22:08:52 2019 kern.info kernel: [294321.661080] mt76x2e 0000:01:00.0: Firmware Version: 0.0.00
Thu Aug  1 22:08:52 2019 kern.info kernel: [294321.666664] mt76x2e 0000:01:00.0: Build: 1
Thu Aug  1 22:08:52 2019 kern.info kernel: [294321.670911] mt76x2e 0000:01:00.0: Build Time: 201507311614____
Thu Aug  1 22:08:52 2019 kern.info kernel: [294321.699342] mt76x2e 0000:01:00.0: Firmware running!
Thu Aug  1 22:08:52 2019 kern.info kernel: [294321.709473] ieee80211 phy1: Hardware restart was requested
Thu Aug  1 22:08:53 2019 kern.err kernel: [294322.779314] mt76x2e 0000:01:00.0: MCU message 2 (seq 9) timed out
Thu Aug  1 22:08:53 2019 kern.info kernel: [294322.841023] mt76x2e 0000:01:00.0: Firmware Version: 0.0.00
Thu Aug  1 22:08:53 2019 kern.info kernel: [294322.846667] mt76x2e 0000:01:00.0: Build: 1
Thu Aug  1 22:08:53 2019 kern.info kernel: [294322.850989] mt76x2e 0000:01:00.0: Build Time: 201507311614____
Thu Aug  1 22:08:53 2019 kern.info kernel: [294322.879345] mt76x2e 0000:01:00.0: Firmware running!
Thu Aug  1 22:08:53 2019 kern.info kernel: [294322.889432] ieee80211 phy1: Hardware restart was requested
Thu Aug  1 22:20:58 2019 daemon.notice hostapd: wlan0: AP-STA-DISCONNECTED 7c:6d:62:da:2c:d2

I have some preliminary hints at what is going on, and it seems that these crashes are triggered when there is a mix of close (really close, about a foot or two) clients and further away clients (up to 30', a wall or two).

I would suspect that the Auto Gain Control (AGC) related handling in each radio is involved in this scenario.

Now that all clients are a minimum of 5' away from the routers WiFi, I have not seen a single wifi driver crash nor system reboot. Same client count and usage as prior days, so it is looking positive.
Will let it run for a day or so, and move them close again.

In preparation for that, any low level stats I should collect to document the works vs. crashes scenarios so I can write up a bug report?