Possible Kernel 5.10 regression issue with MT7621 and SW/HW offload enabled

I can give a try with 5.10 kernel and firewall4/nftables.
Actually snapshots have 5.10.92 kernel, an update will come soon. I'll just wait for it, maybe tomorrow.
I'll use a Netgear R6220 with SW/HW offload. I don't have IPv6 on the WAN side.

1 Like

Well, I just tried a snapshot I did today and it bricked my device. I just connected a UART cable, as you can see it is using Kernel 5.10.92 and just hangs at "Starting kernel ..." with no further error.

I will now try to recover this device and definitively I will stay away from Kernel 5.10 on the Archer C6 v3 devices until it is included in stable build... :slightly_frowning_face:

U-Boot 1.1.3 (May 13 2020 - 19:39:06)

Board: Ralink APSoC DRAM:  128 MB
relocate_code Pointer at: 87f58000

Config XHCI 40M PLL
flash manufacture id: c8, device id 40 18
find flash: GD25Q128C
*** Warning - bad CRC, using default environment

============================================
Ralink UBoot Version: 5.0.0.0
--------------------------------------------
ASIC MT7621A DualCore (MAC to MT7530 Mode)
DRAM_CONF_FROM: Auto-Detection
DRAM_TYPE: DDR3
DRAM bus: 16 bit
Xtal Mode=3 OCP Ratio=1/3
Flash component: SPI Flash
Date:May 13 2020  Time:19:39:06
============================================
THIS IS uboot
icache: sets:256, ways:4, linesz:32 ,total:32768
dcache: sets:256, ways:4, linesz:32 ,total:32768

 ##### The CPU freq = 880 MHZ ####
 estimate memory size =128 Mbytes

Press '4' or 't' to break the booting process

Press 'x' to enter recovery web server                                        0
nm_init:791
nm_initFwupPtnStruct:276
nm_lib_readPtnTable:738
[NM_Debug](nm_lib_readPtnTable) 00743: NM_PTN_TABLE_BASE = 0xfe0000
[NM_Debug](nm_lib_readPtnFromNvram) 00569: partition_used_len = 1054, requried l                                                          en = 8192
[NM_Debug](nm_lib_readPtnTable) 00751: Reading Partition Table from NVRAM ... OK

[NM_Debug](nm_lib_readPtnTable) 00759: Parsing Partition Table ... OK

[NM_Debug](nm_lib_readPtnFromNvram) 00569: partition_used_len = 2, requried len                                                           = 2
factory boot check integer ok.


3: System Boot system code via Flash.
## Booting image at bc040000 ...
   Image Name:   MIPS OpenWrt Linux-5.10.92
   Image Type:   MIPS Linux Kernel Image (lzma compressed)
   Data Size:    2731273 Bytes =  2.6 MB
   Load Address: 82000000
   Entry Point:  82000000
   Verifying Checksum ... OK
   Uncompressing Kernel Image ... OK
No initrd
## Transferring control to Linux (at address 82000000) ...
## Giving linux memsize in MB, 128

Starting kernel ...



OK, recovery was easier than I expected (opening the device and connecting the UART was the "hard" part, the rest was easy).

After I successfully recovered the device, I wanted to be sure the issue was not with my build. So I downloaded today's snapshot image from OpenWRT website and the results are the same as above. So be aware Archer C6 v3 users (as well as A6 v3) that as of January 30rd 2022 the snapshot builds will brick your device (and a UART connection will be required to recover it).

Lol (?) ...
Ok maybe I'll wait for next snapshot. I'm flashing a snapshot 2 or 3 times a week for the mt7621 and it has always worked. I know that snapshot are not garantied to work, for so far I had no real issue. Today's snapshot is still based on 5.10.92. I'll wait for the next based on 5.10.95.

Notice that this issue might affect only my device (Archer C6 v3 and A6 v3) and not all mt7621 devices.

So even with kernel 5.10.92 which is problematic in my experience you may have better luck with a different device.

BTW, due to a complete lack of error messages I am assuming the kernel is the issue (since it hung while loading), but it might be something else in this build.

Please continue the debate here (and maybe help by trying the possible fix) as this problem is not related to SW/HW offload (which is the topic of this thread):

1 Like

I bricked my A6v3 2 days ago. Can you tell me how to recover via UART connection?

@AashishAS, let's continue the discussion about unbricking Archer C6/A6 v3.x in the new topic below:

I just tested a new snapshot build today (r18710-dc2da6a233, kernel 5.10.96). Things got worse:

  1. With the now default firewall4, enabling HW flow offload breaks the firewall and all connected devices lose internet connectivity. Basically firewall4 is not able to recognize the flag option flow_offloading_hw '1' in the firewall config file. More details here.

  2. Redoing the build but selecting firewall3 instead, enabling HW flow offload now has no effect. I've monitored the CPU usage (medium/high) during a heavy download, and even with HW flow offload enabled the CPU usage is the same as disabled. With firewall3 and kernel 5.4 HW flow offload works perfectly and the CPU usage is minimum.

Once again rolled back to snapshot r18324-794e8123ce (kernel 5.4.162), the last snapshot build that has HW offload working and stable.

2 Likes

Just tried on a R6220 with Feb3 OpenWrt SNAPSHOT r18717-0e32c6baf3 (kernel 5.10.96)
Firewall is firewall4 (nftables).

basic settings : 610 Mbit/s
SW offloading : 630 Mbits/s
SW/HW Offloading : 600 Mbit/s
All results are very close and offloading doesn't seem to be active.

When I click on status/firewall I have no answer, and apparently no firewall process in system/processes.

BTW I have a symetric 1Gbit/s fiber which I normally use with a x86 router.

1 Like

You need reboot router now

Hardware Offload have many incompatibilities:

Vlan stp stats....

For future I recommend a big CPU without Offload

I'm waiting for something with 2.5Gbe for now

Are you sure that you followed the thread ? :laughing:
HW offloading is implemented in mt7621. It was working with previous kernels (even early 5.10), so there is a regression in the actual code. I'm just testing from time to time in order to help @dsouza by comparing with another device than his.

My main router is an x86/64. I have this precisely for its CPU power and no offloading :wink:

1 Like

Was just general recommendations for everyone :laughing:

Current HW Offload in snapshot is broken here too

1 Like

If you run grep OFFLOAD /proc/net/nf_conntrack do you see established connections that are supposedly offloaded at least?

Software offloaded conntrack entries should carry an [OFFLOAD] flag, hardware offloaded ones a [HW_OFFLOAD] flag.

Besides moving kernel from 5.4 to 5.10, the current snapshot builds also moved from firewall3 (iptables) to firewall4 (nftables).

HW offload for mt7621 is implemented only in iptables. But iptables with hardware acceleration has compatibility issues with Kernel 5.10 (current kernel version in snapshot, original post of this thread).

Bottom line is that mt7621 HW offload is currently broken in the snapshot builds with no solution in foreseeable future. Even doing a build with Kernel 5.10 and iptables, HW offload does not seem to work anymore.

For this reason I am using the last snapshot build with kernel 5.4 and iptables. With this configuration HW offload is running rock solid on an Archer C6 v3.2.

root@MI-R3P:~# grep OFFLOAD /proc/net/nf_conntrack
ipv6     10 tcp      6 src=2a01:00d0:e6c6:0000:aceb:d180:6f87:3a33 dst=2a00:1450:4001:080e:0000:0000:0000:200e sport=45253 dport=443 packets=9 bytes=3850 src=2a00:1450:4001:080e:0000:0000:0000:200e dst=2a01:00d0:e6c6:0000:aceb:d180:6f87:3a33 sport=443 dport=45253 packets=9 bytes=2190 [OFFLOAD] mark=0 zone=0 use=3
ipv6     10 tcp      6 src=2a01:00d0:e6c6:0000:aceb:d180:6f87:3a33 dst=2a03:b0c0:0003:00d0:0000:0000:168b:9001 sport=43944 dport=443 packets=25 bytes=3405 src=2a03:b0c0:0003:00d0:0000:0000:168b:9001 dst=2a01:00d0:e6c6:0000:aceb:d180:6f87:3a33 sport=443 dport=43944 packets=29 bytes=20045 [OFFLOAD] mark=0 zone=0 use=3
ipv4     2 tcp      6 src=192.168.1.42 dst=142.251.39.14 sport=43619 dport=443 packets=2 bytes=459 src=142.251.39.14 dst=EXTERNAL_IP sport=443 dport=43619 packets=0 bytes=0 [HW_OFFLOAD] mark=0 zone=0 use=3
ipv6     10 tcp      6 src=2a01:00d0:e6c6:0000:aceb:d180:6f87:3a33 dst=2a00:1450:4001:0810:0000:0000:0000:2004 sport=42176 dport=443 packets=18 bytes=4470 src=2a00:1450:4001:0810:0000:0000:0000:2004 dst=2a01:00d0:e6c6:0000:aceb:d180:6f87:3a33 sport=443 dport=42176 packets=23 bytes=8309 [OFFLOAD] mark=0 zone=0 use=3
ipv6     10 tcp      6 src=2a01:00d0:e6c6:0000:aceb:d180:6f87:3a33 dst=2a00:1450:4001:0810:0000:0000:0000:2004 sport=42112 dport=443 packets=31 bytes=2969 src=2a00:1450:4001:0810:0000:0000:0000:2004 dst=2a01:00d0:e6c6:0000:aceb:d180:6f87:3a33 sport=443 dport=42112 packets=34 bytes=64203 [OFFLOAD] mark=0 zone=0 use=3
ipv4     2 tcp      6 src=192.168.1.42 dst=142.250.186.97 sport=53099 dport=443 packets=14 bytes=3076 src=142.250.186.97 dst=EXTERNAL_IP sport=443 dport=53099 packets=1 bytes=60 [HW_OFFLOAD] mark=0 zone=0 use=3
ipv4     2 tcp      6 src=192.168.1.42 dst=108.177.127.113 sport=42794 dport=443 packets=21 bytes=9218 src=108.177.127.113 dst=EXTERNAL_IP sport=443 dport=42794 packets=6 bytes=449 [HW_OFFLOAD] mark=0 zone=0 use=3
ipv4     2 tcp      6 src=192.168.1.42 dst=216.58.212.142 sport=47108 dport=443 packets=12 bytes=2091 [UNREPLIED] src=216.58.212.142 dst=EXTERNAL_IP sport=443 dport=47108 packets=0 bytes=0 [HW_OFFLOAD] mark=0 zone=0 use=3
ipv4     2 udp      17 src=192.168.1.51 dst=52.29.246.211 sport=60304 dport=8765 packets=0 bytes=0 src=52.29.246.211 dst=EXTERNAL_IP sport=8765 dport=60304 packets=0 bytes=0 [HW_OFFLOAD] mark=0 zone=0 use=3
ipv6     10 tcp      6 src=2a01:00d0:e6c6:0000:aceb:d180:6f87:3a33 dst=2a00:1450:4001:0829:0000:0000:0000:2004 sport=44781 dport=443 packets=4 bytes=252 src=2a00:1450:4001:0829:0000:0000:0000:2004 dst=2a01:00d0:e6c6:0000:aceb:d180:6f87:3a33 sport=443 dport=44781 packets=2 bytes=132 [OFFLOAD] mark=0 zone=0 use=3
ipv4     2 tcp      6 src=192.168.1.42 dst=142.251.39.14 sport=43763 dport=443 packets=2 bytes=104 src=142.251.39.14 dst=EXTERNAL_IP sport=443 dport=43763 packets=1 bytes=52 [HW_OFFLOAD] mark=0 zone=0 use=3
ipv4     2 udp      17 src=192.168.1.60 dst=52.8.35.72 sport=16385 dport=1812 packets=0 bytes=0 [UNREPLIED] src=52.8.35.72 dst=EXTERNAL_IP sport=1812 dport=16385 packets=0 bytes=0 [HW_OFFLOAD] mark=0 zone=0 use=3
ipv4     2 tcp      6 src=192.168.1.20 dst=50.7.248.218 sport=2325 dport=82 packets=2 bytes=104 src=50.7.248.218 dst=EXTERNAL_IP sport=82 dport=2325 packets=1 bytes=60 [OFFLOAD] mark=0 zone=0 use=3
ipv6     10 tcp      6 src=2a01:00d0:e6c6:0000:aceb:d180:6f87:3a33 dst=2a00:1450:4001:080e:0000:0000:0000:200e sport=45249 dport=443 packets=4 bytes=252 src=2a00:1450:4001:080e:0000:0000:0000:200e dst=2a01:00d0:e6c6:0000:aceb:d180:6f87:3a33 sport=443 dport=45249 packets=2 bytes=132 [OFFLOAD] mark=0 zone=0 use=3

My own build with nftables/firewall4. Working without issues whole the day.

@Vladdrako, are you using an mt7621 device with HW offload enabled?

If so, have you verified CPU usage? I did this last weekend, enabling HW offload in the latest builds had no effect (CPU usage remained high even when HW offload was enabled).

So while enabling HW offload with nftables does not break it anymore, in fact it is not working.

As you can see in the commit below, HW offload has been disabled in the current firewall4/nftables therefore enabling it has no effect:

https://git.openwrt.org/?p=project/firewall4.git;a=commit;h=7cb10c809314261c20ddca069eacd469adf44be3

fw4: disable "flow_offloading_hw" option for now

Currently there does not appear to exist any kernel side nft flowtable
implementation that supports hardware flow offloading.

Attempting to upload a ruleset containing a flowtable declaration with
the hardware offloading flag set will fail with a generic EOPNOTSUPP
error.

Since there is neither a graceful recovery (e.g. continue without
hardware flow offloading) nor any possibility to probe kernel side
support from userspace, disable the facility entirely for now.

Yes, hw nat is enabled. No, I didn't test it. Also, if I understand correctly, in commit hw nat was reenabled.

3 Likes

Sorry I can't reply : I only setup the router a few minutes in order to check how it works, and perform bandwith tests. It is running as an AP right now. I'll perform another tests with the next kernel release.

In last stable changing HW Offload was on-fly.

With last snapshot needs reboot