Belkin RT3200/Linksys E8450 WiFi AX discussion

justinkb · January 29, 2022, 8:13am

Gonna reply to myself here with some more information. I've now put the openwrt router with br-lan interface static ip set to 192.168.2.1 behind my regular router and connected one of the ethernet ports to the wan port of my rt3200. Now I get an "wan" ip in the range of 192.168.1.0/24 from it and routing to the outside world works fine. Hopefully this info is helpful

justinkb · January 29, 2022, 10:35am

So I've managed to find the cause after some confusion. HW offloading with firewall4 is the thing that's completely broken. I didn't realize this initially since saving and applying doesn't actually apply the change until a reboot.

Jip-Hop · January 29, 2022, 11:13am

Thanks! I've updated my post. Still some more to clarify haha.

Does using the WiFi on this device hammer the CPU? Under all configurations (SW, HW, and no offloading)? And does that CPU overhead occur when communication with the internet (WAN)? Or when communicating with LAN clients (connected via ethernet)?

Has anyone tested the max speed on the WAN port when not using flow offloading?

daniel · January 29, 2022, 11:31am

Yes, if it is TTL level, that will work. RS-232 with D9 plug will not work and fry the device.
If you have TTL serial adapter, set it to 3.3V level (usually a jumper or switch which allows you to set 5V, 3.3V and sometimes also 1.8V). See the Wiki for pinout.

fda · January 29, 2022, 1:55pm

Lol, i installed recovery 0.62 to quick test fw4. It's complete broken!
"Status" > "Firewall" shows only "Collecting data..."
And why are there no more "Custom Rules" with fw4??

Seems nftables is a pre-alpha and should not by installed yet, even it is default now.

daniel · January 29, 2022, 2:37pm

nftables Firewall status in LuCI has not been merged yet, PR for it is here

Custom rules feature depends on iptables command line parser and as such will never be compatible with nftables. No idea if there is a plan to replace it, e.g. by allowing JSON-formatted nft rules being added in a similar way.

nftables itself has been the default way of Linux to setup firewalls, all distributions have adapted by now and OpenWrt still using iptables (and even custom patches to allow modern features, such as flow offloading, to work despite that) is actually the exception. So what is "pre-alpha" is mostly the web interface, and rest assured, that will be fixed by the people who care for that.

fda · January 29, 2022, 3:26pm

Without some sort of custom "rules" it does no make sense to me. I use i to fix bugs of fw3 (eg The firewall3 assigns zone sometimes wrong when pppd is stopped) and for many things which are not possbile by (l)uci. Makeing fw4 default even it does not yet run well is a bad idea. Additional the lack of wildcards for interface names (eg wg+)
Additional it seems there are many fixes https://git.openwrt.org/?p=project/firewall4.git;a=shortlog
Beside that it does not matter if the rules in the kernel are filled by iptables or nft command

Flowoffloading: I never had a crash with it. The only bad side of it is that ipv6 is broken (https://bugs.openwrt.org/index.php?do=details&task_id=3373). Why should this be fixed by the switch from iptables to nft?

daniel · January 29, 2022, 4:57pm

Because netfilter folks suggested that the crash we see here could be a race condition in our hacky out-of-tree flow-offloading integration into xtables:
https://git.openwrt.org/?p=openwrt/openwrt.git;a=blob;f=target/linux/generic/hack-5.10/650-netfilter-add-xt_FLOWOFFLOAD-target.patch;h=bda8d06b7caf49584206bbf4f6a747309f481847;hb=HEAD

So once we don't use xtables at all any more but native nftables, this codepath would no longer be used.

That's not the whole truth: we could (with some limitations) use the iptables-nft wrapper to remain cmdline-compatible with iptables while using the nftables backend in kernel. This is what all other distributions are doing if you use iptables executable. However, I'm not sure that covers really all features of iptables (ie. also ipsets, wild-cards in interface names, ...), I guess no...

dietcoke73 · January 29, 2022, 5:28pm

Is it a known problem with this router, that reboot option in Luci powers off the router, and requires a cold boot switch off for 20 seconds? Reboot from the command line seemed to work ok however strangely.

daniel · January 29, 2022, 5:33pm

This is a known problem if you have modified the CPU frequency governor to ondemand or otherwise set the CPU supply voltage to 0.9V which causes DDR RAM calibration in ARM Trusted Firmware bl2 to never complete and hang for ever, which is why the device doesn't come up when you reboot.

If you have not changed the CPU frequency governor or otherwise made sure it will not perform reboot while operating at the lowest operating point, reboot should work and what you describe would be new to me.

dietcoke73 · January 29, 2022, 5:35pm

Ok, I had indeed set ondemand, so that is the cause. Does this also happen with schedutil? If yes, I'll just leave it defaulted.

daniel · January 29, 2022, 5:52pm

Users have been reporting that schedutil avoids the problem. This is coincidental as it just happens not to end up in the lowest operating point at the point the reset is asserted, so I wouldn't trust it too much and the behavior may change with future updates. Imho the best is to just

echo 437500 > /sys/devices/system/cpu/cpufreq/policy0/scaling_min_freq

for now until the root-cause of the problem is solved.
Ideally we should do that by having ARM Trusted Firmware make sure CPU frequency and voltage are reset to default before jumping into DDR calibration. We build it from source and datasheet for MT7622 is de-facto public, so just needs someone to do it. And it would require everyone to update bl2 which is a bit dangerous.

Alternatively it could be solved by patching the Linux kernel so it would make sure to reset cpufreq to default before reboot. That's kinda ugly and most likely unacceptable for upstream, but then it's solved via normal sysupgrade and (other than updating bl2) there is no risk to brick the device if the update goes wrong.

To be on the super safe side, we could even do both of the above, so new users (and new devices using MT7622 with ATF built from source) would not suffer from that problem and existing users would still not have to replace bl2. However, you guess it, it's a bit of work.

So last but not least we could simply ignore what MediaTek engineer Sean Wang submitted to Linux and just raise the supply voltage on that lowest operating point to 1.0V or completely remove the (anyway obvious buggy as 30MHz in code but labeled more realistic sounding 300MHz) operating point from device tree. But both would be a last-resort if none of the above is implemented when we get close to 22.xx-rc1.

vw-owrt · January 29, 2022, 8:28pm

Lol, i installed recovery 0.62 to quick test fw4. It's complete broken!
"Status" > "Firewall" shows only "Collecting data..."
And why are there no more "Custom Rules" with fw4??

Seems nftables is a pre-alpha and should not by installed yet, even it is default now.

Yes, confirming firewall4 is the culprit. When I downgraded to the 0.6.1 release then updated to the latest snapshot with show advanced options firewall was listed and I had internet access. I tried updating today removing firewall and selecting firewall4 and internet was gone again.

Is there any way to upgrade to firewall4 at this point? Has anyone gotten 0.6.2 to work successfully?

I am ok reconfiguring everything from default if that's what it takes.

fda · January 29, 2022, 8:39pm

Reboot works well with ONDEMAND, but im using at least 1.0volt for all frequencies 125 and 300 mhz - as suggestet in my unmerged PR some months ago -.- , linked somewhere above
So just merger the 125mhz and 1volt commit (governor optional)

I tested the nft wrappr and some rules im using did nor work anymore, it was something that ipv6 ips where no longer as ips recognized ("f" is invalid error). I think it was at least this rule:

ip6tables -t nat -A prerouting_lan_rule -m mac ! --mac-source <2ND_DNS> -p tcp --dport 53 -j DNAT --to-destination [fd0a::1]
or
ip6tables -t nat -A prerouting_wan_rule -d ::xxxx:xxff:fexx:xxxx/::ffff:ffff:ffff:ffff -p tcp --dport 123 -j DNAT --to-destination [fd0a::9999]

offload-crash: I never had this. Maybe because i no longer used it as ipv6 is unusable (see ticket) and the bug was later introduced. So I consider ipv6 is still broken...

Lynx · January 29, 2022, 9:11pm

I've been using schedutil on all three of my RT3200's without issue.

But this is dangerous and we should disable?

fda · January 29, 2022, 9:18pm

As a workaround, set a higher frequency which uses enough voltage, eg
echo 437500 > /sys/devices/system/cpu/cpufreq/policy0/scaling_min_freq
in rc.local or just before the reboot.
Or uses non-ubi. It has a bootloader which can handle less than 1volt correctly during reboot

Lynx · January 29, 2022, 9:20pm

Would that go after or before the set schedutil in rc.local?

fda · January 29, 2022, 10:07pm

I've set "ondemand" in kernel config and am using 300, but all are @1v. 300 just to reduce transitions...

daniel · January 29, 2022, 10:09pm

Technically it should go before, but in practice it doesn't make a difference when it comes to fixing the reboot issue.

ariznaf · January 29, 2022, 10:31pm

I am a bit lost with all the messages about problems with reboot and firewalk4.

I have it running for long with no isdues, but did not upgrade in the last two weeks.
Should I not upgrade for now until problem are solved?

I use SQM but have not tried soft or hard offloading.

I have a 300Mbits wan and it seems to manage that speed with no issues with no need for them.