GL-MT3000 / Beryl AX Dual STA/AP configuration breaks down if AP of (secondary) STA is down

I just woke up to a total network breakdown, turns out:

I have one 2.4 Ghz adapter enabled with 2 networks defined.
One is the AP for my localnet (lan).
One is a fallback client/STA for another wifi (wan) (I then use that with mwan3 manager)

As soon as the AP of the second client network (wan) goes down it seems the wireless client of openwrt tries to find and connect that AP in an in infinite loop and during that time all communication of the primary AP (lan) network is blocked, all clients loose connectivity.

So basically a wan failure (a backup wan failure) crashes the entire lan. Thats not quite what I intended :slight_smile:

problem is reproducible, turn down AP of secondary (wan) net, then all clients of primary net loose connectivity / get disassociated, turn (wan) AP back on, they are then connecting back.

I'm not sure if that's a bug in the software stack, or a HW limitation of the MT3000.

Question: Is this a known problem? Whats the solution/workaround?

The same router, 3 clients at 2 GHz in the form of smart sockets and 1 laptop with old Wi-Fi.
Such as you write, is not observed.

Please connect to your OpenWrt device using ssh and copy the output of the following commands and post it here using the "Preformatted text </> " button:
grafik
Remember to redact passwords, MAC addresses and any public IP addresses you may have:

ubus call system board
cat /etc/config/network
cat /etc/config/wireless
cat /etc/config/dhcp
cat /etc/config/firewall

As this is a L2 or below problem, I don't think the other config files would add any additional info, just more confusion, cause of mwan and fw stuff.
Problem is not L3 connectivity but wifi stations getting disconnected.

config wifi-device 'radio0'
	option type 'mac80211'
	option path 'platform/soc/18000000.wifi'
	option band '2g'
	option channel '6'
	option htmode 'HT20'
	option cell_density '0'
	option country 'DE'

config wifi-iface 'default_radio0'
	option device 'radio0'
	option mode 'ap'
	option ssid 'XXXXXX'
	option encryption 'sae-mixed'
	option key 'XXXXXXXX'
	option network 'lan'
	option ocv '0'

config wifi-device 'radio1'
	option type 'mac80211'
	option path 'platform/soc/18000000.wifi+1'
	option band '5g'
	option channel 'auto'
	option htmode 'HT20'
	option cell_density '0'

config wifi-iface 'wifinet1'
	option device 'radio0'
	option mode 'sta'
	option network 'wwan0'
	option ssid 'yyyyyyy'
	option encryption 'sae'
	option key 'yyyyyyy'
	option ocv '0'

Cant guess which is "secondary " of one sta connection. If it is L1 problem - bring down that wall, it is out of scope of OpenWrt

This is expected behavior. Basically, when you use the radio for both AP and STA mode, it is the STA mode connection (and really the upstream AP) that determines the channel selection of that radio. The AP mode instance must use the same channel as the STA mode connection. This is fine until the STA mode connection cannot be established (i.e. no ability to connect to the upstream AP), in which case the channel cannot be selected. If the channel isn't selected, the AP cannot start.

There are two general solutions for this:

  1. Use each radio as an AP xor STA. By setting up one and only one mode on the radio, there will never be a conflict.
  2. Use Travelmate which is designed for this exact situation and sets up a scenario where the system will attempt upstream connections, but gracefully fail when that is not possible and bring up the AP mode instead.
1 Like

Ty, I already figured that "same channel" restriction out due to trial and error.

It's nice when there is someone who can send you in the right direction a LLM (not yet) can do :wink:

1 Like

Why does no one write about a very serious bug. and not only on this router, but on all routers with OpenWrt?
If you are connected to the Internet via WiFi and distribute the Internet via WiFi, then in case of loss of connection with the access point from which you receive the Internet, you also lose connection with the access point that distributes the Internet to you.
In 99.9 cases, you lose access via WiFi to your router and if you do not have devices connected to the router via a wired connection, you lose access to the router 100% and only RESET restores it by default and then only if you have compiled the firmware with WiFi enabled.
THIS IS A VERY, VERY CRITICAL BUG OF THE ENTIRE OpenWrt PROJECT!!!

what's stopping you from going back to the stock firmware ... ?

The problem is that many routers that have long been out of support, such as TP-Link 841, 1043.... only have OpenWrt. But with this BUG that has not been fixed for years and decades, it becomes annoying.
It's like a road with oncoming traffic. If there is a traffic jam on one lane, then the second one automatically does not work, even though it is empty and fully operational.
In this case, you suggest not buying a car, motorcycle, bicycle, not walking?
An original solution, don't you think?

This is not a bug. It is a hardware limitation that applies to all of the wifi chipsets used in these types of devices. These chips cannot rapidly switch back and forth between two channels/frequencies, so they must choose one and stick with it. Therefore, as I explained before, it is the upstream connection that mandates the channel that is used since the local device cannot tell the upstream device to change channel.

When I first started using OpenWrt for a travel router (TP-Link TL-MR3020), I ran into the same issue. Once I understood the reason (hardware limitation, and it had just one radio), I made a workaround by using the mode switch to boot in a "known good configuration" (not a full reset to defaults, but reset to a specific known state for the wifi config). Shortly thereafter, the Travelmate package was developed to address this exact thing. And Travelmate does its job very well; way better than my workaround.

No, it's not. It affects only a small subset of users and is only 'critical' on devices with a single radio (since on dual radio devices, one can always be setup in AP-only mode while the other can be either STA or STA+AP).

Re the 841 -- only one version (the V13) is currently supported, but the next major revision of OpenWrt will no longer support it. Similar story with the 1043 (v1 is already unsupported, v2-v4 are supported, but likely ending at 24.10).

Both of these models are very old and quite limited.... newer hardware will be much more capable, including much faster routing, dual radios, etc.

Please read my post for the two general solutions. On a single radio device like the two you mentioned, Travelmate is the solution.

But to reiterate, this is a true hardware limitation so your complaints should be directed at Qualcomm and the other wifi chipset manufacturers, not at OpenWrt.

Smart home devices have only 2.4 frequency.
Since this is a longer range, it is mainly used to connect the same backup Internet channel. If it disappears, then the 2.4 access point is disconnected and all devices are disconnected along with it.
If the WiFi chip can receive a signal from one access point and simultaneously transmit it through another access point configured on it, even on the same channel, frequency and width, why then, when the received signal is unavailable, the transmitted signal also disappears?
The question is precisely this, and not that the chip cannot broadcast on a channel with completely different parameters, completely different from the received channel.
If you remove a flash card from a computer, it does not disable all USB ports and even neighboring ones that are on the same bus.
Can the package you mentioned above solve this situation?
So that when the incoming WiFi signal disappears, the outgoing one does not disappear?
Like switching priorities, or some other way?

As I understand it, there was an attempt to solve this situation on Xiaomi AX3600 and there was a separate chip that was responsible for the operation of a separate 2.4 range just for such purposes?
Because a software solution was not found?

The physical radio hardware on the chip is a single radio.
If configured as an access point (AP), it defines the band and channel for the wireless network.
If configured as a station (STA), to connect to a remote AP, then it is the remote AP that defines the channel.

If the STA looses its connection to the remote AP, then the channel is no longer defined.

The HARDWARE limitation is that the chip can only run on a single channel at a time and if the STA has lost its connection to the remote AP it must sit waiting for the AP to reappear, scanning for it on all channels. It might be that the remote AP will come back on a different channel for one reason or another.

If the priority in the chip logic was to use a fixed channel and allow the logical logical AP interface to stay up, then if the remote really did come up on a different channel, the connection would never be re-established.

The chip manufacturers all, without exception, give STA mode priority. This is how it should be.

This issue is NOT a bug, it is a correct design choice.

On OpenWrt there IS a software solution as @psherman has already noted.

This implies that, as you are at home with loads of IoT stuff, you want to leech off someone else's wireless network, probably without them knowing. The ethics of doing that is for another thread.

But if you are wanting a wireless connection from your main router across your house to another router or access point, there are reliable ways to go about it that are not effected by the limitations of STA mode, for example you can use wds or mesh.

2 Likes

Between my explanation and that of @bluewavenet net, is there still any gap in your understanding of why the transmitted signal disappears?

Because there is only one radio and it operates only on a single frequency/channel.

Yes and no. It solves the problem by setting a time-out parameter, then deactivating the STA mode part of the configuration if the connection cannot be established.

To be more specific, Travelmate will:

  • maintain a list of SSIDs in a priority connection order (this is really useful for travel, but could just be a single entry). The priority is user-configured and is simply the order in which it will try to connect to a given SSID)
  • Load the parameters for the highest priority SSID and try to connect
  • If unsuccessful after some period of time, it will try the next SSID in the list
  • If still unsuccessful when it reaches the end of the list, it will disable the STA mode configuration entirely.
  • This allows the AP mode to start normally.

What you have just defined is a hardware solution to a hardware problem... add an additional radio and then each radio only needs to be tasked with one mode (one in STA mode, one in AP mode).

Travelmate is the software solution. As I described earlier, this is critical for single radio devices. It is still useful for dual band devices, but as long as one of the radios operates in AP mode only, you'll always have an active AP anyway (the same applies for a device that has multiple radios on the same band (i.e. 2x 2.4G)).

1 Like