Automatic LAN subnet reassignment upon conflict with WAN

psherman · December 22, 2022, 2:46am

I'll use this new subsection to formally request a new feature that would automatically reassign the LAN subnet if a conflict is detected on the WAN. I think that this would make setup easier for those who are new to OpenWrt and attempt to set it up behind another router that is already using the current OpenWrt default of 192.168.1.0/24.

A while back, I wrote an RFC and it seemed to get a bit of traction. But, it was too late to make it into 22.03. So, maybe this can be included in 23.xx??

Here is the original RFC with high level implementation details:

jow · December 22, 2022, 12:46pm

I would propose to at least set different metrics by default for wan and lan. This will not fix broken internet connectivity but ensures that lan remains reachable even if the wan side conflicts.

We could add detection for the conflict to LuCI then and guide the user through the renumbering

psherman · December 23, 2022, 12:25am

I'm not sure why this would be necessary. The lan is not an upstream gateway (and never is in the default configuration state), and therefore doesn't need a different metric.

The OpenWrt lan should always be reachable, even if the wan conflicts.

I just tested it to be sure:

I set an OpenWrt LAN IP the be the same as my main router
Connected the OpenWrt WAN to my upstream LAN
Plugged a computer to the OpenWrt LAN.

OpenWrt remained accessible with no issues via the IP address and https://openwrt.lan even though there was a conflict on the WAN. Routing, of course, doesn't work.

I think that I personally prefer the idea of automatically fixing the scenario, but guiding the user would be an interesting option. Maybe when the user logs into the OpenWrt interface, it would present a message saying that this issue was detected... maybe t could then give three options:

Fix it for me (automatic, aside from a click to initiate it)
Guide me through fixing it (guided but manual)
Ignore (dismiss the error, don't change anything).

patrakov · December 23, 2022, 2:29pm

I think it is a coincidence, based entirely on interface indices. We should not rely on this.

psherman · December 23, 2022, 3:15pm

Why would it be a coincidence? What do the interface indices have to do with the reachability of the OpenWrt device (which will, in this scenario, always be a direct L2 connection to the host being used for configuration)?

patrakov · December 23, 2022, 4:00pm

We consider a case where there are conflicting subnets on two interfaces. In our case, 192.168.1.0/24. Let's suppose that a packet arrives over the br-lan interface, from 192.168.1.234 to 192.168.1.1. The router needs to reply. However, in the absence of any policy-based routing (and by default, OpenWrt does not use it), replies do not consider the interface over which the original packet came in. All that matters is the routing table.

The usually-documented routing algorithm is:

In the routing table, find entries with the longest prefix that still match the destination host.
Among them, select the entry with the lowest metric, and use it.

However, what we get here is two routes to 192.168.1.0/24, one on br-lan, and one on the wan. And, because the default OpenWrt setup does not use metrics, the end result is still ambiguous. I don't know where it is documented, but the Linux kernel seems to prefer the route that was added first. In your case, this is LAN, and therefore the replies can reach the client. However, there is nothing which guarantees that the LAN will be configured before the WAN. That's why the suggestion to use different metrics.

psherman · December 23, 2022, 5:34pm

I don't recall seeing any situations on the forums where a user has been unable to reach the OpenWrt router when directly connected, even if the WAN conflicts. In fact, wouldn't this happen at L2 anyway? But, yes, I can imagine how there could be a problem if it happens at L3 and sends the packet out the wrong interface because of the ambiguity of the two identical/overlapping subnets.

But... moving to the next logical leap: Ok, so let's make an assumption that the conflict could actually prevent the downstream host from reaching OpenWrt for administration purposes... wouldn't that be an even stronger reason to have the automatic renumbering happen? In this situation, the only other way that the user would be able to gain access to the OpenWrt router would be to disconnect the upstream... but, especially when considering novice users, that may not be obvious to them. So if the system automatically re-assigns the LAN IP so that it doesn't conflict, it fixes a potential access issue and fixes routing. And this always works as long as the user enters openwrt.lan as the admin address (rather than 192.168.1.1 which may have changed in this context).

Remember, the idea of this FR targets two situations: novice users who setup a double-NAT situation with conflicting subnets on OpenWrt's WAN/LAN, or the road warrior situation where the upstream network may be unknown (since it may be a different network each time the user connects in a different location).

Experienced/advanced users (in a non-road-warrior context) would likely not be affected here since they may pre-configure OpenWrt's LAN address to ensure that it avoids such conflicts.

clutch2sft · December 24, 2022, 11:19am

I wonder deeply why the WRT router would even look in the routing table in your scenerio. In my understanding of this the client would arp for 192.168.1.1 and get a response only from the br-lan interface. The the client would send the packets to the responding br-lan MAC. All L2 communication without a router involved.

What about the return traffic? Good question. So I am not so clear on how the WRT would decide which interface to arp on for the return IP to MAC mapping. That is an interesting execise and it is my impression that the collision and decision to "route" could happen if 192.168.1.234 was also used on the WAN side ...again making a TON of assumptions about how WRT would behave in this situation.

At that point the default is also to masquerade on the flow to the WAN ... so how does NAT behave on WRT when you NAT to the overlapped subnet? Going back in time to places I had to solve for that it seems to me that we would NAT both ways. So there is some trickery that could be placed here for that situation...probably.

Mostly I wrote the above for clarity in understanding the problem in my own head ..not to dispute the knowledge imparted by @patrakov. Marking the WAN down (or changing metric to make it appear down) seems logical and easier. Then some sort of documenation or notification of the problem to the end user is also logical to point the user in the right direction. Thanks for listening to me ramble.

patrakov · December 24, 2022, 1:39pm

You are correct about the return traffic being the problem here. The existence or non-existence of the ARP entry about the client on the router does not affect the decision about the outgoing interface - it is purely a routing decision, based on the routing table only, and nothing else. And if the decision is made to send the reply over the WAN (because it came up first), then the replies will definitely not reach the client.

And yes, 192.168.1.1 is entrenched into documentation, and we can't change the LAN network automatically even if a conflict is detected. The best that we can try to achieve is to make sure that the client can communicate with the router (to reconfigure it) even when the WAN network conflicts - by changing the metric, and displaying a warning.

Lucky1 · December 24, 2022, 3:34pm

Hey Guys
Just a tip that can help when playing with openwrt
use the IPV6 Addresses in your browser
just put some [] around the ipv6 addresses (at lest in firefox for what I use)
even if you cascade 2 openwrt routers that both have 192.168.1.x
you can still use there ipv6 addresses to access both at the same time for setting up
just get the gateway address off the status page for the upstream router

psherman · December 25, 2022, 3:42am

So to be clear, I am not married to this idea. In fact, I originally thought it was an unnecessary and silly request (if you refer to the links in the RFC description). However, I came around because I realized we actually see this issue quite often in the forums from people who are network and/or OpenWrt novices. It felt like there could be some 'automatic' means of helping them.

But, unless this process easily leads the (novice) user to a solution (and helps them learn), it doesn't seem to actually solve the issue (i.e. "internet doesn't work from the OpenWrt lan, but OpenWrt's diagnostics work just fine").

Documentation of WAN/LAN subnet conflicts already exists, but that doesn't seem to help many novice users who don't really understand why it isn't working and don't know that there is ample information in the OpenWrt documentation.

Detecting the problem and providing a link from within LuCI may not be all that useful since the user may not have the ability to reach the internet due to the address conflict.

In all the years I've been helping on the forums here (and it's been quite some time), I cannot recall a single instance of a user saying that they couldn't reach the OpenWrt router. I do agree that there is a theoretical possibility that it could happen, but I think it is very much the exception rather than the rule. If you can point me to users who have actually been unable to reach their OpenWrt router from a directly connected computer when they have this subnet conflict, I will absolutely admit that this is real and that I was mistaken. But I really think the concern is theoretical only.

I held this opinion at first, too. But now I very much disagree with this philosophy. Here's why:

192.168.1.1 will still be the default.
The address/subnet will only change in the event of a conflict.
There would only need to be one alternate address by default, so it the router would either be 192.168.1.1 or (just throwing it out there) 10.0.0.1.
We can clearly document the default alternate address/subnet.
saying that "we can't change the LAN network" is a thing of dogma.
- Just because there is historical use of one network address doesn't mean that it can't be changed.
- Other things have changed over time that were 'always' a certain way, until they weren't.
  - swconfig is being deprecated for more platforms with every new release, but we always used swconfig until DSA came around.
    - now we have new documentation explaining how to use DSA.
  - There are syntax constructs that have changed over the years, even though some of them had been very well entrenched and used for many many versions of OpenWrt (for example: the bridge used to be defined inside the network interface stanza, but now a bridge must be defined as a separate device)
  - IIRC (and I may be wrong about this) wifi was enabled by default on early versions of OpenWrt, but obviously now it is off by default for good reasons.
- Nobody is proposing changing the actual default address, just adding an alternate that will be known and allow normal routing/internet to work in the event of a wan/lan conflict.
If we update the documentation to prefer using openwrt.lan as to reach the router in the default state rather than by an IP address, that makes the potentially different LAN IP address a non-issue.
- This can become 'entrenched' in the documentation moving forward (as can the default-alternate address).
- The openwrt.lan idea (proposed by @richb-hanover-priv here was literally the moment I was convinced... it's so simple and will 'just work!'

I think that this would be preferable compared to the current situation since novice users don't understand why the internet isn't working, and some warning or guidance would help point them in the right direction), but I don't think changing the metric really solves any real problems (again, I cannot recall ever seeing a thread where the downstream OpenWrt router itself had become unreachable), and it certainly doesn't resolve the problem of routing not working without user intervention. And novice users may still be really frustrated since this may be beyond their current skill set to resolve.

Also, and I cannot stress this enough: this feature is not for us (by us, I mean those of us discussing in this thread, with the possible exception of travel-router carrying road-warriors). This is for the person who is completely new to OpenWrt and may not have much, if any, real network configuration experience. OpenWrt is billed as a superior option compared to most vendor firmware (and it really is, in most cases), but the new and novice user just sees a router that doesn't work and is super frustrated. When you look at it from a beginner's perspective, this can really improve their initial experience with OpenWrt.

msnews · January 6, 2023, 7:02pm

Auto Changing lan ip should give a warning to the users who has static ip address set in the computer and prompt to change their computer ip as well. Many tutorials on the internet asks user to set static ip address to connect the router.

psherman · January 6, 2023, 7:33pm

How would the system know, reliably, that the IP address was manually set/static on the computer? Yes, you could evaluate the DHCP leases, but that only works if the host is known to have come online after the most recent reboot of the router. If the host is connected through an external switch, the port wouldn't bounce during a router reboot, so it would the host request a new DHCP lease. This could lead to a false-positive assumption.

Beyond that, the detection would likely happen rather early in the cycle... likely/hopefully before DHCP leases are issued, and almost certainly before the user attempts to login to the router for administration purposes.

I did specifically mention in my RFC:

IMO (I will admit: this is opinion -- I don't have data to back this up -- so it could be factually incorrect):

the most common scenario where the automatic reassignment would occur is likely to be a novice user who is using DHCP on their computer
Many novice users will be physically connecting directly to their OpenWrt router via ethernet (not through a switch) at least for initial setup/configuration.
I view static IPs as an edge case scenario, unlikely to be common for novice users (for whom this feature would be useful).

msnews · January 6, 2023, 8:40pm

It does not need detection, just show a general warning message in the popup window to let user know that they have to update their ip address if they use the static ip

psherman · January 6, 2023, 9:54pm

This would only work (and be useful) if the user 1) opens the router admin page, and 2) does it before the router performs the automatic reassignment.

That said, a notice that the subnet has changed could be useful, still, in case the user does have a network that includes devices with static IP assignments and/or switches/APs that are downstream of the router and may therefore not bounce the connection.