No traffic on clients with no DHCP lease

Hi All,

I am a long time OpenWrt user and have several devices running in various environments.
One of these devices is in a remote location at the top of a hill (amateur radio repeater site) and provides internet to this location over a 4G connection.
The device is a TP-Link WDR4300 and it was running version 18 until last week where i drove on site and upgraded it to 21.02.

So here is the issue.
Behavior with version 18 : After rebooting the router the wired clients won't show up in the DHCP leases, yet they were able to communicate together on the LAN and also to the internet. Eventually the clients would show up in the lease list as soon as they lease expired and they actively requested a new one.
Having them as Static leases did not affect this behavior.

Since version 21.02
After rebooting the rooter, The wired clients won't be able to communicate together over LAN nor to the internet. arp table on the router shows 00:00:00:00:00:00 for the clients

The workaround is to set a very short lease, yet I think there might be a more serious underlying issue. Moreover initially my DHCP lease time was 3d so I have to wait 3 days for my devices to come back online. Going on site just to unplug/plug the network cable is not a viabl3e solution since it is a 1 hour drive and in the Winter i need to hike the last Kilometers in the snow....

Thanks for reading and any help

Edit: Adding the clients in /etc/ethers also solves the issue, yet I do not want to have to maintain the hosts in 2 different places (i.e /etc/ethers and static lease)

Edit 2: Well in the end it all boils down to the clients requesting their DHCP leases before the switch is initiliazed i.e. Wan leak. Since the 4G connection is provided by another router in another local subnet the clients get their dhcp leases from there...
As a matter of fact the entry in /etc/ethers does not help, it was a coincidence. It's all about the clasical dreaded LAN to WAN leakage. I did not have this issue in v18 though....

This sounds like dhcp snooping. Is there another switch on the site or the devices connect on the wdr4300 directly?

2 Likes

thank for your reply.
Indeed this is a DHCP issue ....

Here some more info, port 4 is used as an alternate WAN in case the main 4G WAN fails we have a wifi bridge which allows us to still acces the site. Fail over is handled using MWAN3.
On the other side of the wifi bridge there is another DHCP server.

There is also a DHCP server on the "original" wan port but my clients never get an IP from this one. Only from the DHCP server behind port 4

Here is a quick drawing of the setup

When the WDR4300 boots, the LAN clients are assigned an IP from the DHCP highlighted in blue. If I turn off the wifi bridge (kind of unplugging the cable) all is fine and clients stick to the IPs provided by WDR4300.
I never had this issue with version 18.... As the saying goes "never touch a running system"....

Ok here is my best guess of what is going on after some more testing...

On v21 all wired client received an IP from the WDR4300 DHCP when it is rebooted.
In v18, after a router reboot, the client would stick with their IP until renewal without the WDR4300 DHCP server knowing about them, yet they could use the network without any issue.

In v21 the wired clients get an IP when the WDR4300 reboots IMHO OpenWrt toggles the switch on/off to force the clients to request a new IP (virtually plugging/unplugging the cable).
However this is done before the switch configuration is applied and before the DHCP daemon is started on the WDR4300 hence the clients getting an IP from the "blue" DHCP....

I think we have a bug here.... Devs forgot that some users might not stick with the default switch configuration.

As a work around I have to find a way to toggle the switch on and of after the DHCP daemon has been started on the WDR4300 or I could attach the wired clients to another switch so that they do not notice about the WDR4300 rebooting...

This is a well known issue of all the cheap boards used in such routers and it is irrelevant of the OpenWrt version. The question is: how often does the router reboot?

I would expect such a behavior during a restart.
I think it would be better if you used static IPs, or an extra switch between the WDR4300 and the lan hosts to keep their links up while rebooting.

1 Like

Blockquote This is a well known issue of all the cheap boards used in such routers and it is irrelevant of the OpenWrt version. The question is: how often does the router reboot?

I agree yet I never ever had the clients getting IP from the DHCP hooked to the regular WAN port and with v18 never had an issue with my secondary WAN port. I see this as a regression.
The router can be up for weeks but it is not uncommon to experience power outage during winter or in summer during thunderstorms.

Blockquote I would expect such a behavior during a restart.
I think it would be better if you used static IPs, or an extra switch between the WDR4300 and the lan hosts to keep their links up while rebooting.

There are some clients who do not belong to me and have no access to their configuration so static IPs on the clients is a no go. Adding an additional switch will help but I do not like it neither. The more hardware I have the more failure sources I have. The site is more or less an outdoor cabinet near a radio tower. Inside the cabinet, temperatures can range from -5 in winter to +45°C (-15° to +30° outside temperature). The WDR4300 proved to be reliable and never failed since 5 years.

You could open a bug report.

I guess that the power outage will also affect the LTE router which would also boot and not provide addresses in the meantime.

You're lucky on this aspect, as the temperature range for this device is 0-40°C (bottom of the page). In such an environment it's better to look for industrial spec devices. For example Rambutan has 2 versions.

*** Industrial temperature range: -40 - 85°C, Commercial temperature range: 0 - 65°C

You could of course build a fan and resistance installation operated by thermistors to provide some fresh air or heat according to the ambient temperature.

The Ethernet switch chip typically powers up in a dumb mode where all the ports are switched together like an unmanaged switch. This helps the chip maker sell the chip for unmanaged switch applications, manufacturers can build an unmanaged switch around the chip with little other hardware.

Very shortly after power on the bootloader takes control of the switch and usually (not always) sets it into a 4+1 mode with the WAN port isolated. The stock firmware continues in that mode. OpenWrt will reconfigure the switch to however it is set, but that takes 15 seconds or so after power up.

A potential solution then is to replace the bootloader with one that does set the switch to an isolated mode. The 'pepe2k' project may cover this model.

Other directions would be to configure some upstream device to block DHCP requests coming from this router, or not use DHCP at all on the upstream network. The WAN would operate with a static IP.

I don't think there is a userspace way in OpenWrt to completely shut down the switch at layer 1 (so the links drop to the endpoints, and they will re-DHCP when the switch is restarted).

@trendy
I am grateful for all the explanation, and yes I am aware that my setup is not perfect in some aspect, but it served my purpose well over the last 5 years. I did with what I have, and already invested lots of money into this site (outdoor cabinets) are not cheap at all...
The LTE router is a courtesy of someone who gently allowed me to use a some of their bandwidth, it has a battery backup and therefore almost on 24/7. My setup does not (yet) have a battery backup.

@mk24
thank you for the explanation. Indeed I think the bootloader sets the switch in 4+1 mode hence never had an issue with getting IP from the DHCP on the wan. Yest this does not explain why I never had an issue with the secondary wan port on previous versions.

I will open a bug report

1 Like