I have a Zbtlink ZBT-WR8305RT running OpenWrt 18.06.4 that I am trying to use as a WiFi to Ethernet bridge that does not need rebooted periodically. On the LAN I have a CentOS server providing the DNS and DHCP services. On the OpenWRT LAN settings I have ticked the "Ignore interface" option for the "DHCP Server" settings.
It works as expected when the OpenWRT router is started but after it has been running a while the WiFi connected devices can not get or renew their DHCP leases. With Wireshark on the server I can see the "DHCP Discover" message from the client and the "DHCP Offer" reply from the server. Running Wireshark on the client I can see it send the "DHCP Discover" but the reply is not seen. Rebooting the OpenWRT router resolves the problem for a while but then occurs again. Using the restarting the firewall does not help. I have also tried enabling the OpenWRT router as effectively a caching DHCP/DNS but that did not appear to change the behavior.
To me it appears that after some time operating it stops forwarding port 67 to port 68 messages from Ethernet to WiFi. Does anyone have any suggestions of what the cause may be?
Just to confirm, the wireless segment is bridged to the wired segment, correct? brctl show should show the Ethernet interface (or sub-interface) and wireless virtual interface in the same bridge.
As a precaution enable STP.
One more thing to try, when this problem occurs, is to have a host with static IP and start pinging the DHCP server and the gateway.
Does it occur only to wifi connected hosts or wired ones too?
root@ap1:~# brctl show
bridge name bridge id STP enabled interfaces
br-lan 7fff.6460f55298da yes eth0.1
wlan0
then restarted the LAN interface from the LuCI interface but it did not help.
Wired devices are unaffected.
I tried setting a static IP address on a WiFi connected laptop and it did not work. I can see traffic from it on my server so assume there is actual an issue with all traffic going back after a while and I was being mislead by my server logs which are only reporting dhcpd issues.
Thanks for suggesting that as I now realise I am probably chasing something more general issue and my post subject probably needs updated.
When the issue occurs can you ping the bridge IP 10.11.1.21 ?
Does the bridge have valid arp entries to know where to find each host?
Is there anything suspicious in the logs?
I can ping it ok from the Ethernet LAN network but not the WiFi network. The arp table on the router shows entries for the Ethernet devices that have been connected to it but only one of the WiFi devices and I am assuming that is a cached entry that has not timed out since the problem occurred (I'm rebooting often to keep WiFi users happy). On the WiFi client there is no arp entries, I assume they have all expired.
The OpenWRT system log looks pretty much the same all the way through and I think are ok. Here is that tail of it:
These rt2x00queue errors have been a long standing issue with rt2x00 (so I wouldn't hold my breath for ever getting this fixed), but there has been activity on this particular issue recently - please try snapshots instead of 18.06.x.
Thanks. Searching on that kernel error message does back up that it is not actually a simple dropped frame and explains the things I have seen. I saw early mention of disabling WMM so I am trying that and the latest snapshot as per slh's suggestion. I post an update on how it goes after it has been running a while.
I suspected that was the case but a reliable 54Mb connection is more useful to me than a faster one that takes everything off line periodically until rebooted.
Overnight it has run reliably so I have just turned back on WMM and will see if the problem returns.
It has been running now for a while without this issue. I am seeing other issues but they are minor by comparison.
For anyone reading this thread in future the summary is:
The issue was not related to DHCP, it just that the LAN's server was generating log events about that since they were affected.
The real issue was seen as "ieee80211 phy0: rt2x00queue_write_tx_frame: Error - Dropping frame due to full tx queue 2" messages in the kernel log and is know issue with the current stable builds (18.06.4) for the MT7620 SOC.
The best fix is to flash the system with the current snapshot build. Hopefully the fixes in the snapshot build will find their way in to a release build soon.
My thanks to all the people you took the time to reply and point me in the right direction