The most common problem is that the client router cannot pass the DHCP message between the main router and the client connected to the client router
if I am reading this correctly, I am not seeing this. broadcast responses DO make it back to the WiFi cllient on the OpenWRT bridge (see tcpdump on wlan1). the rest is software, isn't it? these responses don't seem to make it through relayd (see tcpdump on br-lan), either because I misconfigured it, or because it's buggy.
if I use ping from a PC connected to the OpenWRT bridge, I see ICMP requests and responses on both relayd's interfaces (wlan1 and br-lan). SSH, NFS etc all work through relayd. however if I use DHCP, the responses seem to vanish within relayd (on their way from wlan1 to br-lan).
Seems like the warning at the beginning of the Relayd wiki perhaps applies to your WNDR3800. No one can explain properly why DHCP doesn't work for some routers.
I've not had any DHCP issues with AR9344 based WD N750 router, Home Hub 5A (Qualcomm wifi), MT7628 based Archer C50 v4, and IPQ4018 based EA6350v3.
I can't see why it would make a difference, but is the relayd wireless link using 2.4 or 5 GHz radio? I only ever use 5 GHz.
and guess what? exact same problem! normal traffic works, DHCP doesn't!
OpenWrt 19.07.0
could you please share your working HH 5A configuration and the version of OpenWRT you are using.
I am also using 5GHz only.
BTW HH 5A is supposed to be a much more powerful router with 802.1ac and everything. but the bridged transfer rate is mere 5MBytes/s compared to WNDR's 13MBytes/s. go figure.
I use LEDE 17.01.4 on HH5A. (Had wifi disconnect issues with 17.01.6). I have previously tested 18.06.0 briefly on HH5A to check it did work. Can't remember whether I tried relayd when 19.07 snapshots first appeared for HH5A.
I presume your firewall zone is correctly configured.
Your 'network' file contains gateway and DNS entries for LAN interface, which are not required.
My 'wireless' file contains 'option bssid' which is not required.
Update: I copied and pasted your 'network' and 'wireless' file onto a spare HH5A with 19.07.0. I edited some IPs, SSID/password. Puzzled by why I couldn't ping my main router until I realised you used 'wlan' as interface name. After correcting the firewall file, all is working including DHCP.
My main router and DHCP server is a HH5A running LEDE 17.01.6 btw
thank you for all your effort. I literally copied and pasted your config file onto a clean HH 5A, changed IP addresses and it still doesn't bloody work. I suspect what must be the difference is that my DHCP server is different from my access point. I have a router (192.168.100.128) and an AP I am bridging to (192.168.100.80), so the DHCP requests go via the AP then into LAN switch and into the router.
my AP (Mikrotik) supports DHCP relay which was disabled by default. I tried enabling it on the AP's 5GHz interface and it makes no difference. when I manually run the DHCP client on my PC connected to the HH 5A, the count of DHCP packets relayed by the AP is increasing accordingly (it has counters for requests and responses). I don't think I need the relay as I only have one subnet on the LAN so I turned it back off.
still, what I was seeing with tcpdump on OpenWRT confirms that the DHCP response makes it way back through the WiFi client.
ie. router/DHCP-server (HH5A LEDE 17.01.6, 192.168.1.254) which is ethernet wired to dumb AP (192.168.1.236) for past 3+ years. The AP was formerly another HH5A (LEDE) but I recently changed to EA6350v3 (Linksys stock FW) using 'bridge' (ie. AP) mode, keeping same 192.168.1.236 address. I only use one subnet 192.168.1.x for everything.
Can I perhaps suggest temporarily installing another AP using your spare WNDR3800. ie. bypass your Microtik AP. For another test, verify relayd on HH5a can connect to standalone WNDR3800 configured as regular openwrt wifi router with its own DHCP server?
Your network and wireless config files are indeed practically identical to my originals. You haven't shared the firewall file.
I leave the firewall enabled. I presume the 'lan' zone is identical to the one I shared otherwise relayd probably wouldn't work at all.
config zone
option name 'lan'
option input 'ACCEPT'
option output 'ACCEPT'
option forward 'ACCEPT'
option network 'lan wwan'
Bill, would it be possible for you to add the -d option to relayd on your relay unit and run a few DHCP requests to see what it should look like?
I don't know how to add arbitrary options to daemons in OpenWRT using configs, so one way to do this would be to SSH to the WiFi client's address, do ps | grep [r]elayd, note that line, then stop relayd and run it manually by adding the -d option to the noted arguments.
that way it will show all requests and responses and hopefully that will be different from mine. I checked the sourcecode and it seems to be validating quite a few things in DHCP packets before forwarding.
the problem seems to be in the Mikrotik AP to which the wireless client is connected. it seems to mangle the DHCP responses somehow that they are not seen or accepted by relayd.
I set up a different AP (OpenWRT) without a DHCP server and plugged into the same LAN Mikrotik is plugged in, and I don't have the DHCP problem.
I compared responses (DHCP ACKs) coming via Mikrotik and non-Mikrotik APs.
the only difference is that in the latter the response has a broadcast address in the Ethernet header (works) while in the former it's the MAC address of the WiFi client (doesn't work).
At the beginning I landed to this plot since I was not able to "see" a device (an IP cam) connected to the access point. Essentially it is the only device so far connected to this Access Point since all the others are connected to the main modem/router (Zyxel VMG8825-T50K by Tiscali Italy).
When I say "see" I mean that the MAC of this IP cam was not appearing here:
so to me the IP cam MAC was not "passed" to the Zyxel router.
The strange thing was that sometimes (even if this is not stable and I don't know if this is another problem...) it was working (I was able to access the cam even via mobile data connection: so it was on the Internet).
So:
the IP cam is connected to the AP-WDS network I created:
(I'm allowed only for one figure per post)
and the IP is the one I "assigned" as static in the Zyxel router
(I'm allowed only for one figure per post)
in the ARP table the IP cam is present but with the MAC of the Access Point:
(I'm allowed only for one figure per post)
as written before, the IP cam connection is very unstable. It works, maybe for one hour and then it drops and I'm not able to make it connecting again, even restarting the Zyxel modem and/or the Netgear Access point and/or the cam itself.
Is this related to some configuration problem and so the previous points?
Running a Mikrotik router/AP and an OpenWrt wireless bridge, after RouterOS upgrade to 6.47 I found myself in exact same situation as you describe. DHCP packets from the server are now coming with a unicast L2 address and are not relayed.
Judging from sources, relayd indeed only watches for DHCP packets with broadcast or multicast addresses. I don't feel up to devising a fix, maybe we could file a feature request.
Anyway, my current solution is to stop relying on relayd's DHCP capabilities and use a standalone DHCP relay. There are multiple implementations to choose from, I went for dnsmasq as it's already installed and well integrated (ready with init.d script etc.)
First step is to run relayd without the -D parameter. Can be done in /etc/config/network:
config interface 'sta_bridge'
option proto 'relay'
option network 'lan wwan'
...
option forward_dhcp 0 # add this
Second, configure dnsmasq as DHCP relay. This is how my /etc/config/dhcp looks like:
config dnsmasq
option port '0' # optional, disable dnsmasq's DNS server, I don't need it
config relay
option interface 'wwan'
option local_addr '192.168.2.1' # lan interface address
option server_addr '192.168.1.1' # the actual dhcp server address
# the rest is only needed for IPv6 (odhcpd)
config dhcp 'lan'
option interface 'lan'
option dhcpv4 'disabled'
option ra 'relay'
option ndp 'relay'
option dhcpv6 'relay'
config dhcp 'wwan'
option interface 'wwan'
option dhcpv4 'disabled'
option ra 'relay'
option ndp 'relay'
option dhcpv6 'relay'
option master '1'
The final step is to make sure the real DHCP server accepts requests from relay agents. On Mikrotik I did: /ip dhcp-server set defconf relay=255.255.255.255)
sadly it didnt help for me. i recently bought archer C7 v5 and decided to flash openwrt for wireless bridge between my mikrotik for to extend local network to the whole house. its working if you create a seperate network in openwrt but doesnt work if you want to be on the same network, as the mikrotik gives warning messages about DHCP (dhcp1 offering lease 192.168.1.29 for xx:xx:60:x2 to Dx:xx:xx:03:x9:xF without success)
where first mac is my phone and the second is the openwrts wifi client.
tried your settings and a few in routeros firewall but nothing helped.. its just not compatible.
it sucks to have 2 different networks for no reason, yet i have to use it like that for now.
I wasn't able also to make it work. Even "copying" the conf files a above. Should be said that I'm not a very expert and so I really understand all the steps I make...
What I really don't understand is why sometimes it's working...
Does it still happen if you stop firewall on openwrt? Just guessing.
The error message suggests that either the Offer packet won't make it to the end device (phone) or the subsequent Request packet is not sent or never reaches the dhcp server.
In the search for the point of failure, tcpdump (opkg install tcpdump-mini) is your friend. Try listening on udp ports 67-68, whether the offer is received on the wireless interface and is relayed to the other interface.
Edit: and yes, if wired connection is an option, go for it and forget all of this...
Can I ask you how to stop the firewall? I "disabled" for all the "interfaces" but I don't know if this is enough.
The strange thing (but again, I'm not at expert at all) is the one client (is an IPcam) sometimes connects to it and receive the right IP address (right in the sense that is the one that I configured in the main router and DHCP server, statically). In this case the IPcam is working fine but, however, the devides (MAC and IP) are not listed by the main router. After some time (10-30 minutes) the connection is lost...