Relayd not forwarding broadcast (BOOTP/DHCP) responses

hi everyone

I am trying to set up my Netgear WNDR3800 as a wireless (pseudo-) bridge. the device doesn't support WDS so I followed numerous guides to set up relayd. I don't need this device to repeat or act as an access point. I literally want to extend my LAN using WiFi.

bridging works (mostly) and performance is great. however I cannot get brodcast traffic to bridge between my wifi client and the wired lan.

I turned off firewall just in case. I used luci for configuration and then re-checked config files manually.

relayd is running with the following options:

/usr/sbin/relayd -I br-lan -I wlan1 -L 192.168.100.83 -B -D

wlan1 is my wifi client and br-lan is obvious. I can ping from a PC connected to br-lan, but DHCP doesn't get any responses.

I installed tcpdump on WNDR3800 and captured BOOTP/DHCP first on wlan1 (it passes both directions) and on br-lan (I only see requests but no responses). so broadcast passed in both directions via the WiFi client. I think this shows that relayd is not forwarding broadcast responses back to the originator.

root@lrbridge:~# tcpdump -i wlan1 "port bootps"
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on wlan1, link-type EN10MB (Ethernet), capture size 262144 bytes
17:36:29.182861 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:23:ae:65:a9:d5 (oui Unknown), length 300
17:36:29.193569 IP gw.home.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300

however

root@lrbridge:~# tcpdump -i br-lan "port bootps"
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br-lan, link-type EN10MB (Ethernet), capture size 262144 bytes
17:36:38.598745 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:23:ae:65:a9:d5 (oui Unknown), length 300
^C
1 packet captured

am I right in my analysis?

below is my /etc/config/network:

config interface 'loopback'
	option ifname 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option ula_prefix 'fddb:40a0:9d52::/48'

config interface 'lan'
	option type 'bridge'
	option ifname 'eth0.1'
	option proto 'static'
	option netmask '255.255.255.0'
	option ip6assign '60'
	option ipaddr '192.168.99.128'
	option gateway '192.168.100.128'
	list dns '192.168.100.128'

config switch
	option name 'switch0'
	option reset '1'
	option enable_vlan '1'
	option blinkrate '2'

config switch_vlan
	option device 'switch0'
	option vlan '1'
	option ports '0 1 2 3 5t'

config switch_port
	option device 'switch0'
	option port '1'
	option led '6'

config switch_port
	option device 'switch0'
	option port '2'
	option led '9'

config switch_port
	option device 'switch0'
	option port '5'
	option led '2'

config interface 'wlan'
	option proto 'static'
	option netmask '255.255.255.0'
	list dns '192.168.100.128'
	option ipaddr '192.168.100.83'
	option gateway '192.168.100.128'

config interface 'stabridge'
	option proto 'relay'
	option ipaddr '192.168.100.83'
	list network 'lan'
	list network 'wlan'

I am running OpenWrt 19.07.0 r10860-a3ffeb413b.

I would be grateful for any suggestions.

thanks in advance.

Your network file seems to look okay at first glance.

Have you seen the warning

https://openwrt.org/docs/guide-user/network/wifi/relay_configuration

The most common problem is that the client router cannot pass the DHCP message between the main router and the client connected to the client router. Currently it seems to be the hardware/SOC limitation (related to MAC cloning?)

fwiw, I configured relayd on an EA6350v3 (ipq4018) yesterday to test throughput to a desktop PC. (OpenWrt relayd not as good as stock linksys firmware in wireless bridge mode) I used the instructions in 9.10 of the Installation guide for HH5A
https://openwrt.ebilan.co.uk/viewtopic.php?f=7&t=266

DHCP working fine with wired desktop PC.

OpenWrt SNAPSHOT r12110-4576a753f2 (I also tested 19.07-snapshot several months ago too)

Here are my config files for comparison

/etc/config/network

config interface 'loopback'
	option ifname 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option ula_prefix 'fdde:0920:aae5::/48'

config interface 'lan'
	option type 'bridge'
	option ifname 'eth0'
	option proto 'static'
	option ipaddr '192.168.111.1'
	option netmask '255.255.255.0'
	option ip6assign '60'

config device 'lan_eth0_dev'
	option name 'eth0'
	option macaddr '60:38:e0:89:25:ab'

config device 'wan_eth1_dev'
	option name 'eth1'
	option macaddr '60:38:e0:89:25:aa'

config switch
	option name 'switch0'
	option reset '1'
	option enable_vlan '1'

config switch_vlan
	option device 'switch0'
	option vlan '1'
	option ports '1 2 3 4 0'

config interface 'wwan'
	option proto 'static'
	option netmask '255.255.255.0'
	list dns '8.8.8.8'
	option ipaddr '192.168.1.235'
	option gateway '192.168.1.254'

config interface 'stabridge'
	option proto 'relay'
	option ipaddr '192.168.1.235'
	list network 'lan'
	list network 'wwan'

/etc/config/firewall

config defaults
	option syn_flood '1'
	option input 'ACCEPT'
	option output 'ACCEPT'
	option forward 'REJECT'

config zone
	option name 'lan'
	option input 'ACCEPT'
	option output 'ACCEPT'
	option forward 'ACCEPT'
	option network 'lan wwan'

config include
	option path '/etc/firewall.user'

DHCP server is Disabled of course.

Only other thought is are you using ar71xx or ath79?

thank you for your response bill888.

the wireless chipset in use is Atheros AR9223, but I don't see how that can be relevant. I can see the DHCP responses when sniffing on the client inteface. they don't appear on the lan bridge though, which IIUC can only be blamed on relayd, not hardware.

I'd be happy if someone pointed out the hole in my reasoning. ping requests and responses appear in both tcpdumps (wlan1 and br-lan) but with broadcast (BOOTP/DHCP) the responses are present on wlan1 but missing on br-lan.

thanks again!

The most common problem is that the client router cannot pass the DHCP message between the main router and the client connected to the client router

if I am reading this correctly, I am not seeing this. broadcast responses DO make it back to the WiFi cllient on the OpenWRT bridge (see tcpdump on wlan1). the rest is software, isn't it? these responses don't seem to make it through relayd (see tcpdump on br-lan), either because I misconfigured it, or because it's buggy.

if I use ping from a PC connected to the OpenWRT bridge, I see ICMP requests and responses on both relayd's interfaces (wlan1 and br-lan). SSH, NFS etc all work through relayd. however if I use DHCP, the responses seem to vanish within relayd (on their way from wlan1 to br-lan).

Are you using ar71xx or ath79 build ?
https://downloads.openwrt.org/releases/19.07.0/targets/ar71xx/generic/openwrt-19.07.0-ar71xx-generic-wndr3800-squashfs-sysupgrade.bin

https://downloads.openwrt.org/releases/19.07.0/targets/ath79/generic/openwrt-19.07.0-ath79-generic-netgear_wndr3800-squashfs-sysupgrade.bin

Have you tried 18.06.5 or 17.01.6 ?

thanks for highlighting the difference.

today I tried both ar71xx and ath79 builds of 19.07.0. I also tried 17.01.7, 18.06.6 for ar71xx.
neither of them work.

I am quite puzzled. DHCP must work for other people.

Seems like the warning at the beginning of the Relayd wiki perhaps applies to your WNDR3800. No one can explain properly why DHCP doesn't work for some routers.

I've not had any DHCP issues with AR9344 based WD N750 router, Home Hub 5A (Qualcomm wifi), MT7628 based Archer C50 v4, and IPQ4018 based EA6350v3.

I can't see why it would make a difference, but is the relayd wireless link using 2.4 or 5 GHz radio? I only ever use 5 GHz.

OK, luckily I also have a BT Home Hub 5A:

root@OpenWrt:~# cat /proc/cpuinfo 
system type		: xRX200 rev 1.2
machine			: BT Home Hub 5A

I manually created the same configuration I used on WNDR:

config interface 'loopback'
	option ifname 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option ula_prefix 'fd52:b7ae:2826::/48'

config atm-bridge 'atm'
	option vpi '1'
	option vci '32'
	option encaps 'llc'
	option payload 'bridged'
	option nameprefix 'dsl'

config dsl 'dsl'
	option annex 'a'
	option tone 'av'
	option ds_snr_offset '0'

config interface 'lan'
	option type 'bridge'
	option ifname 'eth0.1'
	option proto 'static'
	option netmask '255.255.255.0'
	option ip6assign '60'
	option gateway '192.168.100.128'
	option ipaddr '192.168.99.128'
	list dns '192.168.100.128'

config device 'lan_eth0_1_dev'
	option name 'eth0.1'
	option macaddr '18:62:2c:3d:a1:8c'

config device 'wan_dsl0_dev'
	option name 'dsl0'
	option macaddr '18:62:2c:3d:a1:8d'

config switch
	option name 'switch0'
	option reset '1'
	option enable_vlan '1'

config switch_vlan
	option device 'switch0'
	option vlan '1'
	option ports '0 1 2 4 6t'

config switch_vlan
	option device 'switch0'
	option vlan '2'
	option ports '5 6t'

config interface 'wlan'
	option proto 'static'
	option netmask '255.255.255.0'
	list dns '192.168.100.128'
	option ipaddr '192.168.100.83'
	option gateway '192.168.100.128'

config interface 'stabridge'
	option proto 'relay'
	list network 'lan'
	list network 'wlan'
	option ipaddr '192.168.100.83'

--

config wifi-device 'radio0'
	option type 'mac80211'
	option channel '36'
	option hwmode '11a'
	option path 'pci0000:01/0000:01:00.0/0000:02:00.0'
	option htmode 'VHT80'

config wifi-device 'radio1'
	option type 'mac80211'
	option channel '11'
	option hwmode '11g'
	option path 'pci0000:00/0000:00:0e.0'
	option htmode 'HT20'
	option disabled '1'

config wifi-iface 'wifinet0'
	option ssid '***'
	option device 'radio0'
	option mode 'sta'
	option key '***'
	option network 'wlan'
	option encryption 'psk2'

and guess what? exact same problem! normal traffic works, DHCP doesn't!

OpenWrt 19.07.0

could you please share your working HH 5A configuration and the version of OpenWRT you are using.

I am also using 5GHz only.

BTW HH 5A is supposed to be a much more powerful router with 802.1ac and everything. but the bridged transfer rate is mere 5MBytes/s compared to WNDR's 13MBytes/s. go figure.

thanks again.

I use LEDE 17.01.4 on HH5A. (Had wifi disconnect issues with 17.01.6). I have previously tested 18.06.0 briefly on HH5A to check it did work. Can't remember whether I tried relayd when 19.07 snapshots first appeared for HH5A.

/etc/config/network


config interface 'loopback'
	option ifname 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option ula_prefix 'fd17:adad:8579::/48'

config atm-bridge 'atm'
	option vpi '1'
	option vci '32'
	option encaps 'llc'
	option payload 'bridged'

config dsl 'dsl'
	option annex 'a'
	option tone 'av'
	option xfer_mode 'ptm'
	option ds_snr_offset '0'

config interface 'lan'
	option type 'bridge'
	option ifname 'eth0.1'
	option proto 'static'
	option netmask '255.255.255.0'
	option ip6assign '60'
	option ipaddr '192.168.11.1'

config device 'lan_dev'
	option name 'eth0.1'
	option macaddr 'd8:7d:7f:ed:b9:a6'

config device 'wan_dev'
	option name 'ptm0'
	option macaddr 'd8:7d:7f:ed:b9:a7'

config switch
	option name 'switch0'
	option reset '1'
	option enable_vlan '1'

config switch_vlan
	option device 'switch0'
	option vlan '1'
	option ports '0 1 2 4 6t'

config switch_vlan
	option device 'switch0'
	option vlan '2'
	option ports '5 6t'

config interface 'wwan'
	option _orig_ifname 'radio0.network1'
	option _orig_bridge 'false'
	option proto 'static'
	option netmask '255.255.255.0'
	option gateway '192.168.1.254'
	option dns '8.8.8.8'
	option ipaddr '192.168.1.236'

config interface 'stabridge'
	option proto 'relay'
	list network 'lan'
	list network 'wwan'
	option ipaddr '192.168.1.236'

/etc/config/firewall


config defaults
	option syn_flood '1'
	option input 'ACCEPT'
	option output 'ACCEPT'
	option forward 'REJECT'

config zone
	option name 'lan'
	option input 'ACCEPT'
	option output 'ACCEPT'
	option forward 'ACCEPT'
	option network 'lan wwan'

config include
	option path '/etc/firewall.user'

/etc/config/wireless

config wifi-device 'radio0'
	option type 'mac80211'
	option hwmode '11a'
	option path 'pci0000:01/0000:01:00.0/0000:02:00.0'
	option channel '48'
	option country 'GB'
	option htmode 'VHT40'

config wifi-device 'radio1'
	option type 'mac80211'
	option channel '11'
	option hwmode '11g'
	option path 'pci0000:00/0000:00:0e.0'
	option htmode 'HT20'
	option disabled '1'

config wifi-iface 'default_radio1'
	option device 'radio1'
	option network 'lan'
	option mode 'ap'
	option ssid 'DisabledWifi'
	option encryption 'none'

config wifi-iface
	option network 'wwan'
	option ssid 'WifiRouterSSID'
	option encryption 'psk2'
	option device 'radio0'
	option mode 'sta'
	option key 'wifipassword'
	option bssid '60:38:E0:11:11:D0'   #mac address of AP

DHCP server disabled.

I presume your firewall zone is correctly configured.

Your 'network' file contains gateway and DNS entries for LAN interface, which are not required.

My 'wireless' file contains 'option bssid' which is not required.

Update: I copied and pasted your 'network' and 'wireless' file onto a spare HH5A with 19.07.0. I edited some IPs, SSID/password. Puzzled by why I couldn't ping my main router until I realised you used 'wlan' as interface name. After correcting the firewall file, all is working including DHCP.

My main router and DHCP server is a HH5A running LEDE 17.01.6 btw

thank you for all your effort. I literally copied and pasted your config file onto a clean HH 5A, changed IP addresses and it still doesn't bloody work. I suspect what must be the difference is that my DHCP server is different from my access point. I have a router (192.168.100.128) and an AP I am bridging to (192.168.100.80), so the DHCP requests go via the AP then into LAN switch and into the router.
my AP (Mikrotik) supports DHCP relay which was disabled by default. I tried enabling it on the AP's 5GHz interface and it makes no difference. when I manually run the DHCP client on my PC connected to the HH 5A, the count of DHCP packets relayed by the AP is increasing accordingly (it has counters for requests and responses). I don't think I need the relay as I only have one subnet on the LAN so I turned it back off.

still, what I was seeing with tcpdump on OpenWRT confirms that the DHCP response makes it way back through the WiFi client.

my configs are now:

config interface 'loopback'
	option ifname 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option ula_prefix 'fd17:adad:8579::/48'

config atm-bridge 'atm'
	option vpi '1'
	option vci '32'
	option encaps 'llc'
	option payload 'bridged'

config dsl 'dsl'
	option annex 'a'
	option tone 'av'
	option xfer_mode 'ptm'
	option ds_snr_offset '0'

config interface 'lan'
	option type 'bridge'
	option ifname 'eth0.1'
	option proto 'static'
	option netmask '255.255.255.0'
	option ip6assign '60'
	option ipaddr '192.168.99.128'

config device 'lan_dev'
	option name 'eth0.1'
	option macaddr '18:62:2C:3D:A1:8C'

config device 'wan_dev'
	option name 'ptm0'
	option macaddr '18:62:2C:3D:A1:8D'

config switch
	option name 'switch0'
	option reset '1'
	option enable_vlan '1'

config switch_vlan
	option device 'switch0'
	option vlan '1'
	option ports '0 1 2 4 6t'

config switch_vlan
	option device 'switch0'
	option vlan '2'
	option ports '5 6t'

config interface 'wwan'
	option _orig_ifname 'radio0.network1'
	option _orig_bridge 'false'
	option proto 'static'
	option netmask '255.255.255.0'
	option gateway '192.168.100.128'
	option dns '8.8.8.8'
	option ipaddr '192.168.100.83'

config interface 'stabridge'
	option proto 'relay'
	list network 'lan'
	list network 'wwan'
	option ipaddr '192.168.100.83'

---

config wifi-device 'radio0'
	option type 'mac80211'
	option hwmode '11a'
	option path 'pci0000:01/0000:01:00.0/0000:02:00.0'
	option channel '48'
	option country 'GB'
	option htmode 'VHT40'

config wifi-device 'radio1'
	option type 'mac80211'
	option channel '11'
	option hwmode '11g'
	option path 'pci0000:00/0000:00:0e.0'
	option htmode 'HT20'
	option disabled '1'

config wifi-iface 'default_radio1'
	option device 'radio1'
	option network 'lan'
	option mode 'ap'
	option ssid 'DisabledWifi'
	option encryption 'none'

config wifi-iface
	option network 'wwan'
	option ssid '***'
	option encryption 'psk2'
	option device 'radio0'
	option mode 'sta'
	option key '***'

my firewall is disabled.

basically I've run out of ideas. thanks again.

fwiw, I have a similar setup.

ie. router/DHCP-server (HH5A LEDE 17.01.6, 192.168.1.254) which is ethernet wired to dumb AP (192.168.1.236) for past 3+ years. The AP was formerly another HH5A (LEDE) but I recently changed to EA6350v3 (Linksys stock FW) using 'bridge' (ie. AP) mode, keeping same 192.168.1.236 address. I only use one subnet 192.168.1.x for everything.

Can I perhaps suggest temporarily installing another AP using your spare WNDR3800. ie. bypass your Microtik AP. For another test, verify relayd on HH5a can connect to standalone WNDR3800 configured as regular openwrt wifi router with its own DHCP server?

Your network and wireless config files are indeed practically identical to my originals. You haven't shared the firewall file.

I leave the firewall enabled. I presume the 'lan' zone is identical to the one I shared otherwise relayd probably wouldn't work at all.

config zone
	option name 'lan'
	option input 'ACCEPT'
	option output 'ACCEPT'
	option forward 'ACCEPT'
	option network 'lan wwan'

I imported your firewall config unchanged. tried, then disabled and stopped firewall using /etc/init.d, tried again.

I will try another AP and another DHCP server. I was also thinking of enabling the DHCP server on the Mikrotik just for a test.

thanks again for your help.

I head there was another relay implementation, some kernel module. information is scarce on this.

Bill, would it be possible for you to add the -d option to relayd on your relay unit and run a few DHCP requests to see what it should look like?

I don't know how to add arbitrary options to daemons in OpenWRT using configs, so one way to do this would be to SSH to the WiFi client's address, do ps | grep [r]elayd, note that line, then stop relayd and run it manually by adding the -d option to the noted arguments.

that way it will show all requests and responses and hopefully that will be different from mine. I checked the sourcecode and it seems to be validating quite a few things in DHCP packets before forwarding.

the problem seems to be in the Mikrotik AP to which the wireless client is connected. it seems to mangle the DHCP responses somehow that they are not seen or accepted by relayd.

I set up a different AP (OpenWRT) without a DHCP server and plugged into the same LAN Mikrotik is plugged in, and I don't have the DHCP problem.

I compared responses (DHCP ACKs) coming via Mikrotik and non-Mikrotik APs.
the only difference is that in the latter the response has a broadcast address in the Ethernet header (works) while in the former it's the MAC address of the WiFi client (doesn't work).

Hi,

I reply to this post even if is not "strictly" the same problem.

I followed this guide: https://openwrt.org/docs/guide-user/network/wifi/relay_configuration (even if changing something) to create a wireless extended using a Netgear Access Point (Netgear WN3000RPv3).

At the beginning I landed to this plot since I was not able to "see" a device (an IP cam) connected to the access point. Essentially it is the only device so far connected to this Access Point since all the others are connected to the main modem/router (Zyxel VMG8825-T50K by Tiscali Italy).
When I say "see" I mean that the MAC of this IP cam was not appearing here:

so to me the IP cam MAC was not "passed" to the Zyxel router.
The strange thing was that sometimes (even if this is not stable and I don't know if this is another problem...) it was working (I was able to access the cam even via mobile data connection: so it was on the Internet).

So:

  • the IP cam is connected to the AP-WDS network I created:

(I'm allowed only for one figure per post)

and the IP is the one I "assigned" as static in the Zyxel router

(I'm allowed only for one figure per post)

  • in the ARP table the IP cam is present but with the MAC of the Access Point:

(I'm allowed only for one figure per post)

  • as written before, the IP cam connection is very unstable. It works, maybe for one hour and then it drops and I'm not able to make it connecting again, even restarting the Zyxel modem and/or the Netgear Access point and/or the cam itself.
    Is this related to some configuration problem and so the previous points?

Thanks,
Matteo

Can somebody help me?

Thanks @muxx for sharing your findings.

Running a Mikrotik router/AP and an OpenWrt wireless bridge, after RouterOS upgrade to 6.47 I found myself in exact same situation as you describe. DHCP packets from the server are now coming with a unicast L2 address and are not relayed.

Judging from sources, relayd indeed only watches for DHCP packets with broadcast or multicast addresses. I don't feel up to devising a fix, maybe we could file a feature request.

Anyway, my current solution is to stop relying on relayd's DHCP capabilities and use a standalone DHCP relay. There are multiple implementations to choose from, I went for dnsmasq as it's already installed and well integrated (ready with init.d script etc.)

First step is to run relayd without the -D parameter. Can be done in /etc/config/network:

config interface 'sta_bridge'
        option proto 'relay'
        option network 'lan wwan'
        ...
        option forward_dhcp 0 # add this

Second, configure dnsmasq as DHCP relay. This is how my /etc/config/dhcp looks like:

config dnsmasq
        option port '0' # optional, disable dnsmasq's DNS server, I don't need it

config relay
        option interface 'wwan'
        option local_addr '192.168.2.1' # lan interface address
        option server_addr '192.168.1.1' # the actual dhcp server address

# the rest is only needed for IPv6 (odhcpd)

config dhcp 'lan'
        option interface 'lan'
        option dhcpv4 'disabled'
        option ra 'relay'
        option ndp 'relay'
        option dhcpv6 'relay'

config dhcp 'wwan'
        option interface 'wwan'
        option dhcpv4 'disabled'
        option ra 'relay'
        option ndp 'relay'
        option dhcpv6 'relay'
        option master '1'

The final step is to make sure the real DHCP server accepts requests from relay agents. On Mikrotik I did:
/ip dhcp-server set defconf relay=255.255.255.255)

Hope this helps anyone.

1 Like

sadly it didnt help for me. i recently bought archer C7 v5 and decided to flash openwrt for wireless bridge between my mikrotik for to extend local network to the whole house. its working if you create a seperate network in openwrt but doesnt work if you want to be on the same network, as the mikrotik gives warning messages about DHCP (dhcp1 offering lease 192.168.1.29 for xx:xx:60:x2 to Dx:xx:xx:03:x9:xF without success)

where first mac is my phone and the second is the openwrts wifi client.

tried your settings and a few in routeros firewall but nothing helped.. its just not compatible.

it sucks to have 2 different networks for no reason, yet i have to use it like that for now.

I wasn't able also to make it work. Even "copying" the conf files a above. Should be said that I'm not a very expert and so I really understand all the steps I make...

What I really don't understand is why sometimes it's working...

Matteo