How to serve mutliple subnets via a DHCP relay with dnsmasq?

Previous title before I understood the expected behavior of a DHCP relay: Dnsmasq responding to dhcp relay gateway IP instead of source IP address (serving multiple subnets). More information about my current understand of what needs to happen is in post 4: How to serve mutliple subnets via a DHCP relay with dnsmasq? - #4 by robbins

My router’s br-lan interface has IP 10.17.0.1 netmask 255.255.255.0. It is connected to an L3 switch via an untagged VLAN10 port (with VE IP 10.17.0.2 netmask 255.255.255.0), and doesn’t know anything about VLANs. On the switch, I have another VLAN, VLAN33, configured and a virtual ethernet interface attached (Cisco SVI) with IP address 192.168.3.1 netmask 255.255.255.0. Since DHCP broadcasts are limited to this VLAN, I also set ip helper-address to 10.17.0.1 so that they can be sent unicast to the OpenWRT router.

This seems to be working correctly in this capture on the router:

23:22:08.948673 br-lan In  ifindex 9 74:8e:f8:ce:f7:e4 ethertype IPv4 (0x0800), length 341: (tos 0x1,ECT(1), ttl 64, id 7684, offset 0, flags [none], proto UDP (17), length 321)
    10.17.0.2.67 > 10.17.0.1.67: [udp sum ok] BOOTP/DHCP, Request from e8:ea:6a:8d:3d:80, length 293, hops 1, xid 0x38f6fe36, secs 17, Flags [none] (0x0000)
	 Gateway-IP 192.168.3.1
	 Client-Ethernet-Address aa:bb:cc:dd:ee:ff
	 Vendor-rfc1048 Extensions
	   Magic Cookie 0x63825363
	   DHCP-Message (53), length 1: Discover
	   Client-ID (61), length 19: hardware-type 255, ff:cc:f0:c4:00:02:00:00:ab:11:1d:9a:20:44:1a:f0:2e:a2
	   Parameter-Request (55), length 10:
	     Subnet-Mask (1), Default-Gateway (3), Domain-Name-Server (6), Hostname (12)
	     Domain-Name (15), Static-Route (33), NTP (42), URL (114)
	     Unknown (120), Classless-Static-Route (121)
	   MSZ (57), length 2: 1472
	   SLP-NA (80), length 0""
	   Hostname (12), length 8: "hostname"

The DHCP discover is being sent and received from the switch to the router.

I want to configure as much in LuCI as possible, so to be able to setup a second DHCP range, I created an empty bridge vlan33_guest_br and then created an interface on that so I could enable DHCP:

config interface 'guest_vlan33'
	option proto 'static'
	option device 'guest_vlan33_br'
	option ipaddr '192.168.3.1'
	option netmask '255.255.255.0'

config device
	option type 'bridge'
	option name 'guest_vlan33_br'
	option bridge_empty '1'

Then I specify DHCP configuration:

config dhcp 'guest_vlan33'
	option interface 'guest_vlan33'
	option start '16'
	option limit '150'
	option leasetime '12h'
	list dhcp_option 'option:netmask,255.255.255.0'
	list dhcp_option 'option:router,192.168.3.1'
	list dhcp_option 'option:dns-server,0.0.0.0'

but when dnsmasq responds, it doesn’t respond to the SRC IP of 10.17.0.1, but instead 192.168.3.1 which is set as the Relay Agent IP in the DHCP packet:

23:30:20.485107 lo    In  ifindex 1 00:00:00:00:00:00 ethertype IPv4 (0x0800), length 351: (tos 0xc0, ttl 64, id 54437, offset 0, flags [none], proto UDP (17), length 331)
    192.168.3.1.67 > 192.168.3.1.67: [bad udp cksum 0x889b -> 0x7c43!] BOOTP/DHCP, Reply, length 303, hops 1, xid 0x2e778979, secs 4, Flags [none] (0x0000)
	 Your-IP 192.168.3.54
	 Server-IP 192.168.3.1
	 Gateway-IP 192.168.3.1
	 Client-Ethernet-Address aa:bb:cc:dd:ee:ff
	 Vendor-rfc1048 Extensions
	   Magic Cookie 0x63825363
	   DHCP-Message (53), length 1: Offer
	   Server-ID (54), length 4: 192.168.3.1
	   Lease-Time (51), length 4: 43200
	   RN (58), length 4: 21600
	   RB (59), length 4: 37800
	   BR (28), length 4: 192.168.3.255
	   Domain-Name (15), length 9: "home.arpa"
	   Domain-Name-Server (6), length 4: 192.168.3.1
	   Default-Gateway (3), length 4: 192.168.3.1
	   Subnet-Mask (1), length 4: 255.255.255.0

(it also replies on lo I guess because our interface has 192.168.3.1, but I had to set that IP so the DHCP options would be created correctly).

How can I get dnsmasq to respond to the source IP of the relay?

Here’s the dnsmasq configuration as well:

dhcp-range=set:lan,10.17.0.15,10.17.0.164,255.255.255.0,12h
dhcp-option=lan,option:dns-server,0.0.0.0
dhcp-range=set:guest_vlan33,192.168.3.16,192.168.3.165,255.255.255.0,12h
dhcp-option=guest_vlan33,option:netmask,255.255.255.0
dhcp-option=guest_vlan33,option:router,192.168.3.1
dhcp-option=guest_vlan33,option:dns-server,0.0.0.0

Inspired from this post and this post. Thanks

Relay is supposed to be bidirectional yu see udp src dst both 67 - it is to the relay not client

Yes, I know, since the OFFER has to go from the server to the relay first and these both happen on port 67.

I guess don’t understand how the DHCP relay is supposed to work. Wikipedia’s explanation seems to confirm this behaviour, saying “When the DHCP server replies to the client, it sends the reply to the GIADDR-address”. But the GIADDR address is in a different subnet than the client, which is why we have a relay in the first place! I think the DHCP server should send the response to the IP of the relay on its subnet.

So in my case it should respond to 10.17.0.2 which would then route it to the 192.168.3.0/24 subnet.

Ok, so it turns out that I’m wrong about how DHCP relays should work, and it is indeed intended behaviour for the DHCP server to send the reply OFFER packet to the GIADDR as opposed to the source address, and you’re expected to just have a route to the client subnet (especially since clients can apparently contact the DHCP server via unicast to renew leases without going through the relay, so there needs to be L3 connectivity between the subnets and the relay only helps with L2 broadcast connectivity.

I’m still not sure how to implement this on OpenWRT, however, since the VLAN33 SVI IP is 192.168.3.1 but I need an interface with IP 192.168.3.1 to get the correct dnsmasq configuration. I suppose I could use 192.168.3.2, but I’m still unsure how to go from there - do I need a static route to 192.168.3.1 via 10.17.0.2? More configuration on the switch?