OpenWrt not responding to DHCP Discover on VLAN Interface

Hi,

I have a Comset 4G router here that runs OpenWrt. It is working for the most part, except it's not responding to DHCP discover requests from the client on a particular VLAN (4011). I can see with TCPDUMP that the requests are reaching the interface on the OpenWrt router.

VLAN 25 is untagged and VLAN 4011 is tagged. VLAN 4011 is the interface that isn't responding to DHCP requests from clients.

That VLAN interface is in it's own firewall zone. There is a firewall rule that currently accepts all input from that zone. There is nothing specific for that zone in terms of output, but the default is to accept output. It accepts forwarding from that zone to wan, but that's not too relevant here.

I would have said the behaviour is akin to it silently discarding input packets on that VLAN, so the DHCP server never receives the request (even though I can see it hitting the interface). If that's not the problem, then the behaviour is akin to it silently ignoring the request and not responding.

Any ideas?

/etc/config/network:

config switch
        option name 'switch0'
        option reset '1'
        option enable_vlan '1

config switch_vlan
        option device 'switch0'
        option vlan '1'
        option ports '0 1 2 3 6t'
        option vid '25'

config switch_vlan
        option device 'switch0'
        option vlan '4'
        option ports '0t 1t 2t 3t 6t'
        option vid '4011'

config interface 'DSFW_4011'
        option proto 'static'
        option ifname 'eth0.4011'
        option auto '1'
        option delegate '1'
        option ipaddr 'xxx.xxx.xxx.1'
        option netmask '255.255.255.0'
        option ip6assign '60'

/etc/config/dhcp:

config dnsmasq
        option domainneeded '1'
        option boguspriv '1'
        option localise_queries '1'
        option rebind_protection '1'
        option rebind_localhost '1'
        option expandhosts '1'
        option nonegcache '0'
        option readethers '1'
        option leasefile '/tmp/dhcp.leases'
        option resolvfile '/tmp/resolv.conf.auto'
        option localservice '1'
        option authoritative '0'
        option filterwin2k '1'
        option local 'domain.xxx.xxx.xxx'
        option domain 'domain.xxx.xxx.xxx'

config dhcp 'LAN_Untagged'
        option leasetime '12h'
        option interface 'LAN_Untagged'
        option ignore '0'
        option start '11'
        option limit '149'
        option dynamicdhcp '1'
        option force '1'
        option ra 'server'
        option dhcpv6 'server'
        option ra_management '1'
        option ra_default '0'

config dhcp 'DSFW_4011'
        option leasetime '12h'
        option interface 'DSFW_4011'
        option ignore '0'
        option start '11'
        option limit '149'
        option dynamicdhcp '1'
        option ra 'server'
        option dhcpv6 'server'
        option ra_management '1'
        option ra_default '0'
        option force '1'

/etc/config/firewall:


config defaults
        option enabled '1'
        option output 'ACCEPT'
        option forward 'DROP'
        option syn_flood '1'
        option synflood_burst '50'
        option synflood_protect '1'
        option drop_invalid '1'
        option tcp_ecn '1'
        option tcp_syncookies '1'
        option tcp_window_scaling '1'
        option input 'DROP'

config zone 'lan'
        option name 'lan'
        option input 'ACCEPT'
        option output 'ACCEPT'
        option forward 'ACCEPT'
        option network 'lan guestlan'

config zone                             
        option name 'lan_untagged'      
        option input 'ACCEPT'           
        option forward 'DROP'           
        option output 'ACCEPT'          
        option network 'LAN_Untagged'

config zone                             
        option name 'DSFW_4011'         
        option input 'ACCEPT'           
        option forward 'DROP'           
        option output 'ACCEPT'          
        option network 'DSFW_4011' 

config rule                                        
        option enabled '1'                         
        option target 'ACCEPT'                     
        option src 'DSFW_4011'                     
        option dest 'wan'                          
        option proto 'all'                         
        option name 'DSFW4011<>WAN'                
                                                   
config rule                                        
        option enabled '1'                         
        option target 'ACCEPT'                     
        option src 'DSFW_4011'                     
        option name 'DSFW4011-Input'               
        option proto 'all'  

Is the client configured to tag its interface?
Do the request arrive tagged to eth0? Do you see them on eth0.4011 too?

1 Like

Yep, definitely tagged. It took a while, to get my laptop ethernet adapter playing correctly with tagged frames in Wireshark, but now, I can see the DHCP discover requests from the client as they leave the switch interface heading to the connected OpenWrt box, and they're definitely tagged as 4011.

Can you even ping the 4011 network?

Tagged and untagged on the same port often does not work. At the least, tag all VLANs on the CPU port of the switch so lan = eth0.1, 4011 = eth0.4011 and you're not trying to interact with plain eth0, which generally catches all packets regardless of tag. There should be no references to plain eth0 in your config.

Network names must be kept short-- the kernel limit is 15 characters, and that includes the "br-" that OpenWrt adds. Convention is to use all lowercase for network names.

1 Like

According to the GUI, all vlans are tagged for the CPU. This issue is just getting frustrating. I'll try rebuilding the whole thing and keep the names short etc.

It's weird, to test I added an other vlan (1000) where the OpenWrt router can be a DHCP client rather than server. That works fine. The untagged vlan also works fine. Can't ping on 4011 as the client hasn't received an address from DHCP server running on the OpenWrt router.

in case it wasnt mentioned already alot of equipment places arbitary upper limits on the number(ing) of supported vlan's...

63/4 - 1023/4 being fairly common...

some switch chipset doenst go well mixing a untagged and tagged on a single port, so usually you can only assign a single trunk port either go all tag or configure it as a single untagged configured port.

i had the same case with Router 4A 100M (Xiaomi) and want to use it as a VLAN-SSID Access Point, my configuration doesnt work at all at first, after digging some posts regarding VLANs here in OpenWRT, I read that some switch chipset doesnt work well with mix untagged and tag port on single port. After I read that, I just configured my trunk port and eliminate that single untagged, after a few seconds, all 3 tagged VLANs that I set is start working.

Thanks @remlei and @anon50098793, that's useful information. Before I nuke it, I will try cranking 4011 down to a lower vlan number, just in case. VLAN 1000 is working fine as a tagged vlan along side the untagged vlan, so perhaps I've fallen into the 'greater than 1024 vlan tag id' trap.

The chipset in this case as pilfered from dmesg:
SoC Type: MediaTek MT7620A ver:2 eco:6

Okay, I'm narrowing in on this (in the most tedious fashion). The vlan numbers (i.e. greater than 1024) don't appear to be the issue. Could still be the total amount of vlans, more testing to confirm that's not the issue (but I suspect it isn't in my case). Have moved everything to tagged, but that's mostly laziness whilst rebuilding this so often. Suspect mixing tagged and untagged is not the issue.

The issue appears to be with the firewall settings in the gui. This company appears to have rolled their own version of openwrt as aspects of the gui are both older, and in other areas, custom. I think they have bugs in the gui.

Adding firewall rules to accept input from certain zones, is failing to override the default of rejecting input. The aim was such that input would only be accepted from zones with an explicit allow input rule, which are at the top of the list, and all else would hit an implicit default input drop. Worse than that though, the input rule half works....it does indeed allow input to the web gui from the specific zones that have an explicit rule, whilst rejecting input from zones without one. It also allows outbound dhcp from zones with an explicit allow input rule (i.e. dhcp requests from openwrt to another server). Whether thats because it's an allowed response to outbound originated traffic by default, or the actual allow input rule, it doesn't really matter.

The problem is here:
Even for zones where the input in the web gui works, courtesy of it's explicit allow all input (and I mean all protocols, from all hosts in that zone) it manages to break the processing of dhcp discover from those same zones in some way. Just seems to silently discard or ignore them.

TL;DR - I'm going to rebuild the lot from the ground up yet again, but just via the CLI and /etc/conf files, to see if I can at least determine if the problem lies deeper than the gui, or not.