Dhcp and strongswan kernel ipsec blocked by firewall?

Bonus points for anyone that can figure this thing out, been trying to figure this one out for two weeks and it's driving me crazy.

Issue

When running a vanilla OpenWRT with a strongswan IKEv2 server, roadwarrior vpn clients cannot get dhcp leases from dnsmasq on the lan interface. The same issue is blocking vpn clients from making DNS queries to dnsmasq running on the device. DHCP used to work with the older version with TUN interfaces via libipsec and firewall zones, however the kernel-based policy-routing breaks everything.

The roadwariors need to VPN back to the office to access on-site apps, as well as cloud servers that are connected to the office by another site-to-site VPN with custom firewall rules. This means I can't use the usual setup of virtual IPs outside of the LAN subnet without breaking all the firewall rules.

How the problem manifests

DHCP packets and DNS packets are never received by dnsmasq either due probably due to an iptables rule but I can't find it. Specifically in DHCP, the DHCP DISCOVER stage keeps trying till a timeout.

logread output

note: the WAN IP is 192.168.2.45; the roadwarrior client is 192.168.2.25

daemon.info : 07[ENC] parsed IKE_AUTH request 3 [ EAP/RES/MSCHAPV2 ]
daemon.info : 07[ENC] generating IKE_AUTH response 3 [ EAP/REQ/MSCHAPV2 ]
daemon.info : 07[NET] sending packet: from 192.168.2.45[4500] to 192.168.2.25[4500] (144 bytes)
daemon.info : 04[NET] sending packet: from 192.168.2.45[4500] to 192.168.2.25[4500]
daemon.info : 03[NET] received packet: from 192.168.2.25[4500] to 192.168.2.45[4500]
daemon.info : 03[NET] waiting for data on sockets
daemon.info : 11[NET] received packet: from 192.168.2.25[4500] to 192.168.2.45[4500] (80 bytes)
daemon.info : 11[ENC] parsed IKE_AUTH request 4 [ EAP/RES/MSCHAPV2 ]
daemon.info : 11[IKE] EAP method EAP_MSCHAPV2 succeeded, MSK established
daemon.info : 11[ENC] generating IKE_AUTH response 4 [ EAP/SUCC ]
daemon.info : 11[NET] sending packet: from 192.168.2.45[4500] to 192.168.2.25[4500] (80 bytes)
daemon.info : 04[NET] sending packet: from 192.168.2.45[4500] to 192.168.2.25[4500]
daemon.info : 03[NET] received packet: from 192.168.2.25[4500] to 192.168.2.45[4500]
daemon.info : 03[NET] waiting for data on sockets
daemon.info : 09[NET] received packet: from 192.168.2.25[4500] to 192.168.2.45[4500] (112 bytes)
daemon.info : 09[ENC] parsed IKE_AUTH request 5 [ AUTH ]
daemon.info : 09[IKE] authentication of '192.168.2.25' with EAP successful
daemon.info : 09[IKE] IKE_SA rwEAPMSCHAPV2[1] state change: ESTABLISHED => DESTROYING
daemon.info : 09[IKE] IKE_SA rwEAPMSCHAPV2[2] established between 192.168.2.45[localtest.corp.link]...192.168.2.25[192.168.2.25]
authpriv.info : 09[IKE] IKE_SA rwEAPMSCHAPV2[2] established between 192.168.2.45[localtest.corp.link]...192.168.2.25[192.168.2.25]
daemon.info : 09[IKE] IKE_SA rwEAPMSCHAPV2[2] state change: CONNECTING => ESTABLISHED
daemon.info : 09[IKE] peer requested virtual IP %any
daemon.info : 09[KNL] using 192.168.1.1 as address to reach 192.168.1.255/32
daemon.info : 09[CFG] sending DHCP DISCOVER to 192.168.1.255
daemon.info : 09[KNL] using 192.168.1.1 as address to reach 192.168.1.255/32
daemon.info : 09[CFG] sending DHCP DISCOVER to 192.168.1.255
daemon.info : 03[NET] received packet: from 192.168.2.25[4500] to 192.168.2.45[4500]
daemon.info : 03[NET] waiting for data on sockets
daemon.info : 12[MGR] ignoring request with ID 5, already processing
daemon.info : 09[KNL] using 192.168.1.1 as address to reach 192.168.1.255/32
daemon.info : 09[CFG] sending DHCP DISCOVER to 192.168.1.255
daemon.info : 03[NET] received packet: from 192.168.2.25[4500] to 192.168.2.45[4500]
daemon.info : 03[NET] waiting for data on sockets
daemon.info : 14[MGR] ignoring request with ID 5, already processing
daemon.info : 09[KNL] using 192.168.1.1 as address to reach 192.168.1.255/32
daemon.info : 09[CFG] sending DHCP DISCOVER to 192.168.1.255
daemon.info : 03[NET] received packet: from 192.168.2.25[4500] to 192.168.2.45[4500]
daemon.info : 03[NET] waiting for data on sockets
daemon.info : 15[MGR] ignoring request with ID 5, already processing
daemon.info : 09[KNL] using 192.168.1.1 as address to reach 192.168.1.255/32
daemon.info : 09[CFG] sending DHCP DISCOVER to 192.168.1.255
daemon.info : 09[CFG] DHCP DISCOVER timed out
daemon.info : 09[IKE] no virtual IP found for %any requested by 'bryan'
daemon.info : 09[IKE] peer requested virtual IP %any6
daemon.info : 09[IKE] no virtual IP found for %any6 requested by 'bryan'
daemon.info : 09[IKE] no virtual IP found, sending INTERNAL_ADDRESS_FAILURE

Both the DHCP and DNS packets never show up on the br-lan interface with tcpdump

Partial fix - but not really

I can place the virtual IP pool for clients in the LAN subnet/DHCP pool and the DNS works perfectly; however this causes 2 issues, (1) DHCP collisions could happen if the pools overlap, or (2) reserving part of the subnet for VPN clients shrinks the LAN DHCP pool too much and the subnet cannot be changed.
For example /etc/ipsec.conf can be modified to

rightsourceip=192.168.1.51/28

Setup

opkg install

dnsmasq strongswan-default
strongswan-mod-dhcp strongswan-mod-farp
strongswan-mod-eap-identity strongswan-mod-eap-mschapv2

/etc/ipsec.conf

config setup
    # charondebug="cfg 2, dmn 2, ike 1, knl 1, cfg 0, net 2"
    uniqueids=no
    strictcrlpolicy=no

conn %default
    auto=add
    keyexchange=ikev2
    compress=no
    type=tunnel
    fragmentation=yes
    forceencaps=yes
    dpdaction=clear
    dpddelay=300s
    rekey=no

    ### Supported Encryption ###
    ike=aes256-aes128-sha256-sha1-modp3072-modp2048
    esp=aes256-aes128-sha256-sha1-modp3072-modp2048

    ### Default Server ID ###
    left=%any
    leftid=%any
    leftsubnet=0.0.0.0/0

    ### Default Remote ID ###
    right=%any
    rightid=%any
    # rightsourceip=192.168.1.51/28
    eap_identity=%any

conn rwEAPMSCHAPV2
    eap_identity=%identity
    leftauth=pubkey
    leftcert=fullchain.pem
    leftid=office.example.com
    leftsendcert=always
    rightauth=eap-mschapv2
    rightsendcert=never
    rightsourceip=%dhcp

/etc/strongswan.conf

charon {
    load_modular = yes
        dns1 = 192.168.1.1
    plugins {
        include strongswan.d/charon/*.conf
        dhcp {
            force_server_address = yes
            # use_server_port = yes
            identity_lease = yes
            server = 192.168.1.255
            # interface = br-lan
        }
    }
}

/etc/firewall.user

iptables -I INPUT  -m policy --dir in --pol ipsec --proto esp -j ACCEPT
iptables -I FORWARD  -m policy --dir in --pol ipsec --proto esp -j ACCEPT
iptables -I FORWARD  -m policy --dir out --pol ipsec --proto esp -j ACCEPT
iptables -I OUTPUT   -m policy --dir out --pol ipsec --proto esp -j ACCEPT
iptables -t nat -I POSTROUTING -m policy --pol ipsec --dir out -j ACCEPT

I'll sponsor a thanksgiving dinner for anyone that can figure this one out.

Configs were based on https://openwrt.org/docs/guide-user/services/vpn/strongswan/roadwarrior and https://openwrt.org/inbox/strongswan_certificates

UCI config: /etc/config/dhcp
config dnsmasq
	option domainneeded '1'
	option localise_queries '1'
	option rebind_protection '1'
	option rebind_localhost '1'
	option local '/lan/'
	option domain 'lan'
	option expandhosts '1'
	option authoritative '1'
	option readethers '1'
	option leasefile '/tmp/dhcp.leases'
	option resolvfile '/tmp/resolv.conf.auto'
	option localservice '1'

config dhcp 'lan'
	option interface 'lan'
	option start '100'
	option leasetime '12h'
	option limit '150'

config dhcp 'wan'
	option interface 'wan'
	option ignore '1'

config odhcpd 'odhcpd'
	option maindhcp '0'
	option leasefile '/tmp/hosts/odhcpd'
	option leasetrigger '/usr/sbin/odhcpd-update'
	option loglevel '4'
UCI config: /etc/config/network
config interface 'loopback'
	option ifname 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config interface 'lan'
	option type 'bridge'
	option ifname 'eth1 eth2 eth3'
	option proto 'static'
	option ipaddr '192.168.1.1'
	option netmask '255.255.255.0'
	option ip6assign '60'
	option stp '1'

config interface 'wan'
	option type 'bridge'
	option ifname 'eth0'
	option proto 'dhcp'
UCI config: /etc/config/firewall
config defaults
	option syn_flood '1'
	option input 'ACCEPT'
	option output 'ACCEPT'
	option forward 'REJECT'

config zone
	option name 'lan'
	list network 'lan'
	option input 'ACCEPT'
	option output 'ACCEPT'
	option forward 'ACCEPT'

config zone
	option name 'wan'
	list network 'wan'
	list network 'wan_6'
	option input 'REJECT'
	option output 'ACCEPT'
	option forward 'REJECT'
	option masq '1'
	option mtu_fix '1'

config forwarding
	option src 'lan'
	option dest 'wan'

config rule
	option name 'Allow-DHCP-Renew'
	option src 'wan'
	option proto 'udp'
	option dest_port '68'
	option target 'ACCEPT'
	option family 'ipv4'

config rule
	option name 'Allow-Ping'
	option src 'wan'
	option proto 'icmp'
	option icmp_type 'echo-request'
	option family 'ipv4'
	option target 'ACCEPT'

config rule
	option name 'Allow-IGMP'
	option src 'wan'
	option proto 'igmp'
	option family 'ipv4'
	option target 'ACCEPT'

config include
	option path '/etc/firewall.user'

config rule 'ipsec_esp'
	option src 'wan'
	option name 'IPSec ESP'
	option proto 'esp'
	option target 'ACCEPT'

config rule 'ipsec_ike'
	option src 'wan'
	option name 'IPSec IKE'
	option proto 'udp'
	option dest_port '500'
	option target 'ACCEPT'

config rule 'ipsec_nat_traversal'
	option src 'wan'
	option name 'IPSec NAT-T'
	option proto 'udp'
	option dest_port '4500'
	option target 'ACCEPT'

config rule 'ipsec_auth_header'
	option src 'wan'
	option name 'Auth Header'
	option proto 'ah'
	option target 'ACCEPT'

Thanks,
Bryan

Are you trying to relay dhcp to a broadcast address?

As per the guide, lan and ipsec are in different subnets. You on the other hand are trying to combine them and force client to get dhcp from the lan.

I expect your private internal network to be 10.0.1.0/24. This means that your LAN network will still be 10.0.0.0/24 and your VPN clients will connect to your LAN zone using 1.0.1.0/24, so directions do not overlap.

Thanks for looking at it @trendy,

Yes that is a broadcast address. The strongswan documentation, and all the latest forum posts, indicate that the DHCP plugin needs to be pointed to the broadcast address of the LAN subnet (the old logic about pointing it at the IP of the DHCP server doesn't apply any more, I assume because the DHCP server is actually listening on the broadcast address.) The broadcast address shows up in the DHCP Plugin Docs and another OpenWRT example using LAN DHCP.
I tried the LAN IP 192.168.1.1 just to be sure, and that didn't work either :frowning: .

I'm using a super vanilla setup right now to try to debug the problem (flashed two devices just to be sure):

  • Only 1 OpenWRT x86 device with a wan interface (br-wan, 192.168.2.45) and a lan interface (br-lan, 192.168.1.1) - the configurations at the bottom of the original post.
  • dnsmasq and strongswan are both running on that same device.
  • dnsmasq is to the LAN interface with a broadcast address of 192.168.1.255 and is serving DHCP leases from 192.168.1.100 - 192.168.1.150.
  • I also tried setting localservice=0 for dnsmasq but it still does not respond, which is why I'm sure the packets aren't reaching dnsmasq.

As for the comment about basing the configs on the roadwarrior config, I am aware the original guide is based on non-overlapping subnets. What I was trying to say is that I used the roadwarrior guide as a starting point, which worked perfectly. Then I tried to integrate the dhcp plugin, detailed in the docs and other posts, so the roadwarriors would get an IP from the same pool as the LAN clients, however the DHCP DISCOVER packets appear to not be hitting the dnsmasq daemon on the same device.

I suspect once the DHCP and DNS packets emerge from the tunnel they are not being routed to the LAN interface, but only in this scenario. This is really strange, because:

  • if I use a virtual IP pool outside the LAN subnet (say, rightsourceip=192.168.99.0/24) I can ping 192.168.1.1, but DNS requests are not answered (con is set to rightdns=192.168.1.1 and the client correctly has 192.168.1.1 set up as the DNS). Even if I run dig @192.168.1.1 google.com from my macOS VPN client, the DNS request goes 'unanswered'.
  • if I use a virtual IP inside the LAN subnet (rightsourceip=192.168.1.51/28 from the original post), DNS requests are answered perfectly fine by dnsmasq. I can even query static leases configured in dnsmasq and the dnsmasq log reports the DNS query being received.

If we can figure this out, I'll write up an openwrt doc page on how to set this up as there seem to be quite a few questions around variations of this problem.

Cheers,
Bryan

Can you try to capture the interesting packets on all interfaces and see if they reach the OpenWrt properly?
tcpdump -i any -evn udp port 53 or udp port 67
I don't think they are blocked by the firewall, since you have the default input accept and there is no ipsec interface or zone.