Udhcpc chokes on full load on wan

Hello,

I'm running custom build of OpenWRT 19.07 on x86, i3-8100 CPU. I have a problem with udhcpc choking when there's "full load" (download ongoing at max speed) on wan/tun0 interface. I'm also using VPN which get disconnected in this case and only comes back up with /etc/init.d/openvpn restart

Here's log

Mon Apr  5 20:21:03 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to a.a.a.a
Mon Apr  5 20:36:03 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to a.a.a.a
Mon Apr  5 20:43:33 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to a.a.a.a
Mon Apr  5 20:47:18 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to a.a.a.a
Mon Apr  5 20:49:10 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to a.a.a.a
Mon Apr  5 20:50:06 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to 0.0.0.0
Mon Apr  5 20:50:36 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to 0.0.0.0
Mon Apr  5 20:50:51 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to 0.0.0.0
Mon Apr  5 20:50:58 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to 0.0.0.0
Mon Apr  5 20:51:01 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to 0.0.0.0
Mon Apr  5 20:51:02 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to 0.0.0.0
Mon Apr  5 20:51:02 2021 daemon.notice netifd: wan (5210): udhcpc: lease lost, entering init state
Mon Apr  5 20:51:02 2021 daemon.notice netifd: Interface 'wan' has lost the connection
Mon Apr  5 20:51:02 2021 daemon.notice netifd: wan (5210): udhcpc: sending discover
Mon Apr  5 20:51:05 2021 daemon.notice netifd: wan (5210): udhcpc: sending discover
Mon Apr  5 20:51:05 2021 daemon.notice netifd: wan (5210): udhcpc: sending select for b.b.b.b
Mon Apr  5 20:51:05 2021 daemon.notice netifd: wan (5210): udhcpc: lease of b.b.b.b obtained, lease time 7200
Mon Apr  5 20:51:05 2021 daemon.notice netifd: Interface 'wan' is now up
Mon Apr  5 20:51:05 2021 user.notice firewall: Reloading firewall due to ifup of wan (eth1)
Mon Apr  5 20:53:44 2021 daemon.notice openvpn(vpnclient)[5517]: TLS: tls_process: killed expiring key
Mon Apr  5 20:54:02 2021 daemon.notice openvpn(vpnclient)[5517]: [vpnprovider.com] Inactivity timeout (--ping-restart), restarting
Mon Apr  5 20:54:02 2021 daemon.notice openvpn(vpnclient)[5517]: SIGUSR1[soft,ping-restart] received, process restarting
Mon Apr  5 20:54:02 2021 daemon.notice openvpn(vpnclient)[5517]: Restart pause, 5 second(s)
Mon Apr  5 20:54:07 2021 daemon.notice openvpn(vpnclient)[5517]: TCP/UDP: Preserving recently used remote address: [AF_INET]z.z.z.z.132:1194
Mon Apr  5 20:54:07 2021 daemon.notice openvpn(vpnclient)[5517]: Socket Buffers: R=[212992->425984] S=[212992->425984]
Mon Apr  5 20:54:07 2021 daemon.notice openvpn(vpnclient)[5517]: UDP link local: (not bound)
Mon Apr  5 20:54:07 2021 daemon.notice openvpn(vpnclient)[5517]: UDP link remote: [AF_INET]z.z.z.z.132:1194
Mon Apr  5 20:55:07 2021 daemon.err openvpn(vpnclient)[5517]: TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity)
Mon Apr  5 20:55:07 2021 daemon.err openvpn(vpnclient)[5517]: TLS Error: TLS handshake failed
Mon Apr  5 20:55:07 2021 daemon.notice openvpn(vpnclient)[5517]: SIGUSR1[soft,tls-error] received, process restarting
... TLS Error: TLS handshake failed repeated until OpenVPN restarted manually ...

I've tried SQM and software flow offload without SQM which do not help in this case, I believe they have no effect on DHCP requests that go straight to VPN tunnel. I'm running out of ideas what to try next so maybe someone can give some ideas. Also I'm puzzled why OpenVPN requires restart in this case.

I have also tried to get rid of "sending renew to a.a.a.a" by using "Use broadcast flag: true" but for some reason this option does not change what happens in logs in any way.

When there is no load on wan/tun0 DHCP renew always succeeds

Mon Apr  5 16:22:56 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to a.a.a.a
Mon Apr  5 16:37:56 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to a.a.a.a
Mon Apr  5 16:45:26 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to a.a.a.a
Mon Apr  5 16:49:11 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to a.a.a.a
Mon Apr  5 16:51:03 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to a.a.a.a
Mon Apr  5 16:51:59 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to 0.0.0.0
Mon Apr  5 16:51:59 2021 daemon.notice netifd: wan (5210): udhcpc: lease of b.b.b.b obtained, lease time 7200
Mon Apr  5 16:53:44 2021 daemon.notice openvpn(vpnclient)[5517]: TLS: tls_process: killed expiring key
Mon Apr  5 16:53:44 2021 daemon.notice openvpn(vpnclient)[5517]: VERIFY OK: depth=2, C=PA, O=VPNProvider, CN=VPNProvider Root CA
Mon Apr  5 16:53:44 2021 daemon.notice openvpn(vpnclient)[5517]: VERIFY OK: depth=1, C=PA, O=VPNProvider, CN=VPNProvider CA5
Mon Apr  5 16:53:44 2021 daemon.notice openvpn(vpnclient)[5517]: VERIFY KU OK
Mon Apr  5 16:53:44 2021 daemon.notice openvpn(vpnclient)[5517]: Validating certificate extended key usage

DHCP is (usually) udp, which (contrary to tcp) may be dropped under congestion, which understandably puts you in kind of a pickle…

The only real (close to-) solution I see here, is getting aggressive with SQM, diffserv and DSCP tagging.

Using WireGuard or a hotplug script should solve the VPN problem.

Thanks, I wonder which tcpdump spell I could use to check if these DHCP UDP packets are actually dropped in this scenario.

Also I would like to understand "sending renew to a.a.a.a" requests although those are not probably related to this problem. It feels that my ISP does not respond to these requests but udhcpc still sends those anyway. And should the configuration flag "Use broadcast flag" on wan interface change this behavior some way because it doesn't seem to do anything?

If I understand correct dhclient which I could try is an alternative to udhcpc but can I use it instead on OpenWRT or am I stuck with udhcpc?

	>>> -T SEC		Pause between packets (default 3)
	-A SEC		Wait if lease is not obtained (default 20)
1 Like

Probably none, if we assume the packets are dropped between you and the ISP - after they've passed tcpdump.
You can see they're going out, but there's no reply.

Not sure what you mean, should I be able to adjust these settings in OpenWRT somehow?

https://openwrt.org/docs/guide-user/network/protocol.dhcp#troubleshooting

ip -s link show

So I run tcpdump -evni any udp port 67 or 68 during

Tue Apr  6 16:41:47 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to a.a.a.a
Tue Apr  6 16:42:43 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to 0.0.0.0
Tue Apr  6 16:43:12 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to 0.0.0.0
Tue Apr  6 16:43:28 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to 0.0.0.0
Tue Apr  6 16:43:35 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to 0.0.0.0
Tue Apr  6 16:43:38 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to 0.0.0.0
Tue Apr  6 16:43:39 2021 daemon.notice netifd: wan (5210): udhcpc: sending renew to 0.0.0.0
Tue Apr  6 16:43:39 2021 daemon.notice netifd: wan (5210): udhcpc: lease lost, entering init state
Tue Apr  6 16:43:39 2021 daemon.notice netifd: Interface 'wan' has lost the connection
Tue Apr  6 16:43:39 2021 daemon.notice netifd: wan (5210): udhcpc: sending select for b.b.b.b
Tue Apr  6 16:43:39 2021 daemon.notice netifd: wan (5210): udhcpc: lease of b.b.b.b obtained, lease time 7200
Tue Apr  6 16:43:39 2021 daemon.notice netifd: Interface 'wan' is now up
Tue Apr  6 16:43:39 2021 user.notice firewall: Reloading firewall due to ifup of wan (eth1)

tcpdump shows both request and replies, also

16 packets captured
24 packets received by filter
0 packets dropped by kernel

so I'm not sure where the problem is if packet dump seems to be fine

Also in dump replies to unicast and broadcast DHCP requests are identical except unicast having additional "Client-IP "

Are you by any chance running banIP?
Also use the wan interface when you are running tcpdump to make sure that packets use the correct interface.

Renew request while using a VPN

1 Like

No banIP (don't even know what it is). Should I use wan interface for tcpdump or tun0 instead? But I'm pretty sure what I see in my current dump is correct.

Also I'm thinking could this be firewall problem since I don't have "default setup", but then again this does not happen when there's no full download.

Yes I read this yesterday and problem is almost identical however I don't have torrenting or anything downloading on OpenWRT host, only clients

1 Like

I mentioned the wan.

You selected any interface. On the other hand without seeing it's hard to say.

Better post them all together then
uci export network; uci export firewall; head -n -0 /etc/firewall.*

I tested tcpdump for wan interface but can't spot any differences or problems. What puzzles me is that why so many broadcasts even if each one gets reply properly

network & firewall

package network

config interface 'loopback'
        option ifname 'lo'
        option proto 'static'
        option ipaddr '127.0.0.1'
        option netmask '255.0.0.0'

config globals 'globals'
        option ula_prefix 'fd0b:1b7f:d057::/48'

config interface 'lan'
        option type 'bridge'
        option ifname 'eth0'
        option proto 'static'
        option ipaddr '10.30.30.1'
        option netmask '255.255.255.0'

config interface 'wan'
        option ifname 'eth1'
        option proto 'dhcp'
        option peerdns '0'
        option broadcast '1'

config interface 'vpn'
        option proto 'none'
        option ifname 'tun0'
        option auto '1'
        option delegate '0'
        option force_link '1'

config rule
        option mark '0x1'
        option lookup '200'
package firewall

config defaults
        option input 'ACCEPT'
        option output 'ACCEPT'
        option forward 'REJECT'
        option synflood_protect '1'
        option flow_offloading '1'

config zone
        option name 'vpn'
        option output 'ACCEPT'
        option network 'vpn'
        option input 'DROP'
        option forward 'DROP'
        option masq '1'
        option mtu_fix '1'

config zone
        option name 'lan'
        option input 'ACCEPT'
        option output 'ACCEPT'
        option forward 'DROP'
        option network 'lan'

config zone
        option name 'wan'
        option input 'REJECT'
        option output 'ACCEPT'
        option forward 'REJECT'
        option masq '1'
        option mtu_fix '1'
        option network 'wan'

config forwarding
        option src 'lan'
        option dest 'vpn'

config rule
        option name 'Allow-DHCP-Renew'
        option src 'wan'
        option proto 'udp'
        option dest_port '68'
        option target 'ACCEPT'
        option family 'ipv4'

config rule
        option name 'Allow-Ping'
        option src 'wan'
        option proto 'icmp'
        option icmp_type 'echo-request'
        option family 'ipv4'
        option target 'ACCEPT'

config rule
        option name 'Allow-IGMP'
        option src 'wan'
        option proto 'igmp'
        option family 'ipv4'
        option target 'ACCEPT'

config rule
        option name 'Allow-DHCPv6'
        option src 'wan'
        option proto 'udp'
        option src_ip 'fc00::/6'
        option dest_ip 'fc00::/6'
        option dest_port '546'
        option family 'ipv6'
        option target 'ACCEPT'

config rule
        option name 'Allow-MLD'
        option src 'wan'
        option proto 'icmp'
        option src_ip 'fe80::/10'
        list icmp_type '130/0'
        list icmp_type '131/0'
        list icmp_type '132/0'
        list icmp_type '143/0'
        option family 'ipv6'
        option target 'ACCEPT'

config rule
        option name 'Allow-ICMPv6-Input'
        option src 'wan'
        option proto 'icmp'
        list icmp_type 'echo-request'
        list icmp_type 'echo-reply'
        list icmp_type 'destination-unreachable'
        list icmp_type 'packet-too-big'
        list icmp_type 'time-exceeded'
        list icmp_type 'bad-header'
        list icmp_type 'unknown-header-type'
        list icmp_type 'router-solicitation'
        list icmp_type 'neighbour-solicitation'
        list icmp_type 'router-advertisement'
        list icmp_type 'neighbour-advertisement'
        option limit '1000/sec'
        option family 'ipv6'
        option target 'ACCEPT'

config rule
        option name 'Allow-ICMPv6-Forward'
        option src 'wan'
        option dest '*'
        option proto 'icmp'
        list icmp_type 'echo-request'
        list icmp_type 'echo-reply'
        list icmp_type 'destination-unreachable'
        list icmp_type 'packet-too-big'
        list icmp_type 'time-exceeded'
        list icmp_type 'bad-header'
        list icmp_type 'unknown-header-type'
        option limit '1000/sec'
        option family 'ipv6'
        option target 'ACCEPT'

config rule
        option name 'Allow-IPSec-ESP'
        option src 'wan'
        option dest 'lan'
        option proto 'esp'
        option target 'ACCEPT'

config rule
        option name 'Allow-ISAKMP'
        option src 'wan'
        option dest 'lan'
        option dest_port '500'
        option proto 'udp'
        option target 'ACCEPT'

config include
        option path '/etc/firewall.user'
        option reload '1'

Most likely it doesn't get the reply, hence the continuous retries. You'd have to disable one thing at a time, like vpn, firewall, downloads, until you find the culprit.

1 Like
/etc/init.d/openvpn restart; sleep 10; ip route get dhcp_server_ip
1 Like

This gives me (a.a.a.a = DHCP server actual IP)

a.a.a.a via 10.8.3.1 dev tun0 src 10.8.3.5 uid 0
    cache
1 Like

Check whether the issue persists if you disable the default gateway redirection.
Instead, route the LAN client traffic to the VPN tunnel using the PBR app.

Few more things if it makes any difference:

"udhcpc: sending renew to a.a.a.a" appears in tcpdump only for

tcpdump -evn -i tun0 udp port 67 or 68 and not for -i eth1 (wan interface)

10.8.0.4.68 > a.a.a.a.67: BOOTP/DHCP, Request from <mac>, length 300, xid 0xfeedf149, secs 3150, Flags [none]

But there is no reply visible for this request in tcpdump, maybe because of firewall rules? Should there be reply and should I make some rule to allow this somehow?

"udhcpc: sending renew to 0.0.0.0" is visible only for tcpdump -i eth1 not for -i tun0 and reply is visible, maybe because reply is allowed as firewall rule?

Also fo tcpdump -evn -i eth1 udp port 67 or 68 I get

2 packets captured
150 packets received by filter
44 packets dropped by kernel

so I wonder what is dropped and why