UDHCP renew failure

I am running openwrt. For some reason dhcp renewal is failing

2020-08-24T16:22:49.420 NOTICE daemon netifd: udhcpc: sending renew to 172.16.5.3
2020-08-24T16:24:04.496 NOTICE daemon netifd: udhcpc: sending renew to 172.16.5.3
2020-08-24T16:24:41.552 NOTICE daemon netifd: udhcpc: sending renew to 0.0.0.0
2020-08-24T16:24:41.588 NOTICE daemon netifd: udhcpc: lease of 172.16.5.2 obtained, lease time 300

Even though it looks like renew request is sent and response is not received. I tried capturing packet in the inbound interface I can see ACK packet coming back from my DHCP server. For some reason first 2 ACK`s not received by udhcp but the 3rd ACK is alone received by UDHCP. I confirmed the same by using strace. Can some one help me out whether this is a known issue.

In the syslog we can see that although a unicast renew is sent to .3 eventually it gets an answer after the broadcast from .2
In the capture there are completely different addresses.
Which one are we troubleshooting?

Please copy the output of the following commands and post it here using the "Preformatted text </> " button:
grafik
Remember to redact passwords, MAC addresses and any public IP addresses you may have

ubus call system board; \
uci export network; uci export wireless; \
uci export dhcp; uci export firewall; \
head -n -0 /etc/firewall.user; \
iptables-save -c; \
ip -4 addr ; ip -4 ro li tab all ; ip -4 ru
2 Likes

Thanks for the reply,
Please find the output in below pastebin link,

I don't see any interfaces from the list that has dhcp client protocol active with IP.
Which one has the problem?
Is there a DHCP server on every segment or are you relaying?

1 Like

I did few more analysis and found that UDHCP has opened sockets properly after sending DHCP renew request,
but somehow the replies are not getting hooked upto the socket. Below are some command outputs to confirm socket is open

edge:b2-edge1:/proc/6174/fd# netstat -ulpn | grep udhcp
udp 0 0 0.0.0.0:68 0.0.0.0:* 6174/udhcpc
edge:b2-edge1:/proc/6174/fd# lsof -i -P -n | grep udhcp
udhcpc 6174 root 5u inet 103311797 0t0 UDP *:68
edge:b2-edge1:/proc/6174/fd#

I have attached new logs from a different device where I am able to repro the issue.

Adding to that only eth4 is configured for DHCP

tcpdump -vni eth4 udp port 67

Is this some normal board or a virtual machine?
Also is this an official OpenWrt image or some customized?

  "kernel": "4.15.0-1057-aws",
    "version": "cc-remerge-618-g0acfaf8",
    "revision": "r0+2278-0acfaf8",
    "codename": "example_vc-xen-aws",
    "target": "x64/vc-xen-aws",
1 Like

Please find the requested output

edge:b2-edge1:~# tcpdump -vvv -n -i eth4 udp port 67 or udp port 68
tcpdump: listening on eth4, link-type EN10MB (Ethernet), capture size 262144 bytes
04:09:44.149887 IP (tos 0x0, ttl 64, id 18457, offset 0, flags [none], proto UDP (17), length 328)
    172.16.2.2.68 > 172.16.2.3.67: [bad udp cksum 0x5d6b -> 0xa28f!] BOOTP/DHCP, Request from 02:42:ac:10:02:02, length 300, xid 0x9b756947, Flags [none] (0x0000)
	  Client-IP 172.16.2.2
	  Client-Ethernet-Address 02:42:ac:10:02:02
	  Vendor-rfc1048 Extensions
	    Magic Cookie 0x63825363
	    DHCP-Message Option 53, length 1: Request
	    MSZ Option 57, length 2: 576
	    Parameter-Request Option 55, length 7:
	      Subnet-Mask, Default-Gateway, Domain-Name-Server, Hostname
	      Domain-Name, BR, NTP
	    Vendor-Class Option 60, length 12: "udhcp 1.30.1"
	    Hostname Option 12, length 6: "vc-ge5"
	    END Option 255, length 0
	    PAD Option 0, length 0, occurs 21
04:09:44.152274 IP (tos 0x0, ttl 64, id 46620, offset 0, flags [DF], proto UDP (17), length 328)
    172.16.2.3.67 > 172.16.2.2.68: [udp sum ok] BOOTP/DHCP, Reply, length 300, xid 0x9b756947, Flags [none] (0x0000)
	  Client-IP 172.16.2.2
	  Your-IP 172.16.2.2
	  Server-IP 172.16.2.3
	  Client-Ethernet-Address 02:42:ac:10:02:02
	  Vendor-rfc1048 Extensions
	    Magic Cookie 0x63825363
	    DHCP-Message Option 53, length 1: ACK
	    Server-ID Option 54, length 4: 172.16.2.3
	    Lease-Time Option 51, length 4: 300
	    Subnet-Mask Option 1, length 4: 255.255.255.0
	    Default-Gateway Option 3, length 4: 172.16.2.3
	    Domain-Name Option 15, length 11: "example.org"
	    END Option 255, length 0
	    PAD Option 0, length 0, occurs 19
04:10:59.221873 IP (tos 0x0, ttl 64, id 32681, offset 0, flags [none], proto UDP (17), length 328)
    172.16.2.2.68 > 172.16.2.3.67: [bad udp cksum 0x5d6b -> 0xa244!] BOOTP/DHCP, Request from 02:42:ac:10:02:02, length 300, xid 0x9b756947, secs 75, Flags [none] (0x0000)
	  Client-IP 172.16.2.2
	  Client-Ethernet-Address 02:42:ac:10:02:02
	  Vendor-rfc1048 Extensions
	    Magic Cookie 0x63825363
	    DHCP-Message Option 53, length 1: Request
	    MSZ Option 57, length 2: 576
	    Parameter-Request Option 55, length 7:
	      Subnet-Mask, Default-Gateway, Domain-Name-Server, Hostname
	      Domain-Name, BR, NTP
	    Vendor-Class Option 60, length 12: "udhcp 1.30.1"
	    Hostname Option 12, length 6: "vc-ge5"
	    END Option 255, length 0
	    PAD Option 0, length 0, occurs 21
04:10:59.222325 IP (tos 0x0, ttl 64, id 55984, offset 0, flags [DF], proto UDP (17), length 328)
    172.16.2.3.67 > 172.16.2.2.68: [udp sum ok] BOOTP/DHCP, Reply, length 300, xid 0x9b756947, secs 75, Flags [none] (0x0000)
	  Client-IP 172.16.2.2
	  Your-IP 172.16.2.2
	  Server-IP 172.16.2.3
	  Client-Ethernet-Address 02:42:ac:10:02:02
	  Vendor-rfc1048 Extensions
	    Magic Cookie 0x63825363
	    DHCP-Message Option 53, length 1: ACK
	    Server-ID Option 54, length 4: 172.16.2.3
	    Lease-Time Option 51, length 4: 225
	    Subnet-Mask Option 1, length 4: 255.255.255.0
	    Default-Gateway Option 3, length 4: 172.16.2.3
	    Domain-Name Option 15, length 11: "example.org"
	    END Option 255, length 0
	    PAD Option 0, length 0, occurs 19
04:11:36.297839 IP (tos 0x0, ttl 64, id 0, offset 0, flags [none], proto UDP (17), length 328)
    172.16.2.2.68 > 255.255.255.255.67: [udp sum ok] BOOTP/DHCP, Request from 02:42:ac:10:02:02, length 300, xid 0x9b756947, secs 112, Flags [none] (0x0000)
	  Client-IP 172.16.2.2
	  Client-Ethernet-Address 02:42:ac:10:02:02
	  Vendor-rfc1048 Extensions
	    Magic Cookie 0x63825363
	    DHCP-Message Option 53, length 1: Request
	    MSZ Option 57, length 2: 576
	    Parameter-Request Option 55, length 7:
	      Subnet-Mask, Default-Gateway, Domain-Name-Server, Hostname
	      Domain-Name, BR, NTP
	    Vendor-Class Option 60, length 12: "udhcp 1.30.1"
	    Hostname Option 12, length 6: "vc-ge5"
	    END Option 255, length 0
	    PAD Option 0, length 0, occurs 21
04:11:36.299974 IP (tos 0x0, ttl 64, id 59876, offset 0, flags [DF], proto UDP (17), length 328)
    172.16.2.3.67 > 172.16.2.2.68: [udp sum ok] BOOTP/DHCP, Reply, length 300, xid 0x9b756947, secs 112, Flags [none] (0x0000)
	  Client-IP 172.16.2.2
	  Your-IP 172.16.2.2
	  Server-IP 172.16.2.3
	  Client-Ethernet-Address 02:42:ac:10:02:02
	  Vendor-rfc1048 Extensions
	    Magic Cookie 0x63825363
	    DHCP-Message Option 53, length 1: ACK
	    Server-ID Option 54, length 4: 172.16.2.3
	    Lease-Time Option 51, length 4: 300
	    Subnet-Mask Option 1, length 4: 255.255.255.0
	    Default-Gateway Option 3, length 4: 172.16.2.3
	    Domain-Name Option 15, length 11: "example.org"
	    END Option 255, length 0
	    PAD Option 0, length 0, occurs 19
1 Like

No this is running in my virtual machine in AWS. I have taken the openwrt image and customized it with few extra logs and made few things less chatty.

[solved] DHCP on VLAN interfaces

Also, it's missing the broadcast address, aka BR Option 28.
I'm not sure if it's critical or not.

UDP checksum is wrong because of UDP checksum offload. I have already checked the same. 172.16.2.2 is the UDP client, I have a FRR based router on the other end and he accepts the packet so I dont see any issue there.
coming back to BR option, its not mandatory I have checked the same in RFC too.

1 Like