I recently flashed the current OpenWRT (18.06.2 ) on a GL-AR150, and it's my first try of the "real" OpenWRT (apart from all kinds of Freifunk routers I'm taking care of).
Unfortunately, I ran into a – for my use case – severe problem/bug I reported at https://bugs.openwrt.org/index.php?do=details&task_id=2145 : no IPv4 DHCP addresses are offered if no LAN cable is plugged when the device boots (cf. the bug report).
I'm pretty sure this is an issue that should (and will) be fixed, but my problem is that – at the moment – I simply can't use the device at all, because I do need IPv4 addresses and the router is intended to be booted without a LAN cable attached.
I don't want to reflash the original firmware if I can avoid it, so here's my question: Can anybody tell me how to fix this for now, until a release fixing this issue will hopefully be done?
Thanks for all help in advance!
EDIT:
Here's the quintessenece of the longish discussion below:
Apparently, this is an issue caused by the dnsmasq init script doing unneccessary checks that fail and prevent the dhcp range to be added to the dnsmasq config, which then results in dnsmasq not offering dhcpv4 addresses.
I already tried this (cf. the bug report): I put /etc/init.d/dnsmasq restart into /etc/rc.local, but it didn't help. Dnsmasq actually restarts after the boot sequence finishes, but still, no IPv4 DHCP range is set and no IPv4 addresses are offered.
Dnsmasq is started in each case. It is even restarted once during the boot process, this is the case for both LAN cable plugged and unplugged (if you have some spare time, you can have a look at the two bootlogs I posted
Seems like both the LAN device and the bridge are up, no matter if a LAN cable is plugged when booting:
LAN cable plugged (IPv4 DHCP working):
root@OpenWrt:~# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN qlen 1000
link/ether e4:95:6e:44:a8:3a brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br-lan state UP qlen 1000
link/ether e4:95:6e:44:a8:3a brd ff:ff:ff:ff:ff:ff
4: wlan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether e4:95:6e:44:a8:3a brd ff:ff:ff:ff:ff:ff
5: br-lan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP qlen 1000
link/ether e4:95:6e:44:a8:3a brd ff:ff:ff:ff:ff:ff
LAN cable unplugged (no IPv4 DHCP):
root@OpenWrt:~# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN qlen 1000
link/ether e4:95:6e:44:a8:3a brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br-lan state UP qlen 1000
link/ether e4:95:6e:44:a8:3a brd ff:ff:ff:ff:ff:ff
4: wlan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether e4:95:6e:44:a8:3a brd ff:ff:ff:ff:ff:ff
5: br-lan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP qlen 1000
link/ether e4:95:6e:44:a8:3a brd ff:ff:ff:ff:ff:ff
I'm quite inexperienced with OpenWRT, but it seems that dnsmasq does not use the "normal" configuration file directly, but the init script assembles one during startup.
So may I suppose that somewhere in this (quite complex) init script, there's some check that fails and should not which prevents the DHCPv4 range to be added to the "work" config file, which, in consequence, causes dnsmasq to run but to not offer DHCPv4 addresses?
I already found /tmp/etc/dnsmasq.conf.cfg01411c, which is apparently the config file actually used by dnsmasq, and compared the "plugged" and "unplugged" version.
The only difference is that the "plugged" version contains the line
dhcp_check() {
local ifname="$1"
local stamp="${BASEDHCPSTAMPFILE_CFG}.${ifname}.dhcp"
local rv=0
[ -s "$stamp" ] && return $(cat "$stamp")
# If there's no carrier yet, skip this interface.
# The init script will be called again once the link is up
case "$(devstatus "$ifname" | jsonfilter -e @.carrier)" in
false) return 1;;
esac
udhcpc -n -q -s /bin/true -t 1 -i "$ifname" >&- && rv=1 || rv=0
[ $rv -eq 1 ] && \
logger -t dnsmasq \
"found already running DHCP-server on interface '$ifname'" \
"refusing to start, use 'option force 1' to override"
echo $rv > "$stamp"
return $rv
}
(I have not followed the logic through on this other than seeing "If there's no carrier yet, skip this interface")
I would have thought that the hotplug scripts would update this on interface up ("The init script will be called again once the link is up"). Understanding the logic in /etc/init.d/dnsmasq and the hotplug scripts would be how I would go about tracing this down.
logger can be used to trace the flow of these scripts. They can be edited on a "live" install like any other file.
Still puzzling why the later call, assuming the carrier is detected, fails to bring up DHCP. For that matter, why the presence of a bridged, up, wireless adapter doesn't start DHCP.
Maybe, this could really be some timing problem … here's a part of the bootlog if a LAN cable is plugged:
Mon Feb 25 11:10:42 2019 kern.info kernel: [ 27.288895] br-lan: port 1(eth1) entered disabled state
Mon Feb 25 11:10:43 2019 kern.info kernel: [ 27.990110] eth1: link up (1000Mbps/Full duplex)
Mon Feb 25 11:10:43 2019 kern.info kernel: [ 27.993339] br-lan: port 1(eth1) entered blocking state
Mon Feb 25 11:10:43 2019 kern.info kernel: [ 27.998492] br-lan: port 1(eth1) entered forwarding state
Mon Feb 25 11:10:43 2019 daemon.notice netifd: Network device 'eth1' link is up
Mon Feb 25 11:10:43 2019 daemon.notice netifd: bridge 'br-lan' link is up
Mon Feb 25 11:10:43 2019 daemon.notice netifd: Interface 'lan' has link connectivity
Mon Feb 25 11:10:43 2019 kern.info kernel: [ 28.048892] IPv6: ADDRCONF(NETDEV_CHANGE): br-lan: link becomes ready
Mon Feb 25 11:10:43 2019 daemon.info procd: - init complete -
Mon Feb 25 11:10:48 2019 daemon.info dnsmasq[726]: exiting on receipt of SIGTERM
Mon Feb 25 11:10:48 2019 daemon.info dnsmasq[1210]: started, version 2.80 cachesize 150
Mon Feb 25 11:10:48 2019 daemon.info dnsmasq[1210]: DNS service limited to local subnets
Mon Feb 25 11:10:48 2019 daemon.info dnsmasq[1210]: compile time options: IPv6 GNU-getopt no-DBus no-i18n no-IDN DHCP no-DHCPv6 no-Lua TFTP no-conntrack no-ipset no-auth no-DNSSEC no-ID loop-detect inotify dumpfile
Mon Feb 25 11:10:48 2019 daemon.info dnsmasq-dhcp[1210]: DHCP, IP range 192.168.1.100 -- 192.168.1.249, lease time 12h
Mon Feb 25 11:10:48 2019 daemon.info dnsmasq[1210]: using local addresses only for domain test
Mon Feb 25 11:10:48 2019 daemon.info dnsmasq[1210]: using local addresses only for domain onion
Mon Feb 25 11:10:48 2019 daemon.info dnsmasq[1210]: using local addresses only for domain localhost
Mon Feb 25 11:10:48 2019 daemon.info dnsmasq[1210]: using local addresses only for domain local
Mon Feb 25 11:10:48 2019 daemon.info dnsmasq[1210]: using local addresses only for domain invalid
Mon Feb 25 11:10:48 2019 daemon.info dnsmasq[1210]: using local addresses only for domain bind
Mon Feb 25 11:10:48 2019 daemon.info dnsmasq[1210]: using local addresses only for domain lan
Mon Feb 25 11:10:48 2019 daemon.warn dnsmasq[1210]: no servers found in /tmp/resolv.conf.auto, will retry
Mon Feb 25 11:10:48 2019 daemon.info dnsmasq[1210]: read /etc/hosts - 4 addresses
Mon Feb 25 11:10:48 2019 daemon.info dnsmasq[1210]: read /tmp/hosts/dhcp.cfg01411c - 2 addresses
Mon Feb 25 11:10:48 2019 daemon.info dnsmasq-dhcp[1210]: read /etc/ethers - 0 addresses
Mon Feb 25 11:10:48 2019 daemon.info dnsmasq[1210]: read /etc/hosts - 4 addresses
Mon Feb 25 11:10:48 2019 daemon.info dnsmasq[1210]: read /tmp/hosts/dhcp.cfg01411c - 2 addresses
Mon Feb 25 11:10:48 2019 daemon.info dnsmasq-dhcp[1210]: read /etc/ethers - 0 addresses
Mon Feb 25 11:11:26 2019 daemon.notice netifd: Network device 'eth1' link is down
Mon Feb 25 11:11:26 2019 kern.info kernel: [ 71.409043] eth1: link down
Mon Feb 25 11:11:26 2019 kern.info kernel: [ 71.410881] br-lan: port 1(eth1) entered disabled state
Mon Feb 25 11:11:28 2019 daemon.notice netifd: bridge 'br-lan' link is down
Mon Feb 25 11:11:28 2019 daemon.notice netifd: Interface 'lan' has link connectivity loss
Mon Feb 25 11:11:28 2019 kern.info kernel: [ 72.939935] eth1: link up (1000Mbps/Full duplex)
Mon Feb 25 11:11:28 2019 kern.info kernel: [ 72.943174] br-lan: port 1(eth1) entered blocking state
Mon Feb 25 11:11:28 2019 kern.info kernel: [ 72.948341] br-lan: port 1(eth1) entered forwarding state
and here the same if not:
Mon Feb 25 11:10:42 2019 kern.info kernel: [ 27.288888] br-lan: port 1(eth1) entered disabled state
Mon Feb 25 11:10:43 2019 daemon.info procd: - init complete -
Mon Feb 25 11:10:45 2019 daemon.info dnsmasq[727]: read /etc/hosts - 4 addresses
Mon Feb 25 11:10:45 2019 daemon.info dnsmasq[727]: read /tmp/hosts/dhcp.cfg01411c - 0 addresses
Mon Feb 25 11:10:54 2019 kern.info kernel: [ 38.629933] eth1: link up (1000Mbps/Full duplex)
Mon Feb 25 11:10:54 2019 kern.info kernel: [ 38.633169] br-lan: port 1(eth1) entered blocking state
Mon Feb 25 11:10:54 2019 kern.info kernel: [ 38.638318] br-lan: port 1(eth1) entered forwarding state
Mon Feb 25 11:10:54 2019 daemon.notice netifd: Network device 'eth1' link is up
Mon Feb 25 11:10:54 2019 kern.info kernel: [ 38.646337] IPv6: ADDRCONF(NETDEV_CHANGE): br-lan: link becomes ready
Mon Feb 25 11:10:54 2019 daemon.notice netifd: bridge 'br-lan' link is up
Mon Feb 25 11:10:54 2019 daemon.notice netifd: Interface 'lan' has link connectivity
Mon Feb 25 11:11:00 2019 daemon.notice netifd: Network device 'eth1' link is down
Mon Feb 25 11:11:00 2019 kern.info kernel: [ 44.749048] eth1: link down
Mon Feb 25 11:11:00 2019 kern.info kernel: [ 44.750885] br-lan: port 1(eth1) entered disabled state
Mon Feb 25 11:11:01 2019 daemon.notice netifd: bridge 'br-lan' link is down
Mon Feb 25 11:11:01 2019 daemon.notice netifd: Interface 'lan' has link connectivity loss
Mon Feb 25 11:11:01 2019 kern.info kernel: [ 46.279931] eth1: link up (1000Mbps/Full duplex)
Mon Feb 25 11:11:01 2019 kern.info kernel: [ 46.283167] br-lan: port 1(eth1) entered blocking state
Mon Feb 25 11:11:01 2019 kern.info kernel: [ 46.288315] br-lan: port 1(eth1) entered forwarding state
seems like dnsmasq is not restarted in this case … but still: Why does restarting it via /etc/rc.local, at the very end of the boot process, not fix it?!