NTP issues on a dumb AP configuration

Does anybody else run this as an AP and have issues with the system time not getting synced with the main router / NTP servers? I've been pretty happy with stability and performance in general but have had a few wifi connection issues. When I log into the device the system time always seem to be a few days behind and I need to sync it manually to get it back on track.

Have you checked your setup with:

Especially gateway and DNS settings?

1 Like

I've followed the guide but set mine up as dhcp client and give it a fixed ip address in the main router. DNS and gateway uses defaults as advertised via dhcp.

I guess the unused 'wan' port is also configured as dhcp client and maybe that causes confusion? I've deleted 'wan' for now and will keep an eye on things to see if it still causes issues.

No issues here on many devices.

Yea, I assume thos means you edited the LAN interface to DHCP client- that works.

As long as I didnt connect anything to WAN, I've never had to delete the Interface to use the gateway on LAN.

Glad you got it working.

I've just checked the time on my AP and it had drifted by 3 days so the issue persists. I've now changed my router to advertise NTP via dhcp using 'advanced settings' DHCP-options:

 42,192.168.1.1

and then set my AP to use NTP as advertised by DHCP. I'll check in a few days to see if this helps.

If you run the following from your AP, do you see any errors?

/usr/sbin/ntpd -dnN -p 192.168.1.1

It should execute four times and then pause. You can CTRL+C at that point.

So far with dhcp advertised time I've not seen any drift on my AP but I'll keep an eye on it over the next few days.

When I execute your command I just get a whole bunch of timed out queries.

ntpd: timed out waiting for 192.168.1.1, reach 0x00, next query in 2s

I have this issue as well on my access points, and when executing the ntpd command I also got

ntpd: sending query to 192.168.0.1
ntpd: timed out waiting for 192.168.0.1, reach 0x00, next query in 2s

image

Screenshot from 2023-05-06 13-58-21

My config is dumb AP, added main router IP (192.168.1.1) and all works fine:

ntpd: sending query to 192.168.1.1
ntpd: reply from 192.168.1.1: offset:-0.004850 delay:0.003834 status:0x24 strat:2 refid:0xc1cc72e9 rootdelay:0.023270 reach:0x01

Hope this helps

1 Like

Is there an ntpd on 192.168.0.1 ?

1 Like

I'm glad it seems to be working for you still. Though I still suspect something is not quite right with your NTP connectivity.

Option 42 in DHCP only says "Hey, here is the server to which your NTP client can speak 'NTP'..." But according to your ntpd query, 192.168.1.1 appears to be either 1) not running an NTP service or 2) is running an NTP service, but is unreachable due to, perhaps, a firewall block. (??)

May be worth doing more investigation on your side to get that sorted, especially if you truly intend for your main router to be serving up NTP.

A few things you could try to help isolate the issue would be to check the result of this command:
/usr/sbin/ntpd -dnN -p 0.openwrt.pool.ntp.org

Also, confirm the "Provide NTP server" box is checked on your Time Synchronization settings on your router (192.168.1.1). Can be confirmed with:

# uci show system.ntp
system.ntp=timeserver
system.ntp.enabled='1'
1 Like

I also had this issue however Netgate 2100-MAX provides authenticated NTP services so I had to just delete the auto populated information and add in the main router/firewalls ip address that fixed it.

This way my firewall can still manage Authenticated NTP and it syncs up

I've the same problem on my recently flashed dumb AP (works correctly on the main router), so I created the following test service to debug it:

#!/bin/sh /etc/rc.common

START=97
USE_PROCD=1

start_service() {
        procd_open_instance
        procd_set_param command /usr/sbin/ntpd -d -n -N
        procd_append_param command -p 0.openwrt.pool.ntp.org
        procd_append_param command -p 1.openwrt.pool.ntp.org
        procd_append_param command -p 2.openwrt.pool.ntp.org
        procd_append_param command -p 3.openwrt.pool.ntp.org
        procd_set_param stdout 1
        procd_set_param stderr 1
        procd_set_param respawn
        procd_add_jail ntpd
        procd_set_param capabilities /etc/capabilities/ntpd.json
        procd_set_param user ntp
        procd_set_param group ntp
        procd_set_param no_new_privs 1
        procd_close_instance
}

service_triggers() {
        echo trigger
}

syslog:

Fri Jul 26 11:12:10 2024 daemon.err ntpd[1463]: ntpd: bad address '0.openwrt.pool.ntp.org'
Fri Jul 26 11:12:10 2024 daemon.err ntpd[1463]: ntpd: '1.openwrt.pool.ntp.org' is 162.159.200.1
Fri Jul 26 11:12:11 2024 daemon.err ntpd[1463]: ntpd: '2.openwrt.pool.ntp.org' is 200.20.186.76
Fri Jul 26 11:12:11 2024 daemon.err ntpd[1463]: ntpd: '3.openwrt.pool.ntp.org' is 168.181.126.28
Fri Jul 26 11:12:11 2024 daemon.err ntpd[1463]: ntpd: sending query to 168.181.126.28
Fri Jul 26 11:12:11 2024 daemon.err ntpd[1463]: ntpd: sending query to 200.20.186.76
Fri Jul 26 11:12:11 2024 daemon.err ntpd[1463]: ntpd: sending query to 162.159.200.1
Fri Jul 26 11:12:11 2024 daemon.err ntpd[1463]: ntpd: reply from 162.159.200.1: offset:+36.940552 delay:0.038940 status:0x24 strat:3 refid:0x0aa70804 rootdelay:0.129534 reach:0x01
Fri Jul 26 11:12:11 2024 daemon.err ntpd[1463]: ntpd: reply from 200.20.186.76: offset:+36.927714 delay:0.044041 status:0x24 strat:1 refid:0x4f4e4252 rootdelay:0.000015 reach:0x01
Fri Jul 26 11:12:11 2024 daemon.err ntpd[1463]: ntpd: reply from 168.181.126.28: offset:+36.931466 delay:0.057421 status:0x24 strat:2 refid:0x427df641 rootdelay:0.015976 reach:0x01
Fri Jul 26 11:12:12 2024 daemon.err ntpd[1463]: ntpd: '0.openwrt.pool.ntp.org' is 200.160.7.197
Fri Jul 26 11:12:12 2024 daemon.err ntpd[1463]: ntpd: sending query to 200.160.7.197
Fri Jul 26 11:12:12 2024 daemon.err ntpd[1463]: ntpd: reply from 200.160.7.197: offset:+36.927946 delay:0.033424 status:0x24 strat:1 refid:0x47505300 rootdelay:0.000000 reach:0x01
Fri Jul 26 11:12:13 2024 daemon.err ntpd[1463]: ntpd: sending query to 168.181.126.28
Fri Jul 26 11:12:13 2024 daemon.err ntpd[1463]: ntpd: sending query to 200.20.186.76
Fri Jul 26 11:12:13 2024 daemon.err ntpd[1463]: ntpd: sending query to 162.159.200.1
Fri Jul 26 11:12:13 2024 daemon.err ntpd[1463]: ntpd: sending query to 200.160.7.197
Fri Jul 26 11:12:50 2024 daemon.err ntpd[1463]: ntpd: reply from 162.159.200.1: offset:+36.933680 delay:0.025311 status:0x24 strat:3 refid:0x0aa70804 rootdelay:0.129549 reach:0x03
Fri Jul 26 11:12:50 2024 daemon.err ntpd[1463]: ntpd: setting time to 2024-07-26 11:12:50.325553 (offset +36.933680s)

My best guess here is that ntpd is called before /tmp/resolv.conf.d/resolv.conf.auto and its symlink /tmp/resolv.conf are done being created/written to, so a race condition (notice the initial failure to resolve 0.openwrt.pool.ntp.org).
The reason why this works with -n (foreground) and -d (verbose) is likely because it takes longer for it to go through every address and print them.

sigh Another day spent trying to solve an issue. Not sure what's the best approach here; maybe delay the execution of the sysntpd service?

Managed to get it to fail on all resolutions during boot:

Fri Jul 26 12:01:07 2024 daemon.err ntpd[1464]: ntpd: bad address '0.openwrt.pool.ntp.org'
Fri Jul 26 12:01:12 2024 daemon.err ntpd[1464]: ntpd: bad address '1.openwrt.pool.ntp.org'
Fri Jul 26 12:01:17 2024 daemon.err ntpd[1464]: ntpd: bad address '2.openwrt.pool.ntp.org'
Fri Jul 26 12:01:22 2024 daemon.err ntpd[1464]: ntpd: bad address '3.openwrt.pool.ntp.org'
Fri Jul 26 12:01:28 2024 daemon.err ntpd[1464]: ntpd: bad address '3.openwrt.pool.ntp.org'
Fri Jul 26 12:01:33 2024 daemon.err ntpd[1464]: ntpd: bad address '2.openwrt.pool.ntp.org'
Fri Jul 26 12:01:38 2024 daemon.err ntpd[1464]: ntpd: bad address '1.openwrt.pool.ntp.org'
Fri Jul 26 12:01:43 2024 daemon.err ntpd[1464]: ntpd: bad address '0.openwrt.pool.ntp.org'
Fri Jul 26 12:02:01 2024 daemon.err ntpd[1464]: ntpd: bad address '3.openwrt.pool.ntp.org'
Fri Jul 26 12:02:06 2024 daemon.err ntpd[1464]: ntpd: bad address '2.openwrt.pool.ntp.org'
Fri Jul 26 12:02:11 2024 daemon.err ntpd[1464]: ntpd: bad address '1.openwrt.pool.ntp.org'
Fri Jul 26 12:02:16 2024 daemon.err ntpd[1464]: ntpd: bad address '0.openwrt.pool.ntp.org'
Fri Jul 26 12:02:50 2024 daemon.err ntpd[1464]: ntpd: bad address '3.openwrt.pool.ntp.org'
Fri Jul 26 12:02:55 2024 daemon.err ntpd[1464]: ntpd: bad address '2.openwrt.pool.ntp.org'
Fri Jul 26 12:03:00 2024 daemon.err ntpd[1464]: ntpd: bad address '1.openwrt.pool.ntp.org'
Fri Jul 26 12:03:05 2024 daemon.err ntpd[1464]: ntpd: bad address '0.openwrt.pool.ntp.org'

Seems like the process is caching the result of what I believe to be getaddrinfo. As such, if all peers fail to resolve, the ntpd process is effectively deadlocked.
Assuming I am correct, simply removing DNS caching in ntpd will fix this, as it will succeed as soon as DNS resolution is available.

My shitty solution to the problem.

The "done" service executes shortly before sysntpd and that's where /etc/rc.local is executed; this is my rc.local to solve the issue:

until egrep -qs '^nameserver (\d{1,3}\.){3}\d{1,3}$' /etc/resolv.conf; do
        logger -t rc.local -p daemon.warn "Waiting for DNS servers"
        sleep 1
done

This keeps further services from being executed until netifd has written at least one IPv4 server to resolv.conf.

Proper date after boot is now working correctly, and I can see the entries in the syslog:

Fri Jul 26 15:14:07 2024 daemon.warn rc.local: Waiting for DNS servers
Fri Jul 26 15:14:08 2024 daemon.warn rc.local: Waiting for DNS servers

NOTE: This does not actually fix the deadlock within ntpd.

EDIT:

To conclude on this, this issue is present even if it's your main router (not a dumb AP). It's just less likely to happen in this case because netifd is likely to finish setting up your network before the sysntpd service is started, as more services are executed in between.
With a dumb AP, you will be running very few services at boot, so the race condition chance increases exponentially.

Regardless, ntpd should be fixed, it should NOT cache bad info when all peer resolutions fail (why is it even doing this to begin with?).

2 Likes

Got crazy because my new main router has not ntp server capabilities; fortunately I landed on your "shitty solution" that works for me: thank you so much!

What I have done is force a time sync after the network comes up as sometimes it still takes a bit to do it on it's own:

until ping -4 -c 1 -w 1 -W 1 dns.opendns.com > /dev/null 2>&1
  do
    sleep 1
  done

ntpd -q -p time.cloudflare.com
sleep 5

The issue happens because busybox's ntpd reads /etc/resolv.conf only once. If by that time /etc/resolv.conf has not been properly generated, the current ntpd process will forever be unable to resolve hostnames.

There are obvious multiple solutions to this problem, some are:

  • Use raw IP address(es) for your NTP server(s), like the one from the main router (running a NTP server).
  • Wait for a valid /etc/resolv.conf before starting the sysntpd service (as shown in my solution).
  • Use a hotplug script to start the sysntpd service.

Hotplug script example /etc/hotplug.d/iface/99-sysntpd:

IFNAME=lan

[ $INTERFACE = $IFNAME ] || exit 0

case $ACTION in
    ifup)
        service sysntpd start
        ;;
    ifdown)
        service sysntpd stop
        ;;
esac

exit 0

Make sure to disable the sysntpd service and let the hotplug script handle it.

This should be fixed in busybox though, /etc/resolv.conf should not be cached and re-read when name resolution is required.

P.S. Currently I am using the raw IP to main router solution.

1 Like