I've set up NTP in LuCI and configured a computer to query the router for time. On the computer "ntpq -p" returns a different refid address than configured in the router with OpenWrt. In /etc/config/system everything is as set in LuCI. Why does the router look like it's getting time from a different ntp server than configured?
Here's the chain, each is configured to query the next
A Computer
B Router
C NTP Server that the router is configured to query
D Unknown higher NTP servers
If I issue "ntpq -p" from A, then should I not get for refid the server from C? You're saying I should get the server from D. A double hop like this would make it impossible to ever learn of C, which doesn't make sense.
Then, if it doesn't make sense, your understanding of NTP is completely inaccurate. There are no such thing as "hops" in NTP, only Strata. If you issue ntpq -p from A, you should see its programmed NTP servers and their Refrence IDs. Those IDs refer to the Stratum 0 Server in its hierarchy. The only server possessing "real time" (my phrase, not Dr. Mills') is the Stratum 0 somewhere above D in your example.
A reference clock will generally (though not always) be a radio timecode receiver synchronized to standard time as provided by NIST and USNO in the US, NRC in Canada and their counterparts elsewhere in the world.
So you are correct in one aspect, the RefID usually refers to the Stratum 0 server, not others in between.
If the refid address is that of a stratum 0 server, then how do you learn of intermediate servers? Also what if it isn't in stratum 0, how can you know or check? Ntptrace didn't work well
For ntptrace to work properly, each of these servers must implement the NTP Control and Monitoring Protocol specified in RFC 1305 and enable NTP Mode 6 packets.
NTP, at least in the reference implementation, connects to multiple servers and assesses the "sanity" and "quality" of all the upstream servers. It is dynamic and changes depending on the reported state of the upstream servers, as well as the impact of "Internet weather" on the transmission paths.
If you're that concerned about accurate, robust time, you should do two things:
Formally request access to cryptographically signed NTP servers (see NTP rules of engagement)
Run your own stratum 1 server locally
Note that to run your own stratum 1 server with any accuracy (at a reasonable cost) you'll need a "real" GPS unit intended for time keeping with a hard-wired PPS output and/or a GPSDO, rather than some chip/board you bought off eBay. Serial connections are not accurate enough, even when over a UART. Throwing USB into the mix makes things even worse.
Great answers! First I want to check from A that B queries C. Is it common for ntptrace to fail i.e. for servers not to handle it properly? Are there other tools for this job?
On page 22 it says refid is an IP address only for stratum 2 servers or lower. Stratum 0 and 1 give ASCII strings, so the IP I'm getting for refid can't be of a stratum 0 server.
In the well over 20 years I've used NTP, I've never had the occasion to need to use ntptrace or the like. ntpq will give you the state variables of your machine, as well as the important ones of the sources that the algorithms are considering. Those are much more important than exactly which servers you are querying. Take a look at things like root dispersion and jitter and the like -- that is really what you should be caring about if you're that concerned about time accuracy.
Yes, NTP servers become unreachable all the time, and come back again as well. Not only that, but some are "better" than others in various ways (which is time varying as well). That is why the reference selection and clock combining algorithms have gotten so sophisticated over the years. A well-locked, solidly connected Stratum 2 or 3 server can be a better reference than a Stratum 1 server on a poor link, for example.
I really think you are using the word "hop" like its used with traceroute. Your understanding of NTP is again inaccurate.
That not necessarily the case:
A Stratum 1 server can have an IP
If it has an IP, it is likely hard-connected to a Stratum 0 server
Also, you are basing your opinions on the Reference Server, that isn't the only implementation of NTP in the world (e.g. OpenWrt uses busybox-ntpd by default)
There no tool to perform quite what you describe; because what you describe is not how NTP works. There could be hundreds of NTP servers on the same Strata, but you assume that it works more like traceroute. I think your confusion is based in a bad understanding of NTP. As @jeff noted:
Even if you got a successful "ntptrace," the results would be different later. If you want to control what NTP servers are queried in your hierarchy, you should:
Please use the widely available internet sources (e.g. https://stackexchange.com/search?q=ntp) to gain the proper insight into how NTP works. Once you achieved that, you can come back to the LEDE forum to discuss specific issues of NTP on LEDE/OpenWrt.