Busybox ntpd (sysntpd) client msg refid incorrect

Currently the busybox ntpd (sysntpd), in server mode, returns the refid of the currently selected upstream host in reply messages.

This is incorrect. It should return the currently selected reference server's own id as a reference.
i.e. the reference servers own IP address converted into a 32bit unsigned int
(or for IPv6; first four bytes of the (binary) md5 digest of the IPv6 address)

Example:

my home router 192.168.1.254
ntp peer -> 212.159.13.49 (cdns01.plus.net) refid: C342F10A (-> 195.66.241.10 -> ntp2.linx.net)

my home laptop
ntp peer -> 192.168.1.254 SHOULD BE refid: C0A801FE ( -> 192.168.1.254 )

The busybox ntpd is sending C342F10A to my laptop, this is incorrect.


The purpose of refid is to prevent timing loops:

https://tools.ietf.org/id/draft-stenn-ntp-not-you-refid-00.html
The purpose of the REFID is to prevent a one-degree "timing loop": where if A has several timing sources that include B, if B decides to get its time from A, then A should not then decide to get its time from B. The REFID is therefore a vital core-component of the base NTP packet.

I'm thinking I should raise a bug about this, then I can work on a fix...

Rick Frankland.

No, it should do what it's doing - show the IP of the upstream server.

See directly from Dr. Mills: https://www.eecis.udel.edu/~mills/ntp/html/assoc.html#symact

You posted an expired Draft RFC.

Hi.
I don't see any specific mentions of refid in the description of " Symmetric Active/Passive Mode" please explain why the current implementation of refid supports this ntp feature.

Major alternative implementations of ntp do exactly as I have described:

$ ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*192.168.1.254   212.159.6.10     3 u  282 1024  377    0.614   -1.524   0.864
+monitor-1.lan   192.168.1.254    4 u  187 1024  377    1.347   -1.594   1.097
+rick-OptiPlex-7 192.168.1.254    4 u  264 1024  377    5.962   -0.315   5.917
 hydra.lan       192.168.1.254    4 u   45 1024   11    3.993   55.699 160.868

The router at 192.168.1.254, and rick-optiplex-755 are chrony, monitor-1 hydra.lan are the reference ntp.org implementation of ntpd. By your definition of correct, they should both report refids of 212.159.6.10, but they do not. They DO report the refid of their immediate peers; 192.168.1.254. Please explain why they are incorrect, I may then bring this to their attention.

Also, regarding the purpose of refid, this is from https://tools.ietf.org/rfc/rfc5905.txt
"Above stratum 1 (secondary servers and clients): this is the
reference identifier of the server and can be used to detect timing
loops. "
How is this possible with the current implementation? how do chrony and ntpd.org implementations do this? please explain.

This is not Busybox NTP.

That is correct. I installed the ntpd package (https://openwrt.org/packages/pkgdata/ntpd). which builds the full blown "reference" implementation of ntp. The source downloads from:
PKG_SOURCE_URL:=http://www.eecis.udel.edu/~ntp/ntp_spool/ntp4/ntp-4.2/
This software demonstrates what I believe is the correct behaviour, as does the chrony package. This is running on the openwrt named 'hydra' above.

The firewall openwrt router at 192.168.1.254 is running the chrony package for ntp.

The ntpq -p output is from my ubuntu laptop which is peered to the other servers in my home network.

If a chain of busybox ntpd servers was set up, the same refid would be passed all the way to the end of the chain. This would prevent the correct checking for timing loops.

I believe I am providing evidence why the busybox ntpd refid is incorrectly implemented, and what the correct implementation should be based on output from two other well established and widely used software packages. And my own research of the ntp documentation, and delving into the code.

Is there more evidence I should provide? please let me know.

And, thank you for reading and taking an interest.

From the "reference" ntpd implementation ntp-4.2.8p13/ntpd/ntp_proto.c

2781         if (   peer->stratum == STRATUM_REFCLOCK
2782             || peer->stratum == STRATUM_UNSPEC)
2783                 sys_refid = peer->refid;
2784         else
2785                 sys_refid = addr2refid(&peer->srcadr);

The busybox ntpd ONLY does similar to line 2783 without performing any check of the peer's stratum.
What the busybox ntpd SHOULD be doing is the whole of the above lines. Then, when a peer
at stratum between 1 and 15 is selected some code similar to line 2785 would get executed.

and this is the peer reply message being constructed.
3953 xpkt.refid = sys_refid;

and using the refid to check for timing loop -- currently busybox ntpd missing an equivalent check

4754 /*
4755  * local_refid(peer) - check peer refid to avoid selecting peers
4756  *                     currently synced to this ntpd.
4757  */
4758 static int
4759 local_refid(
4760         struct peer *   p
4761         )
4762 {
4763         endpt * unicast_ep;
4764 
4765         if (p->dstadr != NULL && !(INT_MCASTIF & p->dstadr->flags))
4766                 unicast_ep = p->dstadr;
4767         else
4768                 unicast_ep = findinterface(&p->srcadr);
4769 
4770         if (unicast_ep != NULL && p->refid == unicast_ep->addr_refid)
4771                 return TRUE;
4772         else
4773                 return FALSE;
4774 }

Actually, I must apologise. I made a mistake in describing the correct behaviour, then it is very confusing. Here is the example now corrected

Example:

my home router 192.168.1.254
ntp peer -> 212.159.13.49 (cdns01.plus.net) refid: C342F10A (-> 195.66.241.10 -> ntp2.linx.net)

my home laptop
ntp peer -> 192.168.1.254 SHOULD BE refid: D49F0D31 ( -> 212.159.13.49 -> cdns01.plus.net )

The busybox ntpd is sending C342F10A to my laptop, this is incorrect.

You should report this issue at busybox.net - I doubt it will get much traction here.

2 Likes

Thank you. I will do that.