[Netgear R7800] RTC bug / issue

Not a long time ago I switched from STOCK firmware to OpenWRT (hnyman stable openwrt-19.07 build) on my Netgear R7800.

Unfortunately I ran into a serious bug/issue with RTC clock just from the first day with OpenWRT.
I found that time on R7800 falls out of sync (lags behind) quickly.

How to reproduce:

  1. Disable NTP sync or block Internet access on the router.
  2. Synchronize time on router with time in browser using standard LuCI feature
    (System - General Settings - Sync with browser).
  3. Wait a couple of days (1 day is enough) and check router's time.
    Router's clock will a few minutes late compared to other devices (PC, phone, other router etc).

I performed steps above several times in the last 3 weeks but RTC bug consistently reproduces every time - clock falls behind about 2-3 mins a day.
The last time I synced my router time was about 7-10 days ago.
As for today clock on R7800 is 17 minutes behind.

After googling for a while I have found a very very similar issue that took place on DD-WRT project just and only for for R7800 device.
This could be the similar issue I think.

I need an assistance to diagnose and resolve this RTC issue.
I am new to OpenWRT and I am not a developer.

Note:
The purpose of this post is to resolve RTC bug/issue (there is no need to discuss NTP).
I am aware that RTC is done in software (on R7800) and inaccurate over time.
But in contrast to my case time drift caused by inaccuracy is many time slower (10 time or more)

Please refer to the following thread:

https://forum.dd-wrt.com/phpBB2/viewtopic.php?t=324284&postdays=0&postorder=asc&start=312

time on r7800 falls out of sync, after 2 days its 3 mins behind according to router time displayed on gui. compared to any other ddwrt router, set same time zone, ntp is on etc. only the r7800 is doing this, going back it too seems to be another thing that started since the latest ipq changes, the same ones that made qos worse (setting way over what i want to get, to get it).
something is wrong with internal syncing.

https://forum.dd-wrt.com/phpBB2/viewtopic.php?t=324284&postdays=0&postorder=asc&start=313

i can confirm..uptime 3 days, time off 5mins slow

https://forum.dd-wrt.com/phpBB2/viewtopic.php?t=324284&postdays=0&postorder=asc&start=314

I've created an appropriate ticket for this with possible solution.
http://svn.dd-wrt.com/ticket/5629

https://forum.dd-wrt.com/phpBB2/viewtopic.php?t=324284&postdays=0&postorder=asc&start=315

thank u, that fixed it in r30853 time is in sync so is uptime between the wds link again. hfsc qos is normal again to what i set, so is htb for upload only, but htb remains lagging behind whats set on the download as its always done do u know what would happen if the clk freq was reduced more? would it cure htb downlink or cause the router to desync again?

Also there was a rolled-back commit also:

My system information:

root@R7800:~# date -u
Sat Feb 20 14:47:29 UTC 2021

root@R7800:~# uptime
 14:50:25 up 1 day, 19:04,  load average: 0.00, 0.02, 0.00

root@R7800:~# hwclock
hwclock: can't open '/dev/misc/rtc': No such file or directory

root@R7800:~# uname -a
Linux R7800 4.14.218 #0 SMP Wed Feb 3 18:15:29 2021 armv7l GNU/Linux

root@R7800:~# cat /proc/version
Linux version 4.14.218 (perus@ub2010) (gcc version 7.5.0 (OpenWrt GCC 7.5.0 r11273-c9388fa986)) #0 SMP Wed Feb 3 18:15:29 2021

root@R7800:~# ubus call system board
{
        "kernel": "4.14.218",
        "hostname": "R7800",
        "system": "ARMv7 Processor rev 0 (v7l)",
        "model": "Netgear Nighthawk X4S R7800",
        "board_name": "netgear,r7800",
        "release": {
                "distribution": "OpenWrt",
                "version": "19.07-SNAPSHOT",
                "revision": "r11293-312c05611b",
                "target": "ipq806x/generic",
                "description": "OpenWrt 19.07-SNAPSHOT r11293-312c05611b"
        }
}

It seem fixed on dd-wrt, perhaps @dissent1 could comment on this.

It really seems that the oscillator is misconfigured...

That was only my proposal based on search through forums.
Besides that DD-WRT RTC oscillator issue took place in the ancient 2016.
I haven't checked any bug tracker systems or source code repositories because of absence of development skills.

Is it possible to check what oscillator value is currently used by OpenWRT for R7800?

from the dts it seems to be set to the correct value of 25khz
We need to check if this is also reproducible on other router. just to exclude some hardware defect of some type...

I use latest STABLE build from hnyman (based on 19.07 mid 2019 + updates up to 2021).
Is it possible for me to manually check the current oscillator value on my running system?
Maybe there is some command to issue or config file to check?

It could the build I am running on differs from main branch in some RTC settings.

Anybody knows is there any way to view detailed system information like "Router details" or "Kernel Info" on OpenWRT?
As an example, DD-WRT shows such info in GUI as I know.

I am asking because sysinfo in DD-WRT contains information regarding RTC oscillator frequency the system is using.

Bootstrap clock 25MHz

I checked all the output that gave me dmesg, logread, GUI -> System log and Kernel log on OpenWRT.
But it does not seem to contain such or similar string or any information regarding RTC clock frequency (why, OpenWRT, why?).

Is there any other method to check oscillator on the running system?
Maybe there is a system command or any package for this?
Please advice me.

PS: Maybe I am wrong and RTC clock frequency is set by the bootloader? Is it possible to check bootloader's log file or reflash bootloader to different version? Sony for the lame questions.

play with the bootloader is out of question... (very easy brick)
anyway there should be some log in the kernel log... openwrt is a linux system so you can apply the same rules

I still wonder why you sees this issue, as I have not noticed the time lag myself.

I use NTP, but I have set logging of ntp events, and the logs show no major time drift.

E.g. in the last day the drift noticed by ntp has been just ms magnitude:

root@router1:~# uptime; logread |  grep ntp
 20:28:26 up 1 day,  2:52,  load average: 0.00, 0.00, 0.00
Wed Feb 24 17:36:40 2021 user.notice ntpd: Time set, stratum=16 interval=32 offset=58.413796
Wed Feb 24 17:37:12 2021 user.notice ntpd: Stratum change, stratum=3 interval=32 offset=0.000068
Wed Feb 24 17:50:48 2021 user.notice ntpd: Stratum change, stratum=4 interval=64 offset=0.000576
Wed Feb 24 17:51:55 2021 user.notice ntpd: Stratum change, stratum=3 interval=64 offset=0.000684
Thu Feb 25 05:17:29 2021 user.notice ntpd: Stratum change, stratum=4 interval=4096 offset=0.000358
Thu Feb 25 06:24:32 2021 user.notice ntpd: Stratum change, stratum=3 interval=4096 offset=0.001232

I am thinking that could it be somehow tied to the router load, or something like that. (My router has been pretty idle the past day)

I have checked both logs up and down and did not find any records regarding to RTC oscillator frequency setup.
I searched for various keywords, but did not find anything related to setting the parameters of RTC.
You can try to examine your own logs by yourself, it could be possible I missed something because I am new to OpenWRT.

Unfortunately I have no other identical router to try to reproduce the issue on it.
That is why I reverted to STOCK firmware today.
I'll set the correct time and wait a couple of days.

If router's clock will fall behind on STOCK firmware just like it was on OpenWRT that will confirm a hardware malfunction of my device.
If router's clock will have correct time that will confirm a bug in the stable build of 19.7.

PS: All tests will be done in the same environment - no Internet access, no NTP sync.

Very good idea test rtc on the OEM firmware

That was not the best idea as I know now ;)))
I'll post details a little later.

Spent a few days trying to reproduce the issue on other firmwares.

First off all I checked two stock firmwares - 1.0.2.68 (2019) and 1.0.2.74 (2020).
I was testing every firmware for 24 hour with mixed load - 8 hours of running 'openssl speed' in infinite loop, 16 hours in idle state.
On both stock firmwares the clock was near 1 second behind after 24-hour test.

All I can say - thanks god, my router is OK and does not have any hardware malfunction!

.
Stock 1.0.2.68

# uname -a
Linux R7800 3.4.103 #1 SMP Thu Oct 17 15:17:32 CST 2019 armv7l unknown
# cat /proc/version
Linux version 3.4.103 (li.zhang@CNXMDNICP01) (gcc version 4.6.3 20120201 (prerelease) (Linaro GCC 4.6-2012.02) ) #1 SMP Thu Oct 17 15:17:32 CST 2019

.
Stock 1.0.2.74

# uname -a
Linux R7800 3.4.103 #1 SMP Tue Aug 25 19:42:52 CST 2020 armv7l unknown
# cat /proc/version
Linux version 3.4.103 (li.zhang@CNXMDNICP01) (gcc version 4.6.3 20120201 (prerelease) (Linaro GCC 4.6-2012.02) ) #1 SMP Tue Aug 25 19:42:52 CST 2020

.

On the next step I checked official OpenWRT 19.07.7 firmware.
And got the same result - clock was near 1 second behind after 24-hour test.

OpenWRT 19.07.7

# uname -a
Linux OpenWrt 4.14.221 #0 SMP Mon Feb 15 15:22:37 2021 armv7l GNU/Linux
# cat /proc/version
Linux version 4.14.221 (builder@buildhost) (gcc version 7.5.0 (OpenWrt GCC 7.5.0 r11306-c4a6851c72)) #0 SMP Mon Feb 15 15:22:37 2021

Screenshot_2021-03-02 openwrt-19.07.7-Time-Query

.

Also I tried to reproduce this RTC issue on the latest KONG 19.07 build
On KONG I got the best result - clock was just in sync, that means less than 1 second shift in any direction!

KONG 19.07

# uname -a
Linux OpenWrt 4.14.216 #0 SMP Mon Jan 25 10:36:07 2021 armv7l GNU/Linux
# cat /proc/version
Linux version 4.14.216 (bluebat@helios) (gcc version 7.5.0 (OpenWrt GCC 7.5.0 2020-05)) #0 SMP Mon Jan 25 10:36:07 2021

Screenshot_2021-03-02 KONG-19.07-Time-Query

.

In the end I don't know what to think.

My next step will be to go back to @hnyman stable build to reproduce the issue again.

But now I think I made a mistake reverting to Stock without making a full backup off all flash partitions prior to it (I have no experience in doing things like that ;-).

.

You just lost your config... Nothing that bad...

After reverting from OEM back to @hnyman stable build I was unable to reproduce the issue.
Have no Idea wat was it.

Maybe some NVRAM settings from the stock firmware could give such effect.

PS: I met a mention of similar problems on the forum.
That time the issue was caused by the LuCI web interface being constantly opened in a browser.
I tried to reproduce, but I did not succeed.