Possible cause of R7800 latency issues

Adding some predicable and stable overhead is understood and this router does indeed add ~1ms or so. It is the seemingly random and sizeable spikes that is a concern.

1 Like

A web based speed test does not run pings long enough to cache this issue. Are your suggesting pinging one of their servers?

Why are you under the impression that ping can clearly provide viable results to measure your issue?

My suggestion is to use some Layer 7 test, particularly on a server designed for it. The best suggestion would be to have a server plugged into the WAN port solely for this test...but in any case...

Google Public DNS was designed and optimized for udp/53, not ICMP Echo-Request. They could be limiting your pings...delaying them, etc...

http://netalyzr.icsi.berkeley.edu/

That's all.

This is true, but the same exact test with the same router running stock or an Archer C7 running the same master LEDE build introduce no spikes and almost unnoticeable jitter.

1 Like

Take a look @richb-hanover's betterspeedtest ...

A few things to think about:

  • The intermittent latency spikes every few minutes might be Wi-Fi Channel Scans At least you can try to rule it out...

  • The charts shown in that blog posting are created by flent (Flent.org) It's a great way to make repeatable latency/traffic measurements.

  • I've never heard anyone say that Google limits/delays pings. I always ping gstatic.com or ping google.com with pretty uniform results.

  • Yes, betterspeedtest.sh is sort of like the DSLReports Speedtest except no GUI and run from the command line. It has parameters for how long to run, what device to ping and the number of simultaneous sessions (among others), then summarizes the latencies into min, 10th percentile, average, median, 90th percentile, and max.

Ok, as a baseline I've plotted the ping I get from pinging 8.8.8.8 directly connected to the modem:

t_modem

The ping never goes above 15 ms and is very stable, apart from minor isolated spikes. With this in mind, I think it's reasonable to use 8.8.8.8 as a target.

Here's a plot from pinging 8.8.8.8 from my computer connected to the R7800 via ethernet:

t_r7800

The issue is definitely with the R7800.

Here's also a plot with both wifi interfaces disabled:

t_r7800_nowifi

Pretty much the same story.

My friend...you are in the tens of milliseconds...I stand corrected...

I'm sure you know your R7800 more than I...

What other devices are similar in:

  • CPU
  • Memory; and
  • Switch chip (if any)?

ZyXEL NBG6817 is identical and has no such issue. Archer C2600 is similar and does not seem have the issue either. There is also anecdotal evidence that the R7800 models from before a year ago or so have no such issue. The "new" models have antenna numbers (1, 2, and 3) engraved on them and the older ones do not.

What do you mean by "antenna numbers"?

Is this the only way to tell the "new" and "old" model apart (aide from opening it and inspecting the board, chips, etc.)?

Does anyone following this thread have such a similar router?

  • lspci
  • lsusb
  • Flash chip info?
  • Memory chip info?
  • Is this a WiFi only-issue?

(and...a developer would be able to confirm there are no "significant" differences in code.)

We are only using a wired connection for testing.

I'm not sure there are any significant differences in hardware apart from cosmetic changes to the antennas and the power adapter. We need tests from people with the "old version" of the R7800 to determine if they experience this issue.

@hnyman are your antennas numbered with stickers or do they have engraved numbers?

Neither. I think that the antenna number texts like "Antenna 3" are printed to the antennas. No visible sticker, no engraving.

I don't much buy the theory that there would be some magical hardware change that would have caused this possible issue. In general, all additional network devices add latency and possibility of some resource conflicts causing delays.

So, it would be interesting to hear if you see similar latency spikes when you connect some other router instead of R7800 to be between your PC and modem. Some specific hardware combo may increase/decrease the possibility of the issue.

Also, I am not sure if it really makes sense to ping Google's 8.8.8.8, as the traffic may get routed to other different servers in their CDN. (I am pretty sure that there are lots of servers answering to 8.8.8.8 DNS requests.) And there are several devices in the trip, causing maybe 7 hops?

A better host to ping might the "next hop", likely your ISP's device nearest to you.

I tested with my own R7800 the ISP next hop using "hrping" tool in Windows. Short 500 item ping tests with the same 0.2 sec interval as you.

  • on the first try I saw one 64ms spike and one 20ms spike. Otherwise pretty steady 2 ms roundtrip.
    Packets: sent=500, rcvd=500, error=0, lost=0 (0.0% loss) in 99.817061 sec
    RTTs in ms: min/avg/max/dev: 1.594 / 2.232 / 63.923 / 3.970
    Bandwidth in kbytes/sec: sent=0.300, rcvd=0.300

  • on the second try there were no spikes. Pretty steady at 2ms roundtrip.
    Packets: sent=500, rcvd=500, error=0, lost=0 (0.0% loss) in 99.806583 sec
    RTTs in ms: min/avg/max/dev: 1.576 / 1.941 / 3.647 / 0.117
    Bandwidth in kbytes/sec: sent=0.300, rcvd=0.300

  • on the third try there was one 22 ms spike. Otherwise steady 2ms roudtrip.
    Packets: sent=500, rcvd=500, error=0, lost=0 (0.0% loss) in 99.813805 sec
    RTTs in ms: min/avg/max/dev: 1.575 / 2.038 / 21.846 / 1.107
    Bandwidth in kbytes/sec: sent=0.300, rcvd=0.300

So, out of 1500 pings, I got 3 "spikes", 0.2% of pings. I find that rather normal for a router that has ongoing Luci statistics monitoring, nlbwmon traffic monitoring etc. going on the background. And of course on the PC itself, there is stuff ongoing (like me writing this message to the forum at the same time).

Ps.
There was somebody complaining that R7800 adds 1-2 ms to the latency. Well, at least I reach my ISP steadily at 2ms (at quickest 1.575ms), so I doubt R7800 really has added 1-2ms of latency :wink: as the latency could hardly be 0 even when directly connected.

Apples to oranges :slight_smile: You are using DHCP and I am using PPPoE (VDSL). It takes 10ms just to reach the first hop when connected directly to the modem...

Ok, I'll pull out my ancient DIR-655 A4 and see if it has any of these issues. :stuck_out_tongue:

So this means doing a traceroute and finding the first hop that's not my router, right?

My second plot in post #28 contains 9000 pings. I ran the numbers and 40 have a ping above 30ms - that's ~0.4 percent of the total number of pings. Not a large number, but it's still annoying. Additionally, I run a pretty basic build on my router that shouldn't really be taxing it much.

I have my original R7800 test unit and the a new retail unit provided by Netgear for another test at the end of last year. I see no difference. In addition the engraved numbers are just for cosmetic reasons so people don’t put the back antennas on the side and visa versa not that it would make a difference. Netgear confirmed the same (as in no other difference). Currently using the beta unit as a LEDE test unit with the retail unit as an AP, so I guess some load off the CPU.

Yep. That removes most of the subsequent hops from the equation and helps identifying what happens on that network link that your router/modem selection really affects.

(The hops from ISP to Google are beyond your power, and it makes no sense to have the impact from variation in those hops in the analysis of router-induced spikes.)

What happens if the router pings itself? Will it see the same latency spikes?

That's a very good test...