UPDATE: When troubleshooting the latency issue, I started looking at the wrong value. It turned out that it was not the additional latency, it was random periodic latency spikes from 11ms to 50..100ms a few times a minute. So this thread evolved and I updated the subject accordingly.
Hi, I have just tried the "ping 8.8.8.8" test by connecting my computer directly to a VDSL modem (bridged) and then with a router (Netgear R7800 running LEDE 17.01.4). The ping times without the router are ~10.3ms and are all within 10.1 to 10.5ms range for a few minutes of monitoring.
Once I connect via the router (wired of course), the ping times jump by 1ms and hover between 11.2 to 11.5ms most of the time with a spike to 50 to 100 ms two or three times a minute.
Tried with and without SQM and did not see a big difference.
Is that normal for a router to add 1ms latency?
What can be causing those odd ping spikes to 50 or 100ms or more a few times a minute? They do not happen when connected to the modem directly.
FIrewall is only turned on on the router; the modem is in a bridged mode. This is not a double NAT setup. I just did not expect 1ms to be added by the router.
Right, there is no problem. Just was not expecting the router to consistently add 1ms of latency. Below is the contents of /etc/config/firewall, that I have made very little modifications to.
config defaults
option syn_flood '1'
option input 'ACCEPT'
option output 'ACCEPT'
option forward 'DROP'
option drop_invalid '1'
config zone
option name 'lan'
option input 'ACCEPT'
option output 'ACCEPT'
option forward 'ACCEPT'
option network 'lan'
config zone
option name 'wan'
option output 'ACCEPT'
option masq '1'
option mtu_fix '1'
option input 'DROP'
option forward 'DROP'
option network 'wan wan6'
config forwarding
option src 'lan'
option dest 'wan'
config rule
option target 'ACCEPT'
option proto 'tcp udp'
option name 'Guest DNS'
option dest_port '53'
option src 'guest'
config rule
option target 'ACCEPT'
option proto 'udp'
option name 'Guest DHCP'
option src 'guest'
option dest_port '67-68'
config include
option path '/etc/firewall.user'
config zone
option name 'guest'
option output 'ACCEPT'
option input 'DROP'
option forward 'DROP'
option network ‘pluto’
config forwarding
option dest 'wan'
option src 'guest'
config redirect 'dns_override_lan'
option name 'DNS Override (lan)'
option src 'lan'
option proto 'tcp udp'
option src_dport '53'
option dest_port '53'
option target 'DNAT'
config redirect 'dns_override_guest'
option name 'DNS Override (guest)'
option src 'guest'
option proto 'tcp udp'
option src_dport '53'
option dest_port '53'
option target 'DNAT'
config rule
option target 'ACCEPT'
option src 'guest'
option name 'Printer'
option dest_ip ‘192.168.1.100’
option dest 'lan'
option proto 'tcp udp'
config include 'bcp38'
option type 'script'
option path '/usr/lib/bcp38/run.sh'
option family 'IPv4'
option reload '1'
Yes, they do. With SQM on or off, latency spikes look like below: frequent, significant, but isolated. Those spikes are not happening when I connect directly to the modem. My router is Netgear R7800 and one would expect it to have enough muscle to run smoothly. I am not using WiFi for these tests, it is all on wired.
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=60 time=11.3 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=60 time=11.1 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=60 time=11.3 ms
64 bytes from 8.8.8.8: icmp_seq=4 ttl=60 time=11.1 ms
64 bytes from 8.8.8.8: icmp_seq=5 ttl=60 time=11.1 ms
64 bytes from 8.8.8.8: icmp_seq=6 ttl=60 time=11.2 ms
64 bytes from 8.8.8.8: icmp_seq=7 ttl=60 time=68.8 ms
64 bytes from 8.8.8.8: icmp_seq=8 ttl=60 time=11.2 ms
64 bytes from 8.8.8.8: icmp_seq=9 ttl=60 time=11.2 ms
64 bytes from 8.8.8.8: icmp_seq=10 ttl=60 time=11.0 ms
64 bytes from 8.8.8.8: icmp_seq=11 ttl=60 time=11.2 ms
64 bytes from 8.8.8.8: icmp_seq=12 ttl=60 time=11.0 ms
64 bytes from 8.8.8.8: icmp_seq=13 ttl=60 time=11.0 ms
64 bytes from 8.8.8.8: icmp_seq=14 ttl=60 time=11.2 ms
64 bytes from 8.8.8.8: icmp_seq=15 ttl=60 time=11.1 ms
64 bytes from 8.8.8.8: icmp_seq=16 ttl=60 time=11.3 ms
64 bytes from 8.8.8.8: icmp_seq=17 ttl=60 time=11.1 ms
64 bytes from 8.8.8.8: icmp_seq=18 ttl=60 time=11.1 ms
64 bytes from 8.8.8.8: icmp_seq=19 ttl=60 time=11.1 ms
64 bytes from 8.8.8.8: icmp_seq=20 ttl=60 time=11.1 ms
64 bytes from 8.8.8.8: icmp_seq=21 ttl=60 time=11.0 ms
64 bytes from 8.8.8.8: icmp_seq=22 ttl=60 time=50.2 ms
64 bytes from 8.8.8.8: icmp_seq=23 ttl=60 time=20.0 ms
64 bytes from 8.8.8.8: icmp_seq=24 ttl=60 time=11.2 ms
64 bytes from 8.8.8.8: icmp_seq=25 ttl=60 time=11.1 ms
64 bytes from 8.8.8.8: icmp_seq=26 ttl=60 time=11.1 ms
64 bytes from 8.8.8.8: icmp_seq=27 ttl=60 time=11.2 ms
64 bytes from 8.8.8.8: icmp_seq=28 ttl=60 time=11.1 ms
64 bytes from 8.8.8.8: icmp_seq=29 ttl=60 time=11.1 ms
64 bytes from 8.8.8.8: icmp_seq=30 ttl=60 time=11.1 ms
64 bytes from 8.8.8.8: icmp_seq=31 ttl=60 time=11.1 ms
64 bytes from 8.8.8.8: icmp_seq=32 ttl=60 time=11.1 ms
64 bytes from 8.8.8.8: icmp_seq=33 ttl=60 time=11.1 ms
64 bytes from 8.8.8.8: icmp_seq=34 ttl=60 time=11.2 ms
64 bytes from 8.8.8.8: icmp_seq=35 ttl=60 time=11.2 ms
And I get this when connected directly to the modem, which is in same bridged mode. Using the same NIC on my PC, but a different cable. Not a single spike over 11ms in over ~5 minutes of observations.
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=61 time=10.4 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=61 time=10.4 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=61 time=10.4 ms
64 bytes from 8.8.8.8: icmp_seq=4 ttl=61 time=10.3 ms
64 bytes from 8.8.8.8: icmp_seq=5 ttl=61 time=10.4 ms
64 bytes from 8.8.8.8: icmp_seq=6 ttl=61 time=10.3 ms
64 bytes from 8.8.8.8: icmp_seq=7 ttl=61 time=10.4 ms
64 bytes from 8.8.8.8: icmp_seq=8 ttl=61 time=10.3 ms
64 bytes from 8.8.8.8: icmp_seq=9 ttl=61 time=10.4 ms
64 bytes from 8.8.8.8: icmp_seq=10 ttl=61 time=10.3 ms
64 bytes from 8.8.8.8: icmp_seq=11 ttl=61 time=10.4 ms
64 bytes from 8.8.8.8: icmp_seq=12 ttl=61 time=10.6 ms
64 bytes from 8.8.8.8: icmp_seq=13 ttl=61 time=10.3 ms
64 bytes from 8.8.8.8: icmp_seq=14 ttl=61 time=10.4 ms
64 bytes from 8.8.8.8: icmp_seq=15 ttl=61 time=10.4 ms
64 bytes from 8.8.8.8: icmp_seq=16 ttl=61 time=10.6 ms
64 bytes from 8.8.8.8: icmp_seq=17 ttl=61 time=10.3 ms
64 bytes from 8.8.8.8: icmp_seq=18 ttl=61 time=10.4 ms
64 bytes from 8.8.8.8: icmp_seq=19 ttl=61 time=10.3 ms
64 bytes from 8.8.8.8: icmp_seq=20 ttl=61 time=10.4 ms
64 bytes from 8.8.8.8: icmp_seq=21 ttl=61 time=10.3 ms
64 bytes from 8.8.8.8: icmp_seq=22 ttl=61 time=10.4 ms
64 bytes from 8.8.8.8: icmp_seq=23 ttl=61 time=10.3 ms
64 bytes from 8.8.8.8: icmp_seq=24 ttl=61 time=10.4 ms
64 bytes from 8.8.8.8: icmp_seq=25 ttl=61 time=10.3 ms
64 bytes from 8.8.8.8: icmp_seq=26 ttl=61 time=10.4 ms
64 bytes from 8.8.8.8: icmp_seq=27 ttl=61 time=10.3 ms
64 bytes from 8.8.8.8: icmp_seq=28 ttl=61 time=10.3 ms
64 bytes from 8.8.8.8: icmp_seq=29 ttl=61 time=10.3 ms
64 bytes from 8.8.8.8: icmp_seq=30 ttl=61 time=10.4 ms
64 bytes from 8.8.8.8: icmp_seq=31 ttl=61 time=10.3 ms
64 bytes from 8.8.8.8: icmp_seq=32 ttl=61 time=10.3 ms
64 bytes from 8.8.8.8: icmp_seq=33 ttl=61 time=10.6 ms
64 bytes from 8.8.8.8: icmp_seq=34 ttl=61 time=10.3 ms
64 bytes from 8.8.8.8: icmp_seq=35 ttl=61 time=10.4 ms
And one last test from the router: all the additional latency and spikes seem to come from the router.
With SQM
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=61 time=10.983 ms
64 bytes from 8.8.8.8: seq=1 ttl=61 time=10.877 ms
64 bytes from 8.8.8.8: seq=2 ttl=61 time=11.055 ms
64 bytes from 8.8.8.8: seq=3 ttl=61 time=11.302 ms
64 bytes from 8.8.8.8: seq=4 ttl=61 time=11.187 ms
64 bytes from 8.8.8.8: seq=5 ttl=61 time=12.466 ms
64 bytes from 8.8.8.8: seq=6 ttl=61 time=10.777 ms
64 bytes from 8.8.8.8: seq=7 ttl=61 time=79.327 ms
64 bytes from 8.8.8.8: seq=8 ttl=61 time=11.278 ms
64 bytes from 8.8.8.8: seq=9 ttl=61 time=10.936 ms
64 bytes from 8.8.8.8: seq=10 ttl=61 time=11.018 ms
64 bytes from 8.8.8.8: seq=11 ttl=61 time=11.060 ms
64 bytes from 8.8.8.8: seq=12 ttl=61 time=10.963 ms
64 bytes from 8.8.8.8: seq=13 ttl=61 time=11.082 ms
64 bytes from 8.8.8.8: seq=14 ttl=61 time=11.068 ms
64 bytes from 8.8.8.8: seq=15 ttl=61 time=10.974 ms
64 bytes from 8.8.8.8: seq=16 ttl=61 time=10.868 ms
64 bytes from 8.8.8.8: seq=17 ttl=61 time=11.123 ms
64 bytes from 8.8.8.8: seq=18 ttl=61 time=11.128 ms
64 bytes from 8.8.8.8: seq=19 ttl=61 time=11.003 ms
64 bytes from 8.8.8.8: seq=20 ttl=61 time=10.973 ms
64 bytes from 8.8.8.8: seq=21 ttl=61 time=40.033 ms
64 bytes from 8.8.8.8: seq=22 ttl=61 time=11.011 ms
64 bytes from 8.8.8.8: seq=23 ttl=61 time=10.994 ms
64 bytes from 8.8.8.8: seq=24 ttl=61 time=11.063 ms
64 bytes from 8.8.8.8: seq=25 ttl=61 time=11.109 ms
64 bytes from 8.8.8.8: seq=26 ttl=61 time=10.793 ms
Without SQM
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=61 time=11.105 ms
64 bytes from 8.8.8.8: seq=1 ttl=61 time=10.959 ms
64 bytes from 8.8.8.8: seq=2 ttl=61 time=11.119 ms
64 bytes from 8.8.8.8: seq=3 ttl=61 time=10.479 ms
64 bytes from 8.8.8.8: seq=4 ttl=61 time=11.116 ms
64 bytes from 8.8.8.8: seq=5 ttl=61 time=10.998 ms
64 bytes from 8.8.8.8: seq=6 ttl=61 time=11.269 ms
64 bytes from 8.8.8.8: seq=7 ttl=61 time=11.059 ms
64 bytes from 8.8.8.8: seq=8 ttl=61 time=10.900 ms
64 bytes from 8.8.8.8: seq=9 ttl=61 time=34.492 ms
64 bytes from 8.8.8.8: seq=10 ttl=61 time=11.149 ms
64 bytes from 8.8.8.8: seq=11 ttl=61 time=10.997 ms
64 bytes from 8.8.8.8: seq=12 ttl=61 time=11.026 ms
64 bytes from 8.8.8.8: seq=13 ttl=61 time=11.061 ms
64 bytes from 8.8.8.8: seq=14 ttl=61 time=10.528 ms
64 bytes from 8.8.8.8: seq=15 ttl=61 time=10.724 ms
64 bytes from 8.8.8.8: seq=16 ttl=61 time=37.894 ms
64 bytes from 8.8.8.8: seq=17 ttl=61 time=10.975 ms
64 bytes from 8.8.8.8: seq=18 ttl=61 time=10.945 ms
64 bytes from 8.8.8.8: seq=19 ttl=61 time=43.298 ms
64 bytes from 8.8.8.8: seq=20 ttl=61 time=11.026 ms
64 bytes from 8.8.8.8: seq=21 ttl=61 time=11.046 ms
64 bytes from 8.8.8.8: seq=22 ttl=61 time=11.129 ms
64 bytes from 8.8.8.8: seq=23 ttl=61 time=11.125 ms
Can you run the ping test but also SSH in to the R7800 and run the command 'top' and let us know what the idle percentage is when these spikes occur and what it is on average when they are not occurring. Also, does your VDSL modem/router combo by chance have an Intel Puma 6 chip in it?
Good thought, but the puma series are DOCSIS cable modem SoCs so will not be used on a VDSL-modem (also the puma issue should also show up when connecting the PC directly with the modem).
Thanks for the tests, yes I agree it is related to the router. Could you fully disable the wifi radios for a test (or wrap the router in aluminum foil to basically isolate the radios from outside influences). I had a case once where a wifi router with no station attached had cyclic repeated latency/bandwidth issues that went away once I powered down the 2.4 GHz radio in that router (in case you wonder speedport w723v type A).
Spikes up to 80ms even if just occasional will make using a number of applications awkward (VoIP, video chat, on-line ganes...), so getting rid of those looks like a good idea...
Ok, so usually idle is at 95..96%, but during a latency spike it drops to 90..91% and the top process seems to always be [kworker/0:2], which consumes 4..5% CPU alone at that moment.