SQM tweaks and best latency for connections over 650Mb

luci/uci will get you all f (all 4 cores)... i found that problematic for latency... ymmv

my build on boot will set you up with (if you disable the uci setting);

echo -n 1 > /sys/class/net/eth0/queues/tx-0/xps_cpus
echo -n 2 > /sys/class/net/eth0/queues/tx-1/xps_cpus
echo -n 4 > /sys/class/net/eth0/queues/tx-2/xps_cpus
echo -n 4 > /sys/class/net/eth0/queues/tx-3/xps_cpus
echo -n 2 > /sys/class/net/eth0/queues/tx-4/xps_cpus
echo -n 7 > /sys/class/net/eth0/queues/rx-0/rps_cpus
echo -n 7 > /sys/class/net/eth1/queues/rx-0/rps_cpus

have also use a few others in the past... one is maybe

echo -n 7 > /sys/class/net/eth0/queues/tx-0/xps_cpus
echo -n 7 > /sys/class/net/eth0/queues/tx-1/xps_cpus
echo -n 7 > /sys/class/net/eth0/queues/tx-2/xps_cpus
echo -n 7 > /sys/class/net/eth0/queues/tx-3/xps_cpus
echo -n 7 > /sys/class/net/eth0/queues/tx-4/xps_cpus
echo -n c > /sys/class/net/eth0/queues/rx-0/rps_cpus
echo -n c > /sys/class/net/eth1/queues/rx-0/rps_cpus

ymmv... still looking for my better notes... (essentially I move nlbwmon, luci-statistics and a few other bursty/non-essential tasks to core 4 (3 if counting from zero) then try to avoid the 4th core for networking stuff...

i'd start with something like;

        option download '550000'
        option upload '26500'

/etc/init.d/sqm restart
still +12ms of extra latency

1 Like

just to be clear, I should disable packet steering from Luci and try running:

echo -n 4 > /sys/class/net/eth0/queues/tx-0/xps_cpus
echo -n 4 > /sys/class/net/eth0/queues/tx-1/xps_cpus
echo -n 4 > /sys/class/net/eth0/queues/tx-2/xps_cpus
echo -n 4 > /sys/class/net/eth0/queues/tx-3/xps_cpus
echo -n 4 > /sys/class/net/eth0/queues/tx-4/xps_cpus
echo -n c > /sys/class/net/eth0/queues/rx-0/rps_cpus
echo -n c > /sys/class/net/eth1/queues/rx-0/rps_cpus

from the command line, right?

meh... for now it wont matter much (if the interface restarts the commands need to be re-run)... well look at making them more permanent etc. once we get a feel for how they are going...

(but yes... the intention on my build is not to rely on the uci setting due to the weirdness of the rpi4 cores and f potentially being problematic ~ needing specific values for usb vs onboard, but you can test with all f to confirm )

1 Like

Care to elaborate? What is weird, are some IRQ sources hard mapped to some cores, or is power saving different for the different cores?

1 Like

that is the 100 million dollar question... :money_mouth_face: !!!

wall-o-text-hypothesis

I don't really have proper technical words for it... but something about the onboard ethernet 'likes' to be tied to core 0 (interrupts)... (also... there are interactions with usb + usb interrupts)... really messy/confusing when you dig down into it...

with the steering OTOH, cannot recall how much that feeds into the above... perhaps, i've gone a bit too freestyle... but I found as a general rule... try to keep as much (network) stuff on core0 and maybe core1(probably doesn't matter)...

likely some funky scheduler<core0 action (some rpi system threads want to be on the same core as the bcgnet driver or something)... that's the way it looks in laymans terms anyway...

both of those would also express what i'm thinking above very well... IRQ mappings are pretty pinned except for onboard ethernet... some mmc (wifi + hmmm something else)... it's definately weird stuff... and power saving is a level unto it's own on the rpi4... heavy bias to scale down at multiple levels... (even the regulator can tell the cpu to slow down if it's overworked) - but performance or scaling_up_thresh deals with most of the immediate 'problems'...

another knowledgable member of the forum posted a mod to allow pci-e(usb) IRQ re-assignment... will get details if we get someone on fibre willing to do some low level testing...


dlakelans pioneering work on the device and some steering values to test

woohoo... found it Cheddoleum ( @Cheddoleum ) to the rescue (special mention: rhester and mint for low level info)

more discussion w interrupts

more juicy discussion (user reports stock steering results in 1Gb/s however... no reports or investigation re: latency hit encurred from all f... afair my tests indicated approx 25ms hit from this... ymmv... adjusting steering for better/best? latency dropped me below 915~845Mb/s it was a bit variable)


although its for the rpi3 i think (and probably not directly related)... i found the bottom two posts on this page rather insightfull/telling re: these topics

1 Like

@anon50098793 @moeller0 so folks to set the facts straight. Are there any things that can be done to improve the latency with my current ISP? another router especially for gaming ones like the DumaOS ones or the Asus RT-AX85U would help more? or just look for another ISP?

short answer imho... not really... as discussed you can reduce load (drop sqm values or use suggested script to do similar) but it's likely these will be suboptimal given your expressed requirements...

not really... theoretically a beefier router can get you improved latency under high load... but given the current environment... we are not really in those realms... and even if we were we are talking <10ms over 850Mb/s ish figures... so bordering in very small benefit indeed

possibly... (shave those 10ms off) but we'd need to see a few more (isolated) mtr stats as suggested to get a better picture of local vs provider level congestion/contention... if the other ISP is using a different media (fibre)... then likely yes...

1 Like

here are some results and IMO, seems like My ISP is suffering:

root@rpi4-router /45# mtr -ezb4r 8.8.8.8
Start: 2021-12-13T19:56:33+0000
HOST: rpi4-router                 Loss%   Snt   Last   Avg  Best  Wrst StDev
@Not a TXT record
  1. AS???    ???                 100.0    10    0.0   0.0   0.0   0.0   0.0
  2. AS3209   de-dus01a-cr12-eth-  0.0%    10    8.6  10.7   8.1  16.3   2.7
  3. AS6830   de-fra04d-rc1-ae-19  0.0%    10   15.0  18.0  14.8  24.9   3.8
  4. AS6830   84.116.190.94 (84.1 90.0%    10   77.0  77.0  77.0  77.0   0.0
  5. AS15169  74.125.48.122 (74.1  0.0%    10   20.0  28.3  17.8  39.4   7.5
  6. AS15169  142.251.65.73 (142.  0.0%    10   19.5  19.0  15.6  31.3   4.5
  7. AS15169  172.253.64.119 (172  0.0%    10   17.8  19.4  16.6  30.0   3.8
  8. AS15169  dns.google (8.8.8.8  0.0%    10   15.5  15.9  14.2  18.0   1.4
root@rpi4-router /45# mtr -ezb4r 8.8.8.8
Start: 2021-12-13T19:57:45+0000
HOST: rpi4-router                 Loss%   Snt   Last   Avg  Best  Wrst StDev
@Not a TXT record
  1. AS???    ???                 100.0    10    0.0   0.0   0.0   0.0   0.0
  2. AS3209   de-dus01a-cr12-eth-  0.0%    10   10.3  11.5   8.5  17.3   3.1
  3. AS6830   de-fra04d-rc1-ae-19  0.0%    10   17.3  16.7  14.2  20.5   2.2
  4. AS6830   84.116.190.94 (84.1  0.0%    10  137.8  62.9  14.6 137.8  41.8
  5. AS15169  74.125.48.122 (74.1  0.0%    10   17.9  19.9  17.6  24.6   2.0
  6. AS15169  142.251.65.73 (142.  0.0%    10   16.0  18.0  15.8  24.7   2.6
  7. AS15169  172.253.64.119 (172  0.0%    10   16.1  20.5  16.1  26.8   3.3
  8. AS15169  dns.google (8.8.8.8  0.0%    10   15.0  17.0  14.2  25.9   3.4
root@rpi4-router /44# mtr -ezb4r 8.8.8.8
Start: 2021-12-13T19:58:58+0000
HOST: rpi4-router                 Loss%   Snt   Last   Avg  Best  Wrst StDev
@Not a TXT record
  1. AS???    ???                 100.0    10    0.0   0.0   0.0   0.0   0.0
  2. AS3209   de-dus01a-cr12-eth-  0.0%    10   11.0  11.6   7.9  18.1   3.5
  3. AS6830   de-fra04d-rc1-ae-19  0.0%    10   15.4  16.2  13.8  20.3   2.3
  4. AS6830   84.116.190.94 (84.1 10.0%    10   23.1  19.0  14.4  25.7   4.2
  5. AS15169  74.125.48.122 (74.1  0.0%    10   20.1  21.0  17.8  27.5   2.8
  6. AS15169  142.251.65.73 (142.  0.0%    10   19.2  18.1  16.2  22.6   1.8
  7. AS15169  172.253.64.119 (172  0.0%    10   16.1  18.5  16.0  21.2   1.8
  8. AS15169  dns.google (8.8.8.8  0.0%    10   18.5  17.7  14.1  25.7   3.3
root@rpi4-router /44# mtr -ezb4r 8.8.8.8
Start: 2021-12-13T19:59:38+0000
HOST: rpi4-router                 Loss%   Snt   Last   Avg  Best  Wrst StDev
@Not a TXT record
  1. AS???    ???                 100.0    10    0.0   0.0   0.0   0.0   0.0
  2. AS3209   de-dus01a-cr12-eth-  0.0%    10    9.8  12.6   8.5  27.2   5.9
  3. AS6830   de-fra04d-rc1-ae-19 10.0%    10   21.1  18.9  14.5  32.3   5.4
@Not a TXT record
  4. AS???    ???                 100.0    10    0.0   0.0   0.0   0.0   0.0
  5. AS15169  74.125.48.122 (74.1  0.0%    10   27.6  21.2  17.9  27.6   3.3
  6. AS15169  142.251.65.73 (142.  0.0%    10   15.3  18.3  15.3  25.1   3.0
  7. AS15169  172.253.64.119 (172  0.0%    10   26.6  20.9  17.6  26.6   3.4
  8. AS15169  dns.google (8.8.8.8  0.0%    10   18.4  17.9  14.6  25.8   3.9
1 Like

So, this indicates the end results has on average 17.2ms round trip +- 3.2ms with worst case 23ms, so that's pretty tight and happy. Is this something you're unhappy about or just the baseline and under load it's worse?

If you're unhappy about this:

don't be, that's just some intermediate device that doesn't care to try to respond to pings. it doesn't indicate a difficulty in delivery.

exactly under load, it goes high up(up to 60ms more) and that's for gaming a deal-breaker

The trick is to figure out how much latency you are comfortable with... DOCSIS is not terrible, but certainly more bursty than DSL or ethernet. Over a shared medium like the internet a bit of variable delay is really unavoidable...

Fiber/GPON, while like DOCSIS using a variable delay request grant mechanism for upstream data, has considerable smaller jitter since the DOCSIS gr-timing is in the 2-4ms range while e.g. GPON is in the 100s of microseconds range....

So how much variable latency are you willing to accept?

1 Like

Can you show a regular ping output to 8.8.8.8 while doing a speedtest on a LAN device?

1 Like

whoops... forgot to add this (hopefully not hugely relevant)

the following command (off) is needed to fix a bug for
some people on ipv6... it may be having an impact on
your results (this is undo)

ethtool -K eth0 rx on
1 Like

I think up to 10ms is ok, so 30ms in total isn't terrible

here you go:

 ping 8.8.8.8                                                                                                                     100% 
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: icmp_seq=0 ttl=114 time=40.365 ms
64 bytes from 8.8.8.8: icmp_seq=1 ttl=114 time=17.584 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=114 time=19.679 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=114 time=23.420 ms
64 bytes from 8.8.8.8: icmp_seq=4 ttl=114 time=19.711 ms
64 bytes from 8.8.8.8: icmp_seq=5 ttl=114 time=29.746 ms
64 bytes from 8.8.8.8: icmp_seq=6 ttl=114 time=41.728 ms
64 bytes from 8.8.8.8: icmp_seq=7 ttl=114 time=26.395 ms
64 bytes from 8.8.8.8: icmp_seq=8 ttl=114 time=37.371 ms
64 bytes from 8.8.8.8: icmp_seq=9 ttl=114 time=55.538 ms
64 bytes from 8.8.8.8: icmp_seq=10 ttl=114 time=22.051 ms
64 bytes from 8.8.8.8: icmp_seq=11 ttl=114 time=20.234 ms
64 bytes from 8.8.8.8: icmp_seq=12 ttl=114 time=33.853 ms
64 bytes from 8.8.8.8: icmp_seq=13 ttl=114 time=54.448 ms
64 bytes from 8.8.8.8: icmp_seq=14 ttl=114 time=52.496 ms
64 bytes from 8.8.8.8: icmp_seq=15 ttl=114 time=54.357 ms
64 bytes from 8.8.8.8: icmp_seq=16 ttl=114 time=68.789 ms
64 bytes from 8.8.8.8: icmp_seq=17 ttl=114 time=46.031 ms
64 bytes from 8.8.8.8: icmp_seq=18 ttl=114 time=49.377 ms
64 bytes from 8.8.8.8: icmp_seq=19 ttl=114 time=57.952 ms
64 bytes from 8.8.8.8: icmp_seq=20 ttl=114 time=54.956 ms
64 bytes from 8.8.8.8: icmp_seq=21 ttl=114 time=57.383 ms
64 bytes from 8.8.8.8: icmp_seq=22 ttl=114 time=48.456 ms
64 bytes from 8.8.8.8: icmp_seq=23 ttl=114 time=17.777 ms
64 bytes from 8.8.8.8: icmp_seq=24 ttl=114 time=64.908 ms
64 bytes from 8.8.8.8: icmp_seq=25 ttl=114 time=16.948 ms
64 bytes from 8.8.8.8: icmp_seq=26 ttl=114 time=16.493 ms
^C
--- 8.8.8.8 ping statistics ---
27 packets transmitted, 27 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 16.493/38.817/68.789/16.665 ms

thank I ran this. good to have anyway

1 Like

here is another bufferbloat test without SQM:

that was with your SQM settings turned on?
Are you able to make it stay constant by dropping SQM speeds in half? (just trying to figure out what is going on here)

Mh, so here is something to consider, cake's default latency target is 5 ms so under load your median latency will increase by 5ms per loaded direction and empirically the relevant distribution of delays is roughly 2 target per direction, so with the default 5 ms, you will easily see a variable latency in the 0-20ms range.... depending of specific traffic patterns. If all your traffic is exceedingly well behaved you maybe get away with ~1 target worth per direction, but that still adds up to ~10ms. Now given DOCSIS known upstream delay variability in the 4-6 ms range, I am not sure you can do that much to improve that.

1 Like