R7800 and PS4 performance

oversubscription..

root@OpenWrt:~# speedtest-netperf.sh -H flent-newark.bufferbloat.net
2020-07-23 02:08:07 Starting speedtest for 60 seconds per transfer session.
Measure speed to flent-newark.bufferbloat.net (IPv4) while pinging gstatic.com.
Download and upload sessions are sequential, each with 5 simultaneous streams.
............................................................
 Download: 441.97 Mbps
  Latency: [in msec, 61 pings, 0.00% packet loss]
      Min:  35.148
    10pct:  39.203
   Median:  43.091
      Avg:  43.508
    90pct:  47.409
      Max:  52.148
 CPU Load: [in % busy (avg +/- std dev) @ avg frequency, 57 samples]
     cpu0:  66.3 +/-  0.0  @ 1544 MHz
     cpu1:  78.1 +/-  7.5  @ 1565 MHz
 Overhead: [in % used of total CPU available]
  netperf:  41.9
.............................................................
   Upload:  18.95 Mbps
  Latency: [in msec, 61 pings, 0.00% packet loss]
      Min:  37.007
    10pct:  38.277
   Median:  39.615
      Avg:  39.890
    90pct:  41.869
      Max:  43.527
 CPU Load: [in % busy (avg +/- std dev) @ avg frequency, 57 samples]
     cpu0:  15.1 +/-  5.4  @ 1084 MHz
     cpu1:   7.7 +/-  2.9  @  692 MHz
 Overhead: [in % used of total CPU available]
  netperf:   3.4

Those latencies are looking a LOT better, but there's more to tweak. Can you make these changes to your SQM config and then test again?

1 Like
root@OpenWrt:~# speedtest-netperf.sh -H flent-newark.bufferbloat.net
2020-07-23 02:14:22 Starting speedtest for 60 seconds per transfer session.
Measure speed to flent-newark.bufferbloat.net (IPv4) while pinging gstatic.com.
Download and upload sessions are sequential, each with 5 simultaneous streams.
............................................................
 Download: 465.34 Mbps
  Latency: [in msec, 61 pings, 0.00% packet loss]
      Min:  36.113
    10pct:  38.773
   Median:  43.437
      Avg:  43.082
    90pct:  46.422
      Max:  56.799
 CPU Load: [in % busy (avg +/- std dev) @ avg frequency, 56 samples]
     cpu0:  70.0 +/-  0.0  @ 1529 MHz
     cpu1:  83.2 +/-  3.7  @ 1599 MHz
 Overhead: [in % used of total CPU available]
  netperf:  40.7
.............................................................
   Upload:  18.65 Mbps
  Latency: [in msec, 61 pings, 0.00% packet loss]
      Min:  37.146
    10pct:  37.974
   Median:  39.571
      Avg:  39.915
    90pct:  41.909
      Max:  45.329
 CPU Load: [in % busy (avg +/- std dev) @ avg frequency, 58 samples]
     cpu0:  18.2 +/-  6.5  @ 1137 MHz
     cpu1:   8.5 +/-  3.1  @  734 MHz
 Overhead: [in % used of total CPU available]
  netperf:   3.3

Okay, go ahead and take your Per Packet Overhead from 18 to 22 (where it should be for cable).

I would like to see what happens if we try to get you onto cake/piece_of_cake instead of fq_codel/simple. If you're willing to switch these settings and test again, it will be interesting to see if your router can handle this:

same settings but changed to 22 per packet overhead

root@OpenWrt:~# speedtest-netperf.sh -H flent-newark.bufferbloat.net
2020-07-23 02:26:38 Starting speedtest for 60 seconds per transfer session.
Measure speed to flent-newark.bufferbloat.net (IPv4) while pinging gstatic.com.
Download and upload sessions are sequential, each with 5 simultaneous streams.
............................................................
 Download: 449.15 Mbps
  Latency: [in msec, 61 pings, 0.00% packet loss]
      Min:  37.405
    10pct:  39.500
   Median:  43.752
      Avg:  43.548
    90pct:  46.931
      Max:  50.438
 CPU Load: [in % busy (avg +/- std dev) @ avg frequency, 57 samples]
     cpu0:  68.1 +/-  0.0  @ 1516 MHz
     cpu1:  78.5 +/-  8.5  @ 1548 MHz
 Overhead: [in % used of total CPU available]
  netperf:  39.7
.............................................................
   Upload:  18.50 Mbps
  Latency: [in msec, 61 pings, 0.00% packet loss]
      Min:  37.022
    10pct:  37.609
   Median:  39.373
      Avg:  39.549
    90pct:  41.227
      Max:  45.602
 CPU Load: [in % busy (avg +/- std dev) @ avg frequency, 58 samples]
     cpu0:  22.5 +/-  5.3  @ 1097 MHz
     cpu1:  10.7 +/-  3.8  @  793 MHz
 Overhead: [in % used of total CPU available]
  netperf:   3.3

cake settings test

root@OpenWrt:~#  speedtest-netperf.sh -H flent-newark.bufferbloat.net
2020-07-23 02:30:28 Starting speedtest for 60 seconds per transfer session.
Measure speed to flent-newark.bufferbloat.net (IPv4) while pinging gstatic.com.
Download and upload sessions are sequential, each with 5 simultaneous streams.
............................................................
 Download: 282.69 Mbps
  Latency: [in msec, 61 pings, 0.00% packet loss]
      Min:  36.775
    10pct:  38.533
   Median:  43.603
      Avg:  43.373
    90pct:  47.112
      Max:  49.690
 CPU Load: [in % busy (avg +/- std dev) @ avg frequency, 55 samples]
     cpu0:  70.9 +/-  9.0  @ 1612 MHz
     cpu1:  95.6 +/-  0.0  @ 1701 MHz
 Overhead: [in % used of total CPU available]
  netperf:  40.9
.............................................................
   Upload:  18.75 Mbps
  Latency: [in msec, 61 pings, 0.00% packet loss]
      Min:  37.956
    10pct:  38.359
   Median:  39.752
      Avg:  40.170
    90pct:  42.352
      Max:  45.335
 CPU Load: [in % busy (avg +/- std dev) @ avg frequency, 57 samples]
     cpu0:  17.7 +/-  5.1  @ 1068 MHz
     cpu1:   8.9 +/-  4.4  @  820 MHz
 Overhead: [in % used of total CPU available]
  netperf:   3.3

Ouch, yeah that cake test was rough, like I figured it would be. While still on the cake configuration, copy and paste all of this into your SSH session for me and then run the speed test again:

for file in /sys/class/net/*
do
	echo 3 > $file"/queues/rx-0/rps_cpus"
	echo 3 > $file"/queues/tx-0/xps_cpus"
done
echo performance > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
echo performance > /sys/devices/system/cpu/cpufreq/policy1/scaling_governor
echo 800000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_max_freq
echo 800000 > /sys/devices/system/cpu/cpufreq/policy1/scaling_max_freq
sleep 1
echo 1750000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_max_freq
echo 1750000 > /sys/devices/system/cpu/cpufreq/policy1/scaling_max_freq
echo 2 >/proc/irq/30/smp_affinity
echo 2 >/proc/irq/31/smp_affinity
echo 2 >/proc/irq/32/smp_affinity

(make sure you hit enter after the last line)

root@OpenWrt:~# for file in /sys/class/net/*
> do
> echo 3 > $file"/queues/rx-0/rps_cpus"
> echo 3 > $file"/queues/tx-0/xps_cpus"
> done
root@OpenWrt:~# echo performance > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
root@OpenWrt:~# echo performance > /sys/devices/system/cpu/cpufreq/policy1/scaling_governor
root@OpenWrt:~# echo 800000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_max_freq
root@OpenWrt:~# echo 800000 > /sys/devices/system/cpu/cpufreq/policy1/scaling_max_freq
root@OpenWrt:~# sleep 1
root@OpenWrt:~# echo 1750000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_max_freq
root@OpenWrt:~# echo 1750000 > /sys/devices/system/cpu/cpufreq/policy1/scaling_max_freq
root@OpenWrt:~# echo 2 >/proc/irq/30/smp_affinity
root@OpenWrt:~# echo 2 >/proc/irq/31/smp_affinity
root@OpenWrt:~# echo 2 >/proc/irq/32/smp_affinity
root@OpenWrt:~# speedtest-netperf.sh -H flent-newark.bufferbloat.net
2020-07-23 02:48:17 Starting speedtest for 60 seconds per transfer session.
Measure speed to flent-newark.bufferbloat.net (IPv4) while pinging gstatic.com.
Download and upload sessions are sequential, each with 5 simultaneous streams.
............................................................
 Download: 283.03 Mbps
  Latency: [in msec, 60 pings, 0.00% packet loss]
      Min:  35.283
    10pct:  41.869
   Median:  45.275
      Avg:  45.574
    90pct:  49.140
      Max:  62.376
 CPU Load: [in % busy (avg +/- std dev) @ avg frequency, 55 samples]
     cpu0:  90.5 +/-  0.0  @ 1725 MHz
     cpu1:  84.3 +/-  0.0  @ 1725 MHz
 Overhead: [in % used of total CPU available]
  netperf:  55.2
.............................................................
   Upload:  18.86 Mbps
  Latency: [in msec, 61 pings, 0.00% packet loss]
      Min:  36.703
    10pct:  37.672
   Median:  38.929
      Avg:  39.239
    90pct:  41.282
      Max:  43.004
 CPU Load: [in % busy (avg +/- std dev) @ avg frequency, 58 samples]
     cpu0:   8.4 +/-  5.3  @ 1725 MHz
     cpu1:   6.2 +/-  2.5  @ 1725 MHz
 Overhead: [in % used of total CPU available]
  netperf:   1.7

So at this point, we have your CPU governor set in performance mode and it's statically set at your CPU's top speed (1.725Ghz). Cake is CPU intensive, as I have mentioned before, so I was trying to throw all the CPU we possibly can into the mix.

Do you, by any chance, have a decently powerful desktop/notebook that is connected via Ethernet back to your router? If so, could you run a speed test (dslreports.com/speedtest, ideally) from there with all the settings as they are? Don't try the test over wireless as this point.

1 Like

I dont.. just this laptop

Would you be able to connect it to the router via ethernet and switch the wireless off for a test? If not, that's fine. The goal in having you test this way, if possible, is because the speedtest-netperf.sh script creates a certain amount of CPU overhead in and of itself just to run. I wanted to remove that overhead from your router's CPU and let your router concentrate on only routing/shaping to see if that max download speed improves a bit while still on cake.

51ms 176.8d 18.8u
A+ A+ A+

1 Like

Alright, so it's pretty clear that your R7800 isn't going to be able to keep up with your ingress speed on cake qdisc. You still got decent improvements with fq_codel/simple for your Queue Discipline settings in SQM. You'll probably want to switch back over to those settings, but keep the Ethernet link layer and 22bytes per packet overhead as they are.

I would be interested to see if dslreports.com gives you the same A+ buffer bloat rating when you switch back to fq_codel/simple.

So the next step would be to continue tweaking your download/upload speeds once back on fq_codel/simple until you find the sweet spot between throughput vs. latency.

Since you're a gamer, you might have to make a bit of a decision between trying to squeeze out as much throughput as you can to "get your money's worth" out of what you're paying, or be willing to sacrifice some throughput to gain a consistently low-latency connection.

Your thoughts?

1 Like

thinking about going down to 600mb instead of 1gb
from there.. what ever gets me the best latency hopefully around 400-600 download range too

after going back to codel/ simple

1st test
48ms 247d 19u
A A A

2nd test
51ms 205d 19u
A+ A+ A line quality

both had a few spikes in bufferbloat for download

test 3
54ms 249d 18u
A+ A+ A

no spikes

Cool. So your last fq_codel/simple test with 22byte per packet overhead had you at 450mbps with just under 10ms 90pct and some CPU room to spare. I am pretty confident we can get you closer to 500-525mbps without adding too much more latency than that.

So here's what I typically would do... teaching moment...

Your latency is pretty great at 450mbps down. At 690mbps down, it was rough. So what I do when I'm tuning SQM (and others may do it differently) is take a known good number and a known bad number and split the difference.

Bad: 690mbps (706560kbps)
Good: 488mbps (500000kbps) <-- where you are right now
Diff: 201.7mbps (206560kbps) / 2 = 100.9mbps (103280kbps)

So add the Diff kbps to the Good kbps and you have your next testing figure:
500000kbps + 103280kbps = 603280kbps

Plug 603280 (you could round down to 600000 at this point if you'd like) in for your download speed and test again.

At that point you have a binary decision--were the avg and 90pct latency significantly worse?

  • If yes, then your Diff number was too high still. So I take the difference of the last known good (500000 in your case) and the number you just tested at (600000 in this example), divide the difference by 2, add that back to the last known good number and try again (so 550000 in this case).
  • If no, then your new good number becomes 600000. So then you take the difference between 600000 and your last known bad number (706560 in your case), divide the difference by 2, add the difference back to the new good number, and test again.

You basically keep going with that binary testing method until you reach a point where you start to see consistently higher latency test over test, then you back it back down until you see that latency continue to stay reasonably low in the avg and 90pct.

Does this make sense?

1 Like

it does, thank you so much for all the help! very much appreciated!
I will play around with this tomorrow
thats pretty much it after I fine tune?

Also I've got another post about my epson printer not obtaining ip from router? any take on that?

1 Like

That will get you pretty close to tuned up--at least enough so that you are going to start seeing advantages in your browsing and gaming responsiveness. When you get your download where you think it's good, you'll repeat the same process on the upload. I would exercise more discretion on watching for latency increases on the upload because of your gaming needs. You want your upload latency to be as consistently low as possible. I would go more by the 90pct figure on upload as opposed to avg.

That said, what works great one day might not be so great the next day due to factors outside your control (general internet congestion, congestion on your shared cable medium, alignment of the moon and stars, etc.). The goal overall is not to get it perfect in one night, but to get it running as consistently "good" (no major lag spikes, pretty consistent latency at all times of the day) as possible over, say, a week or two. Think of this as more of a marathon and not a sprint. :slight_smile:

Once you get a pretty good comfort level on the basics here, you might take a look at this: https://openwrt.org/docs/guide-user/network/traffic-shaping/sqm-details It gets more in-depth and will take some time to digest.

Is your printer wired or wireless?

1 Like

Wireless only

Everything works great on ps4 when other devices aren’t being used. Such as fire sticks, TVs with WiFi, iPhones.. is there a way to prioritize bandwidth to each devices like qos

It requires CPU as well, so you have to make sure you have some left to handle the additional load. Having said that, I have the same router, a PS4, and may wireless client and never had any issues.
In my setup I only move 5GHz to CPU1 while everything else stays on CPU0
Software offload enabled
All *ps_cpus are set to 3
scaling_min_freq are set to 800000 (and a few more ondemand tweaks, but they are less important)
R7800 cannot shape in both directions, so I only shape the upload at 200Mbps max; anything more than that creates issues

	option qdisc 'fq_codel'
	option script 'simple.qos'
	option qdisc_advanced '0'
	option linklayer 'none'
	option interface 'pppoe-wan'
	option debug_logging '0'
	option verbosity '5'
	option enabled '1'
	option upload '200000'
	option download '0'

UPDATE: And I am using non-ct firmware; not sure is that matters though.