Increase in download bufferbloat with SQM enabled

Hello all,
Main ISP went down and switched to xFinity cable for the meantime. Didn't have a router besides the main provider gateway/router combo, so decided to take the opportunity to tinker around and temporarily use OpenWRT on a spare PC while the main ISP repairs their infrastructure.

Heard good things about cake SQM and decided to try it out since I'm a serious gamer and have other household members using youtube/torrents etc.
But as the title says, I'm seeing an increase in download bufferbloat with cake sqm enabled.
Upload shaping works rather fine an is reduced to near zero.

Here's some info:
SQM off


speedtest

With cake SQM on

Settings

root@OpenWrt:~# cat /etc/config/sqm
config queue 'eth1'
option debug_logging '0'
option verbosity '5'
option interface 'eth1'
option linklayer 'ethernet'
option enabled '1'
option download '200000'
option upload '10000'
option qdisc 'cake'
option script 'piece_of_cake.qos'
option overhead '44'

eth1 is the wan interface

hardware


Modem is an Arris SB6190
Modem -> Mobo nic (intel i219-v) -> OpenWRT -> Intel i210-T1

Also tried on a different system with an older i5 6500 but same results.

Any ideas? Thank you

Same thing for me. I think cause my isp is already using some type of Sqm. I just quit using it

If this is your ISP bandwidth, remember, it should be 80-95% of the real bandwidth. Try testing first

option download '160000'
option upload '8000'

Now retest.

200mbps/10 is around 85% of the throughput with sqm/qos disabled

1 Like

Some ideas:

  • You have so asymmetric connection with a ratio of 20:1 that upload bandwidth plays a role in protocol traffic during heavy downloads. You might test with managing only upload with sqm, and leave download unmanaged. Most of the download data path is before your control, so managing the last leg is not that important (as you have a powerful CPU in any case).

  • Test also the other qdisc options. Cake is not perfect. I prefer simple.qos / fq_codel (the original SQM qdisc), as I get better results with that.

  • Play with the cake parameters, especially the overhead. I get better results with 0 or some other overhead values than the one that the wiki suggests as right for my connection type. In general, I feel that cake tries to optimize too perfectly and depending on the connection and CPU profile and ... sometimes miscalculates.

With your setting you can expect at best:
200000/1000 * ((1500-20-20)/(1500+44)) = 189.12 Mbps
10000/1000 * ((1500-20-20)/(1500+44)) = 9.46 Mbps

of IPv4 throughput, so getting 185.8/9.37 in the speedtest is not all that bad and not a strong sign that you might run out of CPU cycles.

However the noSQM Ookla speedtest reports 84ms delay for the download test (and this measure is not even the maximum, so something is going on here even without SQM).

However:

is not bad advise, heck I would go down to 1000000/10000 for testing to make it very unlikely that your download test is even marginally CPU bound...

Great point! Now, most speedtests test the up- and down-direction sequentially, so the "hidden" throughput used by the reverse ACK traffic typically does not affect speedtests all that much (essentially they will see the capacity consumed by ACK traffic from data flowing in the direction opposite to the current test direction, but the speedtest does not even see traffic generated in the same direction as the test but by other connections... so as long as one performs the seedtest with otherwise quiescent network this should not matter all that much and if SQM is working as expected such extra-speedtest traffic certainly should not result in bad latency under load.

Yepp, cake is quite demanding for CPU not only in how much it gets it but also how quickly, HTB+fq_codel and TBF+fq_codel are considerably more tolerant in that respect. One thing especially problematic for cake seems to be frequency-scaling of a CPU (wild guess it might introduce latency spikes in getting access to the CPU during the switch so cake incurs unwanted delay during ramp-up and ramp-down).

I disagree gently here. Getting the overhead correct it important, however it does not come for free and if cake was marginally CPU bound, adding more CPU demand (by adjustimg each packet's size individually certainly makes things even worse, however if CPU is plenty this should not matter. So in other words, worth testing :wink:

1 Like

Might well be the case here, as @fullrain is using a x86 CPU that certainly scales, likely a lot.

On bare metal or as VM hosted by a different OS/hypervisor?

Thanks for the input

It's running on bare metal booted via a usb drive.
As for the frequency scaling I've locked the cpu frequency to 4.7ghz in bios

I've tried fq_codel and it's giving me similar results

Also should've included in the original post that I already tried values lower than 85% up and down such as 1000000/10000 and lower with zero improvement
putting ingress to 0 gives me the best result since it then gives the same performance as with SQM disabled.

ezgif.com-gif-maker
this test is with the limits set to 100000/9000
One thing I've noticed between from all the test that it seems to be fine for 2-3seconds then completely goes to shit, not sure if that's indicative of anything.

QQ, is this over WiFi or Ethernet? I imagine is the former as you said you are a serious gamer.

I've noticed the waveform test either hanging or giving bad results lately when fast.com hasn't changed. Maybe try fast.com to see what that shows?

Similar results on fast.com

Yes wired of course

1 Like

Tried the SQM on Arista NG firewall (Untangle) which uses fq_codel and get pretty good results
(already tried fq_Codel on Openwrt and get same results as cake)

However, I'm still interested in getting SQM on OpenWRT to work since the main ISP has really bad bufferbloat management by default, and will be getting a dedicated router to run OpenWRT on in the near future.
So if any more knowledgeable people have any ideas of what settings I could change or look into I'd appreciate the input

A bit late but realized you were talking about linux's freq scaling and not the settings in bios
Tried putting scaling governor to performance on all cores and confirmed the frequency with cpuinfo, but still getting the same results.

Seem you have tried almost everything but switching to a recent snapshot build.

Okay, maybe it is time to figure out what is happening with your CPUs.
Could you install the Ookla speedtest client on one of your endsystems (so not the router) and use it to run speed test outside of your browser?
Then do and post the following with SQM enabled with piece_of_cake.qos:
a) post the output of:
tc -s qdisc
cat /proc/interrupts
cat /proc/softirq

b) run the speedtest and post the results

c) run and post (again) the output of:
tc -s qdisc
cat /proc/interrupts
cat /proc/softirq

Also for your own information, log into your router via SSH and run htop while performing a few speedtests. Configure htop to show one usage bar per CPU and to show detailed information including softirq. If you should be CPU limited at least one of the CPUs will be pegged say >= 90% most of the duration of a speedtest.

Hey, finally replying as I've had a busy week.

In the end I (think) I was able to fix the problem by replacing the Arris SB6190 with a Netgear Cm1000.

However, I'm back on ATT fiber and have some questions about the optimal setup to use a second router for SQM only.

I currently have it set up like this:


image
(192.168.2.254 is the ATT bgw 320) dhcp server disabled in br-lan

I would like to know what are the pros and cons between this lan-type setup vs double NAT-ing, vs ATT gateway set to IP passthrough mode(it doesn't support bridge mode) in terms of using the second router purely for SQM and not any other firewall features.

I see a few options:
A) bump in the wire.
If you are willing to use the ATT router as primary rputer and firewall, and use all WiFi via additional access points (so the rest of the network only talks to the ATT router via ethernet) you could configure your cake device to act as bridge between the ATT router and the AP or switch. (Cake will only work well if it has control over most of your traffic, that is why you should not access the internet around the bitw, however it is easy to account for predictable traffic like VoIP by simple reducing thevshaper rate by 100Kbps for each concurrent voice call one might want to use vie the ATT router(e.g. if the ATT router has ports to directly connect old POTS equipment or DECT phones)).
The cons are obviuos, you will only use few of the ATT router's capabilities and none of OpenWrt's features beyond cake.

B) Dual NAT
Configure both router's as firewall/NAT routers. The pros are easy to set up,the cons are mainly that any port forwarding ned to be configured on both routers in lockstep and UPnP might not work (but then UPnP just tries to automate generating port forwarding rules, so it should be possible to achieve similar results via static forwards, but game consoles will likely still complain...)

C) Replace the ATT router completely
In theory one should be able to completely replace the ISP router on an FTTH link, by using an appropriate media converter/ONT. In the past I believe ATT played some certificate games for provisioning a link making it exceedingly hard to replace their router, @dlakelan knows more about this.... Over here in the EU there is a mandate that mostly forces ISPs to also provision compatible user-owned/-supplied routers/ONTs, but that is unlikely to help you :wink:

1 Like

ATT does 802.1x / certificate based authentication on its fiber offerings. The cert is baked into their router, so there are goofy methods to bypass this but the easiest and best thing by far is to use IP passthrough mode. This puts your LAN behind your own router which you should ALWAYS do for various reasons not the least of which is security. Any ATT employee can potentially access your LAN if you don't.

This doesn't involve double NAT and you can use all the UpNP you want (I don't recommend though).

It also works with Ipv6 but there are tricks needed to handle multiple VLANS since the att router doesn't PD you a /56 it will just do several /64s. There's info on the forum about how to do this though.

1 Like