SQM only reduces bufferbloat in one direction [WRT3200ACM]

Mmmh, 15:1.5 or 10:1 is not that extreme, but certainly worth playing with, after trying to fix the root cause of the problem with the downstream shaper :wink:

I was kind of wondering if the downstream direction was suffering from ack issues. Thinking that not only is the ratio there starting to get large, but the serialization delay on the upstream is pretty big... 1500 bytes/1.5Mbps = 8ms so perhaps some acks are getting dropped or something? (Cake has a small packet / ack prioritization thingy though right?)

After years of 14400/9600 SQM ingress/egress, I gave up because of the huge Steam games, and set it to zero. Looking good at 0/9600 on a 15/10 DSL connection.

You mean you didn't want to give up a little bit of download bandwidth because you wanted the game to download as fast as possible, or is there some other thing you refer to here?

1 Like

Yes. The downloads were taking many hours. The egress ping is 16msec, great for the in-house Steam servers. The ingress ping shot up to 53msec, but ATA voice quality is fine.

Yeah, as long as it doesn't vary a lot like between 20 and 150 with an average of 53, then jitter buffers will handle this level of delay pretty well.

I used to have a situation where if people loaded the Netflix main page while I was on a phone call my call would go silent for 3 seconds at a time. :wink: that was on a 60/3 cable connection, with the page loading multiple preview movies it would just saturate the 60 down for 3 seconds.

1 Like

In any case, I really can't see why you should get such high download ping times @omnomberry, so I'm thinking maybe @moeller0's suggestion for testing for PUMA bug is the right way to go. If that is your problem, you could buy yourself a new Cable modem without that chip and your problem would be solved.

Nothing looks obviously wrong in your config. The fact that it started happening when you upgraded to .02 version is a little suspicious. But we haven't heard tons of bug reports similar to this over the last few weeks on the forum.

1 Like

You might want to try again with layer_cake and with following the instructions at https://openwrt.org/docs/guide-user/network/traffic-shaping/sqm-details " Making cake sing and dance, on a tight rope without a safety net (aka advanced features)" especially the point about the ingress keyword, which might allow you to set your shaper for much closer to the 15Mbps of your link than in the past.

1 Like

Thanks for all the replies.

My cable modem is a Cisco DPC3825 operating in bridge mode. It looks like it uses a Broadcom chipset which should not be affected by the Puma bug if I understand the bug correctly. The Puma test on DSLReports has a result that is mostly green and a little bit yellow.

Can you provide info on the "ACK filter" feature? I wasn't able to find a document on how to set that up.

I think you just add ack-filter to the advanced options. Same place you would use dual-srchost etc

1 Like

Okay, that is actually a good thing as otherwise you would have needed a new modem. Unfortunately it does not explain your issue.

So, let me recap, upstream shaping works great and reduces bufferbloat like you expect. Downstream shaping however massively increases bufferbloat?

Could you try to set the Downstream to 7500 Mbps for a test and then follow the recommendations in https://forum.openwrt.org/t/sqm-qos-recommended-settings-for-the-dslreports-speedtest-bufferbloat-testing/2803 to configure and run a dslreports speedtest; and please observe 7.:
" 7. if you perform a test post a link to the detailed results page here in the forum (much nicer than just overview images), either copy and past the "link" from the results page "Sharing" section, or better select "Linked BBCode" which will give the summary graphic tat also acts as a link to the detailed results page (but make sure that those graphics actually display in the forum, otherwise post the link)."

1 Like

Yes, you understand correctly.

I just did the speed tests with the settings you specified. I had to reduce the upload streams to 4 because higher numbers would cause the test the fail.

Downstream at 14250k:

Downstream at 7500k:

Surprisingly, the bufferbloat grade on both tests is better than the usual "D" that I get.

1 Like

It works fine and then it doesn't... Makes me think something sometimes is using up the CPU... Can you watch the idle % during download using top -d 1 and see if it drops down into the single digits...

2 Likes

The downloading bufferbloat detail plots reveal that there are periods of proper shaping interspersed with epochs of rather extreme latencies. As if there was something else competing either for the bandwidth or your router's CPU cycles.... Not sure what exactly though.
Do you run any kind of services on your router, like a NAS or VPN that you could temporarily disabled for testing? And while you are at it, please try with wifi disabled.

1 Like

I just did a couple speed tests while watching top -d 1.

In the first test, Wi-Fi was enabled and the CPU idle never dropped below 96%.

After disabling W-Fi, the CPU idle percentage stayed lower. At one point, it dropped to 59% idle.

I think my OpenWRT setup is as simple as it gets. The only package I downloaded after installing 18.06.2 was luci-app-sqm. I'm not running a NAS or VPN.

How are the ethwrnet cables perhaps you have a problem one direction some kind of bad connection

This indicates that your CPU might actually scale back its frequency, believing there is nothing to do. We have seen several times that CPU frequency scaling does not wrk well with SQM (even though at your line rates that should not matter and certainly it should not cause delays in the 400ms range). It would be interesting if you could try to disable CPU scaling and see how/whether that affects the shaping?

seems like something else is afoot, maybe try a different browser to test.

I don't have any spare cables, but I'll add this to the list of things to try.

I could be wrong, but I don't think CPU scaling is supported in this build. The /sys/devices/system/cpu/cpufreq/ folder shows nothing.

I just tried the speed test in Chrome and it's giving the same result as Firefox.

In my original post, I said I didn't notice this problem in 18.06. I am tempted to go back to that version to see if it was my imagination. :thinking: Maybe I should also try one of the "Davidc502" builds that seem to be pretty popular here.

try at least unplugging the cable and replugging on both ends. Also potentially try swapping two cables that you have in use, just so maybe the "bad" one is in a different place. For example if you swap a WAN and LAN cable then if the new WAN cable is good, WiFi should work, whereas the computer connected via LAN will have problems.