SQM cake: traffic prioritisation

moeller0 · December 11, 2020, 1:30pm

Mmmh, the easiest is to look at the output og tc -s qdisc and maybe use ping with the TOS option to generate packets targeting specific tins, for example DSCP CS7 (decimal value 56) will be sorted into cake's highest priority tin, using a proper iputils ping (you might need to install 'opkg update ; opkg install iputils-ping`) you can then use:

tc -s qdisc
ping -c 10 -Q $(( 56 * 4 )) 1.1.1.1
tc -s qdisc

to generate traffic for the highest tin if you compare the packet counters for that tin from the before and after tc -s qdisc invocation you should see an increase equal or greater than the number of pings (the number after the -c in the ping invocation above). The $(( N * 4 )) simply converts the 6bit DSCP decimal value into an 8bit numer for the full TOS Byte, since ping expects 8bit values here. See https://www.cisco.com/c/en/us/td/docs/switches/datacenter/nexus1000/sw/4_0/qos/configuration/guide/nexus1000v_qos/qos_6dscp_val.pdf for some typical decimal values for common DSCPs, and look into cake's code to figure out which DSCPs map into which priority tier.

Gigabit · December 11, 2020, 2:46pm

Hi @moeller0 are you able to help me with my SQM setup in terms of splitting the interfaces so I can use DSCP tagging on the ingress?

moeller0 · December 11, 2020, 2:52pm

You already instantiated SQM on a LAN facing ethernet/switch port, so you should be able to use iptables for incoming packets already.

Gigabit · December 11, 2020, 2:53pm

Hi @moeller0 many thanks. But do I need to add another for upload now?

moeller0 · December 11, 2020, 3:03pm

As far as I can see no, but with a shaper on LAN you will not shape wifi 2 wan traffic, but you will shape LAN to WiFi traffic instead... For a purely wired router that will not matter much. So assuming no wifi, I would try something like:

config queue 'eth1'
        option enabled '1'
        option debug_logging '0'
        option verbosity '5'
        option linklayer 'ethernet'
        option overhead '44'
        option interface 'eth0'
        option download '0'
        option upload '40000'
        option qdisc 'cake'
        option script 'layer_cake.qos'
        option eqdisc_opts 'diffserv4 dual-dsthost ingress nat'
        option qdisc_advanced '1'
        option squash_dscp '1'
        option squash_ingress '1'
        option ingress_ecn 'ECN'
        option egress_ecn 'ECN'
        option qdisc_really_really_advanced '1'

config queue 'eth1'
        option enabled '1'
        option debug_logging '0'
        option verbosity '5'
        option linklayer 'ethernet'
        option overhead '44'
        option interface 'eth1'
        option download '0'
        option upload '6800'
        option qdisc 'cake'
        option script 'layer_cake.qos'
        option eqdisc_opts 'diffserv4 dual-srchost nat'
        option qdisc_advanced '1'
        option squash_dscp '1'
        option squash_ingress '1'
        option ingress_ecn 'ECN'
        option egress_ecn 'NOECN'
        option qdisc_really_really_advanced '1'

But that assumes that eth1 is your wan interface (so make sure to select the correct wan interface in the GUI for the second SQM instance). Note that both instances shape the interface egress, but on the WAN interface egress is directed into the internet and out of your home network (so equivalent with the internet upload direction), while on the LAN interface egress is directed towards your internal network and out of the internet (equivalent with internet download direction).

dlakelan · December 11, 2020, 3:30pm

Gigabit:

My traceroute  [v0.85]
OpenWrt (0.0.0.0)                                      Sun Oct 11 14:38:15 2020
@Not a TXT recordplay mode   Restart statistics   Order of fields   quit
                 @Not a TXT record     Packets               Pings
 Host                             @Not a TXT recordLast   Avg  Best  Wrst StDev
                                              17    6.2   9.6   5.8  52.2  11.0
    AS2856  31.55.186.177 (31.55.186 93.8%    17    7.5   7.5   7.5   7.5   0.0  6.2   6.2   0.0  3. AS2856  31.55.186.176 (3117   11.1  12.2   1    48.7   9.7  7.9AS2856  host213-121-192-70.ukcor  0.0%    17    9.4  16.5   7.7 106.1  23.9
 5. AS2856  peer2-et0-1-3.slough.ukc  0.0%    17    9.4  22.6   8.7 128.6  32.0
 6. AS2856  109.159.253.219 (109.159  0.0%    17   10.2  17.6   8.5 127.9  28.8
 7. AS15169 216.239.62.75 (216.239.6  0.0%    16   11.8  24.2  10.2 211.8  50.0
 8. AS15169 64.233.175.107 (64.233.1  0.0%    16    9.7  18.9   8.2 139.8  32.6
 9. AS15169 dns.google (8.8.8.8)      0.0%    16   11.6  17.0   8.5 125.2  28.9

Way way back up thread was the answer to why you need to reduce your speeds so much. There is a bottleneck in your ISP infrastructure a few hops out from your link

Gigabit · December 11, 2020, 3:42pm

@moeller0 Slightly cross purposes but I finally got flent to work.

I uploaded the entire output file to Google Drive: https://drive.google.com/file/d/1VHJPcBCXzUNIvKpL7oru960XC013K2zO.

Please can you review it and let me know if it confirms @dlakelan's post above? I looked into the congestion issue and sadly I don't think there is anything I can do about it, I did ask some other BT users and they did not experience it. I would be surprised if BT had congestion issues as I don't see the issue anywhere else, e.g. with download speed?

Unfortunately I cannot change ISP for another year.

dlakelan · December 11, 2020, 4:02pm

try again with mtr... run it for 1 minute then stop and copy and paste that... this will give a good baseline.

Then run it for 1 minute followed by running a speed test and at the end of the speed test stop mtr and copy and paste that... we can compare with and without network load.

Gigabit · December 11, 2020, 4:07pm

@moeller0 It seems like the CPU stats are still not going into the flent output.

I'm using a ZyXEL NBG6617, I've got the SSH keys saved within OpenWrt and I can login without password, please do you have any suggestions for how to get these stats working?

I've checked flent tools are enabled on the router as well.

dlakelan · December 11, 2020, 4:10pm

I've got my mother set up with the same router, it shapes a fiber to the home to 70/70 or so and doesn't have problems that I know of. I don't think CPU is your issue.

The way ISPs work is that they buy a certain amount of capacity at some point of presence, and then share it with a big group of people. If they do a good job, they size their capacity such that "most of the time" there is no bottleneck at their local point of presence. For this kind of "most of the time" they'd often define that as something like 95% of 1 minute samples show net bandwidth usage over that 1 minute less than their capacity... But as they expand, and as people need more internet services due to things like COVID and soforth, it becomes more and more likely to get near that limit especially in brief bursts. It's easily possible during a speed test for some switch or router to bottleneck for a few seconds and result in tens to hundreds of milliseconds of bufferbloat, and the stats over 1 minute would still show they were fine.

Gigabit · December 11, 2020, 4:12pm

Deleted as hard to read.

Gigabit · December 11, 2020, 4:14pm

@dlakelan It would be useful to hear from other BT Internet users here and see if they can also run the same tests and see if they also experience the problems I am.

I am surprised as BT is the largest ISP in the UK and has as far as I know, lots of capacity. I am surprised I am not seeing reports of this issue from other users here, there must be many using BT, no?

dlakelan · December 11, 2020, 4:18pm

Other BT users would only be subject to the same bottleneck if they are in the same local loop (basically your same town or even your same neighborhood depending on the situation).

Also it could be time-of-day dependent.

And, there might be some little old lady with an ancient TV in your town

Gigabit · December 11, 2020, 4:18pm

Deleted as hard to read.

dlakelan · December 11, 2020, 4:19pm

can you go in and reformat those so that each hop is on its own line, there are some line endings missing, otherwise it's extremely hard to read.
thanks!

I see there's some kind of DNS issue where it's spitting out "@Not a TXT record" and that's what's borking the formatting... I'm not sure what's up there, maybe that's an error message you could get rid of by

mtr .... 2>/dev/null to send the stderr output to the dustbin

you could also try typing Ctrl-l (that's a lowercase L) right before you stop mtr

Gigabit · December 11, 2020, 4:24pm

Yes I am trying again, just a few moments please

Gigabit · December 11, 2020, 4:27pm

Sadly neither of those fixes have got rid of the error @dlakelan

dlakelan · December 11, 2020, 4:31pm

maybe there's a switch to tell it not to lookup the DNS record? yes

mtr ... --no-dns ...

see if that helps

Gigabit · December 11, 2020, 4:31pm

Tried that already, sadly it did not work either.

dlakelan · December 11, 2020, 4:33pm

try --curses and as it says under that "Ctr-L clears spurious error messages" maybe that will get something we can read?