SQM + PBR + Wireguard

Hello,

I've tried to follow this guide to set it al up but I'm very confused as a beginner:

Situation is:
Router: NanoPi R6S
DL: 500mbit
UL: 30mbit

I have an openwrt router with wireguard set as client which works perfectly with PBR. I can set any device I want on the vpn connection and it will work.

The problem:
SQM does not work properly anymore when I have a device on the VPN. I have a rpi-4 on the VPN which acts as my torrent-box which seeds 24/7 on full bandwidth. The rpi-4 is hogging al the upload bandwidth when on the VPN which makes my own experience quite laggy.

Without VPN it works perfectly with SQM and other devices as I don't experience any lag with Teams/Work applications even though the rpi-4 is using all the bandwidth.

Can anyone help me out to set SQM up with PBR and VPN. The download is not the main concer but upload is. I tried to set cake with "flows nonat" but that caps other devices than the RPI to 2mbit.

Thanks in advance.

Beyond running pbr in iptables mode to make it compatible with SQM I have no other suggestions.

The packet markings between SQM and pbr are not mutually exclusive and I believe both have been confirmed working on pre-nftables flavours of OpenWrt.

This is easy - see for example:

The first is the most straightforward.

Hello,

Thanks for the answer!

I followed the install instructions and disabled the default SQM in luci.

The first option seems to work but I'm not sure. When I run a speedtest when I have the rpi-4 seeding I get a good share of the bandthwidth (15/30mbit).

With uploading I have 0ms bufferbloat but when I download I still have 50ms which is 0ms with the non vpn setup with the default sqm.

I got something in the logs of your script:

/tmp/cake-wg-pbr.log

RTNETLINK answers: No such file or directory
RTNETLINK answers: No such file or directory
RTNETLINK answers: No such file or directory
Cannot find device "ifb-wg-pbr"
Cannot find device "ifb-wg-pbr"
Cannot find device "ifb-wg-pbr"

How can I get this working?
Thanks in advance.

Seems missing kernel module /dependencies. Did you install the required packages?

Actually I might have missed one:

Please confirm if the latter was needed and I'll add it to the list on GitHub.

Thanks for the fast response.

I installed the packages but I still have the exact same problem.
Packages:

Package tc-tiny (5.15.0-4) installed in root is up to date.
Package kmod-ifb (5.10.176-1) installed in root is up to date.
Package kmod-sched-core (5.10.176-1) installed in root is up to date.
Package kmod-sched-cake (5.10.176-1) installed in root is up to date.
Package kmod-netem (5.10.176-1) installed in root is up to date.

Log file:

RTNETLINK answers: No such file or directory
RTNETLINK answers: No such file or directory
RTNETLINK answers: No such file or directory
Cannot find device "ifb-wg-pbr"
Cannot find device "ifb-wg-pbr"
Cannot find device "ifb-wg-pbr"

I guess I still miss something or is this some FriendlyWrt issue?

EDIT:
I got it to work by enabling the service in Luci under: system > startup. Now on reboot I don't have those error logs anymore from the script.

What I observe now is that the rpi-4 seed it around 16-20mbits instead of the full 27mbits I set in the script.

The bufferbloat is around +5ms for download and +1ms for upload, which is very acceptable. But the throughput is around 20% than what I set.

Is there anyway I can get that down to 0-1ms? Is there anything tweakable in your script or are they set as a must for this to work?

I presume the bandwidth discrepancy relates to the effect of overhead.

Can you give the output of 'tc qdisc ls'?

Are you just using the default cake settings from the script? If so, you should adjust the 'overhead' setting appropriately. It defaults to 92:

and that value is probably wrong for your connection.

@moeller0 could hopefully help you work out what your correct 'overhead' value should be.

+5ms seems perfectly fine to me. Perhaps with the right 'overhead' setting this might improve? If you are using the defaults then you are already using the 'ingress' keyword for download, which would help.


I should mention this script right now only handles IPv4. If you are using IPv6 I'd need to amend the script to handle that properly.

I also think that is happening.

'tc qdisc ls' output:

root@OpenWrt:~# tc qdisc ls
qdisc noqueue 0: dev lo root refcnt 2 
qdisc mq 0: dev eth0 root 
qdisc fq_codel 0: dev eth0 parent :2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 4Mb ecn drop_batch 64 
qdisc fq_codel 0: dev eth0 parent :1 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 4Mb ecn drop_batch 64 
qdisc cake 800f: dev eth1 root refcnt 2 bandwidth 27Mbit besteffort flows nonat nowash no-ack-filter split-gso rtt 100ms noatm overhead 92 
qdisc ingress ffff: dev eth1 parent ffff:fff1 ---------------- 
qdisc fq_codel 0: dev eth2 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 4Mb ecn drop_batch 64 
qdisc noqueue 0: dev br-lan root refcnt 2 
qdisc fq_codel 0: dev tailscale0 root refcnt 2 limit 10240p flows 1024 quantum 1500 target 5ms interval 100ms memory_limit 4Mb ecn drop_batch 64 
qdisc noqueue 0: dev docker0 root refcnt 2 
qdisc noqueue 0: dev vpn root refcnt 2 
qdisc ingress ffff: dev vpn parent ffff:fff1 ---------------- 
qdisc cake 8010: dev ifb-wg-pbr root refcnt 2 bandwidth 475Mbit besteffort triple-isolate nat wash ingress no-ack-filter split-gso rtt 100ms noatm overhead 92 

I set the upload to 27mbit and download to 475mbit as I used in the default sqm setup.

Thanks for informing me, I have ipv4 only here in my apartment. I had no other choice when I bridged my modem.

Situation now:
What's going on now is that the upload of the rpi-4 is hogging all the 30mbit of upload after a period of time. When I do 2 seperate speed tests on different devices at the same time the speed tests reported 1.2mbit and 0.9mbit of upload.

When I reset the wireguard vpn interface I can manage to get 15mbit upload on the speedtest when the rpi-4 is seeding. But after a while it will hog 90% of the upload bandwidth.

Do you have any suggestion on how to get SQM work better in my situation? Otherwise I just have to live with it.

@moeller0 seems like 'flows nonat' isn't quite working as expected for @Joopieert here. Any ideas?

Okay I think I will settle by setting default SQM up on the LAN interface instead of WAN.
Now it works like I want without any custom script.

There is a switch connected to that LAN interface/port and all my devices can transfer/communicate with each other without going through the CPU of the router so it's bypassing the SQM.

I think that all my outgoing traffic has to exit via the LAN interface anyway. I know if I have something directly on the router it has some impact but I don't do that.

Is there any downside by doing it this way?

Not exactly sure what you've set up there, but what you wrote made me think of cake-dual-ifb:

Follow-up consideration: is your VPN applied at the same device that is applying cake? If not, 'flows nonat' won't work and cake then cannot distinguish the individual encrypted flows. Solution would be to have the upstream router handle VPN using PBR - anything from your RPi4 torrent client is sent through VPN.

How I had it set up with your script is:

[Router: SQM via your script + PBR + WG_VPN] > [Switch] > [RPI-4]
Result = RPI-4 hogging all the upload bandwith.

Current setup:
[Router: Default Luci SQM on Lan (Switch) + PBR + WG_VPN] > [Switch] > [RPI-4]
Result = SQM works like expected but not optimal yet.
Question is: Is it bad practise to use SQM on the Lan interface instead of Wan? Are there any downsides?

Seems not set up correctly for some reason. I'd persevere, but that's me!

This will throttle LAN<>router and prevent you from setting up separate LAN interfaces like a guest interface. I think WiFi between clients should still be full rate. It also makes overhead compensation a little tricky, albeit that's hard or impossible to get right with mixture of encrypted and unencrypted flows anyway.

Perhaps these limitations are fine for you though.

I could be wrong, but to me it looks like things work as documented, just not as desired... Torrenting uses typically a lot of parallel flows, so for pure per-flow fairness an active torrent seeder or leecher (or what ever terms are currently used to describe up- and downloads) will dominate easily over more flow-frugal applications.
That is the use case for which cake grew its special triple and dual host/isolation modes, which at least can make sure that a computer running bittorrent will not dominate over all other computers.
But you can always set limits for number of flows and total bandwidth to keep torrents from dominating.... at the cost of not torrenting at the maximal rate even if no other traffic exists.

1 Like

This works well for bump in the wire configurations (aka a device that only runs SQM) but for a normal router it poses three challenges:
a) it will only shape traffic that traverses the LAN interface, so all traffic terminated at the rputer itself will not be seen by the shaper; this is mostly an issue when running server applications on the router.
b) it will not shape traffic between WLANs and internet
c) it will shape traffic between WLANs and LAN.

IMHO this is fine with a wired-only router but less optimal for a combined wifi/wired router.

I have a slight feeling from my earlier testing that b) and c) are no longer the case in OpenWrt. I may be wrong, but that's what I remember when I tested this.

In any case, and in keeping with your observations, I remember not being at all satisfied with the restrictions associated with cake on LAN, and that persevering with cake on WAN via 'flows nonat' and/or IFBs and judicious mirroring was worth it in the end.

Albeit I abandoned VPN use in the end when I concluded that the added latency and bandwidth inconsistencies weren't worth it. Interestingly Vodafone UK seem to boost any video traffic in that I see huge bandwidth without bufferbloat when streaming a 4K video, whereas other traffic is more challenging, so I may as well take the benefit from that boost.

Would 'triple-isolate' work with 'flows nonat' for mixture wireguard encrypted and unencrypted in reliance upon skb->hash preservation?

If so, perhaps that's all that @Joopieert needs to use cake on wan effectively for his needs? Or is that wishful thinking?

I do not know, maybe try it?

I think you might be right.

Current setup:
[Router: Default Luci SQM on Lan (Switch) + PBR + WG_VPN] > [Switch] > [RPI-4],[Wireless AP]

I don't think I have these issues:

a) All traffic exiting the LAN interface will be shaped. I don't run any services yet on the router. But it might change in the future. So not a problem for now.
b+c) Because the AP is on the switch, this is not a problem.

Situation with SQM on Lan:

Upsides:
a) Traffic is before encryption so overhead can be on default of 22 or even disabled?
b) Easier setup with all devices and ap on the same lan port.
c) Basic SQM setup works correctly even with many flows from the rpi-4

Downsides:
a) Can't run bandwidth consuming services on the main router.
b) Can't setup guest networks.

I saw that the default SQM does triple-isolate so I will check it out and report back.

Update: flows and triple-isolate does not work together.
I tried with either of them and it still caps the speed test upload to 2-3mbit.