General Discussion of SQM

which buffers is the question. tcp window buffers are not used at any point by forwarded traffic. A forwarder simply looks at the single packet and decides where to send it. On the other hand, an endpoint has to plop these packets into a buffer and wait for the process that wants them to read them...

on the other hand, if you're a router, you need buffers to handle queuing, which may be the buffers mentioned, also you may need buffers to handle physical hardware DMA or some such thing so those are also different. The tcp window buffers simply don't exist on any machine other than the endpoint machines.

1 Like

Does software flow offload work with or against SQM?

Technically it works with SQM, but SQM traffic by definition can't be accelerated (as the full kernel stack remains in charge if every single packet, not allowing any bypassing). So while enabling software flow-offloading shouldn't break anything (this is different with hardware flow-offloading, which would actively fail), you'll lose any kind of speed-up if SQM is enabled as well.

2 Likes

IIRC, with hardware flow-offloading sqm will only see the few packets of each flow that are passed through the normal kernel stack before the offload triggers. No experience/idea about software flow offloading, but given that traffic shaping is quite computationally expensive, even if software flow offload was compatible with sqm the resulting throughput would be much closer to the pure full-kernel stack performance as compared to pure software offload performance. Software offloading works by doing less work per packet, and traffic shaping will force you to do more...

2 Likes

So do we know if enabling Software offloading inhibits or prevents CAKE SQM from performing its task correctly?

I ask because I have SQM enabled for my PPPoE connection, but I'd really like to use Software offloading to accelerate my WiFi transfer speeds and also unburden the CPU a bit.

It would be golden if the PPPoE traffic would stick to the slow-path for SQM, and any other traffic would go through the fast-path.

Apologies if I got the terminology wrong.

I thought that if I disable SQM for download, then I could take advantage of flow offloading, but it seems like the download actually get slower by over 10%. Tested on R7800, PPPoE, 300/300, over WiFi.

I thought flow offloading is not directional, so its either all or nothing, no?

my understanding is software flow offloading bypasses a bunch of iptables processing. But iptables shouldn't be involved in bridging wifi to Ethernet unless you have bridge iptables enabled. the main path I'd expect to accelerate would be wan <-> LAN but with SQM involved it probably won't be a major speedup

useful article: https://www.kernel.org/doc/Documentation/networking/nf_flowtable.txt

it confirms that the major thing software flow offloading does is bypass unnecessary firewall rules. The first packets go through the full firewall, and then once conntrack has the flow established, as soon as the packet comes in ingress, it gets sent directly to neigh_xmit... Now @moeller0 does neigh_xmit then call into the queuing system? I would guess so. So all the packets would go through SQM if that's true.

1 Like

Thanks for the explanation. Made me realize why my LAN WiFi transfers got faster after enabling SoftOff.

I was running WiFi transfer tests while my WAN connection was 100% utilized. I'm pretty sure now the speed-up was due to reducing CPU utilization on the core doing IRQ for the WiFi device.

I believe the issue is that we only get packets after the ingress node of the diagram, so if a packet matches to the flow table I believe sqm will not see it at all, at least for ingress.

Ah, right! because ingress is handled with an IFB in the standard setup. I'm using egress of my LAN instead :wink: if you have a qdisc on the egress of LAN does it see the packet from neigh_xmit? I haven't got a clue.

1 Like

Mmmh, just did a quick and dirty test on my wndr3700, wirt or without software flow offloading checked my sqm ingress settings are honored (I shape my 99 Mbps downstream link to 49 Mbps, since my current main router can only shape reliably to around 70-80 Mbps up+down, doing a wget -O /dev/null http://speedtest.belwue.net/10G --report-speed=bits -4 manual single flow spedtest shows a limit ~42Mbps independent of the offload status, but in any case doing this over the 5GHz radio has my router's CPU at 1% idle, enabling flow offloads with sqm disabled results in both times ~91Mbps goodput, but with 20% idel with softoffloads and 0-5% without).

So I cautiously retract my theory, software flow offloading might be compatible with ingress sqm (but the sqm is so expensive that it does not really give that much of an advantage).

I have (unscientifically) observed the lower download throughput with SQM ( download == 0 and upload == 275000) and offload enabled, but my router is also close to its limits in this test.
So I have for now disabled the offload.

Blockquote[quote="fantom-x, post:94, topic:30527"]
I have (unscientifically) observed the lower download throughput with SQM ( download == 0 and upload == 275000 ) and offload enabled, but my router is also close to its limits in this test.
So I have for now disabled the offload.
[/quote]

My r7800 runs out of juice for piece of cake / layer cake at that speed as well (~mid 200mbps wired). FQ_codel + simplest_tbf give me the best balance of throughput and latency (up to 500mbps wired). What qdisc / script are you running (cat /etc/config/sqm)?

I know some of you have probably seen it already, but for those of you running out of juice for SQM... if you haven't already seen them here are my recent results using RPi 4 for routing and shaping a full gigabit easily.

This one, but LAN/WAN use case does not interest me. 5GHz wifi tops out at around 260Mbps.

I saw that, but I am desperately trying to not multiply the number of devices to manage. Although, it does not look like I am gonna succeed here.

Yeah, I hear you. I eventually installed cfengine and have been trying to automate the heck out of everything... It has helped a lot with desktops and laptops and servers. There's no openwrt package for routers and access points though.

Well, for the several routers I manage I went the /etc/uci-defaults/ route. I package all configuration scripts and packages, disable preserve config in LuCI, and just send a firmware over to the router owner. They just upgrade, the configuration scripts do all the setup, and it is all good. If they mess up the config, I ask them to reset to defaults, and the router is as good as new in a few minutes.

Has ctinfo been backported to work for 4.4.167?

Is it normal that the SQM overhead size does not make any difference on my connection? Even more, I do not seems to any buffer bloat without SQM either...