No idea, but SQM does nothing actively to foil software offloading...
For flows in the outbound direction , Im curious whether the sqm layer is before or after software flow offloading layer.
Software offloading sends packets direct to the xmit path, and that means into the qdiscs, so SQM works both directions afaik.
And in the end, which one was the expected/good result? the one with SFO on or SFO off?
Ok i have done a couple tests myself with SFO and SQM (first time setting up SQM for me).
Well, SQM is clearly working and the difference is like night and day on latency. Additionally i see a FLOWOFFLOAD target in iptables, and it is being hit since the counter increases. I'd say they are both working fine together then.
what is your speed max @xorbug
thanks
i came of test with mikrotik hap ac2 sqm + SFO and HWO my result is amazing !
Wait, you are saying you use HWO+SQM? But:
I'm confused now...
the result has a top has with sqm / sfo + hwo enabled but maybe on mikrotik hap ac2 has not effect for the moment , i haved execute many test for you show
SQM + SFO + HWO http://www.dslreports.com/speedtest/67711976
SQM + SFO http://www.dslreports.com/speedtest/67712001
SQM http://www.dslreports.com/speedtest/67712028
SFO http://www.dslreports.com/speedtest
SFO + HWO http://www.dslreports.com/speedtest/67712079
Eheh yeah i guess this is the case...
Thank you for caring and taking time to test and share. Indeed it looks like there is no difference between SFO and +HFO, meaning it is not being used, but maybe at those speeds there should be no impact even if it was working... while there is difference between SQM and no SQM for sure.
if you're using hardware flow offloading then SQM is not working, whatever your speed test results say, since it bypasses the qdisc mechanism.
with sqm no sfo http://www.dslreports.com/speedtest/68678992
root@OpenWrt:~# tc -s -d qdisc
qdisc noqueue 0: dev lo root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc fq_codel 0: dev eth0 root refcnt 2 limit 10240p flows 1024 quantum 1518 target 5ms interval 100ms memory_limit 4Mb ecn drop_batch 64
Sent 144618747286 bytes 174506922 pkt (dropped 0, overlimits 0 requeues 27)
backlog 0b 0p requeues 27
maxpacket 9108 drop_overlimit 0 new_flow_count 54047 ecn_mark 0
new_flows_len 0 old_flows_len 0
qdisc noqueue 0: dev lan1 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev lan2 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev lan3 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev lan4 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc cake 8009: dev wan root refcnt 2 bandwidth 16Mbit besteffort triple-isolate nonat nowash no-ack-filter split-gso rtt 100ms noatm overhead 44
Sent 5048846069 bytes 40710429 pkt (dropped 3345, overlimits 16534528 requeues 0)
backlog 0b 0p requeues 0
memory used: 918272b of 4Mb
capacity estimate: 16Mbit
min/max network layer size: 28 / 1500
min/max overhead-adjusted size: 72 / 1544
average network hdr offset: 14
Tin 0
thresh 16Mbit
target 5ms
interval 100ms
pk_delay 15.3ms
av_delay 4.48ms
sp_delay 3us
backlog 0b
pkts 40713774
bytes 5053775579
way_inds 564415
way_miss 155112
way_cols 0
drops 3345
marks 0
ack_drop 0
sp_flows 1
bk_flows 1
un_flows 0
max_len 17054
quantum 488
qdisc ingress ffff: dev wan parent ffff:fff1 ----------------
Sent 72353669760 bytes 53590324 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev br-lan root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc cake 800a: dev ifb4wan root refcnt 2 bandwidth 56Mbit besteffort triple-isolate nonat wash no-ack-filter split-gso rtt 100ms noatm overhead 44
Sent 71833817021 bytes 52268548 pkt (dropped 1321776, overlimits 87391757 requeues 0)
backlog 0b 0p requeues 0
memory used: 1342536b of 4Mb
capacity estimate: 56Mbit
min/max network layer size: 46 / 1500
min/max overhead-adjusted size: 90 / 1544
average network hdr offset: 14
Tin 0
thresh 56Mbit
target 5ms
interval 100ms
pk_delay 667us
av_delay 136us
sp_delay 8us
backlog 0b
pkts 53590324
bytes 73683624904
way_inds 294724
way_miss 109165
way_cols 0
drops 1321776
marks 0
ack_drop 0
sp_flows 2
bk_flows 1
un_flows 0
max_len 39364
quantum 1514
qdisc noqueue 0: dev wlan0 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
root@OpenWrt:~#
with sqm with sfo
root@OpenWrt:~# tc -s -d qdisc
qdisc noqueue 0: dev lo root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc fq_codel 0: dev eth0 root refcnt 2 limit 10240p flows 1024 quantum 1518 target 5ms interval 100ms memory_limit 4Mb ecn drop_batch 64
Sent 144832145622 bytes 174785566 pkt (dropped 0, overlimits 0 requeues 27)
backlog 0b 0p requeues 27
maxpacket 9108 drop_overlimit 0 new_flow_count 54088 ecn_mark 0
new_flows_len 0 old_flows_len 0
qdisc noqueue 0: dev lan1 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev lan2 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev lan3 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev lan4 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc cake 8009: dev wan root refcnt 2 bandwidth 16Mbit besteffort triple-isolate nonat nowash no-ack-filter split-gso rtt 100ms noatm overhead 44
Sent 5095097320 bytes 40826079 pkt (dropped 4137, overlimits 16665515 requeues 0)
backlog 0b 0p requeues 0
memory used: 918272b of 4Mb
capacity estimate: 16Mbit
min/max network layer size: 28 / 1500
min/max overhead-adjusted size: 72 / 1544
average network hdr offset: 14
Tin 0
thresh 16Mbit
target 5ms
interval 100ms
pk_delay 550us
av_delay 134us
sp_delay 7us
backlog 0b
pkts 40830216
bytes 5101225230
way_inds 565065
way_miss 155330
way_cols 0
drops 4137
marks 0
ack_drop 0
sp_flows 0
bk_flows 1
un_flows 0
max_len 17054
quantum 488
qdisc ingress ffff: dev wan parent ffff:fff1 ----------------
Sent 72435286656 bytes 53667803 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev br-lan root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc cake 800a: dev ifb4wan root refcnt 2 bandwidth 56Mbit besteffort triple-isolate nonat wash no-ack-filter split-gso rtt 100ms noatm overhead 44
Sent 71916206973 bytes 52345548 pkt (dropped 1322255, overlimits 87489041 requeues 0)
backlog 0b 0p requeues 0
memory used: 1342536b of 4Mb
capacity estimate: 56Mbit
min/max network layer size: 46 / 1500
min/max overhead-adjusted size: 90 / 1544
average network hdr offset: 14
Tin 0
thresh 56Mbit
target 5ms
interval 100ms
pk_delay 290us
av_delay 72us
sp_delay 3us
backlog 0b
pkts 53667803
bytes 73766740042
way_inds 298245
way_miss 109318
way_cols 0
drops 1322255
marks 0
ack_drop 0
sp_flows 1
bk_flows 1
un_flows 0
max_len 39364
quantum 1514
qdisc noqueue 0: dev wlan0 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
root@OpenWrt:~#
test realized with rt3200 belkin
Higly unscientific test from my side:
I tried enabling software offload on my x86 router on 1Gbit line (no SQM or shaping) and frankly, I did not notice some large improvement by looking at htop and running speedtest.
Only change is that Yamon stopped working, so I switched it back.
@moeller0 since you're the SQM expert, what's the latest take on enabling software or hardware offloading with it? Does it do nothing or should it be avoided? It's a shame the LuCI page still doesn't provide any insight for this.
I have to admit that I have no reliable information to share. However as far as I understand, software offloading should work with SQM as it tries to avoid other parts of the network stack higher up than mere qdiscs like sqm uses. Hardware flow offloading however will hide all packets from sqm, so the hardware offloading engine needs to offer its own qdiscs, like the NSS stuff on e.g. r7800 as far as I understand.
The pages from luci-app-sqm? These are independent from sqm-scripts for some time now, so anybody having reliable information could create a PR to get well-tested/well-researched changed in.
As I said, I have never bothered to test any offload. But it seems that my current router offers software flow offloading so I could go and test it... Maybe I will get around to do that later this year...
It's working if the SQM interface set as wan (device), but software flow offloading does not help with SQM if the interface set as pppoe-wan (tunnel). My test on my friend's FTTH line with 200/20 plan showed that Xiaomi Mi Router 4A Gigabit Edition is able to shape up to 150mbps using CAKE without software flow offloading and it's fine doing 200mbps with software flow offloading turned on. But I'm afraid Diffserv is not working when shaping is done on wan instead of pppoe-wan, every packet goes into Best Effort tin. Othen than that I wasn't able to test if nat option is working, so not sure about that.
I am wondering if you were able to find time to do any tests comparing SQM vs. SQM+SFO. Information I have been able to find on forums so far on whether SFO does or does not impact SQM [including some great screenshots and info earlier in this same topic] seem to be inconclusive or at least not root-caused, same as my own experience with the burning question. Would be great to hear feedback from expert on the topic.
There is one simple solution to this question, getting hardware with sufficient margins to do sqm without any kind of offloading…
--
I'd never buy a new device with flow-offloading in mind, the technology is too new and quirky to do that with a clear conscience. If you suddenly find yourself in a situation where your old hardware won't cope without it after a 'sudden' speed upgrade, fine - give it a try, no harm done - but don't select hardware with it in mind.
I experienced slh's footnote with my ER-X gateway ...I suddenly found my existing hardware was too slow to handle SQM/QoS after an ISP speed upgrade from ~200 to 500 Mbps. So of course the very next thing I tried was to save my hardware by experimenting with software options!
I no longer recall the exact improvement software flow offloading provided with SQM/QoS, but I do recall it was marginal at best (less than 5-10%) and difficult to discern within test result variability. I did at least convince myself it didn't hurt LOL...
Flow offloading and SQM/QoS are not compatible - the CPU needs to handle SQM/QoS. I attributed any observed benefit to software offloading freeing up a few CPU cycles on lan traffic that could then be used for SQM/QoS, but that may be an entirely nonsensical attribution, considering my limited understanding of offloading "fringe benefits."
In the end, I did exactly what slh recommends in the post above - buy enough hardware to not need or care about software flow offloading. I replaced my ER-X gateway with a NanoPi R4S, the CPU of which is woefully underutilized with "only" 500 Mbps ISP service. It's embarrassing really. I think I need to upgrade to faster ISP service
No, I have to admit I did not, I even forgot to put this on my TODO list.
hi everybody veth is compatible with sfo ? thanks
Without having tested in any objective way, and just based on ordinary usage not including Teams or Zoom calls, I can’t tell the difference, albeit I think I am correct in stating that cake-qos-simple is SFO friendly.