Ah good, I wasn't able to get that by reading around about SFO... also the "Experimental feature. Not fully compatible with QoS/SQM." warning in the LuCI firewall tab appears above the SFO checkbox (when HFO checkbox is not even visible) so maybe I got confused a bit by that...
@dl12345 thanks for taking the effort of sharing these benchmarks... very informative indeed.
Speaking as a layman about combined operation of SFO+SQM including these graphs meaning (which i never met but i just imagine to be measurements for different classes of traffic), could it be at this point the other way round?
I mean, since SFO disabled/enabled has so little impact (as if it's not operating at all, well, except the priority inversion, and the shapes too), maybe it's SQM that "disables" SFO no matter what the settings are and the cli/gui shows, and not the opposite? i.e. you are operating without SFO all the time? May this be possible? Or those iverted priorities and different shapes are a clear evidence of SFO?
(sorry in case this looks cocky, i see you know lots more than me on the subject, i don't mean to say you are not aware of what settings you are running or you cannot notice the differences by yourself... just asking to learn something )
This is a good thought but I don't think so. @moeller0 is there anything in the SQM scripts that disables software flow offload?
Reading further on the internet, it's clear that software offloads still send packets via neigh_xmit which I believe still sends things through the qdisc so SQM should work. The priority inversion is weird though.
And in the end, which one was the expected/good result? the one with SFO on or SFO off?
Ok i have done a couple tests myself with SFO and SQM (first time setting up SQM for me).
Well, SQM is clearly working and the difference is like night and day on latency. Additionally i see a FLOWOFFLOAD target in iptables, and it is being hit since the counter increases. I'd say they are both working fine together then.
the result has a top has with sqm / sfo + hwo enabled but maybe on mikrotik hap ac2 has not effect for the moment , i haved execute many test for you show
Thank you for caring and taking time to test and share. Indeed it looks like there is no difference between SFO and +HFO, meaning it is not being used, but maybe at those speeds there should be no impact even if it was working... while there is difference between SQM and no SQM for sure.
I tried enabling software offload on my x86 router on 1Gbit line (no SQM or shaping) and frankly, I did not notice some large improvement by looking at htop and running speedtest.
Only change is that Yamon stopped working, so I switched it back.
@moeller0 since you're the SQM expert, what's the latest take on enabling software or hardware offloading with it? Does it do nothing or should it be avoided? It's a shame the LuCI page still doesn't provide any insight for this.
I have to admit that I have no reliable information to share. However as far as I understand, software offloading should work with SQM as it tries to avoid other parts of the network stack higher up than mere qdiscs like sqm uses. Hardware flow offloading however will hide all packets from sqm, so the hardware offloading engine needs to offer its own qdiscs, like the NSS stuff on e.g. r7800 as far as I understand.
The pages from luci-app-sqm? These are independent from sqm-scripts for some time now, so anybody having reliable information could create a PR to get well-tested/well-researched changed in.
As I said, I have never bothered to test any offload. But it seems that my current router offers software flow offloading so I could go and test it... Maybe I will get around to do that later this year...
It's working if the SQM interface set as wan (device), but software flow offloading does not help with SQM if the interface set as pppoe-wan (tunnel). My test on my friend's FTTH line with 200/20 plan showed that Xiaomi Mi Router 4A Gigabit Edition is able to shape up to 150mbps using CAKE without software flow offloading and it's fine doing 200mbps with software flow offloading turned on. But I'm afraid Diffserv is not working when shaping is done on wan instead of pppoe-wan, every packet goes into Best Effort tin. Othen than that I wasn't able to test if nat option is working, so not sure about that.
I am wondering if you were able to find time to do any tests comparing SQM vs. SQM+SFO. Information I have been able to find on forums so far on whether SFO does or does not impact SQM [including some great screenshots and info earlier in this same topic] seem to be inconclusive or at least not root-caused, same as my own experience with the burning question. Would be great to hear feedback from expert on the topic.
There is one simple solution to this question, getting hardware with sufficient margins to do sqm without any kind of offloading…
--
I'd never buy a new device with flow-offloading in mind, the technology is too new and quirky to do that with a clear conscience. If you suddenly find yourself in a situation where your old hardware won't cope without it after a 'sudden' speed upgrade, fine - give it a try, no harm done - but don't select hardware with it in mind.
I experienced slh's footnote with my ER-X gateway ...I suddenly found my existing hardware was too slow to handle SQM/QoS after an ISP speed upgrade from ~200 to 500 Mbps. So of course the very next thing I tried was to save my hardware by experimenting with software options!
I no longer recall the exact improvement software flow offloading provided with SQM/QoS, but I do recall it was marginal at best (less than 5-10%) and difficult to discern within test result variability. I did at least convince myself it didn't hurt LOL...
Flow offloading and SQM/QoS are not compatible - the CPU needs to handle SQM/QoS. I attributed any observed benefit to software offloading freeing up a few CPU cycles on lan traffic that could then be used for SQM/QoS, but that may be an entirely nonsensical attribution, considering my limited understanding of offloading "fringe benefits."
In the end, I did exactly what slh recommends in the post above - buy enough hardware to not need or care about software flow offloading. I replaced my ER-X gateway with a NanoPi R4S, the CPU of which is woefully underutilized with "only" 500 Mbps ISP service. It's embarrassing really. I think I need to upgrade to faster ISP service