Software flow offloading implications

Looking at the portion you quoted from my earlier post, I realize now I meant to say "I genuinely cannot find any indication that SFO has any negative measurable effect on SQM..." Anyway, I think your working theory sounds very valid. Freeing up CPU, particularly on a device that might otherwise be CPU-bound, for CAKE definitely could show up in visible ways.

Not sure at this point, to be honest. I tend to lean toward "no", but I would recommend those with vnStat and luci-app-statistics to try it. I don't use either of those.

As for nlbwmon, I do use it and will keep an eye on it over the coming days with SFO enabled to see if there is any deviation from norm.

1 Like

vnstat should be ok, as it's counting bits at the network interface - as you mention, one should test to confirm.

Not that familiar with the tool there - if it follows iftop to collect info, might be ok.

1 Like

How about an upload test? Does the tinning still work properly in the upload direction? SQM solutions using ctinfo are better positioned to work with SFO, but it most likely relies on restoring the DSCP mark in the tc filter, since the nftables rules won’t be evaluated for subsequent packets.

Shaping would work with or without ctinfo, but proper tinning is what I am asking about.

Hmmm - upload test should be fine if the device under test is on the other side of the link - e.g. LAN side - I wouldn't do it from the router itself...

interesting point - maybe for some applications, CAKE might not be the best choice - fq_codel might be another option...

I think the problems arise, if you use a virtual interface like pppoe-wan.

1 Like

fq_codel definitely improved download throughput when I used a CPU bound ER-X gateway. Upload performance was another matter - fq_codel upload throughput and latency performance was much worse than CAKE, so I stuck with CAKE. Could have been a user error on my part, but I never got fq_codel to work well on upload with the ER-X.

Does SFO work with all devices (including x86)?

Yes, it’s just an additional configuration that gets added into nftables.

2 Likes

Nice, I guess I will enable SFO in every OpenWrt devices I have (I don't use SQM)

Pay attention if same MAC address floats to different network interface.
"hardware offload" without support will transform packets from outermost network devices and will impair such roaming. Just like real hardware offload.
Obvious freezes on dual band wifi, but may have totally no impact on x86 router.
On the other hand it can reach wire speeds with each packet fully passing firewall

You can add devices if you think they are unjustly omitted... List is append-only, hook and (hw) flag is immutable
/etc/nftables.d/whatever.nft

flowtable ft {
                hook ingress priority filter
                devices = { docker0 , virbr0 }  # Your devices may vary here
                counter
        }

I’ve used it successfully on many different devices without a hitch. I feel like it’s a stigma against SFO at this point. I always notice a drastic improvement in cpu reduction for sqm w/o affecting latency or bandwidth.

There are some more glitches:

pppoe support absent - fixed in 23..5 24..0, but not yet totally covering lowlevel device+vlan

q-in-q and dsa support still not ready for prime time

sometimes plugging unrelated cable destroys offload table with all states inside(or always) - still to test in 6.12

but for simple(-r) cases works 2-3x faster.

Like Reguna, I thought SFO had very few downsides for simple use cases. To clarify, are the stale connections/freeze symptoms expected to occur with software offloading? Or just with hardware offloading (on supported devices)? If both SFO and HFO are impacted, this would be a significant user experience consideration for all-in-one routers where the same SSID is set up on multiple wifi bands.

Here https://github.com/openwrt/firewall4/blob/b6e5157527d361f99ad52eaa6da273cb0f2dfd59/root/usr/share/ucode/fw4.uc#L426 is the function making together device list.

SFO: resolve devices mentioned in the networks mentioned in firewall configuration

HFO: resolve further and keep only physical devices

Examples on the edges it fails:

SFO

wan.7 iw with vlan with absent vlan offloads would be better picking packets on wan to save one memory copy to align packet

Or on the other side if there is card offload it is safe either way.

HFO

it could easily run SFO for docker0 but it does not.

IMO it is totally safe to go with SFO but there is a room for improvement

2 Likes

Well-reasoned algorithmic improvements welcome, i reasoned different device lists for offload levels, botched it down the road, but now it is fine with helping hand of @jow :wink:

1 Like