Netfilter "Flow offload" / HW NAT

For a gigabit wan pipe:

8 packets * 8 bits/packet * 1500 bytes / (1e9 bits/s) = 100 microseconds so your suggestion overflows router buffers within the first 100 microseconds. Most likely you want a minimum of 10 times that value and more likely 50-100 times that value, which is in fact close to the 1000 packets default value. However, downstream, at a 20 mbit wifi connection between an AP and a mobile phone perhaps, the same 1000 packets will take:

1000*8 bits/packet * 1500 bytes / (20e6 bits/s) = 600 ms to empty

which just goes to show that for VOIP you really need end to end quality of service, starting by tagging DSCP on your router, shaping both directions of the router flow, then honoring that DSCP in your switches, and having an AP that takes advantage of the DSCP marks to set the WMM queues. There's really no other way, traffic shaping is an integral part of a modern network.

future is bright :slight_smile:;a=summary

Fantastic news. I cannot wait until this reaches the master branch.

Wouldn't 600 ms make VoIP unusable? Imagine a queue of 1000 deep and the VoIP packet which just joined at the end of a full queue had to wait 600 ms just to be sent out into the wire.

Linux routers have TCP/UDP receive and transmit buffers, which can be adjusted to hold received packets before being sent into the transmit queue.

My limited understanding of Linux networking is as follows:

[1] Socket buffer -> [2] QDisc Queue -> [3] Network interface queue -> [4] Wire/Phy

So I would think that [2] and [3] should be as short as possible, i.e. just enough to saturate the wire/phy. [1] should be as big as possible to avoid dropped packets.

QDisc can be implemented in most consumer routers, but it would not be able to scale effectively.

IMHO, most consumer routers just do not have enough grunt and resources to effectively shape traffic. Also, it only ever make sense to shape uplink traffic in a home setup, as there's no effective way to shape downlink, since it's controlled by your ISP.

From my experience with my 50/20 mbps link, VoIP, video streaming and app updates downloading which saturates the downlink does not really present major issue, when the uplink is relatively free.

My router gets an A from test as well without any QoS enabled, so I guess ISP plays a part as well? It could also be because my links are not considered fast. Anyway, I would think networking is more an art than science :stuck_out_tongue: so probably have to trial and error until we get a setup that works for our own use.

Yes, that's the point, which is why you need prioritization in your switches and your APs. When an AP with WMM gets a VOIP packet with DSCP tag it will send it through the AC_VO queue right away instead of making it wait in a single 600ms FIFO.

In your notation
[2] doesn't need to be short, it just needs to be multi-lane so that low-latency packets like VOIP ones or game or whatever go into a short queue that gets serviced right away. No one cares if there's a 600ms delay in an all night torrent download.

At high bandwidth yes that's correct, many consumer routers will have difficulty with more than say 50 or 100mbit and a decent shaper. Some of the newer ones such as ARM based probably handle 200mbit to 400mbit with shapers.

This is a myth. The point of downstream shaping is that torrents and downloading banner ads and things don't care about hundreds of milliseconds of delays and/or dropped packets, and VOIP does. So if you need more bandwidth, you should try to figure out which streams to slow down. you do this by making sure the VOIP packets have priority over the torrents and banner ads, so that the banner ads build up backlogs in your qdisc, and then the size of this backlog is an indicator to something like fq_codel that this stream should drop packets, and so then the upstream sender sees the dropped packets and slows their sending rate. In other words, TCP and many other protocols are a feedback loop and you need to send feedback to the latency tolerant streams to slow down so your intolerant streams can have some bandwidth.

1 Like

Future is here :slight_smile:
Scroll all the way down for a first version of flow offload implemented by @nbd

Flow offload in trunk.

1 Like

What packages do I need build-in or install in firmware? How to configure it? Any drawbacks?

You will need to compile a build for your device with kernel 4.14, the module required for flow offload and iptable rule to use said module.

What kind of rule would you use if you want to offload all traffic?

Add a rule to the FORWARD chain with: "-j FLOWOFFLOAD"

1 Like

Which of the following 3 modules are required? And am I missing any?


So.... the 2nd rule in the FORWARD chain? Nope.

Try this: Optimized build for the D-Link DIR-860L

Thanks, and so she be where some may look:

iptables -I FORWARD 1 -m conntrack --ctstate RELATED,ESTABLISHED -j FLOWOFFLOAD

Nope: for me it keeps resetting an established connection; my irc client. I will check my image contents again.

With image from config.seed, running on rango target, my irc client(irssi on an ubuntu box) continually resets its connection every few minutes upon installing FLOWOFFLOAD.

Should maybe add that the irc connection is a secure connection.

The latest series of patches addressed this issue, things appear to be working.

The issues should be fixed now. Please try the latest version

iptables -I FORWARD 1 -m conntrack --ctstate RELATED,ESTABLISHED -j FLOWOFFLOAD

This command messes up w/ NAT Loopback. If I delete it, NAT loopback works again.

What do you mean by "NAT loopback"? Also, please try my staging tree at;a=summary to see if it fixes those issues.


I mean the port forwarding with reflection a.k.a. hairpin NAT. I have a few services that I access using the external IP and it doesn't work with the FLOWOFFLOAD as the first rule in the FORWARD chain.

Still happens on the latest staging build.

Should be fixed now. Please try the latest version

1 Like