Overhead of fiber IPoE

I'm wondering what's the real overhead of my connection, the setup it's the following:

Fiber > Huawei EG8145V5 ONT > Mikrotik ac2 with OpenWRT

Huawei ONT it's set to bridge mode and it uses VLAN ID 999, my router gets a public IPv4 address + IPv6 from ISP DHCP server.

When using the Huawei ONT as router speedguide reports 1452 MTU but it's 1500 when using bridge mode.

PD: VLAN tagging only works through ONT bridge mode settings, doing from external equipment gives me no connection from ISP.

To be clear, are you asking about overhead, or MTU?

You've only mentioned things about MTU, and no information regarding any "overhead issue".

I want the overhead value to input in SQM, I know there's a 4 bytes overhead cause VLAN process but I'm not sure if there's anything more to account

This number would be usually 80-90% of your connection in Mbps, if I recall what you're referencing. The other stuff you're mentioning has little basis.

So SQM doesn't needs to account the overhead amount of the connection?

Yes; and I told you:

Everything else you're talking about seem unrelated to that. All you need to know is the Upload/Download speed of your connection per your ISP, any difference is your "overhead"...and 10-20% (5-10% may be better, depending on how large your connection is - given you said fiber) of your true speed is configured/reserved to handle quality on the connection.

Hope this helps.

I don't think the OP is talking about speeds, but the packet overhead which is very much a thing that applies to SQM.

Start with 44. OpenWrt Project: SQM Details has some more in depth details which you might want to look through.

4 Likes

So there are two issues here that you probably understand, but that seem to be mixed in your argument, that I would like to untangle for the benefit of others finding this thread.

SQM ai is to keep latency under load under check by carefully managing the network queues/buffers that can store/soak up packets duding transit. Unfortunately quite a lot of network devices have over-sized, but under-managed buffer/queues. Under load packets accumulate in these buffers leading to increasing delay for each additional packet, especially in situations where the ingress speed to the queue is larger than its egress speed. For many users that situation is given at their internet access link, where LAN/WiFi speeds in the >> 100 Mbps range encounter wan speeds in the <= 100 Mbps range (I am glossing over details here).

SQM applies both flow queueing (FQ) and Codel-type signaling which results typically in relative short standing queues while still allowing the buffer to act as shock absorbers for bursts of traffic and also allow to keep TCP flows running close to capacity in spite of TCPs halving of ist congestion window on detecting packet-loss/CE-marks. But for this to be effective SQM needs to be in control of the most relevant queues in front of the true bottleneck. The way we achieve that is typically by adding a traffic shaper to the mix of codel and FQ that makes sure that SQM's egress speed is below the relevant link's true bottleneck speed, and hence all packets admitted to say, the WAN link in our example will find empty buffers and hence will be transmitted immediately*.

Now typically a device's speed is defined as a gross rate that does not really care about packets (e.g. Gigabit Ethernet has a 1000 Mbps Layer 1 frame (but that is not the true gross speed either, there are additional modulations below that layer but we can safely ignore that)). For our traffic shaper to be effective the shaper's gross speed needs to be <= the true gross speed of the bottleneck, and for ingress shaping rather < or << than =. A typical recommendation is to set ~90% of the bottleneck's known gross speed (or if that is unknown, simply the typical speeds that one can robustly measure as goodput on speedtests over the relevant bottleneck link).
But since we packetize all information, we also need to account for the size of the packet infrastructure (headers and trailers effectively) since these also need to be transmitted and hence account a link's gross speed.
To give an example, let's assume we have a link with an arbitrary speed of 100 volume/time, to keep the link's latency under load acceptable we need to avoid admitting more than 100 gross rate. To keep things simple let's look at a typical case, TCP/IPv4 over ethernet and calculate the achievable goodput for different packet sizes:

packet size 1500:
100 * ((1500-20-20)/(1500+38)) = 94.93
packet size 100:
100 * ((100-20-20)/(100+38)) = 43.48

Now, let's assume that instead of configuring SQM with "rate 100 overhead 38" we simply used "rate 97" and reverse the two calculations:

packet size 1500:
97 * ((1500-20-20)/(1500)) = 94.41
packet size 100:
97 * ((100-20-20)/(100)) = 58.2
We would need to reduce our shaper rate to 97 * (43.48/58.2) = 72.47 to stay in the acceptable region
72.47 * ((100-20-20)/(100)) = 43.48
but that would sacrifice quite some goodput for larger packets
97 * ((1500-20-20)/(1500)) - (72.47 * ((1500-20-20)/(1500))) = 23.87%

So at size 1500 our shaper still works as it admits a bit less that the 94.9 equivalent goodput to the link, but at packet size 100 we now admit 58.2 of allows 43, in other words we are feeding in more packets than the link can carry and these will accumulate in the link's over-sized and under-managed buffers, introducing bufferbloat.

The obvious solution to the issue is to give the traffic shaper sufficient information to robustly estimate the effective size/time each packet will require on the bottleneck link, and that requires both a correct gross rate and the aplicable overhead each packet requires on the link.
Now one might ask, why do I need to specify the overhead, the kernel should know? And that is partly true, except for:

  1. the kernel only cares for the amount of overhead it is responsible itself, for ethernet packets that typically does include 14 bytes for source and destination MAC addresses and 2 bytes for the ethertype field, but things like the frame check sequence (FCS) preamble and start of frame delimiter are typically only added by the network driver/device so the kernel does not care. And finally the interpacket gap is just a pause between sending packets that never exists as true payload, but that happens ot be timed sch that it last just as long as the transmission of 12 bytes would take.
  2. Even if the kernel would know the overhead on its wan side interface perfectly, that is not even guaranteed to be the relevant overhead. For example, if the router is connected via ethernet to an ADSL modem, the kernel will account the size of a full sized packet as 1514, but as demonstrated above that should be 1538, but even if we assume a perfect kernel using 1538 byte packet size that still will not work out correctly (due to ignoring ATM/AAL5's atm cell quantization):
    "True" effective size of a MTU1500 packet on a ATM/AAL5 link (assuming the best-case ATM encapsulation: IPoA, VC/Mux RFC-2684, which just incurs an 8 byte overhead on top of the IP packet):
    ceil((1500+8)/48)*53 = 1696

I hope that this makes it obvious, that proper traffic shaping with the goal to keep bufferbloat under check really requires to correctly model the bottleneck link's property correctly so that the shaper can robustly predict the time each packet occupies the link during its transmission, and that correct description requires, gross rate, per-packet-overhead, and potentially fancy addition encodings like ATM/AAL5's quantized 48/53 encoding** (also important is to model potential minimal packet sizes as well, for example an ethernet from dst MAC to FCS is always at least 64 bytes, so even a theoretical 1 byte packet will require at least 64 bytes for any link that uses ethernet frames (so PTM, DOCSIS, GPON, ...)).

*) Technically that is not perfectly correct, SQM will still batch admit a bunch of packets if need be, but typically such that the upstream queue does not get more packets than it can transmit in 1 milliseconds (or at least a single packet), but effectively that bounds the devices worst case buffering to add only a small delay.

*) VDSL2's PTM uses a 64/64 encoding, but unlike AAL5 this is applied not per-packet but simply for all bytes independent of their function, and while the ATM/AAL5 expansion needs to be calculated for each packet based on its size (as ATM will pad each data frame to use an integer number of ATM cells, so that each ATM cell only carries data from one higher level data frame) the PTM encoding can simply be considered by reducing the gross shaper rate by 100-100(64/65) = 1.54%.
Sidenote: PTM is actually specified as an Annex to one of the ITU's ADSL standards, so not all ADSL link's need to use ATM (and in theory VDSL2 links also could use ATM/AAL5); the question is what is going to happen first, the replacement of all ADSL line cards and modems to support PTM or the complete retiremenr/replacement of ADSL...

2 Likes

In short, we really do not know exactly what to account for on GPON links like yours as per-packet overhead, but overestimating the overhead will only reduce achievable goodput slightly, while underestimating it has the potential of increasing latency under load considerably, so I would recommend to err on the side of too large, so either 38+4 = 42 (ethernet plus VLAN) or the 44 from @krazeh (this is a realistic estimate of what people might encounter on ADSL links).

As a demonstration how benign overestimating the per-packet-overhead is here a comparison of what a shaper will allow as measurable goodput assuming ether no per-packet-overhead (on top of the IP headers) or the recommended 44 bytes:
100 * ((1500-20-20)/(1500+0)) = 97.3
100 * ((1500-20-20)/(1500+44)) = 94.56
97.3-94.56 = 2.74 percentage points worst case and even on GPON the true overhead will be larger than zero.

100 * ((100-20-20)/(100+0)) = 60
100 * ((100-20-20)/(100+44)) = 41.67
As always, these effects are going to be more pronounced for small packets, here 60-41.67 = 18.33 percentage points. But unless you only send small packets even that should not be a show stopper.

2 Likes

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.