1 - What happens if the MPU is larger than necessary?
2 - If I have 3 VLAN tags on the network, do I have to add 12 more bytes?
3 - If I have PPPoE, do I have to add 8 more bytes?
Sqm will account very small packets as being a bit larger, which will result in a small loss of potential throughput. This will only become relevant on links where such small packets can actually happen (e.g. docsis links with their small overhead) and only noticeable if these small packets make up a notable fraction of traffic. A typical case would be reverse ACK traffic on an asymmetric link, where the upstream is only a bit wider than the roughly 1/40 required for ACK traffic. Here small ACK packets will take up 50-100% of the uplink capacity. And this is exactly the situation that motivated cake's mpu handling, alveit not because cake was over- but underestimating the on the wire size of such packets and hence bufferbloat increased. Overestimating the MPU gently causes no harm to latency under load but costs a typically small fraction of potential throughput.
Only if these 3 are nested, that is a single packet carries all three if these (I know double vlan tags are possible, no idea whther that is also true for triple tags). Also only those tags are relevant that affect your sqm shaped link, if you only shape the wan link, any vlan tags that only exist in your LAN are irrelevant for the shaper.
It depends, but say a shaper on pppoe-wan does not see the pppoe overhead and so you have to account for it manually. But note that cake will always interpret manual overhead on top of the IP packet, so you will need to effectively always add 8 bytes if you have pppoe encapsulation on the shaped link....
Thanks for the reply. In the case of a fiber link the recommended MPU is 84, but have VLAN tags and PPoE already been included in this count?
Here you say about overhead or mpu?
PPPoE overhead lives within the ethernet payload and hence has no direct influence on MPU (only by making it less likely to be an issue).
Good point, with a VLAN tag on ethernet you get to 88 bytes effectively...
Both... but PPPoE overhead makes it less likely that the MPU ever triggers, as the MPU really only applies if the ethernet payload data is smaller than 42 bytes... (only then ethernet padding happens and it is mostly that padding that we need to address).
Got it. Thanks.
So 4 bytes for vlan overhead?
Does it matter where cake is applied, i.e. on the base device (say wan) or on the vlan device (say wan.1)?
Does that apply in both directions?
Can overhead not be determined by cake? Or at least by OpenWrt? At least this kind of overhead.
Yes, the first VLAN tag lives in the ethernet header (not the payload and is hence not included in the MTU) but it is included in each packets overhead and hence also the mpu.
No, the kernel only includes 14 bytes of overhead per ethernet frame (and zero on pppoe interfaces) independent on vlan or no vlan. Cake btw is nice enough to take care of the 14 bytes the kernel might have added so all specified overhead is applied on top of the IP layer.
It does, so an ifb sort of inherits that value from the interface it "mirrors".
No, as far as I can tell not, also there are situations where you know about a VLAN tag on the actual bottleneck but the sqm interface does not have any or even the other way around, as with SQM we really need to properly model the overhead/encapusulation on the real bottleneck link.
Thanks for this. In what situations should the ethernet overhead be added. Doesn’t that always include ethernet? Vlan is over ethernet.
I’m confused since I thought what matters is what’s transmitted over bottleneck. But vlan is normally local whereas bottleneck is normally not local.
So when should ethernet or vlan overhead be added?
So this is a bit subtle... If you look at the wiki page for ethernet frame you can see that we have to deal with L2 and L1 Ethernet frames and the difference between the two can matter. Many encapsulations, like PTM on DSL, or GEM on GPON or what ever DOCSIS calls its encapsulation actually package a full L2 ethernet frame and hence inherit ethernet's minimum L2 size of 64 bytes (68 with VLAN). We then typically have to add the actual link layers own additional L1 overhead (except for docsis, where the mandated dicsis shaper operates on the fiction of pure L2 ethernet frames).
On a true ethernet links we obviously need to add the ethernet L1 overhead as well.
Now on link layer's differently from ethernet we typically have a total overhead smaller than on ethernet (and hence also a smaller mpu). Now we need to calculate whether the higher overhead/mpu of the ethernetlink between router and modem ever becomes the bottleneck, if yes we need to account for ethernet L1 overhead and MPU if no we only need to care about the typical bottleneck's overhead mpu. See the trick is to figure out the worst case overhead that can be relevant and then account for that.
Let me give you an example, say we have an cable/docsis link with a gross rate of say 900 Mbps (purposefully selected to show the issue) and connect the modem to our router via gigabit ethernet with 1000Mbps gross rate:
We start by looking up overhead and mpu and find:
overhead: 18
mpu: 64
but we remember that on true ethernet we have
overhead: 38 bytes of overhead (42 with a vlan, but in the docsis case no vlan is expected)
mpu: 84
Now, assume fully saturating traffic with minimally sized packets (yes this is not a normal load, but no reason for our shaper to give up either), the docsis link will allow:
(900 * 1000^2) / (64 * 8) = 1757812.5 packets per second
now on our ethernet link these packets would require:
1757812.5 * (84 * 8) / 1000^2 = 1181.25 Mbps
but we only have 1000 Mbps to go around, so clearly under these conditions the ethernet link will become the bottleneck so to be on the safe side we should configure the ethernet overhead and mpu values....
We can also calculate the traffic rate at which that switch over from docsis overhead is relevant to ethernet overhead is relevant happens, but I leave that as an exercise to the reader
True, however which link is the worst case bottleneck can be a bit non-obvious... which I hopefully illustrated with the example above.
When ever it becomes relevant But if you want to be on the safe side, always, unless you know that the real bottleneck overhead is even higher...
Very helpful as ever thanks!
So technically the overhead could change in the case of a variable remote link that has a bandwidth that goes above and below the bandwidth of the local link? Since then the bottleneck alternates between the remote and local links, and so the bandwidth and the overhead values will need to be dynamically adjusted.
But I suppose tc change calls could deal with this wacky scenario?
Yes, but I note that cake-autorate specifically will not really care about this much. Getting the overhead wrong basically interacts with what shaper speed is achievable without too much latency (that is setting the overhead too high will result in a larger nominal shaper rate still giving acceptable latency under load and vioce versa), and since autorate measures and reacts to that latency it will gracefully deal with this problem to a large degree. It will only run into issues if the link is constantly saturated but the average size of the packets varies massively on unfortunate timescales. But for normal loads that is typically not happening... forward data traffic tends to use close to the largest packet size, reverse ACK traffic close tor the smallest.
But, the gist of this whole thing for configuring of SQM is:
- for gross shaper rate err on the side of too small
- for per-packet-overhead and mpu err on the side of too big
- don't stress out too much
I would simply configure overhead and MPU for the worst case and be done with... I note that for each link layer it is possible to calculate at what throughput (with minimal packet size) the bottleneck flips and only bother about the ethernet overhead if that rate is actually achievable, e.g. for DOCSIS I seen to recall this switch over being around 750 Mbps, so on a say 500 Mbps contract, the issue simply does not matter.