CAKE w/ DSCPs - cake-qos-simple

Does recalculating the headers also potentially break the purpose of the checksum before the pedit?

I do not understand your question. Assuming the checksum was correct before editing it is likely wrong after your edits and hence needs to be re-calculated, otherwise the receiver side TCP stack is going to detect a 'broken' packet and is going to drop and re-request that segment... but I would certainly try to check first how the checksums look before and after the edit! Note most packets will be Not-ECT(00) so if your use-defined value is zero no bits are going to change and hence the checksum is not going to change, so expect having to search for the few packets that were ECT(0) or ECT(1) before to find a candidate for broken checksums.... (I wonder is a checksum broken or simply no-matching, but semantics aside, I hope you understand my rationale)

But if the checksum was wrong before pedit, then is there not a danger of papering over a corrupted packet?

Can tcpdump be used to look at checksums?

Sure. But that likelihood is rather slim, after all the IP4 header#s check sum only covers the actual IPv4 header, UDP or TCP packets will have their own checksum, as will the ethernet frame... Also the IPv6 header dropped the header checksum altogether relying on those alternative checksums. I would say the risk is small enough to not bother (assuming random bit flips),

I tried this:

        if [[ -n "${overwrite_ecn_val_dl}" ]]
        then
                printf "\nSetting up filter to overwrite ecn bits to decimal value: '${overwrite_ecn_val_dl}' on download.\n"
                tc filter add dev "${dl_if}" parent 1: protocol ip matchall action pedit ex munge ip dsfield set "${overwrite_ecn_val_dl}" retain 0x3 csum ip4h and tcp and udp
                tc filter add dev "${dl_if}" parent 1: protocol ipv6 matchall action pedit ex munge ip6 traffic_class set "${overwrite_ecn_val_dl}" retain 0x3
        fi

but still no dice.

Well, for IPv6 this is not too surprising, as there is no iph4 ;), but you really need to capture the same packets before and after your pedit action to see that only the ECN field changes and that before and after the header checksum is correct for the respective header...

If the header checksum is somehow automatically adjusted with your original pedit I do not expect much change...

Ah, turns out I only needed to add in pipe because:

        if [[ -n "${overwrite_ecn_val_dl}" ]]
        then
                printf "\nSetting up filter to overwrite ecn bits to decimal value: '${overwrite_ecn_val_dl}' on download.\n"
                tc filter add dev "${dl_if}" parent 1: protocol ip matchall action pedit ex munge ip dsfield set "${overwrite_ecn_val_dl}" retain 0x3 pipe csum ip4h
                tc filter add dev "${dl_if}" parent 1: protocol ipv6 matchall action pedit ex munge ip6 traffic_class set "${overwrite_ecn_val_dl}" retain 0x3
        fi

works!

The tc syntax is quite fiddly.

For my own understanding, does OpenWrt just drop packets with an incorrect checksum? So the reason the Amazon Prime Video grinds to a halt is because of massive packet drops?

No idea, but even of it shpuld not, the receiver will. This should be confirmable with comparing before amd after pedit packet captures trying to find matching packets....

Now I'm trying to figure out how to match only packets with ECN 1 or 2.

        if [[ -n "${overwrite_ecn_val_dl}" ]]
        then
                printf "\nSetting up filter to overwrite ecn bits to decimal value: '${overwrite_ecn_val_dl}' on download.\n"
                tc filter add dev "${dl_if}" parent 1: protocol ip matchall action pedit ex munge ip dsfield set "${overwrite_ecn_val_dl}" retain 0x3 pipe csum ip4h
                tc filter add dev "${dl_if}" parent 1: protocol ipv6 matchall action pedit ex munge ip6 traffic_class set "${overwrite_ecn_val_dl}" retain 0x3
        fi

I think rather than 'matchall' I just use 'dsfield' for IPv4 and 'priority' for IPv6, but then what combination of value and mask for each to capture ECN '1' or '2' but not '0' or '3'?

          dsfield VAL_MASK_8
                 IPv4 only. Match the packet header's DSCP/ECN
                 field. Synonyms to this are tos and precedence.
          priority VAL_MASK_8
                 IPv6 only. Match the header's Traffic Class field,
                 which has the same purpose and semantics of IPv4's
                 ToS field since RFC 3168: upper six bits are DSCP,
                 the lower two ECN.

@dave14305?

So I would simply stack two filters one for ECT(0) and one for ECT(1), since these are mutually exclusive just give the more common one a lower priority sequence number (so a higher priority), and do not use continue (or how it is called).
tc has some sort of hashtable that could be used instead, but for just two tests, maybe it is not worth bothering...
BTW with the coming L4S experiments it will become helpful to do different things to ECT(0) and ECT(1) so two filter rules might actually be the best way forward....

1 Like

Will the rules always be per port or will they be added per IP in the future?

Good question.

Since ultimately an nftables script is generated and implementing per IP rules in nftables is straightforward, it's always possible for the user to add such rules in to the nftables script once it has been generated based on the config.

But it would be nice to facilitate this using nft variables in the config so that the user only needs to modify those and regenerate the nft script in the same way that this is made possible with the existing port based mapping.

I will give it some thought.

@moeller0 and @dave14305, does this look appropriate (and most efficient):

printf "\nSetting up tc filter to restore DSCPs from conntrack on egress packets on interface '${ul_if}'.\n"
tc filter add dev "${ul_if}" parent 1: protocol all matchall action ctinfo dscp 63 continue

if [[ -n "${overwrite_ecn_val_ul}" ]]
then
	printf "\nSetting up filters to overwrite ecn values 0 or 1 with decimal value: '${overwrite_ecn_val_ul}' on upload.\n"
	tc filter add dev "${ul_if}" parent 1: protocol ip u32 match ip dsfield 1 0x3 action pedit ex munge ip dsfield set "${overwrite_ecn_val_ul}" retain 0x3 pipe csum ip4h
	tc filter add dev "${ul_if}" parent 1: protocol ip u32 match ip dsfield 2 0x3 action pedit ex munge ip dsfield set "${overwrite_ecn_val_ul}" retain 0x3 pipe csum ip4h
	tc filter add dev "${ul_if}" parent 1: protocol ipv6 u32 match ip6 priority 1 0x3 action pedit ex munge ip6 traffic_class set "${overwrite_ecn_val_ul}" retain 0x3
	tc filter add dev "${ul_if}" parent 1: protocol ipv6 u32 match ip6 priority 2 0x3 action pedit ex munge ip6 traffic_class set "${overwrite_ecn_val_ul}" retain 0x3
fi

printf "\nSetting up CAKE on interface: '${dl_if}' with bandwidth: '${cake_dl_rate_Mbps}Mbit/s' and options: '${cake_dl_options}'.\n"
tc qdisc add dev "${dl_if}" root handle 1: cake bandwidth "${cake_dl_rate_Mbps}Mbit" ${cake_dl_options}

if [[ -n "${overwrite_ecn_val_dl}" ]]
then
	printf "\nSetting up filters to overwrite ecn values 0 or 1 with decimal value: '${overwrite_ecn_val_dl}' on download.\n"
	tc filter add dev "${dl_if}" parent 1: protocol ip u32 match ip dsfield 1 0x3 action pedit ex munge ip dsfield set "${overwrite_ecn_val_dl}" retain 0x3 pipe csum ip4h
	tc filter add dev "${dl_if}" parent 1: protocol ip u32 match ip dsfield 2 0x3 action pedit ex munge ip dsfield set "${overwrite_ecn_val_dl}" retain 0x3 pipe csum ip4h
	tc filter add dev "${dl_if}" parent 1: protocol ipv6 u32 match ip6 priority 1 0x3 action pedit ex munge ip6 traffic_class set "${overwrite_ecn_val_dl}" retain 0x3
	tc filter add dev "${dl_if}" parent 1: protocol ipv6 u32 match ip6 priority 2 0x3 action pedit ex munge ip6 traffic_class set "${overwrite_ecn_val_dl}" retain 0x3
fi

Does it make sense to match on '1' or '2' then overwrite to decimal value of choice?

A user can always create traffic rules in the LuCI firewall app that are based on IP and/or ports and set DSCP there. The saving to ct mark will occur after the firewall4 mangle rules are processed. You don’t really have to solve for the basic scenarios if they are achievable with the built-in firewall.

1 Like

I never really got my head around why this is desirable to erase ECT bits. I guess it’s spread over a couple different threads. Just seems complicated to me.

For readability, you might want to break the command over several lines to see all the individual components in the filter.

tc filter add dev "${ul_if}" parent 1: protocol ip \
u32 match ip dsfield 1 0x3 \
action pedit ex munge ip dsfield set "${overwrite_ecn_val_ul}" retain 0x3 pipe \
csum ip4h
1 Like

I think I would create two variables for the overwrite value, one for ECT(0) and one for ECT(1), there will be more ECT(1) traffic in the future that is not cake compatible so being able to selectively wipe only ECT(1) (or ECT(0) for testing) makes some sense.

Because sometimes you get packets marked ECT(0) or ECT(1) that are not from an ECN enabled flow (but likely accidents, somebody trying to set a 6bit DSCP but accidentally writing the value to the 8bot TOS bitfield instead) cake will then re-mark such packets CE on coingestion instead of dropping them, but if the transport does not actually respond to the CE marks this is not the right thing that cake wanted to do. Given enough time (and queuing) cake#s BLUE component is going to reign in such flows but only slowly. The real issue is that cake has no way to disable ECN marking (fq_codel in contrast can be configured to ignore ECN and always drop on congestion). And as I said we are about to see experiments with ECT(1) that expect a different marking strategy than the one cake uses, so being able to selectively wipe ECT(1) seems a useful feature for the near future.
In case you ask, these ECT(1) experiments are in no way guaranteed to work as expected, so changing cake to use that new marking scheme is not a clear winner either. Sorry for getting verbose here...

@moeller0 understood about two overwrite variables. Does it make sense for the overwrite variables to allow for something other than zeroing? Can you see a use case for doing anything other than zeroing? At the moment I allow any value for the overwrite but should it rather than be zero or not zero?

For your use-case re-marking to Not-ECT (aka 0) seems the safest option. Now for the ECT(1= experiment it might become helpful to remark ECT(1) to ECT(0), but that really is pie in the sky, so for now re-mark to 00 seems the best all-around choice (as that will make cake switch from CE marking to dropping...).

1 Like

In that case surely we may as well keep the overwrite variables (0, 1, 2, etc.), but just split out into ect (0) and ect (1) cases as you suggested. So variables can be any value and overwrite will effect that value.

Actually I think this is a really good point. Perhaps we should remove the attempt in cake-qos-simple at facilitating rudimentary DSCP classifying since that can already be done powerfully and graphically in LuCi.

My only reservations are:

  • LuCi firewall rules look a bit cluttered already and now there has to be mixture of rules for firewall and rules for QoS jumbled up together; and

  • it's one rule per port so that's going to mean quite a few rules.

But I see your rationale @dave14305 and absent any objections I think I'll go this route.

Do you think LuCi should have a separate section for QoS rules (rather than jumbling firewall and QoS rules together)?

1 Like