Nftables: how to clear ecn bit

Oh, for the signaling TCP uses the IP header ECN bits for sure, but for negotiation whether to use ECN these are set to not-ECT, so will not trigger your nftables "cleaner", so the endpoints can easily end up thinking that ECN is to be used. The problem really occurs whenever a CE packet gets remarked to not-ECT (or ECT(0) or ECT(1), what counts is that the "slow dow" signal is lost), because then an intermediate node tried to tell the flow "please slow down" and we remove that signal before the flow can see it and hence will not react to it...

For ingress that will only work if you use one of the following:
a) a veth pair to "pipe" your ingress traffic locally around after packets entered the nftables domain and hence move the cake qdisc after nftables
b) use tc pedit to manipulate the ECN bits before the traffic-shaper/AQM sees the packets
c) instantiate the internet download shaper as "egress" shaper on the interface from the CPU to the LAN (switch) (that will only work for wired only routers and will not shape the routers own internet download traffic).

Now, thinking this over, I think there is a way of having nftables act before cake even on an IFB, but it will be hapening outside of your NAT domain, which for ECN bit manipulations is not a real issue as you do not even ook at the IP addresses or port numbers.

Even if the CE bit is nulled, it doesn't matter because cake will drop the packets anyway if congestion occurs.
What about having a non ecn capable qdisc with ecn enabled streams?

Doesn't matter if the packets get mangled after the ingress qdisc,
when the proper bits get modified that are used for the negotiation.

Nope, if cake sees a packet with ECT(1) or ECT(0) that is selected for "action" it will simply mark it CE instead of dropping it, so if cake sees the packet after cleaning the ip ECN bits, it will drop the packet and the flow will react correctly, the problem occurs if the cleaning happens after cake changed ECT(0)/ECT(1) to CE to signal congestion. Sure, cake's BLUE component will rather later than sooner ramp up the drop probability for that hash bin, so eventually drops will happen, but only after considerable queue build-up for that hash bin.

If you e.g. use simple.qos/fq_codel you can configure fq_codel to evaluate ECN (then it will set CE marks) ot not to evaluate ECN bits then it will drop independent of the ECN bits if a packet was deemed actionable.

Nope, as I tried to explain, but clearly failed, the negotiation is going to proceed as it does not use/rely on the IP header ECN bits, it is just that nftables will remove CE marks and hence remove congestion indication... one solution would be to use nftables to simple drop packets where the ECN bits are CE (11)... that would post-hoc do the right thing...

But there won't be any packets with ECT(1)/ECT(0) bits hitting the cake qdisc.
So cake will drop the packets and not mark them and the flow will react accordingly as, I wrote above.
For egress at least...
The point is when the ECN bit is always set to no-ect, cake will fallback to dropping packets.

With right packets I mean the tcp ecn bits, there must be way to do this with nftables too.
IPtables has an extension for this.
So it would be
Sender: TCP ECN bits sets > cake ingress qdisc > nftables ecn bits nulled -> receiver
Receiver thinks ECN not enabled, flow will not be ecn enabled.

But I don't know all sites I checked say the IP ECN are used to indicate ecn capability transport.
Idk why tcp would not honor/use it and proceed with his own ecn thing.
Maybe it's time to do some tcpdumping...

//edit
nft insert rule inet fw4 mangle_forward tcp flags ecn tcp flags ecn delete
Throws an error:
Error: syntax error, unexpected newline, expecting @ or '$'

Why? How can you make sure of this?

Yes, we agree on egress/upload, there nftables will re-set ECT(0)/ECT(1) to not-ECT and cake will drop, my argument has been about the download direction...

Yes, cake will mark if ECT(0/1) and the action was coming from cake's codel component, otherwise, not-ECT or action coming from the BLUE component, cake will drop.

Well, you can try to interfere with the TCP ECN negotiation by manipulating the ECE and CWR bits...

That is a simplification...

Because that is hoe rfc3168 describes how TCP should behave... however TCP senders will only set ECT(0) in the IP header after having successfully negotiated ECN usage, if you add ECT(0) via nftables/iptables later on the path TCP will not use ECN signaling unless it did negotiate it successfully with the TCP on the other side.

But to cycle back to the why? Why, I ask again? For quick testing just use simple.qos/fq_codel and disable ECN in fq_codel, and for longer term, teach cake to allow to disable ECN usage...

I was thinking about this too.
Would be easier to remove the ecn part out of cake than figuring out this nftables command :rofl:

I don't know ecn seems not suitable for internet usage.
It's way too slow.
For local networks with low rtt it is maybe fine.

I tried with the ingress hook thing...
But I can't get fw4 to insert the rules into it.
I have:
ingress-hook.nft

chain ingress_ifbeth2 { type filter hook ingress device ifb-eth2 priority mangle ; }

Included in fw4 config:

config include
option type 'nftables'
option path '/etc/firewall/ingress-hook.nft'
option position 'table-pre'

Seems to work.
But inserting rules into this with a another snippet doesn't work.
clear-ecn-in.nft

ip ecn != not-ect ip ecn set not-ect counter
ip6 ecn != not-ect ip6 ecn set not-ect counter
config include
	option type 'nftables'
	option path '/etc/firewall/clear-ecn-in.nft'
	option position 'chain-pre'
	option chain 'ingress_ifbeth2'

No error message, just no rules are inserted, bummer.

//edit
putting everything in one include like this:

chain ingress_ifbeth2 { type filter hook ingress device ifb-eth2 priority mangle ; ip ecn != not-ect ip ecn set not-ect counter; ip6 ecn != not-ect ip6 ecn set not-ect counter; }
Seems to work...
But how to get line breaks like with \ working?

This works for me in respect of bleaching the ECN bits of upload packets using nftables:

This will have the effect of preventing cake from marking upload packets because the ECT(0) or ECT(1) ECN capability flags will have been scrubbed.

I do not think nftables can help so easily with bleaching the ECN bits of download packets because even the ingress hook of nftables occurs after the tc ingress hook normally used for mirroring ingress packets to the IFB established for the cake download interface. Any thoughts on that @dave14305? One way to have nftables bleach the ECN bits on download would be to switch from the normal tc mirroring to nftables forwarding, and so have nftables bleach the ECN bits and then forward the packets to the IFB. But that means ditching the normal tc-based IFB route.

So to bleach the ECN bits of download packets, @moeller0's suggestion to use tc-pedit:

seems like it might fit better with existing implementations that typically involve a mixture of nftables and tc anyway. It would just involve one or two further tc actions before the normal mirroring action.

I am interested in this myself because I believe my ISP bleaches the ECN bits of packets. When an ISP bleaches ECN bits, cake will not mark download packets but will mark upload packets. This is because the bleaching occurs before cake sees the packets on download (scrubbing the ECT(0) or ECT(1) ECN capability flags), but after cake sees the packets on upload.

So in this case it seems to me desirable to bleach the ECN bits on upload packets to prevent cake from ineffectively marking upload packets.


Actually thinking about this further, perhaps it's best that all ECN bleaching is done using tc rather than nftables because that can presumably easily handle both directions and makes it easier to work with user configurable variables.