Fixed value MSS clamping

z12 · October 13, 2024, 6:41pm

Hello.
I'm looking to setup fixed value MSS clamping on my router.
Automatic path MTU discovery is broken because I am behind a VPN that fragments packets internally when they are larger than the real MTU.
While it works for sending large packets, it tanks throughput, so I'm looking to set a proper MSS value to work around it.
I also can't reduce the interface's MTU because IPv6 standard enforces a 1280 mtu minimum.

I have found some documentation online on how to do this with nft, I'm just not sure about the correct way to apply it in openwrt's FW4's context in a permanent manner.

Any help is appreciated.

lleachii · October 13, 2024, 7:15pm

This is an end-to-end technology, so you would configure this on the SRC client/host. If you router has issues, disable MSS Clamping and set any MTU values properly. This will ensure the MSS size.

So just to be clear, this isn't possible. You may be able to alter the packet in transit (i.e. artificially altering the MSS Clamping), but this may cause other issues.

All VPNs would fragment a packet larger than the real MTU (provided the packet is actually that large), so it's not clear what your statement means.

(So then you disable MSS Clamping.)

What's your real MTU!?
How low do you think it needs to be?
Why do you think that?
Then why would you lower/alter the MSS?

anon63541380 · October 13, 2024, 7:17pm

You just need to set proper mtus so that mss clamping does correct job.

lleachii · October 13, 2024, 7:24pm

I'm trying to understand the OP's theory: rather configure 2 small TCP packets (which both must be acknowledged), opposed to the router fragmenting the packet and delivering them to the other VPN endpoint (one acknowledgement). That could tank download throughput.

I agree this is the simplest advice.

anon63541380 · October 13, 2024, 7:44pm

If the target zone has variable MTU-s then you may need to adjust
/etc/nftables.d/whatever.nft ( EDIT file extension should be nft, rename to disable)

chain mangle_postrouting {
        type filter hook postrouting priority mangle; policy accept;
        ct packets < 14 oifname $wan_devices tcp flags syn / fin,syn,rst tcp option maxseg size set rt mtu
}

Greatest parts in snapshot already.

z12 · October 13, 2024, 7:48pm

Sure it is. Refer to:
https://wiki.nftables.org/wiki-nftables/index.php/Mangling_packet_headers

That's indeed exactly What I want.

I forgot the actual value, but around 1238 bytes if I'm not mistaken.

lower than the real MTU. 1200 sounds like a good start. I can find the optimal value after figuring out how to set MSS value.

In my current, relying on automatic path MTU discovery to set the correct MSS value, leads to a high >1450 value. Which reduces throughput significantly.
I have tested various MSS values with iperf3 (-M switch) and throughput gets significantly better when I set MSS to 1200.

I want that value to apply to all tcp connections that go through the vpn interface.

As mentioned already, my actual MTU is lower than minimum imposed by IPv6, so I can't do that without disabling IPv6.
Since the VPN handles larger packets internally, I don't see the need to do that, when there is a way to ask the client to send smaller packets via MSS clamping.
If for whatever reason a large packet was to be sent anyway, then the VPN will handle it at a lower throughput.

z12 · October 13, 2024, 7:50pm

I will try this with "option maxseg size set 1200" and report back.
Thank you

lleachii · October 13, 2024, 7:51pm

Yes, I noted this.

That's odd and I'm not sure how such a low MTU was determined. Feel feel to expound. But allow me clarify my inquiry:

What medium do you use to connect to your ISP?

Ethernet (1500 MTU)
DSL
PPPoE
Other?

anon63541380 · October 13, 2024, 7:51pm

Can you peel the onion of encapsulations you do. 1280 is like 20 encapsulations away from 1500

z12 · October 13, 2024, 7:55pm

I am behind multiple VPN tunnels.

Unfortunately, it's not possible with my setup.
I understand it comes off as unusual. but believe me I have my reasons.
However, thanks to you. I learned about the /etc/nftables.d/ directory. it was what I was looking for

anon63541380 · October 13, 2024, 7:56pm

Mostly mtu <1280 v6 and <576 v4 will be filtered of wherever you try to connect.

lleachii · October 13, 2024, 7:59pm

That's unfortunate, but still doesn't answer the question. Additionally, you just provided a detail you never mention in your first post. To be clear, this information was needed to assist you.

We need to know the REAL Physical connection medium you use to connect your your actual Internet Service Provider. This is your real MTU.

Afterward we can discuss the nested VPNs, how configured, MSS/MTU issues, etc. (and if the OpenWrt is involved).

z12 · October 13, 2024, 8:03pm

I don't really see how knowing the physical connection helps, but it's a PPPoE connection.
That's the top most layer of the encapsulated packet.

lleachii · October 13, 2024, 8:15pm

1480 is usually the MTU on a PPPoE. This is running on the OpenWrt, correct?

Can you describe more?

We now need to know how you nested the VPNs.

Then we can discuss their MTU's, subtract the nesting overhead, etc.

Ummmm, that depends on how you're nesting VPNs - anyways, again, we needed to know your real maximum MTU to help understand that "top most layer".

Everything else you're discussing is virtual and runs over that physical connection. If you're uncomfortable discussing how/why you have this setup, you may which to see if the VPN providers offer support for their service(s).

anon63541380 · October 13, 2024, 8:15pm

OK, nineteen more to go.

z12 · October 13, 2024, 8:23pm

I appreciate your help.
I know it feels like I'm being dodgy, I can explain the full situation over PM if you'd like, but I'm not comfortable to discuss it on a public forum.

Anyways, I have added this to /etc/nftables.d/10-custom-filter-chains.nft and it seems to be doing what I want. I shall experiment with different values and see how it affects throughput.
Though ideally I should adopt a proper value for IPv4 and IPv6 connections separately, this is a great start.

chain user_post_forward {
        type filter hook forward priority 1; policy accept;
        iifname { "tun0" } tcp flags syn tcp option maxseg size set 1200
        oifname { "tun0" } tcp flags syn tcp option maxseg size set 1200
}

Thank you for help

lleachii · October 13, 2024, 8:25pm

OK. I hope this helps.

I thought your issues was "tanking throughput" - and you had multiple (possibly nested) VPNs. Perhaps I misunderstood. Let us know your results!

Recall, no endpoint knows this is set (or altered, rather).

z12 · October 13, 2024, 8:30pm

I do have nested tunnels. but they aren't visible to openwrt.
it's like (sorry I can't picture it well): openwrt <===( server 1 <===> server 2 )==> server 3 <==> internet
Yes the latency is horrible.

lleachii · October 13, 2024, 8:35pm

I'm not sure if this is related to your MSS and MTU discussion, but I can understand from your diagram that the servers may cause latency.

You did well.

The issues could reside there
If you control these servers/tunnels, you must recall their MTU settings

This MTU is likely 1500 (or higher), so recall you only have 1480 to begin with. Fragmenting/clamping will take place here, no matter what you do.

z12 · October 13, 2024, 8:42pm

I do control 2 of the middle servers.
I think you are suggesting that I count the overhead of each tunnel encapsulation and reduce the sum from 1480 (my physical link mtu) to set on the interface.
I have done this before, but the resulting mtu was <1280, which kernel refused to set on the interface because the interface had an active IPv6 address.