Backported "Linux TCP BBR patches for higher wifi throughput and lower queuing delays"

N.B. This patch is not BBRv2 (still in alpha)

For those interested in running BBR on their routers running Samba/CIFS or VPN (tcp connections terminating at the router), I've attempted to backport the patches from this post: https://groups.google.com/forum/#!topic/bbr-dev/8pgyOyUavvY

This is compile-time, run-tested on mvebu: 1900ACS (Shelby). No numbers or comparative statistics yet - which is why I'm putting this out there ( to get more testing).

This should solve the problems arising from: TCP BBR impacts the WiFi performance from MAC aggregation (AMSDU/AMPDU).

This patchset will be made redundant/included by default when OpenWrt adopts Kernel 5.4 for the majority of targets (probably v20.07 ~July 2020).

Please test (put in targets/linux/generic/backports-4.1X)! If this works well for others, I'll clean it up and submit to OpenWrt trunk/master.

2 Likes

I really like this precise and correct management of expectations! I also like to option to run BBR :wink:

1 Like

4.14@ipq806x so far so good ( solid throughput numbers... uploads particularly efficient? )

1 Like

Will test on 4.19@ipq806x (r7800)

Can someone provide how to report difference before and after this patch?

Well, BBR has two promises, one it will cyclically try to probe the current path bandwidth and adjust itself such that it will not cause excessive (own-) bufferbloat, and that it is reasonably tolerant to relative low rates of packets loss as encountered e.g. over wifi links. So if you use a VPN from a coffee shop's WiFi network back into your network, BBR might be a good thing (except that you should not run a VPN over a TCP connection in the first place, but that often is not optional unfortunately).
Also BBRv1 was reported as not being terribly fair to non-BBR flows and might not play too well with an ECN/loss based AQM as sqm-scripts (but that should depend of the actual BBR version, v2 is supposed to be better in that regards, and I believe without evidence that BBRv1 also improved in that regard)

1 Like

Yep, it's not fair/doesn't co-exist with TCP Cubic/Reno or any loss-based TCP congestion control algorithms.

Theoretically it won't play well with Cake's BLUE or FQ_CoDel's Bulk Dropper. In any case, I agree in that it doesn't play well with ECN.

I don't think BBRv2 will play nicely with ECN still – it looks like they're adopting DCTCP/L4S style rather than SCE or RFC3168.

Though with my use case/ISP, I never receive ECN bits (even though I've enabled it everywhere). I'm pretty sure it's not a config error on my side anyways. I'm stuck with an ISP that uses old Cisco routers and Huawei/FiberHome ONTs/CPEs that probably screw up ECN as collateral due to their effort to nuke DSCP bits.

user@MacBook-Pro ~> netstat -sp TCP | grep ECN
	0 client connection attempted to negotiate ECN
		0 client connection successfully negotiated ECN
		0 time graceful fallback to Non-ECN connection
		0 time lost ECN negotiating SYN, followed by retransmission
		0 server connection attempted to negotiate ECN
		0 server connection successfully negotiated ECN
		0 time lost ECN negotiating SYN-ACK, followed by retransmission
		0 connection using ECN have seen packet loss but no CE
		0 connection using ECN have seen packet loss and CE
		0 connection using ECN received CE but no packet loss
		0 connection fell back to non-ECN due to SYN-loss
		0 connection fell back to non-ECN due to reordering
		0 connection fell back to non-ECN due to excessive CE-markings
user@MacBook-Pro ~> 
1 Like

I still regard BBR as the work of the devil. A way of making Google's traffic outcompete anything else...until we're all running BBR.

4 Likes

Hell, HTTP3 and HTTP2 is/was based off of standards developed by Google – I'm pretty sure we're probably going to go that way with BBR too given that they've got the manpower/funding/hours to throw their weight around in the IETF.

e.g. Cloudflare/Netflix/Amazon Cloudfront also doing BBR.

It may even be commonly used in Chinese sites (Alibaba/Baidu) as well now – a lot of the BBR review papers come out from Chinese universities. (Proliferation in the East & West)

BBR is actually developed by a surprisingly small team and with surprisingly low resources IIRC. The reason why BBRv1 does not support ECN is/was not lack of interest, but lack of available resources I have heard...

2 Likes

In the mean time, what's a same alternative to both BBR and the default (reno/cubic)?
I mean, most people want to try BBR as an alternative to the default, so what would be a better option to try? (I'm thinking about either Vegas or yeAH; but I don't know how they play with ECN)

Sorry if I didn't understand... Use bbr with sqm-scripts will actually degrade performance ? (Increase bufferbloat)

Yes, BBR is not the best TCP congestion control algorithm for a path with an rfc3169-copliant ECN or droping AQM like fq_codel or cake. That said, it will most likely not be catastrophic, but keep in mind that BBR typically gets an edge over more traditional TCP CCs like Reno or Cubic. BBRv2 is designed to play nicer with AQMs on general, but might not be rfc3168 compliant.

The good news is that with cake's host fairness and integration of a BLUE like algorithm, persistent offenders will simply have more of their packets thrown away.

https://ripe76.ripe.net/presentations/10-2018-05-15-bbr.pdf also
http://www.justinesherry.com/papers/ware-imc2019.pdf

My Qnap NAS has the option of using BBR for remote backups - I've turned it off. My CAKE instance does amazing work in keeping latency under control (<3ms average increase on fully utilised link) with per IP fairness too - I cannot tell when my backups are running.

Until I see reports of BBR2's fairness and response to ECN, I simply won't use it or BBR if I have a choice.

2 Likes

The second paper does a much better job at explaining what's wrong with the non-coexistence of BBRv1 with other TCP CCAs.

So is this why when I look at YouTube's "Stats for nerds", the graph for network activity periodically changes to red/orange? A BBR flow being reined in by Cake? Interesting.

I remember back in my DD-WRT days I always selected Westwood as that gave me the lowest latencies when visiting websites. I guess things have changed now...

I don't think so. Westwood+/Reno are very good, even today, and if I understand correctly, they don't interfere with SQM.

While it is quite an old commit, I'm pretty sure what has been said in the commit message still applies today (for general usage).

Now I do agree that re-benchmarking needs to be done given that it has been 7 years since.

Ok, Westwood is not the faster. Probably it is now, it's been a long trip since Linux 3.0 to be honest. Anyway it doesn't matter.
I'm just thinking of what we can do as an alternative to CUBIC, given that BBR breaks SQM (hint: Westwood doesn't).

Well, I just mentioned my experience with old WRT54G running ancient 2.4 kernel. I’m sure a lot has changed since then, both on client and server side.

How this can be benchmarked?

1 Like