BracketQos - Rust, EBPF, XDP, & cake, oh my!

moeller0 · September 22, 2022, 6:27pm

root@turris:~# cat /sys/class/net/eth2/queues/tx-0/byte_queue_limits/*
1000
0
7590
1879048192
0

Turris Omnia, BQL seems to work... the 3rd number changes during load.

dtaht · September 22, 2022, 7:44pm

@Borromini doesn't look like BQL is there. Sigh. It's only 8 lines of code, and makes a huge difference, especially for bidirectional traffic. There's a mini-BQL tutorial over here: http://www.taht.net/~d/broadcom_aug9_2018.pdf

Borromini · September 22, 2022, 7:45pm

@moeller0 That's on OpenWrt or on Turris OS?

@dtaht I'm getting an 'SSL_ERROR_RX_RECORD_TOO_LONG' from your webserver here?

dtaht · September 22, 2022, 7:49pm

that is an http not https address.

Borromini · September 22, 2022, 7:50pm

Sorry. Overly zealous Firefox.

moeller0 · September 22, 2022, 8:41pm

That is on OpenWrt19 based TurrisOS 5.4.3. I bought this thing as I wanted automatic updates (from a source I trust) and it has been delivering; however it currently straggles behind upstream OpenWrt*. But I assume that BQL if available in the old series 4.14 kernel, will also be available for more modern mvneta drivers.

*) This is not a complaint just a statement of current fact, upgrades to a more recent OpenWrt base are in the works and hopefully will be rolled-out soon. If I wanted/dared I could already test the upcoming TOS6.

dtaht · September 22, 2022, 11:32pm

wow. that is a highly capable river, with xdp and tso support, but no bql. doesn't look hard to add though. https://elixir.bootlin.com/linux/v6.0-rc6/source/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c#L4417

Borromini · September 23, 2022, 2:09am

I'd be happy to help (and I read the PDF you linked to) but my C skills are extremely limited. I'll gladly test any patches you'd have though, ideally for 5.10 (device is on 22.03).

dtaht · September 23, 2022, 2:16am

Don't have the hardware, and doing BQL really requires having it, due to how hard it is to find all the sources of resets from code inspection. But I'll put it on the todo list. 1024 tx descriptors is a bit much! At a gbit,
30k (22 big packets), is all bql will put on the ring. Also the flow control to the switch was "interesting", I don't know how much buffering is in the switch itself.

Lochnair · September 23, 2022, 8:39pm

I tend to agree, but in C it's often too easy to do this:

While a bit alien to me as well, I really do like Rust's memory safety features a lot.
Although admittedly I curse at the borrow checker sometimes

Half relatedly I came across this: https://github.com/carbon-language/carbon-lang
Another Google-backed language that aims to be an successor to C++, not ready for usage yet, and likely won't be for a while, but I'll be keeping an eye on it.

moeller0 · September 24, 2022, 10:52am

What I wish for is a language that works well running inside an interpreter but that can also be compiled... for development and troubleshooting an interpreted autorate implementation is so much easier to work with. (I guess if the compiler and development environment is small enough to include on a router that would also work, but I somehow doubt that is a viable way forward for low storage all-in-one router)...

Lochnair · September 24, 2022, 12:22pm

Seemed more appropriate to continue in the autorate thread, so I've posted my reply there instead

Lochnair · September 30, 2022, 1:12pm

I toyed around with getting tcp_info from a socket, seems to work, result here: https://github.com/Lochnair/tcp_info_test

I likely won't have time to add this into Crusader, so I'll leave that part for someone else

dtaht · September 30, 2022, 1:43pm

@zoxc - even incremental progress like this is very, very helpful. Being able to see tcp marks and drops would be so great to also have in crusader, in addition to the measurement loss metric. Also, rtt, retransmits.... ton of useful info in tcp_info!!!!! My big hope was to be able to collect data from the socket every 10ms or so but even just at the end of the run really helps.

thx @lochnair for taking a stab at it!

https://github.com/cloudflare/quiche looks easier to instrument than tcp!

dtaht · September 30, 2022, 1:59pm

@borromini - I spoke with the mvpp2 maintainer about adding BQL. Are you in a position to test a backport?

Borromini · September 30, 2022, 2:23pm

Sure enough! I backported 5.15 to 22.03 so I can test 5.15, I assume that would make things slightly easier?

moeller0 · September 30, 2022, 2:34pm

Like MSS size! Just keep in mind that there will/should be one info record per TCP flow used in a test and these can be quite different (for example with the go-responsiveness networQuality tool I see occasionally that a test run uses both IPv4 and IPv6 flows* the reported TCP Info however tries to only report an aggregate**). flent does this right and reports results for each test flow, but that can turn into quite a long list...

*) No clue why though, but I can clearly see IPv4 and IPv6 measurement flows in parallel.
**) With the tool easily ramping up into the 20 and above parallel flows, I can see the need for some aggregation, but e.g. for MSS aggregating IPv4 and IPv6 flows is not all that helpful.