A few days ago, a rewrite of libreqos.io (moving it over to rust, from python) showed up here:
It (on good hardware) is capable of very complex shaping, and by leveraging xdp and ebpf, is capable of pushing over 10Gbits through thousands of cake instances...
How is openwrt doing on rust support?
How hard is it to use xdp and ebpf in openwrt nowadays?
Once upon a time I'd heard there was a plan to try to go from nftables to ebpf...
Aside from x86 multicores, what's the biggest multicore arm that runs openwrt present?
BracketQOS uses pping to continually monitor TCP latency on your network. It doesn't work like ping - it monitors actual customer TCP traffic, timing the latency of each individual connection.
@Lochnair can you see how pping is leveraged in the source? I had a look but haven't found it yet.
Based on the results on x86 - pping and ebpf pping are too invasive to use at 10Gbit. However looking at that data, it seemed to me that hooking the ack-filter code was the most cpu-efficient way of applying a pping like technique to cake.
Unless I missed something, it looks like it only uses pping to gather latency statistics presented in the GUI. I haven't been able to find anywhere it's used in the daemon to control the shaper.
It doesn't help that I'm always with a big problem an existing rust library doesn't solve. The libreqos.io code shells out like 10 times per customer, and your typical fork/exec takes, oh, 50us or so, when all it needs to do is tc qdisc replace dev eth0 root cake option, option, option. So I figured my first rust proggie was going to be leveraging pure netlink to do that, but, noooo... no lib for it.
And yes, I do both ICMP type 8 and 13. I use nix (safe wrapper for libc) to open raw sockets and etherparse to handle parsing and crafting ICMP packets.
I have a use for getting ICMP messages back - my long stalled out responsiveness test ( https://github.com/dtaht/wtbb) I'd intended to to set the TTL to lower values to try and sense, within a tcp stream, where the bloat was. I might leverage crusader for this ( https://github.com/Zoxc/crusader ) or the basic goresponsiveness test...
but after outlining what I wanted it to do, I realized I would be damned if ever wrote another C program in userspace again, and equally damned, it turned out, if I wanted to somehow get at underlying low level facilities like TCP_INFO, or icmp, in rust or go.
I don't think there's anything newer than Cortex A72's that are supported in master right now? So probably a throw up between the Rpi4 (quad A72) and the newer Rockchips (Dual A72 + Quad A53 I think).
I'd probably discourage use of the RPi4 unless you already had one handy due to the supply issues and the complexities around getting decent ethernet phys connected. (Lack of free pci-e on the 4B means USB3 ethernet, or the need to check out the multitude of carrier boards if using the compute modules)
I imagine the Bpi-W3 is likely to be the next step up whenever that ends up shipping. Everything else is just... an actual devkit or a server?
Unless you count running the armvirt target within qemu on something like an Ampere or Apple Silicon cpu. I'd look into whether a Mac Mini would suffice for multi-gbit testing or whether you'd run into some weird technical limitation first.
RB5009UG+S+IN has a Cortex A72 quadcore CPU AFAIK, has SFP+ and 2,5 GbE, but isn't officially supported. There's a good port though and I'm running it as my router, just tested master with 5.15 on it the other day.
https://mikrotik.com/product/rb5009ug_s_in might be ideal!!! Do the ethernet ports have BQL? ( various files cat /sys/class/net/the_ethernet_device/queues/tx-0/byte_queue_limits/limit )
Actually, even Rust std uses libc for sockets it seems, so I've given up on getting rid of that dependency.
But for ICMP type 8 smoltcp could be nice, since it handles the whole chain, but it doesn't expose type 13, so I'll stick with what I've got for now.
I hope that something like https://crates.io/crates/syscalls will mature and be supported on all architectures in stable, then we could drop libc for the syscall related stuff.
For TCP_INFO, Nix has facilities for getting socket options, but currently doesn't implement TCP_INFO.
The major thing that makes it a bit painful to add (at least for me being a Rust newb) is that the tcp_info struct uses bitfields, which Rust doesn't have builtin support for. So that'd take some research on how to handle.
Are you aiming to run something like libreqos.io or Bracket QOS on OpenWrt? I'm not sure I see the use case. Are you looking for something power efficient to cover a single WISP tower?
For optimizing CAKE in a home setting I'd look more in the direction of something like software flow offloading, not sure how the current state of that is.
@Zoxc There is a lot of "we shipped openwrt 22.03.0, now what?", sort of speculation going on.
In my case I'm very frustrated that vendors like mikrotik have only just added cake, and are 6-7 years behind openwrt in many respects. They make good hardware, but....
One of the biggest problems linux has had in the last few years is that it bottlenecks on reads, not writes. Some bleeding edge stuff like vpp and dpdk appeared that could use 1/16th the cores to blast more packets... but lack any method at all at doing good queue control. XDP and ebpf are responses to that while still retaining good egress characteristics.
"Something power efficient for a tower" & well queued & encrypted backhaul would be great! I am allergic to python, but rust seems fast and performant and easier to make run on openwrt/routerOS/vyatta/linux/etc
PS if you (or anyone) can figure out a way to poll tcp_info in rust in crusader, it would be GREAT. For those here that haven't played with it yet, it's here: https://github.com/Zoxc/crusader