Making cake multicore - or other new features?

Probably the number 1 request for cake is to make it shape to a gbit on non-x86 hardware. The often proposed means to make it do so is to somehow make it multicore, and that, at the time I last looked at it, was hard.

Conceptually, an implementation is simple, multiple instances of cake getting flows from multiple cores via the mq qdisc, but with a shared bandwidth parameter, which, within cake, would merely require an atomic locked update to that shared variable, or better, something protected by RCU. Never did figure out how to do either! On 64 bit arches, it looked like two atomic instructions. (not kidding).

In the real world, perhaps some other factors dominate that could also be improved. For example when last I looked the inbound ifb ended up being crunched into a single tx path long before it got to cake (this is 5+ years ago, so perhaps that's got better?)

The XDP https://github.com/rchac/LibreQoS approach is looking really promising, but I don't know if anyone has tried to get that to work on openwrt, and that implementation is geared more to ISP needs.

What other features is cake missing, going forward? I'm loving the qosify work. I think the ack-filter could be improved (and in fact, added also to wifi).

What features could be ripped out?

10GB/sec or bust!

PS I'd dearly like to find a way to get cake offloaded into more hardware. In every case I've poked into it requires a deal with the vendor, and the two offload engines I've looked at were very, very weird. I have often thought of tackling a prototype in an FPGA.

3 Likes

On 23 November 2021 08:32:06 CET, Dave Taht <dave.taht@gmail.com> wrote:

The context of my question is basically this:

Is cake baked? Is it done?

How about per MAC address fairness (useful for ISPs and to treat IPv4/6 equally)?

How about configurable number of queues (again helpful for ISPs)?

How about MPLS?

From toke:

Dave Taht <dave.taht@gmail.com> writes:

On Tue, Nov 23, 2021 at 2:39 AM Toke Høiland-Jørgensen <toke@toke.dk> wrote:

Sebastian Moeller <moeller0@gmx.de> writes:

Hi Dave,

On 23 November 2021 08:32:06 CET, Dave Taht <dave.taht@gmail.com> wrote:

The context of my question is basically this:

Is cake baked? Is it done?

How about per MAC address fairness (useful for ISPs and to treat
IPv4/6 equally)?

How about configurable number of queues (again helpful for ISPs)?

FWIW I don't think CAKE is the right thing for ISPs, except in a
deployment where there's a single CAKE instance per customer. For
anything else (i.e., a single shaper that handles multiple customers),
you really need hierarchical policy enforcement like in a traditional
HTB configuration. And retrofitting this on top of CAKE is going to
conflict with the existing functionality, so it probably has to be a
separate qdisc anyway.

What progress has been made on breaking the HTB locks in the last few
years?

None. Don't see that happening any time soon; just the simple pfifo_fast
qdisc is uncovering all kinds of bugs when running in lockless mode.

Jesper basically solved the contention issue by partitioning the traffic
and running multiple instances:

Doesn't work for bandwidth sharing across instances, though, so it
solves the ISP "separate rates per customer" case, but not the CAKE
"shape a single link" case.

Given the enormous number of hw tx/rx queues we see today (64+ on
10gbit), trying to charge off
bandwidth per queue in a cake-derived shaper and protecting the merge
with rcu seemed plausible...

Yeah, that was what I was going to try, but it turned out to be
decidedly non-trivial to make sch_cake itself mq-aware, so I gave up. My
hope is that this will be possible once we get sch_bpf, so we can just
have separate instances but they can share a single atomic var for the
bandwidth sync...

-Toke

Makes sense.

Gotcha.

Nice!

1 Like

BSD port(s). This might also include something like BQL and the ability to run fq_codel also at line rate on those OSes rather than requiring PF>