Hi everyone,
I have a silly question. Or maybe not so silly.
We build mesh routers for crisis response (refugee camps, disaster zones, festivals). Our hardware is the 8dev Jalapeno / MeshPoint.One - IPQ4018, quad-core Cortex-A7 @ 717 MHz, 256 MB RAM. Old hardware by today's standards, but it's what's deployed in the field and we can't swap it out.
We've been doing a deep dive into SQM performance on this platform and found something interesting that I'd love the community's input on.
THE SITUATION
IPQ4018 has NO NSS cores.
So all packet processing is software-only on the ARM cores.
Current performance:
- Raw forwarding (no SQM): ~780 Mbps
- With software flow offload: ~950 Mbps
- With CAKE (single-queue): ~200-250 Mbps <-- the problem
- With fq_codel (no shaping): ~600-800 Mbps
CAKE on a single core is the bottleneck. Meanwhile 3 cores sit mostly idle. Classic.
WHAT I FOUND
-
The IPQ4018 EDMA driver exposes 4 TX queues per netdev (EDMA_NETDEV_TX_QUEUE = 4 in edma.h, confirmed in both legacy essedma and new IPQESS drivers).
-
CAKE_MQ (merged into net-next for Linux 7.0) creates one CAKE instance per hardware TX queue, distributing the work across cores.
-
In theory: 4 TX queues x 4 CPU cores = CAKE distributed across all cores = potentially 600-800 Mbps with full QoS.
Has anyone tested CAKE_MQ on IPQ40xx hardware? Does the EDMA driver's multi-queue implementation actually distribute softirq processing across cores, or does it all end up on core 0 anyway?
THE BANDWIDTH PROBLEM
Our use case makes bandwidth estimation... interesting:
- WiFi mesh backhaul: anywhere from 10 to 800 Mbps depending on distance, interference, weather, number of mesh hops
- WiFi AP with clients: 5 to 150 simultaneous users, signal quality varies wildly
- WAN uplink: often Starlink or cellular, bandwidth oscillates throughout the day
We can't hardcode a bandwidth value for CAKE because nothing is fixed.
We know about cake-autorate and it looks promising for the WAN side.
But for the mesh backhaul and AP interfaces, we're relying on kernel
fq_codel + mac80211 per-station fq_codel + AQL, with no explicit shaping.
Question: Is there a simple approach for periodic bandwidth measurement + SQM reconfiguration? Something like:
- Detect low-traffic period
- Run quick bandwidth probe (iperf3 to next mesh hop?)
- Reconfigure CAKE bandwidth parameter
- Repeat every few hours
Or is this overengineering and fq_codel without shaping is genuinely
"good enough" for mesh links where bandwidth is unknown?
WHAT I’M DOING NOW
Our current stack (all available today on OpenWrt 24.10):
- Gateway WAN: CAKE besteffort + cake-autorate (only interface with explicit shaping)
- Mesh backhaul: kernel fq_codel (default, zero config)
- WiFi AP: mac80211 per-station fq_codel + AQL (driver-level)
- LAN: kernel fq_codel (default)
- IRQ affinity + RPS/XPS distributed across all 4 cores
- NAPI budget tuned to 1000
- CPU governor: performance
This gets us to approximately 300-350 Mbps with CAKE on WAN. We're hoping CAKE_MQ on OpenWrt 25.12 will push that to 600+.
SPECIFIC QUESTIONS
1. CAKE_MQ on IPQ40xx: Has anyone tested it? Does the EDMA multi-queue actually distribute across cores?
2. SFE + egress qdisc: We confirmed from source that SFE calls dev_queue_xmit() which preserves the egress qdisc. Anyone running SFE + CAKE/fq_codel in production on IPQ40xx? Any gotchas beyond the known ingress/IFB issue?
3. Mesh QoS without bandwidth knowledge: For WiFi mesh links where bandwidth varies 10-800 Mbps, is fq_codel genuinely the right answer? Or are we leaving performance on the table?
4. cake-autorate on Starlink: Anyone running this combination? How well does it adapt to Starlink's bandwidth variations?
5. Am I missing something obvious? Any IPQ4018-specific optimizations we haven't considered?
We've documented our full analysis including per-interface recommendations, CPU impact measurements, and community network research. Happy to share the write-up privately if anyone is interested.
A NOTE ON DAVE TAHT
I want to say something personal here. Dave was a mentor to me. During the years we were building MeshPoint - mesh routers for refugee camps along the Croatian border in 2015-2016 - Dave was incredibly generous with his time and knowledge. He answered emails within hours, jumped on calls whenever I asked, and never once made me feel like my questions were too basic. He genuinely cared about getting networks right for the people who needed them most.
The fact that fq_codel + AQL "just works" on our mesh nodes without any configuration - that's Dave's legacy in every packet we forward. We're trying to build on that foundation for crisis response networks where connectivity saves lives.
The 25.12 dedication is well deserved. Rest in peace, Dave.
Thanks for any insights. Happy to share our benchmark data and test methodology if useful.