LAN data path is now end-to-end β two upstream-applicable bugs found
Picking up where I left off in the thread: the mainline OpenWrt port on the Flint 3 (IPQ5332 + RTL8372N + 2Γ QCN6274) now has a fully functional Ethernet data path, including procd-mode boot and SSH over LAN. Throughput is wire-rate for a 1 Gbps client (iperf3 sustained at 942 Mbit/s, 0 TX errors, 0 retransmits in MIB).
This session closed two bugs that together had been masking each other for weeks. Both are general β they would affect any IPQ53xx board using the in-tree qcom_ppe driver with a DSA-tagged switch, not just the Flint 3.
Bug 1 β edma_rx_reap leaks one skb per NAPI poll
In drivers/net/ethernet/qualcomm/ppe/edma_rx.c, the linear-packet path pre-fetches the next iteration's skb at the tail of the loop body:
while (likely(work_to_do--)) {
/* ...process skb_N... */
next_skb = rxdesc_ring->rxfill->skbs[idx];
rxdesc_ring->rxfill->skbs[idx] = NULL;
if (unlikely(!next_skb)) { /* ... break ... */ }
}
work_to_do-- is a post-decrement, so on the last iteration of any batch the loop body still runs, the prefetch still NULLs skbs[idx], but the while header then sees work_to_do == 0 and exits β the prefetched skb is silently dropped and that slot stays NULL until the next refill cycle. Empirically this drops exactly one frame per NAPI poll. With work_to_do == 1 (one IRQ per descriptor on this SoC), the result is 50% packet loss.
Fix: wrap the in-loop prefetches (linear + scatter) with if (work_to_do) so the prefetch only runs when there will actually be a next iteration. After this fix, ping -c 1000 lands at 1000/1000 received, 0% loss.
There's a second, related issue worth bundling in the same patch: on IPQ5332 the chip overwrites word2 of the rxdesc before the kernel reads it back, so the index-based opaque pattern (looking up skbs[next_rxdesc_pri->word2 & MASK]) finds the wrong slot. The chip does preserve FIFO order between rxfill prod_idx and rxdesc cons_idx, so the kernel can use cons_idx directly as the skbs[] index. With both fixes together, RX is rock-solid.
Bug 2 β XGMAC powers up in cut-through with a tiny TX threshold
After Bug 1, RX worked but TX still failed. Specifically, frames larger than ~200 B from the SoC out to the wire would just vanish β ip -s link show eth0 showed tx_errors and tx_fifo_errors incrementing in lock-step with each lost frame. The threshold was sharp and reproducible:
ping -s 64 -c 10 β 10/10 received
ping -s 200 -c 10 β 3/10
ping -s 500 -c 10 β 0/10
ping -s 1000 -c 10 β 0/10
ping -s 1400 -c 10 β 0/10
That's a classic DesignWare XGMAC TX FIFO underflow signature. The qcom_ppe driver writes XGMAC_TX_CONFIG_ADDR (speed + TXEN) but never touches MTL_TXQ_OPERATION_MODE at XGMAC_BASE + 0x1100, so the MAC powers up with TSF=0 (cut-through) and the default tiny TTC. Anything larger than a few hundred bytes starts on the wire before the rest of the frame is in the FIFO, and the MAC underflows and aborts.
Fix: set TSF=1 (Store-and-Forward) and TXQEN=2 in MTL_TXQ_OPERATION_MODE at the end of ppe_port_xgmac_link_up:
mtl_mask = GENMASK(3, 1);
mtl_val = BIT(1) | (2u << 2); /* TSF=1, TXQEN=2 */
regmap_update_bits(ppe_dev->regmap, reg + 0x1100, mtl_mask, mtl_val);
After this, every frame size from 64 B to 1500 B delivers at 100% and iperf3 sustains 942 Mbit/s with zero tx_underflow_err. The per-frame latency cost is the frame's serialisation time (~1.2 Β΅s for a full MTU at 10G) β negligible.
I think upstream qcom_ppe should probably default to TSF=1 unconditionally for any XGMAC port, since cut-through requires careful feeder tuning and there's no obvious harm in store-and-forward for current OpenWrt use cases.
What this unblocks on Flint 3
- LAN1β4 reach 192.168.1.1, ping is sub-ms RTT at any size.
- Procd boots cleanly with CONFIG_CMDLINE stripped of rdinit=/bin/sh. The earlier "PCS clk_set_rate hang in preinit" was fixed by the PCS skip-reconfig patch that detects U-Boot's existing 10GBASE-R config and leaves it alone.
- Dropbear starts. nc 192.168.1.1 22 returns SSH-2.0-dropbear with the full kex algorithm list immediately.
- The rtl8372n DSA driver registers with DSA_TAG_PROTO_RTL8_4. CPU port (port 3) IVL/SVL is set to 1 and FID/MSTI is 1 in the VLAN-1 entry β without those bits the chip routes lookups to a different FID and drops ~85% of unicast frames even with a static LUT pin (this was a previous session's find).
What's still pending
- WAN port: RTL8221B PHY at USXGMII. Upstream rtl822x returns 0 in-band caps so managed = "in-band-status" fails; the in-tree workaround is to drop it and rely on MDIO polling.
- ath12k wireless: currently the QCN9274 firmware advertises MLO and mainline ath12k can't register MLO devices the way the firmware wants, so I have a patch that masks the MLO advertisement and lets ath12k come up as two independent non-MLO wiphys. WiFi 7 per-radio works; cross-radio MLD does not. Closing that gap is upstream ath12k work, not a board patch.
- The watchdog is currently status = "disabled"; needs proper procd integration before re-enabling.
Happy to share the two patches inline or as a branch link if anyone wants them. The fixes are mechanical and small (the EDMA one is one new if (work_to_do); the XGMAC one is one new regmap_update_bits) so anyone porting another IPQ53xx board can drop them in without much fuss.