DSA does (transparently) offload to the switch fabric, inter-LAN traffic is accordingly handled within the switch fabric, in hardware - at line speed, and never seen by the SOC's CPU.
Edit: Just to take an example, the realtek target with its single-core 500 MHz RTL83xx mips4k SOC has a rather underpowered CPU (it's fast enough to run luci smoothly, but you don't want it to do routing at any meaningful WAN speed), but it does support devices with 8 to 52 1000 MBit/s ethernet ports (including some 10 GBit/s ports for RTL93xx), it does use a DSA switch driver and does deliver full speed and features like link aggregation and similar at line-speed of the switch fabric (so e.g. 48 GBit/s for the gs1900-48).
I am seeing in dmesg that mv88e6xxx_probe is being called twice during boot per the following printk on a WRT3200ACM. Shouldn't the probe only be called once?
Yep, a DSA capable switch chipset still works as a plain standard ethernet switch. And switch tags are pushed or popped in hardware by the switch. But it can bite you at L2 in case you have set up a bridge between some ethernet ports of the switch and the WiFI interfaces. Traffic between ethernet ports and WiFi interfaces has to pass the CPU, i.e. the CPU needs to push or pop the switch tags. I hope we'll see more linux drivers for hardware offloading (routing, NAT, PPPoE and so on). Unfortunately some vendors of switch chipsets declared their offloading APIs a state secret.
There is no material difference between swconfig or dsa in this regard, hardware offloading of wireless interrupts and processing is an orthogonal question (e.g. NSS).
If this is really true, then I would install a dumb/unmanaged switch for all LAN devices and connect this switch to the router. As a consequence only traffic, which needs routing, goes to the router.
Edit: thanks for asking, so I made an upload test without HW offloading, and it is better:
No difference in download speed, 67MB/s
These tests were made with an Intel AX200 card on my laptop and a Cudy wr1300 AP (so: 866.7 Mbit/s, 80 MHz, VHT-MCS 9, VHT-NSS 2, Short GI)
Right but take a look at LuCI commits, there have been a ton of fixes since rc3. If there was a fix you needed you can always install a 21.02-snapshot.
The Luci commits you refers to is also two days old now, this was in the beginning only hours…
The master branch seems to be where the works are happening now.
But I thought more like this forum tread is 160 posts now and we actually now discuss more how a network router cpu and a switch works than actual faults. We should probably need a big DSA tread where everyone can discuss how DSA works because it seems to be a hot topic.
So from my viewpoint it feels like 21.02 need to try its wings or it will probably never really fly because it will never be fault free, in the best of worlds that version will be named 21.02.7 or maybe 8.
Alternatively what exactly are we waiting for now?
It appears to me that the processing overhead for DSA's switch tags is a little bit larger than for VLAN tags. If I understand DSA correctly the VLAN tags are replaced with the DSA tags, at least for the top VLAN tag. Do you happen to know how VLAN tagged ports forwarding several VLANs are handled by DSA? Does that also work with just a switch tag and the switch chipset inserts the corresponding VLAN tag for egress frames? Or does the CPU need to push the VLAN tag and the switch tag?
No worries! Ethernet traffic between switch ports is handled by the switch chipset. So you get also full throughput with DSA. And I'd like to add that WiFi bridged traffic has to pass the CPU too, since the WiFi interfaces are connected to the CPU.
Good post, didn't notice that on the rc2 thread. Can't we just install kernel 5.10 on most devices though and get these improvements? I know many people are running 5.10 on mvebu with little to no issues.