[netifd | odhcpd] concept for placing DSA ports into same DHCP subnet?

Correct, it doesn't require the kernel to do that. The kernel bridge serves as configuration interface for the hardware switch.

No idea what you mean with that. If you want to have individual subnets / broadcast domains on each port, just configure them as independent interfaces.

Yes, because there must be only one interface per pool. If you want to treat multiple switch ports as one logical interface, you must bridge them with a Linux software bridge, there is no way around that.

1 Like

I would not concur; maybe you just omitted the driver part though as in bridge driver because the soft bridge (device) does nothing about the configuration of the hardware switch but the bridge driver in conjunction with the switchdev framework.


Yes, that is what I meant and how I configured netifd in the meantime, except that the goal is not individual subnets but to have multiple DSA ports in the same subnet, just a bit more work to setup but saves the kernel soft bridge.


Thank you for the clarification.

Too bad, just a bit more work to setup the config then, I most certainly prefer to do away with the soft bridge and lets us disagree here as I am remaining of the opinion of it being unnecessary overhead with DSA ports on a managed switch.

I made a simplified statement.

I am really curious on how you intend to implement that. Even the official upstream DSA documentation uses software bridges to handle multiple DSA ports as one logical interface on the host system. https://www.kernel.org/doc/html/latest/networking/dsa/configuration.html#gateway

I disagree with assertion that the soft bridge in conjunction with DSA introduces overhead.

Separate section for each Lan/WLan port, mind adjusting the firewall zone and being diligent with the subnet planning, resulting in


Those are just showcases

In this documentation some common configuration scenarios are handled as showcases:

chosen for whatever use they had in mind but nothing set in stone to be followed by the letter. DHCP for the soft bridge is just a bit more convenient than configuring each DSA port separately, latter apparently my preference though.

So different subnets / logical interfaces.

:slight_smile: what is your semantic of (different) subnet? In this case on IPv4 the Lan ports and clients connecting to the Lan ports are within the 192.168.84.0/24 mask, just need to be diligent with the pool range for each port (prevent clashing).

Having the same subnet range with different IPs on the various DSA ports might work - I didn't check myself how it behaves in practice, likely similar to having multiple IP addresses on a normal ethernet NIC. But I can't see how such a setup could work if you factor in WLAN. If you would set 192.168.84.*/24 on wlan0, routing will probably break down.

Furthermore it is not clear how dnsmasq reacts if multiple candidate interfaces are available to satisfy a given DHCP pool. Could be that it all works because it simply selects the first DSA port interface which then floods to all others, but that is all a lot of "ifs" and certainly not the way things are intended to work.

Also you're going to need a bridge anyway if you want a common broadcast domain for ethernet and wireless.

It also feels a bit odd that you're essentially adding N additional IP addresses plus ARP entries, associated source selection overhead etc. just to skip an unproven, claimed overhead of a software bridge.

I am totally fine if you implement your personal setups this way but since you open such a generic topic, then propose such a configuration deviating from the official DSA guidelines and marking your odd configuration as the solution, I am worried that you're setting a wrong and misleading precedent here.

As the documentation states - it just just showcases not guidelines. Notwithstanding your link

picks the gateway config whilst there is also the single port config https://www.kernel.org/doc/html/latest/networking/dsa/configuration.html#single-port (sans soft bridge)


Changed the solution, if you are more comfortable with it.


Asked as a question and had a discourse and you disagreed - suppose users reading through the thread will have more trust in you than me. You want me to put up an augmented :warning: in one particular post or the topic (if that is feasible)?


Did not test dnsmasq since not utilising it, odhcpd (as stated in the topic) is fine for me; might see how kea works out.

To my understanding, the single port config does not mean "treat switch as a single port" but rather "configure each port independently".

From quoting the documentation:

single port
Every switch port acts as a different configurable Ethernet port

So it essentially the same as having three ethernet NICs installed on a PC.

Yep, which is what I am doing with my config

Not quite. Note how the DSA configuration example chooses adjacent, non-overlapping IP subnets while your config (at least judging from the screenshots) uses overlapping subnets (actually the same subnet on each port).

2 Likes

Yep, that had to be changed as pointed out

it happened already with two Lan ports on the same subnet, producing two gateways for the subnet and that did not go down well with routing. After rectification it works fine, incl. Wlan


:warning: downside of this the setup due to the absence of the soft bridge:

  • no cross traffic WLan <> Lan
  • no 802.1Q tag management with the bridge command
  • smaller subnet segments for each port
  • increased (initial) administrative effort for setting up
1 Like

My understanding was as given above by tl71 and jow, but then presumably bridge fdb should indicate offload as per document, which I am not seeing.

1 Like

Offloading should apply to port-to-port traffic. Port-to-host (or port-to-cpu) cannot be offloaded sine the Kernel needs to handle the packets in order to forward them to userspace.

Never seen this as well with the soft bridge, even with

assuming this refers to offloading L2 bridging. There is one curious condition mentioned:

the switchdev driver/device should support:
- Static FDB entries installed on a bridge port

, curious when reading https://lore.kernel.org/netdev/20200419164251.GM836632@lunn.ch/

For DSA, we have assumed that the software bridge and the hardware bridge are independent, each performs its own learning. Only static entries are kept in sync.

1 Like

The logic behind this reasoning escapes me

but never mind that ignorance, to test the

the node been reverted to utilise soft bridge (br-lan - enslaving lan0 | 1 | 2).

then looked up the bridge fdb for lan2 (as example here)

44:8a:5b:47:0b:c2 dev lan2 master br-lan
44:8a:5b:47:0b:c2 dev lan2 vlan 1 self

then went ahead with

bridge fdb add 44:8a:5b:47:0b:c2 dev lan2 vlan 1 self

resulting in

44:8a:5b:47:0b:c2 dev lan2 master br-lan
44:8a:5b:47:0b:c2 dev lan2 vlan 1 self static

So it is static now but nothing about offload still. Next up

ip l s br-lan ty bridge vlan_filtering 1

checking again bridge fdb and now exhibiting

44:8a:5b:47:0b:c2 dev lan2 vlan 1 master br-lan
44:8a:5b:47:0b:c2 dev lan2 master br-lan
44:8a:5b:47:0b:c2 dev lan2 vlan 1 self static

Still no joy with offload though.

  • missing something in setup/config (or some other sort of misconception) to get offload working?
  • kernel bug?

If frames are "offloaded" (forwarded by the hardware switch fabric), the ifconfig counters of the per-port netdevs will not increase, otherwise they will.

Here's two test scenarios, first bridged, then routed between my laptop, a wrt3200acm and my desktop. Laptop was wired to lan1, desktop to lan2.

Bridged scenario:

root@wrt3200acm:~# ip link add br0 type bridge
root@wrt3200acm:~# ip link set br0 up
root@wrt3200acm:~# ip link set lan1 up
root@wrt3200acm:~# ip link set lan2 up
root@wrt3200acm:~# ip link set lan1 master br0
root@wrt3200acm:~# ip link set lan2 master br0
root@wrt3200acm:~# ifconfig lan1 | grep "RX bytes"
          RX bytes:26480 (25.8 KiB)  TX bytes:3062 (2.9 KiB)
root@wrt3200acm:~# ifconfig lan2 | grep "RX bytes"
          RX bytes:3752 (3.6 KiB)  TX bytes:7388 (7.2 KiB)
root@desktop:~# ip addr add 10.255.1.1/24 dev eth0
root@desktop:~# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  4] local 10.255.1.1 port 5001 connected with 10.255.1.2 port 53724
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  1.09 GBytes   933 Mbits/sec
^Croot@desktop:~# 
root@laptop:~# ip addr add 10.255.1.2/24 dev eth0
root@laptop:~# iperf -c 10.255.1.1
------------------------------------------------------------
Client connecting to 10.255.1.1, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local 10.255.1.2 port 53724 connected with 10.255.1.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  1.09 GBytes   934 Mbits/sec

root@laptop:~#
# Traffic counters only increased slightly due to unrelated packets
root@wrt3200acm:~# ifconfig lan1 | grep "RX bytes"
          RX bytes:28856 (28.1 KiB)  TX bytes:3062 (2.9 KiB)
root@wrt3200acm:~# ifconfig lan2 | grep "RX bytes"
          RX bytes:3959 (3.8 KiB)  TX bytes:7388 (7.2 KiB)

Routed scenario:

root@wrt3200acm:~# ip link del br0
root@wrt3200acm:~# ip link set lan1 up
root@wrt3200acm:~# ip addr add 10.255.1.1/30 dev lan1
root@wrt3200acm:~# ip link set lan2 up
root@wrt3200acm:~# ip addr add 10.255.1.5/30 dev lan2
root@wrt3200acm:~# iptables -I FORWARD -j ACCEPT
root@wrt3200acm:~# ifconfig lan1 | grep "RX bytes"
          RX bytes:41466 (40.4 KiB)  TX bytes:3524 (3.4 KiB)
root@wrt3200acm:~# ifconfig lan2 | grep "RX bytes"
          RX bytes:11724 (11.4 KiB)  TX bytes:7976 (7.7 KiB)
root@desktop:~# ip addr del 10.255.1.1/24 dev eth0
root@desktop:~# ip addr add 10.255.1.6/30 dev eth0
root@desktop:~# ip route add 10.255.1.0/30 via 10.255.1.5 dev eth0
root@desktop:~# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  4] local 10.255.1.6 port 5001 connected with 10.255.1.2 port 53654
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  1.06 GBytes   907 Mbits/sec
[  4] local 10.255.1.6 port 5001 connected with 10.255.1.2 port 53660
[  4]  0.0-10.0 sec  1.06 GBytes   907 Mbits/sec
^Croot@desktop:~# 
root@laptop:~# ip addr del 10.255.1.2/24 dev eth0
root@laptop:~# ip addr add 10.255.1.2/30 dev eth0
root@laptop:~# ip route add 10.255.1.4/30 via 10.255.1.1 dev eth0
root@laptop:~# iperf -c 10.255.1.6
------------------------------------------------------------
Client connecting to 10.255.1.6, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local 10.255.1.2 port 53660 connected with 10.255.1.6 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  1.06 GBytes   908 Mbits/sec

root@laptop:~# iperf -c 10.255.1.6
------------------------------------------------------------
Client connecting to 10.255.1.6, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local 10.255.1.2 port 53660 connected with 10.255.1.6 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  1.06 GBytes   908 Mbits/sec

root@laptop:~#
# Ifconfig counters increased by the amount of routed traffic
root@wrt3200acm:~# ifconfig lan1 | grep "RX bytes"
          RX bytes:2351622094 (2.1 GiB)  TX bytes:50939120 (48.5 MiB)
root@wrt3200acm:~# ifconfig lan2 | grep "RX bytes"
          RX bytes:40160782 (38.2 MiB)  TX bytes:2282781894 (2.1 GiB)

As can be seen, in a routed configuration (each port by itself, without a software bridge) the throughput is decreased and all forwarded frames are hitting the router CPU. In the bridged scenario, the iperf throughput is higher and almost all frames are bypassing the CPU.

3 Likes

In a bridged scenario there is a hint with ethtool -k dev using br-lan yields

rx-vlan-offload: off [fixed]
tx-vlan-offload: on

Thank you for that contribution.


Kernel developer been kind enough to elaborate:

With DSA, we have two sets of tables. The switch performs address learning, and the software bridge performs address learning.
No attempt is made to keep these dynamic FDB entries in sync. There is not enough bandwidth over the MDIO link to keep the two tables in sync.

and further

However, when you dump the FDB using the bridge command, you get to see the combination of both tables.

The hardware will perform forwarding based on its table, and the software bridge based on its
table.. However, if there is no entry in the hardware table for a given destination MAC address, it will forward the frame to the software bridge, so it can decide what to do with it.

For static FDB entries which the user adds, they are first added to the software bridge, and then pushed down to the switch.


Far as I know that is unrelated to offloading the data path but 802.1Q tag handling (stripping / inserting) ,see https://github.com/torvalds/linux/blob/master/net/ethtool/common.c

1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.