I spent a little time debugging this and reverting the switchdev portion of the 5.4.96 changes fixes the problem:
I saw a patch for linux-next that indicates there may be work to prevent traffic from getting dropped when filtering isn't set up. Not sure if it's related but it appears more work may be needed in setting up VLANs in the near term. Has anyone else seen this or have recommendations on how to address this?
The patch is related to determining if HW offload is supported or not (I believe). When the patches enabling HW offload rolled around they broke my network so I disabled them -- so I may actually have something I need to fix here.
Portion of 5.4.96 to REVERT:
--- a/net/switchdev/switchdev.c
+++ b/net/switchdev/switchdev.c
@@ -461,10 +461,11 @@ static int __switchdev_handle_port_obj_add(struct net_device *dev,
extack = switchdev_notifier_info_to_extack(&port_obj_info->info);
if (check_cb(dev)) {
- /* This flag is only checked if the return value is success. */
- port_obj_info->handled = true;
- return add_cb(dev, port_obj_info->obj, port_obj_info->trans,
- extack);
+ err = add_cb(dev, port_obj_info->obj, port_obj_info->trans,
+ extack);
+ if (err != -EOPNOTSUPP)
+ port_obj_info->handled = true;
+ return err;
}
/* Switch ports might be stacked under e.g. a LAG. Ignore the
@@ -513,9 +514,10 @@ static int __switchdev_handle_port_obj_del(struct net_device *dev,
int err = -EOPNOTSUPP;
if (check_cb(dev)) {
- /* This flag is only checked if the return value is success. */
- port_obj_info->handled = true;
- return del_cb(dev, port_obj_info->obj);
+ err = del_cb(dev, port_obj_info->obj);
+ if (err != -EOPNOTSUPP)
+ port_obj_info->handled = true;
+ return err;
}
/* Switch ports might be stacked under e.g. a LAG. Ignore the
@@ -563,9 +565,10 @@ static int __switchdev_handle_port_attr_set(struct net_device *dev,
int err = -EOPNOTSUPP;
if (check_cb(dev)) {
- port_attr_info->handled = true;
- return set_cb(dev, port_attr_info->attr,
- port_attr_info->trans);
+ err = set_cb(dev, port_attr_info->attr, port_attr_info->trans);
+ if (err != -EOPNOTSUPP)
+ port_attr_info->handled = true;
+ return err;
}
/* Switch ports might be stacked under e.g. a LAG. Ignore the
5.4.97 didn't fix the issue for me -- I only mentioned 5.4.96 because that is where the issue started. I'm a little bit surprised nobody else is complaining, it makes me wonder if the issue isn't really isolated to my setup somehow.
I saw that patch series you linked and although it seems like a similar issue I'm not sure it addresses this. I'm certainly a DSA/switching noob, however, so I'll give it a try without my patch reversion if it gets pulled into trunk.
I applied the patch on 5.4.97 and I get the same effect:
I have 3 network bridges that are configured as untagged VLANs using DSA. The first bridge that is set up works, traffic ends up being dropped on all the others (no network is set up, the activity lights are completely inactive).
Reverting the single patch noted in the first post is still the fix.
From the text of that patch it almost seems like something isn't configured right in the hardware and the bug that is fixed was allowing the switch to still work.
commit bc4e7277cc93b12b6c51b803c32b89700437d607
Author: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Date: Mon Jan 25 13:41:16 2021 +0100
net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP
commit 20776b465c0c249f5e5b5b4fe077cd24ef1cda86 upstream.
It's not true that switchdev_port_obj_notify() only inspects the
->handled field of "struct switchdev_notifier_port_obj_info" if
call_switchdev_blocking_notifiers() returns 0 - there's a WARN_ON()
triggering for a non-zero return combined with ->handled not being
true. But the real problem here is that -EOPNOTSUPP is not being
properly handled.
The wrapper functions switchdev_handle_port_obj_add() et al change a
return value of -EOPNOTSUPP to 0, and the treatment of ->handled in
switchdev_port_obj_notify() seems to be designed to change that back
to -EOPNOTSUPP in case nobody actually acted on the notifier (i.e.,
everybody returned -EOPNOTSUPP).
Currently, as soon as some device down the stack passes the check_cb()
check, ->handled gets set to true, which means that
switchdev_port_obj_notify() cannot actually ever return -EOPNOTSUPP.
This, for example, means that the detection of hardware offload
support in the MRP code is broken: switchdev_port_obj_add() used by
br_mrp_switchdev_send_ring_test() always returns 0, so since the MRP
code thinks the generation of MRP test frames has been offloaded, no
such frames are actually put on the wire. Similarly,
br_mrp_switchdev_set_ring_role() also always returns 0, causing
mrp->ring_role_offloaded to be set to 1.
To fix this, continue to set ->handled true if any callback returns
success or any error distinct from -EOPNOTSUPP. But if all the
callbacks return -EOPNOTSUPP, make sure that ->handled stays false, so
the logic in switchdev_port_obj_notify() can propagate that
information.
Fixes: 9a9f26e8f7ea ("bridge: mrp: Connect MRP API with the switchdev API")
Fixes: f30f0601eb93 ("switchdev: Add helpers to aid traversal through lower devices")
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Link: https://lore.kernel.org/r/20210125124116.102928-1-rasmus.villemoes@prevas.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
I have not. If it moves into "testing" on trunk I will make the switch.
There has been some similar history on this in the post-DSA era: luci issue 2798
mv88e6085 f1072004.mdio-mii:10 lan0: configuring for phy/gmii link mode
mv88e6085 f1072004.mdio-mii:10: p0: hw VLAN 1 already used by br-guest
mv88e6085 f1072004.mdio-mii:10 lan1: configuring for phy/gmii link mode
mv88e6085 f1072004.mdio-mii:10: p1: hw VLAN 1 already used by br-guest
mv88e6085 f1072004.mdio-mii:10 lan2: configuring for phy/gmii link mode
mv88e6085 f1072004.mdio-mii:10: p2: hw VLAN 1 already used by br-guest
mv88e6085 f1072004.mdio-mii:10 lan3: configuring for phy/gmii link mode
mv88e6085 f1072004.mdio-mii:10: p3: hw VLAN 1 already used by br-guest
Similar behavior is in my bootlog, both working and non-working versions. But I think clearing up this configuration so it doesn't send the -EONOTSUPP flag may be wise -- the question is how.
If this was an issue with the kernel I would expect others would see this same issue. I'm <5% sure I have VLANs set up appropriately with the conversion to DSA and I don't have enough knowledge about what the switch driver needs to do here.
I have the same problem and I let netifd handle everything.
I did however stumble upon a workaround while hunting down things to try on google last night:
Setting vlan_default_pvid on the bridge devices seems to help. I now have this in /etc/hotplug.d/iface/01-hack-vids:
[ "$ACTION" != ifup -a "$ACTION" != ifupdate ] && exit 0
if [ "$INTERFACE" = "foo" ]; then
vid=1
elif [ "$INTERFACE" = "bar" ]; then
vid=2
elif [ "$INTERFACE" = "baz" ]; then
vid=3
else
exit 0
fi
ip link set dev "br-$INTERFACE" type bridge vlan_default_pvid "$vid"
I have basically zero idea what I'm doing here, but at least with this things seem to work for now (that's on 5.10 testing kernel btw.)
I let netifd handle everything -- I'm not sure where adding scripts is going to interfere and I can't find where network setup documentation has been updated with respect to DSA so I was waiting (patiently until something broke).
I'm not sure it's the entire solution but it definitely directly addresses this issue. The problem is vlan ids all default to the same value causing the warnings I posted a few steps above. The patch in question changed the logic in how the switch driver is passing messages -- prior to this the "setup handled" flag would get passed even if a "not supported" flag is returned from a deeper function call but after the patch this does not happen. I'm guessing the driver for these VLANs never finishes setting them up.
Although DSA isn't explicitly mentioned here I think the very last paragraph hints at how you can get rid of your script and fix this problem directly in setup: openwrt vlan switch configuration (specifically the dot-notation on ifname creating VLAN IDs 106 and 204 in this case.
config interface 'lan'
option type 'bridge'
option ifname 'eth0.106'
option proto 'static'
option ipaddr '192.168.1.1'
option netmask '255.255.255.0'
config interface 'wan'
option ifname 'eth0.204'
option proto 'dhcp'
I have tested and this indeed clears the warning/error message in the kernel logs. I need to rebuild without deleting this patch to make sure it truly fixes this issue.
My bad -- I mostly ignored the documentation (linked page) because it's still addressing switch setup using the "switch" section in /etc/config/network. The VLAN information section is still applicable other than that.
EDIT: I still don't see how to configure tagged vs. untagged without the "switch" section. I don't see how to specify independently how cpu and lan/wan ports get specified as tagged or untagged.
I was playing around, trying to ferret out information on what can be done with netifd, as things have changed since I last poked around. In case it is of any use here is a snippet:
That looks very interesting, thanks. I'll have to read up and experiment some with this.
As a side note: I just tried your earlier suggestion and used: option ifname 'lan3.2' and that also appears to work for my usecase without any additional scripts
I believe the piece I'm missing is configuring the switch as a device via netifd. I'm not completely sure how to set it up as in a few cases where routers are buried on an internal network I'm pulling in the wan port as part of a VLAN, effectively using the router with 5 "lan" ports. Since these are on two different CPUs I'm guessing I need two switch devices and I'll need to investigate the syntax for this.
The thread you linked is informative and it's possible the information is in there. I likely won't have time to play with this until the weekend.
No changes to wireless config, which attach to all 3 networks. I have a few other configurations where wan is bridged with a lan port, I have yet to test that. Also, per the PR anomeome linked for now we still need "bridge" specified in the interface config although it appears at some point this will be going away.
This syntax is slightly different than that shown in the PR -- I couldn't get the PR syntax to work (using list for lan1, lan2, etc).