Support for RTL838x based managed switches

Could you try activating Source Address learning for outgoing packets?
In https://github.com/openwrt/openwrt/blob/master/target/linux/realtek/files-5.4/drivers/net/ethernet/rtl838x_eth.c change line 104
h->cpu_tag[1] = 0x0400; // BIT14: AS_DPM, BIT 0: L2LEARNING, BIT 1: RVID_SEL
to use values 0x0401 or 0x0403. There is no documentation of what these fields do, but they sound promising. Sorry, I can't try this myself, I am on vacation and don't have the equipment to test.

Hi,
long time not been here...
My Zyxel still works fine but now I have a (most probably light) issue:
How to configure mwan3 on a switch like the Zyxel?
The basic docs at https://openwrt.org/docs/guide-user/network/wan/multiwan/mwan3 are very simple and don't match the interfaces on a switch.
Does someone maybe have an example config?

Does anyone here have a similar device compared to the TP-Link T1600G running the new stock firmware revision with light blue web interface and does use a SFP module?

How is a SFP module shown in the webinterface if there is no link on it. I'm trying to snoop on the modules I2C lines, to trace them back to the SoC. However I can't observe any I2C traffic on the module pins. Now I don't know if I have an incompatible module, since I cannot see any change in the webinterface when a module is plugged in (not sure if there should be any), or if there is a problem elsewhere.

I built the latest master to try https://github.com/openwrt/openwrt/pull/4323, but that did not help.

Here's what I got when running "bridge fdb add":

root@OpenWrt:/# bridge fdb add 5e:xx:xx:xx:xx:2c dev switch vlan 100 self
RTNETLINK answers: File exists
root@OpenWrt:/# bridge fdb del 5e:xx:xx:xx:xx:2c dev switch vlan 100 self
RTNETLINK answers: No such file or directory

Note that in "bridge fdb show", we can see an entry for 5c:xx:xx:xx:xx:2c, which is the correct MAC address except that 5e has been replaced by 5c for some reason?

I also tried changing line 104 to 0x0401. This does appear to help, and I don't see the unexpected packets anymore. The output of "bridge fdb" is unchanged from before. However with 0x0403, the problem reappears.

Did you try

bridge fdb add 5e:xx:xx:xx:xx:2c dev eth0 vlan 100

I don't know what the difference between the "switch" and "eth0" devices is, to be honest. I thought eth0 was the CPU port.
If the 0x401 helps, i.e. we apparently enable L2 Learning of that MAC/VLAN-ID combination for the CPU-port, then I would also have expected a corresponding fdb entry with a "self" attribute but no "permanent" attribute. Strange you do not see that. Maybe you could test further and I will have another look at the SDK code. Sometimes there are hints in unexpected places.

Output (with 0x400):

root@OpenWrt:/# bridge fdb add 5e:xx:xx:xx:xx:2c dev eth0 vlan 100
[  161.593199] eth0: vlans aren't supported yet for dev_uc|mc_add()
RTNETLINK answers: Invalid argument

For the output of "bridge fdb" with 0x401, I checked again and the # lines is the same as before and looks exactly the same.

Note that the MAC addresses printed on the switch start with 5c, not 5e. The stock firmware uses 5c:xx:xx:xx:xx:2c for the switch itself while OpenWrt uses 5e:xx:xx:xx:xx:2c (where xx:xx:xx:xx is the same for both). As shown before, the output of "bridge fdb" with openwrt shows an inconsistency between 5c and 5e. So I think there is another problem here.

The 5e/5c is something that @blogic introduced to distinguish the MACs for the different VLANs. It is part of the switch configuration scripts. The switch HW is in principle able to distinguish MACs being on different VLANs (Independent VLAN Learning, IVL is enabled by default for all VLANs).
If the VLAN/MAC association works by enabling L2 Learning on sending packets, this appears to be a proper solution for the flooded packets. My suspicion is that we do not see the entry because fdb does not ask for forwarding entries with the proper port/vlan combination, the driver only returns entries learned from the HW when asked the right combination. Maybe you want to submit a patch?

Ah, good to know the 5e/5c is not a bug.

Sure, for now I can submit a patch just for 0x401?

Yes, submit that patch. Maybe with an explanation in a comment: the 4 is bit 10 and called AS_DPM, meaning that the Destination Port Mask given is used to send the packet instead of flooding it, bit 0 is the L2LEARNING bit we figured out.

I created https://github.com/openwrt/openwrt/pull/4403

Hmm, the flooding problem is not actually totally fixed. There is no flooding when the switch boots. But it looks like if I go to the web interface and change the VLAN of some port, click Save & Apply, then the flooding starts again, and I don't know how to make it stop.

The "fix" we have for this will only work for the CPU-port, because it only works when the router itself sends packets, so that the Source Address learning works. The VLAN/MAC combination a port listens to are supposed to be put by userspace or the kernel into the forwarding database (they have the permanent and self attributes). The FDB listing you posted earlier shows that this seems to work after boot. When you change a VLAN of a port then the kernel should ask the driver to reflect this change. If this is not the case then for me this seems not to be a driver problem. From hints on the web I start to suspect the Linux Kernel might not support IVL, only SVL. Can you check that fdb show reflects changes of the VLANs for a particular port?

Yes, bridge fdb appears to reflect changes of VLANs for ports.

Overall, I'm not sure what triggers the flooding, but the following seems to be the case:

Given a kernel with the 0x401 modification, and a configuration of:

  • Port 1: vlan 1 untagged, vlan 100 tagged
  • Port 2: vlan 100 untagged
  • Port 3: vlan 100 untagged

Then:

  • If the switch is powered on with devices plugged in ports 2 and 3, there is flooding of packets from port 2 to the switch (seen on port 3). If the device from port 3 is moved to port 1, there is also flooding seen on port 1.
  • If the switch is powered on with devices plugged in ports 1 and 2, there is no flooding. If the device from port 1 is moved to port 3, there is still no flooding.
  • There seems no difference in the output of bridge fdb between these cases, other than one entry 33:33:... may be in different order.

Are you sure you are not just seeing flooding for a couple of seconds due to SPF being reconfigured? When you take out the cable from a port, all learned addresses for that port are being forgotten. Then when you plug in a cable into another port, that port will be switched from the blocking state to the learning state for a couple of seconds during which all traffic is being flooded to all ports for MACs that are actually on that port. Only then does the port move to the forwarding state in which no more flooding happens. So the flooding should be there only for a couple of seconds.

I don't think it's only for a couple of seconds. e.g. if I power on the switch with devices plugged in ports 2 and 3, and start pinging the switch from port 2, the ping packets to the switch can still be seen on port 3 after 10 minutes of constant pinging

I know OpenWRT now supports these switches with this chipset, but does it support the POE versions of these switches? Does OpenWRT support POE in the software where you can see the output in watts per port, and turn the poe on/off? See POE power budget, etc?

PoE is supported, but the design of the management tools is still under discussion so it is not merged.

@blogic has made two implementations so far. The first one written in lua had both stats and configuration. The second one written in C doesn't yet implement any dynamic config. But you do have stats:

root@gs1900-10hp-f:~# ubus call poe info
{
        "firmware": "v22.4",
        "mcu": "ST Micro ST32F100 Microcontroller",
        "budget": 77.000000,
        "consumption": 15.400000,
        "ports": {
                "lan1": {
                        "priority": 0,
                        "mode": "PoE+",
                        "status": "Searching"
                },
                "lan2": {
                        "priority": 0,
                        "mode": "PoE+",
                        "status": "Searching"
                },
                "lan3": {
                        "priority": 0,
                        "mode": "PoE+",
                        "status": "Searching"
                },
                "lan4": {
                        "priority": 0,
                        "mode": "PoE+",
                        "status": "Searching"
                },
                "lan5": {
                        "priority": 0,
                        "mode": "PoE+",
                        "status": "Searching"
                },
                "lan6": {
                        "priority": 0,
                        "mode": "PoE+",
                        "status": "Delivering power",
                        "consumption": 4.400000
                },
                "lan7": {
                        "priority": 0,
                        "mode": "PoE+",
                        "status": "Delivering power",
                        "consumption": 4.200000
                },
                "lan8": {
                        "priority": 0,
                        "mode": "PoE+",
                        "status": "Delivering power",
                        "consumption": 6.300000
                }
        }
}

See https://biot.com/switches/software/poe_management for details of hardware and protocol

The LUA daemon is here: https://git.openwrt.org/?p=openwrt/staging/blogic.git;a=commit;h=2540faec92abf8f5e52eae0e77bfbdb47457252d

The more current C daemon is here: https://patchwork.ozlabs.org/project/openwrt/patch/20210511152243.1167160-1-john@phrozen.org/

2 Likes

Zyxel GS1900-24HP (Semi-)Bricked Warning!

Don't flash the "openwrt-21.02.0-rc4-realtek-generic-zyxel_gs1900-10hp-initramfs-kernel.bin" to the first active partition directly from the Zyxel webinterface since it will get stuck on a bootloop. The entire led row will light up in alternating pattern, first the whole row with all ACT/LNK leds than the row with all POE leds and no ethernet ports will come up(haven't tried SFP fiber)!

Original partition layout before flashing OpenWrt from Zyxel stock webinterface:

v2.40(AAHM.0) Active partition 0
v2.60(AAHM.4) Backup partition 1

The stock Zyxel v2.60 firmwares for the 10HP and 24HP share the same checksum/hash (although the 24HP has different DDR2 /less 64MB RAM) so I thought they are very similar so try to flash the device directly knowing there is a backup partition option via the bootloader. Skipping the direct need yet for opening the device and looking for the correct pins for serial or tftp booting, which offcourse should be default procedure...

Contrary to the 8/10?16 port model this 24port model doesn't come with a external serial pinout available through its side air holes. This device also comes with 2 buttons, one reset on the front and one unkown button on its side through the air holes!

Case openend... there is a J2 on the Poe board with 4 pins, a J2 2pin and J5 10pin (incorrect voltages 1x ~6v 8x 0.15v) on the mainboard. Update, there is also a J4 14pin pinless header under the POE board on the mainboard, with 3 pins measuring 3.4v on resistor r126,r128,r131.

Busy looking for the USB 3,3 serial/uart reader and or a Raspberry pi...

It looks like ZyXEL provides a unified firmware for all models, something that has come up multiple times. I'd first have tried an initramfs image before actually flashing.

Pin layout seems documented here: https://biot.com/switches/gs1900-24e#firmware. That site is the go-to reference for OpenWrt on the Realtek switches.

Yes, but the OpenWrt images are still unique per model.

I don't understand how you managed to flash the OpenWrt GS1900-10HP image from the webinterface though. It isn't flashable on any model, because of this bug: https://patchwork.ozlabs.org/project/openwrt/patch/20210624210408.19248-1-bjorn@mork.no/

And with that fixed, it should only be flashable on the 10HP due to the AAZI identifier being the only one listed. The 24HP uses AAHM as you show.

The OEM images list all the supported codes. OpenWrt uses only the one matching the specific supported hardware for each image.

2 Likes