Support for RTL838x based managed switches

I see. I believe I was mislead by the line "DGS-1210-16: A1 and D1 revisions" from:

I can likely get serial access to U-boot, but I take it there aren't any Marvell OpenWrt builds supporting this hardware?

Correct, at this point there is zero support for the Marvell (revA, revB) or Broadcom (revC, revD) variants. Only the Realtek ones are (-or can be) supported (relatively easily), these would (generally) be revF and revG (don't rely on this rule of thumb without confirming it for the actual model of desire).

Hi,

great news about the performance. We need to disable this debug output by default in master for all architectures, or at least rate-limit it.
I have not really looked at hwmon for now on the Realtek devices, I did some tests with the thermal sensors for fan-control, but never at the SFP modules. All SFP modules including DACs provide sensor data, that is part of the specification. However usually it is mostly empty or bogus, because the actual sensors are missing, especially for the cheaper modules I am willing to spend money on.
The issues you see are not connected to the Realtek driver, but more likely linked to the configuration of the sensors in hwmon. The driver merely exposes an I2C bus to the sfp kernel code, which somehow hands it over to hwmon. I am not a hwmon expert, I cannot help there, but my understanding is you should be able to do everything you need in userspace by configuring hwmon.

The DACs are notoriously tricky. There is no 100% support yet, because they need to be calibrated based on their length. So far there is some default which works for my cable which is 2m long. My issue is that the length needs to be read out of the EEPROM into the sfp code, then moved into the phylink/DSA layer and from there get to the PHY layer. But there is exactly no support so far in the kernel for such a code-path. The data structures one would like to use in phylink/DSA are all private. This is going to take a bit of time. This does likely explain your problem that the link is seen as up on one side and down on the other. Basically, the links are always calibrated on the RX side with some default for the TX amplifier settings based on the general type of the cable (fibre 10G/1G, 10Gbase-T, DAC). But the RX calibration on the Realtek side does not work yet in all cases, whereas the ARM board gets it done. So the ARM side sees the link as up.

For your VLAN troubles: While hunting for bugs in the RTL931x code I found something which manifested in a similar manner. The following fixes it:

From f1be953397ebd6fc3bcfab0fbbcb3e32323cadc2 Mon Sep 17 00:00:00 2001
From: Birger Koblitz <git@birger-koblitz.de>
Date: Sat, 7 May 2022 09:16:33 +0200
Subject: [PATCH] realtek: Fix stripping of VLAN tags on RTL93xx

On the RTL93xx platforms, the VLAN tags were stripped from egressing
and ingressing packets on a port. Keep all these tags intact.
---
 .../realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c     | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c b/target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c
index 26d6f11fad..1167baf67e 100644
--- a/target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c
+++ b/target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c
@@ -1059,11 +1059,14 @@ static int rtl83xx_port_enable(struct dsa_switch *ds, int port,
        pr_debug("%s: %x %d", __func__, (u32) priv, port);
        priv->ports[port].enable = true;
 
-       /* enable inner tagging on egress, do not keep any tags */
+       /* keep inner and outer tags on ingress and egress:
+        * set IGR_ITAG_KEEP, IGR_OTAG_KEEP, EGR_ITAG_KEEP EGR_OTAG_KEEP (bits 0-3)
+        * on RTL9300, on RTL9310 these are bits (6-9), on RTL9310 also set
+        * ITPID_KEEP (bit 0) and OTPID_KEEP (bit 3) */
        if (priv->family_id == RTL9310_FAMILY_ID)
-               sw_w32(BIT(4), priv->r->vlan_port_tag_sts_ctrl + (port << 2));
+               sw_w32(0x3c9, priv->r->vlan_port_tag_sts_ctrl + (port << 2));
        else
-               sw_w32(1, priv->r->vlan_port_tag_sts_ctrl + (port << 2));
+               sw_w32(0xf, priv->r->vlan_port_tag_sts_ctrl + (port << 2));
 
        if (dsa_is_cpu_port(ds, port))
                return 0;
-- 
2.25.1

I have compiled updated images with this patch. Could you test it?

1 Like

Unfortunately, no dice. Can I make sure, that the key presses are sent somehow? If they are sent, should they show up in the serial log (and garble the output)? Nothing shows up here.

Tried it out: looks like the upstream device (through lan1 tagged vlan10) now sees the dhcp requests from the vlan10-untagged ports, but the response never makes it back to the device sending the request.

Is there a way to make tcpdump on the switch see the packets being processed by the switch? Right now, tcpdump of lan12 only shows broadcasts and packets addressed at the switch's CPU. And same for trying to sniff lan1 for the dhcp responses.

I don't think this is possible with local snooping on the switch. We would need to send all packets from the port to the cpu, and the ethernet driver cannot handle that even with a modest load. What you could do is to mirror the port and then use a powerful device like a PC for snooping on the destination port. See
https://svanheule.net/switches/testing/mirroring

(assuming that is implemented on the rtl9300)

Mirroring should work, especially if I send it to the 10G device. Is there a way to set it always mirrored in LUCI?

EDIT: I tried it...

root@XGS1210:~# tc qdisc add dev lan8 clsact
root@XGS1210:~# tc filter add dev lan8 ingress matchall skip_sw action mirred egress mirror dev lan12
RTNETLINK answers: Not supported
We have an error talking to the kernel
root@XGS1210:~# tc filter add dev lan8 egress matchall skip_sw action mirred egress mirror dev lan12
RTNETLINK answers: Not supported
We have an error talking to the kernel

Also, speaking of the DAC, it's this:
sfp sfp-p11: module OEM SFP-H10GB-CU2M rev 03 sn <REDACTED> dc 210913

root@XGS1210:~# ethtool -m lan11
	Identifier                                : 0x03 (SFP)
	Extended identifier                       : 0x04 (GBIC/SFP defined by 2-wire interface ID)
	Connector                                 : 0x21 (Copper pigtail)
	Transceiver codes                         : 0x00 0x00 0x00 0x00 0x00 0x04 0x00 0x00 0x00
	Transceiver type                          : Passive Cable
	Encoding                                  : 0x00 (unspecified)
	BR, Nominal                               : 10300MBd
	Rate identifier                           : 0x00 (unspecified)
	Length (SMF,km)                           : 0km
	Length (SMF)                              : 0m
	Length (50um)                             : 0m
	Length (62.5um)                           : 0m
	Length (Copper)                           : 2m
	Length (OM3)                              : 0m
	Passive Cu cmplnce.                       : 0x01 (SFF-8431 appendix E) [SFF-8472 rev10.4 only]
	Vendor name                               : OEM
	Vendor OUI                                : 00:40:20
	Vendor PN                                 : SFP-H10GB-CU2M
	Vendor rev                                : 03
	Option values                             : 0x00 0x00
	BR margin, max                            : 0%
	BR margin, min                            : 0%
	Vendor SN                                 : <REDACTED>
	Date code                                 : 210913
root@XGS1210:~# ethtool -m lan12
	Identifier                                : 0x03 (SFP)
	Extended identifier                       : 0x04 (GBIC/SFP defined by 2-wire interface ID)
	Connector                                 : 0x07 (LC)
	Transceiver codes                         : 0x10 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
	Transceiver type                          : 10G Ethernet: 10G Base-SR
	Encoding                                  : 0x06 (64B/66B)
	BR, Nominal                               : 10300MBd
	Rate identifier                           : 0x00 (unspecified)
	Length (SMF,km)                           : 0km
	Length (SMF)                              : 0m
	Length (50um)                             : 80m
	Length (62.5um)                           : 20m
	Length (Copper)                           : 0m
	Length (OM3)                              : 300m
	Laser wavelength                          : 850nm
	Vendor name                               : Wiitek
	Vendor OUI                                : 00:01:9c
	Vendor PN                                 : SFP-10G-T
	Vendor rev                                : 1
	Option values                             : 0x00 0x1a
	Option                                    : RX_LOS implemented
	Option                                    : TX_FAULT implemented
	Option                                    : TX_DISABLE implemented
	BR margin, max                            : 0%
	BR margin, min                            : 0%
	Vendor SN                                 : WAMZ<REDACTED>
	Date code                                 : 210730
	Optical diagnostics support               : Yes
	Laser bias current                        : 6.000 mA
	Laser output power                        : 0.5000 mW / -3.01 dBm
	Receiver signal average optical power     : 0.4000 mW / -3.98 dBm
	Module temperature                        : 65.50 degrees C / 149.90 degrees F
	Module voltage                            : 3.2735 V
	Alarm/warning flags implemented           : Yes
	Laser bias current high alarm             : Off
	Laser bias current low alarm              : Off
	Laser bias current high warning           : Off
	Laser bias current low warning            : Off
	Laser output power high alarm             : Off
	Laser output power low alarm              : Off
	Laser output power high warning           : Off
	Laser output power low warning            : Off
	Module temperature high alarm             : Off
	Module temperature low alarm              : Off
	Module temperature high warning           : Off
	Module temperature low warning            : Off
	Module voltage high alarm                 : Off
	Module voltage low alarm                  : Off
	Module voltage high warning               : Off
	Module voltage low warning                : Off
	Laser rx power high alarm                 : Off
	Laser rx power low alarm                  : Off
	Laser rx power high warning               : Off
	Laser rx power low warning                : Off
	Laser bias current high alarm threshold   : 15.000 mA
	Laser bias current low alarm threshold    : 1.000 mA
	Laser bias current high warning threshold : 13.000 mA
	Laser bias current low warning threshold  : 2.000 mA
	Laser output power high alarm threshold   : 1.9952 mW / 3.00 dBm
	Laser output power low alarm threshold    : 0.1584 mW / -8.00 dBm
	Laser output power high warning threshold : 1.5848 mW / 2.00 dBm
	Laser output power low warning threshold  : 0.1778 mW / -7.50 dBm
	Module temperature high alarm threshold   : 80.00 degrees C / 176.00 degrees F
	Module temperature low alarm threshold    : -10.00 degrees C / 14.00 degrees F
	Module temperature high warning threshold : 75.00 degrees C / 167.00 degrees F
	Module temperature low warning threshold  : -5.00 degrees C / 23.00 degrees F
	Module voltage high alarm threshold       : 3.6000 V
	Module voltage low alarm threshold        : 3.0000 V
	Module voltage high warning threshold     : 3.5000 V
	Module voltage low warning threshold      : 3.1000 V
	Laser rx power high alarm threshold       : 1.1220 mW / 0.50 dBm
	Laser rx power low alarm threshold        : 0.0199 mW / -17.01 dBm
	Laser rx power high warning threshold     : 1.0000 mW / 0.00 dBm
	Laser rx power low warning threshold      : 0.0223 mW / -16.52 dBm

Mirroring was not implemented on the RTL93xx. I believe I added support, it is however not tested, but it is really very similar to the RTL83xx way of doing it, so it should work. The image is updated, please try again (and also make sure it is not due to some tc module that is missing).

The DAC cable is very short, I am quite sure this causes the issues. The SDK distinguishes cables of 0.5m length, 1m, 3m, and 5m with different starting conditions for RX calibration. It is probably not necessary to be super exact with the length and my 2m cable seems to work out of pure luck, but 0.2m is definitely at the extreme end of things. Adding support for different DAC lengths will take a bit of time, please do not hold your breath.

I think the Amazon item description is bad: the SFP+ module info cites 2 meters, not 0.2, and it's about the length of a twin bed (can't find my measuring tape at the moment). I'll try an updated image.

EDIT: Actually, the real problem is that I linked to the wrong item of the group. "You purchased another variant..."

Here's the actual variant I got: https://smile.amazon.com/Macroreer-Cables-Optical-Connector-Passive/dp/B071YMK48B

I tried the new image, and the PC sees the SFP+ 10GbE link bounce up and down repeatedly, and the switch sees it stay down. That's even after rebooting to make sure I'd removed the mirroring.

Note that the fancy output from ethtool -m requires the ethtool-full package; plain ethtool just gives a hex dump.

Thinking of the ethtool stats, I found this interesting result in a web search for one of the stats. I'm messing with the collectd ethstat plugin, and the mapping of types is a bit of a pain, especially since I don't really know what i'm doing. For example, do I count Fragments and Jabbers as Errors?

ftp://ftp.romsat.ua/pub/Lan/Firmware%20Edge-Core/ECS4100-28T/v1.2.36.191/ECS4100_V1.2.36.191_SWR-EngReleaseNote.docx

Fast Ethernet Switch ES3526XA-38 (googleusercontent.com)

That is a sure sign the calibration for the cable is not OK.

Did you test the port mirroring? It should work now, and maybe we can figure out what is wrong with the VLAN.

The same cable on the same port worked before the latest image; any changes in SFP+ code? I tried a couple of other changes, and they bounce too.

While I'm experimenting with VLANs, I flipped stuff around so the two 10G machines are on the 2.5G ports of the switch (the SFP+ module is now in one of them instead of in the switch).

I tweaked my VLAN settings:

port              vlan-id
lan1              1 PVID Egress Untagged
                  10
                  1733
lan2              1 PVID Egress Untagged
                  10
                  1733
lan3              1 PVID Egress Untagged
lan4              1 PVID Egress Untagged
lan5              1 PVID Egress Untagged
lan6              1 PVID Egress Untagged
lan7              1 PVID Egress Untagged
lan8              1733 PVID Egress Untagged
lan9              10 PVID Egress Untagged
lan10             1 PVID Egress Untagged
lan11             1 PVID Egress Untagged
lan12             1 PVID Egress Untagged
switch            1
                  10
                  1733

Mirroring from lan1 to lan10, the lan10 machine sees the lan9 machine's DHCP request and response, but they're untagged, and thus get an answer for the wrong subnet.

With mirroring from lan9 to lan10, the lan10 machine sees the lan9 machine's DHCP request, but not the response.

Running tcpdump on the other end of LAN1 (direct connection to a 4-port Intel NIC in an x86 OpenWRT) confirms that the dhcp request is untagged.

There is no change at all with respect to the SFP+ code. There was only the addition of the mirroring code. But as I said, there is no calibration done which would be necessary to make the link reliable. I don't know exactly how that works but I would assume it might even work one time you boot the device and the next time no longer, even though the cable and port are the same, just by chance.

With regards to the VLANs: I am a bit VLAN dyslexic, so maybe @bmork can analyse this better. What I understand is that the request packet arriving on lan9 does not get tagged to pvid 10 as it should. Therefore the outgoing request on lan1 is seen untagged. Correct? Which would mean that the switch hardware does not properly apply the pvid tag on an untagged port?

I just thought of another thing to try: I made vlan1 tagged on lan9. Looks like the dhcp reply makes it back out of the switch, untagged!

That is: lan1 has vlan1 untagged and vlan10 tagged. port 10 has vlan1 tagged and vlan10 untagged. Yet, the dhcp packets are ending up untagged on both.

I've now changed vlan10 into vlan100 so it's easier to visually distinguish.

EDIT: seems like suddenly, my 2.5G ports stopped being able to send any traffic... and now, I can't even ping anything from the switch OS (via serial console). Booted back to stock, and they work there. Booted back to openwrt, still can't ping in or out from switch OS. So I've flashed back to stock.

Should you not have lan1 tagged for the vlans used on ports 9 and 10 so that packets going out of that port keep their tags? My understanding is that untagged ports add a tag on ingress if they have a pvid configured and will always strip a tag on egress.

My understanding of VLAN tags (I'm not an expert) is this:

Inside the switch, all VLANs have tags, though the binary tag format may be proprietary instead of 802.1q. The PVID determines what VID gets slapped on incoming untagged packets, and Egress Untagged or Egress Tagged for a VID determines whether tags for that VID are stripped upon exiting that port.

What should happen:
Request:

PC (untagged) ⟹ lan9 (PVID=100) ⟹ switch (tag=100) ⟹ lan1 (egress tagged) ⟹ Router (tag=100)

Response:

Router (tag=100) ⟹ lan1 (tag=100) ⟹ switch (tag=100) ⟹ lan9 (egress UNtagged) ⟹ PC (untagged)

What seems to be happening:

Request:

PC (untagged) ⟹ lan9 (PVID=100) ⟹ switch (tag=100) ⟹ lan1 (tag goes poof) ⟹ Router (untagged)

Response:

Router (untagged due to broken request) ⟹ lan1 (PVID=1) ⟹ switch (tag=1) ⟹ discard (lan9 isn't allowed to see vlan 1)

Response with tweak (tagged vlan 1 for debugging):

Router (untagged due to broken request) ⟹ lan1 (PVID=1) ⟹ switch (tag=1) ⟹ lan9 (tag goes poof) ⟹ PC (untagged)

OK, I'll look into it.

The issue that still confuses me is this picture with two (inner+outer) tags on the switch. This is easy to understand if we talk about q-in-q, and ports where we push or pop one tag at a time. But most of the time we don't deal with q-in-q. Still, all packets have two tags internal on the switch. If you have a packet wih vlan=100 coming in on an port with PVID=10,10, will 100 be the considered the inner or outer VLAN? How about if we have PVID=100,10? or PVID=10,100?

OK, I should probably sit down with a pen and paper and try this out. But it's easier to ask :slight_smile: