Support for RTL838x based managed switches

LuCI runs smoothly, also because all these devices have relatively large RAM of at least 128MB, often 256MB.
So far I am not aware of a NAT performance test done. But it is also probably not the most obvious use case for such a device to put it at the border of your network. More likely it sits behind a router, the developers even discussed to make it use DHCP for the initial setup instead of the usual OpenWRT static 192.168.1.1. In any case there are still a lot of possible performance improvements in that area, all unexplored. So it is probably a bit early to give a definitive answer on this.

I wonder if those of you who understand this stuff can make anything of this capture from eth0 on the switch:

15:50:49.714166 ARP, Request who-has 192.168.2.29 tell 192.168.2.1, length 46
        0x0000:  ffff ffff ffff 001b 21a7 98bc 8100 0007
        0x0010:  0806 0001 0800 0604 0001 001b 21a7 98bc
        0x0020:  c0a8 0201 0000 0000 0000 c0a8 021d 0000
        0x0030:  0000 0000 0000 0000 0000 0000 800f 1000
15:50:49.715168 ARP, Request who-has 192.168.2.29 tell 192.168.2.1, length 50
        0x0000:  ffff ffff ffff 001b 21a7 98bc 0806 0001
        0x0010:  0800 0604 0001 001b 21a7 98bc c0a8 0201
        0x0020:  0000 0000 0000 c0a8 021d 0000 0000 0000
        0x0030:  0000 0000 0000 0000 0000 0000 800a 1000
15:50:50.509217 IP 192.168.2.111 > 192.168.2.1: ICMP echo request, id 61676, seq 4, length 64
        0x0000:  001b 21a7 98bc 54ee 759a bf58 0800 4500
        0x0010:  0054 50fb 4000 4001 63ed c0a8 026f c0a8
        0x0020:  0201 0800 7ddf f0ec 0004 dffd a25f 0000
        0x0030:  0000 47ff 0000 0000 0000 1011 1213 1415
        0x0040:  1617 1819 1a1b 1c1d 1e1f 2021 2223 2425
        0x0050:  2627 2829 2a2b 2c2d 2e2f 3031 3233 3435
        0x0060:  3637 800a 1000
15:50:50.510337 IP 192.168.2.111 > 192.168.2.1: ICMP echo request, id 61676, seq 4, length 64
        0x0000:  001b 21a7 98bc 54ee 759a bf58 8100 0007
        0x0010:  0800 4500 0054 50fb 4000 4001 63ed c0a8
        0x0020:  026f c0a8 0201 0800 7ddf f0ec 0004 dffd
        0x0030:  a25f 0000 0000 47ff 0000 0000 0000 1011
        0x0040:  1213 1415 1617 1819 1a1b 1c1d 1e1f 2021
        0x0050:  2223 2425 2627 2829 2a2b 2c2d 2e2f 3031
        0x0060:  3233 3435 3637 800f 1000
15:50:50.511576 IP 192.168.2.1 > 192.168.2.111: ICMP echo reply, id 61676, seq 4, length 64
        0x0000:  54ee 759a bf58 001b 21a7 98bc 8100 0007
        0x0010:  0800 4500 0054 b8b9 0000 4001 3c2f c0a8
        0x0020:  0201 c0a8 026f 0000 85df f0ec 0004 dffd
        0x0030:  a25f 0000 0000 47ff 0000 0000 0000 1011
        0x0040:  1213 1415 1617 1819 1a1b 1c1d 1e1f 2021
        0x0050:  2223 2425 2627 2829 2a2b 2c2d 2e2f 3031
        0x0060:  3233 3435 3637 800f 1000
15:50:50.512740 IP 192.168.2.1 > 192.168.2.111: ICMP echo reply, id 61676, seq 4, length 64
        0x0000:  54ee 759a bf58 001b 21a7 98bc 0800 4500
        0x0010:  0054 b8b9 0000 4001 3c2f c0a8 0201 c0a8
        0x0020:  026f 0000 85df f0ec 0004 dffd a25f 0000
        0x0030:  0000 47ff 0000 0000 0000 1011 1213 1415
        0x0040:  1617 1819 1a1b 1c1d 1e1f 2021 2223 2425
        0x0050:  2627 2829 2a2b 2c2d 2e2f 3031 3233 3435
        0x0060:  3637 800a 1000

This shows 3 packets which are all duplicated with and without tag (VID 7). The ARP request and the ICMP echo reply enter tagged on port 8, while the ICMP echo request entered untagged on port 3.

If I read the DSA trailer correctly, then we have consistently

0x800a1000 - untagged
0x800f1000 - tagged

regardless of actual source port. Which is weird, isn't it? Is the switch doing something wrong, or are we bouncing back packets here?

Could this trailer be the trailer the 83xx driver puts in, see function rtl838x_hw_receive()?
There also an 0x80pp1000 is appended, where pp is the port number, not sure this is removed by DSA.

I've received the GPL source package for the GS1900-10HP and put it here: http://get.dyn.mork.no/GS1900-10HP(V2.60(AAZI.2)C0).zip

Maybe someone with access want to add it to the https://biot.com/switches/gpl_source_drops page?

Zyxel just sent me the sources for the GS1900-24HP too, probably just the same archive :smiley:
Your file is mirrored on the wiki now.

1 Like

Thanks.

FYI, I just test built the archine, and it seems both complete and functional. Just needs some of the usual fixups to build in a modern world: Splitting the two combined wildcard and non-wildcard rules in the kernel Makefile, changing "bin/sh is /bin/bash" assumptions in the (pre-generated) Makefiles for the snmp app, and fixing two mknod commands in the final romfs rule so they can be called without having to build the whole thing as root

2 Likes

Been thinking about this. Not sure running management on a VLAN by default is a good idea. Will probably confuse most users coming from OEM firmware, where managent is setup on VLAN 1. Which is untagged PVID on every port by default.

Assuming most new users are using the OEM web GUI to convert (once that is supported), then they will already have setup management access. How about simply generating OpenWrt defaults based on that? The OEM config is readily available in the "jffs" partition (better named "JFFS2_CFG" in OEM uboot/firmware - maybe we should use their partition names too?).

And parsing just the management settings should not be too difficult:

oot@OpenWrt:/# df -hT /mnt/
Filesystem           Type            Size      Used Available Use% Mounted on
/dev/mtdblock3       jffs2           1.0M    248.0K    776.0K  24% /mnt
root@OpenWrt:/# egrep '^manage|^ip'  /mnt/startup-config 
ip address 192.168.99.51 mask 255.255.255.0
management-vlan vlan 203
management access-list default

You can "borrow" the ssh host keys there too if you want to make the transition even smoother:

root@OpenWrt:/# ls -la /mnt/ssh/
drwxr-xr-x    2 root     root             0 Jan  1  2019 .
drwxr-xr-x    4 root     root             0 Jan  1  1970 ..
-rw-r--r--    1 root     root           493 Jan  1  2019 http_rsa_key
-rw-r--r--    1 root     root           137 Jan  1  2019 http_rsa_modulus
-rw-------    1 root     root           668 Jan  1  2019 ssh_host_dsa_v2_key
-rw-r--r--    1 root     root           601 Jan  1  2019 ssh_host_dsa_v2_key.pub
-rw-------    1 root     root          1679 Jan  1  2019 ssh_host_rsa_v2_key
-rw-r--r--    1 root     root           393 Jan  1  2019 ssh_host_rsa_v2_key.pub
-rw-r--r--    1 root     root          1245 Jan  1  2019 ssl_cert.pem
-rw-r--r--    1 root     root          1704 Jan  1  2019 ssl_key.pem

Maybe it is possible to do both: The default configuration blogic suggests makes a lot of sense for someone who bought the switch just for the purpose of running OpenWRT on it. Having a tool that allows to convert the switch configuration over, could be an option for someone to convert an existing switch. But my feeling is a lot of users will probably not run the switch with the OEM firmware in the longer run, also because I noticed that many of the newer models have a habit of phoning home first thing when you switch them on and then require you to register for their "cloud" setup.

Coming back to the GPIOs on the 8393M once more:

I've figured out most of the GPIOs for the T1600G-52PS, but some are still missing:

  • I can read the state of two of the three fans - one is missing
  • The i2c pins to the PoE controllers are correct, but no devices are detected. One of the pins on the 8-pin connector is high while it should be low - I suppose that one GPIO is still missing.
  • The fans seem to be controlled by three GPIO lines - I can't find any of them
  • The port LEDs flash once the LED control register is written (mapped to GPIO 37), but I couldn't figure out which GPIOs the bus is actually connected to - shouldn't there be corresponding GPIO lines!?

The TP-Link source code is incomplete, i.e. the SDK does not contain the GPIO driver implementation. I had a look at the Zyxel code, there are more drivers included. On some chips, there are more GPIO registers (Ports E - G), but that doesn't seem to be the case for the 8393M.

I assume the PiNs of the LEDs go to the shift register ICs on the LED board. Could this be cascaded again to the other HCxxx shift register IC close to the CPU? Or do they go to a RTL8231 or possibly directly to the CPU? But then the question is what the HCxxx shift register close to the CPU is doing. The datasheet of the RTL8231 is in the Telegram channel, the chip has various LED driving modes, normally configured through strapping pins, but pins 17/18 are always the bus. From the pictures I assume the RTL8231 is only controlling the SFP cages plus the attached LEDs, but that could be wrong. Are there other RTL8231 chips around? How are the I2C fan-controllers lines connected: to the SoC or to the RTL8231? "GPIO" 37 is just the bits in the global LED control register. To steer the actual bits, there are several sets of registers that are also available as GPIOs with the individual LEDs when they are run in shit register mode. But evidently for that to work, the correct settings need to be done on the SoC, like number of LEDs, number of color modes and whether there are status leds that are not port-leds controlled via the shift registers. For that you have to look at the gpio driver code for the SoC, for the 9x this has not been tested so far with a mode to directly steer the LEDs, for the 2 9x models I have this simply works out-of-the-box because the strapping pins are just setting everything correctly. If this does not work, then my experience with the other 8x based TP-Link model says it is due to non-port status leds also controlled by the shift register ICs.

I could not find any RTL8231 on this board - I didn't take off the heat sinks and I'm not going to, as I don't want to do any obvious modifications to this brand new switch. I'll probe it in software once again later today. Maybe I'll try to get root access on the stock firmware to dump registers.

I did not yet look at the SFP cages, as I don't have any SFP modules to test.

I disassembled the device a bit further: the traces lead definitely to the CPU.

I'm nearly 100% sure that only the port LEDs that are controlled via the shift registers, as I can control all the other LEDs with individual GPIOs. I'll double check that. Thanks for the hints!

This is the vlan leaking.

/src/DGS-1210-10P/sdk/rtk-sdk/src/app/diag/config/vlan.cli:vlan set vlan-leaky ( <PORT_LIST:ports> | all ) state ( enable | disable )

I am assuming you only see bcast and mcast frames do this ?

Thanks for an interesting ponter. Which maps to

    reg_field_write(unit, MAPLE_VLAN_CTRLr, MAPLE_LKYf, &en_leaky)

in sdk/src/dal/maple/dal_maple_vlan.c.

No, unfortunately not. I also see L2/L3 unicast frames.

And the problem isn't really leakage between VLANs. It's tagged frames on a port configured for untagged output only and vice versa. The frames do belong to a VLAN on that port, so VLAN filtering is working.

Looking at the output from an interface connected to lan1 configured with "7 PVID Egress Untagged" and only that, running:

tshark -T fields -e frame.number -e frame.time_delta -e vlan.id -e _ws.col.Source -e _ws.col.Destination -e _ws.col.Protocol -e ip.len  -e _ws.col.Info -ni  eth0 

This is what I see (with some uninteresting frames removed):

42      0.000018193             192.168.2.111   192.168.2.1     ICMP    84      Echo (ping) request  id=0x1653, seq=1/256, ttl=64
43      0.000395628     7       192.168.2.1     192.168.2.111   ICMP    84      Echo (ping) reply    id=0x1653, seq=1/256, ttl=64 (request in 42)
44      0.000134301             192.168.2.1     192.168.2.111   ICMP    84      Echo (ping) reply    id=0x1653, seq=1/256, ttl=64
45      0.999474668             192.168.2.111   192.168.2.1     ICMP    84      Echo (ping) request  id=0x1653, seq=2/512, ttl=64
46      0.000555070     7       192.168.2.1     192.168.2.111   ICMP    84      Echo (ping) reply    id=0x1653, seq=2/512, ttl=64 (request in 45)
47      0.000032053             192.168.2.1     192.168.2.111   ICMP    84      Echo (ping) reply    id=0x1653, seq=2/512, ttl=64
48      1.031439138             192.168.2.111   192.168.2.1     ICMP    84      Echo (ping) request  id=0x1653, seq=3/768, ttl=64
49      0.000750461     7       192.168.2.1     192.168.2.111   ICMP    84      Echo (ping) reply    id=0x1653, seq=3/768, ttl=64 (request in 48)
50      0.000000244             192.168.2.1     192.168.2.111   ICMP    84      Echo (ping) reply    id=0x1653, seq=3/768, ttl=64
51      0.999679503             192.168.2.111   192.168.2.1     ICMP    84      Echo (ping) request  id=0x1653, seq=4/1024, ttl=64
52      0.000515403     7       192.168.2.1     192.168.2.111   ICMP    84      Echo (ping) reply    id=0x1653, seq=4/1024, ttl=64 (request in 51)
53      0.000104105             192.168.2.1     192.168.2.111   ICMP    84      Echo (ping) reply    id=0x1653, seq=4/1024, ttl=64

You see the duplicated replies - with and without a tag. I obviously expected to only see the untagged ones here.

And simliar on the trunk port connected to the other end of that ping session, and configured only with tagged VLANs

lan8              7
                  203

edited tshark output with VLAN in the 3rd column:

251     0.000346338             192.168.2.111   192.168.2.1     ICMP    84      Echo (ping) request  id=0x1653, seq=1/256, ttl=64
252     0.000138637     7       192.168.2.111   192.168.2.1     ICMP    84      Echo (ping) request  id=0x1653, seq=1/256, ttl=64
253     0.000089122     7       192.168.2.1     192.168.2.111   ICMP    84      Echo (ping) reply    id=0x1653, seq=1/256, ttl=64 (request in 252)
258     0.361303137             192.168.2.111   192.168.2.1     ICMP    84      Echo (ping) request  id=0x1653, seq=2/512, ttl=64
259     0.000190944     7       192.168.2.111   192.168.2.1     ICMP    84      Echo (ping) request  id=0x1653, seq=2/512, ttl=64
260     0.000086761     7       192.168.2.1     192.168.2.111   ICMP    84      Echo (ping) reply    id=0x1653, seq=2/512, ttl=64 (request in 259)
262     0.943017116             192.168.2.111   192.168.2.1     ICMP    84      Echo (ping) request  id=0x1653, seq=3/768, ttl=64
263     0.000193246     7       192.168.2.111   192.168.2.1     ICMP    84      Echo (ping) request  id=0x1653, seq=3/768, ttl=64
264     0.000074042     7       192.168.2.1     192.168.2.111   ICMP    84      Echo (ping) reply    id=0x1653, seq=3/768, ttl=64 (request in 263)
269     0.392436534             192.168.2.111   192.168.2.1     ICMP    84      Echo (ping) request  id=0x1653, seq=4/1024, ttl=64
270     0.000188880     7       192.168.2.111   192.168.2.1     ICMP    84      Echo (ping) request  id=0x1653, seq=4/1024, ttl=64
271     0.000077925     7       192.168.2.1     192.168.2.111   ICMP    84      Echo (ping) reply    id=0x1653, seq=4/1024, ttl=64 (request in 270)

We should only have seen tagged frames here, but as you can see the frames coming from lan1 are sent in duplicate - with and without tagging.

I wonder, am I the only one with these issues? I don't think my config is very unusual for a switch device - having a few access ports and a trunk port. But I could definitely have screwed up the code while trying to get stuff running.

what L3 is that ? is L3 the replies possibly ?
--> /src/linux/net/bridge/br_netfilter_hooks.c
looks like the HW only does leak protection for mcast and bcast needs to be done in SW
I am still grepping ... :slight_smile:

I fired up ghidra (first time use - great tool) and disassembled some of the binaries. I could finally map out the remaining GPIO pins. However, I'm still unable to toggle some of them and I have no idea, why.

The GPIO mapping should be (1 means output; the ones marked with TODO cannot be controlled)

  • C7 -> 1 = GPIO15 PoE Reset (500ms Low)
  • A7 -> 0 = GPIO31 Probably Power Supply Fan 2 Input TODO
  • B4 -> 0 = GPIO20 System Fan Input
  • B6 -> 0 = GPIO22 Power Supply Fan 1 Input
  • B7 -> 1 = GPIO23 Fan Green LED
  • C0 -> 1 = GPIO8 Fan Amber LED
  • C3 -> 0 = GPIO11 Button PoE/Speed
  • C4 -> 0 = GPIO12 Button Reset
  • B0 -> 1 = GPIO16 Speed Green LED
  • B1 -> 1 = GPIO17 PoE Green LED
  • C2 -> 1 = GPIO10 PoE Max Green LED
  • A4 -> 1 = GPIO28 Fan Speed Out TODO
  • B3 -> 1 = GPIO19 Fan Speed Out TODO
  • B2 -> 1 = GPIO18 Fan Speed Out TODO

So basically, I'm trying to figure out why GPIOs 18, 19, 28 and 31 don't work as expected.

Three-wire fans would have a VCC pin, GND, and tachometer output. To detect the rotation speed, I think you would need to measure the tachometer pulse interval. To control the fan speed, you (probably) need a PWM output that controls fan speed.

This would imply that the GPIO pins are controlled by a timer/counter block, and the pins need to be configured with a pinmux.

Ah, I should have mentioned that: There is a bunch of transistors and a MOSFET around the fan area (see https://biot.com/switches/_detail/wiki/t1600g-52ps_fan.jpg?id=t1600g-52ps). Each fan only reports working/stopped, that works for two of the three fans (I have two gpio-fan instances defined in my .dts and they work just fine). The third one would be A7 or GPIO31, the state of this input never changes.

And 18, 19 and 28 are used to set the speed. The pattern determines the speed:

  • Low - Low - Low = Full Speed
  • Low - High - High = Medium
  • High - High - High = Low Speed

(I measured these pins using an oscilloscope when running the stock firmware; full/medium/low are just observations).

The ghidra dump supports this:

Selection_057

The fan speed to set is calculated based on the actual PoE current consumption.

Strangely enough, the pin level doesn't change when toggled from U-Boot either!

1 Like

If you want to look at the leaky VLANs, the following adds a debugfs entry:

diff --git a/target/linux/rtl838x/files-5.4/drivers/net/dsa/rtl838x_debugfs.c b/target/linux/rtl838x/files-5.4/drivers/net/dsa/rtl838x_debugfs.c
index 00a87b0863..2e2f62686f 100644
--- a/target/linux/rtl838x/files-5.4/drivers/net/dsa/rtl838x_debugfs.c
+++ b/target/linux/rtl838x/files-5.4/drivers/net/dsa/rtl838x_debugfs.c
@@ -334,6 +334,13 @@ void rtl838x_dbgfs_init(struct rtl838x_switch_priv *priv)
                debugfs_create_x64("bpdu_flood_mask", 0644, rtl838x_dir,
                                (u64 *)(RTL838X_SW_BASE + priv->r->rma_bpdu_fld_pmask));
 
+       if (priv->family_id == RTL8380_FAMILY_ID)
+               debugfs_create_x32("vlan_ctrl", 0644, rtl838x_dir,
+                               (u32 *)(RTL838X_SW_BASE + RTL838X_VLAN_CTRL));
+       else
+               debugfs_create_x32("vlan_ctrl", 0644, rtl838x_dir,
+                               (u32 *)(RTL838X_SW_BASE + RTL839X_VLAN_CTRL));
+
        return;
 err:
        rtl838x_dbgfs_cleanup(priv);

diff --git a/target/linux/rtl838x/files-5.4/drivers/net/dsa/rtl838x.h b/target/linux/rtl838x/files-5.4/drivers/net/dsa/rtl838x.h
index 0528298cb6..ad52a5d857 100644
--- a/target/linux/rtl838x/files-5.4/drivers/net/dsa/rtl838x.h
+++ b/target/linux/rtl838x/files-5.4/drivers/net/dsa/rtl838x.h
@@ -63,6 +63,7 @@
 #define RTL839X_SDS12_13_PWR1                  (0xb980)
 
 /* VLAN registers */
+#define RTL838X_VLAN_CTRL                      (0x3A74)
 #define RTL838X_VLAN_PROFILE(idx)              (0x3A88 + ((idx) << 2))
 #define RTL838X_VLAN_PORT_EGR_FLTR             (0x3A84)
 #define RTL838X_VLAN_PORT_PB_VLAN(port)                (0x3C00 + ((port) << 2))

It shows that by default leaky STP is on and leaky Mcast is off:

root@OpenWrt:/sys/kernel/debug/rtl838x# cat vlan_ctrl 
0x00000020

It is possible to write to that register through debugfs.

Thanks. That's a nice trick.

BTW, I noticed that the HPE 1910 series is listed on the "models" without any details. And there was one for sale for "nothing" in my neighbourhood. But checking it closer I believe the entry is wrong. These are not RTL83xx based. The specs claim a 333MHz ARM CPU. And I downloaded

Software Release 1910_5.20.R1519P06
Supported products:

    JE005A HP 1910-16G SWITCH
    JE006A HP 1910-24G SWITCH
    JE007A HP 1910-24G-POE (365W) SWITCH
    JE008A HP 1910-24G-POE(170W) SWITCH
    JE009A HP 1910-48G SWITCH
    JG348A HP 1910-8G SWITCH
    JG349A HP 1910-8G-POE+ (65W) SWITCH
    JG350A HP 1910-8G-POE+ (180W) SWITCH

which contains strings like

Marvell Feroceon
U-Boot 1.1.4 (2017-08-09 - 14:09:56) Marvell version: 3.1.6

and

MV88F
88F6281 Z0
88F6192 Z0
88F6180 Z0
DB-88F6281-BP
RD-88F6281
DB-88F6192-BP
RD-88F6192-NAS
DB-88F6180-BP
RD-88F6180-AP

So that probably means more like kirkwood?