Support for RTL838x based managed switches

Related to Paul Fertser's restart patch on the mailing list, I've written a driver for the watchdog peripheral that can also restart the system.

Driver is included only for the 5.10 kernel. Feel free to test and report back! If your device hangs on reboot, you may need to add "reboot=warm" or "reboot=soft" to the bootargs.

4 Likes

I just tested the DGS-1210-28 with the stock firmware: There I do receive back 1500 byte ping responses from the switch, also from ports with a VLAN tag (I saw ethernet frames going out with 1514 bytes without VLAN tags and 1518 bytes tagged).

However, I couldn't elicit responses with big packets from e.g. the Web-UI for a clear cut result, not sure why, but I guess the ICMP packets anyway must also originate from the CPU port and cannot be offloaded somehow.

The pings will also come from the CPU port. I didn't think it would be possible to do this kind of test on the original firmware, great you figured this out. I'll continue to dig into any settings that might limit the packet size, then.

Another datum: It occurred to me that on my Netgear GS108T v3 with OpenWrt 21.02.0 I also never noticed any issues talking to the device itself, so I just tested there as well and can also receive back packets with MTU 1500 just fine, with and without VLAN tags.

A friend of mine also has DGS-1210-28, he reported that he has no issues with 1500 byte packets with the current OpenWrt master. It turns out he has hardware revision F2, while I seem to have F1. Also, his seems to have a much more recently built u-boot,

U-Boot 2011.12.(2.1.5.67086)-Candidate1 (Jun 22 2020 - 14:58:40)

than mine has

U-Boot 2011.12.(2.1.5.67086)-Candidate1 (Apr 18 2017 - 13:56:40)

I'm having issues with flashing DGS-1210-28 F2 revision. I'm unable to flash the latest stable version (21.02) nor the latest snapshot build.

This is TFTP command used to load initframfs image into memory:

tftpboot 0x8f000000 10.90.90.92:openwrt-realtek-generic-d-link_dgs-1210-28-initramfs-kernel.bin

Then according to this guide for DGS-1210-16 G1 I've changed LAN port config so I was able to ssh into switch successfully, then I did sysupgrade via ssh.

After rebooting I've connected to the switch via serial console again and this is the output:

U-Boot 2011.12.(2.1.5.67086)-Candidate1 (Jun 22 2020 - 14:58:40)

Board: RTL838x CPU:500MHz LXB:200MHz MEM:300MHz
DRAM:  128 MB
SPI-F: 1x32 MB
Loading 1024B env. variables from offset 0x80000
Board Model = DGS-1210-28-F1 Cameo_bdinfo_get_BoardID [293] 
Switch Model: RTL8382M_8218B_INTPHY_8218B_8214FC_DEMO (Port Count: 28)
Switch Chip: RTL8382
**************************************************
#### RTL8218B config - MAC ID = 0 ####
Now External 8218B
**************************************************
#### RTL8218B config - MAC ID = 8 ####
Now Internal PHY
**************************************************
#### RTL8218B config - MAC ID = 16 ####
Now External 8218B
**************************************************
**** RTL8214FC config - MAC ID = 24 ****
Now External 8214FC
Net:   Net Initialization Skipped
rtl8380#0
Hit Esc key to stop autoboot:  0 

Loading Runtime Image .OS:...FAILED
read: 0x56b69ccf, calculated: 0x5806c19fFS:...FAILED!!
os_ver = 83ddf784, fs_ver = 1.........(os_ver & fs_ver) = 0...
## Booting kernel from Legacy Image at b4e80000 ...
   Image Name:   
   Created:      2020-12-16  10:54:03 UTC
   Image Type:   MIPS Linux Kernel Image (gzip compressed)
   Data Size:    1035510 Bytes = 1011.2 KB
   Load Address: 80000000
   Entry Point:  80262000
   Verifying Checksum ... OK
   Uncompressing Kernel Image ... OK

Starting kernel ...

Linux version 2.6.19 (simon@208Server) (gcc version 3.4.4 mipssde-6.03.00-20051020) #20 PREEMPT Wed Dec 16 10:53:49 CST 2020
CPU revision is: 00019070
Determined physical RAM map:
 memory: 02000000 @ 00000000 (usable)
User-defined physical RAM map:
 memory: 07a00000 @ 00000000 (usable)
Built 1 zonelists.  Total pages: 30988
Kernel command line: console=ttyS0,115200 mem=122M noinitrd root=/dev/mtdblock7 rw rootfstype=squashfs csb=0x0142C0E0 cso=0x08676FCB csf=0x56C6A823 sfin=<NULL>,32MB,0;10891296 
Primary instruction cache 16kB, physically tagged, 4-way, linesize 16 bytes.
Primary data cache 16kB, 2-way, linesize 16 bytes.
Synthesized TLB refill handler (20 instructions).
Synthesized TLB load handler fastpath (32 instructions).
Synthesized TLB store handler fastpath (32 instructions).
Synthesized TLB modify handler fastpath (31 instructions).
PID hash table entries: 512 (order: 9, 2048 bytes)
Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
Memory: 121088k/124928k available (2015k kernel code, 3724k reserved, 421k data, 108k init, 0k highmem)
Mount-cache hash table entries: 512
Checking for 'wait' instruction...  available.
NET: Registered protocol family 16
NET: Registered protocol family 2
IP route cache hash table entries: 1024 (order: 0, 4096 bytes)
TCP established hash table entries: 4096 (order: 2, 16384 bytes)
TCP bind hash table entries: 2048 (order: 1, 8192 bytes)
TCP: Hash tables configured (established 4096 bind 2048)
TCP reno registered
squashfs: version 3.3 (2007/10/31) Phillip Lougher
JFFS2 version 2.2. (NAND) (C) 2001-2006 Red Hat, Inc.
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
Serial: 8250/16550 driver $Revision: 1.1.1.1 $ 1 ports, IRQ sharing disabled
serial8250: ttyS0 at MMIO 0x0 (irq = 31) is a 16550A
Probe: SPI CS1 Flash Type MX25L25635F
Creating 9 MTD partitions on "Total SPI FLASH":
0x00000000-0x00080000 : "BOOT"
0x00080000-0x000c0000 : "BDINFO"
0x000c0000-0x00100000 : "BDINFO2"
0x00100000-0x00280000 : "KERNEL1"
0x00280000-0x00e80000 : "ROOTFS1"
0x00e80000-0x01000000 : "KERNEL2"
0x01000000-0x01040000 : "SYSINFO"
0x01040000-0x01c40000 : "ROOTFS2"
0x01c40000-0x02000000 : "JFFS2"
IPv4 over IPv4 tunneling driver
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
NET: Registered protocol family 17
VFS: Mounted root (squashfs filesystem) readonly.
Freeing unused kernel memory: 108k freed
init started:  BusyBox v1.00 (2020.12.16-02:52+0000) multi-call binary
Starting pid 14, console : '/etc/rc'
Init RTCORE Driver Module....OK
tun: Universal TUN/TAP device driver, 1.6
tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com>
passwd file exit
ssdh_config file exit

 Complete NpHwInit  
RTK.0> device TAP0 entered promiscuous mode
x3sMxRs@FoGn8: not found
w1: not found




DGS-1210-28 login: 

I've performed sysupgrade even via GUI and nothing seems to be working. Any help is appreciated. Thanks.

EDIT: Added some details about switch from this topic.

U-boot printenv output after failed sysupgrade:

u-boot># printenv    
BID=99
Board_Version=32
Boot_Version=1.01.001
Serial_Number=XXXXXXXXXXXXX
addargs=setenv bootargs console=$(console_device),$(baudrate) mem=$(memsize) noinitrd root=$(image) rw rootfstype=squashfs
baudrate=115200
boardmodel=RTL8382M_8218B_INTPHY_8218B_8214FC_DEMO
bootcmd=run addargs ; bootm 0xb4e80000
bootdelay=1
bootstop=off
console_device=ttyS0
ethact=rtl8380#0
ethaddr=XX:XX:XX:XX:XX:XX
gatewayip=10.90.90.254
hw_version=F2
image=/dev/mtdblock7
ipaddr=10.90.90.90
memsize=122M
netmask=255.0.0.0
serverip=192.168.1.111
stderr=serial
stdin=serial
stdout=serial

Environment size: 611/1020 bytes

Output from cat /proct/mtd from initframfs image:

root@OpenWrt:/# cat /proc/mtd
dev:    size   erasesize  name
mtd0: 00080000 00010000 "u-boot"
mtd1: 00040000 00010000 "u-boot-env"
mtd2: 00040000 00010000 "u-boot-env2"
mtd3: 00d80000 00010000 "firmware"
mtd4: 002b0000 00010000 "kernel"
mtd5: 00ad0000 00010000 "rootfs"
mtd6: 00860000 00010000 "rootfs_data"
mtd7: 00180000 00010000 "kernel2"
mtd8: 00040000 00010000 "sysinfo"
mtd9: 00c00000 00010000 "rootfs2"
mtd10: 003c0000 00010000 "jffs2"

dmesg output from initframfs image can be found here.

I have now tested this fibre link from the boot loader too, after painting myself into a corner where I had to tftpboot an initramfs over it. The bootloader behaves like OEM - the fibre link comes up only after it is toggled from the other end.

I'm still happily running OpenWrt, which brings the link up without any issues on every boot.

I just noticed this post now. Have I understood correctly that the XS1930-12HP is a supportable device? I know its price tag is borderline absurd, but I'm entertaining the idea of getting one, in the long term. I tried looking around the web for any details of its internals, weeks ago, but found nothing conclusive.

I am relatively sure the XS1930-12HP will be supported some time, hopefully soon. At the moment the device boots with SMP. There are still issues with the Aquantia AQR813 8x PHY. On the similar EdgeCore also with an RTL9313 and with 8 RTL8221B 2.5 GBit PHYs we are able to send some pings. There seems to be some issue with stability due to SMP. I would say wait still a couple of weeks before buying such an expensive device. The 9300 based devices are also quite nice and work quite well.

1 Like

I have a couple AMX NXA-ENET8-POE+ 8 port switches that use the same mainboard as the ECS-2100-10P. I am having difficulty interrupting the boot to get to the shell on the console though. Is there a trick that you know?

There are a couple of RTL-based switches where I was not able to figure out how to interrupt the boot, although supposedly there is always a way to stop it with some secret key. So far I have had success with "space", "ESC" and in one case it was "$".

If this takes too long, then there are two options: Read out the flash with a clamp, manipulate the boot settings and flashing it back. Or, read out the flash and then use ghidra/ida pro to disassemble the code, identify the magic key and use that. The first option is much faster unless you are very experienced with your reverse-engineering tool. In both cases you need to open the case and read out the flash using a clamp (some people also like to desolder the flash chip and put a socket in).

In any case, there has been rarely any issue with figuring out how to flash through the web-interface. In fact there is a third way forward, but you risk bricking the device and you would need to be able to re-flash it if things go wrong: Just take the ECS-2100-10P image and try flashing it through the web interface. Possibly first adjusting any magic bytes at the beginning of the image (compare with the OEM image, set these in the makefile in OWRT) and rebuild the image. This is really very risky, if not to say careless. There are typically 3 outcomes: you end up in an endless boot loop (now you need to re-flash to get out), u-boot does not like the image and drops to the console (yay, console access), the image just boots (and you can change the bootloader settings through flash tools under OWRT to interrupt boot).

1 Like

Hi, have someone run openwrt on HP 1910-8-PoE+ (65W) Switch JG537A ?

I did try space, esc, enter so far, and those did not work. If you check out my github, I was able to decompress some of the .bin (they call them .bix) files. Originally I did try to flash the ECS OEM firmware directly to the device, then through TFTP. This results in an error, and I am not to the point where I could figure out how to change the magic bytes in the firmware, nor could I figure out how to get around that check. My github also has the boot log dump, and the binwalk result from the original firmware. I have a couple of these devices, so I think my next step would be to get a clamp or desolder the flash to try and read it.

I investigated this further. The hardware itself is able to even transmit and receive jumbo frames up to 10000 bytes (including frame overhead). The driver imposes a limit of 1600 bytes including all frame overhead and reports a maximum MTU of 1500 bytes (without overhead) to the kernel for packets on the CPU-port. The size limit for packets being switched is 10000 bytes including Ethernet frame overhead. This can only be reported in the 5.10 kernel (but is not and therefore is ignored), the 5.4 kernel does not take this into account whatsoever.

The fragmented packets for ping with -s > 1472 I see are due to the kernel fragmenting packets whose size is larger than 1514 bytes (including overhead). Now under DSA you can configure CPU-port, bridge and port individually with different MTUs. If you start playing with it and the ping-size one can get situations in which the kernel does not fragment a packet but also does not even try to send it out at all. This is either a misconfiguration or a even a bug.

The hardware allows to set a size filter for packets leaving (all SoCs) and arriving (not on RTL838x) at the CPU port. Additionally, packet maximum sizes can be defined for each (on RTL93xx) port (or all ports on RTL83xx) for 1GBit and 100/10MBit connections separately.

This patch: https://github.com/bkobl/openwrt/commit/0bf16d40d09f19a5a7e6c6c68be5d22bdae62099 allows to set MTUs for individual lan ports (on RTL93xx) or for all lan ports on RTL83xx, plus larger MTUs for the CPU port.

Now you can e.g. do:

ip link set dev eth0 mtu 1590
ip link set dev switch mtu 1590
ip link set dev switch.1 mtu 1590
ip link set dev lan1 mtu 1590

The issue is that phylink/DSA strongly links interface MTUs with the one of the CPU-Port. The lan1 MTU can only be set as high as the MTU of the CPU-Port. This seems on purpose, but is a strange choice for a switch, because it limits packet sizes also for packets going through the switch, which by default happily forwards Jumbo frames up to 10k bytes. Ideally one would like to do:

ip link set dev eth0 mtu 1500
ip link set dev lan1 mtu 9000

but DSA does not allow this, while this is more-or-less the hardware default. Maybe someone has a good idea how to solve this.

3 Likes

No, I don't have any of those. But I once got the GPL U-Boot sources for those from HP in error as posted here:

Can you or anybody else confirm that TP-Link TL-SG3216 (=t2600g-18ts) is RTL8382M based as stated here:
https://wikidevi.wi-cat.ru/TP-LINK/Switch

I downloaded GPL Code here:
https://static.tp-link.com/resources/gpl/t2600g-18ts_gpl.tar.gz

And found .../sdk/system/drv/swcore/chip.c which contains hints to these RTL chips:
RTL8396M
RTL8393M
RTL8392M
RTL8391M
RTL8389M
RTL8389L
RTL8382M
RTL8380M
RTL8377M
RTL8353M
RTL8352M
RTL8332M
RTL8330M
RTL8329M
RTL8328M
RTL8328S

So does it make sense that I will get a v2 model to test it or is it one of the devices which isn't accessible at all?

I appreciate any feedback.

From what you write it is quite likely that the device is RTL-based. But whether it can be easily supported is a different story. One would need to figure out how to get console access to u-boot and then there could be issues with flashing. If you have time and interest in hacking the device, then go ahead, if you need something that will surely work, then stay away from it.

I have trouble compiling https://github.com/bkobl/openwrt/commit/0bf16d40d09f19a5a7e6c6c68be5d22bdae62099, it seems to be based on commits that you have not pushed yet:

In file included from drivers/net/ethernet/rtl838x_eth.c:24:
drivers/net/ethernet/rtl838x_eth.c: In function 'rtl93xx_hw_en_rxtx':
drivers/net/ethernet/rtl838x_eth.c:809:53: error: 'RTL931X_L2_UNKN_UC_FLD_PMSK' undeclared (first use in this function); did you mean 'RTL930X_L2_UNKN_UC_FLD_PMSK'?
  809 |                 sw_w32_mask(0, BIT(priv->cpu_port), RTL931X_L2_UNKN_UC_FLD_PMSK);
      |                                                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~
./arch/mips/include/asm/mach-rtl838x/mach-rtl83xx.h:24:40: note: in definition of macro 'sw_w32'
   24 | #define sw_w32(val, reg)        writel(val, RTL838X_SW_BASE + reg)
      |                                        ^~~
./arch/mips/include/asm/mach-rtl838x/mach-rtl83xx.h:26:41: note: in expansion of macro 'sw_r32'
   26 |                                 sw_w32((sw_r32(reg) & ~(clear)) | (set), reg)
      |                                         ^~~~~~
drivers/net/ethernet/rtl838x_eth.c:809:17: note: in expansion of macro 'sw_w32_mask'
  809 |                 sw_w32_mask(0, BIT(priv->cpu_port), RTL931X_L2_UNKN_UC_FLD_PMSK);
      |                 ^~~~~~~~~~~
drivers/net/ethernet/rtl838x_eth.c:809:53: note: each undeclared identifier is reported only once for each function it appears in
  809 |                 sw_w32_mask(0, BIT(priv->cpu_port), RTL931X_L2_UNKN_UC_FLD_PMSK);
      |                                                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~
./arch/mips/include/asm/mach-rtl838x/mach-rtl83xx.h:24:40: note: in definition of macro 'sw_w32'
   24 | #define sw_w32(val, reg)        writel(val, RTL838X_SW_BASE + reg)
      |                                        ^~~
./arch/mips/include/asm/mach-rtl838x/mach-rtl83xx.h:26:41: note: in expansion of macro 'sw_r32'
   26 |                                 sw_w32((sw_r32(reg) & ~(clear)) | (set), reg)
      |                                         ^~~~~~
drivers/net/ethernet/rtl838x_eth.c:809:17: note: in expansion of macro 'sw_w32_mask'
  809 |                 sw_w32_mask(0, BIT(priv->cpu_port), RTL931X_L2_UNKN_UC_FLD_PMSK);
      |                 ^~~~~~~~~~~

A grep for that identifier only yields the one location:

#> grep RTL931X_L2_UNKN_UC_FLD_PMSK -r target/linux/realtek/
target/linux/realtek/files-5.10/drivers/net/ethernet/rtl838x_eth.c:		sw_w32_mask(0, BIT(priv->cpu_port), RTL931X_L2_UNKN_UC_FLD_PMSK);

Also, what do you make of the fact that 1500 byte packets seem to work on the DGS-1210-28 F2 (recent master), and on the Netgear GS108T v3 (release version with the 5.4 kernel)?

Oops, some of the RTL931x work slipped into this. You can just comment out that line 809 it is only relevant on that architecture. Anyway, this was not a PR because there are obvious anti-features such as limiting the size of frames that are being merely switched. Also kernel 5.10 sends packets up to 1514 bytes in total over the network in its default configuration, but for some reason it does not do up to 1500 bytes net. The difference is certainly somewhere in the kernel, since what happens is that the packets get fragmented before they reach the Ethernet driver. There was intensive work on DSA and MTUs for 5.10, see e.g. https://patchwork.ozlabs.org/project/netdev/patch/20191123194844.9508-2-olteanv@gmail.com/
I am not convinced all the assumptions in that patch actually make sense when used with a switch.

Thank you very much, I just bought a used TL-SG3216 for 40 € and I will try my best to get console access to it, I have to wait until it has arrived. I will post some photos and hopefully a bootlog then.

1 Like