Support for RTL838x based managed switches

I have a similar issue:

I have a dgs-1210-10p and a (hopefully) sane DSA config.
EDIT:

  • Version: r20255-0582acf429 (from a few days ago).
  • Exact Model: Could be F1, as a sticker on the back (not bottom) indicates, but not sure :man_shrugging:

When I use proto dhcp on one of the VLAN interfaces, the interface gets an address, but IPv4 does not work. I.e. the gateway IP gets no ARP/IP in the neighbor table.

(IPv6 however just works fine.)

root@sw1:~# netstat -nt
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       
tcp        0    156 fde6:a09a:b373:10::d1f9:22 fde6:a09a:b373:41::d22f:34530 ESTABLISHED

BUT If I assign the or any address manually then ARP, and therefor IPv4 works.
At first I thought its an issue with one untagged and multiple tagged vlans on that hybrid port till I realized that DHCP went fine, and it is really just ARP...
For now I just setup all VLANs as tagged on that trunk.

DHCP log from cpe:

Wed Aug 10 19:41:48 2022 daemon.info dnsmasq-dhcp[18565]: DHCPRELEASE(br-vlan16) 192.168.16.180 08:5a:11:a2:7f:10
Wed Aug 10 19:41:49 2022 daemon.info dnsmasq-dhcp[18565]: DHCPRELEASE(br-vlan16) 00:03:00:01:08:5a:11:a2:7f:10
Wed Aug 10 19:41:50 2022 kern.info kernel: [93263.028531] Atheros AR8216/AR8236/AR8316 mdio.0:00: Port 2 is down
Wed Aug 10 19:42:46 2022 kern.info kernel: [93318.323956] Atheros AR8216/AR8236/AR8316 mdio.0:00: Port 2 is up
Wed Aug 10 19:43:11 2022 kern.info kernel: [93343.924115] Atheros AR8216/AR8236/AR8316 mdio.0:00: Port 2 is down
Wed Aug 10 19:43:13 2022 kern.info kernel: [93345.971781] Atheros AR8216/AR8236/AR8316 mdio.0:00: Port 2 is up
Wed Aug 10 19:43:14 2022 daemon.info dnsmasq-dhcp[18565]: DHCPSOLICIT(br-vlan16) 00:03:00:01:08:5a:11:a2:7f:10
Wed Aug 10 19:43:14 2022 daemon.info dnsmasq-dhcp[18565]: DHCPADVERTISE(br-vlan16) fde6:a09a:b373:10::d1f9 00:03:00:01:08:5a:11:a2:7f:10 sw1
Wed Aug 10 19:43:14 2022 daemon.info dnsmasq-dhcp[18565]: DHCPADVERTISE(br-vlan16) 2003:XX:bf2f:9210::d1f9 00:03:00:01:08:5a:11:a2:7f:10 sw1
Wed Aug 10 19:43:16 2022 daemon.info dnsmasq-dhcp[18565]: DHCPREQUEST(br-vlan16) 00:03:00:01:08:5a:11:a2:7f:10
Wed Aug 10 19:43:16 2022 daemon.info dnsmasq-dhcp[18565]: DHCPREPLY(br-vlan16) fde6:a09a:b373:10::d1f9 00:03:00:01:08:5a:11:a2:7f:10 sw1
Wed Aug 10 19:43:16 2022 daemon.info dnsmasq-dhcp[18565]: DHCPREPLY(br-vlan16) 2003:XX:bf2f:9210::d1f9 00:03:00:01:08:5a:11:a2:7f:10 sw1
Wed Aug 10 19:43:20 2022 daemon.info dnsmasq-dhcp[18565]: DHCPDISCOVER(br-vlan16) 08:5a:11:a2:7f:10
Wed Aug 10 19:43:20 2022 daemon.info dnsmasq-dhcp[18565]: DHCPOFFER(br-vlan16) 192.168.16.180 08:5a:11:a2:7f:10
Wed Aug 10 19:43:20 2022 daemon.info dnsmasq-dhcp[18565]: DHCPDISCOVER(br-vlan16) 08:5a:11:a2:7f:10
Wed Aug 10 19:43:20 2022 daemon.info dnsmasq-dhcp[18565]: DHCPOFFER(br-vlan16) 192.168.16.180 08:5a:11:a2:7f:10
Wed Aug 10 19:43:20 2022 daemon.info dnsmasq-dhcp[18565]: DHCPREQUEST(br-vlan16) 192.168.16.180 08:5a:11:a2:7f:10
Wed Aug 10 19:43:20 2022 daemon.info dnsmasq-dhcp[18565]: DHCPACK(br-vlan16) 192.168.16.180 08:5a:11:a2:7f:10 sw1
Wed Aug 10 19:43:20 2022 daemon.info dnsmasq-dhcp[18565]: SLAAC-CONFIRM(br-vlan16) 2003:XX:bf2f:9210:a5a:11ff:fea2:7f10 sw1
Wed Aug 10 19:43:20 2022 daemon.info dnsmasq-dhcp[18565]: SLAAC-CONFIRM(br-vlan16) fde6:a09a:b373:10:a5a:11ff:fea2:7f10 sw1

IPv4 dows not work

root@sw1:~# ping 192.168.16.1
PING 192.168.16.1 (192.168.16.1): 56 data bytes
^C
--- 192.168.16.1 ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss

root@sw1:~# ip n
192.168.16.132 dev switch.16  FAILED
192.168.16.1 dev switch.16  FAILED
2003:XX:bf2f:9210::1 dev switch.16 lladdr 02:00:01:01:00:10 router STALE
fe80::1ff:fe01:43 dev switch.67 lladdr 02:00:01:01:00:43 router STALE
fde6:a09a:b373:10::1 dev switch.16 lladdr 02:00:01:01:00:10 router STALE
fe80::1ff:fe01:10 dev switch.16 lladdr 02:00:01:01:00:10 router REACHABLE

root@sw1:~# ip -o link show dev switch.16
15: switch.16@switch: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000\    link/ether 08:5a:11:a2:7f:10 brd ff:ff:ff:ff:ff:ff

root@sw1:~# ip -o -4 addr show dev switch.16
15: switch.16    inet 192.168.16.180/24 brd 192.168.16.255 scope global switch.16\       valid_lft forever preferred_lft forever

root@sw1:~# ping 192.168.16.1
PING 192.168.16.1 (192.168.16.1): 56 data bytes
^C
--- 192.168.16.1 ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss

root@sw1:~# ip -4 addr flush dev switch.16
root@sw1:~# ip addr add 192.168.16.180/24 dev switch.16

root@sw1:~# ping -c 2 192.168.16.1
PING 192.168.16.1 (192.168.16.1): 56 data bytes
64 bytes from 192.168.16.1: seq=0 ttl=64 time=1.877 ms
64 bytes from 192.168.16.1: seq=1 ttl=64 time=0.836 ms

--- 192.168.16.1 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
root@sw1:~# cat /etc/config/network                                                                                                                                                  [68/573]
                                               
config interface                'loopback'
        option  device          'lo'           
        option  proto           'static'
        option  ipaddr          '127.0.0.1'
        option  netmask         '255.0.0.0'
                                               
config device                                  
        option  name            'switch'
        option  type            'bridge'
        list    ports           'lan1'     
        list    ports           'lan2'  
        list    ports           'lan3'
        list    ports           'lan4'
        list    ports           'lan5'     
        list    ports           'lan6'
        list    ports           'lan7'  
        list    ports           'lan8'         
        list    ports           'lan9'       
        list    ports           'lan10' 
        option  macaddr         '08:5a:11:a2:7f:10'
                                               
##############################################################################
                                               
config bridge-vlan
        option  device          'switch'
        option  vlan            '1'
        list    ports           'lan2:u*'
        list    ports           'lan3:u*'
        list    ports           'lan4:u*'
        list    ports           'lan5:u*'
        list    ports           'lan6:u*'
        list    ports           'lan7:u*'

config bridge-vlan
        option  device          'switch'
        option  vlan            '16'
        list    ports           'lan8:t'

config bridge-vlan
        option  device          'switch'
        option  vlan            '17'
        list    ports           'lan8:t'

config bridge-vlan
        option  device          'switch'
        option  vlan            '64'
        list    ports           'lan8:t'

config bridge-vlan
        option  device          'switch'
        option  vlan            '65'
        list    ports           'lan8:t'

config bridge-vlan
        option  device          'switch'
        option  vlan            '66'
        list    ports           'lan8:t'

config bridge-vlan
        option  device          'switch'
        option  vlan            '67'
        list    ports           'lan8:t'

config bridge-vlan
        option  device          'switch'
        option  vlan            '4094'
        list    ports           'lan1:u*'

##############################################################################

config interface 'vlan1'
        option  device          'switch.1'
        option  proto           'none'

config interface 'vlan16'
        option  device          'switch.16'
        option  proto           'dhcp'

config interface 'vlan16_v6'
        option  device          'switch.16'
        option  proto           'dhcpv6'

config interface 'vlan17'
        option  device          'switch.17'
        option  proto           'none'

config interface 'vlan64'
        option  device          'switch.64'
        option  proto           'none'

config interface 'vlan65'
        option  device          'switch.65'
        option  proto           'none'

config interface 'vlan66'
        option  device          'switch.66'
        option  proto           'none'

config interface 'vlan67'
        option  device          'switch.67'
        option  proto           'none'

config interface 'vlan4094'
        option  device          'switch.4094'
        option  proto           'static'
        option  ip6class        'local'
        option  ip6assign       '64'

This could maybe be relevant but I have no clue what and how to interpret. I assume some offloading on that chip is still buggy?

Wed Aug 10 19:34:16 2022 kern.info kernel: [   46.459861] rtl83xx_l3_nexthop_update: Setting up fwding: ip 192.168.16.1, GW mac 0000020001010010
Wed Aug 10 19:34:16 2022 kern.info kernel: [   46.470024] Route with id 3 to 192.168.0.0 / 16
Wed Aug 10 19:34:16 2022 kern.info kernel: [   46.475127] Using packet counter 0
Wed Aug 10 19:43:52 2022 kern.info kernel: [   77.949502] rtl83xx_l3_nexthop_update: Setting up fwding: ip 192.168.16.1, GW mac 0000020001010010
Wed Aug 10 19:43:52 2022 kern.info kernel: [   77.959657] Route with id 3 to 192.168.0.0 / 16
Wed Aug 10 19:43:52 2022 kern.info kernel: [   77.964745] rtl83xx_l3_nexthop_update: total packets: 128
Wed Aug 10 19:44:07 2022 kern.info kernel: [   92.829857] rtl83xx_l3_nexthop_update: Setting up fwding: ip 192.168.16.1, GW mac 0000020001010010
Wed Aug 10 19:44:07 2022 kern.info kernel: [   92.840006] Route with id 3 to 192.168.0.0 / 16
Wed Aug 10 19:44:07 2022 kern.info kernel: [   92.845095] rtl83xx_l3_nexthop_update: total packets: 246
Wed Aug 10 19:44:46 2022 authpriv.info dropbear[3134]: Child connection from fde6:a09a:b373:41::d22f:34530
Wed Aug 10 19:44:47 2022 authpriv.notice dropbear[3134]: Auth succeeded with blank password for 'root' from fde6:a09a:b373:41::d22f:34530
Wed Aug 10 19:46:51 2022 kern.info kernel: [  257.503554] rtl83xx_fib4_del: found a route with id 1, nh-id 0
Wed Aug 10 19:46:51 2022 kern.err kernel: [  257.510114] rtl83xx-switch switch@1b000000: unknown nexthop, id 0
Wed Aug 10 19:46:51 2022 kern.err kernel: [  257.522297] rtl83xx-switch switch@1b000000: unknown nexthop, id 0
Wed Aug 10 19:46:51 2022 kern.info kernel: [  257.529752] rtl83xx_fib4_del: found a route with id 2, nh-id 0
Wed Aug 10 19:46:51 2022 kern.err kernel: [  257.536491] rtl83xx-switch switch@1b000000: unknown nexthop, id 0
Wed Aug 10 19:46:51 2022 kern.err kernel: [  257.544569] rtl83xx_fib4_del: no such gateway: 0.0.0.0
Wed Aug 10 19:46:51 2022 kern.info kernel: [  257.550350] rtl83xx_fib4_del: found a route with id 3, nh-id 3
Wed Aug 10 19:46:51 2022 kern.err kernel: [  257.557787] rtl83xx_fib4_del: no such gateway: 192.168.16.1

Workaround #1: Disable IPv4 :sunglasses:
Workaround #2: Set a static DHCPv4 lease with name and proto static on sw1 :neutral_face:

Ok the root-cause is an additional route.

I send via DHCP also 192.168.0.0/16 with

    list    dhcp_option         'option:classless-static-route, 192.168.0.0/16,192.168.16.1'

(yes this collides with the rfc but works with most clients anyway).

I have removed this option from dhcp and now I have only the local network and the default route present. And this works.

As soon as I add i.e. 192.168.0.0/16 via 192.168.16.1 I loose IPv4 connectivity.

As I want to use the DSG-1210 as a switch only I have no specific problem right now but could someone say something regarding this issue? Like,

  • is this known?
  • If it's known why does it happen; or do I just "hold it wrong" and there is a work around?
  • Can we expect a fix for that?
  • Probably related, will it ever be possible to route with at line speed?

I tested routing between 2 VLANs (on the same switch/bridge device) and got only 30mbit/s which seams even for that CPU a little bit weak, or not?
D-Link stats that 128 static IPv4 and 50 static IPv6 routes are supported; Is it wishful thinking that the routing is done by the asic when the original firmware is used? Dumb me forgot to test the device after unboxing. We just soldered and flashed directly without much thinking :sweat_smile: )

However, when I add another route, I get on dmesg

[  191.739762] rtl83xx_port_ipv4_resolve: resolved mac: 0000020001010010
[  191.747140] rtl83xx_l3_nexthop_update: Setting up fwding: ip 192.168.16.1, GW mac 0000020001010010
[  191.757299] Route with id 3 to 192.168.0.0 / 16
[  191.762496] Using packet counter 0

And then IPv4 no longer works.

TL;DR; Why does IPv4 breaks if I add another route?

1 Like

Btw, just spitting some food for thought out as well;

We know the I2C channels share the clock pin; which from an I2C pov is of course weird; as you can't do shared datatransfers. Sharing the clock-pin does reduce the pin-cost by 1, which may be worth it.

The obvious reason however, as to why they do want to have 2 I2C controllers, is of course, i2c addressing. This way, the SFP cages can be addressed the normal way, without special magic (or impossibilities?).

So in that sense it is logical. So my question relating to this, do we have source-drops of switches with MORE then two SFP ports? as I would suspect, that the rtl93xx can do more SFP ports, which is why the manual lists 0-7 I2C data pins (but no actual pins are defined).

IPv4 dhcp works fine for me, after i've installed the image to flash; only the initramfs image/netboot doesn't seem to work. The statically configured IP does work fine however.

I think I kept hearing 'max 50Mbit' for our SoC; so 30 mbit doens't seem too far off, especially if you consider ingress + egress? what does 'top' say while you are dumping that much data around?

BUT, I do think birger was working on L3 offloading; which our chips seem to support, in the future, this may be possible.

DGS-1210-16 (rtl8382m) and larger have 4 SFP ports.

DGS-1210 series bitbang the I2C for SFP port via RTL8231

1 Like

Yeah did for me too, this issue was that I send an additional route via DHCP, and that "extra" route breaks something. It does not matter if the route comes via DHCP or if I add it via ip route. See my last post: Support for RTL838x based managed switches - #1774 by _bernd

As I would have suspected ksoftirqd/0 consumes most of the resources.

That's great news!

'

I obviously understand :slight_smile: I'm more looking at; the rtl93xx has appearantly 8 I2C interfaces (with 2 clocks) but we don't know which GPIO's those attach too.

I get that, that is why I wanted to make it clear that DGS-1210 series are not the ones to look at as they just bitbang I2C and use different SoC

Check the i2c_mux_data structs in target/linux/realtek/files-5.10/drivers/i2c/muxes/i2c-mux-rtl9300.c
IIRC those are the actual GPIO lines those pins are muxed on. Realtek likes shared I2C clock lines so much, they designed a hardware peripheral that does just that...

Is it just me or these SoC-s look like a FPGA design turned into dedicated silicon

Seeing the register definition tables, we've always had the impression this was tests-turned-to-silicon.

That would also explain the weird SPI controller

The SPI controller isn't part of the switchcore, but actually a peripheral with a well-defined register space (well, some control bits are still hidden in the switchcore...). Don't really know the SPI controller that well, but the GPIO and watchdog controllers weren't too bad to write a driver for.

Isn't it weird in a sense that its actually half-duplex and causes the reset not to actually work(Except for forcing it with GPIO-s)?

It is, those SPIF control bits have no business being in a PLL register. IIRC the issue is that one half of the connection stays in 4-byte addressing mode on reset, causing the bootloader not to load.

Sadly, there's only so much we can learn from that driver, as the interesting bits really is in the data structure (which is exactly what you said :stuck_out_tongue: ).

We see that 8 sda pins (93100 has 16!!) we start with pin8 for the clock, a second clock at pin 17, so that's at least news! Could be we have clk0, dat0-7, clk1; and I would guess, any of the data pins can be connected to either clk0 or clk1; so that means the data pins have 2 mux options. But that probably needs some more digging and experimentation, and some accessible test-pads/pins to measure ...

Anyway, I've added the data to the pinout on the wiki ...

Do we know any device though that actually makes use of this? I wonder if someone has a 930x board with a big bunch of SFP cages ... I see the Enginious ECS2512 would be an interesting candidate, as would the Mestechs_MSG9424. The MS108EUP seems hard to get though as does the TL-ST1008F. But none of us on this thread seem to have any of these switches :frowning: They'd be very interesting testing targets for sure.

Yeah, I've read a bit about it, and somewhere it makes sense, it saves you on pins. I think this peripheral can be best treated as a I2C mux, such as the pca9541 and co ... at least the driver is currently already in the mux sub-dir; so that's a great start :stuck_out_tongue:

Edit: looking more closely at the code, it claims we have 2 * 8. so the 'pins=8' is inconsistent with the writings earlier. Would not be weird to have 8 I2C data pins per controller. The i2c.h from the SDK backs this as well. I guess best idea is to request source code for the TP-Link listed above. But my gut tells me, they most likely use an RTL8231 as GPIO and bit-bang the I2C ...

Well remember, all this SoC's start their life as FPGA's. I've worked a bit with Xilinx SoC's/FPGA's and if you ever played with Vivado, it's quite obvious too, that this is how you build a modern SoC. You buy what you can as module, and glue it all together. If the module is to expensive, you usually 'borrow' something from open cores, or write your own module (in vhdl). You do some extensive tests on it, hope all your timings can be met, and then do an ASIC from it; re-test it, and if all checks out; ship it.

You can even find the define's in the code :stuck_out_tongue: https://gitlab.com/olliver/openwrt/realtek_sdk/-/blob/openwrt-dev/bin/sdk_release.sh#L206 for example :slight_smile:

the GPIO is the most basic 'peripheral' a FPGA would have. It really is just a pin; and even in vivado, that's an 'open-source' module, no secrets there (as it is so trivial). Watchdog siilar, or using something off-the-shelf should be cheap.

These special bits, is where things become interesting. The switch-core, is most likely the same recycled VHDL they've been using and improve for decades. It's almost as easy as 'drag-n-drop' :slight_smile:

Even their own internal PRL (preloader) won't load? I know there's a register that tells you the strapping bits, and the strapping bits include a pin to set 3 or 4 pin mode. If this would be miss-configured (resistor missing/incorrectly placed), then something probably goes wrong. If it is because on shutdown its set wrongfully somewhere, then it's a fixable software bug. A proper bootloader, should always be able to do the right thing (tm). But that's much further future work :wink:

I get that everything starts in simulation, moves on to real FPGA, and eventually gets onto a wafer.
Only did some basic FPGA work before I dropped out of college, but it was just a simple Spartan-6 board and no Vivado as Vivado did not support anything before Spartan-7.

I dont think that the bootloader gets stuck, as this has to happen before it, it's probably that HW fails to memory map the NOR due to incorrect addressing.
SparX-5 that I am working on has no FW blobs at all (This is a miracle with ARMv8 and A53) and the HW SPI controller memory maps the NOR, but the addressing is configurable in a register(Defaults to 3 bytes obviously) and that is how it boots as it only support booting from 0x0 of CS0 and only SPI-NOR.

I doubt anything is done in pure simulation anymore. Yes you need to still simulate a lot, but having an FPGA to run your build on happens 'in parallel' I'd argue for the non-critical stuff. I2C, GPIO, SPI etc etc. Even DDR probably, and they likely have a demo board which has some of these things (flash, ddr).

Before Vivado, there was Xilinx ISE 'web-pack' or something? I've actually used that too :smiley: but that was years ago ...

I really would like to learn more about this. I've seen some posts in this thread years ago ...

I know that as mentioned before, we have the strapping pin. Also, the BOOTROM (burnt into the SoC) in the end is also just software, so should be able to read those strapping pins and do the right thing. But, lazy devs are lazy devs :stuck_out_tongue:

There's always 'something' burnt into the chip, unless they magically map the NOR to 0x00000) the starting address, which is super unlikely ... the ALU will need to execute SOMEthing, and that something must be at 0x0000 (or whatever they made their init address) Usually, this is where the BOOTROM lives.

Speaking of bootloader, my shameless plug:
U-Boot for XGS12xx switches (and others based on realtek rtl930x) for those that would like to help (testing) with an XGS1xxx device, please please do :smiley: but lets keep discussion in that thread :smiley:

1 Like