Support for RTL838x based managed switches

The PR added support for Link Aggregation. Just try it out.
There is a caveat, and that is the reason so far no big announcement was made: We pulled in upstream code from 5.15 that enabled configuring of L2 learning per port. The issue is that the default is not to learn at the level of the driver. However at the level of the bridge, learning is by default enabled. And we forgot to merge that code in. So one has to enable L2 learning manually in userspace. Have a look at the discussion at the end of that PR.

I skipped the flash layout change, the revert and the last logging change and it builds. But it seems some device tree adjustment is required. I am checking my diffs vs the 1250 but it doesn't work.

I don't get the switch interface added, and the lan interfaces don't work even after I have manually marked them up.

Any ideas what device tree nodes are missing?

[    5.042466] REALTEK RTL9300 SERDES rtl838x slave mii-0:1a: Detected internal RTL9300 Serdes
[    5.051845] REALTEK RTL9300 SERDES rtl838x slave mii-0:1a: No DT node.
[    5.059156] REALTEK RTL9300 SERDES: probe of rtl838x slave mii-0:1a failed with error -22
[    5.075871] REALTEK RTL9300 SERDES rtl838x slave mii-0:1b: Detected internal RTL9300 Serdes
[    5.085250] REALTEK RTL9300 SERDES rtl838x slave mii-0:1b: No DT node.
[    5.092523] REALTEK RTL9300 SERDES: probe of rtl838x slave mii-0:1b failed with error -22
[    5.108743] REALTEK RTL9300 SERDES rtl838x slave mii-0:3f: Detected internal RTL9300 Serdes
[    5.118110] REALTEK RTL9300 SERDES rtl838x slave mii-0:3f: No DT node.
[    5.125431] REALTEK RTL9300 SERDES: probe of rtl838x slave mii-0:3f failed with error -22

Did you check the DTS changes for the XGS1250-12 and compare to the old one?

Yes I diffed the XGS1250 files between my old copy of vsmp versus the one that got merged, but there was not much difference. Has anyone verified that the XGS1250 still works with openwrt master?

A bit of topic, but wondering if anybody has an idea.
D-Link DGS series are dual FW, which wastes a lot of space for my use so I went out to try and figure out how to use the full NOR.
Basically, D-Link is verifying the sanity of both the kernel and rootfs before booting anything, even initramfs so if both FW-s are corrupted then tough luck.

Essentially, the way they are checking this is by looking for the custom CAMEOTAG magic after the kernel and rootfs, it's followed by the magic version (Usually just 1) and then a checksum.
Somebody previously did some digging and figured out the format:

    0x0                 0x4                 0x8                 0xC
     +-------------------+-------------------+-------------------+-------------------+
 0x0 |                               FW Version (text)                               |
     +-------------------+-------------------+-------------------+-------------------+
0x10 |'C'  'A'  'M'  'E'  'O'  'T'  'A'  'G' |     OS Version    |    Checksum *1    |
     +-------------------+-------------------+-------------------+-------------------+ 
 
1: Checksum
     data without block header and last 4bytes of checksum
     (unit: 1byte, (addr 0x0) + (addr 0x1) + ...)

Source: https://memo205.wordpress.com/2021/09/20/dgs-1210-28-f1-フゑームウェをパヒ/

The tag can be added by using the imgtag tool from the GPL toolchains host/tools folder while the checksum is calculated using the genTotalChecksum tool from the same folder.

The CAMEOTAG and version are static so they can just be appended but the checksum is dynamic.

The issue is that even with proper tagging and checksum is that the bootloader looks to be expecting the tag and checksum within the respective partitions.
So, it wants the kernel tag to end up being in the kernel partition which is only 1.5MB, so that obviously fails.

On the plus side, you can append the tag and checksum to a empty file and the bootloader likes that, so there is no custom header check.

Basically, OpenWrt works only because the second FW is left intact and that passes the verification.

I just built from master and I'm seeing these on my XGS1250-12:

# dmesg -l err
[    6.866071] REALTEK RTL9300 SERDES rtl838x slave mii-0:1b: No DT node.
[    6.895676] REALTEK RTL9300 SERDES rtl838x slave mii-0:3f: No DT node.

I just wiped my buildroot (make dirclean & make distclean) and the ramdisk image for the XGS1250-12 is not giving me any network (even with rtk network on). Tried with a snapshot grabbed from the download servers and with my own build. An older build does give functional network. Strangely enough I'm seeing link activity. The network is up. And ethtool reports info.

@anon13997276 Can you check with master on your XGS1250-12?

Console messages when plugging in a cable into port 1:

[  366.967142] switch: port 1(lan1) entered disabled state
[  372.162691] rtl93xx_phylink_mac_config port 0, mode 0, phy-mode: xgmii, speed 1000, link 1
[  372.171946] rtl93xx_phylink_mac_config SDS is 2
[  372.177025] rtl9300_sds_rst 16
[  372.200423] rtl83xx-switch switch@1b000000 lan1: Link is Up - 1Gbps/Full - flow control rx/tx
[  372.212732] switch: port 1(lan1) entered blocking state
[  372.218642] switch: port 1(lan1) entered forwarding state
[  372.227902] rtl93xx_phylink_mac_config port 0, mode 0, phy-mode: xgmii, speed 1000, link 1
[  372.237176] rtl93xx_phylink_mac_config SDS is 2
[  372.242212] rtl9300_sds_rst 16
root@OpenWrt:/# ethtool lan1
Settings for lan1:
        Supported ports: [ TP MII ]
        Supported link modes:   10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Half 1000baseT/Full 
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Half 1000baseT/Full 
        Advertised pause frame use: Symmetric Receive-only
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Link partner advertised link modes:  10baseT/Half 10baseT/Full 
                                             100baseT/Half 100baseT/Full 
                                             1000baseT/Full 
        Link partner advertised pause frame use: Symmetric
        Link partner advertised auto-negotiation: Yes
        Link partner advertised FEC modes: Not reported
        Speed: 1000Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 0
        Transceiver: external
        Auto-negotiation: on
        MDI-X: Unknown
        Supports Wake-on: d
        Wake-on: d
        Link detected: yes

So I know for some devices OpenWrt just replaces the entire stock firmware U-Boot and all (I'm familiar with some Kirkwood devices). Would it be possible to build our own U-Boot for the D-Link devices therefore bypassing this check?

1 Like

This was my first idea, but I am not sure if the GPL dump U-boot is sufficient as the code where it's checked is conveniently not present or is present in a precompiled object like the board detection and config which D-Link ships as a precompiled object.

I tried decompiling the U-boot but Ghidra is not the best with MIPS.

Ok, found the part responsible for checking the tag.
Its a precompiled object in common/cmd_tool.o

I am not sure, but isn't this a breach of GPLv2?

Ok, so its looks really easy to bypass the checks, they are just called in do_bootm, will just remove the calls and that's it.

Yeah, just remove the call to fixup_checksum_linux in do_bootm and then it works

1 Like

Hi,

I am on vacation and could only smuggle an XGS1210 past my wife. I wanted to work on network without the help of u-boot, but have not gotten around to that either.
Does the network work for the 3 10GBit ports?

There were quite a few changes in the last days in the code, it is entirely possible something got broken in the last minute. If it is the MAC-PHY link, then my suspicion would be on something in dsa.c.

Yep, oddly enough they do. Tried all three; they reply to pings. Switched to a plain gigabit port: again nothing...

OK, this helps to limit it to something going on with the connection between SerDes and the 8218D PHY. We actually got stuck in quarantine here for another week at least. But the issue with the 8218D is something I can also look into with the XGS1210.

No rush. Hope you test negative soon :slightly_smiling_face: .

@blogic Is there a way to get realtek-poe to be more verbose? PoE stopped working at some point on my GS1900-8HP v1. OEM firmware still powers PoE clients as might be expected. All I'm seeing is the port my PoE client is connected to has its status listed as 'unkown', e.g. here the 1st port. Early on, lowering the power budget (device is specced for 70W ZyXEL says) worked for a bit, but that's not the case anymore.

# ubus call poe info
{
	"firmware": "v17.1",
	"mcu": "ST Micro ST32F100 Microcontroller",
	"budget": 65.000000,
	"consumption": 0.000000,
	"ports": {
		"lan1": {
			"priority": 2,
			"mode": "PoE+",
			"status": "unknown"
		},
		"lan2": {
			"priority": 2,
			"mode": "PoE+",
			"status": "Searching"
		}
	}
}

Configuration:


config global
        option budget   '65'

config port
        option enable   '1'
        option id       '1'
        option name     'lan1'
        option poe_plus '1'
        option priority '2'

config port
        option enable   '1'
        option id       '2'
        option name     'lan2'
        option poe_plus '1'
        option priority '2'

@bmork Any instructions on how I can check packet's aren't leaking on the CPU port after your patch? So I can provide my Tested-by.

(Patchwork seems to be down atm.)

I simply snoop on "switch" using tcpdump, observing that there is almost no unicast packets address to external clients.

For example, snooping while I start pinging between two clients on lan1 (10.11.12.13) and lan2 (10.11.12.12) I see this:

root@OpenWrt:/# tcpdump -ni switch
[98987.064784] device switch entered promiscuous mode
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on switch, link-type EN10MB (Ethernet), capture size 262144 bytes
12:42:24.743708 STP 802.1d, Config, Flags [none], bridge-id 8025.80:e8:6f:97:78:00.8001, length 42
12:42:26.328199 IP 10.11.12.13 > 10.11.12.12: ICMP echo request, id 39923, seq 1, length 64
12:42:26.743079 STP 802.1d, Config, Flags [none], bridge-id 8025.80:e8:6f:97:78:00.8001, length 42
12:42:28.745909 STP 802.1d, Config, Flags [none], bridge-id 8025.80:e8:6f:97:78:00.8001, length 42
12:42:29.193655 DTPv1, length 38
12:42:29.697754 IP6 fe80::bea5:11ff:fe9f:e123 > ff02::1: ICMP6, router advertisement, length 120
12:42:30.748743 STP 802.1d, Config, Flags [none], bridge-id 8025.80:e8:6f:97:78:00.8001, length 42
12:42:32.748181 STP 802.1d, Config, Flags [none], bridge-id 8025.80:e8:6f:97:78:00.8001, length 42
12:42:34.751008 STP 802.1d, Config, Flags [none], bridge-id 8025.80:e8:6f:97:78:00.8001, length 42
12:42:36.750410 STP 802.1d, Config, Flags [none], bridge-id 8025.80:e8:6f:97:78:00.8001, length 42
12:42:38.753188 STP 802.1d, Config, Flags [none], bridge-id 8025.80:e8:6f:97:78:00.8001, length 42
12:42:40.752645 STP 802.1d, Config, Flags [none], bridge-id 8025.80:e8:6f:97:78:00.8001, length 42
^C
12 packets captured

There is one initial ICMP packet because the address was unknown to the switch, but it learns from that packet and none of the remaining unicast packets show up on the CPU port.

The rest of the above are all multicast packets.

Note that the difference with and without the patch is only the initial inconsistency. If you toggle learning off and on, then the bridge and DSA ports will become synchronized even without the patch.

But "no one" ever does that, making this fix critical for sane bridge behaviour.

EDIT: BTW, I've now also put the patch into one of my "production" GS1900-10HPs. There I can simply observe the port statistics to validate the patch (this was how I initially discovered the bug). Ran a series of speedtests on the NR7101 hanging off the switch for fun, getting 850 Mbps over 5G :slight_smile: Makes it easy to verify traffic spikes on the ports involved and no others. Not to mention that I'd expect the ethernet driver to explode with something like that thrown at it.

1 Like

WRT port statistics monitoring, in case anyone wonders about that, I added simple ethtool statistics support to mini_snmpd. It's in https://github.com/troglobit/mini-snmpd . Kind of had plans to integrate that with the OpenWrt package once it ended in a release. But I see that this is now more that a year old. How time flies...

Anyway, for now I simply run it with a config file like this:

root@gs1900-10hp:~# cat /etc/mini_snmpd.conf 
# mini-snmpd.conf

location       = "xxxxx"
contact        = "xxxxx"
description    = "ZyXEL GS1900-10HP"

# Vendor OID tree
#vendor         = ".1.3.6.1.4.1"

# true/false, or yes/no
authentication = true
community      = "public"

# MIB poll timeout, sec
timeout        = 1

# Disks to monitor, i.e. mount points in UCD-SNMP-MIB::dskTable
#disk-table     = { "/", }

# Interfaces to monitor, currently only for IF-MIB::ifTable
iface-table    = { "lan1", "lan2", "lan3", "lan4","lan7", "lan8", "lan9", "lan10" }

ethtool "lan*" {
        rx_bytes      = ifInOctets
        rx_mc_packets = ifInMulticastPkts
        rx_bc_packets = ifInBroadcastPkts
        rx_packets    = ifInUcastPkts
        #rx_errors     =
        rx_drops      = dot1dTpPortInDiscards
        tx_bytes      = ifOutOctets
        tx_mc_packets = ifOutMulticastPkts
        tx_bc_packets = ifOutBroadcastPkts                                                                                                                                                                                                   
        tx_packets    = ifOutUcastPkts
        #tx_errors     =
        tx_drops      = ifOutDiscards
}

And then I can just monitor the OpenWrt realtek switches like any other switch using cacti.

If it isn't clear: The point of all that is that the lanX interface counters only show us the CPU traffic on the realtek DSA driver. The port counters are only available using ethtool. The changes to mini_snmpd let you return named ethtool counters instead of the usual interface counters.

For example, the counters of the port attached to the NR7101 demonstrates why the interface counters are useless (or of little interest at least):

root@gs1900-10hp:~# ifconfig lan7
lan7      Link encap:Ethernet  HWaddr BC:CF:4F:D1:6B:38  
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:136 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:6672 (6.5 KiB)  TX bytes:0 (0.0 B)

root@gs1900-10hp:~# ethtool -S lan7
NIC statistics:
     tx_packets: 0
     tx_bytes: 0
     rx_packets: 136
     rx_bytes: 6672
     ifInOctets: 13605459970
     ifOutOctets: 2400042653
     dot1dTpPortInDiscards: 0
     ifInUcastPkts: 10533929
     ifInMulticastPkts: 4
     ifInBroadcastPkts: 131
     ifOutUcastPkts: 3216870
     ifOutMulticastPkts: 8601
     ifOutBroadcastPkts: 617
     ifOutDiscards: 0
     .3SingleCollisionFrames: 0
     .3MultipleCollisionFrames: 0
     .3DeferredTransmissions: 0
     .3LateCollisions: 0
     .3ExcessiveCollisions: 0
     .3SymbolErrors: 0
     .3ControlInUnknownOpcodes: 0
     .3InPauseFrames: 0
     .3OutPauseFrames: 0
     DropEvents: 0
     tx_BroadcastPkts: 617
     tx_MulticastPkts: 8601
     CRCAlignErrors: 0
     tx_UndersizePkts: 0
     rx_UndersizePkts: 0
     rx_UndersizedropPkts: 0
     tx_OversizePkts: 0
     rx_OversizePkts: 0
     Fragments: 0
     Jabbers: 0
     Collisions: 0
     tx_Pkts64Octets: 239642
     rx_Pkts64Octets: 37010
     tx_Pkts65to127Octets: 1450137
     rx_Pkts65to127Octets: 1336624
     tx_Pkts128to255Octets: 20590
     rx_Pkts128to255Octets: 192270
     tx_Pkts256to511Octets: 14381
     rx_Pkts256to511Octets: 20533
     tx_Pkts512to1023Octets: 11779
     rx_Pkts512to1023Octets: 71524
     tx_Pkts1024to1518Octets: 1489559
     rx_StatsPkts1024to1518Octets: 8876103
     tx_Pkts1519toMaxOctets: 0
     rx_Pkts1519toMaxOctets: 0
     rxMacDiscards: 0

EDIT: One shortcoming I meant to mention: mini_smnpd is limited to monitoring 8 ports. Which is why you see I dropped lan5 and lan6 in that config. This is hard to fix. In theory we can just increase the static tables, but it's really designed for a limited number of ports and that will cost both memory and cpu. Anyway, it's great for the small switches.

ZyXEL GS1900-24 v1 support pull request has been submitted: https://github.com/openwrt/openwrt/pull/9400

When will there a default package in the openwrt image for POE?

There is some disagreement with how to integrate realtek-poe; see this patchwork thread.

1 Like

Some issues with realtek-poe:

  • Using the realtek-poe-add-support-for-PoE-on-Realtek-switches.diff patch acquired by hitting diff here,
  • applying it with patch -p1 < /tmp/realtek-poe-add-support-for-PoE-on-Realtek-switches.diff
  • to a copy of the openwrt tree one commit after current snapshot master that I just used to build OpenWrt for the GS1900-24HP v1
  • then running make,
  • (after a make clean, I do get an .ipk file);
  • Installing the resulting realtek-poe package on a running GS1900-24HP v1 and running it generates creates an interesting segfault:
[ 2184.754425] do_page_fault(): sending SIGSEGV to realtek-poe for invalid write access to 004020cc
[ 2184.764434] epc = 77dbeff9 in libubox.so.20211120[77dba000+17000]
[ 2184.771449] ra  = 00401487 in realtek-poe[400000+3000]

... Using the toolchain's objdump:

./build_dir/toolchain-mips_4kec_gcc-11.2.0_musl/binutils-2.37/binutils/objdump -dDS /tmp/realtek-poe | less

... I get:

  401480:       1a00 095c       jal     402570 <ustream_consume@mips16plt>
  401484:       6d0c            li      a1,12
  401486:       9980            lw      a0,0(s1)
  401488:       e42b            subu    v0,a0,s1
        if (list_empty(&cmd_pending))
  40148a:       2203            beqz    v0,401492 <poe_stream_msg_cb+0x46>
  40148c:       1a00 0507       jal     40141c <poe_cmd_send.isra.0>

At this point, honestly not sure how to debug moving forward. I had to edit this post to remove incorrect assumptions -- I was looking at the wrong list_empty(&cmd_pending) in main.c, and even then, at 401488, I'm looking at something that GCC generated (I guess that's what inlining is?)