Support for RTL838x based managed switches

howels · May 28, 2024, 5:22pm

I'll have a go at porting the main commit to add the model itself once the switch arrives. These 1920 are so widely available that I'm sure it will be welcome.

howels · May 28, 2024, 5:30pm

Do you have any history as to why the commits have not been submitted? At first pass they add a lot of exception handling along with the new model and update to the dts. Wondering if there's any issues these have created or was it a simple lack of time available.

howels · May 28, 2024, 5:35pm

That's great. My zyxel GS1900-48 is partitioned in two which means everything has to cram into 8MB.

andyboeh · May 28, 2024, 5:37pm

Please read the few posts following the one with janh's patches in this thread. I do not have any more information than that.

janh · May 28, 2024, 10:36pm

Only these three commits are relevant for HPE 1920-48G:
realtek: add hardware version check for RTL8214FC
WIP: support RTL8214FC on RTL8393
realtek: add support for HPE 1920-48G

The previous changes are mostly already upstreamed. I didn't want to submit support for HPE 1920-48G itself, because the code for the RTL8214FC PHY is unfinished.

Some changes will be required to apply it to the current main branch (it is more likely to still apply to the 23.05 branch without changes). There have already been some small changes for RTL8214FC recently (realtek/rtl839x: respect phy-is-integrated property, realtek: add RTL821X_CHIP_ID).

Technically, there aren't even two separate partitions on these devices. The entire flash between the bootloader and the factory partition is used for a filesystem. This filesystem can store as much firmware images as there is space for, but only one can be set as boot image (and a second one as a backup image).

However, with OpenWrt, a somewhat hacky approach is used instead, where only the beginning of the filesystem is actually valid and contains the kernel as a file, while the rest is used directly for the rootfs and overlay. Actually using the filesystem properly wouldn't have been possible, because the beginning of each erase block is reserved for the filesystem.

sudoBash418 · May 29, 2024, 3:21am

Sorry if this is the wrong place to ask, but is port isolation supported?

I have a ZyXEL GS1900-24Ev2 and I've been trying to setup something like the following:

Port 1 is a trunk port to a router.
Ports 5-8 are access ports to the same VLAN, but they should be isolated such that they cannot communicate with each other (ideally, they can only communicate with port 1).
Ports 23-24 will be (filtered) trunk ports to a pair of APs, with client isolation enabled for certain networks. They should also only be able to communicate with the router.

Toggling the bridge port isolation flag (/sys/devices/platform/switch@1b000000/net/lan5/brport/isolated) doesn't appear to do anything: /sys/kernel/debug/rtl838x/lan5/port_ctrl still reads port_isolation = 0x10ffffff (and clients can still communicate).

The only way I've found to affect the hardware port isolation (based on my limited understanding of the rtl83xx drivers) is to add/remove ports from bridges. However, I can't find a way to attach the same port to multiple bridges (seems unsupported by Linux?), so this technique doesn't seem useful to me (because port 1 would need to be bridged separately to each access port).

Aside: using separate VLANs as a workaround is probably technically possible, but seems like a massive headache, especially without a way to fix the tags on the router port (ie. somehow map port-specific VLAN IDs to/from a single VLAN ID on the router port).

howels · May 29, 2024, 10:29am

Does this account for the unusual space allocation on my Zyxel GS1900-48 - I have only a tiny amount of space in overlay as it seems like the kernel+overlay are both stuck in only 8MB of the possible 16MB flash space?

root@GS1900-48:~# df -h
Filesystem                Size      Used Available Use% Mounted on
/dev/root                 2.8M      2.8M         0 100% /rom
tmpfs                    59.2M     44.0K     59.2M   0% /tmp
/dev/mtdblock8          832.0K    540.0K    292.0K  65% /overlay
overlayfs:/overlay      832.0K    540.0K    292.0K  65% /
tmpfs                   512.0K         0    512.0K   0% /dev

MTD partitions:

root@GS1900-48:~# cat /proc/mtd
dev:    size   erasesize  name
mtd0: 00040000 00010000 "u-boot"
mtd1: 00010000 00010000 "u-boot-env"
mtd2: 00010000 00010000 "u-boot-env2"
mtd3: 00100000 00010000 "jffs"
mtd4: 00100000 00010000 "jffs2"
mtd5: 006d0000 00010000 "firmware"
mtd6: 00350000 00010000 "kernel"
mtd7: 00380000 00010000 "rootfs"
mtd8: 000d0000 00010000 "rootfs_data"
mtd9: 006d0000 00010000 "runtime2"

Wondering how I can use mtd9 - runtime2 for my overlay.

Found the results - no easy way forward (bootloader update, mtd-concat etc, all very intrusive). I'll probably move to the HPE 1920 with 32MB flash and sell the Zyxel GS1900s. Building custom images on 23.05.2 was hard enough, gonna be much harder with a 6.6 kernel in future.

howels · May 29, 2024, 11:44am

sudoBash418:

Sorry if this is the wrong place to ask, but is port isolation supported?

I have a ZyXEL GS1900-24Ev2 and I've been trying to setup something like the following:

Port 1 is a trunk port to a router.

Ports 5-8 are access ports to the same VLAN, but they should be isolated such that they cannot communicate with each other (ideally, they can only communicate with port 1).

Ports 23-24 will be (filtered) trunk ports to a pair of APs, with client isolation enabled for certain networks. They should also only be able to communicate with the router.

Toggling the bridge port isolation flag (/sys/devices/platform/switch@1b000000/net/lan5/brport/isolated) doesn't appear to do anything: /sys/kernel/debug/rtl838x/lan5/port_ctrl still reads port_isolation = 0x10ffffff (and clients can still communicate).

The only way I've found to affect the hardware port isolation (based on my limited understanding of the rtl83xx drivers) is to add/remove ports from bridges. However, I can't find a way to attach the same port to multiple bridges (seems unsupported by Linux?), so this technique doesn't seem useful to me (because port 1 would need to be bridged separately to each access port).

Aside: using separate VLANs as a workaround is probably technically possible, but seems like a massive headache, especially without a way to fix the tags on the router port (ie. somehow map port-specific VLAN IDs to/from a single VLAN ID on the router port).

Wired Ethernet is designed to allow communication within a L2 domain, that is the standard.
To achieve what you want you could install VXLAN modules and VRFs modules then run FRR with a BGP eVPN to each AP, as a L2 eVPN would allow you to control MAC visibility within the overlay and perform the isolation you require. FRR and VXLAN modules are available as packages in the repo, although RTL838x switches are a poor choice due to performance constraints. VRF module needs to be built yourself but isn't too hard.
Larger switches that run wireless APs will often use the underlay/overlay model to perform wired client isolation.
If you want to use OpenWRT you might consider something like the Mellanox SN2010 for more performance.

andyboeh · May 29, 2024, 2:36pm

AFAIK, a bootloader update shouldn't be needed, the bootloader will just see the second partition as corrupted. Using mtd-concat might be an option for quite some of the Realtek devices once the kernel grows.

In my region, used HPE's are even much cheaper than the GS1900s, unless you need PoE.

howels · May 29, 2024, 3:05pm

Same here - I just had the misfortune to see the GS1900 support before I noticed that HPE 1920 support was added.

sudoBash418 · May 30, 2024, 3:37am

With all due respect, your response seems to have little to do with my post.

L2 port isolation is a feature supported by the OEM firmware (see section 30.3 of the user manual).
I'm wondering what degree of support OpenWRT has for this functionality, in general and on this specific device.

Any solution or workaround that requires all packets to traverse the CPU port, including your suggestion, would not be suitable as the performance impact would be extreme.

howels · May 30, 2024, 10:10am

No problem, I hadn't heard of this MAC-based isolation before. Looks like the switch groups ports into those which are limited and those which are enabled to forward L2. TP-LInk doc describes it in more detail

andyboeh · May 31, 2024, 7:20am

@robimarko I'm trying to talk to the LM63 on my DGS-1210-28MP and I've run into the same problem. Did you ever get this sorted? Unfortunately, I discovered your posts just now, after having tried the same things.

As a bonus, however, I can confirm that the SFP GPIOs are the same as for the DGS-1210-52, enabling full SFP support on this switch.

robimarko · May 31, 2024, 11:08am

As far as I remember I did not and at work we decided that the whole Realtek mess and the trouble we would have to go through to fit everything on the NOR would not be worth it so we abandoned it.

I have no idea how stock FW manages to talk to LM63 since the clock line was never working for me in Linux.

howels · May 31, 2024, 1:49pm

(reposting to reply to correct post)

Built the 1920-48g branch and loaded via TFTP but I'm getting repeated panics. Subsequent boot panicked just the once but boot does not complete. The shell is not started and the device is not reachable via 192.168.1.1:

[   10.710045] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[   10.729656] rcu:     1-...!: (1 GPs behind) idle=420/0/0x0 softirq=0/0 fqs=1  (false positive?)
[   10.757579]  (detected by 0, t=2102 jiffies, g=-1191, q=5760)
[   10.776577] Sending NMI from CPU 0 to CPUs 1:
[   10.791038] NMI backtrace for cpu 1 skipped: idling at r4k_wait_irqoff+0x1c/0x24
[   10.815833] rcu: rcu_sched kthread starved for 2099 jiffies! g-1191 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
[   10.849787] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[   10.879425] rcu: RCU grace-period kthread stack dump:
[   10.896119] task:rcu_sched       state:I stack:    0 pid:   11 ppid:     2 flags:0x00100000
[   10.923768] Stack : 80760000 8009e544 00000000 00000000 8205be38 80760000 807c84e0 8203bac0
[   10.951443]         80760000 00000001 80760000 807c8610 00000000 805fe65c 807c8610 00000000
[   10.979121]         807d0000 80094908 ffff8b19 8060155c 80760000 80760000 807d0000 00000000
[   11.006797]         00000000 814a9440 ffff8b19 8009d8d0 06800001 8203bac0 807c84e0 807c84e0
[   11.034474]         00000000 80095084 8149d600 80097f0c 81350000 80760000 80090000 8069c5c4
[   11.062152]         ...
[   11.070212] Call Trace:
[   11.078279] [<805fe2f4>] __schedule+0x284/0x598
[   11.093272] [<805fe65c>] schedule+0x54/0xf8
[   11.107108] [<8060155c>] schedule_timeout+0x68/0xe0
[   11.123266] [<80095084>] rcu_gp_fqs_loop+0x2e4/0x37c
[   11.139681] [<80098fd4>] rcu_gp_kthread+0x120/0x14c
[   11.155825] [<8004d86c>] kthread+0x13c/0x144
[   11.169949] [<80001c58>] ret_from_kernel_thread+0x14/0x1c
[   11.187809] 
[   21.699831] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[   21.719447] rcu:     1-...!: (0 ticks this GP) idle=430/0/0x0 softirq=0/0 fqs=0  (false positive?)
[   21.748232]  (detected by 0, t=2102 jiffies, g=-1187, q=5589)
[   21.767230] Sending NMI from CPU 0 to CPUs 1:
[   21.781689] NMI backtrace for cpu 1 skipped: idling at r4k_wait_irqoff+0x1c/0x24
[   21.806487] rcu: rcu_sched kthread starved for 2102 jiffies! g-1187 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
[   21.840441] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[   21.870078] rcu: RCU grace-period kthread stack dump:
[   21.886772] task:rcu_sched       state:I stack:    0 pid:   11 ppid:     2 flags:0x00100000
[   21.914421] Stack : 80760000 8009e544 00000000 00000000 8205be38 00000020 00000000 8203bac0
[   21.942096]         80760000 00000001 80760000 807c8610 00000001 805fe65c 807c8610 00000001
[   21.969773]         807d0000 80094908 ffff937c 8060155c 807c84e0 814ab600 00000001 00000000
[   21.997450]         00000000 814a9610 ffff937c 8009d8d0 23800001 8203bac0 807c84e0 807c84e0
[   22.025128]         00000000 80095084 814ab600 80097f0c 81350000 80760000 80090000 8069c5c4
[   22.052804]         ...
[   22.060865] Call Trace:
[   22.068932] [<805fe2f4>] __schedule+0x284/0x598
[   22.083926] [<805fe65c>] schedule+0x54/0xf8
[   22.097761] [<8060155c>] schedule_timeout+0x68/0xe0
[   22.113918] [<80095084>] rcu_gp_fqs_loop+0x2e4/0x37c
[   22.130335] [<80098fd4>] rcu_gp_kthread+0x120/0x14c
[   22.146479] [<8004d86c>] kthread+0x13c/0x144
[   22.160600] [<80001c58>] ret_from_kernel_thread+0x14/0x1c

Any idea what's up here?

Full log: https://pastebin.com/KKyMd0G4

howels · May 31, 2024, 2:02pm

Some unusual results, I swapped NICs for the TFTP connection to a 1Gb from an old USB 100Mb NIC and the switch is failing at different points now and actually rebooting instead of sitting at the console.

Tried booting with the original image and my merge of the three commits onto 23.05.3, both fail to finish booting. Wonder if I have an updated hardware version or defective hardware?

andyboeh · May 31, 2024, 8:20pm

OK, thanks. I just hooked up my logic analyzer and - it started working. As soon as the logic analyzer is disconnected, i2c is dead again.

@svanheule Any ideas how to properly configure the RTL8231 for this?

Edit: I just tried a small DSO and this does not make it work. Same problem as already reported: SDA works, SCL is always low.

Edit 2: Finally, it's working. I traced my way through i2c-gpio.c, gpiolib.c and gpio-rtl8231.c to find the correct configuration to force-drive the SCL pin. The magic configuration is as simple as:

	/* LM63 */
	i2c-gpio-4 {
		compatible = "i2c-gpio";
		sda-gpios = <&gpio1 32 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
		scl-gpios = <&gpio1 31 GPIO_ACTIVE_HIGH>;
		i2c-gpio,delay-us = <2>;
		i2c-gpio,scl-open-drain;
		#address-cells = <1>;
		#size-cells = <0>;

		lm63@4c {
				compatible = "national,lm63";
				reg = <0x4c>;
		};
	};

The line i2c-gpio,scl-open-drain; isn't very intuitive. When reading the code, it was finally clear that this sets the flag GPIOD_OUT_HIGH instead of GPIOD_OUT_HIGH_OPEN_DRAIN, resulting into the pin being force-driven.
The documentation mentions that this is deprecated, but doesn't offer any alternative. However, reading the kernel source, in the current kernel there is the flag i2c-gpio,scl-has-no-pullup (which is somehow undocumented!?) that could be used instead and which is clearer. I'll add a note that this can be replaced in 6.6 when I submit the PR.

sudoBash418 · June 5, 2024, 4:24am

FWIW, I've been using a simple DTS patch to double the available firmware space on my Zyxel GS1900-24E.

DTS Patch

diff --git a/target/linux/realtek/dts-5.15/rtl8380_zyxel_gs1900.dtsi b/target/linux/realtek/dts-5.15/rtl8380_zyxel_gs1900.dtsi
index 5993c1b798..202e1ccf28 100644
--- a/target/linux/realtek/dts-5.15/rtl8380_zyxel_gs1900.dtsi
+++ b/target/linux/realtek/dts-5.15/rtl8380_zyxel_gs1900.dtsi
@@ -89,16 +89,12 @@
 				label = "jffs2";
 				reg = <0x160000 0x100000>;
 			};
-			partition@b260000 {
+			partition@260000 {
 				label = "firmware";
-				reg = <0x260000 0x6d0000>;
+				reg = <0x260000 0xda0000>;
 				compatible = "openwrt,uimage", "denx,uimage";
 				openwrt,ih-magic = <0x83800000>;
 			};
-			partition@930000 {
-				label = "runtime2";
-				reg = <0x930000 0x6d0000>;
-			};
 		};
 	};
 };
diff --git a/target/linux/realtek/image/common.mk b/target/linux/realtek/image/common.mk
index 37370f1999..9673c20951 100644
--- a/target/linux/realtek/image/common.mk
+++ b/target/linux/realtek/image/common.mk
@@ -58,7 +58,7 @@ endef
 
 define Device/zyxel_gs1900
   DEVICE_VENDOR := ZyXEL
-  IMAGE_SIZE := 6976k
+  IMAGE_SIZE := 13952k
   UIMAGE_MAGIC := 0x83800000
   KERNEL_INITRAMFS := \
 	kernel-bin | \

Not sure if this configuration risks data loss, but I haven't run into any issues during my kernel development so far .

howels · June 6, 2024, 10:45am

Good idea, will test it out as I have a spare GS1900 here.

sencha · June 6, 2024, 5:04pm

Another interesting nuance about 1920-48G is a more powerful CPU (RTL8393M) , MIPS 34Kc @ 700MHz vs MIPS 4KEc @ 500MHz.

Anyone interested in assisting @howels in development of the firmware for it?

I am willing to help, but not sure how, don't have the switch yet.