Support for RTL838x based managed switches

Just git checkout a previous commit and then compile like usual.

Ok, thanks for the info.

Unfortunately, I didn't find enough time for this today. Will try to look into this more tomorrow evening after work.

Test 1:
=> detected / network works:
log (msg to long): https://pastebin.com/UkabFnNZ

Test 2:
=> detected / network works:

Test 2b (same commit, but using CONFIG_TESTING_KERNEL=y)
=> seems not detected and network does not work

[    3.774191] rtl83xx-switch switch@1b000000 lan1 (uninitialized): PHY [mdio-bus:00] driver [REALTEK RTL8218D] (irq=POLL)
[    3.787987] rtl83xx-switch switch@1b000000 lan2 (uninitialized): PHY [mdio-bus:01] driver [REALTEK RTL8218D] (irq=POLL)
[    3.802473] rtl83xx-switch switch@1b000000 lan3 (uninitialized): PHY [mdio-bus:02] driver [REALTEK RTL8218D] (irq=POLL)
[    3.816367] rtl83xx-switch switch@1b000000 lan4 (uninitialized): PHY [mdio-bus:03] driver [REALTEK RTL8218D] (irq=POLL)
[    3.830160] rtl83xx-switch switch@1b000000 lan5 (uninitialized): PHY [mdio-bus:04] driver [REALTEK RTL8218D] (irq=POLL)
[    3.844635] rtl83xx-switch switch@1b000000 lan6 (uninitialized): PHY [mdio-bus:05] driver [REALTEK RTL8218D] (irq=POLL)
[    3.858444] rtl83xx-switch switch@1b000000 lan7 (uninitialized): PHY [mdio-bus:06] driver [REALTEK RTL8218D] (irq=POLL)
[    3.872389] rtl83xx-switch switch@1b000000 lan8 (uninitialized): PHY [mdio-bus:07] driver [REALTEK RTL8218D] (irq=POLL)
[    3.900530] rtl83xx-switch switch@1b000000: Link is Up - 10Gbps/Full - flow control off
[    3.920768] rtl83xx-switch switch@1b000000 lan9 (uninitialized): validation of usxgmii with support 00,00000000,00018000,000e706c and advertisement 00,00000000,00018000,000e706c failed: -EINVAL
[    3.954266] rtl83xx-switch switch@1b000000 lan9 (uninitialized): failed to connect to PHY: -EINVAL
[    3.964312] rtl83xx-switch switch@1b000000 lan9 (uninitialized): error -22 setting up PHY for tree 0, switch 0, port 24
[    3.992809] rtl83xx-switch switch@1b000000 lan10 (uninitialized): validation of usxgmii with support 00,00000000,00018000,000e706c and advertisement 00,00000000,00018000,000e706c failed: -EINVAL
[    4.032378] rtl83xx-switch switch@1b000000 lan10 (uninitialized): failed to connect to PHY: -EINVAL
[    4.042514] rtl83xx-switch switch@1b000000 lan10 (uninitialized): error -22 setting up PHY for tree 0, switch 0, port 25
[    4.073093] rtl83xx-switch switch@1b000000 lan11 (uninitialized): validation of usxgmii with support 00,00000000,00018000,000e706c and advertisement 00,00000000,00018000,000e706c failed: -EINVAL
[    4.112610] rtl83xx-switch switch@1b000000 lan11 (uninitialized): failed to connect to PHY: -EINVAL
[    4.122737] rtl83xx-switch switch@1b000000 lan11 (uninitialized): error -22 setting up PHY for tree 0, switch 0, port 26

Full log: https://pastebin.com/W4X6uTwA

2 Likes

Can you please add PHY_INTERFACE_MODE_USXGMII to the list of supported PHY connections. How to do? See https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=9272d9919596967b581cab02c2374a74081b2c6e

Did not notice HPE 1920-48G (JG927A) appearing recently as snapshot on https://firmware-selector.openwrt.org/

Thanking all peoples

1 Like

Thanks,

I applied:

diff --git a/target/linux/realtek/files-6.6/drivers/net/dsa/rtl83xx/dsa.c b/target/linux/realtek/files-6.6/drivers/net/dsa/rtl83xx/dsa.c
index d61122e330..f9d37fb3bd 100644
--- a/target/linux/realtek/files-6.6/drivers/net/dsa/rtl83xx/dsa.c
+++ b/target/linux/realtek/files-6.6/drivers/net/dsa/rtl83xx/dsa.c
@@ -684,6 +684,7 @@ static void rtl83xx_phylink_get_caps(struct dsa_switch *ds, int port,
        __set_bit(PHY_INTERFACE_MODE_QSGMII, config->supported_interfaces);
        __set_bit(PHY_INTERFACE_MODE_SGMII, config->supported_interfaces);
        __set_bit(PHY_INTERFACE_MODE_XGMII, config->supported_interfaces);
+       __set_bit(PHY_INTERFACE_MODE_USXGMII, config->supported_interfaces);
        __set_bit(PHY_INTERFACE_MODE_1000BASEX, config->supported_interfaces);
 }

Seems to be detected as some other device (I don't know if it's a detection error, or they actually put in something different), but network connection works now:

[    3.755311] rtl83xx-switch switch@1b000000: configuring for fixed/internal link mode
[    3.764018] rtl93xx_phylink_mac_config port 28, mode 1, phy-mode: internal, speed -1, link 0
[    3.774148] rtl83xx-switch switch@1b000000 lan1 (uninitialized): PHY [mdio-bus:00] driver [REALTEK RTL8218D] (irq=POLL)
[    3.787944] rtl83xx-switch switch@1b000000 lan2 (uninitialized): PHY [mdio-bus:01] driver [REALTEK RTL8218D] (irq=POLL)
[    3.801753] rtl83xx-switch switch@1b000000 lan3 (uninitialized): PHY [mdio-bus:02] driver [REALTEK RTL8218D] (irq=POLL)
[    3.815611] rtl83xx-switch switch@1b000000 lan4 (uninitialized): PHY [mdio-bus:03] driver [REALTEK RTL8218D] (irq=POLL)
[    3.829328] rtl83xx-switch switch@1b000000 lan5 (uninitialized): PHY [mdio-bus:04] driver [REALTEK RTL8218D] (irq=POLL)
[    3.843864] rtl83xx-switch switch@1b000000 lan6 (uninitialized): PHY [mdio-bus:05] driver [REALTEK RTL8218D] (irq=POLL)
[    3.857675] rtl83xx-switch switch@1b000000 lan7 (uninitialized): PHY [mdio-bus:06] driver [REALTEK RTL8218D] (irq=POLL)
[    3.871583] rtl83xx-switch switch@1b000000 lan8 (uninitialized): PHY [mdio-bus:07] driver [REALTEK RTL8218D] (irq=POLL)
[    3.899814] rtl83xx-switch switch@1b000000: Link is Up - 10Gbps/Full - flow control off
[    3.920007] rtl83xx-switch switch@1b000000 lan9 (uninitialized): PHY [mdio-bus:18] driver [Aquantia AQR113C] (irq=POLL)
[    3.951816] rtl83xx-switch switch@1b000000 lan10 (uninitialized): PHY [mdio-bus:19] driver [Aquantia AQR113C] (irq=POLL)
[    3.982901] rtl83xx-switch switch@1b000000 lan11 (uninitialized): PHY [mdio-bus:1a] driver [Aquantia AQR113C] (irq=POLL)

full kernel log: https://pastebin.com/LmgufyRg

2 Likes

The 10 GbE ports are Aquantia PHYs, so that looks correct.

1 Like

Oh, didn't know that, thanks for pointing out

The aquantia PHY might contribute to the (relatively) high price and lower number of 10Gb ports. Hopefully as we start to see 100% Realtek 10Gb solutions like Realtek RTL8261BE the prices will drop and 10Gb port counts will increase.

1 Like

How's the fan behaviour on JG927A?
Datasheet indicates only one fan level on JG927A. (Two on JG928A)

Device tree doesn't look like it has fan stuff in it.

Issues have been highlighted on JG922A (and on my attempt to get JG926A) regarding default fan speed. RTL8231 init sets all GPIO's to input which lowers the fan speed set by u-boot, at least on JG922A and JG926A.

@hitech95 and I would be interested to know :slight_smile:

Can anyone comment?

1 Like

Please forgive my ignorance, but what's wrong with LACP at the moment? I haven't ever tried it on realtek/openwrt but it was on my todo list.

You have to configure it by command line/text config right? From what I recall when i looked, netifd has support? Last I looked there was a patch to do it in LUCI that wasn't ready for upstreaming yet?

We want hardware support for LAG instead of using the CPU. With earlier kernels the RTL DSA driver didn't support hw LAG, and the only route was using the limited CPU power for software LAG, but I'm hoping that it will be possible with hardware now although am not sure what to check.

2 Likes

OK cool. Yeah I've seen patches for supporting link aggregation in DSA switches for other brands. I'd have to compare what those do, with what's in 6.6 for realtek vs what was 5.15.

I just quickly read through target/linux/realtek/files-6.6/drivers/net/dsa/rtl83xx/dsa.c
Edit: there's also stuff in common.c

Looks like there's a bunch of LAG functions there. Haven't had a look at what changed between 5.15 and 6.6.

For future reference to all.
The patch that added link aggregation for realtek is here? Merged 2022?

https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=32e5b5ee6b86956d7f97736615bb56c8a28cd841

What are the steps to replicate and have LAG only work on CPU?

Actually, the first code for LAG was added before that: https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=6f67c079e61a01d599af34dbb852a4fe99d0f52d

At the time, there kernel didn't support LAGs for DSA switches yet (so the implementation uses a netdevice event handler in the driver). Then the commit you linked implemented the actual DSA LAG callbacks from the kernel. However, it also kept the old implementation.

The result is that the current implementation is a mess that just doesn't work properly. Some time ago, I started an attempt to fix it (https://github.com/janh/openwrt/commits/realtek-fix-lag/), but didn't finish it. Note that in the current state of this branch, hardware offloading of LAG is intentionally disabled, as rtl83xx_lag_can_offload returns false. I did this because I wanted a reference for testing, which unfortunately turned up some other more important issues to work on.

Here is a previous comment from me about LAG support:

7 Likes

Awesome. Thanks for the explanation. Greatly appreciated.

@olliver Can you help me out here: Where did you get these page names from: https://svanheule.net/realtek/mango/register/serdes_indrt_access_ctrl ?

Hi @janh - thanks for the insight. Are you aware of any code for LAG/LACP in other chipsets that currently works that one could use as a cheat sheet for comparison? I realise things aren't standardised but sometimes stare and compare gives insights.

Also, it's my understanding that 'hardware' Layer3-4 and higher hash methods depend on the chipsets, yes?

1 Like

To me it looks like we need to find targets which have a port_lag_join which returns something that isn't EOPNOTSUPP ?

Looking for port_lag_join in upstream:

mv88e6xxx, ocelot / felix VSC9959 VSC9953 (NXP LS1028A), qca8k dsa driver.