Switch down after some time - ZyXEL GS1900-24E rtl838x

Hello!

I installed 2 month ago OpenWrt 23.05.3 on a ZyXEL GS1900-24E (rtl838x) switch and all worked perfectly. Since a couple of days (I didn't change anything), the switch seems stuck after ~13 minutes. No Luci, no ssh, no switching/routing functionality anymore. After restarting it works for another 10-15 mins.

I also updated to latest OpenWrt version: 23.05.05 - same behavior.

I think I have to localize the problem a bit first. What is the best way to get some proper error messages or something similar?

Try to find pin layouts and get your self a console adapter.

I have no clue how reliable that is, but the following site suggests there's an UART interface on something that is labeled "JP1":
https://svanheule.net/switches/gs1900-24ep

Assuming your 24E is the same as the 24EP described.

Connect a computer to that serial console and let it run until it crashes.

https://svanheule.net/switches/gs1900-24e#pinouts

Ah, cool! Thanks!

I opened the case, and connected via serial.

I connected the following (-> my serial-usb-adapter) :
JP2.1: 3.3V -> 3V3
JP2.2: TX -> TXD
JP2.3: RX -> RXD
JP2.4: GND -> GND

Should it be correct?

sudo su
screen /dev/ttyUSB0 115200 #=115200, 8N1

 ?��MH2
       ���������������

(At least for another switch that procedure was fine)

Do I need another 115200, 8N1 ?

EDIT: Sorry, with the following it worked:
JP2.2: TX -> RXD
JP2.3: RX -> TXD

Unfortunately there are no logs over serial connection regarding the issue.

[   51.818071] switch: port 24(lan24) entered blocking state
[   51.824282] switch: port 24(lan24) entered disabled state
[   51.832252] device lan24 entered promiscuous mode
[   51.861484] rtl83xx_fib_event: FIB_RULE ADD/DEL for IPv6 not supported
[   54.618123] RTL8380 Link change: status: 1, ports 400000
[   54.973209] rtl83xx-switch switch@1b000000 lan24: Link is Up - 1Gbps/Full - flow control rx/tx
[   54.983065] switch: port 24(lan24) entered blocking state
[   54.989158] switch: port 24(lan24) entered forwarding state
[   55.113729] RTL8380 Link change: status: 1, ports 100000
[   55.615210] rtl83xx-switch switch@1b000000 lan22: Link is Up - 1Gbps/Full - flow control rx/tx
[   55.625161] switch: port 22(lan22) entered blocking state
[   55.631403] switch: port 22(lan22) entered forwarding state
[  515.345073] RTL8380 Link change: status: 1, ports 100000
[  515.352007] rtl83xx-switch switch@1b000000 lan22: Link is Down
[  515.358679] switch: port 22(lan22) entered disabled state
[  515.367812] rtl83xx-switch switch@1b000000 lan22: Link is Up - 1Gbps/Full - flow control rx/tx
[  515.379328] switch: port 22(lan22) entered blocking state
[  515.385561] switch: port 22(lan22) entered forwarding state
[  516.334686] rtl83xx-switch switch@1b000000 lan22: Link is Down
[  516.341462] switch: port 22(lan22) entered disabled state

[ 55.631403] - is the last one after it fully started
Then the problem happens.
[ 515.345073] - I just unplugged one lan cable.

What else can I do to find out the issue?

I would like to push the topic again =)

Are there any other debugging option to find out or to limit the potential problem?

Is the console stuck too when the switch stops working? If not, then you have lots of debugging options there. Are links down? Can you do a link down/up? Does that change anything? Etc.

So far no one knows what you have tried. Which makes it a bit difficult for us to answer the question about what else you can do

Thank you!

The console seems to still work...

If I replug a lan cabel, I see it via console but doesnt work.

All what I tried I posted in this topic. I dont have deep knowledge about OpwnWrt and finding out the root cause. Restarting seems to fix it for the next 10 minutes.

No, there is no information about what you have tried and what the results were.

Only your conclusion:

Based on this, we can only assume that you have tested everything there is to test and that the conclusion is correct..