I added the PHY ID of the recognized PHY (10G PHY, probably Aquantia 413C, used in port 25-28) to your polling list and suddenly the LEDs lit up for 1G connections as well, not just for 10G and 2.5G connections. So this is an improvement. Still no traffic when watching with tcpdump, but progress nonetheless. Other ports (1-24 with 2.5G PHY, 29+30 with SFP) were not tested.
With all 10G ports connected (port 25@1G, port 26@2.5G, port 27@10G, port 28@10G) and the remote side indicating link up, the SDS_EXT status looks like this:
If I'm not mistaken, this should be a SerDes issue then. I can refresh my calibration patch and you can have a try if that improves anything. This didn't had any effect on my devices so far but they are already working mostly. I remember it helped a bit on Plasmacloud devices.
EDIT: There is also some patching still missing, this could also help. I can prepare that too.
Sadly, no directly visible change. I didn’t have time to test with a 10G connection to another computer/switch, just a 10G loopback patch cable between two ports of the same switch. In theory, the first packet on the switch should have caused a storm due to the loop. No such storm happened when I sent a broadcast packet (ARP request for another machine) locally.
I did dump the SDS registers again and will check for changes.
I'm currently taking a deeper dive into the setup process with the bare information the SDK provides. The goal is to have a comprehensive process of configuring different parts of the RX and TX chains. While this may not make it work instantly on every device, it gives better control knobs to work with. But this will still take a few days.
I can quickly compare your SerDes dump with mine later just to check for obvious issues. But don't expect groundbreaking findings there.
After rebasing against latest master, the last two patches do not apply anymore. Is there another branch I can pick from or has an alternative patch already been merged?
Only some cleanup and minor stuff has been merged, the next stuff for RTL931x is still in my queue. I just updated the branch to apply again, but no functional change so far. Still working on the rest.
Memo for everyone: Since this device uses Aquantia PHYs and we slowly move towards 6.18, keep https://github.com/openwrt/openwrt/pull/22690 in mind. The solution is a device-specific DTS fix for now, until we managed to support USXGMII without autonegotiation.
This also includes definitions for this switch and the non-PoE variant since the model numbers were included in my firmware image too. As soon as some networking is working reliably, it can be flashed easily to drop this annoying serial upload.
I'm glad/surprised that this works for this series. I tried a very similar code for the GS1920 series and it didn't work, maybe I was just too stupid to get it right.
Anyway, this makes it web-flashable via the stock UI, right?
I used Claude to run over the stock firmware of my series, it found out which checks are done by the web upgrade validator, and then it told me "no, zynsig isn't capable of doing that but look, there's mkzynfw where you only need to add a new profile and it should work". So I did that and after getting the other parameters correct, it actually worked. The web-UI just accepted the image and started to flash it.
Hopefully this also works for GS1920. I would try it but my GS1920 is now a crucial part of my home network so I would rather not tinker with it too much. If I got it correctly, the important parts which are actually checked are the checksum (zynsig already had that), a minimum file size and information in that MMT, e.g. the model number. If that's correct it will just flash the whole image to the address given in the header.
I arranged my build recipes now in a way that I can even do sysupgrade -F zyxel-firmware.bin with a file directly downloaded from Zyxel to restore to vendor firmware. Not that I really want that but it's flawless if someone wants to.