Hello,
I am seeking feedback on how to best add support for the new Cudy M3000 with the Motorcomm YT8821 PHY on its WAN port. The current situation is that I understand why the 2.5G WAN port behaves unreliably and have one possible fix ready as a PR (beware of the giant message history). However, I am not sure which approach is best for OpenWrt.
EDIT: Some additional context that I forgot to add: an older hardware revision of the Cudy M3000 is already supported by OpenWrt. That hardware revision had a different PHY chip on the WAN port (RTL8221B-VB-CG).
Background
The 2.5G WAN port in the new Cudy M3000 is connected to the Motorcomm YT8821 PHY. The PHY is not working reliably because there is an address conflict on the router's MDIO bus.
-
There is an internal GbE PHY in the MT7981B SoC (this PHY is connected to the LAN port). This PHY always listens on MDIO address 0.
-
Then there is the external YT8821 PHY. This PHY is strapped to listen on MDIO address 1, but the PHY also listens on address 0 by default (Motorcomm considers that address a broadcast address by default).
The result is that the YT8821 reacts to commands intended for the internal MT7981B PHY and this messes up its internal state. For example, it only runs at 100 Mbps on 1 Gbps links. Another example is that calling "ip link set dev eth1 down" (e.g. bringing down the LAN port) brings down the WAN port as well.
The canonical way of fixing this is to reconfigure one internal register in the YT8821 such that the YT8821 stops listening on address 0. The problem is how to achieve that.
What I have tried
I initially thought that this could be fixed in the kernel. I prototyped some patches and sent them to LKML, but the patches do not fit nicely into mainline (the fix is either not reliable, or it requires ugly changes in the core code) and so I'd say they were rejected.
The two kernel-based approaches I tried were the following:
-
First, I tried to patch the YT8821 driver so that it would stop the PHY from listening on address 0 during PHY probing. This fixed the PHY issues on the router, but closer discussion revealed that this fix is not guaranteed to work. The problem is that a probe callback in the driver is called too late. That is, the kernel will communicate with the MT7981B on address 0 before the YT8821 is told to not listen on address 0. On the other hand, the patch was just a few lines of code.
After sending the patch to LKML, I also received feedback that the PHY should be reconfigured by the bootloader/U-Boot before the kernel boots.
If anyone wants to try this patch, it is just the https://github.com/openwrt/openwrt/pull/21584 PR.
-
Replacing the U-Boot in the Cudy M3000 is not without disadvantages (IIUC you lose the easy TFTP rollback path to the Cudy firmware). So I tried to rework how Linux detects and brings up the PHYs such that the YT8821 would be reconfigured before Linux first touches MDIO address 0. I did succeed at that and I think the fix is now reliable. However, the cost paid for that is that the patch is more complex and touches the core PHY infrastructure in the Linux kernel.
I sent this patch set to LKML and I again received feedback that this should be fixed by the bootloader. The PHY subsystems maintainers were (quite understandably) uneasy about the workarounds introduced to the PHY core code. (Please don't harrass them, I internally mostly agree with their conclusions).
If anyone wants to try this patch, it is on the following branch: https://github.com/JakubVanek/openwrt/tree/cudy-m3000-kernel-based-solution/
So, I think the result is that there is no kernel-based solution that would be acceptable to the mainline kernel.
I have some ideas about how to handle this in U-Boot. I played with the U-Boot CLI and I think that the fix can be implemented through a few U-Boot commands. You just need to deassert the PHY reset line, wait for a bit and then do a few MDIO bus writes. However, the commands (mii) are not supported by the stock Cudy U-Boot and so another U-Boot build would have to be used.
What now?
I am not sure how to proceed. As I outlined in https://github.com/openwrt/openwrt/pull/21584#issuecomment-3980643746 , I see the following options:
-
OpenWrt could carry the "simple" kernel patch that appears to be working, but is not guaranteed to do so. The patch is not upstreamable.
-
OpenWrt could carry the "complex" kernel patch that is more robust, but will be more difficult to keep up-to-date and carries more risk (it touches core code instead of just one driver). This patch is also not upstreamable.
-
I think another "simple" kernel patch could be created. That patch would hard-reset the YT8821 PHY during its initialization and then quickly disable the broadcast address. This could be more reliable than the original "simple" patch. I think a similar patch was used for the RTL8221B-VB-CG earlier. The disadvantage here is that I think it might not be possible to use the same device tree for the old RTL8821B-based Cudy M3000 and for the new YT8821-based Cudy M3000. (How should new users know which one to pick, though?) I also doubt that this patch will be upstreamable.
-
OpenWrt could adopt a new "second-stage" U-Boot for the Cudy M3000. The idea is that the vendor U-Boot would boot another small U-Boot that would fix the PHY configuration and then boot Linux. I think it could be possible to achieve this by embedding both the U-Boot and Linux into a single FIT image. No kernel patches or hacks would be required then.
-
OpenWrt could decide that Cudy M3000 is only supported after replacing the vendor's bootloader with custom one. The custom bootloader would also do the PHY configuration fixing.
-
Finally, OpenWrt could decide that the maintenance burden is not worth adding support for the new Cudy M3000.
What are your thoughs on this?
Best regards
Jakub Vaněk

