Sfp.c: OEM SFP-10G-T (RTL8261BE/CE) misidentified as ROLLBALL — probe-based fix

[SOLVED] OEM SFP-10G-T (RTL8261BE/CE) working at 10G on BPI-R4 — kernel patch

Hardware: Banana Pi BPI-R4 (MT7988A), SFP2 port
Module: OEM SFP-10G-T — Realtek RTL8261BE-CG copper 10G SFP+
Kernel: Linux 6.12 (OpenWrt 25.12)


The Problem

The OEM SFP-10G-T module (Realtek RTL8261BE-CG) is incorrectly matched by the kernel
as a ROLLBALL module and fails to establish any link:

sfp sfp2: module OEM SFP-10G-T rev A has been found in the quirk list
sfp sfp2: probing phy device through the [MDIO_I2C_ROLLBALL] protocol
sfp sfp2: no PHY detected, 24 tries left
sfp sfp2: no PHY detected, 23 tries left
...

Root cause: The sfp_quirk_rollball_cc quirk in drivers/net/phy/sfp.c was written
for modules with a Marvell 88X3310 PHY behind a RollBall I2C-to-MDIO bridge.
The RTL8261BE (and RTL8261CE) is a pure Media Converter — it has no I2C-to-MDIO
bridge at all. Sending ROLLBALL protocol commands to it results in infinite
"no PHY detected" retries.

The RTL8261BE auto-switches its SerDes speed based on copper autoneg:

  • 10G copper → 10GBASE-R SerDes
  • 2.5G copper → 2500BASE-X SerDes
  • 1G copper → 1000BASE-X SerDes

No MDIO access is needed or possible. The MAC just needs to use the EEPROM-declared
link mode (10gbase-r) directly.


The Fix

Instead of blindly applying the ROLLBALL quirk, we probe for the I2C-to-MDIO bridge
first. Real ROLLBALL modules respond with CMD_DONE within ~70 ms. Modules without a
bridge (RTL8261BE, RTL8261CE) never respond — the probe times out after 200 ms and
we fall back to MDIO_I2C_NONE.

This approach is safe for all existing ROLLBALL modules and fixes RTL8261BE/CE
without adding new per-chip quirks.

Patch for drivers/net/phy/sfp.c (applies on top of upstream 6.12):

--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ sfp_fixup_potron / sfp_fixup_rollball_cc @@

+/* Local mirror of rollball constants from mdio-i2c.c */
+#define SFP_ROLLBALL_PHY_ADDR   0x51
+#define SFP_ROLLBALL_MDIO_PAGE  3
+#define SFP_ROLLBALL_CMD_ADDR   0x80
+#define SFP_ROLLBALL_CMD_READ   0x02
+#define SFP_ROLLBALL_CMD_DONE   0x04
+
+static int sfp_rollball_a2_write(struct sfp *sfp, u8 reg, const u8 *data, int len)
+{
+	struct i2c_msg msg;
+	u8 buf[8];
+
+	buf[0] = reg;
+	memcpy(buf + 1, data, len);
+	msg.addr  = SFP_ROLLBALL_PHY_ADDR;
+	msg.flags = 0;
+	msg.len   = len + 1;
+	msg.buf   = buf;
+	return i2c_transfer(sfp->i2c, &msg, 1) == 1 ? 0 : -EIO;
+}
+
+static int sfp_rollball_a2_read1(struct sfp *sfp, u8 reg, u8 *val)
+{
+	struct i2c_msg msgs[2];
+
+	msgs[0].addr  = SFP_ROLLBALL_PHY_ADDR;
+	msgs[0].flags = 0;
+	msgs[0].len   = 1;
+	msgs[0].buf   = ®
+	msgs[1].addr  = SFP_ROLLBALL_PHY_ADDR;
+	msgs[1].flags = I2C_M_RD;
+	msgs[1].len   = 1;
+	msgs[1].buf   = val;
+	return i2c_transfer(sfp->i2c, msgs, 2) == 2 ? 0 : -EIO;
+}
+
+/* Probe for a RollBall I2C-to-MDIO bridge by sending the unlock password,
+ * switching to the MDIO page, issuing CMD_READ and polling for CMD_DONE.
+ * A real RollBall bridge asserts CMD_DONE within ~70 ms; modules without a
+ * bridge (e.g. RTL8261BE pure media converter) never assert it, so the poll
+ * times out after 200 ms.  Returns true only when CMD_DONE is observed.
+ */
+static bool sfp_has_rollball_bridge(struct sfp *sfp)
+{
+	u8 pw[4] = { 0xff, 0xff, 0xff, 0xff };
+	u8 page = SFP_ROLLBALL_MDIO_PAGE;
+	u8 cmd  = SFP_ROLLBALL_CMD_READ;
+	u8 saved_page = 0, res;
+	int i;
+
+	if (sfp_rollball_a2_write(sfp, SFP_VSL + 3, pw, sizeof(pw)) < 0)
+		return false;
+
+	sfp_rollball_a2_read1(sfp, SFP_PAGE, &saved_page);
+
+	if (sfp_rollball_a2_write(sfp, SFP_PAGE, &page, 1) < 0 ||
+	    sfp_rollball_a2_write(sfp, SFP_ROLLBALL_CMD_ADDR, &cmd, 1) < 0)
+		goto restore;
+
+	for (i = 0; i < 10; i++) {
+		msleep(20);
+		if (sfp_rollball_a2_read1(sfp, SFP_ROLLBALL_CMD_ADDR, &res) == 0 &&
+		    res == SFP_ROLLBALL_CMD_DONE) {
+			sfp_rollball_a2_write(sfp, SFP_PAGE, &saved_page, 1);
+			return true;
+		}
+	}
+
+restore:
+	sfp_rollball_a2_write(sfp, SFP_PAGE, &saved_page, 1);
+	return false;
+}
+
 static void sfp_fixup_rollball_cc(struct sfp *sfp)
-{
-	sfp_fixup_rollball(sfp);
-
-	/* Some RollBall SFPs may have wrong (zero) extended compliance code
-	 * burned in EEPROM. For PHY probing we need the correct one.
-	 */
-	sfp->id.base.extended_cc = SFF8024_ECC_10GBASE_T_SFI;
-}
+{
+	/* Probe for I2C-to-MDIO bridge: real RollBall modules assert CMD_DONE
+	 * within ~70 ms; pure media converters (e.g. RTL8261BE) have no bridge
+	 * and the probe times out after 200 ms -- skip PHY probing for those.
+	 */
+	if (!sfp_has_rollball_bridge(sfp)) {
+		sfp->mdio_protocol = MDIO_I2C_NONE;
+		return;
+	}
+	sfp_fixup_rollball(sfp);
+	sfp->id.base.extended_cc = SFF8024_ECC_10GBASE_T_SFI;
+}

 /* Add SFP-10G-T-I (industrial grade, same chip) to the quirk list */
-	SFP_QUIRK_F("OEM", "SFP-10G-T", sfp_fixup_rollball_cc),
+	SFP_QUIRK_F("OEM", "SFP-10G-T-I", sfp_fixup_rollball_cc),
+	SFP_QUIRK_F("OEM", "SFP-10G-T",   sfp_fixup_rollball_cc),

Results

dmesg — after patch (working)

sfp sfp2: probing phy device through the [MDIO_I2C_NONE] protocol
mtk_soc_eth 15100000.ethernet sfp-lan: Link is Up - 10Gbps/Full - flow control off

iperf3 — BPI-R4 ↔ BPI-R4 via OEM SFP-10G-T + Cat5e 2m

[SUM]   0.00-10.00  sec  11.0 GBytes  9.41 Gbits/sec  2900   sender
[SUM]   0.00-10.01  sec  10.9 GBytes  9.39 Gbits/sec         receiver

9.41 Gbits/sec — 94% of 10GbE line rate.


Notes

  • Tested on OpenWrt 25.12 / Linux 6.12, BPI-R4 MT7988A
  • Covers SFP-10G-T and SFP-10G-T-I (industrial grade, same chip)
  • Should also fix RTL8261CE — same architecture, no I2C-to-MDIO bridge
  • Existing ROLLBALL modules (Marvell 88X3310) are unaffected — probe succeeds
    and they continue to use MDIO_I2C_ROLLBALL as before
  • Multi-speed limitation: MAC is locked to 10gbase-r. If the link partner
    only supports 2.5G or 1G, traffic will not pass (SerDes/MAC speed mismatch).
    Suitable for 10G-only setups. Full multi-speed support is not feasible —
    the RTL8261BE in Media Converter mode has no I2C-to-MDIO bridge, making
    runtime SerDes speed detection impossible.
  • Candidate for upstream drivers/net/phy/sfp.c

Forum will not help here. Fixes for 25.12 (aka Kernel 6.12) should go mainline first. So a PR is the right place to get things roll(ball)ing. @bmork might be the one with some insight.

Thanks for pointing me in the right direction — patch submitted to linux-netdev:

https://lore.kernel.org/netdev/20260515115527.17241-1-petr.wozniak@gmail.com/

I thought about Openwrt mainline aka snapshot. But upstream is even better. Andrew is the right one to talk to.

Btw. thanks for taking care about this. Will also help the Realtek switches.

Quick update: patch was revised and v3 submitted to netdev today. Full thread:

https://lore.kernel.org/netdev/20260515174044.26036-1-petr.wozniak@gmail.com/

See also the discussion in this issue (if you are not already aware): https://github.com/openwrt/openwrt/pull/22910

Do all SFP based on RTL8261BE use the Media Converter mode, and not USXGMII ?

Is it possible to manually select the mode with ethtool ?
Maybe the quirk needs to set the bits, to allow the user to manually choose the mode, if the link speed is actually known and fixed.

Ideally with these modules, there would be a way to try 1000Base-X / 2500Base-X / 5000Base-X / 10GBase-R in a loop, but it would require something to detect when the SerDes link is working (which is probably not accessible from a sfp driver).
In the worst case, the userspace could to it : set a mode, wait for any RX frame, on timeout select the next mode, ...

Here is the full picture of why multi-speed does not work and why mode-cycling would not help:

The RTL8261BE is designed primarily for direct PCB integration, where its MDC/MDIO management pins are wired directly to the host MAC's MDIO bus. In that configuration all registers are fully accessible and the host can configure speed, mode, rate adaptor, everything.

When this chip is placed inside an SFP module a problem arises: the SFP standard exposes management via I2C (SDA/SCL),not MDIO. To bridge between the SFP cage I2C lines and the chip's internal MDIO interface you need a dedicated I2C-to-MDIO bridge circuit. This is exactly what RollBall modules implement - and why our fix needed to distinguish RollBall (bridge present) from this OEM module (no bridge).

The OEM manufacturer chose not to populate that bridge. The chip's MDIO pins exist physically inside the module but are electrically unreachable from outside. No register access, no speed configuration, no mode switching from the host side.

As for what the chip does autonomously: in Media Converter mode it does automatically adapt its SerDes speed to match whatever copper autoneg settles on. So when the copper peer is 2.5G, the chip switches its SerDes side to 2500Base-X - but MT7988A has no way to know that happened. Without a PHY driver (which requires MDIO access) phylink never gets a speed notification and stays at 10GBase-R. The result is a SerDes speed mismatch and no traffic.

The mode-cycling idea would require MT7988A to blindly probe SerDes speeds until its PCS sees a link. This is technically conceivable but: it is a non-standard phylink pattern that no maintainer would accept upstream, the timing is undefined, and it would need to re-run on every copper link event. More practically: this module is a 10G-only product. The manufacturer does not claim or support multi-speed operation and there is no straightforward path to enabling it without MDIO access.

For 10G peers the module works correctly with the patched kernel. For mixed-speed environments a module with a proper PHY management interface is the right tool.

But at least manual setting (with ethtool and maybe OpenWrt speed config) should work, if the driver allows the different modes to be selected.

With fixed speed, is pause support advertised and working ?

Sadly most if not all SFPs have very limited public datasheets, so it's difficult to know if a I2C to MDIO bridge is present or not.

ethtool controls what the MAC side advertises to phylink — it has no path to the RTL8261BE's copper PHY. The chip's SerDes always presents 10GBASE-R to the MT7988A regardless of copper link speed; there is no command channel to switch it. Setting speed 1000 via ethtool would just take the MAC to 1000base-X, which the module never asserts → link down.

On pause: haven't specifically tested it. The dmesg shows flow control off at link-up, which is what the MAC reports from USXGMII — but without MDIO access to the SFP PHY there's no way to read what the module itself advertises.

The speed limitation is in the hardware architecture (no MDIO bridge = no SerDes mode control), not the driver. Short of MTK publishing confidential MAC register docs, there's nothing software can add here.

How can The chip's SerDes always presents 10GBASE-R to the MT7988A regardless of copper link speed and 2.5G copper => 2500BASE-X SerDes be true at the same time ?
I know that there is no way to force the link parameters on the RTL8261, but if it negotiates on the copper side, then matches the speed on the SerDes side, then why wouldn't manually matching this on the MAC side with ethtool work ?

For the flow control, if this is enabled on the copper side (we could check what is advertised to the link partner), I wonder if the MAC need to be configured to accept it, and if ethtool can set this.
I suppose it wouldn't hurt to allow rx of pause frames (even if we don't know if the link partner would actually send them), but for tx sending frames the link partner doesn't expect might cause issues.

You're right to flag the contradiction. Post #1 was inaccurate — I described the chip's general multi-speed design capability, not its actual SFP module behavior. Hardware testing (including a Mikrotik setup with a 2.5G copper peer) confirms Post #10 is correct: the module's SerDes stays at 10GBASE-R regardless of copper negotiation speed.

So your ethtool suggestion wouldn't help — setting the MAC to 2.5G would break the link because the module's SerDes side is still running 10GBASE-R. The module is effectively 10G-only from the MAC's perspective.

This also means the patch has no multi-speed implications: it only fixes the MDIO_I2C_ROLLBALL misidentification that caused the kernel to stall looking for a PHY that doesn't exist.

So the SFP sets the RTL6261BE in USXGMII mode, not in Media Converter mode.
If I understand this correctly, when connected to a 10G copper it would be the same as 10GBase-R basically, but for other speeds (at least for 5G / 2.5G, the datasheet also mentions SGMII for 1G / 100M) the MAC would need to be set in USXGMII mode explicitly.

So would the sfp need a quirk with .support to set PHY_INTERFACE_MODE_USXGMII, as sfp_module_parse_support would never set that ?
But then I wonder about sfp_select_interface, which wants to link link_mode and interface mode.

I don't know if I'm looking at the correct code, but with PHY_INTERFACE_MODE_USXGMII, phy_caps_from_interface would return all speeds, and phylink_get_inband_type would enable communication with the MAC.

Thanks for the detailed analysis of the USXGMII mode. You may well be right about how the RTL8261BE is configured internally.

However, this is out of scope for the current patch, which has one specific goal: fix the kernel misidentifying this module as a RollBall device and spending 5+ minutes searching for a PHY that doesn't exist. That problem is now solved.

Multi-speed support via USXGMII would be a separate effort requiring its own patch, its own review cycle, and — critically — a way to reach the PHY configuration registers from the host side, which this SFP form factor does not expose.

If you'd like to pursue USXGMII support for this module, I'd suggest opening a new thread so it gets the focused attention it deserves rather than being mixed into this one.