Filogic: can't figure out how to use the second gmac with external phy

I am trying to add support for a new mt7986 board having two Maxlinear GPY211C phys connected to switch port 5 and the second gmac. The one attacted to the switch works fine. But I cannot figure out what is wrong with the gmac connection.

No matter what I do, I am unable to send or receive any packets via the gmac2 netdev (eth1). It does work with OEM firmware, so the hardware must be OK.

The relevant part of my dts looks like this:

&eth {
	status = "okay";

	gmac0: mac@0 {
		compatible = "mediatek,eth-mac";
		reg = <0>;
		phy-mode = "2500base-x";

		fixed-link {
			speed = <2500>;
			full-duplex;
			pause;
		};
	};

	// FIXME:  this fails to forward any traffic over mac@1
	mac@1 {
		compatible = "mediatek,eth-mac";
		reg = <1>;
//		phy-mode = "sgmii";
		phy-mode = "2500base-x"; // phy will automatically switch to sgmmii
		phy-handle = <&phy6>;
	};

	mdio: mdio-bus {
		#address-cells = <1>;
		#size-cells = <0>;
	};
};

&mdio {
	// must reset before probing phy5 and phy6- using dummy as workaround
	phy@0 {
		compatible = "ethernet-phy-ieee802.3-c45";
		reg = <0>;
		reset-gpios = <&pio 6 GPIO_ACTIVE_LOW>;
		reset-assert-us = <50000>;
		reset-deassert-us = <20000>;

	};

	phy5: phy@5 {
		compatible = "ethernet-phy-ieee802.3-c45";
		reg = <5>;
	};

	phy6: phy@6 {
		compatible = "ethernet-phy-ieee802.3-c45";
		reg = <6>;
	};

	switch: switch@1f {
		compatible = "mediatek,mt7531";
		reg = <31>;
		reset-gpios = <&pio 5 GPIO_ACTIVE_HIGH>;
		interrupt-controller;
		#interrupt-cells = <1>;
		interrupt-parent = <&pio>;
		interrupts = <66 IRQ_TYPE_LEVEL_HIGH>;
	};
};

&switch {
	ports {
		#address-cells = <1>;
		#size-cells = <0>;

		port@0 {
			reg = <0>;
			label = "lan2";
		};

		port@1 {
			reg = <1>;
			label = "lan3";
		};

		port@2 {
			reg = <2>;
			label = "lan4";
		};

		port@5 {
			reg = <5>;
			label = "lan1";
			phy-mode = "2500base-x";
                        phy-handle = <&phy5>;
		};

		port@6 {
			reg = <6>;
			label = "cpu";
			ethernet = <&gmac0>;
			phy-mode = "2500base-x";

			fixed-link {
				speed = <2500>;
				full-duplex;
				pause;
			};
		};
	};
};

( NOTE: Ignore the phy@0 is a temporary hack. It ensures that phy@5 and phy@6 are powered up in time for probing. I don't know exactly what gpio 6 is connected to, but I don't think it can be the phy@5 reset line like the OEM firmware claimed. It's more likely enabling a power domain shared by both phys)

Everything looks very good on the surface. Looks like the phy is attached and muxing set up as expected:

root@(none):/# ifconfig eth1 up
[  138.256433] mtk_soc_eth 15100000.ethernet eth1: PHY [mdio-bus:06] driver [Maxlinear Ethernet GPY211C] (irq=POLL)
[  138.266626] mtk_soc_eth 15100000.ethernet eth1: configuring for phy/2500base-x link mode
[  138.274705] mtk_eth_mux_setup: mtk_soc_eth 15100000.ethernet: mux mux_gdm1_to_gmac1_esw isn't present on the SoC
[  138.284856] mtk_eth_mux_setup: mtk_soc_eth 15100000.ethernet: mux mux_gmac2_gmac0_to_gephy isn't present on the SoC
[  138.295266] mtk_eth_mux_setup: mtk_soc_eth 15100000.ethernet: mux mux_u3_gmac2_to_qphy isn't present on the SoC
[  138.305328] mtk_eth_mux_setup: mtk_soc_eth 15100000.ethernet: mux mux_gmac1_gmac2_to_sgmii_rgmii isn't present on the SoC
[  138.316264] set_mux_gmac12_to_gephy_sgmii: mtk_soc_eth 15100000.ethernet: path gmac2_sgmii in set_mux_gmac12_to_gephy_sgmii updated = 1

ethtool is also fine. Without anything connected:

root@(none):/# ethtool eth1
Settings for eth1:
        Supported ports: [  ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
                                2500baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
                                2500baseT/Full
        Advertised pause frame use: Symmetric Receive-only
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Speed: Unknown!
        Duplex: Half
        Auto-negotiation: on
        Port: Twisted Pair
        PHYAD: 6
        Transceiver: external
        MDI-X: Unknown
        Current message level: 0x000000ff (255)
                               drv probe link timer ifdown ifup rx_err tx_err
        Link detected: no

Connecting to a 1gig device also looks good to me:

root@(none):/# [ 1123.207295] mtk_eth_mux_setup: mtk_soc_eth 15100000.ethernet: mux mux_gdm1_to_gmac1_esw isn't present on the SoC                                                                                             
[ 1123.217465] mtk_eth_mux_setup: mtk_soc_eth 15100000.ethernet: mux mux_gmac2_gmac0_to_gephy isn't present on the SoC                                                                                                         
[ 1123.227875] mtk_eth_mux_setup: mtk_soc_eth 15100000.ethernet: mux mux_u3_gmac2_to_qphy isn't present on the SoC                                                                                                             
[ 1123.237938] mtk_eth_mux_setup: mtk_soc_eth 15100000.ethernet: mux mux_gmac1_gmac2_to_sgmii_rgmii isn't present on the SoC                                                                                                   
[ 1123.248873] set_mux_gmac12_to_gephy_sgmii: mtk_soc_eth 15100000.ethernet: path gmac2_sgmii in set_mux_gmac12_to_gephy_sgmii updated = 1                                                                                     
[ 1123.261032] mtk_soc_eth 15100000.ethernet eth1: Link is Up - 1Gbps/Full - flow control rx/tx
[ 1123.269473] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
root@(none):/# ethtool eth1
Settings for eth1:
        Supported ports: [  ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
                                2500baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
                                2500baseT/Full
        Advertised pause frame use: Symmetric Receive-only
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Link partner advertised link modes:  10baseT/Half 10baseT/Full
                                             100baseT/Half 100baseT/Full
                                             1000baseT/Full
        Link partner advertised pause frame use: Symmetric Receive-only
        Link partner advertised auto-negotiation: Yes
        Link partner advertised FEC modes: Not reported
        Speed: 1000Mb/s
        Duplex: Full
        Auto-negotiation: on
        master-slave cfg: preferred slave
        master-slave status: slave
        Port: Twisted Pair
        PHYAD: 6
        Transceiver: external
        MDI-X: on (auto)
        Current message level: 0x000000ff (255)
                               drv probe link timer ifdown ifup rx_err tx_err
        Link detected: yes

But I still can't receive or transmit anything. No errors. Nothing. Just silence.

FWIW, connecting "lan1" (switch port 5) to the same 1gig device looks very similar, except that it works...

root@(none):/# [ 1217.769488] mt7530 mdio-bus:1f lan1: Link is Up - 1Gbps/Full - flow control rx/tx
[ 1217.776989] IPv6: ADDRCONF(NETDEV_CHANGE): lan1: link becomes ready
root@(none):/# ethtool lan1
Settings for lan1:
        Supported ports: [  ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
                                2500baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
                                2500baseT/Full
        Advertised pause frame use: Symmetric Receive-only
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Link partner advertised link modes:  10baseT/Half 10baseT/Full
                                             100baseT/Half 100baseT/Full
                                             1000baseT/Full
        Link partner advertised pause frame use: Symmetric Receive-only
        Link partner advertised auto-negotiation: Yes
        Link partner advertised FEC modes: Not reported
        Speed: 1000Mb/s
        Duplex: Full
        Auto-negotiation: on
        master-slave cfg: preferred slave
        master-slave status: slave
        Port: Twisted Pair
        PHYAD: 5
        Transceiver: external
        MDI-X: on (auto)
        Supports Wake-on: pg
        Wake-on: d
        Link detected: yes

I assume there's some important detail I'm missing here. But what?

Ok, made some progress wrt verifying that the problem is where expected:

The phy datasheet is made public (thanks a lot to Maxlinear!):

Reading out the SGMII status register (30.9) I see that the phy attached to eth1 (second mtk_etc_soc gmac) rerports the reset value only:

root@OpenWrt:/# mdio mdio-bus mmd 6:30 raw 9
0x0008

That is: The auto-neg enabled bit set and nothing else.

Reading the same register on the phy attached to switch port 5 I see with and without a 1gig connection:

root@OpenWrt:/# mdio mdio-bus mmd 5:30 raw 9
0x002e
root@OpenWrt:/# mdio mdio-bus mmd 5:30 raw 9
0x000e

The addtional bits indicate "link active" and "SGMII link rate 1000 Mbit/s", with an addtional "auto-neg completed" when I actually connect something.

So this explains why it doesn't work, obviously. There is no SGMII link between the mac and phy.

But I am still as clueless as before wrt the root cause of that. Is there anyone who has tested the second gmac on the MT7986 and can provide some hints? Anyhting at all is appreciated. Like examples of known working device trees for some board.

The most relevant example I have found so far is the BPi-R3, which has SFP slot 1 connected to this gmac. Not exactly the same, but pretty similar if we assume an SFP module with a phy. Did anyone try that and got it to work? @daniel ? @nbd ? @VA1DER ?

AFAICS, the only significant difference between my configuration and the BPi-R3 is that the latter has

managed = "in-band-status";

Which completely breaks phy communication if I try it. I guess that's expected, given that there is no SGMII link for run the in-band-status over.

Yay! Noticed there were a number of patches in mainline drivers/net/ethernet/mediatek/mtk_sgmii.c different from the pending ones we have. Backporting the mainline ones and configuring the mac as this actually works!:

	mac@1 {
		compatible = "mediatek,eth-mac";
		reg = <1>;
		phy-mode = "sgmii";
		phy-handle = <&phy6>;
		managed = "in-band-status";
	};

So then I assume it wasn't just me this time :slight_smile:

I need somewhere to document this - in case someone wonders why I'm having this discussion with myself :slight_smile:

No, still not friends with this board unfortunately. Can't find a mac config which works regardless of link partner and setting.

Connecting to a 2.5Gbps capable peer works fine with

	mac@1 {
		compatible = "mediatek,eth-mac";
		reg = <1>;
		phy-mode = "2500base-x";
		phy-handle = <&phy6>;
	};

but not if I try to force the speed on either end. Then it breaks. Any attempt to use "in-band-status" makes the 2.5Gbps peer fail regardless of speed. Adding a bit of debug output to the backported mtk_sgmii.c suggests that the negotiation stuff is "working". If there only were some packets too...

Enabling autoneg on the remote end:

[  211.414825] mtk_soc_eth 15100000.ethernet eth1: Link is Down
[  215.575282] mtk_pcs_config: advertise=0x20
[  215.579381] mtk_pcs_config: link_timer=0x989680
[  215.583892] mtk_pcs_config: sgm_mode=0x0, use_an=0 bmcr=0x0
[  215.589468] mtk_pcs_link_up: 
[  215.592420] mtk_pcs_link_up: forcing sgm_mode=0x18
[  215.597213] mtk_soc_eth 15100000.ethernet eth1: Link is Up - 2.5Gbps/Full - flow control rx/tx

and packets flow.

Forcing 1000/full on the remote end:

[  550.454825] mtk_soc_eth 15100000.ethernet eth1: Link is Down
[  552.535335] mtk_pcs_config: advertise=0x1
[  552.539339] mtk_pcs_config: link_timer=0x186a00
[  552.543852] mtk_pcs_config: sgm_mode=0x1, use_an=0 bmcr=0x0
[  552.549419] mtk_pcs_link_up: 
[  552.552371] mtk_pcs_link_up: forcing sgm_mode=0x18
[  552.557160] mtk_soc_eth 15100000.ethernet eth1: Link is Up - 1Gbps/Full - flow control rx/tx

and there are no packets in either direction. Similar with 100/full;

[  935.254832] mtk_soc_eth 15100000.ethernet eth1: Link is Down
[  937.335155] mtk_pcs_link_up: 
[  937.338119] mtk_pcs_link_up: forcing sgm_mode=0x14
[  937.342910] mtk_soc_eth 15100000.ethernet eth1: Link is Up - 100Mbps/Full - flow control rx/tx

This is with the follwong printk's added to suitable places in mtk_sgmii.c:

	pr_info("%s: advertise=0x%x\n", __FUNCTION__, advertise);
	pr_info("%s: link_timer=0x%x\n", __FUNCTION__, link_timer);
	pr_info("%s: sgm_mode=0x%x, use_an=%u bmcr=0x%x\n", __FUNCTION__, sgm_mode, use_an, bmcr);
	pr_info("%s: \n", __FUNCTION__);
	pr_info("%s: forcing sgm_mode=0x%x\n", __FUNCTION__, sgm_mode);

For tthe working 2.5Gbps autoneg case, this is what the phy says:

root@OpenWrt:/# mdio mdio-bus mmd 6:30 raw 8
0x24da

and for the 1Gbps/100Mbps:

root@OpenWrt:/# mdio mdio-bus mmd 6:30 raw 8
0x34da

So ANEG is enabled on the SGMII link between mac and phy when the speed is supposed to be lower than max. Probably makes sense?

For the PHY reset, you can simply put the reset-gpios to the &mdio node. Note that Linux uses reset-delay-us and reset-post-delay-us for MDIO reset delays.

For MTK'S SGMII PCS, I don't have their datasheet so I don't know how it works. But I've been working on DesignWare's PCS so I can try to clear something out:

  • There are two types of autoneg that take place here: 1. The Clause 28 autoneg between 2 RJ45 UTP. (when you use ethtool) 2. The Clause 37 autoneg between 2 PCS (MAC and PHY). Normally a link change event on the UTP side will also restart the Clause 37 AN, if enabled.

  • Clause 37 is originally for 1000Base-X, which is 1Gbps-only with a 1.25Gbaud/s SerDes. To support lower speeds (100Mbps and 10Mbps), Cisco made SGMII, which uses symbol replication to achieve lower speeds (so the SerDes speed remains at 1.25Gbaud/s).

  • There's also 2500Base-X (some IP vendors call it High-SGMII), which is merely an overclocked 1000Base-X (SerDes runs at 3.125Gbaud/s). Note that 2500Base-X and 1000Base-X (or SGMII) use different SerDes speed, so Clause 37 autoneg is not able to switch between 2.5G and 1G/100M/10M, and a manual switch of SerDes speed is required.

Thanks a lot for your helpful hints. Yes, references to the relevant 802.3 clauses makes the problem description clearer.

AFAICS, the clause 28 autoneg works perfectly. ethtool reports expected values on both ends.

So I guess the problem is related to the PCS clause 37 autoneg between MAC and PHY. Although I know so little about this subject that I'm a bit hesitant to conclude. But from you description I wonder if the issue could be that the SerDes clock isn't changed when it should be? Trying to run SGMII with an overclocked SerDes will probably not work too good? I'll try to add some more debugging around that.

Wrt the PHY reset: Yes, I eventually figured that out after resorting to actually reading the docs and discovering the assert/deassert => daelay/post-delay difference between phy and mdio-bus.