Strange Ethernet bug with MT7621 device

Hi, I'm trying to port OpenWRT 21.02 for an unreleased device powered by an MT7621 but I'm having a weird bug that I could trace back to some people telling it's a silicon bug and can be workaround by disabling TC on the port... but I couldn't find any code reference for that for the newer 5.4 kernel.

NETDEV WATCHDOG: eth0 (mtk_soc_eth): transmit queue 0 timed out

This issue only appears if I'm booted with ethernet plugged in, the kernel spit out some stack trace then crashed:

Unplugged ethernet, the device booted up normally then eth works fine once plugged in:

DTS code:

Does anyone have any ideas?

Take a look at - https://git.openwrt.org/?p=openwrt/openwrt.git;a=blob;f=target/linux/ramips/dts/mt7621.dtsi;h=7636f9d8000a55bd453de0dab1db685ef292fabf;hb=b4d7885af70a42df7577157b96f941500bd17bfb#l466

This is the default mt7621 dts file used by all mt7621 devices, creating a corresponding dtsi matching your new device with the overriding settings required.

switch0: switch@1f {
				compatible = "mediatek,mt7621";
				#address-cells = <1>;
				#size-cells = <0>;
				reg = <0x1f>;
				mediatek,mcm;
				resets = <&rstctrl 2>;
				reset-names = "mcm";

				ports {
					#address-cells = <1>;
					#size-cells = <0>;
					reg = <0>;

					port@0 {
						status = "disabled";
						reg = <0>;
						label = "lan0";
					};

					port@1 {
						status = "disabled";
						reg = <1>;
						label = "lan1";
					};

					port@2 {
						status = "disabled";
						reg = <2>;
						label = "lan2";
					};

					port@3 {
						status = "disabled";
						reg = <3>;
						label = "lan3";
					};

					port@4 {
						status = "disabled";
						reg = <4>;
						label = "lan4";
					};

					port@6 {
						reg = <6>;
						label = "cpu";
						ethernet = <&gmac0>;
						phy-mode = "rgmii";

						fixed-link {
							speed = <1000>;
							full-duplex;
						};
					};
				};

Just override on the port 6, so remove full-duplex and pause etc

Uhh... I don't think you're supposed to touch the port 6 which is used to connect the gmac to the internal switch, my device only have 2 ports comes out of that switch and it's working just fine with my DTS, the other unconnected ports are left as disabled by default

ok, without hacking the code, you can use ethtool to control the ports.

The kernel pretty much crashed right after boot so it's not possible to use ethtool that early to control it

Okay, the patch here seems to work: https://www.mail-archive.com/openwrt-devel@lists.openwrt.org/msg60090.html

I wonder why this isn't getting upstreamed to OpenWrt? probably some other device might have this issue too?

Looking at the patch it needs a proper title and submitted to master branch first

As per statement - https://www.mail-archive.com/openwrt-devel@lists.openwrt.org/msg60091.html

The way the patch is done could be better and rather than #ifdef DISABLE_MTK_FC
why not do a if (priv->id == ID_MT7621) to turn off FC as this platform is the one that has the issue?

I've read somewhere that only a certain CPU revision of MT7621 have this issue, not all of it... there has been some code in previous branch that detects if the revision is matched it will disable the FC function, dunno where it has gone....