[Solved] D-Link DIR-860L - mt7621 ethernet connectivity issues

Helllo,

I own a D-Link DIR-860L, running:

# cat /etc/openwrt_release
DISTRIB_ID='OpenWrt'
DISTRIB_RELEASE='19.07.4'
DISTRIB_REVISION='r11208-ce6496d796'
DISTRIB_TARGET='ramips/mt7621'
DISTRIB_ARCH='mipsel_24kc'
DISTRIB_DESCRIPTION='OpenWrt 19.07.4 r11208-ce6496d796'
DISTRIB_TAINTS=''

and experiencing some peculiar disconnects on LAN.

After flashing OpenWrt 19.07.4 clean and doing the basic setup, I noticed in the log random disconnects on the LAN side. Worth mentioning that I'm only using 100Mbit ethernet adapters.
The log records are in the form:

mtk_soc_eth 1e100000.ethernet eth0: port x link down
mtk_soc_eth 1e100000.ethernet eth0: port x link up

The port order on this router looks messed up and I couldn't yet figure out which physical port belongs to which port number. All 4 LAN ports are used and I was able to identify the port for the laptop system I'm using on my desk because it has a different cable color. It's plugged in port 1 (router) and looks like port 4 in OpenWRT. Apparently, port 3 on router is actually port 2 in OpenWRT and is connected to a cheap Gembird USB 100Mbit NIC feeding a Raspberry Pi Zero.
Now the peculiarities:

  1. The port 1 (router) - port 4 (OpenWRT) got disconnected many times, sometimes even after a few minutes - 10-20, for no apparent reason. One scenario I was almost always able to reproduce was to bounce the pppoe interface - like in this snippet:
Sat Oct 24 18:17:02 2020 user.notice firewall: Reloading firewall due to ifup of wan (pppoe-wan)
Sat Oct 24 18:25:41 2020 kern.info kernel: [ 1005.881400] mtk_soc_eth 1e100000.ethernet eth0: port 4 link down
Sat Oct 24 18:25:44 2020 kern.info kernel: [ 1008.340405] mtk_soc_eth 1e100000.ethernet eth0: port 4 link up
  1. If I disconnected/reconnected the Raspberry Pi Zero (at the source - not touching the router) - port 3 (router) - port 2 (OpenWRT), that would usually bounce port 1 (router) - port 4 (OpenWRT) too.
Sat Oct 24 17:37:32 2020 kern.info kernel: [59968.686679] mtk_soc_eth 1e100000.ethernet eth0: port 2 link down
Sat Oct 24 17:37:37 2020 kern.info kernel: [59973.079235] mtk_soc_eth 1e100000.ethernet eth0: port 2 link up
Sat Oct 24 17:56:26 2020 kern.info kernel: [61102.245327] mtk_soc_eth 1e100000.ethernet eth0: port 4 link down
Sat Oct 24 17:56:30 2020 kern.info kernel: [61106.337661] mtk_soc_eth 1e100000.ethernet eth0: port 4 link up

I tried disabling the VLAN functionality and that didn't help. Went on and disabled "Force link" on the LAN interface and that seemed to calm down the random disconnects + the pppoe induced ones + the Raspberry Pi Zero induced ones.
Still - port 1 (router) - port 4 (OpenWRT), the desktop system, would loose connectivity randomly with no "mtk_soc_eth 1e100000.ethernet eth0" log records.

Last try - I own a spare (new) Gembird USB 100Mbit NIC and swapped it on the Raspberry Pi Zero board. It's some hours now I couldn't notice any disconnects (including not logged ones).

And .. I have two theories:

  • the old Gembird USB 100Mbit NIC is faulty and might have injected some noise in the Ethernet Transformer on the router PCB. Could be that the coils for the ports 2 & 4 (OpenWRT) are adjacent and can influence each other. But this doesn't explain the pppoe bouncing induced disconnection on port 4 (OpenWRT).

  • stumbled upon the following unresolved issues with the mt7621 driver:
    https://bugs.openwrt.org/index.php?do=details&task_id=1449&string=mt76
    https://github.com/openwrt/openwrt/pull/2847
    Wondering if what I was experiencing (maybe still am - not yet sure it's all fine now) is related to these driver issues.

I have experience with other 2 such D-Link DIR-860L routers, one running OpenWRT 18.x and the other one 19.07.3, but both are connected on only one LAN port to a Gigabit Switch and I never noticed such strange disconnects.

I'm also not sure I understand what force_link really means for the issue I'm reporting and if it did help somehow deactivating it. This isn't really helpful - LAN interface is by default brought up at boot:


force_link boolean no 1 for protocol static, else 0 Specifies whether ip address, route, and optionally gateway are assigned to the interface regardless of the link being active ('1') or only after the link has become active ('0'); when set to '1', carrier sense events do not invoke hotplug handlers

Any help appreciated.

After some more hours I still get the disconnect:

Sun Oct 25 00:11:04 2020 kern.info kernel: [ 6214.071526] mtk_soc_eth 1e100000.ethernet eth0: port 4 link down
Sun Oct 25 00:11:06 2020 kern.info kernel: [ 6216.537596] mtk_soc_eth 1e100000.ethernet eth0: port 4 link up

Apparently changing the Gembird USB 100Mbit NIC didn't help.
I'm also speculating now that my laptop puts the NIC (port 4) in some PM mode, but that shouldn't make sense. I was working on the laptop and had some pauses while lecturing things, it never went into power saving. Anyways, the only PM options active are - turning off the display and HDD, it never sleeps/hibernates.

P.S. A few minutes later while reading some threads here in the forum:

Sun Oct 25 00:27:48 2020 kern.info kernel: [ 7218.755150] mtk_soc_eth 1e100000.ethernet eth0: port 4 link down
Sun Oct 25 00:27:51 2020 kern.info kernel: [ 7221.657865] mtk_soc_eth 1e100000.ethernet eth0: port 4 link up

Some progress with the investigation.
I've been monitoring and grep-ing the log for the last 20 hours and this evening I did some changes based on what I've learned from this bug report:

  • yesterday (~20 h ago) I learned that the LAN ports order is reversed compared to what is labeled on the router:
    Router - Openwrt
    1 - 4
    2 - 3
    3 - 2
    4 - 1
  • I also learned that switching the cables order (using different ports) helped (no idea why - it is also reported in the bug report link above)
  • the LAN interface option force_link plays no role with this issue, I enabled it back
  • I was left with only port 3 ( the old Gembird USB 100Mbit NIC - Raspberry Pi Zero) experiencing the disconnect issue every few minutes:
Sun Oct 25 04:34:29 2020 kern.info kernel: [ 1904.203995] mtk_soc_eth 1e100000.ethernet eth0: port 3 link down
Sun Oct 25 04:34:31 2020 kern.info kernel: [ 1906.656149] mtk_soc_eth 1e100000.ethernet eth0: port 3 link up
Sun Oct 25 05:31:25 2020 kern.info kernel: [ 5320.867217] mtk_soc_eth 1e100000.ethernet eth0: port 3 link down
Sun Oct 25 05:31:28 2020 kern.info kernel: [ 5323.463332] mtk_soc_eth 1e100000.ethernet eth0: port 3 link up
Sun Oct 25 06:10:56 2020 kern.info kernel: [ 7691.475076] mtk_soc_eth 1e100000.ethernet eth0: port 3 link down
Sun Oct 25 06:11:00 2020 kern.info kernel: [ 7695.525705] mtk_soc_eth 1e100000.ethernet eth0: port 3 link up
....

The same was repeating until 8 a.m. - moment when Kodi (that's what's running on the Pi Zero) stopped the playback of Internet Radio (meaning no more traffic was active). I deliberately left Kodi active playing, switched off the sound and killed it with a cron job at 8 a.m. (I was still asleep). Afternoon I started Kodi again, playing some Internet Radio stream and the disconnects started to show up again on port 3, meaning, they only happen when there is some traffic - Internet Radio is around 128 kbps.

  • around 16:00 in the afternoon I changed (swapped) again the Gembird USB 100Mbit NIC - Raspberry Pi Zero (installed again the new spare card) and rebooted the router. Started Kodi again and it's still playing while I'm writing this post, there are no more disconnects on port 3:
# logread | grep mtk_soc_eth
Sun Oct 25 16:46:18 2020 kern.err kernel: [    3.311836] mtk_soc_eth 1e100000.ethernet: generated random MAC address xx:xx:xx:xx:xx:xx
Sun Oct 25 16:46:18 2020 kern.info kernel: [    4.734205] mtk_soc_eth 1e100000.ethernet: loaded mt7530 driver
Sun Oct 25 16:46:18 2020 kern.info kernel: [    4.746679] mtk_soc_eth 1e100000.ethernet eth0: mediatek frame engine at 0xbe100000, irq 21
Sun Oct 25 16:46:18 2020 kern.info kernel: [    4.934063] mtk_soc_eth 1e100000.ethernet eth0: port 2 link up
Sun Oct 25 16:46:18 2020 kern.info kernel: [    5.734607] mtk_soc_eth 1e100000.ethernet eth0: port 3 link up
Sun Oct 25 16:46:18 2020 kern.info kernel: [    5.749464] mtk_soc_eth 1e100000.ethernet eth0: port 1 link up
Sun Oct 25 16:46:18 2020 kern.info kernel: [    6.601206] mtk_soc_eth 1e100000.ethernet eth0: port 0 link up
Sun Oct 25 16:46:18 2020 kern.info kernel: [    8.134014] mtk_soc_eth 1e100000.ethernet: PPE started
Sun Oct 25 16:46:18 2020 kern.info kernel: [   11.667792] mtk_soc_eth 1e100000.ethernet: 0x100 = 0x6060000c, 0x10c = 0x80818
Sun Oct 25 16:46:27 2020 kern.info kernel: [   23.804844] mtk_soc_eth 1e100000.ethernet: PPE started

Still confused about what's actually causing the issue:

  • is it something electrical? noise or bad connections (poor quality plastic RJ45 sockets on the router).
  • or is it driver related? Worth mentioning that I got a few (2-3) records about port 0 disconnecting and that is internal, shouldn't have anything to do with the external wires and NIC cards.

P.S. Side Note
While I was playing around with the router, I had a look at the logs from another Raspberry that is connected in my LAN. It's a secured box, static IP, with a very restrictive firewall logging/dropping everything that's not meant to receive/send. I get this every time I restart the router:

[Sun Oct 25 16:46:58 2020] smsc95xx 1-1.1:1.0 eth0: link down
[Sun Oct 25 16:47:07 2020] smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0xC5E1
[Sun Oct 25 16:47:10 2020] UDP DROP: IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:yy:yy:yy:yy:yy:yy:zz:zz:zz:zz:zz:zz:aa:aa:aa:aa:aa:aa:aa:aa SRC=192.168.1.1 DST=192.168.1.255 LEN=1029 TOS=0x00 PREC=0x00 TTL=64 ID=4495 DF PROTO=UDP SPT=46693 DPT=4919 LEN=1009

Where:

  • yy:yy:yy:yy:yy:yy router LAN MAC
  • zz:zz:zz:zz:zz:zz some constant MAC not present in my LAN (reoccurring)
  • aa:aa:aa:aa:aa:aa:aa:aa some dynamically generated string (MAC not present in my LAN)

I have no 192.168.1.1 configured in my router, LAN IP & Gateway is 192.168.1.100
Is 192.168.1.1 hard coded somewhere?
Here's the relevant section from my /etc/config/network

config interface 'lan'
        option type 'bridge'
        option ifname 'eth0.1'
        option proto 'static'
        option ipaddr '192.168.1.100'
        option netmask '255.255.255.0'
        option broadcast '192.168.1.255'
        option delegate '0'

I managed to resolve the reported issue and here are some more details:

Further investigation:
I had to do some more investigation for another issue I was experiencing with the WiFi 2.4 (also resolved) and while I flashed OpenWRT 19.07.3 (intentional downgrade) I had to put the router on a side to access the reset button and bring it in recovery mode, twisting a little all the ethernet cables.
After the successful flash I noticed a lot of link down/link up records in the log, almost flooding it, related to almost all ports. Pushed the connectors firmly into the sockets and restarted the router. After the restart I was only left with port 3 reconnecting like crazy - almost every few seconds and went on and changed again the Gembird USB 100Mbit NIC (installed the new one). Restarted the router again and didn't notice any disconnects.
Meanwhile I resolved the WiFi issue (it wasn't related to OpenWRT) and flashed back the latest OpenWrt 19.07.4.
It's been over 48 hours now and I haven't seen any random "link down/ link up" records in the log, even after restarting the router several times.
Worth noting that during the investigations I haven't seen any errors & re-transmits in the ethernet stats counters on both the router and clients (ifconfig).

Conclusion:
It looks like there is indeed a contact (electrical) issue with the cheap plastic RJ45 sockets and the mt7530 driver seems to reset the interface when noticing link issues instead of trying to fix/re-transmit.
After firmly pushing the ethernet connectors and rebooting the router (reloading the mt7530 driver), the disconnects simply disappear.

I'm OK now and hope that my experience serves as a lesson for others.

Will put the thread on Resolved.

1 Like

Well, I put this thread on resolved because I thought I've found a workaround to silence the ethernet ports down/ups.
Apparently these are still occurring randomly, after a 24-48 hours of uptime and the only way to get rid of them is to restart the router. Pushing firmly the ethernet connectors doesn't look to help and I started to believe that it's 100% a driver issue. Besides, found out that the port 0 is actually the WAN interface, don't know why I believed it's something internal.
Here are the latest findings:

logread | grep mtk_soc_eth
Tue Nov  3 08:32:06 2020 kern.info kernel: [13934.966911] mtk_soc_eth 1e100000.ethernet eth0: port 0 link down
Tue Nov  3 08:34:32 2020 kern.info kernel: [14080.806664] mtk_soc_eth 1e100000.ethernet eth0: port 1 link up
Tue Nov  3 08:35:05 2020 kern.info kernel: [14114.330869] mtk_soc_eth 1e100000.ethernet eth0: port 1 link down
Tue Nov  3 08:35:08 2020 kern.info kernel: [14116.886705] mtk_soc_eth 1e100000.ethernet eth0: port 1 link up
Tue Nov  3 08:43:33 2020 kern.info kernel: [14622.072861] mtk_soc_eth 1e100000.ethernet eth0: port 0 link up
Tue Nov  3 08:43:54 2020 kern.info kernel: [14643.302157] mtk_soc_eth 1e100000.ethernet eth0: port 0 link down
Tue Nov  3 08:43:57 2020 kern.info kernel: [14646.305265] mtk_soc_eth 1e100000.ethernet eth0: port 0 link up
Tue Nov  3 08:44:13 2020 kern.info kernel: [14661.993053] mtk_soc_eth 1e100000.ethernet eth0: port 0 link down
Tue Nov  3 08:44:16 2020 kern.info kernel: [14665.036617] mtk_soc_eth 1e100000.ethernet eth0: port 0 link up
Tue Nov  3 08:44:16 2020 kern.info kernel: [14665.388770] mtk_soc_eth 1e100000.ethernet eth0: port 0 link down
Tue Nov  3 08:44:20 2020 kern.info kernel: [14668.896265] mtk_soc_eth 1e100000.ethernet eth0: port 0 link up
Tue Nov  3 08:51:33 2020 kern.info kernel: [14767.849840] mtk_soc_eth 1e100000.ethernet eth0: port 2 link down
Tue Nov  3 08:51:40 2020 kern.info kernel: [14774.988523] mtk_soc_eth 1e100000.ethernet eth0: port 2 link up
Tue Nov  3 08:52:51 2020 kern.info kernel: [14846.043023] mtk_soc_eth 1e100000.ethernet eth0: port 2 link down
Tue Nov  3 08:52:52 2020 kern.info kernel: [14847.627645] mtk_soc_eth 1e100000.ethernet eth0: port 2 link up
Tue Nov  3 08:55:35 2020 kern.info kernel: [15010.573220] mtk_soc_eth 1e100000.ethernet eth0: port 3 link down
Tue Nov  3 08:55:38 2020 kern.info kernel: [15013.094936] mtk_soc_eth 1e100000.ethernet eth0: port 3 link up
Tue Nov  3 10:03:39 2020 kern.info kernel: [19094.142556] mtk_soc_eth 1e100000.ethernet eth0: port 1 link down
Tue Nov  3 10:48:42 2020 kern.info kernel: [21797.517598] mtk_soc_eth 1e100000.ethernet eth0: port 1 link up
Tue Nov  3 10:49:31 2020 kern.info kernel: [21846.510586] mtk_soc_eth 1e100000.ethernet eth0: port 1 link down
Tue Nov  3 10:49:34 2020 kern.info kernel: [21848.982607] mtk_soc_eth 1e100000.ethernet eth0: port 1 link up
Tue Nov  3 12:20:55 2020 kern.info kernel: [27330.217991] mtk_soc_eth 1e100000.ethernet eth0: port 1 link down
Tue Nov  3 12:48:34 2020 kern.info kernel: [28989.218387] mtk_soc_eth 1e100000.ethernet eth0: port 1 link up
Tue Nov  3 12:49:25 2020 kern.info kernel: [29039.855558] mtk_soc_eth 1e100000.ethernet eth0: port 1 link down
Tue Nov  3 12:49:27 2020 kern.info kernel: [29042.328412] mtk_soc_eth 1e100000.ethernet eth0: port 1 link up
Tue Nov  3 13:20:36 2020 kern.info kernel: [30910.671226] mtk_soc_eth 1e100000.ethernet eth0: port 1 link down
Tue Nov  3 20:53:19 2020 kern.info kernel: [58073.881214] mtk_soc_eth 1e100000.ethernet eth0: port 1 link up
Tue Nov  3 20:54:11 2020 kern.info kernel: [58125.937881] mtk_soc_eth 1e100000.ethernet eth0: port 1 link down
Tue Nov  3 20:54:14 2020 kern.info kernel: [58128.426172] mtk_soc_eth 1e100000.ethernet eth0: port 1 link up
Wed Nov  4 00:01:20 2020 kern.info kernel: [69354.827702] mtk_soc_eth 1e100000.ethernet eth0: port 3 link down
Wed Nov  4 00:01:23 2020 kern.info kernel: [69357.325027] mtk_soc_eth 1e100000.ethernet eth0: port 3 link up
Wed Nov  4 00:36:49 2020 kern.info kernel: [71483.600453] mtk_soc_eth 1e100000.ethernet eth0: port 1 link down
Wed Nov  4 00:36:52 2020 kern.info kernel: [71486.158273] mtk_soc_eth 1e100000.ethernet eth0: port 1 link up

(port 4 is connected to a second laptop that I only start once a week, thus, the port is most of the time down)

Only a reboot (ethernet driver re-initialization) is silencing the ethernet down/up messages. And these are not only messages, but actual disconnects, I get packets lost (no communication) during the disconnects.

1 Like

Same issue on Xiaomi R3P Pro, it's so anoying.

I pushed the WAN cable more firmly and I don't get errors on port 0 anymore (over 3 weeks now).
Then I changed the cables for port 1 and 3 some two weeks ago, ... although they looked OK, and I haven't seen any errors ever since.
I tend to believe that there is a HW issue, Ethernet connectors on the router are really bad quality and the router very sensitive, then, the Ethernet driver itself is bad quality - error correction - should try to handle better the issues instead of just resetting the port.

1 Like

If your problem is solved, please consider marking this topic as [Solved]. See How to mark a topic as [Solved] for a short how-to.

The problem as a whole is not solved, as the Ethernet driver for this SoC is bad quality. However, there is a bug report related to this issue and this thread looks to be a duplicate ... with a little more investigation info and a workaround instead of a solution - use perfectly manufactured RJ45 connectors, push them with a rubber hammer, glue them with superglue and never touch them again...
Will put it on solved if that makes you happier :slight_smile:

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.