TL-WR841N v8 ath79 Link Unstable

In rc1 eth0 was very unstable but it is stable now in final 19.07.
There's now a problem with eth1 (WAN) though.

[ 8450.943196] eth1: link down
[ 8451.981464] eth1: link up (100Mbps/Full duplex)
[37071.909045] eth1: link down
[37072.988861] eth1: link up (100Mbps/Full duplex)
[48492.291332] eth1: link up (10Mbps/Half duplex)
[48632.691559] eth1: link up (100Mbps/Full duplex)
[55697.449886] eth1: link down
[55698.491082] eth1: link up (100Mbps/Full duplex)

Here's the report for the eth0 issue but there has been no reply for a while so I thought I'd post it here.

https://bugs.openwrt.org/index.php?do=details&task_id=2216

Loss of connection and especially connecting at 10 Mb could be a hardware issue-- bad cable or lightning damage to the port in the router or modem.

The standard for diagnosing hardware issues is to return to stock firmware and see if it continues to have problems.

1 Like

The switch driver was changed in 19.07 for this device and that's probably causing these issues. I was on 18.06 for a very long time and this never happened. You can also see in the ticket that the issues only appeared after upgrading to 19 so it's not a hardware failure but rather a bug in the new switch driver

I have 841n v8 too, but upgraded 64MB ram & 8MB flash.
I'm running 19.07.0 ath79 more than hours without any issue.
This model just have 32MB ram, i think this issue is insufficient ram related.

Please check dmesg. I think you will find the same messages there

I running uart console terminal & not see any problems.
This is some latest dmesg message:

[   15.764291] Loading modules backported from Linux version v4.19.85-0-gc63ee2939dc1
[   15.772200] Backport generated by backports.git v4.19.85-1-0-g8a8be258
[   15.783834] random: crng init done
[   15.791199] ip_tables: (C) 2000-2006 Netfilter Core Team
[   15.812667] nf_conntrack version 0.5.0 (1024 buckets, 4096 max)
[   15.823069] ctnetlink v0.93: registering with nfnetlink.
[   15.952764] xt_time: kernel timezone is -0000
[   16.078753] PPP generic driver version 2.4.2
[   16.086557] NET: Registered protocol family 24
[   16.160848] ath: EEPROM regdomain: 0x0
[   16.160861] ath: EEPROM indicates default country code should be used
[   16.160865] ath: doing EEPROM country->regdmn map search
[   16.160883] ath: country maps to regdmn code: 0x3a
[   16.160890] ath: Country alpha2 being used: US
[   16.160894] ath: Regpair used: 0x3a
[   16.178205] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
[   16.182417] ieee80211 phy0: Atheros AR9340 Rev:3 mem=0xb8100000, irq=2
[   16.230385] kmodloader: done loading kernel modules from /etc/modules.d/*
[   47.066925] eth0: link up (1000Mbps/Full duplex)
[   47.088580] br-lan: port 1(eth0.1) entered blocking state
[   47.094276] br-lan: port 1(eth0.1) entered disabled state
[   47.100209] device eth0.1 entered promiscuous mode
[   47.105169] device eth0 entered promiscuous mode
[   47.161043] br-lan: port 1(eth0.1) entered blocking state
[   47.166643] br-lan: port 1(eth0.1) entered forwarding state
[   47.470798] jffs2_scan_eraseblock(): End of filesystem marker found at 0x0
[   47.477920] jffs2_build_filesystem(): unlocking the mtd device... 
[   47.515221] done.
[   47.523675] jffs2_build_filesystem(): erasing all blocks after the end marker... 
[   49.382318] eth1: link up (100Mbps/Full duplex)
[   51.420950] done.
[   51.422988] jffs2: notice: (1173) jffs2_build_xattr_subsystem: complete building xattr subsystem, 0 of xdatum (0 u.
[   51.721315] overlayfs: upper fs does not support tmpfile.
root@OpenWrt:/# uptime
 17:13:34 up 34 min,  load average: 0.00, 0.00, 0.00

Check again in a few hours. It happens at random intervals.

I have about 7 MB RAM free on my device so don't really think it's related to RAM.

Well, you right. After few minutes heavy load testing (download torrent+watch video stream), the issue revealed, but internet connection still working.

[ 3151.230793] eth1: link up (10Mbps/Half duplex)
[ 3468.429171] eth1: link down
[ 8372.031970] eth1: link up (10Mbps/Half duplex)
[ 8483.312309] eth1: link up (100Mbps/Full duplex)
[ 8648.671362] eth1: link down
[ 8648.785204] eth1: link up (100Mbps/Full duplex)
[ 8699.792408] eth1: link up (10Mbps/Half duplex)
[ 8753.872414] eth1: link up (100Mbps/Full duplex)
[ 8787.152712] eth1: link down
[ 8787.264991] eth1: link up (100Mbps/Full duplex)
[ 8829.952701] eth1: link up (10Mbps/Half duplex)
[ 8838.272289] eth1: link up (100Mbps/Full duplex)

Yes, but the internet won't work until the link is back up. I guess this confirms that there's definitely an issue with the new switch driver.

There's not even heavy load on my interfaces so load is probably not related. It could be related though if it happens more frequently under heavy load.

Another issue with ath79 firmware. Sometime router accident enter to failsafe mode when reboot, i see serial uart console messages:

[    5.868712] init: - preinit -
[    5.960699] random: procd: uninitialized urandom read (4 bytes read)
[    6.909058] random: jshn: uninitialized urandom read (4 bytes read)
[    7.327749] random: jshn: uninitialized urandom read (4 bytes read)
[    7.567130] random: jshn: uninitialized urandom read (4 bytes read)
[    7.638687] random: jshn: uninitialized urandom read (4 bytes read)
Press the [f] key and hit [enter] to enter failsafe mode
Press the [1], [2], [3] or [4] key and hit [enter] to select the debug level
[    9.860896] eth0: link up (1000Mbps/Full duplex)
- failsafe button rfkill was pressed -
- failsafe -
[   12.309911] urandom_read: 3 callbacks suppressed
[   12.309921] random: dropbearkey: uninitialized urandom read (32 bytes read)
Generating 1024 [   12.323834] random: dropbearkey: uninitialized urandom read (32 bytes read)
bit rsa key, this may take a while...
Public key portion is:
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQCKZVT/UJoTHz2CkNHrYmfzzBDfa/6Yd6+LJVcrURD15lbN3ch7AG/tc/ZolkeOrqkKD6dM2h4TsYv4+)
Fingerprint: sha1!! b9:a4:d3:94:71:b3:34:ae:a3:85:c8:e9:a7:c8:63:fd:74:52:5e:ad


BusyBox v1.30.1 () built-in shell (ash)

ash: can't access tty; job control turned off
  _______                     ________        __
 |       |.-----.-----.-----.|  |  |  |.----.|  |_
 |   -   ||  _  |  -__|     ||  |  |  ||   _||   _|
 |_______||   __|_____|__|__||________||__|  |____|
          |__| W I R E L E S S   F R E E D O M
 -----------------------------------------------------
 OpenWrt 19.07-SNAPSHOT, r10868-08d9828b76
 -----------------------------------------------------
================= FAILSAFE MODE active ================
special commands:
* firstboot          reset settings to factory defaults
* mount_root     mount root-partition with config files

after mount_root:
* passwd                         change root's password
* /etc/config               directory with config files

for more help see:
https://openwrt.org/docs/guide-user/troubleshooting/
- failsafe_and_factory_reset
- root_password_reset
=======================================================

I noticed when wifi on/off switch on router is on, it will accident enter failsafe mode. If switch to off everything work well.
Issue happens with 841n v8/v10 & 841hp v3 (same hardware as v10) model.

19.07 final is running fine for almost 3 hours now with the old switch driver. No more connection drops.

I'm also getting this problem on the WAN side of a 841N v9 when using the tiny ath79 build.
Going back to ath9k tiny makes it go away.

But in my case it's on the WAN side, not the switch.

WAN is one port on the switch

This exact same thing happens for me on a GL.iNet AR150 as well.

What's the suggested way to deal with it? Changing to 19.07 ath71xx?

Which OpenWrt image have you flashed? Please provide the link to the image.

https://downloads.openwrt.org/releases/19.07.0/targets/ath79/generic/openwrt-19.07.0-ath79-generic-glinet_gl-ar150-squashfs-sysupgrade.bin

The issue is the same as OP's. Intermittent WAN connection drops which show

eth1: link up (10Mbps/Half duplex)
eth1: link up (100Mbps/Full duplex)

in dmesg, but nothing with logread. This router has been running for years with 18.06 ath71xx and 17.01 ath71xx, and has never showed this issue ever before.

It also seems to be limited to this switch, because I have a bunch of TL-WDR4300's as well, with a different switch but the same SoC, and they do not have any issues on ath79.

I'm running 19.07 with the old switch driver without any issues but you'd need to build the image yourself then

BusyBox v1.30.1 () built-in shell (ash)

u  _______                     ________        __
 |       |.-----.-----.-----.|  |  |  |.----.|  |_
 |   -   ||  _  |  -__|     ||  |  |  ||   _||   _|
 |_______||   __|_____|__|__||________||__|  |____|
          |__| W I R E L E S S   F R E E D O M
 -----------------------------------------------------
 OpenWrt 19.07.0, r10860-a3ffeb413b
 -----------------------------------------------------
root@OpenWrt:~# uptime
 11:35:36 up 7 days, 12:25,  load average: 0.65, 0.25, 0.12

I have installed 19.07 ar71xx now. It has been stable for two days. No connection drops, nothing in dmesg showing any link changes.