Losing PPPoE connection: Onboard switch failing?

My NETGEAR WNDR3700v4 running OpenWrt 19.07.6 seems to be behaving oddly.

With annoying frequency the LAN side networking on wired and wireless disappears. I can't ping or ssh to the device over a wired connection for example.

It seems I can restore it by simply unplugging the WAN side RJ45 and plugging it back in. Which strikes me as weird...

logread tells me that we've lost the WAN side connection, but I wonder is this is simply because the ports / switch have frozen so no traffic is arriving rather than the ISP actually not sending traffic (timestamp field date trimmed to save space):

13:58:23 daemon.debug pppd[2442]: sent [LCP EchoRep id=0x66 magic=0x5659ba5a]
13:58:33 daemon.debug pppd[2442]: rcvd [LCP EchoReq id=0x67 magic=0xa432d8d]
13:58:33 daemon.debug pppd[2442]: sent [LCP EchoRep id=0x67 magic=0x5659ba5a]
13:58:45 daemon.debug pppd[2442]: rcvd [LCP EchoReq id=0x68 magic=0xa432d8d]
13:58:45 daemon.debug pppd[2442]: sent [LCP EchoRep id=0x68 magic=0x5659ba5a]
13:58:52 daemon.debug pppd[2442]: sent [LCP EchoReq id=0x7a magic=0x5659ba5a]
13:58:53 daemon.debug pppd[2442]: sent [LCP EchoReq id=0x7b magic=0x5659ba5a]
13:58:54 daemon.debug pppd[2442]: sent [LCP EchoReq id=0x7c magic=0x5659ba5a]
13:58:55 daemon.debug pppd[2442]: sent [LCP EchoReq id=0x7d magic=0x5659ba5a]
13:58:56 daemon.debug pppd[2442]: sent [LCP EchoReq id=0x7e magic=0x5659ba5a]
13:58:57 daemon.info pppd[2442]: No response to 5 echo-requests
13:58:57 daemon.notice pppd[2442]: Serial link appears to be disconnected.
13:58:57 daemon.info pppd[2442]: Connect time 56.8 minutes.
13:58:57 daemon.info pppd[2442]: Sent 111166559 bytes, received 3198688592 bytes.
13:58:57 daemon.notice netifd: Network device 'pppoe-wan' link is down
13:58:57 daemon.debug pppd[2442]: Script /lib/netifd/ppp-down started (pid 2655)
13:58:57 daemon.debug pppd[2442]: sent [LCP TermReq id=0x2 "Peer not responding"]
13:58:57 daemon.notice netifd: Interface 'wan' has lost the connection

dmesg doesn't suggest anything catastophic (the last two messages are each from a recovery - no details before them of a possible cause):

[   40.777530] br-lan: port 2(wlan0) entered forwarding state
[   90.027795] pppoe-wan: renamed from ppp0
[ 6042.429564] conntrack: generic helper won't handle protocol 47. Please consider loading the specific helper module.
[29936.738456] pppoe-wan: renamed from ppp0
[33647.050681] pppoe-wan: renamed from ppp0

Can anyone suggest which way around things are failing and / or how I look for things I can fix?

TIA
IanC

Given the age of these devices, nothing is really impossible. Especially the power supply units are likely to go out first (with a wide range of issues caused by that, so if you have a compatible one - test it), but also the rest of your device has undergone quite some heat and usage over the years.

1 Like

Great idea on the power supply / regulation there. It is a 2.5A device, but the power supply in use was only 1.5A. Possibly some brown-out style interruption was happening. 2.5A supply now back in place to see if it helps.

IanC