No internet after several hours

Board: Linksys EA6350v3
OpenWrt: SNAPSHOT r11152-4791afa734

After several hours (or days) of casual usage internet stops working.
ping 8.8.8.8 (from the board) - 100% packet loss
ifconfig eth1 down && ifconfig eth1 up - no effect.
service network restart - no effect.
Restart wan from LiCI UI - no effect.
Nothing obviously suspicious in top / dmesg / logread.

Reboot always fixes the issue instantly.
I suppose I can create a cron job to reboot automatically, but would prefer to find the root cause.

Any suggestions please?

Is there a reason or desire to run a snapshot version instead of a main release?

Have you tried running an official release build (18.06.5 or 19.07.0-rc1 or rc2)?

Are you running any special/non-default packages? What changes have you made to the default configuration?

Is there a reason or desire to run a snapshot version instead of a main release?

There was no main release when I decided to install it in October (18.06.5 doesn't support this board and 19.07.0-rc1 was released in November).

Non-default packages: OpenVPN, DNSCrypt, Samba 4, configured using instructions from this site.
The configuration is close to default, except for 10.0.0.1 address and wan mac override.

Have you tried disabling dnscrypt?

And what about resetting everything to defaults?

1 Like

Yes, I tried that.
Yesterday I also flashed 19.07.0-rc2 and reset everything.
It was working fine during the day, but this morning I've found it in this weird state again.

Forgot to mention - reconnecting the wan cable also helps.

It might be something with the dhcp lease on the wan. What is in the logs?

Literally nothing at that time.

After service network restart:

daemon.notice hostapd: wlan0: ACS-COMPLETED freq=2437 channel=6
daemon.notice hostapd: wlan0: interface state ACS->HT_SCAN
daemon.notice hostapd: 20/40 MHz operation not permitted on channel pri=6 sec=10 based on overlapping BSSes
daemon.err hostapd: Using interface wlan0 with hwaddr 60:38:e0:a3:85:fc and ssid "OpenWRT"
kern.info kernel: [75167.383333] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
kern.info kernel: [75167.385175] br-lan: port 2(wlan0) entered blocking state
kern.info kernel: [75167.389057] br-lan: port 2(wlan0) entered forwarding state
daemon.notice hostapd: wlan0: interface state HT_SCAN->ENABLED
daemon.notice hostapd: wlan0: AP-ENABLED
daemon.notice netifd: Network device 'wlan0' link is up
daemon.info avahi-daemon[1599]: Joining mDNS multicast group on interface wlan0.IPv6 with address fe80::6238:e0ff:fea3:85fc.
daemon.info avahi-daemon[1599]: New relevant interface wlan0.IPv6 for mDNS.
daemon.info avahi-daemon[1599]: Registering new address record for fe80::6238:e0ff:fea3:85fc on wlan0.*.

After reconnecting the wan cable:

daemon.notice netifd: Network device 'eth1' link is down
daemon.notice netifd: Interface 'wan' has link connectivity loss
daemon.notice netifd: Network device 'eth1' link is up
daemon.notice netifd: Interface 'wan' has link connectivity
kern.info kernel: [75333.209694] ess_edma c080000.edma: eth1: GMAC Link is up with phy_speed=100
daemon.notice netifd: wan (19383): udhcpc: sending select for <IP>
daemon.notice netifd: wan (19383): udhcpc: lease of <IP> obtained, lease time 9611
daemon.info avahi-daemon[1599]: Joining mDNS multicast group on interface eth1.IPv4 with address <IP>.
daemon.info avahi-daemon[1599]: New relevant interface eth1.IPv4 for mDNS.
daemon.info avahi-daemon[1599]: Registering new address record for <IP> on eth1.IPv4.
daemon.notice netifd: Interface 'wan' is now up
user.notice firewall: Reloading firewall due to ifup of wan (eth1)
authpriv.info dropbear[14332]: Early exit: Terminated by signal
authpriv.info dropbear[20352]: Not backgrounding
authpriv.info dropbear[20354]: Not backgrounding

Can you time exactly how long the internet connection lasts before it stops working? 9611 seconds would equate to 160 minutes, or a bit under 3 hours. (If that number is minutes, it would be a bit over 6.5 days). A persistent ping with time stamps from one of your devices to say 8.8.8.8 would help you figure this out.

Unfortunately there's no pattern. It worked fine for about 25 hours, lost the connection, then lost again after only 15 minutes, then after 3 hours, then after 20 or so.
No correlation with the load - sometimes it happens when the board is actively used, sometimes overnight when it's not used at all.

I've put a cron job to ping 8.8.8.8, log into a file and reboot on fail, so supposedly I'll have better stats in a few days.

I do have a lot of trouble with IPQ40xx devices as well. Do you use maybe IPv6 too? If yes, I can reproduce the internet drop outs quite easy by just playing a YouTube video via a LAN client. I still can reboot OpenWrt via Wifi, but LAN and WAN interfaces are dead. For details you can read Fritz!Box 4040 drops out internet connection if watching Youtube from LAN or https://bugs.openwrt.org/index.php?do=details&task_id=2591

I recommend to upgrade and test the official 19.07 release. At least for me it helped a lot and so far my network is stable now.