Intermittent WAN

Hey all,
I've had a few years of frustration with my TP-Link AX23 and finally risked the biscuit and installed openwrt. It's been a tad jarring despite pre-install research. So far most things are fine and I'm still working on, figuring out and setting up some things but one thing is dogging me...the WAN!

So everything works...until it doesn't. I know, vague, not helpful...but that's the problem. Everything will work for a few hours then nothing. Internal LAN is all fine but nothing can be sent outbound. There is nothing in dmesg or the system log that shows the link went down, burped or back fipped. I can't find anything as to why this is happening. In fact some times it will stop working and the last thing in the log will be from like a hour or more earlier. Log output level is set to Debug as well.

There have been times where this looks like DNS, but then it isn't. A few times it looked like the WAN lost it's IP but that hasn't been found to be the case. Which made me think maybe this is a series of events when the WAN interface is talking with the ISP modem and it's DHCP lease renew goes wierd?

Honestly I'm at a loss and I was never huge into networking in the first place. Reading docs has become a blur so I might be missing all sorts of things. So if you have some insight as to what I could watch or check when WAN dies again I'd appreaciate it. Obviously if you need to know some settings ask. Given the preamble of "I dunno" I "dunno" what settings might be useful to anyone who might take a look at this and suggest things.

Post dmesg output right after an outage, not the whole thing, just the relevant part.
Use the </> button when you do.

I'll try. Sadly with dmesg not having the human readable flag it's hard to tell how far back to go, especially if it goes down but I might not notice for a few minutes. Frankly I'm hoping just because I posted about this it will magically work forever and ever. We will see if it makes it through the day and I'll post back with dmesg. So far it seems to happen about 3 times a day so it might be a few more hours before it presents.

Non-BusyBox dmesg have -H for human readable time stamps, not sure it exists in Openwrt too.

Doesn't on mine. Pretty cut down.


Usage: dmesg [-cr] [-n LEVEL] [-s SIZE]

Print or control the kernel ring buffer

	-c		Clear ring buffer after printing
	-n LEVEL	Set console logging level
	-s SIZE		Buffer size
	-r		Print raw message buffer
# apk info dmesg
dmesg-2.41.3-r1 description:
dmesg is used to examine or control the kernel ring buffer

dmesg-2.41.3-r1 webpage:
https://www.kernel.org/pub/linux/utils/util-linux/

dmesg-2.41.3-r1 installed size:
66 KiB

(didn't install/ test that myself, so I don't know if -H, --human is enabled there.

1 Like

Not to flood with casual conversation but bed is calling. So far it's holding up, appendages crossed. I hadn't noted it in the OP but I had tried to restore the original firmware because it wasn't looking like I could figure the WAN issue out. However that went about as good as the default tp-link firmware worked (it didn't : slaps knee) so I reflashed open-wrt and restored from backup. I'm starting to wonder if reflashing didn't inadvertently fix something.

However I've written a little script that polls every 60 seconds for outbound traffic. If the WAN dies while I'm sleeping the script should dump dmesg within 60 seconds of the failure...hopefully I wake up to nothing, other wise I'll just wake up to livid roommates...and the dmesg dump.

  1. If you could show us the configuration, it would be easier to understand the situation.

  1. If the output from the "dmesg" and "logread" commands is very long for you, you can always try deleting old messages and starting from a clean state, before a connectivity loss event (it's crude and ugly) but it works.
#! /bin/sh
dmesg -c; /etc/init.d/log restart

  1. And do you detect any "core" files in "/tmp"?

Config would be insanely huge would it not? Also config of what? like luci? or all the various networking files?

The size of dmesg isn't really an issue, if I capture an event I'll just share what looks relevant rather than the whole thing since boot or log rotation etc.

If there was any core dumps in /tmp they are gone now. I'd have also thought that should have shown in dmesg. The crux of all this for me is any time this happened as I said there was either no log entry for hours before (and nothing when it happened) or in dmesg it would just be repeats of the same stuff as when everything was operating fine.

Yesterday it worked for almost 8ish hours and I actually thought maybe something self corrected...then it died again. That was when I decided I'd try to go back to the tp-link firmware. It sucks but it never crapped out like what I've been experiencing. However when that failed and I re-flashed openwrt it's been good since. Comically my account got approved after the fact of all this.

As is a trend it seems the second you reach out for help it works or you figure it out. The classic take car to mechanic, say it makes a ping, it no longer pings...or in the case of a router, it doesn't ping...but ping once it's at the mechanics. :wink: I'm hoping that's the case here rather than waking to an outage. If I do wake to anything I'll check /tmp for core dumps.

1 Like

Please connect to your OpenWrt device using ssh and copy the output of the following commands and post it here using the "Preformatted text </> " button (red circle; this works best in the 'Markdown' composer view in the blue oval):

Screenshot 2025-10-20 at 8.14.14 PM

Remember to redact passwords, VPN keys, MAC addresses and any public IP addresses you may have:

ubus call system board
cat /etc/config/network
cat /etc/config/wireless
cat /etc/config/dhcp
cat /etc/config/firewall

Here is the stuff you asked for on top of "So far so good!" So outside silly beliefs like saying this "jinx's" it I think I might be in the clear. Woke to no drops and no angry house mates... So either something was awry with the initial flash or the reflash fixed something. I know packages are wiped on flash/upgrade but I should have only had the same packages I do now. However after banging my head on this issue for a week or so my memory could be off.

ubus call system board

{
	"kernel": "5.15.137",
	"hostname": "OpenWrt",
	"system": "MediaTek MT7621 ver:1 eco:4",
	"model": "TP-Link Archer AX23 v1",
	"board_name": "tplink,archer-ax23-v1",
	"rootfs_type": "squashfs",
	"release": {
		"distribution": "OpenWrt",
		"version": "23.05.2",
		"revision": "r23630-842932a63d",
		"target": "ramips/mt7621",
		"description": "OpenWrt 23.05.2 r23630-842932a63d"
	}
}

cat /etc/config/network

config interface 'loopback'
	option device 'lo'
	option proto 'static'
	option ipaddr 'numbers'
	option netmask 'numbers'

config globals 'globals'
	option ula_prefix 'numbers:letters'
	option packet_steering '1'

config device
	option name 'br-lan'
	option type 'bridge'
	list ports 'lan1'
	list ports 'lan2'
	list ports 'lan3'
	list ports 'lan4'
	option ipv6 '0'

config interface 'lan'
	option device 'br-lan'
	option proto 'static'
	option ipaddr 'numbers
	option netmask 'numbers'
	option delegate '0'
	list dns 'numbers'
	option gateway 'numbers'

config interface 'wan'
	option device 'wan'
	option proto 'dhcp'
	option peerdns '0'
	option delegate '0'

config device
	option name 'wan'
	option macaddr 'numbers:letters'

	
cat /etc/config/wireless

config wifi-device 'radio0'
	option type 'mac80211'
	option path '1e140000.pcie/pci0000:00/0000:00:01.0/0000:02:00.0'
	option band '2g'
	option channel '1'
	option htmode 'HE20'
	option country 'CA'
	option cell_density '0'

config wifi-iface 'default_radio0'
	option device 'radio0'
	option network 'lan'
	option mode 'ap'
	option ssid 'numbers:letters'
	option encryption 'psk2'
	option key 'numbers:letters:symbols:ohmy'

config wifi-device 'radio1'
	option type 'mac80211'
	option path '1e140000.pcie/pci0000:00/0000:00:01.0/0000:02:00.0+1'
	option band '5g'
	option channel '36'
	option htmode 'HE80'
	option cell_density '0'
	option country 'CA'

config wifi-iface 'default_radio1'
	option device 'radio1'
	option network 'lan'
	option mode 'ap'
	option ssid 'numbers:letters'
	option encryption 'psk2'
	option key 'numbers:letters:symbols:ohmy'

cat /etc/config/dhcp

config dnsmasq
	option domainneeded '1'
	option localise_queries '1'
	option rebind_protection '1'
	option rebind_localhost '1'
	option local '/lan/'
	option domain 'lan'
	option expandhosts '1'
	option cachesize '1000'
	option readethers '1'
	option leasefile '/tmp/dhcp.leases'
	option resolvfile '/tmp/resolv.conf.d/resolv.conf.auto'
	option localservice '1'
	option ednspacket_max '1232'
	list addnhosts '/etc/dnsmasq.hosts'

config dhcp 'lan'
	option interface 'lan'
	option start '100'
	option limit '150'
	option leasetime '12h'
	option dhcpv4 'server'
	option dhcpv6 'server'
	option ra 'server'
	list ra_flags 'managed-config'
	list ra_flags 'other-config'
	option ignore '1'
	option dynamicdhcp '0'

config dhcp 'wan'
	option interface 'wan'
	option ignore '1'
	option start '100'
	option limit '150'
	option leasetime '12h'

config odhcpd 'odhcpd'
	option maindhcp '0'
	option leasefile '/tmp/hosts/odhcpd'
	option leasetrigger '/usr/sbin/odhcpd-update'
	option loglevel '4'
	option piofolder '/tmp/odhcpd-piofolder'

cat /etc/config/firewall

config defaults
	option input 'REJECT'
	option output 'ACCEPT'
	option forward 'REJECT'
	option flow_offloading '1'
	option flow_offloading_hw '1'
	option synflood_protect '1'
	option drop_invalid '1'

config zone
	option name 'lan'
	option input 'ACCEPT'
	option output 'ACCEPT'
	option forward 'ACCEPT'
	list network 'lan'

config zone
	option name 'wan'
	option input 'REJECT'
	option output 'ACCEPT'
	option forward 'REJECT'
	option masq '1'
	option mtu_fix '1'
	list network 'wan'

config forwarding
	option src 'lan'
	option dest 'wan'

config rule
	option name 'Allow-DHCP-Renew'
	option src 'wan'
	option proto 'udp'
	option dest_port '68'
	option target 'ACCEPT'
	option family 'ipv4'

config rule
	option name 'Allow-Ping'
	option src 'wan'
	option proto 'icmp'
	option icmp_type 'echo-request'
	option family 'ipv4'
	option target 'ACCEPT'

config rule
	option name 'Allow-IGMP'
	option src 'wan'
	option proto 'igmp'
	option family 'ipv4'
	option target 'ACCEPT'

config rule
	option name 'Allow-DHCPv6'
	option src 'wan'
	option proto 'udp'
	option dest_port '546'
	option family 'ipv6'
	option target 'ACCEPT'

config rule
	option name 'Allow-MLD'
	option src 'wan'
	option proto 'icmp'
	option src_ip 'fe80::/10'
	list icmp_type '130/0'
	list icmp_type '131/0'
	list icmp_type '132/0'
	list icmp_type '143/0'
	option family 'ipv6'
	option target 'ACCEPT'

config rule
	option name 'Allow-ICMPv6-Input'
	option src 'wan'
	option proto 'icmp'
	list icmp_type 'echo-request'
	list icmp_type 'echo-reply'
	list icmp_type 'destination-unreachable'
	list icmp_type 'packet-too-big'
	list icmp_type 'time-exceeded'
	list icmp_type 'bad-header'
	list icmp_type 'unknown-header-type'
	list icmp_type 'router-solicitation'
	list icmp_type 'neighbour-solicitation'
	list icmp_type 'router-advertisement'
	list icmp_type 'neighbour-advertisement'
	option limit '1000/sec'
	option family 'ipv6'
	option target 'ACCEPT'

config rule
	option name 'Allow-ICMPv6-Forward'
	option src 'wan'
	option dest '*'
	option proto 'icmp'
	list icmp_type 'echo-request'
	list icmp_type 'echo-reply'
	list icmp_type 'destination-unreachable'
	list icmp_type 'packet-too-big'
	list icmp_type 'time-exceeded'
	list icmp_type 'bad-header'
	list icmp_type 'unknown-header-type'
	option limit '1000/sec'
	option family 'ipv6'
	option target 'ACCEPT'

config rule
	option name 'Allow-IPSec-ESP'
	option src 'wan'
	option dest 'lan'
	option proto 'esp'
	option target 'ACCEPT'

config rule
	option name 'Allow-ISAKMP'
	option src 'wan'
	option dest 'lan'
	option dest_port '500'
	option proto 'udp'
	option target 'ACCEPT'

config redirect
	option dest 'lan'
	option target 'DNAT'
	option name 'http'
	option src 'wan'
	option dest_ip 'numbers'
	option dest_port '80'
	option src_dport '80'

config redirect
	option dest 'lan'
	option target 'DNAT'
	option name 'https'
	option src 'wan'
	option src_dport '443'
	option dest_ip 'numbers'
	option dest_port '443'

config redirect
	option dest 'lan'
	option target 'DNAT'
	option name 'smtp'
	option src 'wan'
	option src_dport '25'
	option dest_ip 'numbers'
	option dest_port '25'

config redirect
	option dest 'lan'
	option target 'DNAT'
	option name 'XMPP'
	option src 'wan'
	option src_dport '5222'
	option dest_ip 'numbers'
	option dest_port '5222'

config rule
	option name 'Allow-OpenVPN'
	option src 'wan'
	option dest_port '1194'
	option proto 'udp'
	option target 'ACCEPT'	

You should upgrade to 24.10

You have over-redacted here. But one thing is clear: the gateway line should not be there. Delete it.

Is there a reason you have disabled the lan dhcp server?

I just grabbed the image that was for my hardware, which when I initially looked was 23.05.2. I'll take a peek at that (24.10.5) in a bit but at the same time it seems like I just got stability and I'm a tad worried to look at it wrong much less redo it again.

But redaction is fun! OK so gateway on LAN is bad (removed for testing - it's been working for hours though). I'd wager some things are from just trying anything to see what was wrong. It also didn't sit right with me that on reboot it worked...for a while. It really seemed like a renew or cron/update was periodically screwing something up but with nothing in the logs I couldn't figure out what.

Yes, this install of open-wrt is new...and new new as I had just reflashed by the time my account here was approved. The cruft is from a previous back up from the weeks initial attempts. However to your question is also why I installed open-wrt...the TP-Link firmware is garbage.

I had fought with the TP-Link firmware for well over a year as all advertised features were implemented in a way that made them a nightmare to use if not flatly useless. Over time I just delegated all the functions (other than physical interfaces and wireless) to other machines on the network. So DHCP is handled by my DNS server. If I can continue to keep open-wrt good I may migrate things back.

You should upgrade asap because you’re on an old and unsupported version.

I’m not sure where you saw the references to use 23.05, but it is always best to use the firmware selector to download the latest version.

Rfc1918 addresses and subnet masks don’t need to be redacted. They don’t reveal any sensitive information.

Is this a pihole? Or something else? Are you positive it is working properly and that it is issuing the correct gateway address? Does this dhcp/dns server continue to be reachable when your wan appears to be down?

Yeah using pihole these days, it used to be BSD and hand written iptable rules. It's working properly and yes everything (internal) is reachable when WAN would die. I had also wondered if maybe my ISP was having issues that just "luckily" happened to coincide with me trying "wrt" but any time I'd reboot the router it would magically work again. I never had these issues prior to the firmware change.

Still scary (I thought I may have been in brick land a few times this week so my nerves of steel are more like tin atm) but I'll move to 24.10.5 once everyone goes to bed and I won't be B____ed at for "no network."

I'm pretty sure I used the same page you linked to me. How or why it defaulted to a 23.05 version if 24 is current I don't know or remember. Either way thank you to everyone that's chimed in and to you for pointing out I've flashed an old version. Scary or not one thing I hadn't got to yet was reading up on how to safely / correctly update/upgrade for security patches. I know upgrading base packages can be risky but that's as far as I had got before I started having WAN dropouts.

Did you, in fact, upgrade packages at any point? This is not recommended and can/will cause problems.

Upgrading the system via sysupgrade is the correct approach.

1 Like

The only thing that should have been changed outside settings was I installed packages for openvpn, easy-rsa and something else that was a dependency of those iirc. I watched/read some things about how using the luci > upgrade > software > updates was a bad idea. Though that prompted questions about OK so why is it there that I didn't find answers to.

I'm thought something was off still. Upon reflashing I could have sworn I re-added the openvpn packages but there is nothing for it in the luci webUI. I've been so focused on the WAN stability I didn't notice the VPN menu option was still missing. I just reinstalled openvpn and easy-rsa as a test because I'll be upgrading shortly anyhow but the VPN menu still isn't there...looking through packages as I write this the thing I forgot was the luci-app.openvpn package...I thought that was the one that adds the webui for it but after installing it's still not there. I suppose I'll not delve into this too deep until I've moved to the 24 build.

As a running log things are still stable. Hopefully in a few hours I'll upgrade and continue to be stable. I'll close this if I can survive the night heh. Certainly looking like either a fluke issue or I bodged something on the initial flash.

This may be on topic (if the older firmware was a baseline for something being a bit off) or off topic but I just noticed something. At some point the firmware selection page had a url that had the openwrt version. This made me think perhaps I was given the old version because when I searched for the firmware selection page I had used a search engine link that already had something like https://firmware-selector.openwrt.org/?version=23.05.2 thus it defaulted to an older version. However if I try to use that url or even select the older version from the drop down I get "Failed to fetch." So this really is confusing how the site this week even allowed me to download that old version.

I also being stubborn and trying to figure things out did seem to figure out the VPN issue. However the site firmware, the missing VPN menu and a few other things feel like a hybrid gas lighting, idiosyncrasy and sharp edges situation. The VPN menu stayed hidden despite refreshing the page and restarting uhttp. Only after clicking the OpenWRT logo did it finally show up. So lots of little "you have to know to know" idiosyncracies to learn here. I also noted this same behavior where the menu item doesn't show until the logo is clicked in the 24 build too.

Flash to 24.10.5 is done, I'll see how things go. Still good, I dare say things are working...still tons to figure out / learn though.