Archer C7 V2 - Traffic not working between some LAN hosts

Hi,

I've had this unresolved issue now for a few years that I have noticed more lately.

There's this strange problem where suddenly one or multiple LAN hosts connected over WIFI cannot communicate with some particular other LAN hosts (in this case one host connected through WIFI).

Basically I can do a ping from one computer in my LAN which won't work but if I do the same from another LAN host (or the router itself) it works.

I have tried many things over the year, different drivers, tweaking transmit power etc.

I have no firewall in place in the LAN or on the LAN hosts. I have also tried to manually add the destination host - which I'm trying to ping - to the ARP cache but doesn't help. My guess is that OpenWrt (or the hardware) somehow drops the packages.

I'm currently running 21.02 but earlier I used 19.07.7.

Anyone having any solution to my problem?

I'm attaching an screenshot so you can see issue when troubleshooting from the terminal.

that sounds terrible similar with a issue I posted in reddit.

Does the problem still happen on 21.02?

Try changing the MAC address of the client to see if it goes away.

If the computers are all on the same subnet, the routing functions are not involved. At that point, it is layer 2 (switching), and could only be impacted by a bug in the wireless stack of OpenWrt. However, I find this unlikely given that you said that another machine can ping without issue. This suggests a problem with your hosts.

Beyond that, you haven’t provided enough info to troubleshoot anything. Please provide details about the tests you have run (which ips > ips work, what doesn’t work, etc), the os and wired/WiFi status of each device, and your network configuration details. We need to know what routers/APs you are using and how they are setup (topology and configuration).

Please copy the output of the following commands and post it here using the "Preformatted text </> " button:
grafik
Remember to redact passwords, MAC addresses and any public IP addresses you may have:

cat /etc/config/network
cat /etc/config/wireless
cat /etc/config/dhcp
cat /etc/config/firewall

Well I upgraded earlier today to 21.02 and the problem still occur, had some hope that somebody perhaps had fixed it but no.

Which host should I change Mac adress for, the source host which cannot reach the destination host, the opposite or both?

If you looked at the picture I sent I clearly showed a test where I try to ping the same host from two hosts, one computer and my Archer C7 V2 running OpenWRT. And most likely it's at least not an ARP issue as I tried to manually add the ARP entry on the host which couldn't ping the destination host.

What's interesting is that this ain't happening all the time and right now the destination host is reachable from both Archer C7 as well as the host which couldn't ping the destination host before.

This ain't a new error for me as I have had the issue for several years. Sometimes I cannot ping a particular host from my laptop and sometimes one of my Raspberry PI's running HomeAssistant cannot reach the API of my home alarm running in the LAN.

It's happening from time to time with different hosts, with different hardware, different architecture and different Linux distributions.

The host which cannot reach, can it reach any other host?

Well try to change one at the time and test.
It's just a hunch based on the workaround of my issue.

You have hit this problem

@tomplast - it is possible that @sammo has provided a link that may describe the issue. But if not, you need to give us more specific info, like I asked for earlier. Otherwise we cannot help.

First and foremost, the link that @sammo supplied did not solve anything. I tried the solution provided and it seemed to work for a few hours - like most solutions I have tried - but in the end the problem came back. The last few days I have had days when everything is working and then suddenly it stops working.

Today there were two hosts I couldn't reach. I did a simple telnet 192.168.1.12 8123 (HomeAssistant port) from my laptop (and from my browser on my phone) and I couldn't reach the host. But if I did the same test from the router everything worked just fine. The interesting part is that after doing a couple of telnet attempts from my laptop I suddenly got through and then it was stable, like I pierced through some kind of wall of some sort. While doing the test I did a tcpdump from the Archer C7 router and while I couldn't reach the host there was a total silence in the output - like it never reached the router or passed the Data Link-layer perhaps?

I have another Archer C7 V2 which have exactly the same behaviour. Also, if I dont recall incorrectly I believe I may had similar issue with the original stock firmware.

Anyhow, I'm pasting a previous test I did a few days ago when it was not working. I hope it's the details you requested.

I'm trying to reach a host with the ip adress of 192.168.1.227.

192.168.1.227 is running Tasmota on a Sonoff ZB Bridge. Connected through 2.4GHZ wireless network.

My Archer C7 V2 is responsible for both wired and wireless networks. DHCP is enabled and I'm using a 192.168.1.0/24 network. Have no firewall enabled on the LAN side.

TEST 1 - Laptop connected through WIFI

I'm using an laptop connected through WIFI, the same as 192.168.1.227 is connected to. The OS I'm using is Manjaro and the kernel is 5.4

IP: 192.168.1.191

PING TEST <<
[tomas@academy ~]$ ping -c 3 192.168.1.227
PING 192.168.1.227 (192.168.1.227) 56(84) bytes of data.
From 192.168.1.191 icmp_seq=1 Destination Host Unreachable
From 192.168.1.191 icmp_seq=2 Destination Host Unreachable
From 192.168.1.191 icmp_seq=3 Destination Host Unreachable

--- 192.168.1.227 ping statistics ---
3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2038ms
pipe 3

This fails.

TRACEROUTE TEST <<
[tomas@academy ~]$ traceroute 192.168.1.227
traceroute to 192.168.1.227 (192.168.1.227), 30 hops max, 60 byte packets
1 academy.lan (192.168.1.191) 3043.221 ms !H 3043.114 ms !H 3043.081 ms !H
[tomas@academy ~]$

Seems reasonable, besides the high response time.

TEST 2 - Stationary computer connected through wired network

I'm using a stationary computer connected through wired network, a cable directly to Archer C7. The OS I'm using is Manjaro and the kernel is 5.10.

IP: 192.168.1.174

PING TEST <<
[tomas@wombat]$ ping -c 3 192.168.1.227  :heavy_check_mark:  4s 
PING 192.168.1.227 (192.168.1.227) 56(84) bytes of data.
64 bytes from 192.168.1.227: icmp_seq=1 ttl=255 time=5.38 ms
64 bytes from 192.168.1.227: icmp_seq=2 ttl=255 time=21.7 ms
64 bytes from 192.168.1.227: icmp_seq=3 ttl=255 time=46.0 ms

--- 192.168.1.227 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2004ms
rtt min/avg/max/mdev = 5.375/24.358/45.959/16.671 ms

GOOD, but not awesome response time.

TRACEROUTE TEST <<
[tomas@wombat]$ traceroute 192.168.1.227  :heavy_check_mark:
traceroute to 192.168.1.227 (192.168.1.227), 30 hops max, 60 byte packets
1 zbbridge1.l

GOOD!

TEST 3 - From Archer C7
IP: 192.168.1.1
OpenWrt 21.02

PING TEST <<
root@OpenWrt:~# ping -c 3 192.168.1.227
PING 192.168.1.227 (192.168.1.227): 56 data bytes
64 bytes from 192.168.1.227: seq=0 ttl=255 time=33.361 ms
64 bytes from 192.168.1.227: seq=1 ttl=255 time=64.697 ms
64 bytes from 192.168.1.227: seq=2 ttl=255 time=81.042 ms

--- 192.168.1.227 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 33.361/59.700/81.042 ms

GOOD, but not awesome response time.

TRACEROUTE TEST <<
root@OpenWrt:~# traceroute 192.168.1.227
traceroute to 192.168.1.227 (192.168.1.227), 30 hops max, 38 byte packets
1 zbbridge1.lan (192.168.1.227) 45.810 ms 6.215 ms 6.221 ms

Finally, here are the files you asked me to concat.


/etc/config/network:

config interface 'loopback'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'
	option device 'lo'

config globals 'globals'
	option ula_prefix 'fd44:333c:f2c3::/48'

config interface 'lan'
	option proto 'static'
	option ipaddr '192.168.1.1'
	option netmask '255.255.255.0'
	option ip6assign '60'
	option device 'br-lan'

config interface 'wan'
	option proto 'dhcp'
	option device 'eth0.2'

config interface 'wan6'
	option proto 'dhcpv6'
	option device 'eth0.2'

config switch
	option name 'switch0'
	option reset '1'
	option enable_vlan '1'

config switch_vlan
	option device 'switch0'
	option vlan '1'
	option ports '2 3 4 5 0t'

config switch_vlan
	option device 'switch0'
	option vlan '2'
	option ports '1 6t'

config device
	option name 'br-lan'
	option type 'bridge'
	list ports 'eth1.1'



/etc/config/wireless:


config wifi-device 'radio0'
	option type 'mac80211'
	option channel '36'
	option hwmode '11a'
	option path 'pci0000:00/0000:00:00.0'
	option htmode 'VHT80'
	option legacy_rates '0'
	option country 'SE'

config wifi-iface 'default_radio0'
	option device 'radio0'
	option network 'lan'
	option mode 'ap'
	option ssid 'yyy-5GHz'
	option key 'xxx'
	option encryption 'psk2+ccmp'
	option disassoc_low_ack '0'

config wifi-device 'radio1'
	option type 'mac80211'
	option channel '11'
	option hwmode '11g'
	option path 'platform/ahb/18100000.wmac'
	option htmode 'HT20'
	option legacy_rates '0'
	option country 'SE'
	option txpower '19'

config wifi-iface 'default_radio1'
	option device 'radio1'
	option network 'lan'
	option mode 'ap'
	option key 'xxx'
	option ssid 'yyy'
	option encryption 'psk2'




/etc/config/dhcp:
config dnsmasq
	option domainneeded '1'
	option localise_queries '1'
	option rebind_protection '1'
	option rebind_localhost '1'
	option local '/lan/'
	option domain 'lan'
	option expandhosts '1'
	option authoritative '1'
	option readethers '1'
	option leasefile '/tmp/dhcp.leases'
	option localservice '1'
	option resolvfile '/tmp/resolv.conf.d/resolv.conf.auto'

config dhcp 'lan'
	option interface 'lan'
	option start '100'
	option limit '150'
	option leasetime '12h'
	option dhcpv6 'server'
	option ra 'server'

config dhcp 'wan'
	option interface 'wan'
	option ignore '1'

config odhcpd 'odhcpd'
	option maindhcp '0'
	option leasefile '/tmp/hosts/odhcpd'
	option leasetrigger '/usr/sbin/odhcpd-update'
	option loglevel '4'




/etc/config/firewall:
config defaults
	option syn_flood '1'
	option input 'ACCEPT'
	option output 'ACCEPT'
	option forward 'REJECT'

config zone
	option name 'lan'
	list network 'lan'
	option input 'ACCEPT'
	option output 'ACCEPT'
	option forward 'ACCEPT'

config zone
	option name 'wan'
	list network 'wan'
	list network 'wan6'
	option input 'REJECT'
	option output 'ACCEPT'
	option forward 'REJECT'
	option masq '1'
	option mtu_fix '1'

config forwarding
	option src 'lan'
	option dest 'wan'

config rule
	option name 'Allow-DHCP-Renew'
	option src 'wan'
	option proto 'udp'
	option dest_port '68'
	option target 'ACCEPT'
	option family 'ipv4'

config rule
	option name 'Allow-Ping'
	option src 'wan'
	option proto 'icmp'
	option icmp_type 'echo-request'
	option family 'ipv4'
	option target 'ACCEPT'

config rule
	option name 'Allow-IGMP'
	option src 'wan'
	option proto 'igmp'
	option family 'ipv4'
	option target 'ACCEPT'

config rule
	option name 'Allow-DHCPv6'
	option src 'wan'
	option proto 'udp'
	option src_ip 'fc00::/6'
	option dest_ip 'fc00::/6'
	option dest_port '546'
	option family 'ipv6'
	option target 'ACCEPT'

config rule
	option name 'Allow-MLD'
	option src 'wan'
	option proto 'icmp'
	option src_ip 'fe80::/10'
	list icmp_type '130/0'
	list icmp_type '131/0'
	list icmp_type '132/0'
	list icmp_type '143/0'
	option family 'ipv6'
	option target 'ACCEPT'

config rule
	option name 'Allow-ICMPv6-Input'
	option src 'wan'
	option proto 'icmp'
	list icmp_type 'echo-request'
	list icmp_type 'echo-reply'
	list icmp_type 'destination-unreachable'
	list icmp_type 'packet-too-big'
	list icmp_type 'time-exceeded'
	list icmp_type 'bad-header'
	list icmp_type 'unknown-header-type'
	list icmp_type 'router-solicitation'
	list icmp_type 'neighbour-solicitation'
	list icmp_type 'router-advertisement'
	list icmp_type 'neighbour-advertisement'
	option limit '1000/sec'
	option family 'ipv6'
	option target 'ACCEPT'

config rule
	option name 'Allow-ICMPv6-Forward'
	option src 'wan'
	option dest '*'
	option proto 'icmp'
	list icmp_type 'echo-request'
	list icmp_type 'echo-reply'
	list icmp_type 'destination-unreachable'
	list icmp_type 'packet-too-big'
	list icmp_type 'time-exceeded'
	list icmp_type 'bad-header'
	list icmp_type 'unknown-header-type'
	option limit '1000/sec'
	option family 'ipv6'
	option target 'ACCEPT'

config rule
	option name 'Allow-IPSec-ESP'
	option src 'wan'
	option dest 'lan'
	option proto 'esp'
	option target 'ACCEPT'

config rule
	option name 'Allow-ISAKMP'
	option src 'wan'
	option dest 'lan'
	option dest_port '500'
	option proto 'udp'
	option target 'ACCEPT'

config include
	option path '/etc/firewall.user'

config redirect
	option src 'wan'
	option name 'WireGuard'
	option src_dport '51820'
	option target 'DNAT'
	option dest_ip '192.168.1.12'
	option dest 'lan'
	list proto 'udp'
	option dest_port '51820'

config rule
	list proto 'all'
	option name 'Disable media box'
	list src_ip '192.168.1.203'
	option dest 'wan'
	option target 'DROP'
	option src 'lan'
	option enabled '0'

config redirect
	option dest_port '51820'
	option src 'wan'
	option name 'Wireguard'
	option src_dport '51820'
	option target 'DNAT'
	option dest_ip '192.168.1.12'
	option dest 'lan'
	list proto 'udp'

There's nothing in your files that would be unusual or likely the cause of your issues. So, it is either an issue with your wireless (i.e. a driver or hardware issue), or a problem that is related to the specific hosts in question.

To that end, the best way to figure out what might be happening is to run some specific tests when the problem is manifesting. Forgive me if you have actually already done these, but it's hard to follow the relatively long post (in which case summarizing could be useful)...

These are all ping tests. Useful responses are simply normal / high and/or variable latency / fail

  1. Wired host to another wired host
  2. wired host to the router
  3. wired host to a wireless host
  4. wireless host to another wireless host
  5. wireless host to the router

I gave you the configuration files you asked me to. Also in my post you could see that wired hosts did not have any issue reaching the target destinations, only the other wireless hosts had this behaviour.

So basically:
WIRED > WIRELESS = WORKS
WIRED > WIRED = WORKS
WIRELESS > WIRELESS = PROBLEMATIC
WIRELESS > WIRED = PROBLEMATIC

And also, I wrote that I did a telnet test which didn't work initially but after a few attempts it came through.

Try changing your 5G bandwidth to 40MHz instead of 80. See if that improves the wifi tests that are problematic.

Nope it didn't help. I'm pretty sure I have tried this a few times over the years as well. Have read about so many people who had issues with Archer C7 V2 and many of them tries this or that and for a while it looks like whatever they tried worked but it's just pure chance.

I appreciate the effort but I'm tired of trying the same things over and over. I want to identify the problem behind my issues, not just testing a lot of possibilities.

Do you have any ideas how to get some real verbose information from OpenWrt and find out exactly what's happening?

If i run a tcpdump on OpenWrt I can clearly see that my cellphone (which cannot access one particular host right now) makes an ARP request to the router about the host but OpenWrt never responds!

I have a C7 V2 and have never had these issues.

Run arp -a and post the output.

I tried to ping 192.168.1.12 from my Laptop (wireless connection) and I couldn't reach the host the first attempt I did (for about 10 seconds). In the meantime I checked the arp table for the 192.168.1.12 record on my Laptop and it didn't have an entry for 192.168.1.12. Also I checked the arp table in OpenWrt (from which I could ping 192.168.1.12, albeit extremely high latency this time - over 500ms) and the arp entry was correct there.

After a minute or so I tried with my Laptop once again and then suddenly I got an ARP reply from OpenWRT and could finally ping 192.168.1.12.

I think I'm spending a few hours every month restarting or screaming at this router. It's driving me nuts that nothing seems to help and it's driving me even more nuts that I can find so many people having issues with router even though so many people think this is one of the best routers ever.

Is power management enabled for the NIC on 192.168.1.12?

Very unlikely, it's a Raspberry pi running Hassio distribution.

If i run a tcpdump on OpenWrt I can clearly see that my cellphone (which cannot access one particular host right now) makes an ARP request to the router about the host but OpenWrt never responds!

Just a note that ARP request is simply broadcast to the entire subnet, and it is the responsibility of the host with the requested IP to reply. OpenWRT does not own that IP, therefore it is not expected to reply. The blame could be attributed to the target host itself (maybe it is sleeping etc...) or poor WiFi connection to the target host (packet drops), or openWRT failed to relay ARP packets. Would be nice if you could run tcpdump on the target machine.

Could it be related to this post: Device stops passing broadcast packets to wireless interfaces after 24H of uptime?