ARP requests not answered

Hi.

I have OpenWrt 22.03.0 installed on a ASUS RT-AC58U with a number of Linux / Android clients connected via WiFi.

At the point I boot the router, everything is fine. Then after a random period of time, usually hours or days, I will find that various local devices on the same network can no longer communicate with each other.

Having been through every log on every system, the common thing I see every time it happens are ARP requests with no response. Below is an example (tcpdump on the router) of two Android devices both sending ARP requests for the same Linux system (as a result of a ping), one gets a response and the other doesn't:

12:08:43.729424 ARP, Request who-has 192.168.10.3 tell 192.168.10.216, length 28
12:08:43.729482 ARP, Request who-has 192.168.10.3 tell 192.168.10.216, length 28
12:08:43.729534 ARP, Request who-has 192.168.10.3 tell 192.168.10.216, length 28
12:08:43.729614 ARP, Request who-has 192.168.10.3 tell 192.168.10.216, length 28
12:08:43.729647 ARP, Request who-has 192.168.10.3 tell 192.168.10.216, length 28
12:08:43.729713 ARP, Request who-has 192.168.10.3 tell 192.168.10.216, length 28
12:08:43.729424 ARP, Request who-has 192.168.10.3 tell 192.168.10.216, length 28
12:08:43.800402 ARP, Reply 192.168.10.3 is-at bc:a5:11:ba:d2:57, length 28
12:08:43.800451 ARP, Reply 192.168.10.3 is-at bc:a5:11:ba:d2:57, length 28

12:08:56.232285 ARP, Request who-has 192.168.10.3 tell 192.168.10.108, length 28
12:08:56.232353 ARP, Request who-has 192.168.10.3 tell 192.168.10.108, length 28
12:08:56.232389 ARP, Request who-has 192.168.10.3 tell 192.168.10.108, length 28
12:08:56.232411 ARP, Request who-has 192.168.10.3 tell 192.168.10.108, length 28
12:08:56.232479 ARP, Request who-has 192.168.10.3 tell 192.168.10.108, length 28
12:08:56.232285 ARP, Request who-has 192.168.10.3 tell 192.168.10.108, length 28
12:08:57.001877 ARP, Request who-has 192.168.10.3 tell 192.168.10.108, length 28
12:08:57.001940 ARP, Request who-has 192.168.10.3 tell 192.168.10.108, length 28
12:08:57.001978 ARP, Request who-has 192.168.10.3 tell 192.168.10.108, length 28
12:08:57.002001 ARP, Request who-has 192.168.10.3 tell 192.168.10.108, length 28
12:08:57.002069 ARP, Request who-has 192.168.10.3 tell 192.168.10.108, length 28
12:08:57.001877 ARP, Request who-has 192.168.10.3 tell 192.168.10.108, length 28
12:08:57.770221 ARP, Request who-has 192.168.10.3 tell 192.168.10.108, length 28
12:08:57.770286 ARP, Request who-has 192.168.10.3 tell 192.168.10.108, length 28
12:08:57.770325 ARP, Request who-has 192.168.10.3 tell 192.168.10.108, length 28
12:08:57.770348 ARP, Request who-has 192.168.10.3 tell 192.168.10.108, length 28
12:08:57.770411 ARP, Request who-has 192.168.10.3 tell 192.168.10.108, length 28
12:08:57.770221 ARP, Request who-has 192.168.10.3 tell 192.168.10.108, length 28

The requests from 192.168.10.108 continue indefinitely.

logread on the router does not report anything when this is happening.

There is no pattern to which devices are affected, it can be a different source and destination device every time. I can't believe they all have a problem. The common denominator seems to be the router.

To resolve the problem I've tried:

  • Stopping and starting the WiFi on the source.
  • Stopping and starting the WiFi on the destination.
  • Rebooting the source.
  • Rebooting the destination.

None of which has any affect. The only thing that fixes it is to reboot the router, which is OK for a bit and then the situation repeats itself.

I didn't have this problem when I was using my ISP's router.

If anyone has any suggestions as to what the problem could be, I would very much appreciate any advice.

If you need any additional information, please let me know.

Thanks in advance.

I'm seeing similar issues on my network. Same OpenWrt version on a Linksys AX3200, with a fairly vanilla configuration, so it's not just you.

The last time it happened I also ran a trace on the destination device (192.168.10.3) and, while it received the requests from 192.168.10.216, it did not receive the requests from 192.168.10.108.

Can confirm this issue on my ASUS RT-AC85P with a vanilla install. Hope someone will be able to identify and resolve the issue?

From this thread:

  • ASUS RT-AC85: Mediatek MT7621
  • Linksys AX3200: Mediatek MT7622

From another thread:

  • Xiaomi Mi Router 3 Pro: Ralink MT7620A

It appears there are predominantly Mediatek-chipsets affected by this?

Update: I just upgraded to OpenWRT 22.03.2 and the issue still persists as shown in picture:

2. Update: A bug report has been filed here.

The bug link suggests someone was to try snapshot and report back...at exactly the same time as your update.

(It should also be noted the fix is beleived to be there.)

So...have you tried snapshot?

3. Update:

  1. The firmware has been upgraded to a snapshot-version and the issue persisted.
  2. The upgraded snapshot-firmware has been reset to defaults and the issue persisted.

Did this ever get resolved? I had the same issue and still do on 23.05.0. I see no resolutions in the bug report.
Model Netgear R6220
Arch: MediaTek MT7621 ver:1 eco:3

An Update on that topic would also be interesting for me. I also faced exactly the same issue on a Netgear R7500 v1 by today.
I did not use that device for years so I don't know which firmwares might worked in the past.

Right now I tried 23.05.0.

My router is based on a Qualcomm Atheros IPQ8064. So maybe there is a new issue which is not strictly MediaTek related, anymore.

I guess I am having the same problem: Cannot access home server from AP - #25 by corvin. Any news on the fix? I am on the latest stable, I did not have this issue on the latest stable 22.X.Y release.

Did you manage to resolve this issue? Since I moved to 23.05.2 I am struggling.

I am having similar problems, mostly with IoT devices. Enabling "Multi to Unicast" has improved the situation significantly for me.

Tried that, did not help.

Hi,
I am not sure if my issue is the exact same, but it looks similar. I dug into it with tcpdump and noticed that the OpenWRT router can ping and therefore arp request all the devices. similarly my laptop can do that via vpn. a successful request looks like this (notice the type of Out):

17:40:54.707115 br-lan Out ARP, Request who-has 10.1.2.205 tell 10.1.2.2, length 28
17:40:54.707155 if-lan Out ARP, Request who-has 10.1.2.205 tell 10.1.2.2, length 28
17:40:54.708673 if-lan P   ARP, Reply 10.1.2.205 is-at xxx, length 46
17:40:54.708673 br-lan In  ARP, Reply 10.1.2.205 is-at xxx, length 46

an unsuccessful request looks like this:

17:39:32.821519 if-lan B   ARP, Request who-has 10.1.2.205 tell 10.1.2.209, length 28
17:39:32.821519 br-lan B   ARP, Request who-has 10.1.2.205 tell 10.1.2.209, length 28

you see the target device supposedly not responding at all to the broadcast (B) packet. I do not know if this is appropriate behavior??? the failing request was done from a device in the same lan network like the target device.

here is some background info about my setup which bridges a batman-adv mesh interface together with the other ports. this may make my case special?

network:

config device
	option name 'br-lan'
	option type 'bridge'
#	list ports 'lan1'
	list ports 'lan2'
	list ports 'lan3'
	list ports 'lan4'
	list ports 'if-lan'
	list ports 'bat0.1'

config device
	option name 'lan1'
	option macaddr 'xxx'

config device
	option name 'lan2'
	option macaddr 'xxx'

config device
	option name 'lan3'
	option macaddr 'xxx'

config device
	option name 'lan4'
	option macaddr 'xxx'

config interface 'lan'
	option device 'br-lan'
	option proto 'static'
	option ipaddr '10.1.2.2'
	option netmask '255.255.255.0'
	option ip6assign '64'
	option ip6hint '2'

config interface 'bat0'
	option proto 'batadv'
	option routing_algo 'BATMAN_IV'
	option ap_isolation '1'
	option gw_mode 'off'
	option log_level '0'
	option distributed_arp_table '1'
	option orig_interval '2000'

config interface 'mesh'
	option mtu '1532'
	option proto 'batadv_hardif'
	option master 'bat0'

wireless:

config wifi-device 'radio0'
	option type 'mac80211'
	option path 'pci0000:00/0000:00:00.0/0000:01:00.0'
	option htmode 'HT20'
	option hwmode '11g'
	option country 'DE'
	option legacy_rates '1'
	option channel '2'

config wifi-iface 'lan'
	option device 'radio0'
	option mode 'ap'
	option encryption 'psk2+ccmp'
	option ssid 'xxx'
	option ifname 'if-lan'
	option key 'xxx'
	option network 'lan'

config wifi-iface 'batman'
	option ifname 'if-bat'
	option device 'radio0'
	option mode 'mesh'
	option mesh_id 'xxx'
	option mesh_fwding '0'
	option encryption 'sae'
	option key 'xxx'
	option network 'mesh'
	option disabled '0'

any hints are appreciated !

EDIT:
additionally I noticed that the lan devices still can ping the router itself and also other access points which are technically reached via wifi but the packets are moving through batman-adv. at this point I could not test if it works to ping another wifi device connected to another accespoint