I'm using the latest code present on github (compiled from source) (with ipv6 support disabled).
If the DM200 tries to communicate with a host of whom he already has the mac address, but 5 minutes are passed from last received packet (from the host to the DM200) then the DM200 will try to send packets to the mac address of the host but those will never arrive to the host.
(tcpdump on DM200 shows outgoing packets but tcpdump on the host shows nothing).
The packets will start to arrive only after the DM200 resends a broadcast arp and the device will replay to the arp request.
I already tried changing the switch connected to DM200 and I also tried connecting the DM200 directly to the host, but nothing change.
Necessary to reproduce (all on the same LAN):
- Netgear DM200
- A host that wont try to communicate with DM200 for at least 300 seconds (eg: without the DM200 as gateway)
- A PC logged with ssh into DM200
Step to reproduce: (all the step will be executed on DM200):
- login into ssh in DM200
- ping the host on LAN [reply will start immediately]
- stop ping
- wait at least 300 seconds
- if you run "arp" command you can see that DM200 has still the mac address of the host in its cache (flag 0x2 means: arp complete)
- ping again the host [PING REPLAY WONT ARRIVE] (actually, icmp requests are not coming to the host, so the host cannot obviously replay, verified with tcpdump on both devices)
- after 10/15 seconds you will see ping replies arriving (this is because the arp entry becomes "incomplete" (flag 0x0) and the DM200 sent an arp in broadcast)
If you connect wireshark to tcpdump running on DM200 and the host, after the 300 seconds, you can see packets outgoing from DM200 interface but nothing will arrive to the host interface until the arp in broadcast.
It seems that the packets intended for a host with which DM200 has already communicated 300 seconds before are NOT coming out of the LAN interface of DM200, even if wireshark shows them.
This problem doesn't affect only icmp packets, but all packets sent to a host after 300 seconds of NON communications with it
You can do the same test with netcat trying to connect to a random port:
- the first time you'll get a "connection refused" immediately
- after 300 seconds you'll have to wait 10/15 seconds to get the "connection refused"
I tried with many devices as "host", so i think the problem is on the DM200 and not on the other host.
May be this a problem of the integrated LAN interface?
All my other openwrt devices are running kernel 4.9 and this problem is NOT present.
DM200 is the only with kernel 4.14.79, so may this problem be related to the kernel?
Can someone with another device with kernel 4.14 try to reproduce these steps?
I think that a fast way to mitigate this problem is to invalidate the arp cache after 300 seconds, so DM200 will immediately send a broadcast arp again (like if it were the first time).