IPv6 NDP relay works only on second attempt

I have a non-delegated /64 prefix. Host 2404:7a80:9621:7100::10 in my LAN is serving some content to Internet. Often, when trying to access it, the first connection attempt fails, and the host seems to come online only after ten seconds from the first attempt. I inspected with tcpdump on my OpenWrt (21.02) router to see what's going on:

The ISP router sends a Neighbor Solicitation to my OpenWrt router's WAN port:

03:08:07.742029 IP6 (class 0xb8, hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::207:7dff:fe99:f7c5 > 2404:7a80:9621:7100::10: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has 2404:7a80:9621:7100::10
	  source link-address option (1), length 8 (1): 00:07:7d:99:f7:c5

Almost immediately, my router sends a Neighbor Solicitation to the LAN port (ff02::1:ff00:10 is a "Solicited-Node Multicast Address" for addresses that end with 00:10, so it corresponds to the host), and the host responds to it:

03:08:07.743163 IP6 fe80::1aa6:f7ff:fe8d:c0d3 > ff02::1:ff00:10: ICMP6, neighbor solicitation, who has 2404:7a80:9621:7100::10, length 32
03:08:07.743342 IP6 2404:7a80:9621:7100::10 > fe80::1aa6:f7ff:fe8d:c0d3: ICMP6, neighbor advertisement, tgt is 2404:7a80:9621:7100::10, length 32

However, after that, there is a 10 seconds of radio silence; my router doesn't respond to the ISP router. After 10 seconds, the ISP router tries again, and only then, gets a response from my router (and for some reason, it sends two responses; one without and one with the destination link-address option that corresponds to the router WAN port):

03:08:17.779822 IP6 (class 0xb8, hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::207:7dff:fe99:f7c5 > 2404:7a80:9621:7100::10: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has 2404:7a80:9621:7100::10
	  source link-address option (1), length 8 (1): 00:07:7d:99:f7:c5
03:08:17.780000 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 24) fe80::1aa6:f7ff:fe8d:c0d4 > fe80::207:7dff:fe99:f7c5: [icmp6 sum ok] ICMP6, neighbor advertisement, length 24, tgt is 2404:7a80:9621:7100::10, Flags [solicited]
03:08:17.781569 IP6 (flowlabel 0xcd5a1, hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::1aa6:f7ff:fe8d:c0d4 > fe80::207:7dff:fe99:f7c5: [icmp6 sum ok] ICMP6, neighbor advertisement, length 32, tgt is 2404:7a80:9621:7100::10, Flags [solicited]
	  destination link-address option (2), length 8 (1): 18:a6:f7:8d:c0:d4

To me, this behaviour looks like a bug. I've set RA-Service, DHCPv6-Service and NDP-Proxy as "relay" on the WAN and LAN ports, and the WAN port as the master interface. My understanding is that the NDP-Proxy setting on "relay" should make it respond correctly and swiftly to the upstream Neighbor Solicition messages, but this happens only on the second attempt.

For further info, here's an excerpt of my /etc/config/dhcp:

config dhcp 'wan6'
	option interface 'wan6'
	option ignore '1'
	option master '1'
	option dhcpv6 'relay'
	option ra 'relay'
	option ndp 'relay'

config dhcp 'lan'
	option interface 'lan'
	option start '100'
	option limit '150'
	option leasetime '12h'
	option dhcpv4 'server'
	option ra_slaac '1'
	list ra_flags 'managed-config'
	list ra_flags 'other-config'
	option ra 'relay'
	option dhcpv6 'relay'
	option ndp 'relay'

Does this seem like a bug, or am I misunderstanding something about the config or something else? If it's a bug, where should I report it or consult for more information?

related to this? IP6 relay problems

i never found a solution

Might be! In your case, R7800 is running openwrt and relaying NA/NS, right? Interesting that it pings the hosts in lan instead of sending NS. I might have filtered pings out in my tcpdump, maybe I should redo the traces.

Also, I've been trying to look into odhcp source code and considered bulding a versiok with more debug tracing, but then again, life is too short for chasing every bug in infamiliar codebases, so perhaps not :joy:

i think maybe it's not sending the relay NA messages correctly and confusing the upstream router. they are relayed to the WAN without the router flag set and with the IP6 address of the client (on the LAN), but with the MAC address of the R7800 WAN. surely that is going to make the upstream router think that the R7800 has changed address.

is anyone able to provide any advice on this? pinging the 'LAN' port of the ISP router (connected to the WAN port of OpenWRT) from a Client on the OpenWRT lan results in jitter up to 800-900+ms (normally it's 1-2ms).

looking at the NS/NA traffic, the ISP router does an NS for the Client every 2 or 3 seconds, and it usually takes 200-600ms for a corresponding NA to be sent back. OpenWRT very promptly sends a ping to the PC Client after receiving the NS and it receives the response almost immediately. most of this delay seems to occur after this, the kernel (or whatever is responsible) taking a while to send the NA back to the ISP router.

i don't know much about IPV6 routing unfortunately.

  1. should the ISP router be sending NS quite so often for something it has been receiving constant traffic from?
  2. should there be such a big delay in responding to a NS? if not, any idea what the cause might be, can it be tuned?

not sure if it will help you, i fixed the latency problem on mine by disabling odhcpd's NDP proxying - i couldn't work out why, but something to do with the way it sends pings to trigger the kernel to do a neighbour solicitation/advertisement and then responds. instead i had to update the ndppd package to the master branch so it would update the routing tables (enable autowire on the proxy entries).

this branch uses master of my fork (which is currently the same as the source master, i just didn't want it changing underneath me): https://github.com/facboy/routing/tree/ndppd

1 Like