Fundamental problem with ping and IPv6

I have ongoing problems with my ISPs IPv6 service. As a result, I've set up a connectivity logger pinging to various addresses local to my own network and across the wider Internet to track IPv6 connectivity issues. This uses collectd.

I have just upgraded to OpenWrt 21.02.2 on a TP-Link Archer C7 v2

# cat /etc/openwrt_release
DISTRIB_ID='OpenWrt'
DISTRIB_RELEASE='21.02.2'
DISTRIB_REVISION='r16495-bf0c965af0'
DISTRIB_TARGET='ath79/generic'
DISTRIB_ARCH='mips_24kc'
DISTRIB_DESCRIPTION='OpenWrt 21.02.2 r16495-bf0c965af0'
DISTRIB_TAINTS=''

# cat /proc/cpuinfo
system type		: Qualcomm Atheros QCA9558 ver 1 rev 0
machine			: TP-Link Archer C7 v2

On booting, the ISPs IPv4 connectivity comes up immediately. The IPv6 connectivity can take hours.

While I am waiting for the IPv6 protocol to come up, the collectd ping module fails to record any information. Looking in syslog, I see the following entries:

Mon Mar 21 15:47:15 2022 daemon.err collectd[12933]: ping_sendto: Permission denied
Mon Mar 21 15:47:15 2022 daemon.err collectd[12933]: ping_sendto: Permission denied
Mon Mar 21 15:47:15 2022 daemon.err collectd[12933]: ping_sendto: Permission denied
Mon Mar 21 15:47:15 2022 daemon.err collectd[12933]: ping_sendto: Permission denied
Mon Mar 21 15:47:15 2022 daemon.err collectd[12933]: ping_sendto: Permission denied
Mon Mar 21 15:47:15 2022 daemon.err collectd[12933]: ping_sendto: Permission denied
Mon Mar 21 15:47:15 2022 daemon.err collectd[12933]: ping_sendto: Permission denied
Mon Mar 21 15:47:15 2022 daemon.err collectd[12933]: ping_sendto: Permission denied
Mon Mar 21 15:47:15 2022 daemon.err collectd[12933]: ping_sendto: Permission denied
Mon Mar 21 15:47:15 2022 daemon.err collectd[12933]: ping_sendto: Permission denied

There is one 'Permission denied' for each IPv6 host defined in /etc/collectd.conf - but even IPv4 reachable hosts are not recorded.

In order to diagnose further, I installed noping, which also chokes on unreachable IPv6 hosts.

If I use the built in Busybox ping, I get a similar (?identical issue) issue

ping ::1 pings the IPv6 local loopback address. This works.

# ping ::1
PING ::1 (::1): 56 data bytes
64 bytes from ::1: seq=0 ttl=64 time=0.283 ms
64 bytes from ::1: seq=1 ttl=64 time=0.291 ms
64 bytes from ::1: seq=2 ttl=64 time=0.286 ms

ping a local IPv6 address

# ping -6 fd3c:458e:50a8::6d7
PING fd3c:458e:50a8::6d7 (fd3c:458e:50a8::6d7): 56 data bytes
64 bytes from fd3c:458e:50a8::6d7: seq=0 ttl=64 time=0.458 ms
64 bytes from fd3c:458e:50a8::6d7: seq=1 ttl=64 time=0.644 ms
64 bytes from fd3c:458e:50a8::6d7: seq=2 ttl=64 time=0.600 ms

Try and ping a remote IPv6 address:

# nslookup one.one.one.one
Server:		127.0.0.1
Address:	127.0.0.1#53

Name:      one.one.one.one
Address 1: 1.0.0.1
Address 2: 1.1.1.1
Address 3: 2606:4700:4700::1001
Address 4: 2606:4700:4700::1111

# ping 2606:4700:4700::1111
PING 2606:4700:4700::1111 (2606:4700:4700::1111): 56 data bytes
ping: sendto: Permission denied

Pinging IPv4 shows there is external connectivity via IPv4

# ping 1.1.1.1
PING 1.1.1.1 (1.1.1.1): 56 data bytes
64 bytes from 1.1.1.1: seq=0 ttl=59 time=1.417 ms
64 bytes from 1.1.1.1: seq=1 ttl=59 time=1.225 ms
64 bytes from 1.1.1.1: seq=2 ttl=59 time=1.945 ms

So there appears to be a fundamental problem affecting the IPv6 protocol stack such that if a remote host is unreachable, we are generating a 'Permission denied' rather than "Destination Unreachable" or "Time Exceeded". This affects the collectd ping module, Busybox ping, and liboping used in noping.

The incorrectly handled 'Permission denied' in the collectd ping module means all ping results fail or are discarded, and nothing ends up being recorded in the rrd database, even though other parameters, such as cpu, are successfully being recorded.

What can I do to diagnose further and raise a bug in the correct place? It looks like the same issue is affecting several different code-bases, so it's not appropriate to raise bugs there.

Edit to add.
5 and three-quarter hours after booting, IPv6 came up.
And collectd is now happy*, rrd databases are being updated and graphs are being produced.

  • "Time Exceeded" means that all routers in the path decremented the TTL counter until it == 0; the router reporting ttl==0 sends the Time Exceeded - that didn't occur.
  • "Destination Unreachable" has other conditions (i.e. a router sends this error message)...since you were the originating SRC of the packet, this isn't the case. For this to be applicable, it would mean an upstream IPv6 router received the packet and had no onward destination to deliver it - this didn't occur either

If your IPv6 interface isn't up, Permission Denied could be the appropriate response depending on the issue. I t could also and is commonly a firewall issue (e.g. you may have denied output on the interface doing the ping6), etc.

OK, thanks for the explanation of the precise meanings of 'Time Exceeded' and 'Destination Unreachable'. It's a little irritating that a router (which would know if the destination is unreachable) doesn't return that status when you try to ping outside directly attached networks.

It's not a firewall issue*. IPv6 protocol does come up eventually, it just takes a long time. I have taken comprehensive logs using tcpdump and wireshark, and can see that the Neighbor Discovery is not working properly between the OpenWrt router and the upstream Huawei router operated by the ISP. The ISP have been...unhelpful. It is galling that IPv4 comes up as soon as the router is booted, and IPv6 takes a variable amount of time, from minutes to hours. I'm not ruling out that OpenWrt implements Neighbor Discovery incorrectly, but I would have expected a lot of bug notifications if that were the case.The Huawei appears to be ignoring Router Solicitations - details in previous post: OpenWrt IPv6 issue?

*Well, I don't see how the firewall would cause the behaviour I'm seeing. If you can suggest some diagnostics to demonstrate it is a firewall problem, or rule it out, I'd be grateful.

Similar issue resolved in this thread...

1 Like

Thanks for that, I'll look into it, and try to understand why that should work.

Well, I tried that, and got a lot of messages in the log complaining about the configuration. I'll redo it when I get the time.

Meanwhile, the IPv6 stability on 21.02.2 seems better. The collectd statistics show ping responses from the ISPs concentrator's link-local IP address to (a) be more stable, (b) consistently low-delay and (c) far lower drop rate.

--- fe80::<redacted> ping statistics ---
100 packets transmitted, 100 packets received, 0% packet loss
round-trip min/avg/max = 1.078/1.957/6.722 ms

That said, some RTDs are a lot higher, and there is some packet loss.

Now it might be that the ISP have upgraded the software on their Huawei device (which I'm getting loop detection packets from, but on the face of it, it is more likely something has changed in the OpenWrt IPv6 stack.

If/when I get the time, I'll set up test router and try unplugging and replugging the physical interface while running a packet capture and try and diagnose (again) what is going amiss with the Neighbor Discovery Protocol.