I have ongoing problems with my ISPs IPv6 service. As a result, I've set up a connectivity logger pinging to various addresses local to my own network and across the wider Internet to track IPv6 connectivity issues. This uses collectd.
I have just upgraded to OpenWrt 21.02.2 on a TP-Link Archer C7 v2
# cat /etc/openwrt_release
DISTRIB_ID='OpenWrt'
DISTRIB_RELEASE='21.02.2'
DISTRIB_REVISION='r16495-bf0c965af0'
DISTRIB_TARGET='ath79/generic'
DISTRIB_ARCH='mips_24kc'
DISTRIB_DESCRIPTION='OpenWrt 21.02.2 r16495-bf0c965af0'
DISTRIB_TAINTS=''
# cat /proc/cpuinfo
system type : Qualcomm Atheros QCA9558 ver 1 rev 0
machine : TP-Link Archer C7 v2
On booting, the ISPs IPv4 connectivity comes up immediately. The IPv6 connectivity can take hours.
While I am waiting for the IPv6 protocol to come up, the collectd ping module fails to record any information. Looking in syslog, I see the following entries:
Mon Mar 21 15:47:15 2022 daemon.err collectd[12933]: ping_sendto: Permission denied
Mon Mar 21 15:47:15 2022 daemon.err collectd[12933]: ping_sendto: Permission denied
Mon Mar 21 15:47:15 2022 daemon.err collectd[12933]: ping_sendto: Permission denied
Mon Mar 21 15:47:15 2022 daemon.err collectd[12933]: ping_sendto: Permission denied
Mon Mar 21 15:47:15 2022 daemon.err collectd[12933]: ping_sendto: Permission denied
Mon Mar 21 15:47:15 2022 daemon.err collectd[12933]: ping_sendto: Permission denied
Mon Mar 21 15:47:15 2022 daemon.err collectd[12933]: ping_sendto: Permission denied
Mon Mar 21 15:47:15 2022 daemon.err collectd[12933]: ping_sendto: Permission denied
Mon Mar 21 15:47:15 2022 daemon.err collectd[12933]: ping_sendto: Permission denied
Mon Mar 21 15:47:15 2022 daemon.err collectd[12933]: ping_sendto: Permission denied
There is one 'Permission denied' for each IPv6 host defined in /etc/collectd.conf - but even IPv4 reachable hosts are not recorded.
In order to diagnose further, I installed noping, which also chokes on unreachable IPv6 hosts.
If I use the built in Busybox ping, I get a similar (?identical issue) issue
ping ::1 pings the IPv6 local loopback address. This works.
# ping ::1
PING ::1 (::1): 56 data bytes
64 bytes from ::1: seq=0 ttl=64 time=0.283 ms
64 bytes from ::1: seq=1 ttl=64 time=0.291 ms
64 bytes from ::1: seq=2 ttl=64 time=0.286 ms
ping a local IPv6 address
# ping -6 fd3c:458e:50a8::6d7
PING fd3c:458e:50a8::6d7 (fd3c:458e:50a8::6d7): 56 data bytes
64 bytes from fd3c:458e:50a8::6d7: seq=0 ttl=64 time=0.458 ms
64 bytes from fd3c:458e:50a8::6d7: seq=1 ttl=64 time=0.644 ms
64 bytes from fd3c:458e:50a8::6d7: seq=2 ttl=64 time=0.600 ms
Try and ping a remote IPv6 address:
# nslookup one.one.one.one
Server: 127.0.0.1
Address: 127.0.0.1#53
Name: one.one.one.one
Address 1: 1.0.0.1
Address 2: 1.1.1.1
Address 3: 2606:4700:4700::1001
Address 4: 2606:4700:4700::1111
# ping 2606:4700:4700::1111
PING 2606:4700:4700::1111 (2606:4700:4700::1111): 56 data bytes
ping: sendto: Permission denied
Pinging IPv4 shows there is external connectivity via IPv4
# ping 1.1.1.1
PING 1.1.1.1 (1.1.1.1): 56 data bytes
64 bytes from 1.1.1.1: seq=0 ttl=59 time=1.417 ms
64 bytes from 1.1.1.1: seq=1 ttl=59 time=1.225 ms
64 bytes from 1.1.1.1: seq=2 ttl=59 time=1.945 ms
So there appears to be a fundamental problem affecting the IPv6 protocol stack such that if a remote host is unreachable, we are generating a 'Permission denied' rather than "Destination Unreachable" or "Time Exceeded". This affects the collectd ping module, Busybox ping, and liboping used in noping.
The incorrectly handled 'Permission denied' in the collectd ping module means all ping results fail or are discarded, and nothing ends up being recorded in the rrd database, even though other parameters, such as cpu, are successfully being recorded.
What can I do to diagnose further and raise a bug in the correct place? It looks like the same issue is affecting several different code-bases, so it's not appropriate to raise bugs there.
Edit to add.
5 and three-quarter hours after booting, IPv6 came up.
And collectd is now happy*, rrd databases are being updated and graphs are being produced.
- aprt from the do_page_fault(): sending SIGSEGV to reader#[n] for invalid write access to [somewhere], which appears to be a known problem:
https://github.com/openwrt/openwrt/issues/8113
https://github.com/openwrt/openwrt/issues/8114