Clients lose IPv6 connection after a few hours

Hi,

I'm running a dual router configuration with my ISP router and OpenWRT 19.07.2 on a Meraki Z1. However, after a period of a few hours to a day the LAN clients lose their IPv6 connectivity. I'm not sure about the time period.

I do not run an exotic setup. It's just your standard configuration with the RA server enabled for stateless client auto-configuration and the DHCPv6 server disabled. The IPv6 prefix is requested via DHCPv6-PD and I successfully get a /60 prefix (which is dynamic but hasn't changed yet). Out of that prefix a /64 prefix is assigned to my LAN interface.

From the router's perspective, the router still has a working IPv6 connection. The requested delegated prefix still remains valid and mentioned on the OpenWRT status page. I can still ping to public IPv6 addresses from OpenWRT. A restart of the WAN6 interface resolves my problems till it happens again. I've read it could be a multicast problem, but ifconfig reports multicast is still running on all interfaces. It seems more like a routing problem for me.

Here are the relevant parts of my configuration.

# /etc/config/dhcp

config dhcp 'lan'
        option interface 'lan'
        option start '100'
        option limit '150'
        option leasetime '12h'
        list domain 'home.shibe.nl'
        list dns '2001:aaaa:aaaa:bbf0::1'
        option ra 'server'

config odhcpd 'odhcpd'
        option maindhcp '0'
        option leasefile '/tmp/hosts/odhcpd'
        option leasetrigger '/usr/sbin/odhcpd-update'
        option loglevel '4'
# /etc/config/

config interface 'lan'
        option type 'bridge'
        option ifname 'eth0.1'
        option proto 'static'
        option ipaddr '192.168.200.1'
        option netmask '255.255.255.0'
        option ip6ifaceid '::1'
        option ip6hint '0'
        option ip6assign '64'

config interface 'wan6'
        option ifname 'br-wan'
        option proto 'dhcpv6'
        list dns '2606:4700:4700::1111'
        list dns '2606:4700:4700::1001'
        option reqprefix 'auto'
        option reqaddress 'try'
        option peerdns '0'

These are the routes on my OpenWRT router in a working state (2001:aaaa:aaaa:bb00::/64 is the prefix assigned to my ISP router, 2001:aaaa:aaaa:bbf0::/64 is the prefix assigned to my OpenWRT router via DHCPv6-PD). Both come from a delegated prefix length of /60.

default from 2001:aaaa:aaaa:bb00:cccc:cccc:cccc:5460 via fe80::dddd:dddd:dddd:b650 dev br-wan  metric 512
default from 2001:aaaa:aaaa:bb00::/64 via fe80::dddd:dddd:dddd:b650 dev br-wan  metric 512
default from 2001:aaaa:aaaa:bbf0::/60 via fe80::dddd:dddd:dddd:b650 dev br-wan  metric 512
2001:aaaa:aaaa:bb00::/64 dev br-wan  metric 256
2001:aaaa:aaaa:bbf0::/64 dev br-lan  metric 1024
unreachable 2001:aaaa:aaaa:bbf0::/60 dev lo  metric 2147483647  error -148
fddc:404e:e9ed::/64 dev br-lan  metric 1024
unreachable fddc:404e:e9ed::/48 dev lo  metric 2147483647  error -148
fe80::/64 dev eth0  metric 256
fe80::/64 dev br-wan  metric 256
fe80::/64 dev br-lan  metric 256
anycast 2001:aaaa:aaaa:bb00:: dev br-wan  metric 0
anycast 2001:aaaa:aaaa:bbf0:: dev br-lan  metric 0
anycast fddc:404e:e9ed:: dev br-lan  metric 0
anycast fe80:: dev eth0  metric 0
anycast fe80:: dev br-wan  metric 0
anycast fe80:: dev br-lan  metric 0
ff00::/8 dev eth0  metric 256
ff00::/8 dev br-lan  metric 256
ff00::/8 dev br-wan  metric 256

These are the routes on my OpenWRT router in a non-working state.

default from ::/64 via fe80::dddd:dddd:dddd:b650 dev br-wan  metric 512
default from 2001:aaaa:aaaa:bb00:cccc:cccc:cccc:5460 via fe80::dddd:dddd:dddd:b650 dev br-wan  metric 512
default from 2001:aaaa:aaaa:bb00::/64 via fe80::dddd:dddd:dddd:b650 dev br-wan  metric 512
default from 2001:aaaa:aaaa:bbf0::/60 via fe80::dddd:dddd:dddd:b650 dev br-wan  metric 512
::/64 dev br-wan  metric 256
2001:aaaa:aaaa:bb00::/64 dev br-wan  metric 256
2001:aaaa:aaaa:bbf0::/64 dev br-lan  metric 1024
unreachable 2001:aaaa:aaaa:bbf0::/60 dev lo  metric 2147483647  error -148
fddc:404e:e9ed::/64 dev br-lan  metric 1024
unreachable fddc:404e:e9ed::/48 dev lo  metric 2147483647  error -148
fe80::/64 dev eth0  metric 256
fe80::/64 dev br-wan  metric 256
fe80::/64 dev br-lan  metric 256
anycast 2001:aaaa:aaaa:bb00:: dev br-wan  metric 0
anycast 2001:aaaa:aaaa:bbf0:: dev br-lan  metric 0
anycast fddc:404e:e9ed:: dev br-lan  metric 0
anycast fe80:: dev eth0  metric 0
anycast fe80:: dev br-wan  metric 0
anycast fe80:: dev br-lan  metric 0
ff00::/8 dev eth0  metric 256
ff00::/8 dev br-lan  metric 256
ff00::/8 dev br-wan  metric 256

Notice that the non-working state has two additional routes.

default from ::/64 via fe80::dddd:dddd:dddd:b650 dev br-wan  metric 512
::/64 dev br-wan  metric 256

From a clients perspective, the connection reverts back to IPv4. The general behavior is not consistent. Some clients loose all their IPv6 address, including the ULA address. Others loose only their temporary address generated by the privacy extension. In contrast to the router, clients cannot route IPv6 traffic.

These are the routes on a client in a working state. The 7ed0 subfix is the gateway, 16e9 is the stateless auto-configured address, and 1e5 is the temporary address.

IPv6 Route Table
===========================================================================
Active Routes:
 If Metric Network Destination      Gateway
  4    301 ::/0                     fe80::dddd:dddd:dddd:7ed0
  1    331 ::1/128                  On-link
  4    301 2001:aaaa:aaaa:bbf0::/60 fe80::dddd:dddd:dddd:7ed0
  4    301 2001:aaaa:aaaa:bbf0::/64 On-link
  4    301 2001:aaaa:aaaa:bbf0:dddd:dddd:dddd:16e9/128
                                    On-link
  4    301 2001:aaaa:aaaa:bbf0:dddd:dddd:dddd:1e5/128
                                    On-link
  4    301 fddc:404e:e9ed::/48      fe80::dddd:dddd:dddd:7ed0
  4    301 fddc:404e:e9ed::/64      On-link
  4    301 fddc:404e:e9ed:0:dddd:dddd:dddd:16e9/128
                                    On-link
  4    301 fddc:404e:e9ed:0:dddd:dddd:dddd:1e5/128
                                    On-link
  4    301 fe80::/64                On-link
  4    301 fe80::dddd:dddd:dddd:16e9/128
                                    On-link
  1    331 ff00::/8                 On-link
  4    301 ff00::/8                 On-link
===========================================================================

These are the routes from a client in a non-working state. The 7ed0 subfix is the gateway, 16e9 is the stateless auto-configured address, and 36a9 is the temporary address.

IPv6 Route Table
===========================================================================
Active Routes:
 If Metric Network Destination      Gateway
  4    291 ::/0                     fe80::dddd:dddd:dddd:7ed0
  1    331 ::1/128                  On-link
  4    291 2001:aaaa:aaaa:bbf0::/60 fe80::dddd:dddd:dddd:7ed0
  4    291 2001:aaaa:aaaa:bbf0::/64 On-link
  4    291 2001:aaaa:aaaa:bbf0:dddd:dddd:dddd:16e9/128
                                    On-link
  4    291 fddc:404e:e9ed::/48      fe80::dddd:dddd:dddd:7ed0
  4    291 fddc:404e:e9ed::/64      On-link
  4    291 fddc:404e:e9ed:0:dddd:dddd:dddd:36a9/128
                                    On-link
  4    291 fddc:404e:e9ed:0:dddd:dddd:dddd:16e9/128
                                    On-link
  4    291 fe80::/64                On-link
  4    291 fe80::dddd:dddd:dddd:16e9/128
                                    On-link
  1    331 ff00::/8                 On-link
  4    291 ff00::/8                 On-link
===========================================================================

Notice that the working state has an additional route.

  4    301 2001:aaaa:aaaa:bbf0:dddd:dddd:dddd:1e5/128
                                    On-link

The routing table seems to be messed up on the clients, and I don't know I can even trust this output. Different output on different clients depending on whether IPv6 addresses have disappeared or not from the clients interface. This is just an example of 1 single client.

So anyone knows what happens to cause this problem after a period of time, usually within 24 hours? A restart of the WAN6 interface and a reboot for the clients solves the problems.

Thank you.

I see this fix on the OpenWRT git posted 17 hours ago. Could this be the problem? My WAN prefix size is as big as the delegated prefix on the LAN. Both are /60. Notice the on link route is also missing in the non-working state.

Let's switch to the master branch tonight :grinning:. @dedeckeh you're an absolute god.

Should not be the case. Try /64 on lan.

...wait, it already is.

:thinking:

Let us know how the switch works out!

Yeah i meant to say the delegated prefix length received on the wan is equal to the downstream delegated prefix length. Its both /60. Out of that /60 prefix I assigned a /64 prefix on my LAN interface.

The link click counter tells me you even didn't read what the fix was about... Hehe. To clarify:

odhcpd includes RIO RA options according to requirement L3 in RFC7084.
However if the delegated prefix length received on the wan is equal to the
downstream delegated prefix length on the Lan this may pollute the
routing table of type C hosts as the RIO routing entry can take
precedence of the PIO routing entry meaning all traffic for the on link
hosts will go via the router iso direct on link communication.
If the traffic is dropped in the router hosts are unreachable; therefore
don't include RIO options with prefixes and prefix length identical to
those in a PIO RA option

And yeah will do. I'm not building the master branch by the way. As the fix in the commit is rather simple, I choose to backport the fix and building the stable 19.07.2 instead. Just building to odhcpd package and replace it on my existing installation, or backport it is there's any incompability but I doubt it reading the last commits since 19.07.2. I just have to wait around 2 days to see if the changes are effective and report back here.

As I prefer a stable release instead of running the master branch, I just took the time to compare all commits between e53fec89 (release of v19.07.2) and 5ce07702 (the last commit with the supposed fix) of the odhcpd project. I couldn't see any changes that would break with the current stable release of OpenWRT. The newer recent commits have some nice optimizations in them, so let's include them too instead of only backporting the fix.

I just compiled my own package, removed the existing odhcpd-ipv6only on my device, and installed the updated odhcpd package using opkg. I also tracked the changes in the filesystem during this process. Everything went smooth and fine.

I cannot say this will fix my problem yet, but I'm pretty sure it will. Here's my compiled package for OpenWRT 19.07.2 with target device ar71xx/nand (MIPS 24KC) build from snapshot 5ce07702 using the OpenWRT SDK.

If you're in bad need of this solution too, please take note that my compiled package is most likely not compatible with your device. If you can't wait till the next service release of OpenWRT, you can easily build your own odhcpd package for OpenWRT v19.07.2 using the SDK. You should be done with building a updated package in 10 minutes if you know your way around Linux and the SDK.

Download the SDK version 19.07.2 for your target device and set it up. The whole process is documented in the manual. Just skip the 'load package lists' section and do the following instead. First, you update the package feeds with the ./scripts/feeds update -a command. Then you modify the odhcpd makefile in feeds/base/package/network/services/odhcpd/ to tell the SDK to build odhcpd based on snapshot 5ce07702.

Notice the PKG_SOURCE_DATE, PKG_SOURCE_VERSION, PKG_MIRROR_HASH, and PKG_ASLR_PIE_REGULAR lines.

[..]
PKG_NAME:=odhcpd
PKG_RELEASE:=3

PKG_SOURCE_PROTO:=git
PKG_SOURCE_URL=$(PROJECT_GIT)/project/odhcpd.git
PKG_SOURCE_DATE:=2020-05-04
PKG_SOURCE_VERSION:=5ce077026b991f49d96464587386f93d92f56385
PKG_MIRROR_HASH:=5fcb4e9f219398ac09ab87e942d1a9a3f4c58431dceefa30b429a19d2dba8ff6

PKG_MAINTAINER:=Hans Dedecker <dedeckeh@gmail.com>
PKG_LICENSE:=GPL-2.0

PKG_INSTALL:=1
PKG_CONFIG_DEPENDS:=CONFIG_PACKAGE_odhcpd_$(BUILD_VARIANT)_ext_cer_id
PKG_ASLR_PIE_REGULAR:=1

include $(INCLUDE_DIR)/package.mk
include $(INCLUDE_DIR)/cmake.mk
[..]

After modifying the makefile, you make the package available with ./scripts/feeds install odhcpd. Continue the building process as described in the 'usage' section of the manual. When building is complete, you can remove the existing odhcpd-ipv6only package from your router and install your own odhcpd-ipv6only package using opkg or the LuCI web interface. You can find the newly build package file in bin/packages/<target architecture>/base. Only deploy odhcpd-ipv6only.ipk as the other packages are dependencies that are already installed on your device.

Feel free to message me if you can't figure how to build your own. I'm happy to build one for your device while we wait on the next OpenWRT service release.

I can't edit my posts anymore but I can confirm upgrading to the most odhcpd build works for OpenWRT 19.07.2. You need to compile odhcpd yourself or wait till the next OpenWRT service release.

So for clarity, the problem was that the routing table gets messed up if the upstream delegate prefix length is the same as the downstream delegated prefix length from OpenWRT's perspective.

Best timing to finally upgrade from my LEDE installation with radvd :smile:

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.