IPv6 source address selection broken for packets generated on the router

Hi, while debugging DHCPv6 issues on my router, I noticed that IPv6 messages generated on the router have the wrong source address.

Disclaimer, I replaced the actual IPv6 prefixes with random values. Call me stupid or overly cautious. Fake addresses start with 1234:4567.

The problem

Here is the output of ip -6 addr:

[snip]
3: br-lan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
    inet6 1234:4567:1111::1/60 scope global dynamic
       valid_lft 86222sec preferred_lft 3422sec
    inet6 fdec:69e6:7d50::1/60 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::f29f:c2ff:fe60:91e0/64 scope link
       valid_lft forever preferred_lft forever
5: eth0.2@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
    inet6 1234:4567:89ab:cdef:f29f:c2ff:fe60:91e1/64 scope global dynamic
       valid_lft 2591818sec preferred_lft 604618sec
    inet6 1234:4567:89ab:cdef:1559:813d:406f:6f4d/128 scope global dynamic
       valid_lft 86222sec preferred_lft 3422sec
    inet6 fe80::f29f:c2ff:fe60:91e1/64 scope link
       valid_lft forever preferred_lft forever

My ISP assigns 1234:4567:89ab:cdef/64 to the public router interface and passes 1234:4567:1111/48 down via DHCPv6-PD.

Now i capture the a ICMPv6 ping to ipv6.google.com issued on the router itself using wireshark on the router. The roundtrip looks like this:

43	173.811099	1234:4567:1111::1	2a00:1450:400a:802::200e	ICMPv6	118	Echo (ping) request id=0x1a40, seq=1, hop limit=64 (reply in 44)
44	173.812327	2a00:1450:400a:802::200e	1234:4567:1111::1	ICMPv6	118	Echo (ping) reply id=0x1a40, seq=1, hop limit=55 (request in 43)

Notice how the source address is 1234:4567:1111::1. My ISP assigns 1234:4567:1111::/48 to me using DHCPv6-PD. But notice how the address is only assigned to br-lan and not to eth0.2. Clearly the ping paket left the router on eth0.2 (in fact that was the only interface i pointed wireshark at). So i'd expect it to pick one of the addresses assigned to eth0.2, some address in 1234:4567:89ab:cdef/64.

The problem is not unique to ping pakets. DHCPv6 pakets to my ISP also are sent from an address from the DHCPv6-PD prefix and not from an address of the external interface. (My ISP says this might be the source of my DHCPv6 problems, but that is a different story).

Investigation

After reading up on IPv6 source address selection, a surprisingly interesting and complicated topic, I confirmed that Linux should pick an address from the outgoing interface. So let's check the routing table for IPv6:

default from 1234:4567:89ab:cdef:1559:813d:406f:6f4d via fe80::ca9c:1dff:fe93:343f dev eth0.2 proto static metric 512 pref medium
default from 1234:4567:89ab:cdef::/64 via fe80::ca9c:1dff:fe93:343f dev eth0.2 proto static metric 512 pref medium
default from 1234:4567:1111::/48 via fe80::ca9c:1dff:fe93:343f dev eth0.2 proto static metric 512 pref medium
1234:4567:89ab:cdef::/64 dev eth0.2 proto static metric 256 pref medium
1234:4567:1111::/64 dev br-lan proto static metric 1024 pref medium
[snip]

Woah...the default routes are gated using source rules.

Conclusion: in order to know my outgoing interface I have to know my source address, but to know the source address I have to know the outgoing interface.

Trying to fix it

Now, I tried to replace the 3 default route entries with just a single one:

default via fe80::ca9c:1dff:fe93:343f dev eth0.2 proto static metric 512 pref medium

When I retried the ping -6 ipv6.google.com Wireshark would show:

259	1507.821756	1234:4567:89ab:cdef:f29f:c2ff:fe60:91e1	2a00:1450:400a:802::200e	ICMPv6	118	Echo (ping) request id=0xc943, seq=0, hop limit=64 (reply in 260)
260	1507.822988	2a00:1450:400a:802::200e	1234:4567:89ab:cdef:f29f:c2ff:fe60:91e1	ICMPv6	118	Echo (ping) reply id=0xc943, seq=0, hop limit=55 (request in 259)

Notice how the source address is now actually from a prefix assigned to eth0.2. Great!

Actual questions

  1. Can the observed behaviour be considered a bug?
  2. Why is OpenWrt using complicated source routes instead of a single default route?

No idea about DHCPv6, but it certainly affects IPv6 connectivity via a tunnel broker:
https://bugs.openwrt.org/index.php?do=details&task_id=2167

The main reason may be to allow multiple IPV6 wan interfaces without complicated policy routing. It also stops the ULAs from reaching the internet, since there won't be any default routes for them.

BTW it will use the first global IPv6 address it finds, if you create an interface with ifname @loopback and an address from a delegated/routed prefix it will use that instead of the lan address.

For some reason it doesn't work for 6in4-tunnel.

I use a 6in4 tunnel and I have no problem using ping6 on the router. But the ip -6 route get won't succeed unless I specify a from address, but that shouldn't be a problem.

BTW regarding the problem with ip6prefix mentioned in the bug report. ip6prefix is (or at least can be) a list which allows you to specify multiple prefixes for each interface.

Thank you for all your replies!

How does this enable multiple WAN interfaces? Would a binary need to bind a socket to an source IPv6 address before making a connection/sending a packet? While this is 100% possible, I doubt most UDP/TCP clients do that (server are probably different). If a socket is not bound to a source address I don't see how the current routing scheme works with multiple WAN interfaces.

As for ULAs, one could have a dedicated routing rule for fc00::/7.

I will try that, thank you for the idea!

I was reading http://www.davidc.net/networking/ipv6-source-address-selection-linux among other pages, and I was under the assumption that source address selection is a lot more sophisticated than that. It sounds like, without any special setup, it should pick the address on the outgoing interface of the paket with the longest matching prefix as the destination address. TBH that sounds very reasonable and I don't like OpenWrt working around this logic.

If you are talking about the router itself, then I guess so. But you could use ip6class and have one LAN network which uses the IPv6 prefix from one WAN, and another LAN network which uses the IPv6 prefix from the other WAN. You can also use both prefixes on the same LAN, but then it's up to the hosts on the LAN to select the source address. On Linux address selection can be configured using ip addrlabel. From the man page:

IPv6 address labels are used for address selection; they are described in RFC 3484. Precedence is managed by userspace, and only the label itself is stored
in the kernel.