OpenWrt Forum Archive

Topic: 6rd and ipv6-support

The content of this topic has been archived between 12 Jul 2015 and 30 Apr 2018. Unfortunately there are posts – most likely complete pages – missing.

CyrusFF wrote:

Ummn, do you mean a second router on the same link with "secondary router"?

I have a "primary router" handling the connectivity to ISP. That router has 6in4 tunnel etc. and runs dnsmasq and 6relayd etc.

And then I have a second Openwrt router, which I called a "secondary router". That router is connected via wired LAN to the primary router in the same LAN subnet, and mostly acts as a second wifi access point, but it has no role in dhcp and should also have no need for 6relayd, right?

(Last edited by hnyman on 13 Sep 2013, 14:46)

CyrusFF wrote:

Yes, but what should be the result? Assume the server is bad and restart the whole exchange? Try to deduce some value from the IAs already there?

I figured out at least one way to cause the timeout error:
If I re-flash my "primary router" acting as the 6relayd server and reboot it after the flash operation, then odhcp6c in the secondary router gets stuck into "poll again in 300 seconds" mode (which after your fix probably corresponds to the former UINT32_MAX timeout). I rebooted the first router at 20:07, so after the next RENEW round odhcp6c seems to acknowledge a valid response, but timeout remains 300 seconds as the odhcp6c is somehow stuck with the invalid timeout.

Could it be that if the server instance changes, odhcp6c somehow doesn't fully accept the response from the new server? I am thinking your comment about "restart the whole exchange". May be that should be done?

Tue Sep 17 18:50:56 2013 daemon.notice odhcp6c[887]: Sending <POLL> (timeout 1800s)
Tue Sep 17 19:20:56 2013 daemon.notice odhcp6c[887]: Sending RENEW (timeout 1080s)
Tue Sep 17 19:20:56 2013 daemon.notice odhcp6c[887]: Got a valid reply after 1ms
Tue Sep 17 19:20:56 2013 daemon.notice odhcp6c[887]: Sending <POLL> (timeout 1800s)
Tue Sep 17 19:50:56 2013 daemon.notice odhcp6c[887]: Sending RENEW (timeout 1080s)
Tue Sep 17 19:50:56 2013 daemon.notice odhcp6c[887]: Got a valid reply after 1ms
Tue Sep 17 19:50:56 2013 daemon.notice odhcp6c[887]: Sending <POLL> (timeout 1800s)
<< first router rebooted at this point >>
Tue Sep 17 20:20:56 2013 daemon.notice odhcp6c[887]: Sending RENEW (timeout 1080s)
Tue Sep 17 20:20:56 2013 daemon.notice odhcp6c[887]: Got a valid reply after 2ms
Tue Sep 17 20:20:56 2013 daemon.notice odhcp6c[887]: Sending <POLL> (timeout 300s)
Tue Sep 17 20:25:56 2013 daemon.notice odhcp6c[887]: Sending RENEW (timeout 300s)
Tue Sep 17 20:25:56 2013 daemon.notice odhcp6c[887]: Got a valid reply after 2ms
Tue Sep 17 20:25:56 2013 daemon.notice odhcp6c[887]: Sending <POLL> (timeout 300s)
Tue Sep 17 20:30:56 2013 daemon.notice odhcp6c[887]: Sending RENEW (timeout 300s)
Tue Sep 17 20:30:56 2013 daemon.notice odhcp6c[887]: Got a valid reply after 1ms
Tue Sep 17 20:30:56 2013 daemon.notice odhcp6c[887]: Sending <POLL> (timeout 300s)

EDIT:
I tested: a normal reboot of the router running 6relayd is enough to make the other router's odhcp6c to get stuck in "renew in 300s" mode.

(Last edited by hnyman on 17 Sep 2013, 21:58)

Thank you very much for your help.
I found the issue: When the server forgets about your lease (e.g. through a restart) it includes an error code in the lease. The old behaviour was to ignore that error and simply not parse it, which lead to this strange behaviour. I just commited a fix that restarts the transaction if the client receives such an error status (e.g. NoBinding).

CyrusFF wrote:

Thank you very much for your help.

Thanks. I have a stable environment with two identical routers that I regularly flash to the newest version, so I guess I will spot issues like that more clearly than most users ;-)

But could you please consider adding that uci config option in 6relayd config to be able to prevent its launch? Having that kind of uci option would survive the flash and would help suppressing 6relayd in routers not needing it.

(Last edited by hnyman on 18 Sep 2013, 12:22)

I'm currently in the middle of refactoring 6relayd and will rewrite the startup behaviour. Though this can still take a few weeks until its commited.

Hi CyrusFF, why is it that in Interface section in LuCI the ipv6 prefix section appears for trunk but not attitude adjustment?
Ok I found it
/etc/config/network changes was not merged
https://dev.openwrt.org/changeset/36384/
Is it possible for you to push the all the changes required to AA?

(Last edited by alphasparc on 28 Sep 2013, 12:42)

I've been having trouble with relayed, think I found a bug and propose a solution.

Basically, the bug is that 6relayd is advertising bad route information on RAs -- please see https://forum.openwrt.org/viewtopic.php?id=46584 for the full story.

It seems that there is a bug in the function send_router_advert in router.c. Specifically, if I change the data structure "routes" to accomodate a 128-bit address and make the necessary changes, everything works like I think it should.

Here is the expanded structure definition:

 struct {
        uint8_t type;
        uint8_t len;
        uint8_t prefix;
        uint8_t flags;
        uint32_t lifetime;
        uint32_t addr[4];
    } routes[RELAYD_MAX_PREFIXES];

See how addr[2] is now addr[4].
The last two lines of the following is the extra code to fill the extra two uint32_ts:

        routes[routes_cnt].addr[0] = addr->addr.s6_addr32[0];
        routes[routes_cnt].addr[1] = addr->addr.s6_addr32[1];
        routes[routes_cnt].addr[2] = addr->addr.s6_addr32[2];
        routes[routes_cnt].addr[3] = addr->addr.s6_addr32[3];

Does it look plausible? It certainly fixes the problem I reported, and kinda makes sense to me. FYI, I'm running a current version of BB on a Linksys NSLU2, which is an ARM running big-endian.
It is worth filing a patch at OpenWrt?
Regards
Mike

Nope, it is OK to cut off trailing bytes in the route struct so this shouldn't be problematic. I've seen that radvdump has problems with deciphering those but they are technically correct. Do you have any other clues on why this should be problematic the way it is and what your solution improves other than the output of radvdump?

CyrusFF wrote:

Nope, it is OK to cut off trailing bytes in the route struct so this shouldn't be problematic. I've seen that radvdump has problems with deciphering those but they are technically correct. Do you have any other clues on why this should be problematic the way it is and what your solution improves other than the output of radvdump?

The main improvement is that the routing works after the fix, and doesn't work without it. The fix has completely restored IPv6 connectivity between my private network (Macs, Linux boxes, printers, etc.) and the internet. I was just using radvdump as a diagnostic.

I've assumed that radvdump is reporting the real content of the RA, because what was happening before the fix corresponds with what I'd expect to happen with faulty routing information -- devices using the RAs were unable to ping6 through the router. Then, having applied the fix (and seen the improvement in the radvdump outputs), devices were indeed able to ping6 through. BTW, the device running radvdump is a FreeBSD box.

TBH, I don't understand how it could be OK to cut off the trailing bytes, because by cutting them off, surely the rightmost 8 bytes of the ipv6 router address are cut off, so the data being passed in the RA would not actually contain a full route address at all. By my reasoning, the odd-looking route addresses displayed were constructed using four bytes of prefix and four bytes of junk from beyond the end of the data structure. It seems consistent with what radvdump was reporting and what was actually happening in terms of routing. I have to admit, though, I'm guessing a lot of this on the basis of what seems plausible... smile

(Last edited by mikebrady on 3 Oct 2013, 16:54)

No, that isn't the point these route options indicate that a target network (which is the last part of the route struct) is reachable through the router that send the advert. So the last field doesn't indicate a router but a route target and thus cutting off a part is legitimate. This is also in compliance with http://tools.ietf.org/html/rfc4191#section-2.3.

Just checking again with radvdump shows the same crap that you encountered e.g. route 2001:db80::1903:0:0:708/48
however looking at wireshark dumps there it is recognized correctly and also obvious that the random garbage after the route is actually a product of radvdumps imagination wink

But having seen so much crap in IPv6 implementations especially with what you mentioned it could very well be clients having issues. Nevertheless if this really fixes things for you you can send me a patch ( cyrus - at - openwrt - dot - org) and I can evaluate it for my queue.

Thanks for the clarification. As you say, in spite of being strictly unnecessary, it might to be useful. I'll send the patch.

I have again a special situation for you:
Thanks to a buggy wireless driver causing an endless reboot loop, I had today trouble with my main router and I had to flash and boot the router multiple times.

Apparently I have booted the main router (6relayd server) just after the secondary router  (odhcp6c client) has received answer to SOLICIT and has sent a REQUEST message. Then the booting server has not answered, and as the timeout is really long, the client is apparently still waiting for the answer.

  Thu Oct 10 20:19:56 2013 daemon.notice odhcp6c[1115]: Sending REQUEST (timeout 4294967295s)

Does the timeout for REQUEST really need to be that long?

Thu Oct 10 19:49:55 2013 daemon.notice odhcp6c[1115]: Sending RENEW (timeout 1080s)
Thu Oct 10 19:49:55 2013 daemon.notice odhcp6c[1115]: Got a valid reply after 1ms
Thu Oct 10 19:49:55 2013 daemon.notice odhcp6c[1115]: Sending <POLL> (timeout 1800s)
Thu Oct 10 20:19:55 2013 daemon.notice odhcp6c[1115]: Sending RENEW (timeout 1080s)
Thu Oct 10 20:19:55 2013 daemon.notice odhcp6c[1115]: Got a valid reply after 2ms
Thu Oct 10 20:19:55 2013 daemon.warn odhcp6c[1115]: Server returned IAID status 3!
Thu Oct 10 20:19:55 2013 daemon.notice odhcp6c[1115]: Sending RELEASE (timeout 3s)
Thu Oct 10 20:19:55 2013 daemon.notice odhcp6c[1115]: (re)starting transaction on br-lan
Thu Oct 10 20:19:55 2013 daemon.notice odhcp6c[1115]: Sending SOLICIT (timeout 4294967295s)
Thu Oct 10 20:19:55 2013 daemon.notice odhcp6c[1115]: Got a valid reply after 37ms
Thu Oct 10 20:19:55 2013 daemon.notice netifd: Interface 'lan6' has lost the connection
Thu Oct 10 20:19:55 2013 daemon.notice netifd: Interface 'lan6' is now up
Thu Oct 10 20:19:56 2013 daemon.notice odhcp6c[1115]: Sending REQUEST (timeout 4294967295s)
Thu Oct 10 20:19:56 2013 daemon.info dnsmasq[1385]: reading /tmp/resolv.conf.auto
Thu Oct 10 20:19:56 2013 daemon.info dnsmasq[1385]: using nameserver fd4d:a3ef:f00::1#53
Thu Oct 10 20:19:56 2013 daemon.info dnsmasq[1385]: using nameserver 192.168.1.1#53
Thu Oct 10 20:19:56 2013 daemon.info dnsmasq[1385]: using local addresses only for domain lan
Thu Oct 10 20:35:52 2013 daemon.info dnsmasq[1385]: reading /tmp/resolv.conf.auto
Thu Oct 10 20:35:52 2013 daemon.info dnsmasq[1385]: using nameserver 2001:xxxx:yyy::1#53
Thu Oct 10 20:35:52 2013 daemon.info dnsmasq[1385]: using nameserver fd4d:a3ef:f00::1#53
Thu Oct 10 20:35:52 2013 daemon.info dnsmasq[1385]: using nameserver 192.168.1.1#53
Thu Oct 10 20:35:52 2013 daemon.info dnsmasq[1385]: using local addresses only for domain lan
Thu Oct 10 21:36:25 2013 daemon.info dnsmasq[1385]: reading /tmp/resolv.conf.auto
Thu Oct 10 21:36:25 2013 daemon.info dnsmasq[1385]: using nameserver fd4d:a3ef:f00::1#53
Thu Oct 10 21:36:25 2013 daemon.info dnsmasq[1385]: using nameserver 192.168.1.1#53
Thu Oct 10 21:36:25 2013 daemon.info dnsmasq[1385]: using local addresses only for domain lan
Thu Oct 10 22:03:38 2013 daemon.info dnsmasq[1385]: reading /tmp/resolv.conf.auto
Thu Oct 10 22:03:38 2013 daemon.info dnsmasq[1385]: using nameserver 2001:xxxx:yyy::1#53

(Last edited by hnyman on 11 Oct 2013, 10:31)

Ok thanks. I set the timeout to 60s now which is not quite was the standard says but should be good enough for now.

Looks like I lost ipv6 connectivity between 38578 and 38610. Might be due to recent netifd changes?

When looking at the changes, there is not much else that could cause this (for an ar71xx build).
https://dev.openwrt.org/changeset?new=38610%40trunk&old=38578%40trunk

The sixxs tunnel interface does not get up, not even with ifup.
And LAN interface only gets a link-local address.

EDIT:
reverting just the changes on netifd today makes the ipv6 connectivity to come back. Otherwise r38610, but netifd as before the changes by r38606

(Last edited by hnyman on 30 Oct 2013, 20:54)

Are there issues with ipv4 as well? What kind of config do you use?

nbd wrote:

Are there issues with ipv4 as well? What kind of config do you use?

I didn't notice anything special with ipv4. Traffic flows ok.

I have a static 6in4 tunnel for sixxs.

But also LAN did not get an ipv6 address and it is controlled by 6relayd, I guess.

One of my suspicisions for the reason is that the maybe protocol scripts (6in4.sh) etc. have not yet been accomodated to netifd functions being moved around between scripts. E.g. 6in4.sh references netifd-proto.sh, which is now marked non-executable, right?   http://nbd.name/gitweb.cgi?p=openwrt.gi … 43;hb=HEAD  and http://nbd.name/gitweb.cgi?p=luci2/neti … 7e;hb=HEAD

Similarly, some of the functions have been moved to utils.sh. I am not sure if the ipv6 protocol handlers have been accomodated to the changes. (If the changes actually do cause that kind need to changes scripts references.)

EDIT:
Fixed by r38627 or actually by the underlying change  in netifd: http://nbd.name/gitweb.cgi?p=luci2/neti … ddfcdc5cb9

(Last edited by hnyman on 31 Oct 2013, 16:26)

Hi CyrusFF,
I no longer have IPv6 routing. Wan and clients have well a public IPv6 address.

Edit1: Yet everything seems ok with the ping test.

ping -6 ipv6.google.com

Envoi d'une requête 'ping' sur ipv6.l.google.com [2a00:1450:400c:c05::67] avec 3
2 octets de données :
Réponse de 2a00:1450:400c:c05::67 : temps=675 ms
Réponse de 2a00:1450:400c:c05::67 : temps=29 ms
Réponse de 2a00:1450:400c:c05::67 : temps=28 ms
Réponse de 2a00:1450:400c:c05::67 : temps=28 ms

Statistiques Ping pour 2a00:1450:400c:c05::67:
    Paquets : envoyés = 4, reçus = 4, perdus = 0 (perte 0%),
Durée approximative des boucles en millisecondes :
    Minimum = 28ms, Maximum = 675ms, Moyenne = 190ms

Edit2: I think 6relay returns bad gateway. fe80 class instead of 2a01 class.

(Last edited by Manani on 23 Nov 2013, 12:48)

I am running r38896 and for Apple clients (iphone ipad etc) everything is honky dory, however my Android device that previously (r386xx versions on another platform I replaced) was working fine, is now almost always without a prefix after a few hours.

I also note that I am getting a firewall reset every time the ISP sends me an RA;

Sun Dec  1 04:01:54 2013 user.notice firewall: Reloading firewall due to ifup of wan6 ()
Sun Dec  1 04:13:29 2013 user.notice firewall: Reloading firewall due to ifup of wan6 ()
Sun Dec  1 04:25:30 2013 user.notice firewall: Reloading firewall due to ifup of wan6 ()
Sun Dec  1 04:37:55 2013 user.notice firewall: Reloading firewall due to ifup of wan6 ()

could I be missing a package in my build?  other than 6relayd and odhcp6c, I don't need anything in particular?  Or do I need odhcpd?  I'd love to figure out what has changed to break my Androids.  Anyone else having issues or is it just me?

I have tried with ULA prefix compat, without ULA prefix at all, dhcpv6 server disabled, stateless only, stateful+stateless, stateful only and it never seems to keep a prefix longer than 7xxx seconds or whatever the first RA it gets duration is.

(Last edited by cconn on 1 Dec 2013, 05:25)

I seem to have lost clients' ipv6 connectivity between 39150 and 39154.
Everything worked with 39150. With 39154, 6relayd starts, but does not stay alive (does not stay in  process list). The router itself can ping ipv6.google.com, but clients do not get connectivity.

39152 made changes to dnsmasq startup handling and I wonder if that has had impact.

with 39154 I see an extra log line:
Sat Dec 21 20:08:13 2013 daemon.warn 6relayd[1342]: Termination requested by signal.

EDIT:
Interesting, after flashing 39155 there is no such line in the log and ipv6 works again for clients. I don't see how that 39155 could have affected wired connection, so there might be a problem surfacing sometimes, but not always.

(Last edited by hnyman on 21 Dec 2013, 22:54)

I have the same issue with current trunk. At boot the router broadcasts the prefix and gives DHCPv6 leases, but shortly after all stops.


Thu Jan  1 00:00:56 1970 daemon.notice netifd: Interface 'wan6' is now up
Thu Jan  1 00:00:56 1970 user.notice firewall: Reloading firewall due to ifup of wan6 (wlan0)
Thu Jan  1 00:00:57 1970 daemon.warn 6relayd[851]: Termination requested by signal.
Thu Jan  1 00:00:58 1970 daemon.info dnsmasq[973]: read /etc/hosts - 1 addresses
Thu Jan  1 00:00:58 1970 daemon.info dnsmasq[973]: read /tmp/hosts/6relayd - 0 addresses
Thu Jan  1 00:00:58 1970 daemon.info dnsmasq-dhcp[973]: read /etc/ethers - 0 addresses
Thu Jan  1 00:00:58 1970 user.notice firewall: Reloading firewall due to ifup of wan6 ()
Tue Dec 31 15:33:57 2013 user.notice firewall: Reloading firewall due to ifup of wan6 ()
Tue Dec 31 15:39:13 2013 user.notice firewall: Reloading firewall due to ifup of wan6 ()
Tue Dec 31 15:48:39 2013 user.notice firewall: Reloading firewall due to ifup of wan6 ()


So this is with current trunk, no upstream IPv6, and default config (including the default WAN6 DHCP)

I have noticed with the last few trunk builds in late December that 6relayd may not stay alive during the startup process. It is mentioned in the logs, but the process silently disappears somewhere before the boot is complete. Router itself can ping ipv6.google.com, but LAN clients do not get ipv6 connectivity as 6relayd is not there providing RA.

Issuing a "/etc/init.d/6relayd restart" command fixes things. No changes needed for the config.

I guess that this might be due to the recent changes in netifd startup order.

I have filed bug #14710 about this: https://dev.openwrt.org/ticket/14710

EDIT: Fixed by r39184.

(Last edited by hnyman on 26 Jan 2014, 22:16)

Since the end of 6relay I only get IPv6 on public wan interface. The problem remains with the default settings after a new installation.
IPv6 test on the router is correct.
To recap, I have native 6rd.

Edit: I do not know what really happened, everything is OK after nth installation.

(Last edited by Manani on 26 Jan 2014, 21:58)

just rebuilt r39479 and the DHCPv6 server is now refusing requests from previously working odhcp6c;

Feb  5 14:05:53.218: IPv6 DHCP: Received SOLICIT from FE80::68ED:9D14:A4DE:44CC on Virtual-Access3.1104
Feb  5 14:05:53.218: IPv6 DHCP: detailed packet contents
Feb  5 14:05:53.218:   src FE80::68ED:9D14:A4DE:44CC (Virtual-Access3.1104)
Feb  5 14:05:53.218:   dst FF02::1:2
Feb  5 14:05:53.218:   type SOLICIT(1), xid 5331819
Feb  5 14:05:53.218:   option ELAPSED-TIME(8), len 2
Feb  5 14:05:53.218:     elapsed-time 0
Feb  5 14:05:53.218:   option ORO(6), len 20
Feb  5 14:05:53.218:     SIP-DOMAIN,SIP-ADDRESS,DNS-SERVERS,DOMAIN-LIST,SNTP-ADDRESS,UNKNOWN,UNKNOWN,UNKNOWN,UNKNOWN,UNKNOWN
Feb  5 14:05:53.218:   option CLIENTID(1), len 10
Feb  5 14:05:53.218:     00030001F81A67A4BD24
Feb  5 14:05:53.218:   option UNKNOWN(39), len 9
Feb  5 14:05:53.218:   option IA-NA(3), len 12
Feb  5 14:05:53.218:     IAID 0x00000001, T1 0, T2 0
Feb  5 14:05:53.218:   option IA-PD(25), len 41
Feb  5 14:05:53.218:     IAID 0x00000001, T1 0, T2 0
Feb  5 14:05:53.218:     option IAPREFIX(26), len 25
Feb  5 14:05:53.218:       preferred 0, valid 0, prefix ::/48
Feb  5 14:05:53.218: IPv6 DHCP: Option UNKNOWN(39) is not processed
Feb  5 14:05:53.218: IPv6 DHCP: Using interface pool v6POOL
Feb  5 14:05:53.218: IPv6 DHCP: No binding for IA_PD 00000001
Feb  5 14:05:53.218: IPv6 DHCP: Reclaiming addresses for client FE80::68ED:9D14:A4DE:44CC 1


and on the wrt;

Wed Feb  5 19:15:49 2014 daemon.notice odhcp6c[5678]: (re)starting transaction on pppoe-wan
Wed Feb  5 19:15:49 2014 daemon.notice odhcp6c[5678]: Starting SOLICIT transaction (timeout 4294967295s, max rc 0)
Wed Feb  5 19:15:49 2014 daemon.notice odhcp6c[5678]: Got a valid reply after 6ms
Wed Feb  5 19:15:50 2014 daemon.notice odhcp6c[5678]: Starting SOLICIT transaction (timeout 4294967295s, max rc 0)
Wed Feb  5 19:15:50 2014 daemon.notice odhcp6c[5678]: Got a valid reply after 6ms
Wed Feb  5 19:15:51 2014 daemon.notice odhcp6c[5678]: Starting REQUEST transaction (timeout 4294967295s, max rc 10)
Wed Feb  5 19:15:51 2014 daemon.notice odhcp6c[5678]: Send REQUEST message (elapsed 0ms, rc 0)
Wed Feb  5 19:15:51 2014 daemon.notice odhcp6c[5678]: Got a valid reply after 6ms
Wed Feb  5 19:15:51 2014 daemon.warn odhcp6c[5678]: Server returned IA_NA status 2 (NOADDRS-AVAIL)

downgraded back to r39379 and its all working again.




also I seem to have lost the 6relayd config menu in LUCI?  what option do I have to enable to get it back?

You can configure DHCPv6 alongside DHCP now in the interface config. Static leases can be configured along the dhcp ones. 6relayd is now odhcpd so please check that you have it.

Will have a look at the dhcpv6 client tomorrow.

CyrusFF wrote:

You can configure DHCPv6 alongside DHCP now in the interface config. Static leases can be configured along the dhcp ones. 6relayd is now odhcpd so please check that you have it.

Will have a look at the dhcpv6 client tomorrow.


erf.....6relayd was enabled in the .config but package odhcpd was not....(!!)

I am trying a rebuild without 6relayd and odhcpd enabled.