RFC 8157 - GRE Tunnel Bonding Protocol

qbdwlr0 · September 10, 2018, 4:08pm

I am investigating the effort to add the signalling protocol outlined in RFC 8157 (https://tools.ietf.org/html/rfc8157) to OpenWRT.
This protocol allows the CPE to use a LTE GRE Tunnel as backup or added capacity to a DSL GRE Tunnel.

I have been spending time familiarizing myself with OpenWRT in general: building, installing, configuring. I am now at the point where I am looking at the code for other protocols that use signalling to establish connections (DHCP, PPPoE, etc), but I would like some input before I begin making changes to reduce the amount of rewrite/wasted effort.

Is there a particular protocol's source that I should model?
Any advice on this area of development is appreciated.

jeff · September 10, 2018, 4:15pm

First question is if the protocol is supported in the upstream kernel.

If "yes", then it may be mainly enabling the proper kernel configuration, along with the build-system dependencies (it would seem that, for example, you need GRE to use "fancy GRE")

I'd look at the way GRE itself is handled by netifd.

Edit: That's an "informational" RFC. Also, how are you going to handle what looks to be a provider-end requirement:

A Hybrid Access Aggregation Point (HAAP) is the network function that resides in the provider's networks to terminate these bonded connections.

qbdwlr0 · September 10, 2018, 4:35pm

This is not my individual effort. There is a team that has been working on the HAAP side of this for a while, but without the CPE component, it is not a complete solution. The HAAP team has been using canned router tester interaction for initial test, but I have been tasked with providing an actual client. Ideally, we want to eventually make the CPE component available through OpenWRT.

lleachii · September 10, 2018, 4:54pm

Also from RFC8157:

The solution described in this document is currently implemented by Huawei and deployed by Deutsche Telekom AG.

This is a proprietary protocol...and might be property of Huawei...it does seem they don't mind others implementing it...

This document will enable other developers to build interoperable implementations.

...have you attempted to ask the authors about it?

N. Leymann
C. Heidemann
Deutsche Telekom AG
M. Zhang
B. Sarikaya
Huawei
M. Cullen
Painless Security

qbdwlr0 · September 10, 2018, 5:34pm

Huawei has their own CPE and therefore a complete solution. I am investigating an open source implementation of the protocol.

lleachii · September 10, 2018, 5:38pm

I understood that already; but I asked have you contacted them in order to ask them???

Perhaps, they're free to distribute the code; or know where an Open Source implementation of the protocol could be found.

Because...if not, you must realize that someone has to WRITE this into source code.

For example...the code could be included in an Open Source request (if their firmware is Open Source and required to do so).

qbdwlr0 · September 10, 2018, 5:53pm

Of course I realize that someone has to write the protocol into source code. That someone is me. I am starting to write it now.

qbdwlr0 · September 10, 2018, 11:48pm

I am just going to start asking very specific questions, if there is not anymore general advice.

I think it does make sense to augment the existing gre implementation rather than duplicating it as a new interface type. The developer's guide doesn't appear to cover this type of development.

For instance, if I think I need to add new attributes to gre.sh for netfid, then which file am I changing?

: find . -name gre.sh
./package/network/config/gre/files/gre.sh
./staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-brcm2708/lib/netifd/proto/gre.sh
./build_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/gre-1/.pkgdir/gre/lib/netifd/proto/gre.sh
./build_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/gre-1/ipkg-arm_cortex-a7_neon-vfpv4/gre/lib/netifd/proto/gre.sh
./build_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-brcm2708/lib/netifd/proto/gre.sh
./build_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root.orig-brcm2708/lib/netifd/proto/gre.sh

if I need to make changes to the Linux files, which file needs to be changed?

: find . -name ip_gre.c
./build_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/linux-brcm2708_bcm2709/linux-4.9.120/net/ipv4/ip_gre.c
./build_dir/toolchain-arm_cortex-a7+neon-vfpv4_gcc-7.3.0_musl_eabi/linux-4.9.120/net/ipv4/ip_gre.c

jeff · September 10, 2018, 11:54pm

I'd define a new "protocol" for netifd, but you can use gre.sh and the like as a template. You should probably consider it as a package, much as the gre package is created as well -- package/network/config/gre

Other developers might disagree, but especially with a memory-constraiined device, minimal is better.

Kernel patching is done after the sources are downloaded. You can see the patches in the OpenWrt git archive. The kernel patches seem to be under ./target/ and also see https://openwrt.org/docs/guide-developer/patches

qbdwlr0 · September 12, 2018, 3:19pm

A few more questions.

What is the feeling on code written in C++? I am wondering because the HAAP side has a lot of the signalling code done and there is potential to reuse this code, but it is written in C++.
When I was looking at dhcp and pppoe for examples of signalling implementation, I see that they are both user space processes, so it makes sense to do something similar for gtb. dhcp is a part of the busybox suite whereas ppp is a ppp-default/ppp-2.4.7/pppd/. Is the pppoe model an appropriate one for me to follow?

jeff · September 12, 2018, 4:06pm

I don't consider myself qualified to answer (2), perhaps @jow or one of the other "core-level" developers has some thoughts?

On C++, I understand the desire to reuse code, but with many devices straining for resources because of their 4 MB (or even 8 MB) of flash, requiring the C++ libraries could put the use of the code out of the reach of many users. libstdcpp is a 400 kB load if added as a package (which would be the case for most users), perhaps half of that if compressed and installed as part of the ROM. Even compressed, that may be too much for older or inexpensive devices on which people wish to run OpenWrt (or with which ISPs would like to provision their subscribers).

qbdwlr0 · October 19, 2018, 11:36am

I have finally had some time to spend on this effort and the user space implementation and package is coming together. I will be starting the kernel space implementation shortly.
I have a question about packaging. In addition to the process that does the control packet exchange, I have modified tcpdump-full to know how to parse the control packets for debug and interop testing. It is not clear to me how I should package the tcpdump changes. I think I cannot include them in my package since tcpdump-full is its own package. Do I submit them as a standalone modification to tcpdump-full?
Also, what sort of turnaround can I expect on a review process for the main changes? At what stage in the coding should I reach out to get feedback?

jeff · October 19, 2018, 1:12pm

In my opinion, it would be a different package than tcpdump-full as the additional functionality is neither in the upstream code base, nor supporting a wide need.

qbdwlr0 · October 19, 2018, 1:30pm

OK, so if my base package is called grehag, then there would be a supplementary package called grehag-dump that has a dependency on tcpdump-full?

jeff · October 19, 2018, 1:47pm

I don't know how your "version" interacts with mainline tcpdump; by changing the code or by calling it or its libraries. I had assumed that it was a code patch. If a code patch, then something like tcpdump-grehag would be the way I'd approach it myself. There wouldn't be any dependencies if you can examine packets without generating them, as well as the other way around.

Not quite enough coffee -- tcpdump-grehag (or whatever you choose to call it) would actually conflict with tcpdump-mini and tcpdump (and possibly libpcap, if OpenWrt splits it out), as it would be installing executables and object files that would reside in the same location.

jow · October 19, 2018, 2:06pm

As @jeff already noted, C++ is a rather uncommon choice for low level OpenWrt software components due to its comparatively large footprint. Apart from that, there is no argument against writing your reference implementation in C++.

That is correct. Ideally the user space process takes care of all the heavy lifiting and serves as controlling instance for your protocol traffic while the netifd integration script merely talks to your userspace process to bring up and teardown connections.

Without knowing any deeper details about the protocol or your kernel side implementation I cannot be more specific here but usually such userspace control processes fall into two categories:

Either it is a sort of tunneling daemon, for example OpenVPN, which once started spawns a Linux netdev which encapuslates/decapsulates traffic and terminates said netdev when shut down
or it is a control client, like udhcpc or dhclient, which given an existing netdev (e.g. eth0 ethernet interface) sends a number of packets over the existing interface to obtain some sort of configuration which is then directly or indirectly applied to the system.

Depending on the nature of your userspace control application, the netifd integration will look slightly different. In some cases, the nature of the protocol makes it impracticable to model as a network interface handler (e.g. relayd) and it should be better implemented as normal system deamon.

The nominal way would be submitting your changes upstream to the tcpdump project: http://www.tcpdump.org/#contribute

In a second step you can then either the usptream accepted change as patch to the OpenWrt package or change the Openwrt package to simply build a newer version of tcpdump with your protocol dissector included.

Ideally as soon as possible, getting changes reviewed in OpenWrt can take a very long time, especially for topics or little public interest (like implementations for protocols no one uses yet).

qbdwlr0 · November 14, 2018, 6:04pm

I have been running into a few issues where I could use some advice.
First issue: The two GRE tunnel legs of the bundle need to communicate with the same server on the other side, ie, the control packets are sent to the same dst IP but over different devices. So I attempted to install a long routes for each tunnel leg using my new /lib/netifd/proto/*.sh script, but this doesn't appear to work using OpenWRT utilities because they ignore the table attribute for IPv4 routes. Is this correct? I have been using the busybox "ip rule/route" to get around this, but I wasn't sure if this is the best way to do it.
Second issue: For interface configuration, I can specify a IP address and mask that installs a subnet, but there doesn't seem to be a way to specify a next-hop address with OpenWRT utilities. For this issue, I have also been using the busybox "ip route" to get around this.
Third issue, when I have the two tunnel legs set up and the control packet exchange begins, the exchange is completely successful, but for every packet that the server sends to my client (which I am receiving), using tcpdump I additionally see a ICMP unreachable response is being sent back to the server from something on OpenWRT. I am not sure what could be sending the ICMP unreachable response or how to stop it.

poshul · January 24, 2023, 8:14am

I'm gonna play necromancer here for a bit since this still seems to be a pendant issue.
Telekom has published their specific implementation https://www.telekom.de/hilfe/downloads/technischer-netzzugang-hybrid.pdf
They also have released all the GPLed sources related to their implementing router: https://www.telekom.de/hilfe/downloads/quellcode-dt-hybrid-gpl.zip which has their reference implementation.