Vectoring on Lantiq VRX200 / VR9 - missing callback for sending error samples

I built this branch and so far, everything seems to run fine *knockingonwood. With the lantiq-branch I had several random crashes/reboots.

I collect them every hour. Is this fine-grained enough?

AFAIK, this needs to be rewritten/updated for 21.02/master. When I updated from 19.04.x to 21.02-rc on another device, no data was collected any more.

Both branches are now updated to build with the testing kernel.

I only had such severe problems with the version that was shortly in the branch this night. But it's also possible that the concurrency issue just leads to different kinds of issues.

For monitoring line stability this should be enough.

Personally, I monitor the output of all useful dsl_cpe_pipe.sh commands every half hour. Currently, I am also monitoring dsmstatg every minute to check if there are any irregularities.

By the way, an interesting effect of AVM's driver is that you can capture the error reports using tcpdump (SNAP OUI 0x0019a7 and protocol ID 0x0003, this filter works: llc and ether[14:4]=0xaaaa0300 and ether[18:4]=0x19a70003). One thing that still needs fixing is the source MAC address, which is currently all-zeroes. It seems to work regardless, but the spec says to use the proper device MAC address.

I thought so. I believe it is not that hard to collect values in 21.02, maybe even easier (since the root requirement may not apply anymore, what with the line stats exposed through ubus.)

Thank you. I just built it with the testing kernel, and it is working.

Pointer: I rewrote the scripts to collect Lantiq DSL values in collectd for 21.02. (Unfortunately, ltq-vdsl-app does not expose DSM statistics through ubus.)

Thanks. Already installed. I'll give you feedback in the other thread, if I encounter any problems.

I had an uptime of 7 days. The line had to be disconnected a day ago, since I switched from DTAG to o2.

During the 7 days, I had a stable SNR of 11.5 dB +/- 0.5 dB and the line did not suffer from deterioration. Let's see how it develops with my no ISP, since it is basically the same VPE I doubt that there are any significant changes. But I'll keep you posted.

Are there any plans to get this work merged into OpenWRT?

Yes, I plan to get this merged eventually. However, there are a few reasons for not yet doing that:

  • The source MAC address is not yet set correctly.

  • Testing to make sure this actually works (results from @stonerl and myself look good so far, my own line is synced for over 12 days now and SNR margin is also absolutely stable).

  • I am still waiting on a license clarification from AVM. While the code must be licensed under GPLv2 (it is built into the kernel image in AVM firmware), this is not clearly marked in the source code. There is only the MODULE_LICENSE macro, and the header refers to a separate file named "LICENSE" which doesn't actually exist.

2 Likes

Quick question, if I wanted to join the testing fun, is there a simple instruction somewhere, how I can quickly build my own instance from your sources? (I have build openwrt in the past, but only by leveraging @hnyman's really nice scripts, so am a bit rusty with OpenWrt proper builts).

The normal OpenWrt build instructions apply. Look at the README or the wiki for more details. As this is based on master, don't forget to add the luci package during configuration, if you want a web interface.

The only difference here is that you need to either start from my ltq-vectoring-avm branch, or apply the two commits onto current master yourself. Also, you probably want to add the frame size fix for the switch. Putting the patch into target/linux/lantiq/patches-5.4 / target/linux/lantiq/patches-5.10 should work.

1 Like

With the current version in my branch, the MAC address should be set correctly.

I couldn't test the actual code on a line with vectoring this so far, as I ran into a NAND issue on my test device. But I did some testing previously and tried the code on another device, so i expect it to work.

If someone else wants to test this, it is necessary to look at the actual source address of the error reports using tcpdump. Checking the output of the dsmmcg command is not enough, as the value is only actually applied just after the DSL firmware is loaded.

The MAC address is set correctly, confirmed using tcpdump.

I noticed that dmesg prints protocol 0000 is buggy, dev dsl0 for each packet while tcpdump is active. It should still work, but I'll look into fixing this.

I can confirm that the MAC address is set correctly.

One problem I have is, since I switched from DTAG to o2, I cannot thoroughly test Line uptime. For some weird reason, o2 decided to hard reset the line every night between 2am and 6am. Sometimes up to 4 times a night. But I'll continue testing.

@moeller0 If you prefer not to build or cannot build your own image. Just let me know which device you have and which packages you need.

In all likelihood it is not O2 but Telekom who actually operates your TAL and DSLAM port, these reconnects could be caused by the Assia DSLExpresse DLM system Telekom uses to individually configure each link for their preferred rate/stability trade-off point.

For a quick test that actually would be nice. I have a BT HomeHub5A and having nano, and the luci GUI would be nice (and since I am used to it and the HH5A seems to have enough flash, maybe mc)...

the Bt hub 5 is ok with the new release. I'm running it with AdGuardHome on it.

Sadly I know my upstream is an ECI hub and thus pretty sure i dont get vectoring. (there are rumors some ECI hubs do have vectoring but its unconfirmed)

btw. richb-hanover/OpenWrtScripts: A set of scripts for maintaining and testing OpenWrt (github.com)

He's got an auto update script that i've adapted and use to setup from a clean upgrade.

Some of my changes. This deletes the ATM interface and sets up for VDSL. If you use this MAKE SURE you put the relevant firmware file in place or remove that line from config so it uses the inbuilt firmware.

# === Set up the WAN (eth0) interface ==================
# Default is DHCP, this sets it to PPPoE (typical for DSL/ADSL) 
# From http://wiki.openwrt.org/doc/howto/internet.connection
# Supply values for DSLUSERNAME and DSLPASSWORD 
# and uncomment ten lines
#
# echo 'Configuring WAN link for PPPoE'
# DSLUSERNAME=YOUR-DSL-USERNAME
# DSLPASSWORD=YOUR-DSL-PASSWORD
uci delete network.atm
uci delete network.dsl
uci delete network.wan

uci set network.globals.packet_steering='1' #Use every cpu to handle packet traffic

uci set network.dsl=dsl
uci set network.dsl.annex='b'
uci set network.dsl.xfer_mode='ptm'
uci set network.dsl.line_mode='vdsl'
uci set network.dsl.ds_snr_offset='0'
uci set network.dsl.tone='a'
uci set network.dsl.firmware='/lib/firmware/vr9-B-dsl-5.9.1.4.0.7.bin'

uci set network.lan.igmp_snooping='1'

uci set network.wan=interface
uci set network.wan.device='dsl0.101'
#uci set network.wan.ifname='dsl0.101'
uci set network.wan.proto='dhcp'
uci set network.wan.mtu='1500'
uci set network.wan.ipv6='1'

# uci set network.wan.username=$DSLUSERNAME
# uci set network.wan.password=$DSLPASSWORD
uci commit network
ifup wan
echo 'Waiting for link to initialize'
sleep 20

# === Update the software packages =============
# Download and update all the interesting packages
# Some of these are pre-installed, but there is no harm in
# updating/installing them a second time.
 echo 'Updating software packages'
 opkg update                # retrieve updated packages
 opkg install luci          # install the web GUI
# opkg install snmpd fprobe  # install snmpd & fprobe
 opkg install luci-app-sqm  # install the SQM modules to get fq_codel etc
# opkg install ppp-mod-pppoe # install PPPoE module
# opkg install avahi-daemon  # install the mDNS daemon
# opkg install netperf		 # install the netperf module for speed testing
 opkg install bind-tools	#dns tools
 opkg install ca-certificates # CA Certs for SSL
 opkg install igmpproxy	#TV igmpproxy

It also auto installs packages (i have a SQM script lower in the file)

My testing is delayed, in site of @stonerl's great help, I realized I re-used the HH5A's power suppy and need a replacement before I get started. Which I ordered, but which has not arrived yet....

thank you so much for the work you've done
I was able to maintain a connection for over 11 days

I've recently just rebooted (so am unable to share stats) but I can confirm that your branch makes a difference for sure. Without your patches the SNR just steadily drops until the speed degrades or it disconnects. I seem to get the best results using the most recent fritz blob 5.9.1.4.0.7.

I will leave my current connection up and get back to you with full stats after it's been up for a while.

If anyone is interested I compiled a firmware for the bt-hh5a if anyone wants to test, it doesn't just have the O.P's branch merged in though it has the eth-burst speed up patches and the deu patches, smp, baby jumbo mtu, kernel 5.10, DSA etc. It's compiled using the latest git glibc 2.34 git branch as well. Please note this firmware contains no dsl blob as it is illegal to distribute it. iperf3 seems to be underperforming a little though with the router hosting, turning on packet steering seems to increase performance.