Vectoring on Lantiq VRX200 / VR9 - missing callback for sending error samples

If thats any help. Thats my bthub 5 transformer.

They did. Sorta. They made it so USB is the defautl standard for phones. And that was an utter shitshow before they did. Now its just awkward.

xkcd: Standards

:relevent:

To get openwrt to compile with glibc and the lantiq platform these two files need to be deleted:

toolchain/gcc/patches/11.2.0/850-use_shared_libgcc.patch
toolchain/gcc/patches/11.2.0/851-libgcc_no_compat.patch

otherwise the compiler compains of an undefined reference to unwind_resume when it goes to compile the lantiq parts of the code. This isn't just for gcc 11.2.0 either so the relevant files also need to be deleted if using a different version of gcc

What firmwares are people using btw?

I have the following i've been using for a while.

vr9-B-dsl-5.9.0.12.1.7.bin
vr9-B-dsl-5.9.1.4.0.7.bin

    "firmware_version": "5.9.1.4.0.7",

I got these from Lantiq firmware files - openwrt.ebilan.co.uk

I'm making a new post incase the former is considered off topic

stats summary from web interface:

DSL
DSL Status
Line State:Showtime with TC-Layer sync
Line Mode:G.993.2 (VDSL2, Profile 17a, with down- and upstream vectoring)
Line Uptime:19h 9m 10s
Annex:B
Data Rate:103.595 Mb/s / 44.199 Mb/s
Max. Attainable Data Rate (ATTNDR):111.477 Mb/s / 48.213 Mb/s
Latency:0.13 ms / 0.00 ms
Line Attenuation (LATN):11.7 dB / 14.5 dB
Signal Attenuation (SATN):11.7 dB / 14.5 dB
Noise Margin (SNR):8.2 dB / 10.1 dB
Aggregate Transmit Power (ACTATP):6.8 dB / 13.3 dB
Forward Error Correction Seconds (FECS):0 / 135568
Errored seconds (ES):0 / 612
Severely Errored Seconds (SES):0 / 83
Loss of Signal Seconds (LOSS):0 / 0
Unavailable Seconds (UAS):455 / 455
Header Error Code Errors (HEC):0 / 0
Non Pre-emptive CRC errors (CRC_P):0 / 0
Pre-emptive CRC errors (CRCP_P):0 / 0
ATU-C System Vendor ID:Broadcom 177.197
Power Management Mode:L0 - Synchronized

Full dslstat from terminal:

{
	"api_version": "4.17.18.6",
	"firmware_version": "5.9.1.4.0.7",
	"chipset": "Lantiq-VRX200",
	"driver_version": "1.5.17.6",
	"state": "Showtime with TC-Layer sync",
	"state_num": 7,
	"up": true,
	"uptime": 69014,
	"atu_c": {
		"vendor_id": [
			181,
			0,
			66,
			68,
			67,
			77,
			177,
			197
		],
		"vendor": "Broadcom 177.197",
		"system_vendor_id": [
			181,
			0,
			66,
			68,
			67,
			77,
			0,
			0
		],
		"system_vendor": "Broadcom",
		"version": [
			130,
			123,
			204,
			192,
			0,
			16,
			65,
			65,
			49,
			54,
			49,
			48,
			70,
			83,
			48,
			72
		],
		"serial": [
			65,
			65,
			49,
			54,
			49,
			48,
			70,
			83,
			48,
			72,
			51,
			45,
			48,
			51,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0
		]
	},
	"power_state": "L0 - Synchronized",
	"power_state_num": 0,
	"xtse": [
		0,
		0,
		0,
		0,
		0,
		0,
		0,
		2
	],
	"annex": "B",
	"standard": "G.993.2",
	"profile": "17a",
	"mode": "G.993.2 (VDSL2, Profile 17a, with down- and upstream vectoring)",
	"upstream": {
		"vector": true,
		"trellis": true,
		"bitswap": true,
		"retx": true,
		"virtual_noise": false,
		"interleave_delay": 0,
		"data_rate": 44199000,
		"latn": 14.500000,
		"satn": 14.500000,
		"snr": 10.000000,
		"actps": -90.100000,
		"actatp": 13.300000,
		"attndr": 48170000
	},
	"downstream": {
		"vector": true,
		"trellis": true,
		"bitswap": true,
		"retx": true,
		"virtual_noise": false,
		"interleave_delay": 130,
		"data_rate": 103595000,
		"latn": 11.700000,
		"satn": 11.700000,
		"snr": 8.300000,
		"actps": -90.100000,
		"actatp": 6.800000,
		"attndr": 111820800
	},
	"errors": {
		"near": {
			"es": 0,
			"ses": 0,
			"loss": 0,
			"uas": 455,
			"lofs": 0,
			"fecs": 0,
			"hec": 0,
			"ibe": 0,
			"crc_p": 0,
			"crcp_p": 0,
			"cv_p": 0,
			"cvp_p": 0,
			"rx_corrupted": 8973,
			"rx_uncorrected_protected": 8690,
			"rx_retransmitted": 0,
			"rx_corrected": 283,
			"tx_retransmitted": 4715
		},
		"far": {
			"es": 612,
			"ses": 83,
			"loss": 0,
			"uas": 455,
			"lofs": 0,
			"fecs": 135568,
			"hec": 0,
			"ibe": 0,
			"crc_p": 0,
			"crcp_p": 0,
			"cv_p": 0,
			"cvp_p": 0,
			"rx_corrupted": 3705820,
			"rx_uncorrected_protected": 3375420,
			"rx_retransmitted": 0,
			"rx_corrected": 330400,
			"tx_retransmitted": 1823315
		}
	}
}

dsl_cpe_pipe.sh dsmstatg

nReturn=0 n_processed=277819 n_fw_dropped_size=0 n_mei_dropped_size=1 n_mei_dropped_no_pp_cb=0 n_pp_dropped=0

The SNR offset is set to 0. It's working fantastically well at the moment with the SNR above 8. I can't thank you enough for sharing your work this is a game changer for these devices.

Errors appear to be counted even when first syncing up ? Or is this some kind of history being kept by the other end ? Do you think it would be possible to take the initial errors being reported and keep an offset ? So as to see how many have actually happened for just one period of uptime ? Something along the lines of "far errors this uptime:" stat ?

Yes, this is the "far" side which phones in its current error totals collected throughout its own uptime. With errors -- even on the "near" side -- you should always look at the relative difference/increase, not the absolute numbers.

Collectd, using my scripts, does exactly that, it only collects the increase.

awesome i'll look into it thanks

I was using 5.7.8.11.0.7, but I thought I give 5.9.1.4.0.7 a try.

I wonder whether with the new PPA driver all previous testing of firmware blobs is not invalidated anyway, at least partially?
Given the source of the ppa driver I certainly would test fritzbox firmware blobs again....

And I will, once I solved my PSU issue....

i've tested the fritz 5.9.1.4.0.7 for over 10 days it's deinitely reliable, the older fritz one you mentioned is definitely slower
i'm currently testing the one from the netgear dm200 it seems a little slower so far but is building up

it seems to depend on the interleave delay

There isn't anything in the vectoring driver that is specific to a DSL firmware version. The only reason for picking AVM's version of the driver is that it queues the packets instead of calling the PTM driver directly (because the latter doesn't really work with the PTM driver in OpenWrt).

Aside from that, I used the latest Fritzbox DSL firmware (5.9.1.4.0.7) during my tests. Note that there are actually different variants of that firmware. Unfortunately, there doesn't seem to be an easy way to get the AVM version number. If you extract a current VR9 Fritzbox firmware, it contains two versions of the firmware, the primary one seems to be in use since 7.21, and the older one (with "release" prefix) looks to be the one from 7.12. Here are the SHA256 sums:

bc0d9cb5d8fa71cfc7be2aa408c3a7d153bad56163ee70c4214211b996baa6f7  xcpe_591407_590D01_a_avm0712.bin
8a676d7b7e07e34c9cc97f5350e04f73abe68e21e9d52661e6169a4235e76a28  xcpe_591407_590D02_b_avm0712.bin
b9e71e2150a3815c4dc6f6489fb95a0839889609ab88679a8062c346db2aba37  xcpe_591407_590D01_a_avm0721.bin
9d5277b36b322e66d634f5bcf3c9a72e97b80db03506f8494e345790fa22c29b  xcpe_591407_590D02_b_avm0721.bin

That is interesting information I have never read before. If I understand you correctly, that means that there are firmware blobs that are inherently unsuited to be used with OpenWrt's driver?

No, I was trying to say the opposite. The vectoring driver doesn't even directly interact with the DSL firmware. (Of course an incompatibility between firmware and kernel driver is possible in theory, but there doesn't seem to be any specific evidence for that.)

The issue here is that the unmodified vectoring kernel driver from Lantiq doesn't play nice with the PTM kernel driver that is used in OpenWrt. As discussed above, the original vectoring driver expects the PTM driver to support concurrent calls to its methods. The variant of the vectoring driver from AVM doesn't have this requirement, as it queues the packets regularly. This way, the kernel performs the necessary locking that is expected by the PTM driver.

I added a few small changes to the driver. The procfs path now uses a more sensible value, and the kernel log message protocol 0000 is buggy, dev dsl0 should be fixed.

I took moeller0's post to mean that now that there is extra data being passed to the dslam the firmware blobs that were more problematic vs other blobs will now operate in a new light

Not related: I can't say for sure but I think I remember with blobs from tdt.de it actually accepted less dms / dmms commands than the fritz

Yes, I agree that it makes sense to re-test firmware versions that previously worked less well. I was primarily responding to the last sentence in @moeller0's message, which referred to the source of the driver. Using the vectoring driver from AVM doesn't mean it is necessary to use AVM's DSL firmware to get functional vectoring. But it is of course still possible that their firmware performs better in practice.

The DSL firmwares from AVM definitely contain some extensions. There are also a few additional commands that are not accessible on OpenWrt at all (this message on the mailing list shows some: https://lists.openwrt.org/pipermail/openwrt-devel/2017-July/017210.html).

I've updated my firmware as per the link above to have janh's latest update. I've reverted dnsmasq back to an older version to sotp the crashing until they figure it out so it should be good to test with.

Yes, I agree to your interpretation, my thought eas that AVM will have probably tested their own drivers against the firmware blibs they use, and so we might be able to profit from that to some degree. But I guess we do not have the most recent AVM driver source anyway, so it is unclear whether we have an AVM-tested combo at our hands anyway....

The vectoring driver is just a very small part of all the DSL-related drivers. All the other drivers remain unchanged. While the source archive from AVM is recent (it corresponds to firmware 7.27, which was released 3 months ago), it doesn't contain sources for the other drivers. So, using the same software stack as AVM is unlikely to ever happen. But the results in this thread look positive so far, so that doesn't seem to matter anyway.

Just a quick update. Firmware 5.9.1.4.0.7 did the trick for me. DSL line is up for more than two days.

PPPoE is reset ever 24h and with the old firmware (5.7) this resulted in an DSL reconnect as well, this behavior is gone now.

So firmware 5.9.x and this branch seem to be the perfect fit.

But I’ll wait and see, fingers crossed…