No more VDSL vectoring after upgrading from 22.03 to 23.05 [not reproducible anymore]

Hi there,

after upgrading from 22.03 to 23.05 my VDSL line does no longer synchronize to VDSL Vectoring, before 129/42Mbit after 20/1.5.

This happens for both Lantiq-VRX200 Devices I own, a FRITZ Box 7360 v2 and an Arcadyan VGV7510KW22 (o2 Box 6431)

If I downgrade image, everything is fine again, Vectoring on.

I tested this upgrading / downgrading a few times, 22.03 always works, while 23.05 does not, on both devices!

No other configurations changed during up/downgrading.

Configuration is
Annex B (all), (in 23.05 description is longer)
Tone, Encapsulation auto
Line mode VDSL
0.0db SNR offset
Firmware 7430-07-31-5.9.1.4.0.7-5.9.0.D.0.1.bin (reuploaded after every flash)
(version extracted myself, but it doesn't matter, firmware from image has exactly the same effect)

Anyone got an idea what changed from 22.03 to 23.05 that might break VDSL Vectoring?


snippets from /etc/init.d/dsl_control dslstat working / non working

	"api_version": "4.17.18.6",
	"firmware_version": "5.9.1.4.0.7",
	"chipset": "Lantiq-VRX200",
	"driver_version": "1.5.17.6",
	"state": "Showtime with TC-Layer sync",
	"state_num": 7,
	"up": true,
	"uptime": 9,
	"atu_c": {
		
	},
	"power_state": "L0 - Synchronized",
	"power_state_num": 0,
	"xtse": [
		0,
		0,
		0,
		0,
		0,
		0,
		0,
		2
	],
	"annex": "B",
	"standard": "G.993.2",
	"profile": "17a",
	"mode": "G.993.2 (VDSL2, Profile 17a, with down- and upstream vectoring)",
	"upstream": {
		"vector": true,
		"trellis": true,
		"bitswap": false,
		"retx": true,
		"virtual_noise": false,
		"interleave_delay": 0,
		"data_rate": 42463000,
		"latn": 10.900000,
		"satn": 10.900000,
		"snr": 6.800000,
		"actps": -90.100000,
		"actatp": 14.500000,
		"attndr": 44076000
	},



{
	"api_version": "4.17.18.6",
	"firmware_version": "5.9.1.4.0.7",
	"chipset": "Lantiq-VRX200",
	"driver_version": "1.5.17.6",
	"state": "Showtime with TC-Layer sync",
	"state_num": 7,
	"up": true,
	"uptime": 55,
	"atu_c": {
		"vendor_id": [
			181,
			0,
			66,
			68,
			67,
			77,
			193,
			144
		],
		"vendor": "Broadcom 193.144",
		"system_vendor_id": [
			181,
			0,
			66,
			68,
			67,
			77,
			0,
			0
		],
		"system_vendor": "Broadcom",
		"version": [
			118,
			49,
			49,
			46,
			48,
			48,
			46,
			48,
			57,
			0,
			0,
			0,
			0,
			0,
			0,
			0
		],
		"serial": [
			72,
			49,
			48,
			75,
			52,
			48,
			48,
			48,
			54,
			53,
			53,
			95,
			48,
			52,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0
		]
	},
	"power_state": "L0 - Synchronized",
	"power_state_num": 0,
	"xtse": [
		0,
		0,
		0,
		0,
		0,
		0,
		0,
		2
	],
	"annex": "B",
	"standard": "G.993.2",
	"profile": "17a",
	"mode": "G.993.2 (VDSL2, Profile 17a)",
	"upstream": {
		"vector": false,
		"trellis": true,
		"bitswap": false,
		"retx": true,
		"virtual_noise": false,
		"ra_mode": "At initialization",
		"ra_mode_num": 1,
		"interleave_delay": 0,
		"inp": 42.000000,
		"data_rate": 1615000,
		"latn": 2.400000,
		"satn": 2.400000,
		"snr": 11.400000,
		"actatp": 11.800000,
		"attndr": 1615000,
		"mineftr": 1477000
	},

You need reupload external firmware and add firmware file to /etc/sysupgrade.conf to keep vectoring.

What? Ofc I reuploaded the FW to the boxes after every flash as you can see from the snippets, its always the same.

1 Like

You edited for clarity, actually I saw similar happenings in some other bug trackers. Try other/higher version vectoring fw....

I tried all kinds of firmware versions, it changes nothing. Just didn't write all that. Im by now pretty sure it has absolutely nothing to do with the FW file itself (as long as it supports vectoring) but with the way it is being used.

Also the fact that the exact same thing happens for two quite different VRX200 devices but then is reliably reproducible is a strong indicator that there is something wrong with a non user configurable setting regarding VDSL synchronization.

xref https://github.com/openwrt/openwrt/issues/14545

You can join party at github issue as it seems regression introduced in v23 (over there it seemed doom and gloom as nothing worked out)

Id not join that party as it for the most part had nothing to do with VDSL sync but rather L3 stuff.
Whats the process of getting an actual dev involved that can (given the clear evidence i provided) go through the git changes / do a binary search with a proper dev build setup.

Ill later also try to flash prebuilt versions between those two i used, maybe I can pindown the breaking change further this way.

The actual dev was involved over there, assuring vector firmware got installed but to no joy, the additional bit of v23 regression was not known at the time, so if you restart discussion it may go somewhere. If possible try snapshot, if problem is fixed there even without knowing causes, the next version will be valid again.

For what it's worth, my ZyXEL P2812HNU (Lantiq VRX200) runs fine on 23.05.3:

DSL Status

Line State: Showtime with TC-Layer sync
Line Mode: G.993.2 (VDSL2, Profile 17a, with down- and upstream vectoring)
Line Uptime: 22d 18h 7m 38s
Data Rate: 111.214 Mb/s / 31.423 Mb/s
Noise Margin: 6.0 dB / 4.9 dB

So it's not vectoring itself which is broken.

1 Like

Could you post the full output and maybe even screenshots taken with go-dsl, please. Which ISP are you customer of and are you by chance located in Germany, if so which ISP operates the physical lines?

Yes I'm by all chances on a Telekom operated landline in Germany
ISP is Vodafone/Vitroconnect but I don't think that matters for the DSL part.

There is no luci-mod-dsl in 22?

Actually what I just noticed: I followed the steps to extract FW from FRITZ box rom but that results in Annex A FW (cause last step is to patch B to A). Still this FW worked for at least a year now with OWRT 22 without problems?

I will also try the lantiq-vrx200-b.bin later just to be sure which i just noticed you can install direcly via opkg, slightly older version but should support vectoring as well.

Full output of working (old) FW state

{
	"api_version": "4.17.18.6",
	"firmware_version": "5.9.1.4.0.7",
	"chipset": "Lantiq-VRX200",
	"driver_version": "1.5.17.6",
	"state": "Showtime with TC-Layer sync",
	"state_num": 7,
	"up": true,
	"uptime": 15847,
	"atu_c": {
		"vendor_id": [
			181,
			0,
			66,
			68,
			67,
			77,
			193,
			144
		],
		"vendor": "Broadcom 193.144",
		"system_vendor_id": [
			181,
			0,
			66,
			68,
			67,
			77,
			0,
			0
		],
		"system_vendor": "Broadcom",
		"version": [
			118,
			49,
			49,
			46,
			48,
			48,
			46,
			48,
			57,
			0,
			0,
			0,
			0,
			0,
			0,
			0
		],
		"serial": [
			72,
			49,
			48,
			75,
			52,
			48,
			48,
			48,
			54,
			53,
			53,
			95,
			48,
			52,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0,
			0
		]
	},
	"power_state": "L0 - Synchronized",
	"power_state_num": 0,
	"xtse": [
		0,
		0,
		0,
		0,
		0,
		0,
		0,
		2
	],
	"annex": "B",
	"standard": "G.993.2",
	"profile": "17a",
	"mode": "G.993.2 (VDSL2, Profile 17a, with down- and upstream vectoring)",
	"upstream": {
		"vector": true,
		"trellis": true,
		"bitswap": true,
		"retx": true,
		"virtual_noise": false,
		"interleave_delay": 0,
		"data_rate": 42463000,
		"latn": 10.900000,
		"satn": 10.700000,
		"snr": 7.100000,
		"actps": -90.100000,
		"actatp": 14.400000,
		"attndr": 44316000
	},
	"downstream": {
		"vector": true,
		"trellis": true,
		"bitswap": true,
		"retx": true,
		"virtual_noise": true,
		"interleave_delay": 150,
		"data_rate": 109996000,
		"latn": 11.700000,
		"satn": 11.700000,
		"snr": 11.700000,
		"actps": -90.100000,
		"actatp": -0.700000,
		"attndr": 130449408
	},
	"errors": {
		"near": {
			"es": 5,
			"ses": 1,
			"loss": 3,
			"uas": 316,
			"lofs": 0,
			"fecs": 0,
			"hec": 0,
			"ibe": 0,
			"crc_p": 0,
			"crcp_p": 0,
			"cv_p": 0,
			"cvp_p": 0,
			"rx_corrupted": 41321,
			"rx_uncorrected_protected": 23508,
			"rx_retransmitted": 0,
			"rx_corrected": 17813,
			"tx_retransmitted": 9594
		},
		"far": {
			"es": 424,
			"ses": 83,
			"loss": 0,
			"uas": 316,
			"lofs": 0,
			"fecs": 135735,
			"hec": 0,
			"ibe": 0,
			"crc_p": 0,
			"crcp_p": 0,
			"cv_p": 0,
			"cvp_p": 0,
			"rx_corrupted": 698601,
			"rx_uncorrected_protected": 352757,
			"rx_retransmitted": 0,
			"rx_corrected": 345844,
			"tx_retransmitted": 791215
		}
	},
	"erb": {
		"sent": 36494,
		"discarded": 0
	}
}

So what happens is that the DSLAM somehow recognises your modem as incapable of vectoring and hence provisions it with a fall-back VDSL2 profile that is limited to frequencies >= 2.2 MHz (the same first 512 sub carriers used by ADSL, since Telekom did not want to replace all ADSL modems these bins are excluded from vectoring). The question is, why does it do that. The traditional cause seems ot be firmware blobs that do not support vectoring, but that is not the case on your link, as I understand using the exact same blob on OpenWrt22 in exactly the same device results in a proper negotiation of VDSL2 with vectoring...

Correct, I can only think of changed local initialization/configuration of that blob. Somehow that blob has to be "uploaded" to the modem / started? Are there parameters used during that process?

To answer my question, ofc parameters mus be used as you can provide them via the UI. Maybe there are more hidden paramters? maybe the ones provided via UI get processed differently?

All my clear evidence just vanished into thin air. Seriously I tested this back and fourth yesterday to make sure I get consistent results. And it was always 22 working and 23 not.

But not today, every firmware blob (also the Annex A ones) are now working on both routers. And it doesn't seem that I can get it to not work anymore, tried to resync numerous times now, always works.

I don't know, maybe my failures from yesterday triggered some technician on the other side today, who knows....

2 Likes

Some line cards were known to get caught in the fall back profile occasionally requiring a rest (of either the port or the full card I do not know). I had thought that only happened in the past with older line card firmware versions, but apparently this can still happen.

But you are good then?

For now, still the question remains why in a "certain environment" 22 works and 23 not. If the line card is faulty/bad or something I would it expect to then consistently not work until "something" changes. And if that something is a different OWRT version (but not dsl firmware blob version) ... that leaves some questions.

Oh, if your 23 built tried to connect to the DSLAM before you uploaded the vectoring firmware the line card might not have been wrong in recognising a non-vectoring capable device and assigning the non-vectoring fall-back for it. If I am correct the real error then was that the line-card apparently did not (immediately) switch back to the normal vectoring profile after you changed the firmware and initiated a resync/retrain. If you ask me the switch to FTTH can not come soon enough, as that will make all of this arcane DSL voodoo a thing to scare people with in camp fire stories and not something one needs to remember in daily life...

1 Like

Good point, but the same would apply to 22, as neither have my custom blob after flashing and I have to upload it to both versions after. But maybe the timing / triggering of sync is different for 22 or something. And it still doesn't explain why It suddenly works one day later.

FTTH scheduled for june here, but is was also scheduled for last june. They twice a year send their door-to-door salesman squads to sell addhesion contracts including "cost sharing" for digging up the pavement. And nobody wants that :smile: )