OpenWrt 21.02 dsl_control?

Today I built for the first time OpenWRT 21.02 and I would like to know who is the "GENIUS" who thought to change /etc/init.d/dsl_control

In all previous versions dsl_control status provided a nice dsl status table like the following:

ATU-C Vendor ID:                          Infineon 178.6
ATU-C System Vendor ID:                   45,43,49,20,74,65,6C,65
Chipset:                                  Lantiq-VRX200
Firmware Version:                         5.9.1.4.0.7
API Version:                              4.17.18.6
XTSE Capabilities:                        0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x2
Annex:                                    B
Line Mode:                                G.993.2 (VDSL2)
Profile:                                  17a
Line State:                               UP [0x801: showtime_tc_sync]
Forward Error Correction Seconds (FECS):  Near: 195400 / Far: 1
Errored seconds (ES):                     Near: 8 / Far: 1661
Severely Errored Seconds (SES):           Near: 6 / Far: 11
Loss of Signal Seconds (LOSS):            Near: 0 / Far: 974
Unavailable Seconds (UAS):                Near: 58 / Far: 58
Header Error Code Errors (HEC):           Near: 0 / Far: 0
Non Pre-emtive CRC errors (CRC_P):        Near: 18 / Far: 0
Pre-emtive CRC errors (CRCP_P):           Near: 0 / Far: 0
Power Management Mode:                    L0 - Synchronized
Latency [Interleave Delay]:               8.0 ms [Interleave]   0.0 ms [Fast]
Data Rate:                                Down: 33.152 Mb/s / Up: 6.809 Mb/s
Line Attenuation (LATN):                  Down: 24.7 dB / Up: 33.6 dB
Signal Attenuation (SATN):                Down: 22.4 dB / Up: 32.9 dB
Noise Margin (SNR):                       Down: 5.7 dB / Up: 6.1 dB
Aggregate Transmit Power (ACTATP):        Down: 3.8 dB / Up: 3.9 dB
Max. Attainable Data Rate (ATTNDR):       Down: 32.604 Mb/s / Up: 6.952 Mb/s
Line Uptime Seconds:                      4114
Line Uptime:                              1h 8m 34s

Now instead it returns just:
running

Is this a joke!!??

Not to talk about dsl_status lucistat that I used within scripts in all my openwrt routers to feed dsl stats to a database via an mqtt tool.

PLEASE CAN WE HAVE /etc/init.d/dsl_control back as it was?
Who makes these decisions over breaking backwards compatibility without offering suitable replacements?

Sorry about the rant, but this is a regression bad enough for me to go back to 19.07

1 Like

I assume that it is related to this change:

(and https://github.com/openwrt/openwrt/commit/dea953744dff41794ccecbe7b0854ae99bdfe570 )

You might read about the new syntax from sources.

Or use directly the underlying ubus call:
ubus call dsl metrics

1 Like

I would like to encourage the development team to be mindful of the impact when they remove existing features
Lots of us use openwrt via command line and build scripts that depend on well established tools
In particular dsl_control was a stable and widely used tool that has been there for years.

The issue is that if we can't rely upon existing tools to be future proof, then we have to build more resource consuming solutions.
For example, I see that /usr/share/libubox/jshn.sh is available to read json objects in shell. Can I rely upon it or do I have to ignore it and build my own solution because in future you will change it or make it disappear? If the latter, clearly my solutions would have a larger footprint.

The same applies to "ubus call dsl metrics", is it going to change? Will the developer maintain backwards compatibility?

Look, this is not your first rodeo and you are not new to OpenWrt or opensource in general. You know the drill, things are going to keep changing, but typically not by malice and reasonable requests to maintain backward compatibility will likely at least receive thoughtful consideration and discussion.
IMHO relaying on dsl_control is a bit tricky, as that clearly is an OpenWrt internal script, so maybe calling . /lib/functions/lantiq_dsl.sh ; dsl_cmd is more future proof?
The upshot seems to be that the new method for dsl_control is significantly faster than the old one, so maybe for your use-case the new method actually offers some advantages to make up for the undesirable cost of having to change your code?
IIRC the justification for the change is the speedup so that calling dsl_control cyclically to harvest statistics for similar uses as yours...

2 Likes

dsl_control response time was never an issue, therefore an improvement adds no benefits, on the other end, the need to reproduce the exact output is an issue.

I am now working on a quick and ugly script. However I found that, for example in the old version I had:
dsl.atuc_vendor_id="Infineon 178.6"
dsl.atuc_system_vendor_id="45,43,49,20,74,65,6C,65"

Whilst now I have: "atu_c": {
"vendor_id": [
255,
181,
71,
83,
80,
78,
0,
16
],
"system_vendor_id": [
0,
0,
48,
48,
48,
48,
0,
0
],

I looked at 19.07 /lib/functions/lantiq_dsl.sh (this is no longer present in 21.02) parse_vendorid function, but I cannot find a match for the above atuc->vendor_id nor system_vendor_id make any sense (yes, I convert decimal to hex)

Vendor IDs are quite useful when contacting dumb ISP support helpdesks for example, especially in my case as I manage several sites across Europe.

Where are the above in 21.02? Do I have to enable some optional package?

You are entitled to your opinion, but:

$ time /etc/init.d/dsl_control dslstat
real	0m 2.66s
user	0m 0.90s
sys	0m 1.76s

$ time ubus call dsl metrics
real	0m 0.02s
user	0m 0.00s
sys	0m 0.01s

paints a different picture, and in fact it was the long duration of "dsl_control dslstat" that motivated the change. You might disagree about the value of the speed up versus the required changes in consumers of dsl_control's output, but claiming "an improvement [in speed] adds no benefits" is simply factually wrong, unless we all would agree that your subjective judgement to be the final arbiter.

It would if there was a guarantee that dsl_control output would be a stable API, but is that actually true?

Maybe https://github.com/openwrt/openwrt/commit/5372205ca9afea8e51c1762eabcaf5a97350bbaf can help here:

static void m_vendor(const char *id, const uint8_t *value) {
 	// ITU-T T.35: U.S.
 	if (U16(value[0], value[1]) != 0xb500)
 		return;

  	const char *str = NULL;
 	switch (U32(value[2], value[3], value[4], value[5])) {
 	STR_CASE(0x414C4342, "Alcatel")
 	STR_CASE(0x414E4456, "Analog Devices")
 	STR_CASE(0x4244434D, "Broadcom")
 	STR_CASE(0x43454E54, "Centillium")
 	STR_CASE(0x4753504E, "Globespan")
 	STR_CASE(0x494B4E53, "Ikanos")
 	STR_CASE(0x4946544E, "Infineon")
 	STR_CASE(0x54535443, "Texas Instruments")
 	STR_CASE(0x544D4D42, "Thomson MultiMedia Broadband")
 	STR_CASE(0x5443544E, "Trend Chip Technologies")
 	STR_CASE(0x53544D49, "ST Micro")
 	};

or you might want to use:
(this is the CPE)
. /lib/functions/lantiq_dsl.sh ; dsl_cmd g997listrg 0

which returns something like
'nReturn=0 nDirection=0 G994VendorID="..IFTNY." SystemVendorID="..IFTN.." VersionNumber="0123456789012345" SerialNumber="01234567890123456789012345678901" SelfTestResult=0 XTSECapabilities=(00,00,00,00,00,00,00,07)'

(this is the DSLAM/MSAN)
. /lib/functions/lantiq_dsl.sh ; dsl_cmd g997listrg 1
'nReturn=0 nDirection=1 G994VendorID="..BDCM.." SystemVendorID="..BDCM.." VersionNumber="v12.03.90 " SerialNumber="eq nr port:33 oemid softwarerev" SelfTestResult=0 XTSECapabilities=(00,00,00,00,00,00,00,02)'

All you need to do is translate the vendor 4 letter code into the current name...

1 Like

Good question, my lantiq modem is in hibernation and I have not yet looked at the most recent changes, but I am confident that the shell code from old lantiq_dsl.sh that implemented dsl_cmd could be copied into a stand alone script and be used from there. Actually, I will have to look at that in the future again, but due to reliability and retrain issues with the lantiq modem I switched over to a broadcom modem.

It looks like

is still part of the package, which in the end should sifficie to talk to the firmware blob.

just my 2 cent... parse command output is always wrong... so really IMHO the ubus change should really improve the data sampling and script handling...
Also about the breakage... this is a new version so it's really just an update of the package...
it would be a breakage if the change was done in the same version (a minor update to 19.x changed the script)
They ported data to ubus that is the correct way to handle this kind of thing and we are complaining for the effort...

2 Likes

In all the time (years) I have used OpenWRT modem routers (now I use only BT Home Hub 5), I found 3 use cases for which it becomes necessary to query dsl parameters. OpenWRT support for these 3 use cases makes this firmware stand out against any proprietary modem/router:

  1. manual query to check whether the line is performing as expected
  2. manual query to troubleshoot connection issues
  3. script to feed a database at regular intervals with data necessary to analyse line performance and degradation through time (particularly useful to support the case with stubborn ISPs when typical FTTC line degradation occurs over long periods of time)

Although a fast response time would be indeed desirable, none of those cases (especially 3) depend upon a response below 2".
However cases 1 and 2 depend upon a human readable report, whilst case 3 depends on scripts that are no longer available.

Therefore in light of the simple comparative impact analysis above, my assessment that the change applied adds no benefits is based on facts, rather than opinion.
To see added value I would have to see a positive impact on the 3 use cases above, instead, as a consequence of the change, none of the 3 use cases can be achieved now.

My impression on the other end is that your assessment is based on your opinion that does not take into account the typical use cases above.

On your following messages:

  • thank you for providing the vendor table.
  • I confirm that neither dsl_cmd nor lantiq_dsl.sh are in 21.02

I would like to ask what is your typical hardware configuration for xDSL (FTTC), I find myself stuck to BT Home Hub 5 which is an excellent modem router, but in some of my cases it has become a little too slow. On the other end I do not want to depend on a proprietary modem.

Except, that in the end, the source of the information here is always a query to the firmware-blob, which seems to report oddly formatted data packed in ASCII strings. So all methods query the same source. But I take it you argue against parsing the formatted output of dsl_control, to that I agree :wink:

I am upset because there was no user impact analysis. Practically they removed what made OpenWrt outstanding against proprietary xDSL modem/routers.
I suppose it can be added, but why not re-create the existing user interface as part of the change?

Two points:

The response time is only a symptom, the problem is the load the "old way" caused.

Take LuCI for example. It displays DSL stats on the front page, and they are refreshed every few seconds. As a consequence, the "old way" lead to a load of ~ 0.5 just for keeping the front page open.

Calling status on an etc/init.d script should always have displayed the service status, this was finally codified back in 2018. dsl_control has been an exception for a long time, having historically used the "status" argument to output ... well, the line status. To bring it in line with other etc/init.d scripts, the change in syntax was necessary. This change happened separately, quite some time before the switch to ubus.

Addendum: I agree, there is no "beautiful" way to get DSL stats on the shell right now, at the moment it is purely machine-readable (though not impossible to parse for human eyes.) Reformatting the ubus output into a table should be easy to do, it's just up to somebody to create this reformatting script. That somebody ... could be you.

1 Like

The new architecture is indeed an improvement, but improvements should not come with a detriment to usability, otherwise they are incomplete.

Sorry, it is based ob your assessment of facts, which in itself is a subjective judgement. A fact and an evaluation of a fact are two separate things. But I get your annoyance, I would be annoyed as well if my so far working solution stopped doing so to offer new features I would not need.

Which is your right to think. I happen to disagree and will offer https://github.com/moeller0/lantiq_dsl_parser as indicator that I might be more involved in that question than you might assume.

Well, then use dsl_cpe_pipe.sh as shown above, which should do the trick as well.

I only have a single link, and I used a HH5A as bridged Modem under OpenWrt as well. Since my ISPs vectoring line-cards are all Bradcom based and do not play nicely with the xrx200 ever since the ISP activated G.INP on both directions I switched to use a zyxel vmg1312-B30A (configured as bridge).

I actually tried that initially, but already ran into CPU issues when trying with a 50/10 link as router/firewall/SQM, so I relegated the HH5A to bridged modem only. Which worked well until the ISP activated vectoring and bi-directional G.INP and increased the Sync-limits to 116/40, at which point I got multiple re-syncs per week, sometimes per day...

Yes, I understand. I came from the same position and was simply "converted" by the lantiq becoming unstable (by no fault of lantiq) and the broadco proved to be rock solid. I since rewired my internal phone line to further increase the line (less FEC errors, zero CRCs). I am tempted to try the HH5A again, but that would also make a decent AP to increase WiFi coverage: decisions, decisions, decisions :wink:

None of my providers support vectoring (I have Infostrada and Vodafone in Italy, Plusnet and TalkTalk in the UK), however I am using vectoring Lantiq firmare as follows:

ATU-C Vendor ID:                          Broadcom 192.20
ATU-C System Vendor ID:                   Broadcom
Chipset:                                  Lantiq-VRX200
Firmware Version:                         5.9.1.4.0.7
API Version:                              4.17.18.6
XTSE Capabilities:                        0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x2
Annex:                                    B
Line Mode:                                G.993.2 (VDSL2)
Profile:                                  17a
Line State:                               UP [0x801: showtime_tc_sync]
Forward Error Correction Seconds (FECS):  Near: 0 / Far: 55
Errored seconds (ES):                     Near: 0 / Far: 4
Severely Errored Seconds (SES):           Near: 0 / Far: 0
Loss of Signal Seconds (LOSS):            Near: 0 / Far: 0
Unavailable Seconds (UAS):                Near: 173 / Far: 173
Header Error Code Errors (HEC):           Near: 0 / Far: 0
Non Pre-emtive CRC errors (CRC_P):        Near: 0 / Far: 0
Pre-emtive CRC errors (CRCP_P):           Near: 0 / Far: 0
Power Management Mode:                    L0 - Synchronized
Latency [Interleave Delay]:               0.14 ms [Fast]   0.0 ms [Fast]
Data Rate:                                Down: 86.081 Mb/s / Up: 18.911 Mb/s
Line Attenuation (LATN):                  Down: 8.3 dB / Up: 8.6 dB
Signal Attenuation (SATN):                Down: 8.3 dB / Up: 8.5 dB
Noise Margin (SNR):                       Down: 6.1 dB / Up: 8.0 dB
Aggregate Transmit Power (ACTATP):        Down: -21.9 dB / Up: 13.9 dB
Max. Attainable Data Rate (ATTNDR):       Down: 98.992 Mb/s / Up: 28.134 Mb/s
Line Uptime Seconds:                      183692
Line Uptime:                              2d 3h 1m 32s

The above is Vodafone IT as you can see the line is quite clean and never drops (the only restarts are due to my maintenance). But in Italy FTTC is 100% VDSL2 (no microfilters), telephone is supplied via VOIP. In the UK the service isn't as good, I always have had problems and they still share broadband with phone on the same line with microfilters.

I am wondering if I am missing something by continuing to use BT HH5.

The performance issues I was referring to were due to my requirements to run also TFTP and NFS shares (to boot diskeless servers), also my router on the 100/20 VDSL2 line is just about performing OK with software NAT offload. If I had a faster line I think it would become a bottleneck.

No idea, but here in Germany quite a number of folks using xrx200 modems reported issues after vectoring and G.INP activation. Not restricted to HH5As but also with the over-here quite popular AVM Fritzbox 7490. I believe AVM managed to get things improved enough to stop being an issue, but they are using more recent VDSL drivers than OpenWrt and matching firmware blobs. I have a hunch that these blobs might not work optimally with the older drivers in OpenWrt, but no proof for that hypothesis.

My own line was quite dirty, so the sub-optimal combination of lantiq modem and broadcom linecard was just struggling/limping along. I went for a broadcom modem as written above and it turned out that @takimata had already written a data collector for those that can be run as part of OpenWrt's statistics collection. in the end I got better monitoring for the new modem as well (plus dslstats to get QLN and SNRmargin/bit loading per carrier plots). I intended to go back to the HH5A after confirming that my line's instability was not caused by the lantiq modem... but it turns out the zyxel was stable immediately (albeit syncing a bit lower). So instead of complaining to my ISP and going back to the HH5A I simply stuck with the other modem (my family appreciates a stable internet way more then a slightly faster but less stable one :wink: )
Then over christmas, I repositioned the telephone socket in my apartment and got rid of 12 meter of wiring and as a result most of the CRC errors the broadcom modem was still collecting...
So maybe one of these might help your UK line as well?

I can believe that the HH5A is nice an compact, but the SoC was designed in a world of ~50/10 dsl plans, not necessarily 80/20 or 100/40... it had a good run.

Going back to "ubus call dsl metrics", working on a human readable report that reproduces "dsl_control status" I noticed that errors->[near|far]->fecs are always = 0 (this is a G.992.5 ADSL2+ line) whilst "es" increases. Which is quite not a likely result. Please could you double check whether "fecs" is picked correctly?

One more question: please, could you let me know where I can find latency status on the new json structure, equivalent to the following lucistat output:
dsl.latency_s_down="Interleave"
dsl.latency_s_up="Fast"

This is also quite an important parameter to check when lines go bad.

Since my HH5A is powered down, there is not much I can do right now, looking at the new code things look reasonably okay, but the C code used ioctl to actually query the values so that is a bit hard to confirm/debug without a working lantiq modem....