DSL line degradation

Hi,

I am using a Fritz!Box 7520 with OpenWrt 23.05.2 on a DSL line from 1&1 (ds-lite). I noticed that there are a lot of errors during the day (especially "far"). There is a mandatory disconnect from the ISP each night, but instead of terminating the PPPoE connection, I get a complete line reset. According to the ISP technician, because of the errors during the day, the mandatory line reset is also used to "recalibrate" the line metrics, which results in permanently dropped data rates. My line degraded from approx. 63 Mbit/s downstream to 39 Mbit/s.

My question is why there are so many errors with OpenWrt but not with another box running the stock Fritz!Box firmware? I am using the original modem firmware for the 7520 on OpenWrt.

The output of the dsl metrics before a reboot:

root@OpenWrt:~#  ubus call dsl metrics
{
        "api_version": "4.23.1",
        "firmware_version": "8.13.1.5.0.7",
        "chipset": "Lantiq-VRX500",
        "driver_version": "1.11.1",
        "state": "Showtime with TC-Layer sync",
        "state_num": 7,
        "up": true,
        "uptime": 23685,
        "atu_c": {
                "vendor_id": [
                        181,
                        0,
                        66,
                        68,
                        67,
                        77,
                        194,
                        120
                ],
                "vendor": "Broadcom 194.120",
                "system_vendor_id": [
                        181,
                        0,
                        66,
                        68,
                        67,
                        77,
                        0,
                        0
                ],
                "system_vendor": "Broadcom",
                "version": [
                        118,
                        49,
                        50,
                        46,
                        48,
                        52,
                        46,
                        49,
                        50,
                        48,
                        32,
                        32,
                        32,
                        32,
                        32,
                        0
                ],
                "serial": [
                        101,
                        113,
                        32,
                        110,
                        114,
                        32,
                        112,
                        111,
                        114,
                        116,
                        58,
                        49,
                        54,
                        32,
                        32,
                        111,
                        101,
                        109,
                        105,
                        100,
                        32,
                        115,
                        111,
                        102,
                        116,
                        119,
                        97,
                        114,
                        101,
                        114,
                        101,
                        118
                ]
        },
        "power_state": "L0 - Synchronized",
        "power_state_num": 0,
        "xtse": [
                0,
                0,
                0,
                0,
                0,
                0,
                0,
                2
        ],
        "annex": "B",
        "standard": "G.993.2",
        "profile": "17a",
        "mode": "G.993.2 (VDSL2, Profile 17a, with down- and upstream vectoring)",
        "upstream": {
                "vector": true,
                "trellis": true,
                "bitswap": false,
                "retx": true,
                "virtual_noise": false,
                "ra_mode": "At initialization",
                "ra_mode_num": 1,
                "interleave_delay": 0,
                "inp": 45.000000,
                "data_rate": 12736000,
                "latn": 25.900000,
                "satn": 25.700000,
                "snr": 17.700000,
                "actatp": 9.900000,
                "attndr": 29711000,
                "mineftr": 12733000
        },
        "downstream": {
                "vector": true,
                "trellis": true,
                "bitswap": true,
                "retx": true,
                "virtual_noise": false,
                "ra_mode": "At initialization",
                "ra_mode_num": 1,
                "interleave_delay": 180,
                "inp": 73.000000,
                "data_rate": 39990000,
                "latn": 21.300000,
                "satn": 21.300000,
                "snr": 17.100000,
                "actatp": 14.500000,
                "attndr": 71502096,
                "mineftr": 39968000
        },
        "olr": {
                "downstream": {
                        "bitswap": {
                                "requested": 16,
                                "executed": 8,
                                "rejected": 0,
                                "timeout": 0
                        },
                        "sra": {
                                "requested": 0,
                                "executed": 0,
                                "rejected": 0,
                                "timeout": 0
                        },
                        "sos": {
                                "requested": 0,
                                "executed": 0,
                                "rejected": 0,
                                "timeout": 0
                        }
                },
                "upstream": {
                        "bitswap": {
                                "requested": 0,
                                "executed": 0,
                                "rejected": 0,
                                "timeout": 0
                        },
                        "sra": {
                                "requested": 0,
                                "executed": 0,
                                "rejected": 0,
                                "timeout": 0
                        },
                        "sos": {
                                "requested": 0,
                                "executed": 0,
                                "rejected": 0,
                                "timeout": 0
                        }
                }
        },
        "errors": {
                "near": {
                        "es": 0,
                        "ses": 0,
                        "loss": 3,
                        "uas": 335,
                        "lofs": 0,
                        "fecs": 138,
                        "leftrs": 2,
                        "cv_c": 0,
                        "fec_c": 70,
                        "hec": 0,
                        "ibe": 0,
                        "crc_p": 0,
                        "crcp_p": 0,
                        "cv_p": 7,
                        "cvp_p": 0,
                        "rx_corrupted": 25105,
                        "rx_uncorrected_protected": 24861,
                        "rx_retransmitted": 0,
                        "rx_corrected": 244,
                        "tx_retransmitted": 9826
                },
                "far": {
                        "es": 95,
                        "ses": 46,
                        "loss": 4,
                        "uas": 335,
                        "lofs": 0,
                        "fecs": 1596,
                        "leftrs": 837,
                        "cv_c": 292,
                        "fec_c": 51140,
                        "hec": 0,
                        "ibe": 0,
                        "crc_p": 0,
                        "crcp_p": 0,
                        "cv_p": 0,
                        "cvp_p": 0,
                        "rx_corrupted": 2395966,
                        "rx_uncorrected_protected": 2391857,
                        "rx_retransmitted": 0,
                        "rx_corrected": 4109,
                        "tx_retransmitted": 17248669
                }
        },
        "erb": {
                "sent": 109847,
                "discarded": 0
        }
}

Same output after resetting the router (I also changed the firmware file):

root@OpenWrt:~# ubus call dsl metrics
{
        "api_version": "4.23.1",
        "firmware_version": "8.13.1.10.1.7",
        "chipset": "Lantiq-VRX500",
        "driver_version": "1.11.1",
        "state": "Showtime with TC-Layer sync",
        "state_num": 7,
        "up": true,
        "uptime": 46,
        "atu_c": {
                "vendor_id": [
                        181,
                        0,
                        66,
                        68,
                        67,
                        77,
                        194,
                        120
                ],
                "vendor": "Broadcom 194.120",
                "system_vendor_id": [
                        181,
                        0,
                        66,
                        68,
                        67,
                        77,
                        0,
                        0
                ],
                "system_vendor": "Broadcom",
                "version": [
                        118,
                        49,
                        50,
                        46,
                        48,
                        52,
                        46,
                        49,
                        50,
                        48,
                        32,
                        32,
                        32,
                        32,
                        32,
                        0
                ],
                "serial": [
                        101,
                        113,
                        32,
                        110,
                        114,
                        32,
                        112,
                        111,
                        114,
                        116,
                        58,
                        49,
                        54,
                        32,
                        32,
                        111,
                        101,
                        109,
                        105,
                        100,
                        32,
                        115,
                        111,
                        102,
                        116,
                        119,
                        97,
                        114,
                        101,
                        114,
                        101,
                        118
                ]
        },
        "power_state": "L0 - Synchronized",
        "power_state_num": 0,
        "xtse": [
                0,
                0,
                0,
                0,
                0,
                0,
                0,
                2
        ],
        "annex": "B",
        "standard": "G.993.2",
        "profile": "17a",
        "mode": "G.993.2 (VDSL2, Profile 17a, with down- and upstream vectoring)",
        "upstream": {
                "vector": true,
                "trellis": true,
                "bitswap": false,
                "retx": true,
                "virtual_noise": false,
                "ra_mode": "At initialization",
                "ra_mode_num": 1,
                "interleave_delay": 0,
                "inp": 45.000000,
                "data_rate": 12736000,
                "latn": 25.900000,
                "satn": 25.900000,
                "snr": 17.700000,
                "actatp": 9.900000,
                "attndr": 29536000,
                "mineftr": 12733000
        },
        "downstream": {
                "vector": true,
                "trellis": true,
                "bitswap": false,
                "retx": true,
                "virtual_noise": false,
                "ra_mode": "At initialization",
                "ra_mode_num": 1,
                "interleave_delay": 180,
                "inp": 73.000000,
                "data_rate": 39990000,
                "latn": 21.300000,
                "satn": 21.300000,
                "snr": 18.700000,
                "actatp": 14.500000,
                "attndr": 79478784,
                "mineftr": 39990000
        },
        "olr": {
                "downstream": {
                        "bitswap": {
                                "requested": 0,
                                "executed": 0,
                                "rejected": 0,
                                "timeout": 0
                        },
                        "sra": {
                                "requested": 0,
                                "executed": 0,
                                "rejected": 0,
                                "timeout": 0
                        },
                        "sos": {
                                "requested": 0,
                                "executed": 0,
                                "rejected": 0,
                                "timeout": 0
                        }
                },
                "upstream": {
                        "bitswap": {
                                "requested": 0,
                                "executed": 0,
                                "rejected": 0,
                                "timeout": 0
                        },
                        "sra": {
                                "requested": 0,
                                "executed": 0,
                                "rejected": 0,
                                "timeout": 0
                        },
                        "sos": {
                                "requested": 0,
                                "executed": 0,
                                "rejected": 0,
                                "timeout": 0
                        }
                }
        },
        "errors": {
                "near": {
                        "es": 0,
                        "ses": 0,
                        "loss": 0,
                        "uas": 144,
                        "lofs": 0,
                        "fecs": 0,
                        "leftrs": 0,
                        "cv_c": 0,
                        "fec_c": 0,
                        "hec": 0,
                        "ibe": 0,
                        "crc_p": 0,
                        "crcp_p": 0,
                        "cv_p": 7,
                        "cvp_p": 0,
                        "rx_corrupted": 0,
                        "rx_uncorrected_protected": 0,
                        "rx_retransmitted": 0,
                        "rx_corrected": 0,
                        "tx_retransmitted": 0
                },
                "far": {
                        "es": 95,
                        "ses": 46,
                        "loss": 4,
                        "uas": 144,
                        "lofs": 0,
                        "fecs": 1596,
                        "leftrs": 843,
                        "cv_c": 292,
                        "fec_c": 51140,
                        "hec": 0,
                        "ibe": 0,
                        "crc_p": 0,
                        "crcp_p": 0,
                        "cv_p": 0,
                        "cvp_p": 0,
                        "rx_corrupted": 2413845,
                        "rx_uncorrected_protected": 2409736,
                        "rx_retransmitted": 0,
                        "rx_corrected": 4109,
                        "tx_retransmitted": 17248670
                }
        },
        "erb": {
                "sent": 676,
                "discarded": 0
        }
}

As you can see, the "far" errors are numerous and do not get reset, most likely causing the line to degrade after a new sync.

Thanks for your support!

Please install and use the go-dsl tool and create and post a screenshot (make sure to show maximum, ans minimum and to autoscale the graphs (by clicking the respective checkboxes in the gui)). That mniught give some clue what is happening to your signal quality.
Typical suspects are:
a) powerline adapter PLC in the neighbourhood
b) slightly defect switching type power supplies
c) bad contacts in TAE or APL connectors
d) deteriorated/broken wires

That said thge far errors are large because these are maontained by the DSLAM ands will not be reset when you reboot your modem. So these are mostly problematic if they grow, if they are just large all it means is the link is in UP state for a long time or there were issues in the past...

a) possible. I can see some ssids from powerline adapters
b) I would rule that one out because I had another box using a different power supply which also caused the line to degrade a few years ago
c) TAE was freshly installed by the DTAG technician, I also changed the TAE-F/RJ-12 cable to the router. APL connection is messy
d) DSLAM is over 730 m away. wires on the house (around the APL) are a mess and very old.

Still, with the stock firmware, the line will not degrade (at least not so extremely fast).

The graphs below are all empty, so I left them out.

Mmh, please let go-dsl run for 24 hours and retake the scrennshot. This will give more meaningful maximum and minimum plots as well as populating the 'graphs below' which shows the errors in blocks of 5 minutes over the last 24 hours.

ATM it looks like the long cable makes your line susceptible for interference (as seen in the intervention of the DLM/dynamic port optimisation) but we dot yet see clear signs of that interference itself...

Does the program itself need to run all the time or did you mean letting the DSL sync stay up for 24 h?

You need to let the program run, as it will collect data from the modem every few seconds I believe.

1 Like

This is the line after running for approx. 12 h. I had to abort because the line kept degrading to an all time low and the DTAG technician will reset it today. I am now running my FB 7412 with stock firmware and will let go-dsl run on that aswell for comparison.

Now here are the results of the FB 7520 with OpenWrt. The image quality is worse because I used the -web option and thought printing as pdf would give me hires vectors which it did not. The red bar around 4 h is the time the line reset that is caused during the mandatory disconnect.

You can press the save button in the GUI which will export each individual plot as SVG file.

In the Signal to noise ration plot we see quite a number of differences between maximum and minimum that might be caused by PLC adapters or similar... If you can please try again for a ~24 hours measurement.

Excellent idea, for FritzOS please use:

./dsl-gui -d fritzbox -o LoadSupportData=1 fritz.box

to get the full set of information

1 Like

I tried a different approach and flashed FritzOS on the 7520 which should result in less confound results as the same hardware is used. Unfortunately, the FB ran out of RAM after about 8 h so the results were unrecoverably lost (can't save when there is an error in go-dsl). However, I observed nearly no errors both near and far at any point in time in go-dsl. Furthermore, FritzOS also did not report any line errors. This evidently leads to the conclusion that something in the way OpenWrt handles i.e. vectoring.

I found this (Vectoring on Lantiq VRX200 / VR9 - missing callback for sending error samples) post from @janh which points out that with the smaller modem (VRX200), there have been issues with deteriorating lines in the past. The specific issue has already been fixed in OpenWrt but there may be something else related at play in my case. Could this be a configuration error by any chance, i. e. using ptm instead of atm or using a wrong tone? It almost looks like there are communication errors due to incompatibilities.

The firmware files were all taken from the latest (v7.57) Fritz FW:

aca_fw.bin
ppe_fw.bin
vr11-B-dsl.bin

PS: I will run go-dsl again with FritzOS via -web. However, I want to refrain from using OpenWrt for a few days or weeks because I do not want my bandwidth to go down even further. It should recover over time due to the bandwidth management running on the DSLAM. I would really like to make this work (and it should, as @elder_tinkerer runs the same box without my issues), but if there is no chance of fixing the problem, I could try setting up the other FB 7512 in (half/full) bridge mode and use OpenWrt on the 7520.

I use a FB 7520 under OpenWrt configured as bridged modem for more than 6 months on a 100/40 Mbps Telekom link, so vectoring is in play, as is G.INP. Still I get full sync (116/46) and there was a single time in 7 months that I had an unscheduled resync (and the scheduled resyns where all from ~7 months ago). However, I do live pretty close to the DSLAM which might make a difference.

I had exactly that issues with xrx200 under OpenWrt before but Jan's patches fixed that for me, I never had these issues with the 7520...

I think we can conclude that OpenWrt does a subpar job concerning the modem which becomes a problem under certain circumstances (bad wires/interference). Did you follow https://openwrt.org/toh/avm/avm_fritz_box_7530? If you could post your /etc/config/network, I might compare it to mine. Although, I doubt that this is the cause of any problems. I will post the results from the FritzOS analysis tomorrow. Thank you so far for your support!

Yes, pretty much...

Sure:

root@OpenWrt_FB7520:~# cat /etc/config/network

config interface 'loopback'
	option device 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option ula_prefix 'fdcd:4cf3:765d::/48'

config atm-bridge 'atm'
	option vpi '1'
	option vci '32'
	option encaps 'llc'
	option payload 'bridged'
	option nameprefix 'dsl'

config dsl 'dsl'
	option tone 'b'
	option annex 'b'
	option ds_snr_offset '0'

config device
	option name 'br-lan'
	option type 'bridge'
	list ports 'lan1'
	list ports 'lan2'
	list ports 'lan3'
	list ports 'lan4'

config interface 'lan'
	option proto 'static'
	option ip6assign '60'
	option device 'br-lan.42'
	list ipaddr '192.168.100.1/24'

config device
	option name 'dsl0'
	option macaddr '98:9B:CB:C0:F5:BB'

config interface 'wan'
	option device 'dsl0'
	option proto 'none'

config interface 'wan6'
	option device '@wan'
	option proto 'dhcpv6'

config bridge-vlan
	option device 'br-lan'
	option vlan '42'
	list ports 'lan1:t'
	list ports 'lan2'
	list ports 'lan3'
	list ports 'lan4'

config bridge-vlan
	option device 'br-lan'
	option vlan '7'
	list ports 'lan1:t*'

config device
	option type '8021q'
	option ifname 'dsl0'
	option vid '7'
	option name 'dsl0.7'

config device
	option type 'bridge'
	option name 'br-dsl'
	list ports 'br-lan.7'
	list ports 'dsl0.7'

config interface 'MODEM'
	option proto 'none'
	option device 'br-dsl'

Note, that I use VLAN7 to reach the dsl-bridge/internet and VLAN42 to reach the br-lan and the 7520's GUI all over the same LAN port pair between router and modem.

There might be some interferences that happen only at specific times which would be visible in the minimum ands maximum SNR lines, but ony if the data was collected when the issue happened.

1 Like

The problem is, that with dsl-go, FritzOS will run out of memory after approx. 9-10 h resulting in a disconnect in go-dsl. The process failed again tonight, so I took a new sample over 9 h today. Interestingly, there are almost no errors reported as opposed to the massive amount I got with OpenWrt (in the order of magnitude of 10^9). But that might just be false reporting as there were exactly 0 errors "far" immediately after I installed FritzOS again while those errors stayed under OpenWrt even after reboots.

           State:    Showtime
            Mode:    VDSL2 Profile 17a
          Uptime:    9 hours, 7 minutes

          Remote:    Broadcom 12.4.120
           Modem:    AVM 1.180.131.100

     Actual rate:       34990 kbit/s      12736 kbit/s 
 Attainable rate:       79790 kbit/s      27742 kbit/s 
         MINEFTR:       34968 kbit/s      12733 kbit/s 

         Bitswap:         off               off        
 Rate adaptation:         off               off        

    Interleaving:           0 ms              0 ms     
             INP:        73.0 symbols      45.0 symbols
  Retransmission:          on                on        

       Vectoring:        full              full        

     Attenuation:        21.0 dB           26.0 dB     
      SNR margin:        21.0 dB           18.0 dB     
  Transmit power:        14.0 dBm          10.0 dBm    

    RTX TX Count:          43                 0        
     RTX C Count:          43                 0        
    RTX UC Count:           0                 0        

       FEC Count:           0                 0        
       CRC Count:           0                 0        

        ES Count:           0                 0        
       SES Count:           0                 0        










1 Like

The far errors are detected and maintained by the DSLAM as far as I understand and the DSLAM just increments them. I believe FritzOS substracts the value from just after synchronisation from the reported values as that would be the most relevant number for the current uptime, I guess.

That is not good, sorry for have you chased along that path... quering my 7520 under OpenWrt I can let go-dsl run for ages and nothing crashes... Maybe -o LoadSupportData=0 would be less demanding, and the biggest difference would be that this does not get QLN and HLOG plots, but these are only showing data collected during the synchronisation phase so for the longer term error checking the basic go-dsl would work... Heck even not using go-dsl but simply postimng screenshots of the Internet->DSL Informationen Tab from the FritzOS GUI would help (I would like to see the Spektrum plots after an uptime of 24 hours with the link "Maximum und Minimum anzeigen" clicked before screenshoting).

An update that changes this will be available later today.

For the older modems, an additional kernel driver is necessary to handle vectoring, which was missing in OpenWrt. On VRX500, the same task is handled by the firmware, so this issue doesn't exist here.

No, if either of these options was the issue, you wouldn't get a connection at all. VDSL2 connections in Germany always use PTM. And the tone option only affects the carrier set that is used during the handshake phase (i.e. at the very beginning of the connection).

Exactly. On OpenWrt the raw counters from the modem are reported. This means that errors before the current connection (and before the device was even started) may be included for those counters that are maintained by the DSLAM. Or, in some cases it may even report spurious values (like the downstream FEC counter in your second screenshot which is suspiciously close to the maximum value of an unsigned 32 bir integer).

Thus, you should ignore the absolute counters in the table, and look only at changes instead. The graphs should make this easy.

The error rates in your screenshots from vendor firmware and OpenWrt are both very low, so there is no indication of an actual issue so far (and all the errors that occured while monitoring were corrected ones anyways).

The other data also doesn't show any issue, so it is unclear why the Assia/DLM system has limited your line to such a low data rate. I would try to keep the line uninterrupted for a few days or weeks, maybe that will cause the system to reassess your line and increase the rates again.

I also suspect that the most likely issue causing this would be loading of support data. Because other than that, only accesses to the web interface and the officially supported TR-064 interface are made.

2 Likes

It's been a while since I was using VDSL/ DTAG, but I remember the that it took me over a week (but less than two weeks) to get back up to speed after a modem reboot/ firmware upgrade.

1 Like

The time the dynamic portoptimisation (that is the Telekom's name for this system) takes to recover from throtteling is unfortunatelly not strictly predictable based on a single line. So sometimes it takes 1 week sometimes 2 and occasionally much longer. However one observation that a bunch of people made is that often the first step takes quite some time while subsequent rate limit increases can come in much more rapid sequence, so knock on wood.
However for the OP I still wonder what causes the downgrading in the first place as the error counters seem to be in the OK range...

@janh It seems that during the mandatory disconnect the DNS gets confused and since I was using the hostname fritz.box, go-dsl lost connection. Coincidentally, the reported RAM usage went up and up during the go-dsl monitoring. So there does not seem to be a problem and I will now let the program run using the IP address. PS: I could not find out how to configure a different ListenAddress:port using the toml.file, so I could access the graphs from another device.

Topic:
The DLM (Dynamic Line Management) throttled once again while I was using the stock OS on the FB. I suspect, this is is an aftereffect of the error counter on the port rather than a persisting problem even with the stock OS. I will report tomorrow, hopefully with a 24 h report.

So the portoptimiser also gets triggered by frequent resyncs, so if you test and switch between openwrt and fritzos on the same device and hence sync frequently the system might misdiagnose this as a sign of link instability and throttle your link. The portoptimiser is essentially operating heuristically and seems to err on the side of stability over throughput.