Lantiq xDSL - how to log SRA activity?

While resyncs are logged in the system logs, SRA caused sync rate adjustments don't appear to be. Fritz!OS seems able to do this (though AVM dumb it down and only report the first rate change) - does anyone have any idea how I could implement this in OpenWrt?

Background: I'm experiencing irregular WAN DHCP lease renewal failures and in discussing this with my service provider's support team it would be very helpful if I could pinpoint any SRA activity. The IPoE session has a 30 minute lease limit; some days I get no failures, other days I've had as many as 4 (so far). A new lease is granted within 5-15 seconds of a failure, so the authentication is basically ok but something is choking the renewals - when it happens the 15 minute renewal attempt (half way point in the lease time) fails as do all subsequent renewal requests (I've seen around 5-6 logged) up until the lease expires. In nearly all cases I'm aware of so far, there is little other WAN activity at the time.

I'm not aware that the (internal) DSL modem logs anything in the system log. Of course resyncs cause a WAN disconnect, and that is logged in the syslog, but I am not sure that the modem itself logs anything.

(Edit) In a way it could, but it doesn't: A change in line status triggers several scripts in /etc/hotplug.d/dsl/, one of them for example takes care of the LED signaling of the line status. However, nothing in there logs anything about the line status into the syslog.

I am also not aware that SRA activity prompts a modem statistics value to change -- other than the line rate itself of course.

That being said, it would be trivial to check the line rates at regular intervals (this can be every minute or even more frequent) and, if they change, log time and new rate, to the syslog or to some other file. Is that what you want?

Also, as a sidenote, if you're running a Vectoring line, you should absolutely consider switching to a current snapshot. Improvements to the error handling that significantly improve line stability have not landed in a release OpenWrt version yet.

There are also counters for SRA events, accessible via dsl_cpe_pipe.sh:

# dsl_cpe_pipe.sh osg 0 # OlrStatistics_US
nReturn=0 nDirection=0 nBitswapRequested=57552 nExtBitswapRequested=0 nBitswapExecuted=57552 nBitswapRejected=0 nBitswapTimeout=0 nSraRequested=0 nSraExecuted=0 nSraRejected=0 nSraTimeout=0 nSosRequested=0 nSosExecuted=0 nSosRejected=0 nSosTimeout=0

# dsl_cpe_pipe.sh osg 1 # OlrStatistics_DS
nReturn=0 nDirection=0 nBitswapRequested=142 nExtBitswapRequested=0 nBitswapExecuted=112 nBitswapRejected=30 nBitswapTimeout=0 nSraRequested=0 nSraExecuted=0 nSraRejected=0 nSraTimeout=0 nSosRequested=0 nSosExecuted=0 nSosRejected=0 nSosTimeout=0

Oh right, with all the time I spent on ubus-exposed statistics recently I forgot about the dsl command pipe (in my defense: mainly because it's not availble to collectd-exec). Thanks for the correction.

There's a couple of netifd log lines reporting 'dsl0' as up as well as some kernel output, but nothing else.

On the hotplug approach: after some reading and experimenting with the generic logger script shown here it looks like hotplug has events that trigger on a DSL rate change which reflect closely enough to SRA activity for my purposes.

Specifically, the variable DSL_NOTIFICATION_TYPE (used in /etc/hotplug.d/dsl/led_dsl.sh) values:

  • DSL_DATARATE_STATUS_US
  • DSL_DATARATE_STATUS_DS

reflect changes to the actual data rate in the upstream and downstream respectively.

If I get organised enough I'll post a simple hotplug script to fill in some gaps in this area.

Yes... :wink:

Looks like this is a thread where I get thoroughly and rightfully corrected from all sides. :wink: I honestly never saw any syslog lines regarding dsl0, but that might also be because my Lantiq dsl0 went up once and then never down again over the last month.

That is very interesting, but also it makes it really easy to do what you want. I wonder if it would be worth the effort of documenting catch-able hotplug events somewhere. Edit/infodump: the relevant git commit

I think it would be very helpful if added to the hotplug Wiki page. I'm currently not able to do this myself but if no-one else gets in beforehand I'll try and remember to do so when I have some time off in a couple of months time.

Thanks for finding that, it's a very useful base for a Wiki update. An observation from what I've seen so far: both data rate value environment variables are present in both the _US and _DS notification cases, which is slightly different to the description in that text.

On a separate note (pertaining to the background info in my original post): so far there seems to be some sort of relationship between SRA activity and the WAN lease renewal failures mentioned (at least in the the current VDSL session), however it's definitely not clear what the relationship might be as the first signs of trouble have never been less than about 5 minutes after the sync rate change. The failures have all occurred within 30 minutes (one full lease period) after a sync rate change (mostly within 15 minutes, but I've seen one instance with a successful renewal after the sync rate change followed by a failure). Quite odd... :thinking: :roll_eyes:

Based on my experimentation and the git commit found by @takimata, I've added DSL information to the hotplug Wiki page along with an example logging script.

2 Likes

That's a really handy script, to the point where I am thinking: for DSL users, would this not be generally good info to have in the syslog? An extension/rework of hotplug.d/dsl/led_dsl.sh even, to not just visually notify about status changes, but also log them?

Anyway, here's a sidbar question. And sorry if it's vague, it's off my memories of a several years back. I used to have problems with my back-then-ADSL line spontaneously resynching. And I seem to remember that I wrote a similar script and ran into problems when I reacted to all line status changes. I don't exactly remember why, and how, but by reacting to the changes and logging them I caused some sort of race condition and disturbed the DSL synching process itself. (Yes, weird. It should not even be possible to do that.) You probably tested your script very thoroughly and went through quite a few sync processes, did you ever observe something similar?

+1 - whether led_dsl.sh is extended or another script added

No, but my line has been relatively stable since the last fault was rectified - only occasional SRA events.

I seem to remember that early in OpenWrt's Lantiq DSL support there was no SMP - if what you saw was on a uniprocessor kernel I could see that there might have been situations where that could have happened for lack of spare CPU capacity.

Yeah, disregard my question.

I thought about what I did back then a bit longer, and I'm now sure that I didn't actually go through hotplug.d, but rather wrote a script that regularly polls the DSL line state through the dsl CPE pipe script. In hindsight not the smartest way to do it, querying the modem at significant points in its sync process probably caused it to hang.

I'm not entirely sure if the hotplug events didn't exist back then, or I just didn't know about them, which is weird because I most definitely knew about the dsl_led script.

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.