Lynx
2347
Has the lualanes issue been fixed? Or still necessary to compile patched version?
That's a bit of an issue at present with the lua version.
To my understanding, this cannot be fixed by the sqm-autorate team - please educate me if you believe that I am wrong. I accept that this still means that there is an issue for you. We are tracking this at
stry
2349
The problem for me isn't the ICMP reflectors being used but in peak times the pings start fluctuating over the connection.
The script then detects the fluctuations and lowers the set bandwidth. So only option for me is to raise max_delta_RTT= to a value high enough that normal fluctuations don't cause problems.
I think the default reflectors used is perfect for most people, only thing is that the delta will have to be adjusted for users where they have a high RTT and high fluctuation.
stry
2350
I am currently using the shell version.
@stry have you tried the lua sqm-autorate script?
Lynx
2352
Correction: lua version team.
If you were the sqm-autorate team then I suppose you could switch language
.
stry
2353
Not yet - Will try it this afternoon.
So 350ms threshold is 'special'.... especially with close by reflectors... I think you might profit from a temporal criterion in addition to the magnitude criterion, where you only act if X out of the last Y samples cross the threshold.
Becuse we are deep in policy land here, tolerating up to 350ms excursions essentially means that bufferbloat is not controlled well at all. But then this is why the threshold parameter can and should be adapted to a users preferences, policy needs to be set individually by and for each network.
The lua code will appear to work better for you as it does a 'remporal discounting' already*, but at the cost that the threshold parameter is not easily settable by observing real RTTs/OWDs.
*) Technically it bandpass filters the delay measurements which effectively reduces the magnitude of individual spikes, which in turn means that unless the high delay persists long enough you 350ms spike will not cross a considerably lower threshold.
This might be behavior that might fit your policy well.
1 Like
Lynx
2355
I find it curious though because with a 350ms spike wouldn't that just affect one or two ticks and so only result in that much worth of decrease?
Also the script in main actually pulls out the average over each tick (I have not yet changed it to minimum):
echo $(/usr/bin/ping -i 0.00 -c 10 $reflector | tail -1 | awk '{print $4}' | cut -d '/' -f 2) >> $RTTs&
The second entity in the fourth column is the average from the 10 pings for that tick:
rtt min/avg/max/mdev = 38.456/44.683/49.664/4.902 ms, pipe 3, ipg/ewma 19.571/45.029 ms
So presumably @stry encounters a big spike across the average of 10 pings for each tick that punishes bandwidth. One spike increases the average, but not the minimum, so perhaps I really should change this to minimum now @moeller0! Agreed?
@stry please could you try changing the line:
echo $(/usr/bin/ping -i 0.00 -c 10 $reflector | tail -1 | awk '{print $4}' | cut -d '/' -f 2) >> $RTTs&
To:
echo $(/usr/bin/ping -i 0.00 -c 10 $reflector | tail -1 | awk '{print $4}' | cut -d '/' -f 1) >> $RTTs&
With taking the average for each ping cycle the script is a little overly sensitive to occasional spikes. So this simple change may improve things.
@stry also, would be great if you could post some data showing ramping up of download vs sustained high load - would be good to see how the shell script copes with ramping up bandwidth and then oscillation around peak bandwidth.
We discussed that in the past, I agree hat minimum seems to be the best statistic here, since link bufferbloat will invariably affect ALL connections. This assumes however that there are no spuriously low RTT results like 0 possible, e.g. as a consequence of an unhandled error or so.
1 Like
tievolu
2357
I've been beavering away this morning and found over 500 more ICMP type 13 reflectors, distributed all over the world.
Who shall I send them too?
2 Likes
If you have a github account, maybe share the list in its own repository?
1 Like
tievolu
2359
Done. Here's the list, in the same format that sqm-autorate uses:
4 Likes
Lynx
2360
Dear @tievolu,
The efficacy of your reflector selection script and also that you were the first to be using timestamps for one way delays makes me super curious about your perl version of the autorate challenge.
Any chance at all you might consider adding that too to your GitHub? I'd love to try it out. It would be great to be able to output some kind of monitoring lines to see how it fares.
tievolu
2361
Sharing it feels like exposing myself. 
I'm sure the code is horribly inefficient and I have no idea whether it will work for anyone else, but I've created a private repo and invited you so you can have a look. You'll need to edit the config near the top of the script to set your upload/download interfaces if you want to try it out, and you'll need to handle log rotation. I also run it as a service with an init script, but I haven't included that.
By the way, I've modified it a lot recently to go with the approach of resting at a "standard" bandwidth, and increasing/decreasing as necessary in response to load and latency.
EDIT: One more thing - you need to run it like this: ./qos_monitor.pl daemon
1 Like
Nomid
2362
Testing version 5.0 out. I like what you have done with the shell output and logging, pretty neat
At first, slight bit confusing (and i read the README
) But when i see it in action, makes perfect sense. Cool job guys
2 Likes
dtaht
2363
Occasionally I like to get folk to think sideways and out of the box, whenever y'all get some time... https://www.youtube.com/watch?v=DAN-7sWFxLw&list=PLrninrcyMo3L-hsJv23hFyDGRaeBY1EJO&index=2
1 Like
Lynx
2364
Tickle the cross traffic!
Do I understand correctly that this would require VPS for control at both ends?
Could you summarize the main idea, I am 26 minutes in and so far still in the exposition stage... (of a problem, which I believe to not be fully naive on, rightly so?)
Lynx
2366
Probably my enthusiasm got the better of me, like with Sprout.
There is a paper here:
Seems like this is end to end, so not a viable solution for the general use case?