CAKE w/ Adaptive Bandwidth [October 2021 to September 2022]

Pulling it and about to test, but curious if you're running Lua 3.4+? This fails on Lua 5.1: local cur_process_id = posix.getpid()["pid"]. Apparently the ["pid"] was an add in later version of Lua.

I guess there's probably a way to check current Lua version for the runtime and set that variable accordingly. I can look into that shortly.

Now to test...

dlakelan@tintin:~/Consulting/sqm-autorate$ lua -v
Lua 5.1.5  Copyright (C) 1994-2012 Lua.org, PUC-Rio

for me, that line failed saying you're trying to subtract a number from a table.

Weird! This is on my OpenWrt box:

root@OpenWrt:~# lua -v
Lua 5.1.5  Copyright (C) 1994-2012 Lua.org, PUC-Rio (double int32)

I get this error when trying to run with the ["pid"] tacked on that var declaration:

lua: test2.lua:38: attempt to index a number value

I suspect it's a difference in the version of luaposix between the two.

On my desktop:

dlakelan@tintin:~/Consulting/sqm-autorate$ apt policy lua-posix
lua-posix:
  Installed: 33.4.0-3+b1
  Candidate: 33.4.0-3+b1
  Version table:
 *** 33.4.0-3+b1 500
        500 http://httpredir.debian.org/debian stable/main amd64 Packages
        500 http://httpredir.debian.org/debian testing/main amd64 Packages
        500 http://httpredir.debian.org/debian unstable/main amd64 Packages
        100 /var/lib/dpkg/status

whatever that means.

Ah, that's got to be it! OpenWrt has luaposix - 35.1-1

1 Like

Ok, I grabbed the 35.1-1 version from luarocks instead of the debian package...

root@OpenWrt:~# lua test2.lua 
rx_bytes_path: /sys/class/net/ifb4eth0/statistics/tx_bytes
tx_bytes_path: /sys/class/net/eth0/statistics/tx_bytes
timedata =      nil
timedata =      test2.lua:143: attempt to perform arithmetic on global 'rtt' (a nil value)
timedata =      cannot resume dead coroutine
timedata =      cannot resume dead coroutine
timedata =      cannot resume dead coroutine
timedata =      cannot resume dead coroutine
timedata =      cannot resume dead coroutine
timedata =      cannot resume dead coroutine
^Clua: test2.lua:231: interrupted!
stack traceback:
        [C]: in function 'nanosleep'
        test2.lua:231: in function 'conductor'
        test2.lua:235: in main chunk
        [C]: ?

Ok, I think I sorted that and pushed.

Close!

root@OpenWrt:~# lua test2.lua 
rx_bytes_path: /sys/class/net/ifb4eth0/statistics/tx_bytes
tx_bytes_path: /sys/class/net/eth0/statistics/tx_bytes
timedata =      nil
timedata =      test2.lua:149: attempt to concatenate global 'downlink_time' (a nil value)
timedata =      cannot resume dead coroutine
timedata =      cannot resume dead coroutine
timedata =      cannot resume dead coroutine
timedata =      cannot resume dead coroutine
timedata =      cannot resume dead coroutine
timedata =      cannot resume dead coroutine
timedata =      cannot resume dead coroutine
^Clua: test2.lua:232: interrupted!
stack traceback:
        [C]: in function 'nanosleep'
        test2.lua:232: in function 'conductor'
        test2.lua:236: in main chunk
        [C]: ?

Oh right. that's in the debug print... hmmm I'll sort that. Ok, I just have it referring to the stats table now...

root@OpenWrt:~# lua test2.lua 
rx_bytes_path: /sys/class/net/ifb4eth0/statistics/tx_bytes
tx_bytes_path: /sys/class/net/eth0/statistics/tx_bytes
timedata =      nil
Reflector IP: 9.9.9.9  |  Current time: 15109009  |  TX at: 15108508  |  RTT: 501  |  UL time: 9  |  DL time: 492  |  Source IP: 9.9.9.9
timedata =      table: 0x7f96f6593de0
lua: test2.lua:219: attempt to index field '?' (a nil value)
stack traceback:
        test2.lua:219: in function 'conductor'
        test2.lua:236: in main chunk
        [C]: ?

Is it because .upewma and .downewma don't exist yet?

hmmm......

Ok, I think you can help here since I actually based my data table on a kind of copy-pasta-edit from your code.

This is how we make the data table...

		local stats = {
		   reflector = sa.addr,
		   originalTS = tsResp[6],
		   receiveTS = tsResp[7],
		   transmitTS = tsResp[8],
		   rtt = time_after_midnight_ms - tsResp[6],
		   uplink_time = tsResp[7] - tsResp[6],
		   downlink_time = time_after_midnight_ms - tsResp[8]}

I'm trying to do:

            OWDbaseline[timedata.reflector].upewma = OWDbaseline[timedata.reflector].upewma * slowfactor + (1-slowfactor) * timedata.uplink_time

and I think it's saying I think that timedata.reflector is a nil value, which suggests that sa.addr is nil, and in fact if I search for that text, it's never defined anywhere :slight_smile: so I probably deleted it or something.

I don't understand something... looking at the code:

                    print('Reflector IP: '..reflector..'  |  Current time: '..time_after_midnight_ms..

clearly prints out Reflector IP: 9.9.9.9 but where is the variable "reflector" defined?

Yeah, that's some shrapnel from early iterations where receive_ts_ping() took in a reflector parameter. Once we changed to having receive_ts_ping() run as a coroutine, I dropped that parameter and then set reflector = sa.addr. At that point in the code, it was guaranteed to be one of our reflector IPs by nature of:

local pos = get_table_position(reflector_array_v4, sa.addr)
...
if (pos > 0 and src_pkt_id == pkt_id) then
...

My take on the error was that you're attempting to set the .upwema to OWDbaseline[timedata.reflector].upewma * ... but OWDbaseline[timedata.reflector].upewma is nil on the first time through the loop.

the weird part is that it prints out something meaningful...

Yeah, the first time through that's it! I'll set them up, after that I've gotta go, just invited to an emergency Left4Dead2 session :slight_smile: but see if you can work out some basic stuff. it looks like it's working on your machine minus bugs. I need a testbed so we don't have to test remotely like this.

Ok I pushed a version that should create the ewma values if they're not already there. pushed.

1 Like

Similar (if not the same) error... but I'll look into it so you can catch your L4D2 session! Thanks a bunch for the help this evening!

root@OpenWrt:~# lua test2.lua 
rx_bytes_path: /sys/class/net/ifb4eth0/statistics/tx_bytes
tx_bytes_path: /sys/class/net/eth0/statistics/tx_bytes
timedata =      nil
Reflector IP: 9.9.9.9  |  Current time: 16404750  |  TX at: 16404249  |  RTT: 501  |  UL time: 7  |  DL time: 494  |  Source IP: 9.9.9.9
timedata =      table: 0x7f57d8f1ee90
lua: test2.lua:219: attempt to index field '?' (a nil value)
stack traceback:
        test2.lua:219: in function 'conductor'
        test2.lua:249: in main chunk
        [C]: ?

Zombie apocalypse averted. well no actually we made it through a couple levels and then died because we need more players. Ok, I pushed some additional fixes.

1 Like

Oh boy! This should be interesting because I didn't realize you were still working on it and I made a bunch of changes too. I will take my changes and commit them on my branch instead until we can compare.

Just pushed my updates: https://github.com/Fail-Safe/sqm-autorate/commit/84272c8b8ad056a9185bb22e91257d86bdbb0f1f

It's coming along, for sure. However, there's an issue I can't seem to resolve and it's around the downlink time. For example:

Reflector 149.112.112.10 up baseline = 7 down baseline = 494
Reflector 9.9.9.9 up baseline = 7.8 down baseline = 493.1
Reflector 9.9.9.10 up baseline = 9.9 down baseline = 491.1
Reflector 149.112.112.112 up baseline = 19 down baseline = 482
Reflector 149.112.112.11 up baseline = 10 down baseline = 490
Reflector 149.112.112.10 up baseline = 7 down baseline = 494
Reflector 9.9.9.9 up baseline = 6.4 down baseline = 493.8
Reflector 9.9.9.10 up baseline = 9.2 down baseline = 491.8
Reflector 149.112.112.112 up baseline = 19 down baseline = 482
Reflector 149.112.112.11 up baseline = 10 down baseline = 490
Reflector IP: 149.112.112.10  |  Current time: 21835437  |  TX at: 21834936  |  RTT: 501  |  UL time: 6  |  DL time: 495

Basically the tick time is falsely skewing the downlink time and RTT. I have tried moving the nanosleep to multiple other spots in the conductor loop, but can't seem to affect the downlink TS.

Let me think on it tomorrow. Rather than one big sleep I think we want a bunch of little sleeps looking for receives and then when enough time has passed we do a send... So its a little more complicated.

Sounds like a plan. FWIW, with the sleep removed, the data looks really good. I know that's not the ideal situation, but wanted to confirm the sleep was what was skewing the DL time:

Reflector 149.112.112.10 up baseline = 9.378772789432 down baseline = 28.992749458312
Reflector 9.9.9.9 up baseline = 9.7714186061496 down baseline = 23.871238756222
Reflector 9.9.9.10 up baseline = 8.7436160679502 down baseline = 26.26278690689
Reflector 149.112.112.112 up baseline = 7.8720013713354 down baseline = 32.387970967232
Reflector 149.112.112.11 up baseline = 9.1828892084889 down baseline = 29.603219387768
Reflector 149.112.112.10 up baseline = 10.851292718198 down baseline = 13.134334236123
Reflector 9.9.9.9 up baseline = 10.194248216772 down baseline = 11.994944021763
Reflector 9.9.9.10 up baseline = 9.9702585337904 down baseline = 13.119697435658
Reflector 149.112.112.112 up baseline = 4.450000203979 down baseline = 18.872833330872
Reflector 149.112.112.11 up baseline = 10.850000450494 down baseline = 13.14974461526

Reflector IP: 9.9.9.9  |  Current time: 23261503  |  TX at: 23261483  |  RTT: 20  |  UL time: 8  |  DL time: 12

Reflector 149.112.112.10 up baseline = 9.378772789432 down baseline = 28.992749458312
Reflector 9.9.9.9 up baseline = 9.5942767455347 down baseline = 22.684114880599
Reflector 9.9.9.10 up baseline = 8.7436160679502 down baseline = 26.26278690689
Reflector 149.112.112.112 up baseline = 7.8720013713354 down baseline = 32.387970967232
Reflector 149.112.112.11 up baseline = 9.1828892084889 down baseline = 29.603219387768
Reflector 149.112.112.10 up baseline = 10.851292718198 down baseline = 13.134334236123
Reflector 9.9.9.9 up baseline = 8.4388496433544 down baseline = 11.998988804353
Reflector 9.9.9.10 up baseline = 9.9702585337904 down baseline = 13.119697435658
Reflector 149.112.112.112 up baseline = 4.450000203979 down baseline = 18.872833330872
Reflector 149.112.112.11 up baseline = 10.850000450494 down baseline = 13.14974461526

Reflector IP: 149.112.112.112  |  Current time: 23261504  |  TX at: 23261479  |  RTT: 25  |  UL time: 11  |  DL time: 14

Reflector 149.112.112.10 up baseline = 9.378772789432 down baseline = 28.992749458312
Reflector 9.9.9.9 up baseline = 9.5942767455347 down baseline = 22.684114880599
Reflector 9.9.9.10 up baseline = 8.7436160679502 down baseline = 26.26278690689
Reflector 149.112.112.112 up baseline = 8.1848012342019 down baseline = 30.549173870509
Reflector 149.112.112.11 up baseline = 9.1828892084889 down baseline = 29.603219387768
Reflector 149.112.112.10 up baseline = 10.851292718198 down baseline = 13.134334236123
Reflector 9.9.9.9 up baseline = 8.4388496433544 down baseline = 11.998988804353
Reflector 9.9.9.10 up baseline = 9.9702585337904 down baseline = 13.119697435658
Reflector 149.112.112.112 up baseline = 9.6900000407958 down baseline = 14.974566666174
Reflector 149.112.112.11 up baseline = 10.850000450494 down baseline = 13.14974461526

Reflector IP: 149.112.112.10  |  Current time: 23261504  |  TX at: 23261479  |  RTT: 25  |  UL time: 11  |  DL time: 14

Reflector 149.112.112.10 up baseline = 9.5408955104888 down baseline = 27.493474512481
Reflector 9.9.9.9 up baseline = 9.5942767455347 down baseline = 22.684114880599
Reflector 9.9.9.10 up baseline = 8.7436160679502 down baseline = 26.26278690689
Reflector 149.112.112.112 up baseline = 8.1848012342019 down baseline = 30.549173870509
Reflector 149.112.112.11 up baseline = 9.1828892084889 down baseline = 29.603219387768
Reflector 149.112.112.10 up baseline = 10.97025854364 down baseline = 13.826866847225
Reflector 9.9.9.9 up baseline = 8.4388496433544 down baseline = 11.998988804353
Reflector 9.9.9.10 up baseline = 9.9702585337904 down baseline = 13.119697435658
Reflector 149.112.112.112 up baseline = 9.6900000407958 down baseline = 14.974566666174
Reflector 149.112.112.11 up baseline = 10.850000450494 down baseline = 13.14974461526

Reflector IP: 149.112.112.112  |  Current time: 23261504  |  TX at: 23261480  |  RTT: 24  |  UL time: 10  |  DL time: 14

Have a great night!

1 Like

Fantastic work - very exciting to see this coming along.

May I ask - I can follow most of the code but not this:

OWDbaseline[timedata.reflector].upewma = OWDbaseline[timedata.reflector].upewma * slowfactor + (1-slowfactor) * timedata.uplink_time
            OWDrecent[timedata.reflector].upewma = OWDrecent[timedata.reflector].upewma * fastfactor + (1-fastfactor) * timedata.uplink_time
            OWDbaseline[timedata.reflector].downewma = OWDbaseline[timedata.reflector].downewma * slowfactor + (1-slowfactor) * timedata.downlink_time
            OWDrecent[timedata.reflector].downewma = OWDrecent[timedata.reflector].downewma * fastfactor + (1-fastfactor) * timedata.downlink_time

            for ref,val in pairs(OWDbaseline) do
                upewma = val.upewma and val.upewma or "?" -- Hacky Lua version of a ternary
                downewma = val.downewma and val.downewma or "?" -- Hacky Lua version of a ternary
                print("Reflector " .. ref .. " up baseline = " .. upewma  .. " down baseline = " .. downewma)
            end
            for ref,val in pairs(OWDrecent) do
                upewma = val.upewma and val.upewma or "?" -- Hacky Lua version of a ternary
                downewma = val.downewma and val.downewma or "?" -- Hacky Lua version of a ternary
                print("Reflector " .. ref .. " up baseline = " .. upewma  .. " down baseline = " .. downewma)
            end
        end

Any chance you you could summarize how this works?

Corresponding code in the OWD shell is:

        # Check for any bad OWD values for reflector and if found just 
		# maintain previous baseline for reflector and continue to next reflector
		if [ "$uplink_OWD" = "999999999" ] || [ "$downlink_OWD" = "999999999" ]; then
			echo $reflector $prev_uplink_baseline $prev_downlink_baseline >> $BASELINES_cur 
        		if [ $enable_verbose_output -eq 1 ]; then
                		echo $reflector "No Response. Skipping this reflector."
        		fi
			continue
		fi

		delta_uplink_OWD=$( call_awk "${uplink_OWD} - ${prev_uplink_baseline}" )
		delta_downlink_OWD=$( call_awk "${downlink_OWD} - ${prev_downlink_baseline}" )

        	if [ $enable_verbose_output -eq 1 ]; then
                	printf "%25s;%14.2f;%14.2f;%14.2f;%14.2f;%14.2f;%14.2f;\n" $reflector $prev_downlink_baseline $downlink_OWD $delta_downlink_OWD $prev_uplink_baseline $uplink_OWD $delta_uplink_OWD 
        	fi

		if awk "BEGIN {exit !($delta_uplink_OWD >= 0)}"; then
        	        cur_uplink_baseline=$( call_awk "( 1 - ${alpha_OWD_increase} ) * ${prev_uplink_baseline} + ${alpha_OWD_increase} * ${uplink_OWD} " )
	        else
        	        cur_uplink_baseline=$( call_awk "( 1 - ${alpha_OWD_decrease} ) * ${prev_uplink_baseline} + ${alpha_OWD_decrease} * ${uplink_OWD} " )
	        fi
		
		if awk "BEGIN {exit !($delta_downlink_OWD >= 0)}"; then
        	        cur_downlink_baseline=$( call_awk "( 1 - ${alpha_OWD_increase} ) * ${prev_downlink_baseline} + ${alpha_OWD_increase} * ${downlink_OWD} " )
	        else
        	        cur_downlink_baseline=$( call_awk "( 1 - ${alpha_OWD_decrease} ) * ${prev_downlink_baseline} + ${alpha_OWD_decrease} * ${downlink_OWD} " )
	        fi
		
        echo $reflector $cur_uplink_baseline   $cur_downlink_baseline >> $BASELINES_cur

		if awk "BEGIN {exit !($delta_uplink_OWD < $min_uplink_delta)}"; then
			min_uplink_delta=$delta_uplink_OWD
		fi

		if awk "BEGIN {exit !($delta_downlink_OWD < $min_downlink_delta)}"; then
			min_downlink_delta=$delta_downlink_OWD
		fi

So fast factor or slow factor is applied in dependence upon sign of delta.

In your code I am struggling to understand the role of OWDrecent and associated ewma. That may be because I'm totally unfamiliar with lua.