CAKE w/ Adaptive Bandwidth [October 2021 to September 2022]

Lynx · December 7, 2021, 10:42am

Right, but I thought from @Lochnair it seemed hping3 allowed smaller durations for the same server? That is surely just about the ping utility itself, no? I am wondering if nping is a little on the sluggish side.

Also curious about what you make of the timestamp options above.

Lynx · December 7, 2021, 10:51am

t_end=$(dateroot@OpenWrt:~# nping --icmp-type 13 9.9.9.9 -c 1
N)

Starting Nping 0.7.91 ( https://nmap.org/nping ) at 2021-12-07 10:48 UTC
SENT (0.0463s) ICMP [10.9.79.124 > 9.9.9.9 Timestamp request (type=13/code=0) id=65076 seq=1 orig=0 recv=0 trans=0] IP [ttl=64 id=65281 iplen=40 ]
RCVD (0.1023s) ICMP [9.9.9.9 > 10.9.79.124 Timestamp reply (type=14/code=0) id=65076 seq=1 orig=0 recv=38939018 trans=38939018] IP [ttl=55 id=28512 iplen=40 ]

Max rtt: 55.925ms | Min rtt: 55.925ms | Avg rtt: 55.925ms
Raw packets sent: 1 (40B) | Rcvd: 1 (46B) | Lost: 0 (0.00%)
Nping done: 1 IP address pinged in 1.12 seconds
root@OpenWrt:~# t_end=$(date +%s.%N)
root@OpenWrt:~#
root@OpenWrt:~#
root@OpenWrt:~# echo $t_start
1638874138.931538566
root@OpenWrt:~# echo $t_end
1638874140.065827210

moeller0 · December 7, 2021, 11:11am

IMHO current nping output is not sufficient for our task, we really need a local timestamp...

Yes, great find:

root@turris:~# time nping --icmp-type 13 9.9.9.9 -c 1 --icmp-orig-time now

Starting Nping 0.7.70 ( https://nmap.org/nping ) at 2021-12-07 12:01 CET
SENT (0.1333s) ICMP [95.112.150.73 > 9.9.9.9 Timestamp request (type=13/code=0) id=12706 seq=1 orig=39712000 recv=0 trans=0] IP [ttl=64 id=54886 iplen=40 ]
RCVD (0.3113s) ICMP [9.9.9.9 > 95.112.150.73 Timestamp reply (type=14/code=0) id=12706 seq=1 orig=39712000 recv=39716490 trans=39716490] IP [ttl=58 id=6329 iplen=40 ]
 
Max rtt: 177.826ms | Min rtt: 177.826ms | Avg rtt: 177.826ms
Raw packets sent: 1 (40B) | Rcvd: 1 (40B) | Lost: 0 (0.00%)
Nping done: 1 IP address pinged in 1.21 seconds
real    0m 1.21s
user    0m 0.01s
sys     0m 0.00s
root@turris:~#

I would say partly, it does allow to set an orgin time (in what ever time base) and that is included in the output, but taking 1.2 seconds seems a bit longish

orig=39712000 recv=39716490 trans=39716490], rtt: 177.826

39716490 - 39712000 = 4490
(39712000 + 177.826) - 39716490 = -4312.174

The offset seems to be better, but not perfect (but we did not expect perfect anyway), but I am not sure that orig + rtt is correct:

root@turris:~# time nping --icmp-type 13 9.9.9.9 -c 1 -v2

Starting Nping 0.7.70 ( https://nmap.org/nping ) at 2021-12-07 12:21 CET
SENT (0.1319s) ICMP [95.112.150.73 > 9.9.9.9 Timestamp request (type=13/code=0) id=19390 seq=1 orig=0 recv=0 trans=0] IP [ver=4 ihl=5 tos=0x00 iplen=40 id=46310 foff=0 ttl=64 proto=1 csum=0xbe23]
RCVD (0.3002s) ICMP [9.9.9.9 > 95.112.150.73 Timestamp reply (type=14/code=0) id=19390 seq=1 orig=0 recv=40922954 trans=40922954] IP [ver=4 ihl=5 tos=0x00 iplen=40 id=43900 foff=0 ttl=58 proto=1 csum=0xcd8d]
 
Max rtt: 168.227ms | Min rtt: 168.227ms | Avg rtt: 168.227ms
Raw packets sent: 1 (40B) | Rcvd: 1 (40B) | Lost: 0 (0.00%)
Tx time: 0.00126s | Tx bytes/s: 31796.50 | Tx pkts/s: 794.91
Rx time: 1.00053s | Rx bytes/s: 39.98 | Rx pkts/s: 1.00
Nping done: 1 IP address pinged in 1.21 seconds
real    0m 1.21s
user    0m 0.00s
sys     0m 0.01s

I grudgingly agree that nping might not be the "droids we are looking for" here... for multiple reasons

moeller0 · December 7, 2021, 11:16am

Try t_start=$( printf "%10.0f" "$( date -d "1970-01-01 UTC $(date +%T.%N)" +%s.%N )"e3 ) instead as that should give us time in milliseconds after midnight UTC, which is the reference time type 13/14 should use (but not all reflectors are using that reference clock). That way no additional multiplication required...
38939018 - 1638874138931 = -1.63883519991e+12 (these offsets are a bit unwieldy)
1638874140065 - 38939018 = 1.63883520105e+12
1638874140065 - 1638874138931 = 1134 ms

That is really taking too long... at least for tick rates > 0.5 Hz...

Lynx · December 7, 2021, 11:28am

More hping3 testing needed now?

BlueKingMuch · December 7, 2021, 11:54am

I'd also suggest to use 2 different tickrates: first to check local things such as current usage of bandwith (with a RAMdisk this can be set very low, even set to a few ms), and use that to trigger pings for latency measurement because on low load scenarios a constent 1s tickrate to ping isn't really necessary, it is more interesting to catch up the events when a high load is outgoing/incoming, and that can be steered if you use the load as a tickrate factor, dumbed down for example:

$loadfactor: 0.2 (for example, 1 is fully saturated, 0 is idle)

$base_ping_tickrate: 1s
$min_tickrate = 0.1s

$actual_ping_tickrate() { 
if $base_ping_tickrate/(loadfactor*5) >= $min_tickrate; then 
$used_tickrate = $base_ping_tickrate/(loadfactor*10)
else
$used_tickrate = $min_tickrate
fi

in this example, the tickrate would be 1/1 = 1s with a 20% load on the Connection

moeller0 · December 7, 2021, 12:01pm

I respectfully would like to argue that such optimizations should wait until we are sure that the algorithm is robust and reliable. The point is, if the true bottleneck rate changes significantly our load estimates will be wrong and without delay measurements we will never notice (or rather notice later than possible).

BlueKingMuch · December 7, 2021, 12:05pm

True, i just wanted to add it because before there were some arguments of the load for the public IP's with the requests this autorate tool creates (when it is rolled out and/or used on a larger scale) which can help to reduce unnecessary high load, for which i wanted to document this Idea shortly in this thread before i forget it

anon50098793 · December 7, 2021, 12:19pm

exhausted interrogating tc for all it has(up/down load etc)? i.e.

excuse-the-trademark-uneligant-code

#!/bin/sh

gettc() {

       local RES=0
	case "${1}" in
		up)
for val in $(tc -s qdisc | grep backlog | grep -E '(K$|b$)' | tr -s '\t' ' ' | sed 's| backlog||g' | sed 's|b|0|g' | sed 's|K|000|g' | head -n 1); do
	RES=$((${RES:-"0"} + $val))
done
echo $RES
	;;

	down)
for val in $(tc -s qdisc | grep backlog | grep -E '(K$|b$)' | tr -s '\t' ' ' | sed 's| backlog||g' | sed 's|b|0|g' | sed 's|K|000|g' | tail -n 1); do
	RES=$((${RES:-"0"} + $val))
done
echo $RES
	;;
esac
}

UP=$(gettc up)
DOWN=$(gettc down)

while :; do

	UPl=$UP
	DOWNl=$DOWN
	UP=$(gettc up)
	DOWN=$(gettc down)

	if [ "$UP" -eq "$UPl" ]; then
		echo -n "  UPnc:$UP "
	elif [ "$UP" -lt "$UPl" ]; then
		echo -n " UPlow:$UP "
	else
		echo -n "UPhigh:$UP "
	fi


	if [ "$DOWN" -eq "$DOWNl" ]; then
		echo -n "DOWNnc:$DOWN"
	elif [ "$DOWN" -lt "$DOWNl" ]; then
		echo -n "DOWNlow:$DOWN"
	else
		echo -n "DOWNhigh:$DOWN"
	fi
	echo ""
	sleep 3
done

(i.e. if 0 multiply rate increment/decrement x 2(or X num0reads) within rate boundaries for said direction?)

moeller0 · December 7, 2021, 12:50pm

Not sure where you are aiming at, but the problem I am trying to highlight here ist that we do not have reliable load numbers. Let me explain, what we want is how does the current short term traffic volume compare to the maximum volume in the same interval over the true bottleneck what we do however is to compare the former with the maximum volume that we could push through our shaper in the same interval. If the shaper undershoots the true bottleneck rate that is not a problem, we overestimate the load and will tend to open up the shaper until bufferbloat tells us to scale back a bit. If the shaper however overshoots the true rate we are in trouble, unless we also check the bufferbloat. But when we do that our load numbers become untrustworthy to a degree (they probably underestimate the true load). And that is why we can use load numbers as additional heuristic but should primarily rely on the delay changes, at least that is my take on that.

moeller0 · December 7, 2021, 1:07pm

Mmmh

user@work-horse:~$ start_ms=$( printf "%14.0f" "$(( $(date '+%-H *3600000000000 + %-M *60000000000 + %-S *1000000000 + %-N') ))"e-6 ) ; \
> sudo hping3 9.9.9.9 --icmp --icmp-ts -c 1 ; \
> end_ms=$( printf "%14.0f" "$(( $(date '+%-H *3600000000000 + %-M *60000000000 + %-S *1000000000 + %-N') ))"e-6 ) ; \
> echo "$start_ms $end_ms"
HPING 9.9.9.9 (eth0 9.9.9.9): icmp mode set, 28 headers + 0 data bytes
len=46 ip=9.9.9.9 ttl=57 id=1798 icmp_seq=0 rtt=27.9 ms
ICMP timestamp: Originate=46783647 Receive=46783654 Transmit=46783654
ICMP timestamp RTT tsrtt=28


--- 9.9.9.9 hping statistic ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 27.9/27.9/27.9 ms
      50383571       50383715
user@work-horse:~$

The originate timestamp looks quite different from the manually created one, but that might be related to my clock not being in UTC...

start_ms=$( printf "%14.0f" "$(( $(date -u '+%-H *3600000000000 + %-M *60000000000 + %-S *1000000000 + %-N') ))"e-6 ) ; \
sudo hping3 9.9.9.9 --icmp --icmp-ts -c 1 ; \
end_ms=$( printf "%14.0f" "$(( $(date -u '+%-H *3600000000000 + %-M *60000000000 + %-S *1000000000 + %-N') ))"e-6 ) ; \
echo "$start_ms $end_ms"

Yepp, that was it: timestamps in UTC since midnight:
printf "%14.0f" "$(( $(date -u '+%-H *3600000000000 + %-M *60000000000 + %-S *1000000000 + %-N') ))"e-6
pretty ugly

Lynx · December 7, 2021, 1:11pm

Were you able to demonstrate determination of the upload time as proportion of RTT?

If so that's a mini victory, at least in my book.

Excited by the prospect of being able to show proof of concept by saturating upload and showing the upload time increase (whilst download time remains the same), then later saturating download and showing the download time increase (whilst upload time remains the same).

That would be super rewarding / cool!

moeller0 · December 7, 2021, 1:25pm

Not sure what you mean (on an ubuntu host where I can use hping3):

moeluserer@work-horse:~$ start_ms=$( printf "%14.0f" "$(( $(date -u '+%-H *3600000000000 + %-M *60000000000 + %-S *1000000000 + %-N') ))"e-6 ) ; sudo hping3 9.9.9.9 --icmp --icmp-ts -c 1 ; end_ms=$( printf "%14.0f" "$(( $(date -u '+%-H *3600000000000 + %-M *60000000000 + %-S *1000000000 + %-N') ))"e-6 ) ; echo "$start_ms $end_ms"
HPING 9.9.9.9 (eth0 9.9.9.9): icmp mode set, 28 headers + 0 data bytes
len=46 ip=9.9.9.9 ttl=57 id=26580 icmp_seq=0 rtt=19.8 ms
ICMP timestamp: Originate=47890115 Receive=47890122 Transmit=47890122
ICMP timestamp RTT tsrtt=20


--- 9.9.9.9 hping statistic ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 19.8/19.8/19.8 ms
      47890065       47890182
user@work-horse:~$

end_ms - start_ms
47890182 - 47890065 = 117ms total run-time

Originate - start_ms
47890115 - 47890065 = 50ms delay between calling date and hping3 emitting its packet

Receive - Originate
47890122 - 47890115 = 7ms there_delay_ms

(Originate + rtt) - Transmit
(47890115+19.8) - 47890122 = 12.799999997ms and_back_delay

Since we can not expect precise clock synchronization between local and remote host this looks pretty decent.

Receive - start_ms
47890122 - 47890065 = 57ms there_delay_ms

end_ms - Transmit
47890182 - 47890122 = 60ms and_back_delay

Now these are just singular values, but that looks pretty sane to me, I am happy to believe that that gunk before, during and after the hping3 call causes additional delay in the ~100 ms range.

Especially since the hping3 command takes ~90ms to complete:

user@work-horse:~$ sudo time -p hping3 9.9.9.9 --icmp --icmp-ts -c 1
HPING 9.9.9.9 (eth0 9.9.9.9): icmp mode set, 28 headers + 0 data bytes
len=46 ip=9.9.9.9 ttl=57 id=10069 icmp_seq=0 rtt=23.8 ms
ICMP timestamp: Originate=48520135 Receive=48520143 Transmit=48520143
ICMP timestamp RTT tsrtt=24


--- 9.9.9.9 hping statistic ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 23.8/23.8/23.8 ms
real 0.09
user 0.00
sys 0.00

Now what would need testing is how repeatable these numbers are....

Any further tests will need to wait though....

Lynx · December 7, 2021, 1:34pm

Nice. So am I right you estimated the RTT of 19.8ms to comprise 12.8ms on uplink and remainder downlink?

I wonder what the significance of this is:

--tcp-timestamp

Enable the TCP timestamp option, and try to guess the timestamp update frequency and the remote system uptime.

Could that be exploited? It seems a shame to be relying upon local system time before and after ping does stuff given variable system load etc. I wonder if there is a way round that. To get more precise time at point of send. Like do we need ping utility to output timestamp or put timestamp in. Nping offered that, but I don't see equivalent in hping3.

Sorry if this sounds stupid, I am slightly struggling to follow this.

moeller0 · December 7, 2021, 1:47pm

Well if the clocks would be synchronized that would be the split of the RTT of 19.8ms into the two directions. But I am certain the clocks are not synchronized the exact numbers will be slightly off.... (it still is true that they need to add up to not more than 19.8ms (can be smaller as Receive and Transmit do not need to be identical as in the current case)).

About TCP timestamps not sure we want to go that route, as TCP requires a 3-way handshake to start a connection up in the first place.

Oh the timestamp when hping sent the packet is in there (look e.g. for Originate=48520135*) and, as @dlakelan proposed and my quick and dirty tests seem to confirm: Originate+tsrtt ~ local Receive
So hping3 would not need additional timestamps (but I wanted to actually see local timestamps making sense with the hping internal numbers before trusting hping, as the saying goes, "trust, but verify", which I have heard from reliable sources does sound better in the original russian " Doveryay, no proveryay" it even rhymes;)

*) hping3 fills this automatically with the current time, which IMHO is the better default, in nping the --icmp-orig-time now thing would work better as an explicit override to allow different values, but the default of zero is somewhat boring.

Lynx · December 7, 2021, 1:53pm

For the dummies in this thread like me, would you be able to give a simplistic explanation of what these various timestamps actually mean? I gather that originate is the timestamp when the packet is sent? Then I'm lost. Why are receive and transmit the same?

For anyone wanting to follow, this is helpful:

Originate timestamp is the time the sender last touched the message before sending it.

Receive timestamp is the time the echoer first touched it on receipt.

Transmit timestamp is the time the echoer last touched the message on sending it.

All timestamps are in units of milliseconds since midnight UT. If the time is not available in milliseconds or cannot be provided with respect to midnight UT then any time can be inserted in a timestamp provided the high order bit of the timestamp is also set to indicate this non-standard value.

The use of Timestamp and Timestamp Reply messages to synchronize the clocks of Internet nodes has largely been replaced by the UDP-based Network Time Protocol and the Precision Time Protocol.

So with:

ICMP timestamp: Originate=47890115 Receive=47890122 Transmit=47890122
ICMP timestamp RTT tsrtt=20

Does this mean RTT is 20ms and uplink is 47890122-47890115=7ms?

dlakelan · December 7, 2021, 2:01pm

Yes that is my interpretation.

Lynx · December 7, 2021, 2:01pm

Then we need to get hping3 as an OpenWrt package!

It seems to completely outclass nping - at least for our purposes.

moeller0 · December 7, 2021, 2:02pm

So I am no expert and have not looked at the code, but the RFC/documentation of ICMP type13/14 pairs indicate that there are 3 timestamps involved that are ideally all in milliseconds since midnight in universal time coordinated (UTC), other clock bases are allowed but should set the highest bit of the 32 timestamp value to indicate so (something we should be able ignore):

"Timestamp or Timestamp Reply Message

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Type      |      Code     |          Checksum             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           Identifier          |        Sequence Number        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Originate Timestamp                                       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Receive Timestamp                                         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Transmit Timestamp                                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

IP Fields:

Addresses

  The address of the source in a timestamp message will be the
  destination of the timestamp reply message.  To form a timestamp
  reply message, the source and destination addresses are simply
  reversed, the type code changed to 14, and the checksum
  recomputed.

IP Fields:

Type

  13 for timestamp message;

  14 for timestamp reply message.

Code

Checksum

  The checksum is the 16-bit ones's complement of the one's
  complement sum of the ICMP message starting with the ICMP Type.
  For computing the checksum , the checksum field should be zero.
  This checksum may be replaced in the future.

Identifier

  If code = 0, an identifier to aid in matching timestamp and
  replies, may be zero.

Sequence Number

  If code = 0, a sequence number to aid in matching timestamp and
  replies, may be zero.

Description

  The data received (a timestamp) in the message is returned in the
  reply together with an additional timestamp.  The timestamp is 32
  bits of milliseconds since midnight UT.  One use of these
  timestamps is described by Mills [5].

The Originate Timestamp is the time the sender last touched the
message before sending it, the Receive Timestamp is the time the
echoer first touched it on receipt, and the Transmit Timestamp is
the time the echoer last touched the message on sending it.

  If the time is not available in miliseconds or cannot be provided
  with respect to midnight UT then any time can be inserted in a
  timestamp provided the high order bit of the timestamp is also set
  to indicate this non-standard value.

  The identifier and sequence number may be used by the echo sender
  to aid in matching the replies with the requests.  For example,
  the identifier might be used like a port in TCP or UDP to identify
  a session, and the sequence number might be incremented on each
  request sent.  The destination returns these same values in the
  reply.

  Code 0 may be received from a gateway or a host."

The bolded section describes the three named timestamps...

moeller0 · December 7, 2021, 2:06pm

len=46 ip=9.9.9.9 ttl=57 id=10069 icmp_seq=0 rtt=23.8 ms
ICMP timestamp: Originate=48520135 Receive=48520143 Transmit=48520143
ICMP timestamp RTT tsrtt=24

Pretty much, but we can do a tad better than tsrtt=24 and use the non rounded rtt=23.8 from above instead....

Only if the clocks between the systems would be synchronised to the millisecond, which is unlikely to be the case, even though the examples we are looking at here are quite well synchronized, I had expected much worse offsets, but so far we are only looking at a single reflector so that niceness might not be universal.