Latency Debugging / Plotting script

Several people are trying to use either SQM or my script to solve their latency issues on their networks. To help in debugging those issues, I've written a simple Julia script which will traceroute to a given site, then ping all the sites on that traceroute for a given time, and plot the latency as a time-series. Tested only on Linux running standard linux traceroute and ping tools. Though there's no reason it shouldn't work in say WSL on Windows.

it's available on master in my https://github.com/dlakelan/routerperf as juliapingplot.jl

For those of you who are having mysterious latency issues such as @segal_72 you might want to try it out to debug some of those issues. This is VERY rough and not guaranteed to work at all, so I'd appreciate some testing and feedback. I'm very happy to add a bunch of features to this as time goes on, it's complimentary to the gaming script also in the same github repo.

To use it, you'll need to download Julia 1.7-rc1 at https://julialang.org/downloads/ and then run julia at the command line, and have it load the script. The first time you run, you'll want to uncomment the first few lines which makes julia grab the needed packages from the julia package server.

cd into the directory where you've grabbed my file, then run julia at the command line... then at the julia> prompt type as follows:

julia> include("juliapingplot.jl")

The FIRST time you run it it will take a while to grab the needed packages... After that, you can comment out those first lines, and it'll take a shorter (but still nontrivial) time to just load the packages... Julia is a just in time compiled language and has an issue with "time to first plot" so be a bit patient it may be some seconds before the plot screen appears.

You can adjust which site it pings via editing the script... I'm going to bed but you guys who are in a different part of the globe may try it out and give some basic feedback.

2 Likes

Yepp, you really mean that.... :wink: I did try with the current stable (1.6) to no avail :wink: (I want to keep installing stuff like julia via homebrew on my mac*)
That reminds me that "ability to read" often is insufficient if not exercised.

*) I assume that apple's ping, with its lack of support for D is not going to work, but I did not even get that far....

I only test on 1.7 because there is a change in the package manager which is 100% necessary to run on my system. How did 1.6 fail?

When it comes to Julia I always download the latest. It's a trivial install into my home directory (unpack... Then run)

With the first 3 lines uncommented:

mac-book:routerperf user$ julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.6.3 (2021-09-23)
 _/ |\__'_|_|_|\__'_|  |  Built by Homebrew (v1.6.3_3)
|__/                   |

julia> include("juliapingplot.jl")
  Activating environment at `/Users/Shared/space/data_local/moeller/PRIVATE/samba/privat/MOEWE/techno_kram/CODE/routerperf/Project.toml`
    Updating registry at `~/.julia/registries/General`
   Resolving package versions...
  No Changes to `/Users/Shared/space/data_local/moeller/PRIVATE/samba/privat/MOEWE/techno_kram/CODE/routerperf/Project.toml`
  No Changes to `/Users/Shared/space/data_local/moeller/PRIVATE/samba/privat/MOEWE/techno_kram/CODE/routerperf/Manifest.toml`
0×0 DataFrame

julia> 

with these lines commented out:

mac-book:routerperf user$ julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.6.3 (2021-09-23)
 _/ |\__'_|_|_|\__'_|  |  Built by Homebrew (v1.6.3_3)
|__/                   |

julia> include("juliapingplot.jl")
ERROR: LoadError: ArgumentError: Package StatsPlots not found in current path:
- Run `import Pkg; Pkg.add("StatsPlots")` to install the StatsPlots package.

Stacktrace:
 [1] require(into::Module, mod::Symbol)
   @ Base ./loading.jl:893
 [2] include(fname::String)
   @ Base.MainInclude ./client.jl:444
 [3] top-level scope
   @ REPL[1]:1
in expression starting at /Users/Shared/space/data_local/moeller/PRIVATE/samba/privat/MOEWE/techno_kram/CODE/routerperf/juliapingplot.jl:5

julia> 

Hmm. Actually the only redundant part is the PKG.add, once you've done that you don't need to keep doing it. But keep the using Pkg and the activate line uncommented.

I'll put some logic in to make commenting and uncommenting irrelevant. The bigger issue is it runs and does nothing with the uncommented version I think perhaps because the Mac traceroute or ping are not working right for this script

Okay, as expected that works... (as much/little as the full version).

Most likely, unfortunately on my ubuntu 20LTS Julia is at version 1.4 and there I get:

user@work-horse:~/CODE/routerperf$ julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.4.1
 _/ |\__'_|_|_|\__'_|  |  Ubuntu   julia/1.4.1+dfsg-1
|__/                   |

julia> include("juliapingplot.jl")
 Activating environment at `~/CODE/routerperf/Project.toml`
   Updating registry at `~/.julia/registries/General`
   Updating git-repo `https://github.com/JuliaRegistries/General.git`
  Resolving package versions...
   Updating `~/CODE/routerperf/Project.toml`
 [no changes]
   Updating `~/CODE/routerperf/Manifest.toml`
 [no changes]
ERROR: LoadError: BoundsError: attempt to access 2-element Array{SubString{String},1} at index [3]
Stacktrace:
 [1] getindex at ./array.jl:788 [inlined]
 [2] gethops(::String) at /home/moeller/CODE/routerperf/juliapingplot.jl:26
 [3] top-level scope at /home/moeller/CODE/routerperf/juliapingplot.jl:77
 [4] include(::String) at ./client.jl:439
 [5] top-level scope at REPL[1]:1
in expression starting at /home/moeller/CODE/routerperf/juliapingplot.jl:77

julia> 

but there installing the packages also caused some errors (that I did not catch/log).

Just try grabbing the 1.7 tar, unpack that and then run that version from the unpacked directory... No install required. I think 1.4 is from 2019

Still not working (I guess I need to look into my Ubuntu installation)

user@work-horse:~/CODE/julia/julia-1.7.0-rc1/bin$ ./julia                                                                                                            
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.7.0-rc1 (2021-09-12)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> include("juliapingplot.jl")
  Activating project at `~/CODE/julia/julia-1.7.0-rc1/bin`
    Updating registry at `~/.julia/registries/General`
    Updating git-repo `https://github.com/JuliaRegistries/General.git`
   Resolving package versions...
  No Changes to `~/CODE/julia/julia-1.7.0-rc1/bin/Project.toml`
  No Changes to `~/CODE/julia/julia-1.7.0-rc1/bin/Manifest.toml`
ERROR: LoadError: BoundsError: attempt to access 2-element Vector{SubString{String}} at index [3]
Stacktrace:
 [1] getindex
   @ ./array.jl:839 [inlined]
 [2] gethops(dname::String)
   @ Main ~/CODE/julia/julia-1.7.0-rc1/bin/juliapingplot.jl:26
 [3] top-level scope
   @ ~/CODE/julia/julia-1.7.0-rc1/bin/juliapingplot.jl:77
 [4] include(fname::String)
   @ Base.MainInclude ./client.jl:451
 [5] top-level scope
   @ REPL[1]:1
in expression starting at /home/moeller/CODE/julia/julia-1.7.0-rc1/bin/juliapingplot.jl:77

julia> 

This is probably because your traceroute is outputting something different from mine. Can you paste output of

traceroute -n -w 0.5 google.com

Here is the graphical output of a run I did while pinging www.google.com and running the FAST.com speed test:

as you can see even at 800Mbps or so of speedtest, my latency increase remains down in the 5ms range except for one outlier from one site along the way (which may well be that site responding less quickly rather than anything else). The code outputs a data frame with all the measurements and could be made to write that to a CSV file for later analysis. It can also potentially do statistical analysis of that data to determine where in the chain the delays occur, what the average delay increase is... etc. Those are the kinds of things I'd like to add ultimately. In particular for some of what @segal_72 has been experiencing, I suspect that he's got congestion along the path rather than at the local connection. Collecting this kind of data can ultimately help to understand what's going wrong.

In order to make it work, the script does need to parse the output of both "traceroute" and "ping"... my traceroute outputs like this:

traceroute to www.google.com (142.250.188.228), 30 hops max, 60 byte packets
 1  10.79.1.6  0.268 ms  0.234 ms  0.308 ms
 2  192.168.1.254  2.987 ms  1.265 ms  4.953 ms
 3  162.207.92.1  5.319 ms  11.280 ms  5.690 ms
 4  70.232.229.68  6.284 ms  12.283 ms  5.920 ms
 5  12.242.115.21  12.660 ms  11.913 ms  13.252 ms
 6  12.255.10.176  11.570 ms  17.743 ms  12.827 ms
 7  * * 10.252.217.126  15.254 ms
 8  142.251.60.129  13.688 ms  13.564 ms  13.573 ms
 9  142.250.188.228  13.981 ms 142.251.60.129  13.357 ms 142.250.188.228  13.037 ms

and my ping like this:

PING www.google.com(2607:f8b0:4007:80a::2004) 56 data bytes
[1633974125.855457] 64 bytes from 2607:f8b0:4007:80a::2004: icmp_seq=1 ttl=116 time=4.71 ms
[1633974126.856720] 64 bytes from 2607:f8b0:4007:80a::2004: icmp_seq=2 ttl=116 time=4.19 ms
[1633974127.858339] 64 bytes from 2607:f8b0:4007:80a::2004: icmp_seq=3 ttl=116 time=4.54 ms
[1633974128.859661] 64 bytes from 2607:f8b0:4007:80a::2004: icmp_seq=4 ttl=116 time=4.27 ms
[1633974129.862127] 64 bytes from 2607:f8b0:4007:80a::2004: icmp_seq=5 ttl=116 time=5.37 ms
[1633974130.862500] 64 bytes from 2607:f8b0:4007:80a::2004: icmp_seq=6 ttl=116 time=4.28 ms
[1633974131.863939] 64 bytes from 2607:f8b0:4007:80a::2004: icmp_seq=7 ttl=116 time=4.36 ms
[1633974132.866574] 64 bytes from 2607:f8b0:4007:80a::2004: icmp_seq=8 ttl=116 time=5.07 ms
[1633974133.866879] 64 bytes from 2607:f8b0:4007:80a::2004: icmp_seq=9 ttl=116 time=4.24 ms
[1633974134.868568] 64 bytes from 2607:f8b0:4007:80a::2004: icmp_seq=10 ttl=116 time=4.64 ms

--- www.google.com ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9013ms
rtt min/avg/max/mdev = 4.192/4.567/5.367/0.370 ms

So those are the formats it's expecting. If someone has issues, please send the format of your traceroute and ping output so we can deduce how to modify the script to read that data properly.

user@work-horse:~/.julia$ traceroute -n -w 0.5 google.com
traceroute to google.com (142.250.181.206), 30 hops max, 60 byte packets
 1  192.168.42.1  0.283 ms  0.264 ms  0.255 ms
 2  62.52.200.147  9.308 ms  9.488 ms  9.821 ms
 3  62.53.12.14  9.764 ms 62.53.12.12  9.738 ms  9.178 ms
 4  62.53.25.59  9.075 ms  9.050 ms  9.023 ms
 5  74.125.48.102  9.283 ms  9.055 ms  9.265 ms
 6  * * *
 7  209.85.251.206  10.693 ms 209.85.251.130  9.652 ms 209.85.251.206  10.636 ms
 8  209.85.240.161  9.590 ms 108.170.253.85  9.971 ms 209.85.240.83  9.446 ms
 9  108.170.253.49  10.461 ms 142.250.181.206  9.433 ms  9.403 ms

similar to yours, except hop 6 where mine only has 3 entries, while yours has four...

user@work-horse:~/CODE/julia/julia-1.7.0-rc1/bin$ ping -n -D -w 10 www.google.com
PING www.google.com(2a00:1450:4005:802::2004) 56 data bytes
[1633977962.957324] 64 bytes from 2a00:1450:4005:802::2004: icmp_seq=1 ttl=118 time=9.64 ms
[1633977963.958362] 64 bytes from 2a00:1450:4005:802::2004: icmp_seq=2 ttl=118 time=8.91 ms
[1633977964.959678] 64 bytes from 2a00:1450:4005:802::2004: icmp_seq=3 ttl=118 time=9.19 ms
[1633977965.960913] 64 bytes from 2a00:1450:4005:802::2004: icmp_seq=4 ttl=118 time=9.10 ms
[1633977966.962372] 64 bytes from 2a00:1450:4005:802::2004: icmp_seq=5 ttl=118 time=9.37 ms
[1633977967.963654] 64 bytes from 2a00:1450:4005:802::2004: icmp_seq=6 ttl=118 time=9.17 ms
[1633977968.964961] 64 bytes from 2a00:1450:4005:802::2004: icmp_seq=7 ttl=118 time=9.16 ms
[1633977969.966174] 64 bytes from 2a00:1450:4005:802::2004: icmp_seq=8 ttl=118 time=9.09 ms
[1633977970.967463] 64 bytes from 2a00:1450:4005:802::2004: icmp_seq=9 ttl=118 time=9.16 ms
[1633977971.968717] 64 bytes from 2a00:1450:4005:802::2004: icmp_seq=10 ttl=118 time=9.12 ms

--- www.google.com ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9012ms
rtt min/avg/max/mdev = 8.908/9.190/9.639/0.184 ms

ping looks the same...

Ah I think I'm not handling the case where there's no response at all! (ie. hop 6)

This seems to be a regex issue. I think I can fix that.

1 Like

Here's the diff for the patch I'm using:

@@ -22,6 +22,9 @@ function gethops(dname)
     while true
         if(!eof(trout))
             line = lstrip(readline(trout));
+            if match(r"[0-9.]+ ms",line) === nothing
+                continue
+            end
             s = split(line,r"[ *]+")
             push!(hops,(n=s[1],addr=s[2],ms=s[3]))
             #@show hops

Fixed it, thanks! Does something that looks about sane...

1 Like
julia> include("juliapingplot.jl")
ERROR: LoadError: syntax: "<" is not a unary operator
Stacktrace:
 [1] top-level scope
   @ C:\Users\0901\juliapingplot.jl:7
 [2] include(fname::String)
   @ Base.MainInclude .\client.jl:444
 [3] top-level scope
   @ REPL[14]:1
in expression starting at C:\Users\0901\juliapingplot.jl:7

Check to see if the file is corrupted or something. There is no "<" character on line 7 of the script. In fact there is no "<" character anywhere in the file.

hello,
i dont can install script in julia

julia> add https://github.com/dlakelan/routerperf/blob/master/juliapingplot.jl
ERROR: syntax: extra token "https" after end of expression
(roby) pkg> add http://github.com/dlakelan/routerperf/blob/master/juliapingplot.jl
     Cloning git-repo `http://github.com/dlakelan/routerperf/blob/master/juliapingplot.jl`
ERROR: failed to clone from http://github.com/dlakelan/routerperf/blob/master/juliapingplot.jl, error: GitError(Code:ERROR, Class:HTTP, unexpected http status code: 404)

don't try to add the script as a package, just download the file to your computer then "include" the file at the prompt.

how can i download directly script in julia ?

on command line of shell do:

wget https://raw.githubusercontent.com/dlakelan/routerperf/master/juliapingplot.jl