Shaping performance

@dlakelan how about we start in a hackish fashion and simply append* "date +%s%N >> stat_ouput.txt ; cat /proc/stat >> stat_output.txt" and then parse this with a lua script after the measurement is done? Then we have the freedom to easily play with the different variables with much less time constraints. Now this will require a real date as busybox date seems to insist upon reporting just seconds...
We can combine this with your idea of using gnu sleep to run this say every 100ms to get reasonable high resolution?

*) This is the gist of the recording, it might still make sense to do this from a script so other data like interface traffic counters, and wlan rates can also be collected, timestamped and saved out.

1 Like

As a first pass this seems reasonable, let's also collect /proc/net/dev to figure out bandwidths.

You guys are amazing! So glad you're figuring it out. I'm just going to watch from the sidelines for now, but I think this will be a really useful feature in the wiki once it's sorted out.

Here is basic shell code to get started:

#!/bin/sh

## truncate the output file
echo "" > /tmp/stat_output.txt
i=1

echo "Beginning data collection. Run speed test now."

while [ i -lt 200 ]; do ## 20 seconds of data collection
   date +%s%N >> /tmp/stat_output.txt
   cat /proc/stat >> /tmp/stat_output.txt
   echo "" >> /tmp/stat_output.txt ## blank line for separating two files
   cat /proc/net/dev >> /tmp/stat_output.txt
   sleep 0.1
   i = $((i+1))
done

echo "Data collection done. See data file in /tmp/stat_output.txt"

## run analysis script here

@moeller0, can you run this on your router with the speed test and then send me the collected data file? I will start by analyzing it in R and seeing what info it really shows us, perhaps we have a methodology problem that would be useful to know before coding up a lua script.

2 Likes

@moeller0

What if we just collected this kind of file and uploaded it to an "analyzer". At the end of the script we basically say "Do you want to upload the statistics file to contribute to the OpenWRT performance estimator?" if the user says "Y" we wget --post-file= it up to some website that then runs the parser/analyzer and adds the results to a database. The user is free to decide not to contribute, but we don't support analysis of the data on the router, a major reason to do that is because analyzing the file on the router requires running parser/analysis code that is much less auditable. It's very easy for people to see "gee the data collected is innocuous I could contribute it no problem" than "gee this complicated lua parser/analyzer is ok to run as root on my router".

I'd say it's no problem to make the whole database be public information (say a sqlite file that you can download), and of course the results of our analysis would be used to predict max bandwidth for each router in the ToH.

To do it this way, I'd basically just add a header on top of the file... something that has say Kernel Version, SQM package version, Router HW specifications, and an sha256sum hash of some unique identifier, such as concatenating /proc/version and /etc/openwrt_release and the MAC address of eth0

1 Like

Okay, on my netgear wnder3700v2, after:
opkg update ; opkg install coreutils-date coreutils-sleep

I ran:

#!/bin/sh

## truncate the output file
echo "" > /tmp/stat_output.txt
i=1

echo "Beginning data collection. Run speed test now."

while [ "$i" -lt 600 ]; do ## 60 seconds of data collection
   echo "###date" >> /tmp/stat_output.txt
   date +%s%N >> /tmp/stat_output.txt
   echo "###/proc/stat" >> /tmp/stat_output.txt
   cat /proc/stat >> /tmp/stat_output.txt
   echo "" >> /tmp/stat_output.txt ## blank line for separating two files
   echo "###/proc/net/dev" >> /tmp/stat_output.txt
   cat /proc/net/dev >> /tmp/stat_output.txt
   sleep 0.1
   i=$(( ${i} + 1 ))
done

echo "Data collection done. See data file in /tmp/stat_output.txt"

## run analysis script here

exit 0

The speedtest result is here:
http://www.dslreports.com/speedtest/32640248

SQM info:

root@nacktmulle:~# cat /etc/config/sqm

config queue
	option debug_logging '0'
	option verbosity '5'
	option upload '9545'
	option linklayer 'ethernet'
	option overhead '34'
	option linklayer_advanced '1'
	option tcMTU '2047'
	option tcTSIZE '128'
	option tcMPU '64'
	option qdisc_advanced '1'
	option ingress_ecn 'ECN'
	option egress_ecn 'NOECN'
	option qdisc_really_really_advanced '1'
	option squash_dscp '0'
	option squash_ingress '0'
	option download '46246'
	option qdisc 'cake'
	option script 'layer_cake.qos'
	option interface 'pppoe-wan'
	option linklayer_adaptation_mechanism 'default'
	option iqdisc_opts 'nat dual-dsthost ingress mpu 64'
	option eqdisc_opts 'nat dual-srchost mpu 64'
	option enabled '1'

root@router:~# uname -a
Linux router 4.9.91 #0 Tue Apr 24 15:31:14 2018 mips GNU/Linux

root@ router:~# cat /etc/openwrt_release 
DISTRIB_ID='OpenWrt'
DISTRIB_RELEASE='SNAPSHOT'
DISTRIB_REVISION='r6755-d089a5d773'
DISTRIB_TARGET='ar71xx/generic'
DISTRIB_ARCH='mips_24kc'
DISTRIB_DESCRIPTION='OpenWrt SNAPSHOT r6755-d089a5d773'
DISTRIB_TAINTS='no-all busybox'

root@router:~# cat /proc/version
Linux version 4.9.91 (perus@ub1710) (gcc version 7.3.0 (OpenWrt GCC 7.3.0 r6687-d13c7acd9e) ) #0 Tue Apr 24 15:31:14 2018

Script and result file to be found under http://workupload.com/archive/YPzFKTt (please note that this is the first time I try a free file hoster, so let me know if this thing is malicious); here are the hash values for the uploaded files to make modifications detectable:

bash-3.2$ openssl sha1 proc_sampler.sh 
SHA1(proc_sampler.sh)= beff294dba4b573f140950b5756af63d09509aad
bash-3.2$ openssl sha1 
proc_sampler.sh  stat_output.txt  
bash-3.2$ openssl sha1 stat_output.txt 
SHA1(stat_output.txt)= 16cd13ea610861005cd6f8b4c973244322d1fff0

I started the speedtest after around 8 seconds, but the speedtest itself has a rather long preparation time. The measurement was done over the 5GHz radio. The upload test shows some issues, but the goal here is not to get the perfect measurement, but rather an initial reasonably well documented data set for testing whether our discussed approach has merit :wink:

Many Thanks

1 Like

This sounds great, as first step I would say let's get an analyzer coded and tested and then we could just distribute that for interested users. That will allow anything under the sun as analysis software, be it R, octave, or for the so inclined even forth :wink:

1 Like

what no LOLCODE? https://en.wikipedia.org/wiki/LOLCODE

1 Like

Quick thought - does this code account for multi-core processors? Will a dual-core always look 50% free since only one CPU is working? I just don't know enough to know how CPU utilization is calculated here.

The /proc/stat output gives per-cpu usage, so we should be able to account for multi-core at least somewhat. We'll see how things go. All of that is down to the analysis code rather than the data collection code.

1 Like

Probably nothing malicious, but it's hard to see the files (there's too much clicking required).

I'm partial to pastebin.com (and there are tons of others...) You can post files (write-only) files anonymously, or sign up for an account, which allows you to edit your posted files.

1 Like

re the workupload link, I downloaded the zip file and uncompressed without any issues. So at least for the purposes of putting together a parser/data extractor program I'm set, now I just need to get a little block of time, which probably won't happen until next week given all the various issues I'm handling right now.

I haven't forgotten this project, but I've been extremely busy with personal and family activities. So here's some thoughts I've had over the downtime:

The format of these various files is not particularly uniform or easy to parse, hand-writing a parser will involve a lot of by-hand if-then garbage, and reading n bytes from here, then skipping q bytes, then reading to end of line... etc etc etc

So, I had the thought that perhaps an ANTLR grammar with a javascript target would make sense, let a machine write the parser code, and then just walk the parse tree and extract the key bits of information, and slam them into a sqlite file. Once the sqlite file is created, then analysis of the data in R is how I'd proceed.

So, any thoughts? Does anyone have specific experience with ANTLR?

To give an idea how this might look, here's the skeleton of the grammar:

STATFILE : STATENTRY* EOF ;

STATENTRY: DATE STAT PROCDEV ;

DATE : COMMENT* INT ;

STAT : COMMENT* CPUENT* IGNORABLE

PROCDEV : COMMENT* PDHEADER INTFC

INTFC: WS* INAME ":" (WS* INT)*

etc etc

then ANTLR does magic, you run it all, and then extract just the juicy bits you care about, like you look at every STATENTRY and grab out its associated date, the CPUENTs and the INTFC info, do a calculation, and then store a row in the SQL table.

An alternative idea I had was to use something like m4 to text-process the raw data file into something much simpler to parse, and then manually parse it in R.

For example, it could strip out all the irrelevant lines and irrelevant numbers on given lines, and leave you with a file that looked like:

{
time : 123456,
cpuIdle : {193331,19300,19331},
intface : {{"lo",...},{"eth0",...}...},
...
}

It seems like it would be not too impossible to make the output compliant JSON and then parse it using a JSON parser. Most of the m4 macros would be responsible for just reading lines and deleting them :wink: the other macros would be responsible for transforming things like

lo : 1999 1999 1999 0 0 0 0 0 ...

into

{'lo',1999,1999,1999,0,0,0,0,0,...}

or some such thing

This has a considerable appeal in that it probably involves a lot less overhead, but on the other hand, m4 macros aren't exactly the easiest thing to write, debug, or understand. Still, considering the complexity of getting ANTLR to do this relatively simple task, I'm kinda in favor of this approach.

+1 to that; I know that using an intermediary format like JSON is not going to save any work in writing the initial parser, but it will be much easier to change to different consumers of the parsed data that way, plus, unlike sqlite, it is still easy to read those files without any additional tools. But I have no knowledge whatsover in ANTLR or m4, so all I can do is watch and be duely impressed

I've actually used m4 in the past for crunching large text files into more useful formats, whereas I haven't used ANTLR (but I have skimmed through the book and was considering it for another project) and I don't do Java, ANTLR will work with several other languages, of which javascript is probably a thing I could do, but ... eh, I'm inclined to try the m4 thing, massage it all into JSON and let people use that.

That being said, it's probably over a decade since I wrote anything in m4...

I've been thinking about this a bit too and feel like a synthetic benchmark with a PC acting as a server might be more reproducible and better for building a database. It wouldn't be as good for crowdsourcing perhaps and might not be as "real-world" but I think it would be more scientific and potentially more reliable.

Something simple like setting SQM bandwidth to a really big number (e.g. 1Gbps) and then run iperf locally with a PC acting as a server. The actual bandwidth achieved should be limited by the CPU's ability to deal with SQM/Cake/Codel. Since SQM bandwidth is set too high, the actual latency might be high, but the bandwidth should be close to the max bandwidth the router can handle. At that point, you could try setting an SQM bandwidth number slightly lower (e.g. 10%) than the previous "max" value, and ensure that latency actually remains low with a simple ping. Not sure if Flent or something else could simplify this.

As an alternate approach, maybe inquire on one of the bloat/cake/make-wifi-fast/codel mailing lists to see if someone has an idea on how to benchmark or would consider writing a script to do so. The script could ramp up bandwidth while monitoring latency and CPU utilization, producing a final approximate max bandwidth a router can handle.

Just an idea. I have no skills to write any of this, but think the data would be meaningful to help people select a router. It could also be helpful to understand the characteristics of CPU performance that influence SQM performance, which could in theory guide future CPU or router designs (ok, now I'm really dreaming).

A synthetic benchmark will be more reproducible yes, but I don't think better for two reasons:

  1. Real world performance is what people care about, synthetic benchmarks represent lower noise (variance) but more biased (different from what's wanted) information relative to something like the speed-test results.
  2. Synthetic benchmarks won't have the ISP equipment in the picture, in particular the range of behaviors possible with DOCSIS or DSL

In the end the main thing is to provide a predictive estimate from a statistical model of the real-world performance that you can reliably achieve, so that some speed rating can be assigned. I personally imagine this speed rating being rounded to the nearest 25Mbps so you'd see something like 75 for some ancient WRT54GL and 125 for some 2007 hardware, and maybe 250 or 325 for a WRT3200ACM (based on results in a different thread) or whatever... This lets people do things like say "Hey I can get 100Mbps now and might go to 200 or 250 in a few years, but I doubt I'll get more than 300 Mbps in the next 7 years, I can go with any router on this list with more than 275 rating....

or alternatively, hey I might get 500Mbps in a year or two, I should really move up to an x86 now even though if I bought a WRT3200ACM it would handle the 200Mbps I have today...

As long as it facilitates that kind of decision making effectively, I don't think we need dramatically more precision than that. People who want to tinker and eke out a few extra tens of Mbps will need to do stuff on their particular link and particular load level, and particular set of packages installed anyway.

Ok, here's how to get a really basic JSON output. First grab the stat_output.txt file and put it in a directory.

Then, install GNU m4 on your linux machine, put this m4 macro script into a file called jsonify.m4 in same directory

divert(-1)dnl
define(`delfirstbrack',`define(`delfirstbrack',`')dnl')
define(`delcomma',`')
pushdef(`delcomma',`dnl')
define(`cpuent0',["$1"`,translit('$2`,` ',`,')]')
define(`cpuentn',`,'["$1"`,translit('$2`,` ',`,')]')
define(`date',`delfirstbrack]}
delcomma,
{
"time" : popdef(`delcomma')')
define(`procstat',`,
"cpuvals" :[')

define(`ctxt',`] dnl')
define(`intr',`dnl')
define(btime,`dnl')
define(processes,`dnl')
define(procs_running,`dnl')
define(procs_blocked,`dnl')
define(softirq,`dnl')
define(procnetdev,`,
"interfaces" :[pushdef(`delcomma',`dnl')dnl')
define(Inter,`
dnl')
define(face,`dnl')

define(interfacedata,`delcomma,
define(`delcomma',`')dnl
[patsubst("'$1`",` +',`')`,' patsubst('$2`,` +',`,')]')
divert dnl
m4wrap(]}])dnl
[

Then put this script called jsonify.sh into the same directory:

#!/bin/sh
cat jsonify.m4 $1 | sed -E -e "s:#*::g"\
			-e "s:/proc/stat:procstat:"\
			-e "s:/proc/net/dev:procnetdev:"\
			-e "s:(cpu) (.*):cpuent0(\1,\2):"\
			-e "s:(cpu[0-9]+) (.*):cpuentn(\1,\2):"\
			-e "/^ face/,/date/ s/(.*):(.*)/interfacedata(\1,\2)/" | m4


Then chmod u+x jsonify.sh and run

./jsonify.sh stat_output.txt

It should give you a sequence of JSON objects

EDIT: It should but I haven't actually tried to parse it, so if you find errors let me know. I'll get a chance to test reading it in probably some time next week.

EDIT: ok, I admit I screwed a few things up so fixing them...