Fq_codel_fast helpers, test and testers?

Over here I've long had some potential improvements to fq_codel: https://github.com/dtaht/fq_codel_fast/ . it's probably a bit out of date...

As openwrt bakes fq_codel into the kernel, and not as a module, I've not got around to testing it
on smaller platforms, or developing the ideas further. My original intent was to create something that did hw mq better, but that turned out hard.

By getting rid of the flows parameter, we got a constant size for flows and thus more cpu optimization.
I don't know of anyone that tries to use flows for anything, but... if they need it, they can scream here.

By getting rid of the non O(1) search for finding the fattest flow, behavior under overload should
be improved, but this adds in a constant for keeping track of the fattest flow that might hurt on normal performance. I prefer O(1) behaviors on overload, sooo....

It benchmarked out as about 5% less cpu at gigE with these two mods on x86 hardware, and better
on overload in general, on that hardware.

Bug: I never figured out how to do the drop from current flow thing right, it's just math on pointers and that always hurts my brain, so I just disabled it. Patches gladly accepted.... see the commit logs. Sorry for being sloppy...

"feature". It has a very, very preliminary implementation of "SCE" support, which conflates two things
together that it perhaps shouldn't. GRO has become a real problem of late, and splitting it apart
where it matters is probably a good idea, and where it doesn't, not splitting is a good idea.

and as for SCE vs L4S, oy, vey. It's the most heated debate in the ietf I've ever been in and I'd like more openwrt folks to be participating in that on the tsvwg mailing list and testing. Some recent results: https://github.com/heistp/sce-l4s-ect1#key-findings

The two concepts are conflated in the code to enable both gro splitting AND SCE, use the
ce_threshold parameter. probably with a threshold of 2.5ms or so.

Any testing here should NOT make it back into openwrt mainline, but given THIS bug

commit b723748750ece7d844cdf2f52c01d37f83387208
Author: Toke Høiland-Jørgensen toke@redhat.com
Date: Mon Apr 27 16:11:05 2020 +0200

tunnel: Propagate ECT(1) when decapsulating as recommended by RFC6040

RFC 6040 recommends propagating an ECT(1) mark from an outer tunnel header
to the inner header if that inner header is already marked as ECT(0). When
RFC 6040 decapsulation was implemented, this case of propagation was not
added. This simply appears to be an oversight, so let's fix that.

Fixes: eccc1bb8d4b4 ("tunnel: drop packet if ECN present with not-ECT")
Reported-by: Bob Briscoe <ietf@bobbriscoe.net>
Reported-by: Olivier Tilmans <olivier.tilmans@nokia-bell-labs.com>
Cc: Dave Taht <dave.taht@gmail.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

which definately needs to be backported into all of openwrt, more folk exercising ALL tunnel types in the presence of fq_codel and ect(1) marking, needs to happen also.

1 Like

Hi,

your opening post is a bit of ramble (that's ok, I do it to) and I'm trying to orient myself as I came here from the ath10k AQL thread.

Looking over the source it appears not related to ath10k-ct wireless AQL, but to wired ethernet interface SQM. If's that's the case, I can't help on my r7500v2 (ath10k based) as it is an wireless AP/switch only at this point, but I might make time to try it on my DIY ubuntu (20.04) router currently configured with SQM (fqcodel)/QOS.

If that's of interest for testing let me know.

1 Like

that would be very helpful thanks. I think however, fq_codel is compiled into the kernel, not as a module, on ubuntu too. let me know. if you have the kernel headers and build-essential installed, IF it's a module, you can just do a make && make install on ubuntu 20

The core algorithm is the same between wifi and ethernet. It's just easier to experiment with ethernet first. And I've lacked time for over a year, to experiment.

Note: fq_codel runs at line rate on the ethernet, no sqm required. it might run faster or better with these mods, both at line rate and within sqm, but I just plain ran out of energy to pursue it. So I figured getting a bit of help and a team setup we could make faster progress on various fronts. I have a little spare time, and given how popular and endless the ath10k thread seems to be, I thought maybe I could attract a few more people over on this, simpler way, to explore some new ideas.

1 Like

I took a quick peek before I posted and I suspect it is...

[2] $ sudo lsmod | grep codel
sch_fq_codel           20480  6
[4] $ sudo modinfo sch_fq_codel | grep codel
filename:       /lib/modules/5.4.0-29-generic/kernel/net/sched/sch_fq_codel.ko
name:           sch_fq_codel

but but being an enthusiast and not a professional, I've been fooled before.

EDIT 0: Once I work my way through the issue below, I'll post on your github site (this forum is for openwrt and I'd like it to stay that way)

I ran into this trying to make

sudo make
make -C /lib/modules/5.4.0-29-generic/build SUBDIRS=/home/ul/tmp/fq_codel_fast modules LDFLAGS_MODULE="--build-id=0x51bd334bffe5fca5171fdc08c1e83eb7b5c1db1c" CFLAGS_MODULE="-DCAKE_VERSION=\\\"51bd334bffe5fca5171fdc08c1e83eb7b5c1db1c\\\""
make[1]: Entering directory '/usr/src/linux-headers-5.4.0-29-generic'
make[2]: *** No rule to make target 'arch/x86/tools/relocs_32.c', needed by 'arch/x86/tools/relocs_32.o'.  Stop.
make[1]: *** [arch/x86/Makefile:232: archscripts] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-5.4.0-29-generic'
make: *** [Makefile:8: default] Error 2

I have the kernel headers:

apt search linux-headers-$(uname -r)
Sorting... Done
Full Text Search... Done
linux-headers-5.4.0-29-generic/focal-updates,focal-security,now 5.4.0-29.33 amd64 [installed,automatic]
  Linux kernel headers for version 5.4.0 on 64 bit x86 SMP

EDIT 1: It builds...

need to change

...
$(MAKE) -C $(KDIR) SUBDIRS=$(PWD) ...
...

in the makefile to:

...
$(MAKE) -C $(KDIR) M=$(PWD) ...
...

the router is in heavy use due to COV-19 but hopefully I can give it a quick try in the next couple of days.

1 Like

is it still the ubuntu default qdisc for ethernet?

1 Like

so... I'm participating to be helpful but also a learning exercise for myself. i.e. please be patient

I think yes, based on (enp1s0 is WAN):

[38] $ tc -d -s qdisc show dev enp1s0
qdisc htb 1: root refcnt 2 r2q 10 default 0x12 direct_packets_stat 0 ver 3.17 direct_qlen 1000
 Sent 5707192195 bytes 38092976 pkt (dropped 25413, overlimits 6186964 requeues 4) 
 backlog 0b 0p requeues 4
qdisc fq_codel 120: parent 1:12 limit 1001p flows 1024 quantum 300 target 5.0ms interval 100.0ms memory_limit 32Mb ecn 
 Sent 5688504342 bytes 37935193 pkt (dropped 25411, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
  maxpacket 7837 drop_overlimit 0 new_flow_count 13646731 ecn_mark 0
  new_flows_len 0 old_flows_len 1
qdisc fq_codel 130: parent 1:13 limit 1001p flows 1024 quantum 300 target 5.0ms interval 100.0ms memory_limit 32Mb ecn 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 110: parent 1:11 limit 1001p flows 1024 quantum 300 target 5.0ms interval 100.0ms memory_limit 32Mb ecn 
 Sent 18687853 bytes 157783 pkt (dropped 2, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
  maxpacket 1394 drop_overlimit 0 new_flow_count 135571 ecn_mark 0
  new_flows_len 1 old_flows_len 26
qdisc ingress ffff: parent ffff:fff1 ---------------- 
 Sent 67814104374 bytes 57591849 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
1 Like

turn sqm off and hopefully fq_codel is running native on that interface. It wouldn't surprise me if it ended up being pfifo_fast or sch_fq at this point, though.

1 Like

it'll have to wait a bit. my son is on his xbox and two other family members are streaming/gaming.

There is also the small fact that I recently "rebuilt" (and upgraded to 20.04 from 18.04) the router due to a failing HD. That disruption in service was tolerated but not much appreciated so I have to minimize potential outages for the time being.

FWIW, I recall that I saw fq_codel on that if before enabling sqm but I'll stop it when I have a moment and see. I also recall that testing via "dslreports" gave me an "F" for bufferbloat before enabling sqm and all "A's" after that (so I'm confused about your "no sqm required" comment above). I'll be interesting to see...

BTW "Bug" in your opening post refers to commit f00ad01 (related to the function fq_codel_drop)?

I'm still reading so "feature" may be a week or two off...

1 Like

certainly sqm is required for your environment and I know dang full well how upset the whole family can get when the network acts up! All the bufferbloat work was basically based on trying to make the internet work "right" for jim gettys' family network - one major, old, brilliant geek, a geeky son, an artistic daughter, and a schoolteacher spouse, that kept going "Daaadd - stop using the internet, I'm tyrying to {game,upload some pics, do a videoconference}."" To read that story, going back to 2010, and moving forward to present day, is enlightening. https://gettys.wordpress.com/2010/09/16/interesting-puzzle-surprising-results/

My test was merely to see if fq_codel was the default at line rate also. if it is, it's kind of painful to get a new version installed, and if it's buggy, really hard to get the box back.

1 Like

so, anyway, if it's the default

DON'T do
make install
in the fq_codel_fast dir. Just make. Then...

... rather

sqm stop
for i in each device you have
do
tc qdisc replace dev $i root pfifo_fast
done

rmmod sch_fq_codel.ko # you need to remove the kernel module when all the users are gone
insmod ./sch_fq_codel.ko # temporarily replace the existing kernel module

restart sqm
beat it up

I do fear it will be buggy and lock this kernel up and need a reboot, if it does, but at least this way
you wont' permantently mess up your box.

1 Like

kernel does not want to let go of sch_fq_codel. I can replace the qdisc with pfifo_fast but I'm not sure I'm really stoping sqm (/usr/lib/sqm/stop-sqm enp1s0 completes without error...)

EDIT 1: skimming through /usr/lib/stop-sqm and support scripts, ifb4<if> should not be present and I should not see /var/run/<if>.state if stop-sqm completes successfully. As I still saw those, something is not working as I expected

EDIT 2: from /usr/lib/sqm/run.sh (on my openwrt AP) it looks like I need to:

IFACE=enp1s0 SCRIPT=/usr/lib/sqm/simple.qos /usr/lib/sqm/stop-sqm

to "sqm stop"

(silly, I saw the comment at the top of "stop-sqm" that says "allow passing in the IFACE as first command line argument", but there is no code like IFACE=$1 to actually do that)

EDIT 3: even after stopping sqm, I can't rmmod fq_codel (tried rmmod several modules associated with sqm, ip link set <all-ifs> down, search kernel source for *module_get or *module_put to find what might be causing the "used by" to always be 1, etc no luck) but maybe it doesn't matter see next post

my 45 min window of opportunity to play with the network just closed so it'll be 24 hours before I can try again

I need a "lab" to be of much help. I can do that on the cheep but it will take time in the present environment.

1 Like

so fq_codel_fast/sqm (simple.qos) are loaded and appear to be working at the moment.

I didn't follow your instructions

what i did

sudo ~/tmp/sqm.sh stop enp1s0 # DIY start/stop sqm script
sudo insmod ~/tmp/fq_codel_fast/sch_fq_codel_fast.ko
[24] $ lsmod | grep sch
sch_htb                28672  2
sch_fq_codel_fast      20480  4
sch_ingress            16384  1
sch_cake               32768  0
sch_fq_codel           20480  2
# change QDISC=fq_codel_fast in /etc/sqm/enp1s0.iface.conf
[14] $ sudo ~/tmp/sqm.sh start enp1s0
Starting SQM script: simple.qos on enp1s0, in: 26000 Kbps, out: 3100 Kbps
Using generic sqm_start_default function.
get_burst (by duration): the calculated burst/quantum size of 387 bytes was below the minimum of 1749 bytes.
get_burst (by duration): the calculated burst/quantum size of 387 bytes was below the minimum of 1749 bytes.
WARNING: qdisc fq_codel_fast does not support a limit
WARNING: qdisc fq_codel_fast does not support a limit
WARNING: qdisc fq_codel_fast does not support a limit
WARNING: qdisc fq_codel_fast does not support a limit
simple.qos was started on enp1s0 successfully

# ping 8.8.8.8 and do ookla speed test (no obvious change from using fq_codel/simple.qos or cake/piece_of_cake.qos - I think a good sign but I'd like to look "deeper" if possible on my not so fast "broadband")

[31] $ sudo tc -d -s qdisc show dev enp1s0
[sudo] password for ul: 
qdisc htb 1: root refcnt 2 r2q 10 default 0x12 direct_packets_stat 0 ver 3.17 direct_qlen 1000
 Sent 11583428 bytes 54082 pkt (dropped 611, overlimits 6186 requeues 0) 
 backlog 0b 0p requeues 0
qdisc fq_codel_fast 120: parent 1:12 
 Sent 11514074 bytes 53515 pkt (dropped 611, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc fq_codel_fast 130: parent 1:13 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc fq_codel_fast 110: parent 1:11 
 Sent 69354 bytes 567 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc ingress ffff: parent ffff:fff1 ---------------- 
 Sent 324709486 bytes 242045 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0

I'll let this run for bit and I can start and stop it but, as I've shut down the nextcloud server also running on this box for safety, I will not let it go indefinitely.

HTH

EDIT 0: I let it run for a couple hours (worked fine, no obvious issues) but shut it off as I need to do domestic activities and can't be around to fix it if breaks. I can enable/disable it with little or no network disruption so if there is something you'd like me to test or examine, let me know.

1 Like

heh. I totally forgot how I did the implementation. When I started it was a drop in replacement for fq_codel! Thx for taking a look. sqm and tc would need mods to play with it harder. Glad to hear it didn't crash and to see it drop packets where needed. The principal measurement to start with was to merely see if consumed less cpu resources or not.

1 Like

I'll have to think about a how best to measure that on my network (netperf to load the network and some sort of mpstat or such). Given my relatively slow broadband and the fact that this is running on x86, my load usually looks like it does right now.

[1] $ uptime
 11:17:05 up  2:25,  1 user,  load average: 0.00, 0.00, 0.00

R7800 users that use their device as a GW/AP on 100+ mbps up/down connections generally see significant load when using sqm. So that might be a better device to test on. I can fork the fq_codel_fast git repo, port it to openwrt ipq806x, and test that it loads on my r75000v2.

Beyond that, anouther brave openwrt user will have to try it/test it with WAN and sqm.

You've indicated your goal with this is to try it on ath10k-ct wifi which might be something I'm able to test... but first things first, it needs to built for openwrt.

1 Like

and my other question was, what's the default qdisc for ubuntu 20.4 at line rate, without sqm?

1 Like

well, that's an end, end, goal. Figuring out if the cpu cost savings from the performance improvements I'd made by ripping out every unused feature would balance out with the benefit from gso-splitting, on any hardware is a shorter term goal.

another OT long term problem, I'm pretty convinced now that ack-filtering is a win, and that's only in cake...

but gotta partner up with a few people that have chops I don't to code and test. Ever since toke graduated I've been without a voice. I'm good at theory. Lousy at kernel programming.

1 Like

I am under the impression there is a firmware offload for codel, at least, on this chip. I somehow doubt it's made it over to openwrt?

1 Like

based on what I saw over the past day, its fq_codel.

1 Like

Yea! I was afraid they'd listened to certain parties and made sch_fq the default. I DO strongly recommend sch_fq if you have a heavy tcp-serving workload, and it is evolving steadily to make
udp serving apps like quic fly, but if you are using network namespaces, containers, vpns, vms, voip, videoconferendcing, operating below 10Gbit, or routing packets in any way... fq_codel remains the best all-round thing as a default qdisc for linux. Still.

That debate was just brutal: https://github.com/systemd/systemd/issues/9725#issuecomment-413369212

1 Like

If you are referring to the nss cores, openwrt does not officially support it (but I see you're aware of the forum thread about it...)

1 Like