Rpi4 < $(community_build)

God, man. You're too fast!

Okay, I'll check if irqbalance does any difference, as for rpi-perftweaks.sh I'll take a look, but it's been a while since I dabbled in hardcore bash (time to Google I guess :sweat_smile:).

As per the other point, hard agree. Let me see if I can find a spare USB and I'll give it a whirl.

Thanks again for everything!

1 Like

meh... feel free to discuss on this thread... most of the tweaks can pretty much be reduced to two or three simple commands...

just knowing which ones is the pita...

Heheh, can relate.

Then I'll get on reading and report back if I find anything. :blush:

1 Like

This strikes me as a little off... if sqm was enabled during this test... it's likely that it's probably not tuned so well...

In addition to that above comments...

  • treat all tests seperately... (sort out core bottlenecks without sqm first) then just try to sort out sqm as directly as possible (client directly wired to the pi or as best you can)

mixing the two of these things will make isolating stuff very difficult...

Now that you mention it, I tried running again from all previous devices and only the AP reported speeds lower than WAN. I find it quite likely SQM was being the bottleneck then.

PC:

   Speedtest by Ookla

     Server: INFINITUM
        ISP: Telmex
    Latency:     2.09 ms   (3.53 ms jitter)
   Download:   754.75 Mbps (data used: 421.3 MB)
     Upload:   204.47 Mbps (data used: 201.2 MB)
Packet Loss:     0.0%

Router:

Retrieving speedtest.net configuration...
Testing from Telmex (REDACTED)...
Retrieving speedtest.net server list...
Selecting best server based on ping...
Hosted by INFINITUM (REDACTED) [5.88 km]: 25.44 ms
Testing download speed................................................................................
Download: 752.06 Mbit/s
Testing upload speed......................................................................................................
Upload: 175.55 Mbit/s

AP:

     Server: Coeficiente Comunicaciones
        ISP: Telmex
    Latency:    67.25 ms   (0.13 ms jitter)
   Download:   589.14 Mbps (data used: 693.0 MB)
     Upload:   201.05 Mbps (data used: 245.4 MB)
Packet Loss:     0.0%

Given it's through scripts, how should I go about tuning, aside from setting speeds a smidge (5% - 10%) lower than average?

1 Like

strip everything back to the bare essentials then slowly change one thing at a time...

cat <<'TTT' > /bin/toaster.sh
echo -n 2 > /proc/irq/ETH0INTERRUPTNUM2/smp_affinity
TTT
chmod +x /bin/toaster.sh
echo 'PERFTWEAKS_SCRIPT="/bin/toaster.sh"' >> /root/wrt.ini
/etc/init.d/irqbalance disable
#reboot and wait 3 mins then test

the reboot is only needed to undo the previous settings... if you copy what the defaults are you can just run the script manually and reset each default each time...

for testing...

  • watch cat /proc/interrupts
  • something like this htoprc -RC htoprc-cpu-w-kernel-sortcpu

Okay. I'll give it a try then.

Should I leave anything in particular for a Queue Setup Script? Or just leave SQM disabled for the moment?

1 Like

yeah... best to disable sqm for now...

1 Like

Fair enough then. I'll see how the default tweak works for now and add or remove from the original file to test.

Thanks again wulfy!

1 Like

might be a pain in the behindo... but (lol) next thing i'd be doing after moving one of the eth0 interrupts (if needed) is;

  • swapping eth0(make it lan) and eth1(try different nics)
  • irqbalance ( put exit 0 at the top of your script no need to manually move the interrupt if irqbalance is enabled )
  • upping the base frequency on the governor echo -n "1000000" > /sys/devices/system/cpu/cpufreq/policy0/scaling_min_freq

i'd be pretty surprised of all combinations of the above don't get you up to 900Mb...

kmod-usb-net-rtl8150

nice... one more driver for the build... thanks!

1 Like

Okay, so far it seems I've made some progress.

Setting both min and max frequency to 2000000 and then performing a speedtest leads to an increase in speed of about 20Mbps with irqbalance enabled, and no difference with it disabled.

Also htop reports both dnsmasq and ksoftirqd/1 hitting core 0 hard and the latter core 1 as well, as you say this might have to do more with interrupt handling than anything.

I'll read up a little through RHEL's materials for IRQ, first time dealing with it actually, so if you have any pointers I'll be more than thankful.

Otherwise, I'll keep you posted. Thanks!

1 Like

not really... but running top then pressing c will give you a nice per cpu network interrupt percentage... (while running your speedtest)

CPU0:  0.0% usr  0.0% sys  0.0% nic  100% idle  0.0% io  0.0% irq  0.0% sirq
CPU1:  0.0% usr  0.0% sys  0.0% nic 99.7% idle  0.0% io  0.0% irq  0.2% sirq
CPU2:  0.0% usr  0.0% sys  1.2% nic 98.7% idle  0.0% io  0.0% irq  0.0% sirq
CPU3:  0.0% usr  0.1% sys  0.1% nic 99.6% idle  0.0% io  0.0% irq  0.0% sirq

it's probably wise to swap that nic and/or swap wan<>lan before doing too much more... or just do some general web searching about that nic and what other people say about the linux driver...

1 Like

That's plenty already!
Looks like this will be a multithreading exercise then.

I'll give it a shot. Thank you!

1 Like

sorry, I missed you were asking specifically re:sqm...

i'm probably not the best person to ask... I don't even use overheads... or any other fancy settings... just tweak the speeds down until it works (-2 to 5%)... then make sure that value is lower than mean actual obtainable bandwidth over time...

that said... on really high speeds you might experiment with trying 'ack-filter'... ( a separate thread would be good for your tuning once you get the baseline bandwidth to a predictable value)...

I think I may have also read that fqcodel(or no sqm) not cake is prefferred for super high bandwidth connections...

(that said... the low(er) results from the AP are likely application load on the AP itself or some constraint thereafter)

I personally wouldn't be too concerned with such a value if it's consistent... and nothing (any cpu core) is maxed out... ( but I suspect we are not there yet )

sir I want to keep setting in folder /etc/sysctl.d and do certain commands in first boot after upgrade/update.

1 Like

you can create a regular file there and add it to sysupgrade.conf as discussed i.e.;

echo 'net.ipv6.route.max_size = 24576' > /etc/sysctl.d/19-mysysctl
echo '/etc/sysctl.d/19-mysysctl' >> /etc/sysupgrade.conf

if they are truly only for firstboot... just create your script in /etc/custom/firstboot/999-something and add it to /etc/sysupgrade.conf

if for everyboot do the same in /etc/custom/everyboot/999-myscript and add to /etc/sysupgrade.conf

all depends what the commands do really... 999 means late... after all the other stuff...


for even more complicated stuff (more packages?) you could also use your own imagebuilder...

http://rpi4.wulfy23.info/ib/

1 Like

thank you for the explanation. Since I need a dependency package that doesn't available yet in OpenWrt so I need to make sure the services running well after upgrading.

1 Like

in this case... it may be better if you track "release" aka 21.02 and not the other builds...

it will mean you have to update way less (be offered much less updates anyway)...

but some packages may not be compatible / available on 21.02... (most should be ok)

1 Like

Hey there wulfy! Status update.

It seems enabling IRQBalance and setting affinity to cores 2 and 3 worked correctly for normal operation (commands below)

echo -n 3 > /proc/irq/38/smp_affinity
echo -n 2 > /proc/irq/39/smp_affinity

And set the governor to performance using the following command

echo -n performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

All bundled up in /bin/toaster.sh as my custom perftweaks.sh you helped me create yesterday.

I also added the following to /boot/config.txt to overclock to 2.0GHz

# Overclocking to 2Ghz
over_voltage=6
arm_freq=2000

After all that, it seems I'm able to do a speedtest without maxing out any CPU core reliably from any device in the network, topping out around 87% for core 1 on average as long as SQM is disabled, in which case using cake as qdisk and simple.qos as the Queue Setup Script leads to the core maxing out at 100% but not reducing speed, and only reducing latency a little.

Given that SQM aside the configuration seems stable, should I aim for further fine-tuning, or look somewhere else along the chain?

Thanks for everything!

2 Likes

good news... sounds like your are on the limit / have enough wiggle room to tune things to run cake if you really want to...

what i've done in this case... is to ensure any services that may use cpu are pinned to core 4(or 3 if you count from zero)... you can see examples with taskset maybe in rpi-perftweaks.sh if not... ask and I may be able to advise... but these are just for a 'little' wiggle room on a single core and can backfire...

i.e.

taskset-aarch64 -apc 2,3 $thispid

( good cantidates are collectd, nlbwmon, uhttpd etc. )

other than that... i'd create that new thread regarding sqm for more professional advice re: tuning cake/fqcodel/other for on-the-limit-cpu/high bandwidth scenarios...

2 Likes