Rpi4 < $(community_build)

That's fair, your solution of using user submitted scripts as a sort of 'selectable profile' is a very pragmatic approach that anyone can learn to adjust.

I'm working my way through your script now, I appreciate that you commented it but it's pretty hard going for a newb like me but I I'm getting there. I saw there's a bunch of commented out code in perftweaks that would implement CPU affinity and now I'm wondering why you disabled it?

Is there an overview guide of how your changes hang together? E.g where your scripts start in the boot process and how they are strung together. If that's a big ask please ignore.

The performance of your build is huge compared to the default snapshot builds I started with, I'm trying to understand what changes you've made purely from a performance standpoint. Is that all in perftweaks?

1 Like

thanks for the questions...

pretty much as above... spend around a month of my life on this... and in the end there;

  • was not enough skilled testers to provide the level of feedback / testing / future improvements
  • the support burden was massive, lots of questions about why ABC is not doing XYZ ( with little background information given )
  • as above, performance and performance optimization testing is a fickle thing... give 10 cooks a lasagne recipe and you will get 10 different things...

so I got a bit burnt out with it and said to myself: 'turn it all off except some reliable general tweaks, and allow the users to make their own changes'

its not absolute, and I do hope to one day to return to my better scripts, but for the medium term separate submitted user scripts is the most managable way to offer best performance for everyone without 10 fold factor in questions and problems compared to solutions and feedback offered...


it is all very basic... and i'll even fixup anything fancy if you want to submit/suggest something without the fancy variables and hooks... all tweaks boil down to around 10 simple commands...

almost everything in my build gets called out of rc.local > rc.custom > elsewhere

you can find almost everything with fgrep

fgrep -r rpi-perftweaks /etc/custom

there is one KEY element... this script is called in the background after a sleep of around 200 seconds... this is to allow for all services to startup and settle, otherwise RENICE and TASKSET may have no process to work on...

and because you went through the hassle of providing some quality observations... I dug up an old reference sample of the user configurable tweaks script for you if interested note: this does not discover eth0 irq numbers like the current one so wont work likely without fixes... useful for the servicecpuadjust() (taskset) you were asking about...


i don't think its really that major (but admittedly I have not really compared for a looong time)... has been discussed before in this thread... will link it here when I have some time...

essentially;

  • some teency config.txt changes or whatever
  • some sysctl's (but I don't think they really do much or at least I don't have a great understanding of what they really do... just seemed sensible / worth chucking in)
  • 2-5 days looking at the governor and limiting the threshold at which it throttles back cpu power... it's very agressive on these devices... minfrequ is set to maybe half of the frequency range as part of this
  • perftweaks (cpu affinity, process affinity, process renice, packet steering) all on or off or with minor variances depending on how far you go back / which build revision you look at
1 Like

Thanks Wulfy, that's an excellent breakdown and allows me to dig in further. I've already started editing a forked copy of your perftweaks script as suggested and dumbed down your comments a little so that I can understand them :slight_smile:

I fully appreciate how this would burn you out, 1800 comments in this thread alone!

And don't play down your hard work, the latest official build ran like cr@p for me and there's practically no optimisation guidance to be found.

I was about to give up and build an x86 OPNsense box when I stumbled across your work - it took me 10 minutes to get it running with GB throughput, I wasted 1/2 day trying and failing on the latest official build!

1 Like

Lazy guide to achieving 1GB throughput on this amazing community build.

This is basically a note to my future self when I forget everything :slight_smile:

  • Use a Realtek USB3 adaptor for WAN (way lower CPU usage than other chipsets I tested)
  • Disable packet steering
  • Uninstall nlbwmon (it eats CPU time)
  • Manually set affinity for eth0 to it's own CPU core (or maybe try enabling IRQsteering - YMMV)
  • A 2GB Rpi4 is more than enough, extra memory won't improve anything.
  • QoS was unnecessary for me, I'm getting A+ on all of the buffer bloat tests.
  • Overclocking is entirely unnecessary with nlbwmon disabled.
  • I've enabled DoH, Adblock (with XXL lists) and Wireguard server - no issues

That's literally all I needed to do in order to get line speed with around 40% utilisation across all 4 cores! Impressive work @anon50098793 - you made this easy.

2 Likes

interesting findings... great advice over ~650Mb/s to just trash nlbwmon &&|| luci statistics the bursting messes up cpu utilisation at those levels as you found...

great you don't need SQM... and the packet steering is sort of tied to that AFAIK... so users of SQM(over around 550Mb/s) would be advised to definitely use packet steering

    TASKSET="$(command -v taskset-aarch64)"
	for thispid in $(pidof nlbwmon); do
		$TASKSET -apc 3 $thispid 2>&1 >/dev/null
	done
	for thispid in $(pidof collectd); do
		$TASKSET -apc 3 $thispid 2>&1 >/dev/null
	done
	for thispid in $(pidof uhttpd); do
		$TASKSET -apc 2 $thispid 2>&1 >/dev/null
	done

findRUPT() {
	fgrep ${1} /proc/interrupts  | sed 's|^ ||g' | cut -d':' -f1 | \
		tr -s '\n' ' '
}

eth0INTs="$(findRUPT eth0)"
tRU=
if [ ! -z "$eth0INTs" ]; then
	for tRU in $eth0INTs; do
		coreSET=${coreSET:-1}
		echo -n ${coreSET} > /proc/irq/$tRU/smp_affinity
		coreSET=$((coreSET + 1))
	done
fi

#would be good if you can test all 'c' and all 'f' and all '0' here also without SQM (can test with also but mostly interested without)
echo -n 1 > /sys/class/net/eth0/queues/tx-0/xps_cpus
echo -n 2 > /sys/class/net/eth0/queues/tx-1/xps_cpus
echo -n 4 > /sys/class/net/eth0/queues/tx-2/xps_cpus
echo -n 4 > /sys/class/net/eth0/queues/tx-3/xps_cpus
echo -n 2 > /sys/class/net/eth0/queues/tx-4/xps_cpus
echo -n 7 > /sys/class/net/eth0/queues/rx-0/rps_cpus
echo -n 7 > /sys/class/net/eth1/queues/rx-0/rps_cpus

echo -n "1100000" > /sys/devices/system/cpu/cpufreq/policy0/scaling_min_freq
echo -n 21 > /sys/devices/system/cpu/cpufreq/ondemand/up_threshold && sleep 2
echo -n 5 > /sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor

I tested your script and it worked as expected regards setting affinity, much cleaner than my hardcoding eth0 to core 1.

I also tried setting the queues with c / f / 0s - all achieved 1Gbps except all 0s - which dropped throughput down to ~600Mbps.

I re-installed and enabled nlbwmon and ran all of the tests again with the same results (there's a margin of error of around 3% which I'm attributing the general upstream test servers).

So fixing nlbwmon to core 3 was all that was required to re-enable it with no loss of performance at 1Gbps speeds! That's a nice result.

2 Likes

thankyou very much for your tests...

in honour of your efforts I will re-introduce/introduce a '~1Gbs(tba)' parameter that can be easily set in future builds

AND put the perftweaks script on github should anyone wish to make PR

PERFTWEAKS_Gbs=1
2 Likes

Glad to make a tiny contribution.

Looking forward to testing out the next build!

Perftweaks="Inhonourofsubzero" has a nice ring to it :joy:

1 Like

@swanson r18436 ( master @ 'stable' || 'current' ) now contains a fix for the onboard wifi

for this reason I pushed it down to stable, if anyone else does not need the onboard wifi then no real reason to update (from r18370)...

suppose I may as well generate a newer 'release'/21.02.1 build... if anyone is on 21.02.1_1.0.10-x same thing... if you don't need the onboard wifi then no real reason to update also...

1 Like

I can't access LuCI then login ssh got this log

WedDec2916:53:442021 mmc0: Got data interrupt 0x00000002 even though no data operation was in progress.
WedDec2916:53:442021 mmc0: Got data interrupt 0x00000002 even though no data operation was in progress.
WedDec2916:53:442021 mmc0: Got data interrupt 0x00000002 even though no data operation was in progress

Then after reboot cannot access ssh & LuCI too
With version rpi-4_21.02.1_1.0.10-3_r16325_extra_release_update

1 Like

bugger... i'd suggest possibly using a new/different sdcard with a factory image (if you have a backup from the updatecheck bar you could restore that ... or from a linux pc you may be able to copy all the files from /etc/config/ from one sdcard to the other)

while there is a thing or two that comes to mind recently that may be related... based on this until someone else reports the same messages... we are best to assume it's an isolated disk issue

avoiding for the future could be stuff like;

  • better sdcard
  • cooler case
  • avoiding power loss
  • double checking nothing heavy is writing to mmc
  • maybe using a powered usb-ssd-dock for the OS instead of mmc

or could just be one of those random things there is no control over...

1 Like

Thanks you for the advices. Sadly I don't run linux OS in any devices.

This old sdcard from two years ago i used to flash sdcard frequently before use your build

1 Like

any usbstick can also be used if you have a free port and no spare mmc card

1 Like

Woahh that's good. Is it just plugged into usb 3?
Found sdcard 64GB in bag lol

1 Like

yup, either should work

(the only catch is depending on the boot order it will attempt the mmc(first) by default... so long as there is nothing in there or the mmc has no OS it should then try USB i think... also adds a good 10 seconds before it boots too)

for microsd's (if anyone is getting replacements) i'd recommend either of these examples (price is AUD so divide by 0.6 or something) and these sizes are around the current sweet spot... for sdcards be vary wary of ebay and stuff... worth paying double even from a good store if there is no other option...

Samsung 64GB evo plus ~ approx 7$US?
Samsung 32GB PRO Endurance ~ approx 12$US?

very cheap now! think i paid at least double that a year ago...

(I hear pretty good things about the sandisk equivalents extreme class10? but too many counterfeit sandisks going around for me personally to trust getting one online - sorry sandisk)

1 Like

just flashed and can't loaded network section with this errors.

error:
firewall.getZoneColorStyle is not a function

is it issues with argon theme? I tried other theme are normal

1 Like

yeah... another user reported this and switched to bootstrap... i've not been able to see it yet...

could be related to how fast you open luci on firstboot maybe...(then again... the words in the error point more towards some sort of style change i've not updated for or something)

i'll keep trying to reproduce it especially since it's been reported twice, thankyou!


you can also try this before the next upgrade;

ARGONVER="2.2.9"

to get the newer argon code... but that has some style issues I need to fix also... but may be of interest / worth a try...

manual replace from raw package without upgrade would be something like;

opkg remove luci-theme-argon
opkg install /etc/custom/luci/themes/argon/2.2.9/*.ipk
1 Like
stable uptodate: 3.5.331-7  twicedaily[refresh]  [backup]  [tty]

upgraded successfully.

1 Like

One more thing I like to report
from last two upgrades there is no issue of IPTV stream stopping after few mins, it is working fine.

2 Likes

Can confirm @SubZero settings for a 1000/50 connection. I always figured I needed packet steering but the network is snappier without it. I run IRQBalance instead of manually setting affinity mostly because I haven’t worked out how to do that or get it working. I do however slightly over clock to 1800mhz. Maybe it was just the speed tests but I was maxing out at 870mbps and now can get 920mbps out of it. Irrelevant really, because my wireless doesn’t get me more than 700 but we like to tinker!

Anyway, well done.