I don't know if going thru 800000 matters, I've seen it mentioned somewhere that you can't go from 384 to max, you have to ramp up thru 800. It doesn't hurt anything to have it there.
My temps are about 1-2C higher with these performance settings. Noise really.
Why not just run performance governor at max freq all the time and not search for the perfect settings anymore? Heat is not a problem on the R7800 but do monitor it: cut -c1-2 /sys/devices/virtual/thermal/*/temp
With an aggressive ondemand up_threshold value you will pretty much spend most time at max frequency anyway, bouncing between min and max frequencies.
I don't think many realize what up_threshold means, it is not the load to step up to the next freq, but rather to jump up all the way to max freq. I tried it with min=800, max=1.7 and up_threshold=20 and it just flip-flops between 800 and 1.7, the frequencies in between (1.0, 1.4) are never used.
up_threshold
If the estimated CPU load is above this value (in percent), the governor will set the frequency to the maximum value allowed for the policy. Otherwise, the selected frequency will be proportional to the estimated CPU load.
(useful to also read the explanation for sampling_down_factor)
Can anyone tell me where to check to figure out why my R7800 restarts randomly? For the third time it has restarted at the most inopportune time (conference call, middle of game, etc.). I am running hnyman's build - trying different "tuning" that people post, but the same thing always happens. I have looked in /var/log and found nothing. I looked at the kernel and system logs in the GUI and found nothing. Where do I go to see what is causing these crashes? Anyone know?
Serial console output during the actual crash. To prevent flash wear, logs are written to ram, so they do not survive a reboot.
P.s. One known/suspected reason for occasional crashes has been jumbo frames from other devices. There was apparently a bug in the kernel driver for fixed ethernet, but that should have been fixed before kernel 5.4.
Okay, so not really an easy way to figure it out...
I just update from r13628-870588b6eb-20200625 to r13881-bae4204e34-20200718.
I'll see if my some miracle that fixes anything.
to the local startup commands as I had read that it fixes some issues with crashing.
It did greatly reduce the amount of transitions the CPU's were doing.
I thought that had fixed it, but the random crashes still persist.
They next thing I might try is changing the governor to performance - this basically just runs at full clock speed correct?
kernel: add patch that adds support for running threaded NAPI poll functions
This is helps on workloads with CPU intensive poll functions (e.g. 802.11) on multicore systems
For some drivers (especially 802.11 drivers), doing a lot of work in the NAPI poll function does not perform well. Since NAPI poll is bound to the CPU it was scheduled from, we can easily end up with a few very busy CPUs spending most of their time in softirq/ksoftirqd and some idle ones.
Introduce threaded NAPI for such drivers based on a workqueue. The API is the same except for using netif_threaded_napi_add instead of netif_napi_add.
In my tests with mt76 on MT7621 using threaded NAPI + a thread for tx scheduling improves LAN->WLAN bridging throughput by 10-50%. Throughput without threaded NAPI is wildly inconsistent, depending on the CPU that runs the tx scheduling thread.
With threaded NAPI it seems stable and consistent (and higher than the best results I got without it).
I suspect this patch might just make the firmware crash faster if there are any undiagnosed race conditions hanging around. I'd love to see if it makes a difference with the R7800 though.
Well so far 9.5 days on the above and no crashes.
Whatever was going on causing the crashes seems to be gone now.
I also found this and put it in my local startup - that might have helped... not sure:
# Put your custom commands here that should be executed once
# the system init finished. By default this file does nothing.
# min scaling frequency: set to 800MHz because of L2 cache issues
echo 800000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_min_freq
echo 800000 > /sys/devices/system/cpu/cpufreq/policy1/scaling_min_freq
sleep 1
# ondemand governor
echo ondemand > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
echo ondemand > /sys/devices/system/cpu/cpufreq/policy1/scaling_governor
echo 100000 > /sys/devices/system/cpu/cpufreq/ondemand/sampling_rate
echo 75 > /sys/devices/system/cpu/cpufreq/ondemand/up_threshold
echo 10 > /sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor
# schedutil governor
#echo schedutil > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
#echo schedutil > /sys/devices/system/cpu/cpufreq/policy1/scaling_governor
#echo 100000 > /sys/devices/system/cpu/cpufreq/schedutil/rate_limit_us
# performance governor
#echo performance > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
#echo performance > /sys/devices/system/cpu/cpufreq/policy1/scaling_governor
# Reset stats
echo 1 > /sys/devices/system/cpu/cpufreq/policy0/stats/reset
echo 1 > /sys/devices/system/cpu/cpufreq/policy1/stats/reset
exit 0
I believe ondemand was the default anyway, but when I checked the sampling was much much lower (if I remember correctly) so maybe these tweaks helped too? If anyone thinks one of these settings is wrong or could be improved, I'm always happy to hear it!
Thanks as always for the community support!
DeadEnd
One thing to consider- Using 800mhz as your minimum frequency you’ll probably never see a frequency increase with an up threshold of 75. This will hurt performance for max 5ghz wifi speeds or as you approach near gig wired speeds.
Most people will use a more aggressive up threshold with 800mhz as their minimum. 20-35 is pretty popular. Personally I use the more aggressive end (90%+ of the time the router is at 800mhz, only a fraction of the time it ramps up to max frequency). This is what my startup settings are:
I posted those settings so I can comment on them based on my use case and observations
up_threshold 75 works quite well if you want to use pretty much all frequencies between 800-1.7 and have a gradual ramp up. I use mine in AP mode, most services turned off and it does get to 1.7 when laptops back up over wifi.
If you go aggressive 20-35 then you will just be jumping from 800 pretty much to 1.7, you don't use 1 or 1.4, because up_threshold is the load to go all the way up to max, not to the next frequency. At least in my case when I tried aggressive settings I never saw 1.4 being used. Not that it really matters, because ...
I use the performance governor, locked at 1.7, I don't see much benefit in transitioning between frequencies a lot. This is not a phone, you're not trying to save battery.
FWIW, I noticed schedutil to not be aggressive at all, with 800 min freq, I rarely see moves up to higher frequencies.