R7800 cache scaling issue

in theory with my patchset you should be able to run with the ondemand governor and without changing the scaling_max_freq at all. btw the max freq is 1725000, not 1750000. i'd be interested in seeing your results with the ondemand governor.

I don't change scaling_min_freq either, but at the moment i'm not sure whether the QSDK comment about Krait L2/idle frequencies is a true hardware restriction or a workaround for the L2 scaling breaking. so you may want to leave the scaling_min_freq at 800MHz (or even 600MHz, which is the next bin down).

ok, will try

Scaling works well. Even at full download speed it scales down sometimes.

hum...i'm sort of in two minds about this. digging further through the existing patch series on master, it seems that a lot of this stuff is meant to be handled by that patch series. in particular, it looks like the pull-down to the idle clock for L2 prior to setting the intended frequency ought to be handled, at least at first glance.

as far as the L2 frequencies go, it has no support for defining the required voltages for the L2 cache, nor does it take it account the different scaling of the L2 compared with the cpu frequency, it assumes that the L2 will just run at the same frequency as the CPU, which contradicts https://source.codeaurora.org/quic/qsdk/oss/kernel/linux-msm/tree/arch/arm/boot/dts/qcom-ipq8064.dtsi?h=NHSS.QSDK.10.0

so on the one hand i could try and work out why the existing patch series doesn't work and perhaps find a smaller fix. or i could persist with deleting it and replacing it with something that i know works. the existing patch series is from 2015 and has never been merged upstream as far as i can tell, though i think some variant of it resurfaced this month.

anything that brings more performance to the table is welcome!!

any updates on this?
I have been running the 7800 with your patch with no issues at all.

um i have refined it further on my branch but i wouldn't expect it to be any faster. i added some locking etc that was in the original patches submitted to the upstream kernel (i assume they knew what they were doing) and fixed the L2 voltage not correctly scaling down sometimes.

i was planning to tidy it up and address the code comments from the earlier PRs, and submit it for actual merging to master, but i haven't got around to that yet.

EDIT: re my earlier concerns, it seems like the existing patch set on openwrt master does not entirely work correctly, and may be fixed in a later patchset submitted to the upstream kernel (but only partially merged). long story short, i don't think it's worth trying to 'fix' the patch set on master, i'll just continue with the current approach of removing it and just adding what is needed to set the L2 rates/voltages directly.

cool, will try your latest patch.
Can you teach me to create a patch that I can apply with patch -p1 like the first time I tried?

sure, github has a pretty nice interface for this, though it's perhaps not that obvious to find. this gives you a comparison view of openwrt/master to the pr2280 branch on my fork:

and if you want to see it in patch form, you just add a .patch on the end:

https://github.com/openwrt/openwrt/compare/master...facboy:pr2280.patch

1 Like

fyi i had a bug in this pr which i've just fixed, it incorrectly says it failed to scale the L2 clock in the syslog everytime the clock changes, which adds a small amount of overhead.

trying your patch to kernel 4.19... compiling right now

Is this patch not causing higher CPU usage this time?

@facboy anyway now that i'm watching the patch... i undestand why it got rejected by openwrt... if you are just adding plain driver file and modifying the generic cpufreq-dt file... then just add a new specific plain driver... IMHO adds specific code to generic files is plain wrong


have this error... spammed like hell

Fri Sep 20 15:09:19 2019 kern.err kernel: [   66.044240] s1a: unsupportable voltage range: 1050000-0uV
Fri Sep 20 15:09:19 2019 kern.err kernel: [   66.044293] s1a: unsupportable voltage range: 1050000-0uV
Fri Sep 20 15:09:19 2019 kern.err kernel: [   66.048636] cpufreq_dt: failed to scale l2_regulator: -22
Fri Sep 20 15:09:19 2019 kern.err kernel: [   66.054195] cpufreq: __target_index: Failed to change cpu frequency: -22

the patch improves memory performance significantly, I have tested it using the same software as @facboy and indeed, it does improve memory performance.

Does it translate into a higher throughput?

not with cake, of if it does, then not by a noticeable amount.

What did improve throughput brutally was this:
for file in /sys/class/net/*
do
echo 3 > $file"/queues/rx-0/rps_cpus"
echo 3 > $file"/queues/tx-0/xps_cpus"
done

1 Like

that looks like it doesn't apply cleanly to 4.19. this is really a 4.14 patch tbh, it just deletes a bunch of stuff that i think has now been merged to 4.19 and replaces it. would need to look at vanilla 4.19 and work out what to change again.

i don't really know how to swap the cpufreq-dt driver to a specific driver for ipq806x...quite a lot of cpufreq-dt is still used. i don't really want to copy/paste the whole thing. i was planning to move all the L2 stuff to a specific driver, but still call it from cpufreq-dt (like the fab_scaling atm).

obviously i reworked to apply on 4.19 and it does compile well

Only problem is this... also the fix you post about l2 scaling is included in your repo?

sorry what i meant is that it doesn't 'work' properly on 4.19, i'm sure you fixed it to compile :). from what i remember a lot of stuff around krait L2 and the voltage management has changed in 4.19, and the patch set on 4.14 is an 'old' version that was never applied. i assumed that a completely different patch would be needed on 4.19, so i gave up trying to work out what was going on with the 4.14 patch set and just continued with dissent's approach of deleting and replacing it.

you mean the fix so that it transitions to the "correct" L2 speed? yes that is in my repo. i remember seeing references to it one of the patch sets i think, it talks about switching the PLL on L2 transition to the secondary or something (which is 384Mhz).

ok can confirm the L2 scaling is still busted on 4.19 after building your PR. i will look into a fix when i have time.

1 Like