R7500v2 kernel 4.19 test

up 3+ days on this build with no issues.

Usb, thermal sensors, cpuidle - spc and wfi idle states, extended overlay, and ath10k-ct-htt all functioning without errors on my r7500v2.

It appears there is an issue with the cpuidle spc idle state on the r7800. Based on @Ansuel reports, the cpuidle wfi idle state appears to function on kernel 4.19 (as it did on kernel 4.14).

For 4.19, I made a change to my " ipq806x: k419 cpuidle fix" patch that will enable the wfi idle state, enable some cpuidle sysfs status info (not enabled on kernel 4.14), and disable the spc idle state for all devices dependent on the qcom-ipq8064.dtsi (as it was on kernel 4.14).

However for the r7500v2, the spc idle state is enabled (via a status = "okay"; line added to the qcom-ipq8064-r7500v2.dts file).

As long as there is interest and assistance from a r7800 user, I'll help to see if the spc idle state can/should be enabled on the r7800. That said, I think it would be better to test usb, thermal sensors, and cpuidle (wfi only) changes more broadly on r7800 without the enabling the spc idle state.

While I am ambivalent about having kernel 4.19 put into master, I know others would like to see it there and I think this may be the quickest way to achieve that.

Once I've got my next image built, loaded, and tested for a few hours, I'll post again.

I'm still reading about operating-points-v2. This is related to the cpufreq driver/system, is of interest to me on its own, and I plan to spend time on it; however, I suspect will not help with the spc idle state for r7800 - one can always hope tho.

1 Like

IMHO ipq806x should be pushed to master at least as testing_kernel link it was done for mvebu...

As we fixed everytihing was broken (except spc that was broken before anyway)

1 Like

confirm.

confirm. ( 2 cpu init issue > cpu1 < set = crash )

~
0046-cpufreq-qcom-independent-core-clocks.patch?
~
1 Like

so I see opp data in some dts files (I found some on the dd-wrt source code site for the r7500v2 if I decide to try...)

but I'm not sure its needed. if I follow, the 0046*patch above seems to indicate that ipq8064 has independent clocks and the opp data is not needed which may explain why its not there now

EDIT: then again after reviewing the patched cpufreq-dt.c maybe they are still used... back to reading for now.

fine, but I'm still interested if something like this is relevant for the nss freq scaling issue

just don't want to bother quarky when I've got nothing to contribute - plus I'm not ready to work on that yet

1 Like

A clue from prior efforts. The openwrt forum posts make more sense to me now. Perhaps "not relevant" is relevant

on 8065;

opp-shared: Indicates that device nodes using this OPP Table Node's phandle
  switch their DVFS state together, i.e. they share clock/voltage/current lines.
  Missing property means devices have independent clock/voltage/current lines,
  but they share OPP tables.

stopped the board crashing (with SPC nodes enabled )... although still without working idle... it's a big hint. ( I suspect current patch-set does not handle multi-cpu w shared clocks adequately ~at least on 4.19~ )...

i.e. check psci init...

[    5.911869] cpuidle: enable-method property 'psci' found operations
[    5.917253] cpuidle: enable-method property 'psci' found operations

[    6.016381] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[    6.016417] pgd = (ptrval)
[    6.023529] [00000000] *pgd=00000000
[    6.026064] Internal error: Oops: 80000005 [#1] SMP ARM
[    6.044022] PC is at   (null)

code executes, without error until it addresses a memory address I'm assuming was supposed to be for the second cpu. but there is nothing there...

with your better fix of SPC disabled... board does not crash AND idle works. so, without a re-write of the underlying stuff... there may be little point investigating much further as what you've done seems to be adequate, for most....

patch 0046 about independent clock could be the cause?

i was checking the cpufreq patch and i notice that it's possibile we have an old patch

The series from this v12 version have 2 additional patch that read the opp table from the soc

https://patchwork.kernel.org/cover/10565391/

1 Like

that looks right... stuff like;

	for_each_possible_cpu(cpu) {
		if (IS_ERR_OR_NULL(tbl2[cpu]))

seems to be what was missing... nice find...

so we have the original codeaurora patch but not the revisioned one from the kernel guy

the hard part is find THE LAST one (and i think it just got rejected and never merged)


Ok that patch has been merged in linux-next so we need to find the merge...

some alternatives to think about and try later if your current line of effort stalls:

Documentation is sparse or non existent for these "suggestions" so most of this falls under "try it and see"

The idle driver has "knobs" to adjust in the dts file, i.e.:

entry-latency-us = <150>;
exit-latency-us = <200>;
min-residency-us = <2000>;

There is a (undocumented?) different cpu "bring up" mechanism specified in the qcom-ipq8064.dtsi

cpu0: cpu@0 {
			compatible = "qcom,krait";
			enable-method = "qcom,kpss-acc-v1";

I tried enable-method = "qcom,kpss-acc-v2";early on but I did not observe a change

Lastly, perhaps the "SPM register data for 8064" hard coded into spm.c needs adjusting for 8065...

/* SPM register data for 8064 */                                                
static const struct spm_reg_data spm_reg_8064_cpu = {                           
        .reg_offset = spm_reg_offset_v1_1,                                      
        .spm_cfg = 0x1F,                                                        
        .pmic_dly = 0x02020004,                                                 
        .pmic_data[0] = 0x0084009C,                                             
        .pmic_data[1] = 0x00A4001C,                                             
        .seq = { 0x03, 0x0F, 0x00, 0x24, 0x54, 0x10, 0x09, 0x03, 0x01,          
                0x10, 0x54, 0x30, 0x0C, 0x24, 0x30, 0x0F },                     
        .start_index[PM_SLEEP_MODE_STBY] = 0,                                   
        .start_index[PM_SLEEP_MODE_SPC] = 2,                                    
};                                                                              

If I can understand why its this way for 8064 and how 8065 is different this might be possible. I think this is a long shot tho.

HTH

1 Like

we need ipq8065 documentation to find this value and check if they changed.

up 3 days and no issues...

root@OpenWrt:~# uname -a
Linux OpenWrt 4.19.69 #0 SMP Fri Sep 6 19:49:31 2019 armv7l GNU/Linux
root@OpenWrt:~# uptime
 17:50:16 up 3 days, 48 min,  load average: 0.07, 0.04, 0.00
root@OpenWrt:~# cat /sys/devices/virtual/thermal/thermal_zone0/temp
47640
root@OpenWrt:~# cat /sys/devices/virtual/thermal/thermal_zone9/temp
50383
root@OpenWrt:~# cat /sys/devices/system/cpu/cpuidle/current_driver
arm_idle
root@OpenWrt:~# cat /sys/devices/system/cpu/cpu0/cpuidle/state0/name
WFI
root@OpenWrt:~# cat /sys/devices/system/cpu/cpu0/cpuidle/state0/usage
148883071
root@OpenWrt:~# cat /sys/devices/system/cpu/cpu1/cpuidle/state0/name
WFI
root@OpenWrt:~# cat /sys/devices/system/cpu/cpu1/cpuidle/state0/usage
91957390
root@OpenWrt:~# cat /sys/devices/system/cpu/cpu0/cpuidle/state1/name
spc
root@OpenWrt:~# cat /sys/devices/system/cpu/cpu1/cpuidle/state1/name
spc
root@OpenWrt:~# cat /sys/devices/system/cpu/cpu0/cpuidle/state1/usage
28458872
root@OpenWrt:~# cat /sys/devices/system/cpu/cpu1/cpuidle/state1/usage
13888733

me too still waiting @anon50098793 for a patch

1 Like

hmm...do you think this patch set is related to the L2 issues too? it seems i've been looking at the old patch set when trying to work out why it doesn't properly set the clocks on master: R7800 cache scaling issue

perhaps with the V12 version it will just work...up to a point.

problem is that it got merged upstream... so we need to just backport it

and i can't find the commit to the kernel...

1 Like

The patch set was split over clk-next and cpufreq-next i think.

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=1578968f77e6759afe4882dc20882639eacb2977 merged from clk-qcom-krait https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=77612720a2362230af726baa4149c40ec7a7fb05

and i think cpufreq is here:

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=f75d2accca7785657311653c125bb22f342dc5d9

i'm not entirely sure about the last one patch https://patchwork.kernel.org/patch/10565455/ as it looks like only some (if any) of it got applied. perhaps

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=57f2f8b4aa0c6b41a284da82bfa40dc3b2abe9a5

1 Like

up 6+ days and no issues

i'm start thinking we need to make a pr... or we won't test this any further i mean... only 3-4 guy are testing 4.19 on ipq.....

3 Likes

well, for starters, here is my current patch to fixup patches-4.19 ... looking at it now... i dont think its right... especially the 701.... i can take a better look if nobody already has something better....

basically 067 and 701 wont apply... without some mods...

chunkeey patches-4.19 fixup alpha

idle-tsense-common-fixups-dupe

dts-spc-7500only-alpha < @anon98444528 test?

1 Like