R7500v2 kernel 4.19 test

if you don't mind me asking...

where/how did you pick up on the dts extra entry?

is the ENXIO generated in 4.19 spm.c as @Ansuel suggests? i.e. it is going there but should not be

thats the way i read it....

either something is not present in the dts/i or something lower down somewhere like an arm cpu.c ain't talking nice...

cat arch/arm/boot/dts/qcom/* | grep -C12 SPC
grep -C10 idle-state
etc...
1 Like

IMHO the spm is not probed / inizialized and cpuidle faild because of this

It may have but it's likely more a symptom than a root cause. In other words, if the code flow gets to spm.c for r7500v2 or r7800, cpuidle will fail (because it is supposed to fail).

Looking for an answer to why "0x40000002" (powerdown) and not say "0x00000001" (retention) (see apendix A.2) eventually got me to

"and also need to specify the arm,psci-suspend-param property for
+each idle state.
"

Great, the requirements for the dts(i) file changed. Even better, there is code to print a warning (in drivers/firmware/psci.c - in psci_dt_cpu_init_idle) if arm,psci-suspend-param is missing (apparently these are sent to @anon50098793 instead of my kern log... or I just need to learn how to turn on pr_warn in addition to pr_err).

so not a bug? a long winded way to say you are right to look at the dts file? idk, still need a time when the router isn't being used so I can try it.

1 Like

So you are telling me they changed the cpuidle requirement in the DTS without update the documentation?


Anyway trying a build... now we have to solve problem with tsense...


@anon50098793 do we have to enable some kernel config flags? i still have the error with cpuidle

idk, perhaps a better way to express my current attempt at understanding is: how the kernel responds to the data/configuration specified in the DTS files is changing and the quoted link in my previous post is a start of the documentation.

BTW check this out (under The idle invocation hierarchy) as I think its a map of how cpuidle use to be initiated.

Based on the changes for usb and cpuidle, I'd search the linux arm mailing list and include keywords about the dts. Also kerwords like "flatten dt" may also help...

if you want to do a quick test try adding

entry-method = "psci";

in idle-states (in addition to @anon50098793's changes) and

enable-method = "psci";

for each cpu right above cpu-idle-states = <&CPU_SPC>;

i.e. "optional" in the idle-states documentation:

An idle-states node defines the following properties:

- entry-method
	Value type: <stringlist>
	Usage and definition depend on ARM architecture version.
		# On ARM v8 64-bit this property is required and must
		  be:
		   - "psci"
		# On ARM 32-bit systems this property is optional

might mean include it if you want to use psci or leave it out if you want to use something else...

4.19.56 working fine, no new issues observed.

Still no progress on cpuidle but an update below for anyone interested.

While I think psci is the upcoming "mechanism" for cpu idle control, I'm uncertain if attempting to implement it here is the right solution. It looks like doing so would be a substantial change (likely more than the changes to the dts mentioned above), I'm uncertain if the 4.19 kernel is ready for it with ipq806x, and I'm uncertain if the ipq806x "firmware" supports it (but I think its likely).

For now, I'm going back to focusing on cpuidle-arm.c, spm.c, and the dts system. It looks like 4.14 cpuidle-arm.c anticipated an -ENXIO error and was coded to ignore it (and still function?). In 4.19 cpuidle-arm.c, I tried:

	if (ret) {
		pr_err("CPU %d failed to init idle CPU ops\n", cpu);
		ret = ret == -ENXIO ? 0 : ret;
		// goto out_kfree_drv;
	}

but that resulted in a boot loop. Since I'm only interested in a "try it and see" approach if it gives me some indication about what the problem is, this result sent me back to reading about the dts system and kernel with an eye to tracing/troubleshooting/debugging techniques. There is lots of information out there on these topics so I expect it to take me some time.

1 Like

~ 7 days up on kernel 4.19.56 - no new issues, worked well.

just built and installed 4.19.57, so far so good.

been busy so nothing to report about thermal sensors or cpuidle.

same busy for exams...

no crash or problems

Example for tftpboot initramfs images via tty serial on the r7500v2 (i.e. for testing an initramfs image loaded to ram without having to flash an image to nand):

This example is based on the instructions provided by @quarky for the r7800 here.

Prerequisites:

  1. Compile and install the usb to tty device drivers (in this case provided by hiletgo):
$ make
$ sudo cp cp210x.ko /lib/modules/`uname -r`/kernel/drivers/usb/serial
$ sudo insmod /lib/modules/`uname -r`/kernel/drivers/usb/serial/usbserial.ko
$ sudo insmod /lib/modules/`uname -r`/kernel/drivers/usb/serial/cp210x.ko
  1. Install tftp server "tftpd-hpa" and "screen" on ubuntu. If using the "ufw" firewall, allow tftp (udp port 69):
$ sudo apt update
$ sudo apt install screen tftp-hpa tftpd-hpa
$ sudo service tftpd-hpa status
$ sudo service tftpd-hpa start
$ sudo ufw allow tftp
  1. Copy an initramfs image to the tftpd directory. In my case, I've already built, flashed (via the r7800 TFTP method here which works perfectly on the r7500v2) and tested an openwrt "factory" image on the router; however, it is likely not necessary to have an openwrt image already flashed to nand to tftpboot an initramfs from ram. I used the initramfs automatically generated during the build of a known working image to make sure the initramfs image will work.
$ sudo cp ~/openwrt/bin/targets/ipq806x/generic/openwrt-ipq806x-netgear_r7500v2-initramfs-uImage /var/lib/tftpboot/
  1. Connect the computer ethernet port to a LAN port on the router, set up and enable the computer ethernet as static ipv4 address/netmask: 192.168.1.10/24, gateway: 192.168.1.1. Connect the usb tty serial device to the router and computer (a usb extension cable to the usb tty adapter works for me).
  2. Start a screen session. I use:
sudo screen -h 1000 -L -Logfile ~/r7500v2-`date +"%Y%m%d-%H%M"`.log /dev/ttyUSB0 115200
  1. Power on the router.
  2. At this point you should see the router booting up in screen session started above. If not, you'll need to trouble shoot the usb tty serial setup - it took me a few tries... When you see something similar too:
U-Boot 2012.07 [local,local] (May 29 2015 - 19:03:53)

U-boot 2012.07 dni1 V1.5 for DNI HW ID: 29764958 NOR flash 0MB NAND flash 128MB RAM 512MB 1st Radio 3x3 2nd Radio 4x4
smem ram ptable found: ver: 0 len: 5
DRAM:  491 MiB
NAND:  SF: Unsupported manufacturer 00
ipq_spi: SPI Flash not found (bus/cs/speed/mode) = (0/0/48000000/0)
128 MiB
MMC:   
*** Warning - bad CRC, using default environment

PCI0 Link Intialized
PCI1 Link Intialized
In:    serial
Out:   serial
Err:   serial
 131072 bytes read: OK
cdp: get part failed for 0:HLOS
Net:   MAC1 addr:XX:XX:XX:XX:XX:XX
athrs17_reg_init: complete
athrs17_vlan_config ...done
S17c init  done
MAC2 addr:XX:XX:XX:XX:XX:XX
eth0, eth1
Hit any key to stop autoboot:  1
(IPQ) # 

Interrupt U-Boot by pressing any key (in the serial console) when prompted. You have only 2-3 seconds before U-Boot proceeds to boot from the NAND flash.

  1. Find the ram load address to use with the uboot "tftpboot" command from the uboot command "printenv":
(IPQ) # printenv
baudrate=115200
bootargs=console=ttyHSL1,115200n8
bootcmd=sleep 2;   nmrp;  if loadn_dniimg 0 0x1480000 0x44000000 && chk_dniimg 0x44000000; then bootipq2; else fw_recovery; fi
bootdelay=2
config_ubi_prepare=mtdparts default; ubi part dnidata
ethact=eth0
ipaddr=192.168.1.1
language_ubi_prepare=mtdparts default; ubi part language
loadaddr=0x42000000
machid=1260
modelid=R7500v2
serverip=192.168.1.10
stderr=serial
stdin=serial
stdout=serial
updateloader=ipq_nand sbl && nand erase 0x00c80000 0x00580000 && imgaddr=0x42000000 && source $imgaddr:script

Environment size: 585/262140 bytes

In my case, the ram load address is "44000000" and is shown in the "bootcmd" environment variable as 0x44000000. Using the loadaddr environment variable "0x42000000" did not work (I think this is the nand address to load an image from).

  1. test that you can see the tftp server from uboot:
(IPQ) # ping 192.168.1.10
Using eth1 device
host 192.168.1.10 is alive
  1. Load the initramfs image via "tftpboot":
(IPQ) # tftpboot 44000000 192.168.1.10:openwrt-ipq806x-netgear_r7500v2-initramfs-uImage
Using eth1 device
TFTP from server 192.168.1.10; our IP address is 192.168.1.1
Filename 'openwrt-ipq806x-netgear_r7500v2-initramfs-uImage'.
Load address: 0x44000000
Loading: #################################################################
         #################################################################
         #################################################################
         #################################################################
         #################################################################
         ###############################################################
done
Bytes transferred = 5683228 (56b81c hex)
  1. boot the image:
(IPQ) # bootm
   Image Name:   ARM OpenWrt Linux-4.19.57
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    5683164 Bytes = 5.4 MiB
   Load Address: 42208000
   Entry Point:  42208000
   Verifying Checksum ... OK
   Loading Kernel Image ... OK
OK 
mtdparts variable not set, see 'help mtdparts'
no partitions defined

defaults:
mtdids  : nand0=msm_nand
mtdparts: mtdparts=msm_nand:3584K@0x7980000(language),3M@0x7d00000(dnidata)
info: "mtdparts" not set
Using machid 0x1260 from environment

Starting kernel ...

[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Linux version 4.19.57 (n@E6410) (gcc version 7.4.0 (OpenWrt GCC 7.4.0 r10459+4-1174b94bc9)) #0 SMP Tue Jul 9 23:20:21 2019
[    0.000000] CPU: ARMv7 Processor [512f04d0] revision 0 (ARMv7), cr=10c5787d 
...

When the kernel log messages stop, hit any key to get a console and go from there. Note, I could not see my "overlay" files from my previous nand flashed openwrt image; however, typing "reboot" from the serial console and not interupting uboot as described above got me back to my nand flashed environment. Nice.

a picture of the r7500v2 with tty usb connection...

That is correct rx/ tx are always labeled from the point of view of the individual devices, meaning the r7500v2 listens on its rx pin for transmissions from the tx pin of your connector - and transmits back over its tx pin, which is listened to by the rx pin of your USB2serial adapter.

cpuidle update: it helps to be able to see boot messages via the serial console and to quickly compile, load and test kernel images.

I confirmed what @Ansuel observed in prior posts. For kernel 4.19 in spm.c, spm_dev_probe is never called (or pr_err fails silently from here) and qcom_cpuidle_init will always return -ENXIO. However, state_count does get to 2, so it looks like the purpose of calling qcom_cpuidle_init is to set fns and per_cpu(qcom_idle_ops, cpu) = fns.

Whats new (for me) is that this also happens in 4.14.

So I'm back to trying to understand why 4.14 calls ret = cpuidle_register_driver(drv) before skipping the ENXIO error and 4.19 will exit on ENXIO before calling this function... @Ansuel observed that calling this function earlier results in a boot loop and my own attempts skipping over the error also results in a boot loop... seems broken to me and I don't think the answer is in this patch set (but who am I to judge).

My other clues are:
qcom-ipq8064.dtsi:

saw0: regulator@2089000 {                                       
                        compatible = "qcom,saw2", "syscon";

"syscon" (system controller?) does not show up in the qcom-ipq8064.dtsi vanilla kernel sources - so I'll try to understand why its here...

and

@Ansuel's suggestion to look at tscr

1 Like

yes all is correct

tscr is the only one that look for compatible syscon so i think syscon = tscr BUT STILL... in tscr there is nothing about cpuidle... it's all about usb inizialization

IMO the following does nothing more than continue to demonstrate my lack of understanding... but it does get

cat /sys/devices/system/cpu/cpuidle/current_driver

return "arm_idle" on 4.19 after booting.

I made the following hackish edits to 4.19 cpuidle-arm.c to make it "work" like 4.14:

	/*                                                                      
         * Allow the initialization to continue for other CPUs, if the reported
         * failure is a HW misconfiguration/breakage (-ENXIO).                  
         */
        if (ret) {
                //pr_err("CPU %d failed to init idle CPU ops\n", cpu);
                ret = ret == -ENXIO ? 0 : ret;
                //goto out_kfree_drv;                                           
        }

        ret = cpuidle_register_driver(drv);
        if (ret) {
                if (ret != -EBUSY)
                        pr_err("Failed to register cpuidle driver\n");
                goto out_kfree_drv;
        }
        /*                                                                      
         *dev = kzalloc(sizeof(*dev), GFP_KERNEL);                                
         *if (!dev) {                                                             
         *        ret = -ENOMEM;                                                  
         *        goto out_unregister_drv;                                        
         *}                                                                       
         *dev->cpu = cpu;                                                         
         *                                                                        
         *ret = cpuidle_register_device(dev);                                     
         *if (ret) {                                                              
         *        pr_err("Failed to register cpuidle device for CPU %d\n",        
         *               cpu);                                                    
         *        goto out_kfree_dev;                                             
         *}                                                                       
        */
        return 0;

I'm away until Monday. I'd really like to get some kind of spec sheet for ipq8064. Are "spm" and "saw" even used?

EDIT: reference this and this regarding cpuidle, qcom-scm driver (not relevant?), and the transition from 4.4 to 4.9.

This way driver gets register but not the device... So it was actually broken in 4.14? Some way to test cpuidle in 4.14?

that's how i see edits/patches evolving in the linux-arm mailing list...

idk but I think its possible. Still contemplating how to test cpuidle. Also, I'd like to know if its useful - its my understanding that having arm cpuidle should use less power and enhance "performance" but I have not seen this proven.

Before I go for the weekend, I'm creating a cleaned up patch and will post a link to it in my "k419" github branch.

EDIT: a less hackish patch on github here.

A Power meter should be sufficient
Our arm cpu doesn't support suspend mode that actually disable cores but it does support the disabling of some part to save power so...
About performance and cpu s along we have another driver. Anyway will try to check about tsense problem this night

Your pretty close now i think @anon98444528

I managed to squeeze out these, although they are likely just "missing stuff" related;

[    1.593968] Speed bin: 0
[    1.598317] PVS bin: 5
[    1.602797] DT idle-states: Parsing idle state node /cpus/idle-states/spc failed with err -19
[    1.807413] cpufreq: cpufreq_online: CPU1: Running at unlisted freq: 387500 KHz
[    1.812917] cpufreq: cpufreq_online: CPU1: Unlisted initial frequency changed to: 600000 KHz

Interestingly, independently also wound up at "syscon" as you did....

Was essentially trying various compatible, saw, and acc options from;
https://linux-arm-kernel.infradead.narkive.com/roIsCrjt/patch-v9-3-9-arm-dts-qcom-add-power-controller-device-node-for-8974-krait-cpus

( seemed that maybe.... "saw" needs to be changed maybe to saw-vXX , perhaps just saw is not 4.19 capable? )