[PR] ipq806x: kernel 5.10 bump code propose

Yes, C2600 is a dumb AP connected to the R7800 as a main router.

I can but please confirm that you want me to test these commands on kernel 5.10+DSA. Both routers are now on kernel 5.10+swconfig so I'd need to change the firmware.

EDIT: I just now saw that you made a few more changes to the code. I'll upgrade to latest master + your new commits and will let you know.

Yes, I always change both.

This is the result that I get for ethtool --set-eee lan1 eee off on the C2600, on all ports. Didn't test on the R7800 yet.

eee unmodified, ignoring

And this is ethtool --show-eee lan1:

EEE Settings for lan1:
        EEE status: disabled
        Tx LPI: disabled
        Supported EEE link modes:  100baseT/Full
                                   1000baseT/Full
        Advertised EEE link modes:  Not reported
        Link partner advertised EEE link modes:  Not reported

So, EEE is off by default. If I do ethtool --set-eee lan1 eee on, then I get:

EEE Settings for lan1:
        EEE status: enabled - inactive
        Tx LPI: disabled
        Supported EEE link modes:  100baseT/Full
                                   1000baseT/Full
        Advertised EEE link modes:  100baseT/Full
                                    1000baseT/Full
        Link partner advertised EEE link modes:  Not reported

EDIT:

Scrap the above. I now upgraded both the C2600 (dumb AP) and the R7800 (main router) at my home as well as my R7800 at my office (main router) with your latest code (kernel 5.10+DSA) on top of latest master. EEE is on by default on all routers.

My office R7800 is behaving well and the logs are clean. Speed and latency are good. I have a few ethernet connections plugged into it (file servers and a switch) and there are no port drops.

At my house, the C2600 is working ok and logs are clean. On the R7800, however, there were lots of these every 3-5 secs, right after the router booted:

[ 194.496394] qca8k 37000000.mdio-mii:10 lan3: Link is Down
[ 194.496498] br-lan: port 3(lan3) entered disabled state
[ 196.577023] qca8k 37000000.mdio-mii:10 lan3: Link is Up - 100Mbps/Full - flow control rx/tx
[ 196.577061] br-lan: port 3(lan3) entered blocking state
[ 196.577076] br-lan: port 3(lan3) entered forwarding state

I'm not sure what is connected to port 3 (I still have to sort out the cabling...). After I saw these port drops, I decided to turn-off EEE on the port 3 of the R7800 and then the port drops stopped. Maybe there is some device connected to port 3 that cannot handle EEE?

So, other than the port 3 drop above, everything seems to be working well. I'll check what is the device connected to port 3 when I get back home later today and will report for any other stability issue (if any).

Thank you for the nice work!

EDIT 2: Port disconnects seem to have been solved by turning off EEE on the problematic ports. But I'm seeing problems with some devices failing to reconnect to wifi after roaming between routers. I did some googling and it seems that other platforms had similar problems with DSA.

Nicely done. I know mvebu is a totally different target, but kernel 5.10 is running great on the WRT3200ACM too.

@hnyman I tried your latest build R7800-master-r16186-bf4aa0c6a2-20210314-1025-factory.img. I flashed the factory.img directly using TFTP method instead of going through sysupgrade. My R7800 still boot-loops. I even disabled Energy Efficient Ethernet in my WIndows 10 laptop (connected to LAN4 port), and no devices connected on any of the other ports, it still boot-looped.

I tried MASTER SNAPSHOT (not 21.02) r16218-662ceebc4c factory.img via TFTP, and it works fine. I don't know if there is something related to DSA or kernel 5.10 that doesn't work on my R7800. I cannot connect to the R7800 through ethernet or wifi.

I don't think my existing config would have been retained when flashing through TFTP so it should be whatever default config the flash image had or generated on first boot.

If you want any logs/info from my R7800, please let me know. I have access to SSH, SFTP and Luci, but no JTAG or Serial access. I am currently running 21.02-SNAPSHOT r15893-f82e7e96a0.

My current R7800 swconfig setup is

WAN port: Arris S33 Cable Modem
LAN1 - Raspberry Pi 4 Model B - Main Router running Openwrt 21.02-SNAPSHOT
LAN2 - Nothing
LAN3 - Windows 10 Laptop (lan_2 aka guest network)
LAN4 - Windows 10 / Linux Laptop (lan_1 aka main local network)
2.4 GHz WiFi - Disabled
5 GHz WiFi - lan_1 local network

I am using VLAN tagging on LAN1 to separate "internet" (eth0.1001), "lan_1" (eth1.2001) and "lan_2" (eth1.2002) networks and sending it to my RPi4B. I am using my R7800 only as an Ethernet Switch and WiFI access point.

EDIT: Re-formatted post to avoid confusion.

Please differentiate between two things:

  • does your R7800 boot-loop with the default settings?
  • or does it boot-loop with your personal confoig, with added VLAN settings?

Serial access would be crucial, as otherwise you do not see the kernel log at the crash time, so it is pure guessing what causes it. It is pretty easy to enable in R7800. See Netgear R7800 exploration (IPQ8065, QCA9984) - #2 by hnyman

And @Ansuel is actually the person doing the hard work here with kernel 5.10.

Personally I do not use VLANs, so no comments regarding that part. The build has worked for me.

Default config. TFTP recovery flash erases all config in the device so it's whatever default config the flashed build has or gets generated during firstboot. It boot-loops immediately after flashing. I do not have ethernet or wifi connection to the device and therefore caanot login to SSH or Luci to change the config in the 1st place.

I have edited my original post to clarify that my setup (that I mentioned) is what I am currently using in swconfig.

Yeah, I got confused by your kernel logs (that had VLANs), but they are from 21.02, and do have not much relevance with debugging your problems 5.10.

I currrently do not have USB-TTL Serial cable. I will try to obtain boot logs for the DSA build during the weekend.

1 Like

@Ansuel and @hnyman Test-DSA-kernel510-master-r16293-310b7f76e8-20210321 Boot Log

Power Adapter is
Netgear P/N 332-10762-01;
Model No. MU42-3120350-A1 (U.S.A. pin);
Input: AC 100-240V, 50/60Hz, 1.5A;
Output: DC 12V, 3.5A

as i tought it's a voltage problem... now what i need to understand is your cpu is too good or too bad?

Is that a question for me or are you just thinking out loud?

So I have an (short) opportunity to test tomorrow but I ran into a compile issue:

...
FATAL ERROR: Couldn't open "/home/sn/openwrt/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_\
eabi/linux-ipq806x_generic/linux-5.10.23/arch/arm/boot/dts/qcom-ipq8064-g10.dtb": No such fil\
e or directory
mkimage: Can't open /home/sn/openwrt/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/lin\
ux-ipq806x_generic/asrock_g10-fit-uImage.itb.new.tmp: No such file or directory
make[5]: *** [Makefile:431: /home/sn/openwrt/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_\
eabi/linux-ipq806x_generic/asrock_g10-fit-uImage.itb] Error 255
make[5]: *** Waiting for unfinished jobs....
...

not sure atm if I did something silly. I consider this an "aggressive" build attempt as I just used my normal AP config (with a fair amount of packages/kmods) using your PR on top of latest master. I can make it simpler...

To get ready for the build, I did:

cd ~/openwrt
make distclean
git checkout master
git fetch upstream
git rebase upstream/master
git fetch upstream pull/3954/head:k510
git checkout k510
git rebase master
...

followed by my normal build script. All of which went without error until the build.

I'll see if I can't figure out what's up but a hint would be appreciated as I haven't done this in a while.

EDIT: i'm still not sure why i continue to get this error during building, but after a few other attempts, I checked the ~/openwrt/bin/targets/ipq806x/generic and the r7500v2 images are there. I missed my window for today, but I think i can at least load it later this week and post a dmesg if there is anything interesting. Re-config for dsa looks relatively straight forward even for my network vlans...

1 Like

both...
long story:
there are 7 range that the cpu can work based on how good the chip is.
So we need to understand if your router requires more voltage or less voltage...
a quick test would be do a speed test and check the values here

cat /sys/kernel/debug/regulator/regulator_summary

This is in 21.02-SNAPSHOT r15915-31bca5f256 build from downloads.openwrt.org. I took multiple outputs while the speed test was running, as I was not sure if the values change or remain constant.

EDIT: I am also running my R7800 with the following in /etc/rc.local

echo performance > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
echo performance > /sys/devices/system/cpu/cpufreq/policy1/scaling_governor

ok now i need your idle voltage...
just do various regulator summary and we should have found your pvs

I am currently having only a few websites open (but no videos) and not using the full download/upload bandwidth. I don't know if that counts as idle. I checked the regulator_summary again and the values are same as before. I have set both the cpu cores to 'performance' governor. I don't know if that is keeping the regulator values constant.

Also the current values are shown as 0mA. Does it mean the current values are actually in uA (micro Amps) that is rounded off to 0 mA, or is it actually 0 mA (zero current usage)? Current value being zero signifies open circuit. I am confused.

pls revert the gov to the original value or the voltage is not scaled.
about 0mA the regulator doesn't comunicate the A used. the system just tells the regulator what voltage should be used.

I updated the same gist with captures both during Idle, during Speed Test and then Idle again, with ondemand governor (default).

it seems you have a pvs3 while I have pvs4.. strange situation wonder if there is a problem with setting the voltage at the wrong time?