Netgear R7800 exploration (IPQ8065, QCA9984)

Found something really strange...
The specific mdio reg 31 (reg_offset 31) requires some specific delay.
I notice this since still sometimes the mdio have some fail to detect the port (in the order of days)
So I investigate the regs and write operation with their return value...
I notice that right after any write to the reg_offset 31, the very next read produce garbage... (and unluckily the next read is every time the status of the port). The write to reg_offset 31 requires 16 more usec to the default delay of 8 (or 10... i still need to understand if it's safe to decrease the delay to 8)

Using the extra delay for reg 31 fix the problem...
To test all of this i simply repeated the read operation 5 times and check if every time the value was equal... And with my surprise, the value was correct and equal for every write/read except the one after the write of offset 31. (and after the normal init, offset 30 and 31 are the common one that are used)

From a very initial test (8 delay + extra for offset 31) it looks to work normally + i don't have random port dropping on the start

Anyway the major problem of the random port dropping (not considering the lost packet) is the fact that if for example one have a pppoe session and many service that are reloaded on wan reset (unbound, adblock, banip, hetunnel) the system overload and the wifi froze with lots of swba overrun error as the system can't handle the packet from the wifi... Sometimes this can even cause a crash of the wifi firmware and a simple downtime of 3-4 sec results in a downtime of 30-60 sec

2 Likes

Building from latest master breaks the R7800 for me, seems to crash or something.
Could be that I hit this... I must have pulled the src just before that commit.
https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=09fbc79bf6dc6365018944e1e0629b5f1c921790

hmm, I'm not sure if this is related but

an imagebuilder image for the r7500v2 for today had problems (reported here). In brief, the image won't build with the package block-mount and if i remove that, the resulting image boots but is essentially non functional. From the console, it looked like the device couldn't read its flash and all the prior configs/files weren't present.

Tftp flashing an image from late Jan. and restoring my configs got the device back.

1 Like

Hi,

I've played a little bit with samba4 configurations and recently while doing "fdisk -l" putty returned this:

I refreshed the firmware, but the backup GPT table is corrupt. How did it become corrupt, and how can I fix it? Does anyone knows?

And the partition 1 error is it normal?

Thanks in advance for any help on this.

Nevermind, fixed it with fdisk /dev/sda :slight_smile:

Im having trouble with samba4:

image

I followed this guide:

And it got me as far as I could. Can the problem be that Im using windows 8.1?

I also get this on explorer:

Can someone help please?

Is anyone able to enable EEE (Energy Efficient Ethernet)? The switch should support it, I set enable_eee to 1 in swconfig, but devices that connect to the port are unable to negotiate EEE.

ipq806x is about to move from swconfig to dsa switch drivers, see the kernel v5.10 PR and thread, it would probably be most useful if you could repeat your tests with dsa.

Edit: I would be rather surprised if anyone had tested EEE on QCA8337 (nor most other devices typically running OpenWrt), but the dsa drivers should offer you more insight with ethtool and iproute2 (EEE has been tested on the dsa based realtek target).

also in theory it should be enabled by default, if i'm not wrong.

One example (router connected to a askey modem (broadcom switch))

root@No-Lag-Router:~# ethtool --show-eee wan
EEE Settings for wan:
        EEE status: enabled - active
        Tx LPI: disabled
        Supported EEE link modes:  100baseT/Full
                                   1000baseT/Full
        Advertised EEE link modes:  100baseT/Full
                                    1000baseT/Full
        Link partner advertised EEE link modes:  100baseT/Full
                                                 1000baseT/Full

Anyone know what's going on here?
I get a lot of messages like these

[1806716.506993] ath10k_pci 0000:01:00.0: fetch-ind: failed to lookup txq for peer_id 47 tid 0

Linux nighthawk 5.4.99 #0 SMP Sat Feb 20 18:23:45 2021 armv7l GNU/Linux
11:29:21 up 23 days, 15:45, load average: 0.00, 0.04, 0.01

FWIW, i don't see those on my r7500v2 (AP only).

Linux r7500v2 5.4.99 #0 SMP Mon Feb 22 08:04:04 2021 armv7l GNU/Linux
r7500v2 # uptime
 08:23:27 up 21 days, 13:33,  load average: 0.00, 0.02, 0.00

nice stability lately on master tho

Yeah, I get them too. I filed a bug but unclear if anyone looks at these bugs.
https://bugs.openwrt.org/index.php?do=details&task_id=3679

These too:

[162053.710365] ath10k_pci 0000:01:00.0: Invalid peer id 16 or peer stats buffer, peer: 00000000  sta: 00000000
[191856.628897] ath10k_pci 0000:01:00.0: received unexpected tx_fetch_ind event: in push mode

I also suffer of random disconnects:

What's the benefit of moving to DSA instead of switchdev? Is switchdev being deprecated in newer kernels?

You mean swconfig

Yeah, swconfig... it's a bit confusing because OpenWrt actually names swconfig as "switchdev".

anyway benefits of dsa

  • Drop custom code (that is very hacky)
  • Use mainline driver
  • Dsa in theory should give better perf
  • Better documentation instead of relying on mysterious hex value applied to the switch
  • in theory dsa should be easier to use as every port is a separate interface
3 Likes

Any reason why /proc/cpuinfo won't show current CPU MHz?
It just prints BogoMIPS?

Contents of /proc/cpuinfo vary between architectures, mips and arm never displayed the CPU frequency - and just for the avoidance of doubt, the bogomips value is indeed bogus. If you want to know the current frequency, check cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq.

3 Likes

i got that on r7800: 1725000

Here's (part of) a little script I use:

  echo ">>> CPU Freq Stats:"
  cat /sys/devices/system/cpu/cpufreq/policy*/stats/time_in_state
  echo ">>> Current Frequencies:"
  cat /sys/devices/system/cpu/cpufreq/policy*/*_cur_freq
  echo ">>> Temperatures:"
  cut -c1-2 /sys/devices/virtual/thermal/*/temp