Netgear R7800 exploration (IPQ8065, QCA9984)

OpenWrt 21.02 will not have DSA for IPQ806x
kernel 5.10 and DSA will be in the next release, possibly 22.0x ...

1 Like

Anyone seeing these errors in dmesg? They seem to have started recently, running OpenWrt 21.02-SNAPSHOT, r16268-750b966866:

[665492.586429] ath10k_pci 0000:01:00.0: failed to lookup txq for peer_id 109 tid 2

When this happens the client gets disconnected.

Hi, hope this isn't off topic.

I'm currently on the 19.07.7 r11306-c4a6851c72 release for my R7800 router. Been quite nice using it so far. I noticed that a new 21.02.0 release came out. Before flashing, wanted to make sure it's stable (I'll be going away for a while, so if any issues will be hard for me to fix).

Should I stick to the official 21.x release, or use any other build (hnyman or Kong builds), as rerely when many devices connect, and are streaming, there seem to be interruptions

Thanks :smiley:

I had some issues with radio0 and radio1 (2.4GHz/5GHz) after sysupgrade from 19.07 to 21.02, which led to many of my devices not connecting wirelessly. I'm a basic user, still learning and not sure if I messed something up when I initially choosed WPA3 as encryption. I noticed quickly some of my devices didn't support WPA3 and changed back to WPA2 but they still didn't manage to connect to the router, yet my PC and Android phone could. Tried several resets, without keeping configuration but it just didn't work. Long story short, I ended up flashing the factory image via TFTP following the guide in this thread and it works flawless so far :grinning: :+1:

2 Likes

Just as an fyi, the upstream of ath10k-ct fixed the VHT160 mess. I submitted a pull request-https://github.com/openwrt/openwrt/pull/4639 -but its an easy change if you want to fix your builds in the meantime.

2 Likes

26 posts were split to a new topic: R7800 SQM performance

I run DSA + master (about 2 months old build), everything is very solid however my Chromecast stopped connecting to internet, it's connected to the router over wifi but says it can't access internet.
All other devices are working fine so I'm a bit puzzled why this happens.

The Chromecast is not discovered on the LAN either, so I can't cast to it but it responds to ping.

Anyone had this issue recently?

(i tried connecting it to another Wifi connection and it worked fine so it's something related to OpenWRT)

UPDATE:
I think i found the issue, I block public DNS access in the firewall and have been doing it for a very long time. Seems like the Chromecast is using it's own settings for DNS, because it started to work when I disabled that rule... Must be some Chromecast FW upgrade.

i used to run DSA but went back to swconfig when i tried out the NSS builds. i've just recently gone back to something much closer to master (i got tired of rebasing the NSS patches in all the time, plus my link speed hardly needs it). do you find DSA to be faster or better in any other way?

i was going to wait until the patch set got merged but maybe i'll try it out.

It should be faster in theory but in reality it's by a small margin if I remember correctly (i did some testing a while back), @Ansuel made some more changes since the build I'm running tho' so things might have improved.

I'm running DSA now because we'll all end up running it eventually anyway.

1 Like

Qca8k patch are merged upstream so now it's only required a small pr

1 Like

I have made two days ago r17855-a1939e7e37 test builds of both the normal swconfig version and the DSA version.

Ps. currently there are conflicts in PR4036 due to dts changes by https://github.com/openwrt/openwrt/commit/70c12d26ca6eb01a938feb38f89720d78df0ca6d
(new ipq8065 device XR500, updated sister to R7800)

they already merged that?


ok it's litteraly the last commit... so yes i will have to refresh the pr, easy to do as in theory the switch definition should be in the dtsi


@hnyman i rebased the pr with the new version :smiley:

4 Likes

@Ansuel

Not related to DSA, but maybe to the 5.10:

I have recorded the latency to the ISP nexthop both from R7800 itself and from a dumb AP (RT3200, Mediatek 7622) behind R7800. The latency is really small thanks to fiber/FTTH/something, so the small variations are quite visible. The interesting thing is that there seems to be a performance hit in master compared to 21.02.

The RT3200 has had a master build the whole time, but R7800 had my 21.02 community build for a few times during the last month, and when looking at the ping chart at RT3200, there is noticeable 0.3 ms latency increase with master compared to 21.02.

Curiously, the difference is more clearly visible (and uniform) when monitored from the dumb AP that gets routed via R7800. R7800 itself has maybe 0.2 ms impact. Maybe there is a performance hit both with outgoing and ingoing packets, so the routed traffic gets a doubled effect.

From RT3200, dumb AP:

From R7800 itself, connection to ISP:

1 Like

I Also had many issues with Master vs 21.02
most of my VLANS on BATMAN-adv stopped working properly after the kernel 5.10 change ( I have not poked around to see why that happened )
but I was able to go back to 21.02 and things returned to normal ( using a Compex WPQ8065 with some PCIe WLAN cards ) the port is very similar to the R7800 but the issues could be related

I've had the same issue a few times on older builds, things that fix it for me.

  1. toggle igmp snooping on the lan network
  2. toggle spanning tree protocol on the lan network

I have 0 idea as to why that works but fixes the same issues with my chromecast devices.

I'm also running an r7800 with all DNS requests forwarded to stubby running locally (pointing to cloud flare DNS-TLS services). I'm using iptables rule to redirect plaintext DNS traffic and block any other secure DNS ports / hosts.

1 Like

The real problem in this kind of thing is understand if the latency increase comes from the target or from some generic patches. I mean from the target side in theory with 5.10 thing should not have changed that much.

By the way, ath10k could run in non-MSI mode, that lets you to set irq affinity or use irqbalance:

cat /etc/modules.d/ath10k-ct
ath10k_pci irq_mode=1

then reboot

[   21.743862][  T231] ath10k_pci 0001:01:00.0: pci irq legacy oper_irq_mode 1 irq_mode 1 reset_mode 0

set affinity:

echo 2 > /proc/irq/36/smp_affinity
...
 36:     768949      16625     GIC-0  90 Level     ath10k_pci

Any possible drawbacks from not using MSI interrupts?

2 Likes

Oh... thanks for the info, might explain why some managed to get GIC-0 mode for ath10k_pci.
I stated before that it wasn't possible but some people said it was. I never got an explanation why that was, I thought it was because of different revisions of the router.

Some speculation...
From what I understand, MSI should lead to less lockups etc. However I doubt that is much of a problem for a router like the R7800?! Also the ARM architecture is a bit different too.
If we control what CPU wifi uses, in theory it should/could lead to better performance if you do it right, at least according to my logic. :slight_smile:

Oops, forgot to mention that was done on tp-link C2600, so probably your experience may vary.

Works on the R7800 too, I tried. But the IRQ may differ tho'