Netgear R7800 exploration (IPQ8065, QCA9984)

Understood, thanks.

I'm preparing to upgrade from 17.01.4 to 18.06. Because sysupgrade is not possible due to flash partition size changes, I have to do a "manual" sysupgrade.

Is this the rough outline of what I need to do? Any corrections?

  • Back up current configuration (such as with LUCI System->Backup/Flash Firwmare, Generate Archive) to another computer
  • Take a list of currently installed packages: opkg list-installed; save this on another computer
  • Download factory.img file for the release (such as netgear_r7800-squashfs-factory.img); verify its checksum
  • Boot up router in TFTP mode (see https://openwrt.org/toh/netgear/r7800), install new release image file, reboot that image
  • Log in, reinstall packages (see list saved above) with opkg
  • Restore configuration files from saved backup, merging (where necessary) to incorporate release-to-release changes
    Hmm, maybe https://openwrt.org/docs/guide-user/installation/generic.sysupgrade has some hints, but is it stale (being targeted at 14.07 and 15.05?) and also it's more about sysupgrade rather than reinstall/restore configurations

Otherwise that looks ok, but you should allow the new defaults take place e.g. in network config. (18.06 and later do create VLANs for both wan and lan, so there are eth0.2 and eth1.1 as defaults)

config switch
        option name 'switch0'
        option reset '1'
        option enable_vlan '1'

config switch_vlan
        option device 'switch0'
        option vlan '1'
        option ports '1 2 3 4 6t'

config switch_vlan
        option device 'switch0'
        option vlan '2'
        option ports '5 0t'

That then affects also e.g. SQM and BCP38 config, and other places where interface names are used in package config.

So although the following step is otherwise ok, I suggest that you handle /etc/config/network carefully. And also check network related package configs separately.

I jump between 18.06 and master and 17.01 all the time, and I do not think that there are any major hiccups.

(And as always, just install those packages that you actually want, ans let them pull in all their dependencies.)

Updated everything, got it all working except for SQM with cake/piece_of_cake.
17.01.4 throughput of cake handled my 250Mbps/15Mbps service just fine, capped to 232750/15000.
With 18.06, I am getting only 86Mbps downloads with cake/piece_of_cake. With fq_codel/simple, and the same limits, I'm getting 222Mbps (both according to my ISP's test).
Did something go down the tubes with performance of cake/piece_of_cake in this release?

@hnyman can you update the first post with a current status that includes what is working and what isn't ( on trunk ). For instance switch hardware support, the two offloading cpu's, hardware nat acceleration, leds, networking and whatever you may think is worth noticing.

my suggestion would be something like this:
Everything is working except:

  • hardware nat.
  • native switch 8ca8k
  • etc

hardware nat:

  • current trunk has software nat offloading support. hardware nat is unlikely to be implemented but x is working on it. patches here.

native switch 8ca8k:

  • ...

This way someone new coming to this thread will be able to have a quick overview of the current status.
I just got my R7800 yesterday to replace my old C7 and went trough this entire thread but in the end some things weren't clear for me.

Thanks alot to you and everyone that contributed to add support for this router to OpenWRT.

Thanks for suggestion. I made some updates to the first message.

One question to all:
I have not tested with the Netgear OEM firmware if the current master & 18.06 (with 4 MB kernel space for kernel 4.14) can still be flashed with OEM GUI, or is TFTP needed? Has anybody tested recently?

I did just that yesterday without any problems.
Opened a just bought R7800 box, browsed to web interface at 192.168.1.1, uploaded trunk factory.img firmware file. The web interface gives a warning about flashing to an older version but allows the process to continue. it then flashes and reboots without any problems.

1 Like

After enabling software NAT offloading, irqbalance, SQM (fq_codel/simple) and more aggressive CPU frequency scaling, I got two crashes/reboots in a day of my router. (kids yelled, "hey what happened to the wifi?")
How/where can I configure something to get crash details to help diagnose the failure?
I can disable/enable the options in different combinations for diagnosis, but I'd like to make an educated guess about the likely cause so I can limit the number of unplanned outages.

Start with disabling flow-offloading for testing.

@hnyman, related to the updated first post, are you sure the freq scaling,etc. is working?
I see that blogic has closed dissent1's PR (#632) but I cannot find this back in the github logs?
Or, was it all included when IPQ806x was moved to kernel 4.14 by R. Jangir?

CPU frequency scaling has worked since early 2017

EDIT:
the core part of CPU frequency scaling was implemented already in late 2016. See the first core commits here:
https://github.com/openwrt/openwrt/commits/b2135f3ba5f0ce3bac39620bcdfd46a0225727b5

and then in March 2017 I re-enabled the independent scaling of cores with
https://github.com/openwrt/openwrt/commit/3f9eadf599e7d44fe5c3e4c4652334dda4c6d88f#diff-3e975c9f8ed838b203874eac61a79f18

is it normal for bogomips to be different from one cpu to the other ( 26.95 and 56.15 ) when cpu scaling is disabled ( by compiling the kernel with ondemand disabled and performance as default governor ).

root@gateway:~# cat /proc/cpuinfo
processor       : 0
model name      : ARMv7 Processor rev 0 (v7l)
BogoMIPS        : 26.95
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32
CPU implementer : 0x51
CPU architecture: 7
CPU variant     : 0x2
CPU part        : 0x04d
CPU revision    : 0

processor       : 1
model name      : ARMv7 Processor rev 0 (v7l)
BogoMIPS        : 56.15
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32
CPU implementer : 0x51
CPU architecture: 7
CPU variant     : 0x2
CPU part        : 0x04d
CPU revision    : 0

Hardware        : Generic DT based system
Revision        : 0000
Serial          : 0000000000000000
root@gateway:~# cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq
1725000
root@gateway:~# cat /sys/devices/system/cpu/cpu1/cpufreq/cpuinfo_cur_freq
1725000
root@gateway:~#

Bogomips really are bogus, there have been attempts to remove it completely - the only reason why they had to be reintroduced was to keep old/ broken userland 'working'.

I decided to test the performance I get with cake - layered_cake. This is what I get on my Archer C7 v2 after flashing blogic's mac80211 patch build:


This is what I get on my R7800 with blogic's patch:


Both tests were done with only the internet connection set up and sqm configured (no wifi, apps, etc.)

I've always had a feeling that the speeds I get on my R7800 have been very wonky, and the Archer C7 v2 should definitely not get better speeds than the R7800 considering the difference in CPU power. Is it possible that ath79 provides enough of a performance boost to enable the Archer to outperform the R7800?

You might test tweaking the CPU frequency scaling ramp-up parameters in kernel.
See

I tried:

echo 35 > /sys/devices/system/cpu/cpufreq/ondemand/up_threshold
echo 10 > /sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor

The speeds improved, but they're still on the low side (I have a 500/500 line):


I also tried irqbalance together with the changes above without any improvement (a decrease in upload speed in fact):


root@telia:~# cat /proc/interrupts 
           CPU0       CPU1       
 16:      12515      26923     GIC-0  18 Edge      gp_timer
 18:         33          0     GIC-0  51 Edge      qcom_rpm_ack
 19:          0          0     GIC-0  53 Edge      qcom_rpm_err
 20:          0          0     GIC-0  54 Edge      qcom_rpm_wakeup
 26:          0          0     GIC-0 241 Edge      ahci[29000000.sata]
 27:          0          0     GIC-0 210 Edge      tsens_interrupt
 28:      17033        447     GIC-0  67 Edge      qcom-pcie-msi
 29:      20342       5607     GIC-0  89 Edge      qcom-pcie-msi
 30:     121442          0     GIC-0 202 Edge      adm_dma
 32:     391151        395     GIC-0 258 Level     eth1
 33:          0          0     GIC-0 130 Level     bam_dma
 34:          0          0     GIC-0 128 Level     bam_dma
 35:          0          0   PCI-MSI   0 Edge      aerdrv
 36:      17033        447   PCI-MSI   1 Edge      ath10k_pci
 68:          0          0   PCI-MSI   0 Edge      aerdrv
 69:      20342       5607   PCI-MSI   1 Edge      ath10k_pci
101:         12          0     GIC-0 184 Level     msm_serial0
102:          2          0   msmgpio   6 Edge      gpio-keys
103:          2          0   msmgpio  54 Edge      gpio-keys
104:          2          0   msmgpio  65 Edge      gpio-keys
105:          0          0     GIC-0 142 Level     xhci-hcd:usb1
106:          0          0     GIC-0 237 Level     xhci-hcd:usb3
IPI0:          0          0  CPU wakeup interrupts
IPI1:          0          0  Timer broadcast interrupts
IPI2:       9839       5868  Rescheduling interrupts
IPI3:         33      33533  Function call interrupts
IPI4:          0          0  CPU stop interrupts
IPI5:      11286       6782  IRQ work interrupts
IPI6:          0          0  completion interrupts
Err:          0

Just to add my experience - Today I bought an R7800, went to the device's 192.168.1.1 address, and used their web interface to flash 18.06.1 (the factory.img file). Worked perfectly. There is a warning about "downloading to an older firmware" and you just click through that.

Thanks all.

I've been part of this thread for a while now and have really enjoyed learning and even helping some.

I recently moved to Michigan from Florida. In Michigan I now have a fiber connection (1000Mbps up and down) which is just incredible. However, I have now realized the limitation of our version of OpenWrt for the R7800. I cannot break 300Mbps over WiFi no matter what I do.

I switched to the Netgear stock firmware and easily hit 900Mbps.
I tried dd-wrt and easily hit 800Mbps.

So, I really want to help solve this for OpenWrt and I know that we have gotten close a few times. Do we know where the problem currently resides? Where is a good place to start looking for repairs in this area? Is it the Atheros firmware? I also realize that the number 1 contributor to this version of OpenWrt (@hnyman) doesn't have a super highspeed connection to test with so we might need some other peeps to step up if they can help.

Let's solve this if we can!

Thanks all.

The vendor firmware offloads most of the firewalling, NATing and routing to two dedicated 800 MHz NSS/ NPU cores, support for these (hardware flow-offloading) is not available in OpenWrt so far and the main SOC hits its limits around 350-400 MBit/s.

In contrast mvebu does have the performance to route at 1 GBit/s line speed in software, while mt7621 has supported hardware flow-offloading available to keep up with the easy cases.

What is dd-wrt doing to get past this?