R7800 performance

Okay, it crashed again now. But there is nothing specific i could find in the log files at the date and time the device crashed. Only something else from this night which does not seem to be related. But maybe someone could take a look? Thanks again and sorry for being annyoing :frowning:

Mon Nov 12 00:45:56 2018 kern.err kernel: [   13.606952] print_req_error: I/O error, dev mtdblock0, sector 0
Mon Nov 12 00:45:56 2018 kern.err kernel: [   13.607647] print_req_error: I/O error, dev mtdblock0, sector 8
Mon Nov 12 00:45:56 2018 kern.err kernel: [   13.612338] print_req_error: I/O error, dev mtdblock0, sector 16
Mon Nov 12 00:45:56 2018 kern.err kernel: [   13.618362] print_req_error: I/O error, dev mtdblock0, sector 24
Mon Nov 12 00:45:56 2018 kern.err kernel: [   13.633682] print_req_error: I/O error, dev mtdblock0, sector 0
Mon Nov 12 00:45:56 2018 kern.err kernel: [   13.633703] Buffer I/O error on dev mtdblock0, logical block 0, async page read
Mon Nov 12 00:45:56 2018 kern.err kernel: [   13.718530] print_req_error: I/O error, dev mtdblock0, sector 0
Mon Nov 12 00:45:56 2018 kern.err kernel: [   13.718556] Buffer I/O error on dev mtdblock0, logical block 0, async page read
Mon Nov 12 00:45:56 2018 kern.err kernel: [   13.723808] print_req_error: I/O error, dev mtdblock1, sector 0
Mon Nov 12 00:45:56 2018 kern.err kernel: [   13.730956] print_req_error: I/O error, dev mtdblock1, sector 8
Mon Nov 12 00:45:56 2018 kern.err kernel: [   13.736929] print_req_error: I/O error, dev mtdblock1, sector 16
Mon Nov 12 00:45:56 2018 kern.err kernel: [   13.742758] print_req_error: I/O error, dev mtdblock1, sector 24
Mon Nov 12 00:45:56 2018 kern.err kernel: [   13.753909] Buffer I/O error on dev mtdblock1, logical block 0, async page read
Mon Nov 12 00:45:56 2018 kern.err kernel: [   13.794426] Buffer I/O error on dev mtdblock1, logical block 0, async page read

Mon Nov 12 00:45:56 2018 kern.info kernel: [   26.355073] ath10k_pci 0001:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
Mon Nov 12 00:45:56 2018 kern.warn kernel: [   26.527894] ath10k_pci 0001:01:00.0: Direct firmware load for ath10k/QCA9984/hw1.0/firmware-6.bin failed with error -2
Mon Nov 12 00:45:56 2018 kern.warn kernel: [   26.527934] ath10k_pci 0001:01:00.0: Falling back to user helper
Mon Nov 12 00:45:56 2018 kern.err kernel: [   26.743892] firmware ath10k!QCA9984!hw1.0!firmware-6.bin: firmware_loading_store: map pages failed
Mon Nov 12 00:45:56 2018 kern.info kernel: [   26.744216] ath10k_pci 0001:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe
Mon Nov 12 00:45:56 2018 kern.info kernel: [   26.751760] ath10k_pci 0001:01:00.0: kconfig debug 0 debugfs 1 tracing 0 dfs 1 testmode 1
Mon Nov 12 00:45:56 2018 kern.info kernel: [   26.764625] ath10k_pci 0001:01:00.0: firmware ver 10.4-3.5.3-00053 api 5 features no-p2p,mfp,peer-flow-ctrl,btcoex-param,allows-mesh-bcast,no-ps crc32 4c56a386
Mon Nov 12 00:45:56 2018 kern.info kernel: [   29.038913] ath10k_pci 0001:01:00.0: board_file api 2 bmi_id 0:2 crc32 dd6d039c
Mon Nov 12 00:45:56 2018 kern.info kernel: [   34.948468] ath10k_pci 0001:01:00.0: htt-ver 2.2 wmi-op 6 htt-op 4 cal pre-cal-file max-sta 512 raw 0 hwcrypto 1
Mon Nov 12 00:45:56 2018 kern.debug kernel: [   35.038016] ath: EEPROM regdomain: 0x0
Mon Nov 12 00:45:56 2018 kern.debug kernel: [   35.038031] ath: EEPROM indicates default country code should be used
Mon Nov 12 00:45:56 2018 kern.debug kernel: [   35.038041] ath: doing EEPROM country->regdmn map search
Mon Nov 12 00:45:56 2018 kern.debug kernel: [   35.038060] ath: country maps to regdmn code: 0x3a
Mon Nov 12 00:45:56 2018 kern.debug kernel: [   35.038073] ath: Country alpha2 being used: US
Mon Nov 12 00:45:56 2018 kern.debug kernel: [   35.038084] ath: Regpair used: 0x3a

[   17.488180] NET: Registered protocol family 24
[   17.492263] PPTP driver version 0.8.5
[   17.510314] ath10k_pci 0000:01:00.0: assign IRQ: got 67
[   17.510917] ath10k_pci 0000:01:00.0: enabling device (0140 -> 0142)
[   17.510996] ath10k_pci 0000:01:00.0: enabling bus mastering
[   17.511443] ath10k_pci 0000:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[   17.691347] ath10k_pci 0000:01:00.0: Direct firmware load for ath10k/QCA9984/hw1.0/firmware-6.bin failed with error -2
[   17.691392] ath10k_pci 0000:01:00.0: Falling back to user helper
[   17.720674] firmware ath10k!QCA9984!hw1.0!firmware-6.bin: firmware_loading_store: map pages failed
[   18.078298] ath10k_pci 0000:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe
[   18.078349] ath10k_pci 0000:01:00.0: kconfig debug 0 debugfs 1 tracing 0 dfs 1 testmode 1
[   18.091880] ath10k_pci 0000:01:00.0: firmware ver 10.4-3.5.3-00053 api 5 features no-p2p,mfp,peer-flow-ctrl,btcoex-param,allows-mesh-bcast,no-ps crc32 4c56a386
[   20.366121] ath10k_pci 0000:01:00.0: board_file api 2 bmi_id 0:1 crc32 dd6d039c
[   26.254157] ath10k_pci 0000:01:00.0: htt-ver 2.2 wmi-op 6 htt-op 4 cal pre-cal-file max-sta 512 raw 0 hwcrypto 1
[   26.347933] ath: EEPROM regdomain: 0x0
[   26.347948] ath: EEPROM indicates default country code should be used
[   26.347959] ath: doing EEPROM country->regdmn map search
[   26.347977] ath: country maps to regdmn code: 0x3a
[   26.347991] ath: Country alpha2 being used: US
[   26.348002] ath: Regpair used: 0x3a
[   26.353226] ath10k_pci 0001:01:00.0: assign IRQ: got 100
[   26.354259] ath10k_pci 0001:01:00.0: enabling device (0140 -> 0142)
[   26.354406] ath10k_pci 0001:01:00.0: enabling bus mastering
[   26.355073] ath10k_pci 0001:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[   26.527894] ath10k_pci 0001:01:00.0: Direct firmware load for ath10k/QCA9984/hw1.0/firmware-6.bin failed with error -2
[   26.527934] ath10k_pci 0001:01:00.0: Falling back to user helper
[   26.743892] firmware ath10k!QCA9984!hw1.0!firmware-6.bin: firmware_loading_store: map pages failed
[   26.744216] ath10k_pci 0001:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe
[   26.751760] ath10k_pci 0001:01:00.0: kconfig debug 0 debugfs 1 tracing 0 dfs 1 testmode 1
[   26.764625] ath10k_pci 0001:01:00.0: firmware ver 10.4-3.5.3-00053 api 5 features no-p2p,mfp,peer-flow-ctrl,btcoex-param,allows-mesh-bcast,no-ps crc32 4c56a386
[   29.038913] ath10k_pci 0001:01:00.0: board_file api 2 bmi_id 0:2 crc32 dd6d039c
[   34.948468] ath10k_pci 0001:01:00.0: htt-ver 2.2 wmi-op 6 htt-op 4 cal pre-cal-file max-sta 512 raw 0 hwcrypto 1
[   35.038016] ath: EEPROM regdomain: 0x0
[   35.038031] ath: EEPROM indicates default country code should be used
[   35.038041] ath: doing EEPROM country->regdmn map search
[   35.038060] ath: country maps to regdmn code: 0x3a
[   35.038073] ath: Country alpha2 being used: US
[   35.038084] ath: Regpair used: 0x3a
[   35.046985] kmodloader: done loading kernel modules from /etc/modules.d/*
[   39.360354] print_req_error: 14 callbacks suppressed
[   39.360361] print_req_error: I/O error, dev mtdblock0, sector 0
[   39.365157] print_req_error: I/O error, dev mtdblock0, sector 8
[   39.370565] print_req_error: I/O error, dev mtdblock0, sector 16
[   39.376509] print_req_error: I/O error, dev mtdblock0, sector 24
[   39.383225] print_req_error: I/O error, dev mtdblock0, sector 0
[   39.388182] Buffer I/O error on dev mtdblock0, logical block 0, async page read
[   39.399839] print_req_error: I/O error, dev mtdblock0, sector 0
[   39.401113] Buffer I/O error on dev mtdblock0, logical block 0, async page read
[   39.409406] print_req_error: I/O error, dev mtdblock1, sector 0
[   39.414918] print_req_error: I/O error, dev mtdblock1, sector 8
[   39.420708] print_req_error: I/O error, dev mtdblock1, sector 16
[   39.426678] print_req_error: I/O error, dev mtdblock1, sector 24
[   39.433121] Buffer I/O error on dev mtdblock1, logical block 0, async page read
[   39.441850] Buffer I/O error on dev mtdblock1, logical block 0, async page read
[   40.434402] Generic PHY fixed-0:01: attached PHY driver [Generic PHY] (mii_bus:phy_addr=fixed-0:01, irq=POLL)
[   40.435384] dwmac1000: Master AXI performs any burst length
[   40.443307] ipq806x-gmac-dwmac 37400000.ethernet eth1: IEEE 1588-2008 Advanced Timestamp supported
[   40.448944] ipq806x-gmac-dwmac 37400000.ethernet eth1: registered PTP clock
[   40.457941] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
[   40.469290] br-lan: port 1(eth1.1) entered blocking state
[   40.470466] br-lan: port 1(eth1.1) entered disabled state
[   40.476309] device eth1.1 entered promiscuous mode
[   40.481315] device eth1 entered promiscuous mode
[   40.488717] IPv6: ADDRCONF(NETDEV_UP): br-lan: link is not ready
[   40.507627] Generic PHY fixed-0:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=fixed-0:00, irq=POLL)
[   40.508599] dwmac1000: Master AXI performs any burst length
[   40.516729] ipq806x-gmac-dwmac 37200000.ethernet eth0: IEEE 1588-2008 Advanced Timestamp supported
[   40.522073] ipq806x-gmac-dwmac 37200000.ethernet eth0: registered PTP clock
[   40.531627] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[   40.541388] IPv6: ADDRCONF(NETDEV_UP): eth0.2: link is not ready
[   41.513944] ipq806x-gmac-dwmac 37400000.ethernet eth1: Link is Up - 1Gbps/Full - flow control off
[   41.529860] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
[   41.530200] br-lan: port 1(eth1.1) entered blocking state
[   41.535027] br-lan: port 1(eth1.1) entered forwarding state
[   41.577173] IPv6: ADDRCONF(NETDEV_CHANGE): br-lan: link becomes ready
[   41.593907] ipq806x-gmac-dwmac 37200000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off
[   41.593970] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[   41.614916] IPv6: ADDRCONF(NETDEV_CHANGE): eth0.2: link becomes ready
[   48.338796] IPv6: ADDRCONF(NETDEV_UP): wlan1: link is not ready
[   54.450130] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   54.455951] br-lan: port 2(wlan1) entered blocking state
[   54.456002] br-lan: port 2(wlan1) entered disabled state
[   54.460992] device wlan1 entered promiscuous mode
[   54.476939] br-lan: port 3(wlan0) entered blocking state
[   54.476965] br-lan: port 3(wlan0) entered disabled state
[   54.481487] device wlan0 entered promiscuous mode
[   54.852974] IPv6: ADDRCONF(NETDEV_CHANGE): wlan1: link becomes ready
[   54.853239] br-lan: port 2(wlan1) entered blocking state
[   54.858540] br-lan: port 2(wlan1) entered forwarding state
[   55.234089] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[   55.234420] br-lan: port 3(wlan0) entered blocking state
[   55.239561] br-lan: port 3(wlan0) entered forwarding state

Update:

The current version is stable since 4-5 days, i will not change anything now :slight_smile:

Firmware Version OpenWrt SNAPSHOT r8450-a95bef0579 / LuCI Master (git-18.316.62517-99a696d)
Kernel Version 4.14.79
Uptime 4d 15h 39m 35s

So I am running build OpenWrt-r8463-59ff8687c7 with irqbalance in /etc/rc.local, but the processor statistics is showing that most tasks are still processed by the first (default?) core:

OpenWrt%20-%20Processor%20-%20LuCI%202018-11-17%2016-46-55

Is there something wrong?

What does /proc/interrupts look like in that case? Are the interrupts still being handled by one core?

           CPU0       CPU1
 16:      41612     111819     GIC-0  18 Edge      gp_timer
 18:         33          0     GIC-0  51 Edge      qcom_rpm_ack
 19:          0          0     GIC-0  53 Edge      qcom_rpm_err
 20:          0          0     GIC-0  54 Edge      qcom_rpm_wakeup
 26:          0          0     GIC-0 241 Edge      ahci[29000000.sata]
 27:          0          0     GIC-0 210 Edge      tsens_interrupt
 28:     180811      18578     GIC-0  67 Edge      qcom-pcie-msi
 29:     188100      85549     GIC-0  89 Edge      qcom-pcie-msi
 30:     205667         25     GIC-0 202 Edge      adm_dma
 31:       3798      11330     GIC-0 255 Level     eth0
 32:        131         26     GIC-0 258 Level     eth1
 33:          0          0     GIC-0 130 Level     bam_dma
 34:          0          0     GIC-0 128 Level     bam_dma
 35:          0          0   PCI-MSI   0 Edge      aerdrv
 36:     180811      18578   PCI-MSI   1 Edge      ath10k_pci
 68:          0          0   PCI-MSI   0 Edge      aerdrv
 69:     188100      85549   PCI-MSI   1 Edge      ath10k_pci
101:         10          0     GIC-0 184 Level     msm_serial0
102:          2          0   msmgpio   6 Edge      gpio-keys
103:          2          0   msmgpio  54 Edge      gpio-keys
104:          2          0   msmgpio  65 Edge      gpio-keys
105:          0          0     GIC-0 142 Level     xhci-hcd:usb1
106:          0          0     GIC-0 237 Level     xhci-hcd:usb3
IPI0:          0          0  CPU wakeup interrupts
IPI1:          0          0  Timer broadcast interrupts
IPI2:      35221      75618  Rescheduling interrupts
IPI3:         36      20041  Function call interrupts
IPI4:          0          0  CPU stop interrupts
IPI5:      49216      85866  IRQ work interrupts
IPI6:          0          0  completion interrupts
Err:          0

A cursory conclusion would be that the irqs are still not being balanced properly, with core #0 receiving the bulk of interrupts from the devices that are most active.

I am in the middle of debugging a similar situation myself.

It would seem so. Hopefully, someone will provide a fix soon.

It's hard to argue with the raw figures you have in your rrd graphs - but I wonder whether a lot of this isn't architecture specific.

Look at a couple of ARM based boards I'm seeing similar imbalances in the IRQs raised - where most of the device IRQs are raised against a single core - however in those cases it looks like at least some of the work is being rescheduled later (presumably via the rescheduling interrupts).

I suppose the test would be to attempt to actually max out one of the cores - and make sure the load was redistributed properly at that point - certainly it looks like at low levels there's a supervisory function assumed by the first core.

FYI: @cannesahs has a few interesting notes to help improve latency here: Netgear R7800 exploration (IPQ8065, QCA9984)

Today I tested Linksys EA8500 (same hardware, as in R7800, but CPU 1.4Ghz instead of your 1.7GHz ),
with OpenWRT 18.06.1 ,
default settings + light tuning (net buffers).
static address, NAT, 2 PC (1st in LAN, 2nd in WAN),
port forwarding (for tests in both directions) for Iperf port.
Few simple rules in firewall (for ssh, ipsec)

Iperf, ftp. (ftp test use passive ftp mode.)

Iperf (tcp, 2 streams, 250K buffers) :
(WAN-LAN, ~~ same for LAN-WAN)

without software offloading,
default settings for CPU governor & power management :
540-560 Mbits/sec. (~70-80 %sirq)

with software offloading,
default settings for CPU governor & power management :
635-650 Mbits/sec. (but less %sirq)

without any software offloading,
optimized settings for CPU governor & power management :
870-900 Mbits/sec. (~50-65 %sirq)

and 900 Mbits/sec. isn't a 100% load - router may more speed (for example, in duplex).

ftp, 1 stream , without any software offloading, in WAN<->LAN (both directions) ,
default settings for CPU governor & power management :
65-70 Mbytes/sec.
optimized settings for CPU governor & power management :
95-103 Mbytes/sec.

Next test : routing disabled, only WiFi AP, speed is limited by speed of WAN channel to other router (100/100 Mbits).
Even for this light load download to wifi client was ~90 in both cases, but upload was worse for default settings for CPU governor.

optimized settings for CPU governor & power management :

  1. settings for ondemand scheduler:
    35 for /sys/devices/system/cpu/cpufreq/ondemand/up_threshold
    (for up_threshold =30 or 40 I not detect any difference.)

10 for /sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor

  1. or set performance scheduler/governor , no additional settings.

Conclusion :
Settings for CPU governor & power management is
very important for this CPU for hi-speed channels.
Default settings for CPU governor & power management in OpenWRT 18.06.1 (and 17.01.xx too ) is very poor for any router, based on IPQ80xx or any other CPU with advanced power&frequency management .
Yes, I know that these numbers also are in many other "stock kernels", but it's not for routers& firewalls ! ( and not for many more other specific devices)

8 Likes

Changing from soc/cpu frequency to another or from powersaving / "sleep-state" (eg. C1E on x86 intel side) to active takes ages compared to what datatransfering needs.

Try leaving only two available frequencies for ondemand to pick from, if you really want use it.

On heavyish / moderate load it doesn't matter so much as cpu will stay active, but in "normal" use there is anough time between computing / irq-wake-ups to put cpu in powersaving and then wake to first packet of burst takes too long. Ondemand tuning and performance governor can reduce "issue", but won't make it go away fully.

And latencies matter also to get full line speed on single tcp-connection if tcp window isn't big enough / huge.

Yes, imho it's reason for enable "performance" mode.

it's micro-seconds for some packets,
Frequency tuning and performance governor give me 1.5-2x difference on real traffic.

" without software offloading,
default settings for CPU governor & power management :
540-560 Mbits/sec. (~70-80 %sirq)

without any software offloading,
optimized settings for CPU governor & power management :
870-900 Mbits/sec. (~50-65 %sirq)"

1 Like

Could someone please explain if it's feasible to use the ondemand scheduler in combination with setting the scaling governor as well as setting the maximum cpu frequency as advised here: Netgear R7800 exploration (IPQ8065, QCA9984)

I'm having a hard time breaking 100 Mbps through NAT with the two R7800s that I have. When I just reboot the router, I get about 6-700 Mbps. However, as soon as the router has been up for a minute, the performance goes down under 100 Mbps. I've tried with a linksys WRT3200ACM and consistently get 900 Mbps, but of course the R7800 radios work way better than the WRT ones, so I'd like to stick with the R7800.

I see people on here complaining of "bad" performance around 300 Mbps, which I would love to get consistently. Does anyone have suggestions on what I might be doing wrong. Some more context:

  • OpenWrt 18.06.4 r7808-ef686b7292
  • iperf3 to the router from the LAN gets 900 Mbps, so it seems NAT related
  • iperf3 from the router to the WAN gets terrible performance (though if I plug in a computer or a linksys router, the WAN gives fantastic performance)
  • I have 10 firewall rules and 4 port forwards
  • 4 static routes to /20 IPv4 prefixes for other routers on the LAN side, and typically ~20 devices showing in arp -a
  • In steady state conntrack -L | wc -l ranges between 150-400 connections, about half look like DNS (port 53).
  • Just two firewall zones (lan and wan)
  • Top does not show much CPU load even when saturating NAT at ~93 Mbps.
  • I've tried all three of "Software based offloading" checked, both Software and hardware checked, and none checked, and it doesn't seem to make much different to performance or CPU utilization.

Thanks for any suggestions.
Under the firewall menu, I have something that says "Routing/NAT Offloading", but no check box. Next to it reads "Experimental feature not fuly compatible with QoS/SQM". Given that I have near line-rate upstream, I really don't care about QoS, but don't see any place to disable it.

Any suggestions on how I might at least get to 300 Mbps or whatever people on here consider the bad performance? I have played with CPU governors and such, but as expected that stuff doesn't matter because something else must be going on as my CPU shows plenty of idle time.

1 Like

Did you try 19.07? There was a performance affecting bug fixed back in May.
You would have to compile 19.07 this time.

I was able to get ~750 Mbps without SQM.

I tried the snapshot and 19.06.4, and neither can break 100 Mbps. How do I get 19.07?

If it was recently within the last couple of months, then it is the same as 19.07.

Looks very suspicious as this is almost max for 100Mbps link: are you sure all your devices are negotiating 1Gbps connection and not 100Mbps? Maybe a bad cable/port?

I can share my build with you if you want to try: it includes some perf optimizations that are not a part of the default image.

Oh yes, I just tried https://downloads.openwrt.org/snapshots/targets/ipq806x/generic/openwrt-ipq806x-netgear_r7800-squashfs-factory.img tonight after reading your messages. (It was strange because that build is missing luci, but it was routing packets okay at the 100 Mbps rate.)

Seems very unlikely, because A) I can briefly get much faster right after rebooting the router, and B) I can get 900 Mbps with a linksys router. Oh and also I own two R7800s and two linksys routers, so the common factor is always the R7800.

So the only thing I can think is that my ISP could be throttling based on MAC address or something.

I haven't yet done a controlled experiment, so what I could do is set up one of my R7800s NATed to my local network instead of trying to benchmark through my ISP.

1 Like

How do you connect to ISP? PPPOE?

Also, if you have a gigabit switch, place it between your r7800 and the ISP modem.

My ISP installed some kind of box with fiber coming in on one side and an Ethernet port. I connected that Ethernet port to a 1Gig switch, and the R7800's WAN port plugs into that switch. The devices plugged into the switch (including the R7800) get public IP addresses from the ISP via DHCP.

That's already how it's configured (so I can bypass the R7800's poor NAT performance...)