Build for Netgear R7800

@hnyman surprise! your r3498 build comes with nlbwmon monitor :smiley: so nice!. thanks for that!.

lede-r4773-12930fc045-20170831-rxring-buffersize-pcie-test

I made a test build that contains all three fixes under discussion:

Blockquote
Do you have a reference to the earlier report of a MAC problem? I couldn't find any posts with R7800 and MAC errors.

If the problem would have been in all latest trunk builds and would affect all users, I guess there would have been more reports by now.

Well, at the time I tracked it down to the Wifi cipher configuration. Using Force TKIP (which I actually had configured by mistake) I had the mentioned MAC errors, consistently. With AES it seems to work mostly fine (I mean the wifi still craps up every now and then but remote connections work fine and I don't get the MAC errors anymore).

Also, before noticing the wifi configured as Force TKIP, I had managed to make it work by disabling some of the checksumming offload features (can't remember exactly which) on the interface with ethtool as shown on this page:
http://docs.gz.ro/node/282
This made the MAC errors disappear and I could keep my remote connections and transfers. However the performance was abysmal.

Following @LuisGC idea about using ethtool, I was able to overcome the MAC error problem with the r4773 build.
Install ethtool (opkg install ethtool) and experiment with the br-lan interface since it is the center point for most of the other interfaces.

First list how this interface is configured before changing anything:

root@R7800RT1:~# ethtool -k br-lan

Features for br-lan:
rx-checksumming: off [fixed]
tx-checksumming: on
	tx-checksum-ipv4: off [fixed]
	tx-checksum-ip-generic: on
	tx-checksum-ipv6: off [fixed]
	tx-checksum-fcoe-crc: off [fixed]
	tx-checksum-sctp: off [fixed]
scatter-gather: on
	tx-scatter-gather: on
	tx-scatter-gather-fraglist: off [requested on]
tcp-segmentation-offload: on
	tx-tcp-segmentation: on
	tx-tcp-ecn-segmentation: on
	tx-tcp-mangleid-segmentation: on
	tx-tcp6-segmentation: on
udp-fragmentation-offload: off [requested on]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: on [fixed]
netns-local: on [fixed]
tx-gso-robust: off [requested on]
tx-fcoe-segmentation: off [requested on]
tx-gre-segmentation: on
tx-gre-csum-segmentation: on
tx-ipxip4-segmentation: on
tx-ipxip6-segmentation: on
tx-udp_tnl-segmentation: on
tx-udp_tnl-csum-segmentation: on
tx-gso-partial: on
tx-sctp-segmentation: off [requested on]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: on
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
busy-poll: off [fixed]
hw-tc-offload: off [fixed]

Notice how most of the tx settings are "on". Try setting them "off".

root@R7800RT1:~# ethtool --offload br-lan tx off

Actual changes:
tx-checksumming: off
	tx-checksum-ip-generic: off
tcp-segmentation-offload: off
	tx-tcp-segmentation: off [requested on]
	tx-tcp-ecn-segmentation: off [requested on]
	tx-tcp-mangleid-segmentation: off [requested on]
	tx-tcp6-segmentation: off [requested on]

Now re-run the "tree /" test (which fails every time for me on r4xxx builds).

root@R7800RT1:~# tree /
The entire tree is displayed followed by
12302 directories, 61309 files

Although I took a "shotgun" approach, so far these changes haven't had much impact on performance, but could possibly be refined.

Unfortunately the ethtool changes won't survive a router power-cycle or reboot.
To fix that, on LuCi go to System => Startup and in the box under Local Startup insert the ethtool syntax.

# Put your custom commands here that should be executed once
# the system init finished. By default this file does nothing.
ethtool --offload br-lan tx off
exit 0

These changes should hold unless the router's firmware is overwritten, then the ethtool will need re-installing.

That's the good news; now for the bad news which has nothing to do with ethtool.
Has anyone noticed that the r4xxx builds (4.9 kernel) have poor Wi-Fi performance? Install iperf3 (opkg install iperf3) and test with one of my octacore HP laptops configured as an iperf3 server (iperf3 -s) on the 5GHz radio. Router is still on r4773. The router antennas are 4 feet from the laptop's Wi-Fi dongle.

root@R7800RT1:~# iperf3 -c 192.168.16.121

Connecting to host 192.168.16.121, port 5201
[  5] local 192.168.16.1 port 44032 connected to 192.168.16.121 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.25 MBytes  27.2 Mbits/sec   45   14.1 KBytes       
[  5]   1.00-2.00   sec  3.15 MBytes  26.4 Mbits/sec   50   11.3 KBytes       
[  5]   2.00-3.00   sec  3.29 MBytes  27.6 Mbits/sec   48   29.7 KBytes       
[  5]   3.00-4.00   sec  3.76 MBytes  31.5 Mbits/sec   42   7.07 KBytes       
[  5]   4.00-5.00   sec  3.37 MBytes  28.3 Mbits/sec   49   7.07 KBytes       
[  5]   5.00-6.00   sec  2.73 MBytes  22.9 Mbits/sec   42   8.48 KBytes       
[  5]   6.00-7.00   sec  2.49 MBytes  20.9 Mbits/sec   44   7.07 KBytes       
[  5]   7.00-8.00   sec  2.86 MBytes  24.0 Mbits/sec   42   14.1 KBytes       
[  5]   8.00-9.00   sec  3.36 MBytes  28.1 Mbits/sec   46   7.07 KBytes       
[  5]   9.00-10.00  sec  2.98 MBytes  25.0 Mbits/sec   35   9.90 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  31.2 MBytes  26.2 Mbits/sec  443             sender
[  5]   0.00-10.00  sec  31.1 MBytes  26.1 Mbits/sec                  receiver

iperf Done.

Notice the low transfer and high retry count.

What tcp congestion algorithms are on this router?

root@R7800RT1:~# sysctl -a 2>/dev/null|grep congestion
net.ipv4.tcp_allowed_congestion_control = cubic reno
net.ipv4.tcp_available_congestion_control = cubic reno
net.ipv4.tcp_congestion_control = cubic

So "cubic" is the default; try "reno".

root@R7800RT1:~# iperf3 -c 192.168.16.121 -C reno

Connecting to host 192.168.16.121, port 5201
[  5] local 192.168.16.1 port 44040 connected to 192.168.16.121 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.99 MBytes  33.5 Mbits/sec   54   15.6 KBytes       
[  5]   1.00-2.00   sec  3.73 MBytes  31.3 Mbits/sec   49   9.90 KBytes       
[  5]   2.00-3.00   sec  3.73 MBytes  31.3 Mbits/sec   55   11.3 KBytes       
[  5]   3.00-4.00   sec  3.85 MBytes  32.3 Mbits/sec   47   11.3 KBytes       
[  5]   4.00-5.00   sec  3.36 MBytes  28.1 Mbits/sec   54   8.48 KBytes       
[  5]   5.00-6.00   sec  3.42 MBytes  28.7 Mbits/sec   52   24.0 KBytes       
[  5]   6.00-7.00   sec  3.54 MBytes  29.7 Mbits/sec   51   15.6 KBytes       
[  5]   7.00-8.00   sec  3.17 MBytes  26.6 Mbits/sec   58   2.83 KBytes       
[  5]   8.00-9.00   sec  3.79 MBytes  31.8 Mbits/sec   54   18.4 KBytes       
[  5]   9.00-10.00  sec  3.73 MBytes  31.3 Mbits/sec   50   7.07 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  36.3 MBytes  30.5 Mbits/sec  524             sender
[  5]   0.00-10.00  sec  36.1 MBytes  30.3 Mbits/sec                  receiver

iperf Done.

Higher transfer but higher retry count.

Now try a wired connection.

root@R7800RT1:~# iperf3 -c 192.168.16.196

Connecting to host 192.168.16.196, port 5201
[  5] local 192.168.16.1 port 38042 connected to 192.168.16.196 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   101 MBytes   847 Mbits/sec    0    595 KBytes       
[  5]   1.00-2.01   sec   110 MBytes   922 Mbits/sec    0    622 KBytes       
[  5]   2.01-3.01   sec  97.5 MBytes   813 Mbits/sec    0    822 KBytes       
[  5]   3.01-4.01   sec   101 MBytes   855 Mbits/sec    0    822 KBytes       
[  5]   4.01-5.00   sec   101 MBytes   852 Mbits/sec    0    933 KBytes       
[  5]   5.00-6.01   sec   112 MBytes   936 Mbits/sec    0    933 KBytes       
[  5]   6.01-7.00   sec   101 MBytes   855 Mbits/sec    0    984 KBytes       
[  5]   7.00-8.00   sec   100 MBytes   842 Mbits/sec    0    984 KBytes       
[  5]   8.00-9.01   sec  78.8 MBytes   654 Mbits/sec    0   1.07 MBytes       
[  5]   9.01-10.00  sec   109 MBytes   920 Mbits/sec    0   1.07 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1013 MBytes   849 Mbits/sec    0             sender
[  5]   0.00-10.00  sec  1013 MBytes   849 Mbits/sec                  receiver

iperf Done. <== Very good.

===========================================================================

Re-install r3498 build (4.4 kernel) and iperf3.

root@R7800RT1:~# cat /etc/os-release|grep -i rel
LEDE_RELEASE="LEDE Reboot 17.01-SNAPSHOT r3498-dc8392f6a1"

root@R7800RT1:~# uname -mrvos
Linux 4.4.83 #0 SMP Sun Aug 27 16:31:45 2017 armv7l GNU/Linux

root@R7800RT1:~# iperf3 -c 192.168.16.121

Connecting to host 192.168.16.121, port 5201
[  4] local 192.168.16.1 port 42076 connected to 192.168.16.121 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  17.3 MBytes   145 Mbits/sec    0    411 KBytes       
[  4]   1.00-2.00   sec  19.1 MBytes   161 Mbits/sec    0    544 KBytes       
[  4]   2.00-3.00   sec  17.8 MBytes   149 Mbits/sec    0    632 KBytes       
[  4]   3.00-4.00   sec  18.3 MBytes   154 Mbits/sec    0    740 KBytes       
[  4]   4.00-5.00   sec  17.5 MBytes   146 Mbits/sec    0    740 KBytes       
[  4]   5.00-6.00   sec  18.1 MBytes   152 Mbits/sec    0    740 KBytes       
[  4]   6.00-7.00   sec  16.7 MBytes   140 Mbits/sec    0    861 KBytes       
[  4]   7.00-8.01   sec  18.7 MBytes   156 Mbits/sec    0   1.05 MBytes       
[  4]   8.01-9.00   sec  17.9 MBytes   151 Mbits/sec    0   1.22 MBytes       
[  4]   9.00-10.00  sec  17.0 MBytes   142 Mbits/sec    0   1.29 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec   178 MBytes   150 Mbits/sec    0             sender
[  4]   0.00-10.00  sec   176 MBytes   147 Mbits/sec                  receiver

iperf Done.

Now that's more like it.

Try the "reno" algorithm

root@R7800RT1:~# iperf3 -c 192.168.16.121 -C reno

Connecting to host 192.168.16.121, port 5201
[  4] local 192.168.16.1 port 42080 connected to 192.168.16.121 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  22.0 MBytes   184 Mbits/sec    0   2.76 MBytes       
[  4]   1.00-2.01   sec  19.7 MBytes   165 Mbits/sec    0   3.14 MBytes       
[  4]   2.01-3.00   sec  20.0 MBytes   168 Mbits/sec    0   4.96 MBytes       
[  4]   3.00-4.05   sec  20.0 MBytes   160 Mbits/sec    0   6.01 MBytes       
[  4]   4.05-5.00   sec  19.4 MBytes   172 Mbits/sec    0   6.01 MBytes       
[  4]   5.00-6.00   sec  19.9 MBytes   167 Mbits/sec    0   6.01 MBytes       
[  4]   6.00-7.00   sec  19.5 MBytes   164 Mbits/sec    0   6.01 MBytes       
[  4]   7.00-8.00   sec  18.0 MBytes   151 Mbits/sec    0   6.01 MBytes       
[  4]   8.00-9.00   sec  19.5 MBytes   163 Mbits/sec    0   6.01 MBytes       
[  4]   9.00-10.00  sec  19.9 MBytes   167 Mbits/sec    0   6.01 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec   198 MBytes   166 Mbits/sec    0             sender
[  4]   0.00-10.00  sec   195 MBytes   163 Mbits/sec                  receiver

iperf Done.

Here "reno" has better transfer than the default "cubic".

Now try a wired connection.

root@R7800RT1:~# iperf3 -c 192.168.16.196

Connecting to host 192.168.16.196, port 5201
[  4] local 192.168.16.1 port 52298 connected to 192.168.16.196 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec   112 MBytes   942 Mbits/sec    0    478 KBytes       
[  4]   1.00-2.00   sec   111 MBytes   935 Mbits/sec    0    478 KBytes       
[  4]   2.00-3.00   sec   107 MBytes   897 Mbits/sec    0    499 KBytes       
[  4]   3.00-4.00   sec   112 MBytes   942 Mbits/sec    0    499 KBytes       
[  4]   4.00-5.00   sec   107 MBytes   895 Mbits/sec    0    522 KBytes       
[  4]   5.00-6.00   sec   112 MBytes   941 Mbits/sec    0    522 KBytes       
[  4]   6.00-7.00   sec   106 MBytes   892 Mbits/sec    0    522 KBytes       
[  4]   7.00-8.01   sec   108 MBytes   894 Mbits/sec    0    522 KBytes       
[  4]   8.01-9.00   sec   106 MBytes   897 Mbits/sec    0    588 KBytes       
[  4]   9.00-10.00  sec   112 MBytes   941 Mbits/sec    0    588 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  1.07 GBytes   918 Mbits/sec    0             sender
[  4]   0.00-10.00  sec  1.07 GBytes   916 Mbits/sec                  receiver
 
iperf Done. <== Very good.

=========================================================================

These are my results, but YMMV.

@hnyman
Didn't mean to hijack this thread. I should have posted to "Netgear R7800 Exploration". Don't know how to move the post.

  • Magnetron1.1

It is great that somebody tries to debug this and find the root cause of the performance issues troubling some of the users.

But yes, please continue the discussion in the exploration thread, as there is nothing build-specific in my build.

I made a copy of Megnetron1.1's awesome post over in the other thread so that the investigation can continue over there.

Is anyone testing latest @hnyman ‘s build that includes all 3 fixes?

I want to try this but this issue is making me stay away until its fixed!
Or I can use older build.

I flashed the r4786 yesterday evening. No errors in the logs right now after 12 hours. 5 GHz performance seems a little low in comparison to r3498, but I have no measurements to bolster this claim nor tested systematically.

I am seeing the same.
All builds after r3498 have lower 5ghz speed.
Had to go back to r3498

Please just note that r3498 is from the 17.01 stable branch, meaning that it has different, older code base for quite many components.

Does that all builds after r3498 claim also hold true regarding the newer 17.01 build r3506 ????

I guess that you actually mean that the current builds from the master have lower performance than the 17.01 builds.

Oh ok
Not tried 3506.
when I get time Il try it.
thanks

@hnyman
Hi hnyman
if I want to compile r7800 by myself,where should I get the source code? from your or https://github.com/lede-project/source ?

thanks

I provide no full source, so in any case you download the main part of the source from the LEDE source repo.

You can build basic image for R7800 with that.

If you want my build's modifications (like patches, included packages and script modifications) , download and apply my patches that I provide with each release. Link to instructions in message #2 of this thread.

thank you hnyman!

but I'm not sure about "Build firmware with hnscripts/updateNmake.sh"
is that mean do hnscripts/updateNmake.sh instead of "make -j 3" ?
or do hnscripts/updateNmake.sh first then make -j 3 ?

and by the way,if I want to use fast patch, apply fast patch first or your stuff first?

Yes. Please read the scripts...
updateNmake calls at the end the script parallelcompile.sh that calls make.

My build scripts highly automatize the updating/compilation process.

Yes.
You have to apply dissent1's fastpath patch separately.

OK I got it , apply dissent1’s first
appreciate your help me always !

@hnyman
something wrong with fastpath (apply official lede 1701 is ok)
here is the log

Installing package 'sipp' from telephony
Installing package 'siproxd' from telephony
Installing package 'yate' from telephony
jack@jack-R478-R429:~/OwrtLEDE$ make menuconfig
make: *** No rule to make target 'menuconfig'. Stop.
jack@jack-R478-R429:~/OwrtLEDE$ make menuconfig
make: *** No rule to make target 'menuconfig'. Stop.
jack@jack-R478-R429:~/OwrtLEDE$ cd lede1701
jack@jack-R478-R429:~/OwrtLEDE/lede1701$ make menuconfig
Collecting package info: done
configuration written to .config

*** End of the configuration.
*** Execute 'make' to start the build or try 'make help'.

jack@jack-R478-R429:~/OwrtLEDE/lede1701$ git am *.patch
fatal: Dirty index: cannot apply patches (dirty: .config.init .gitignore files/etc/Compile_info.txt files/etc/applyHNsettings.sh files/etc/checksettings.sh files/etc/config/fstab files/etc/hotplug.d/ntp/20-ntpd-logger files/etc/lan-repeater.sh files/etc/saveHNsettings.sh hnscripts/copyPackages2tmp.sh hnscripts/createbuildinfo.sh hnscripts/kernelcompile.sh hnscripts/mountNcopy.sh hnscripts/newBuildroot.sh hnscripts/parallelcompile.sh hnscripts/singlecompile.sh hnscripts/timestampVersion.sh hnscripts/updateNmake.sh package/base-files/Makefile package/base-files/files/bin/config_generate package/base-files/files/etc/rc.button/reset package/firmware/ath10k-firmware/Makefile package/kernel/mac80211/patches/012-increase-bmi-timeout.patch package/kernel/mac80211/patches/013-log-when-longer-bmi-cmds-happen.patch package/kernel/mac80211/patches/014-add-BMI-parameters-to-fix-calibration-from-DT-pre-cal.patch package/kernel/mac80211/patches/936-ath10k-fix-otp-failure-result.patch package/kernel/mac80211/patches/936-ath10k_skip_otp_check.patch package/network/ipv6/6in4/files/6in4.sh package/network/services/dnsmasq/files/dhcp.conf package/network/services/hostapd/files/hostapd.sh package/network/services/hostapd/files/wps-hotplug.sh package/utils/busybox/patches/310-save-history-in-tmp.patch target/linux/generic/patches-4.4/036-net_sched-avoid-too-many-hrtimer_start-calls.patch target/linux/generic/patches-4.4/037-sched-place-state-next_sched-and-gso_skb-in-same-cac.patch target/linux/generic/patches-4.4/038-net_sched-generalize-bulk-dequeue.patch target/linux/generic/patches-4.4/662-use_fq_codel_by_default.patch target/linux/generic/patches-4.4/663-remove_pfifo_fast.patch target/linux/ipq806x/base-files/etc/board.d/01_leds target/linux/ipq806x/base-files/etc/hotplug.d/firmware/11-ath10k-caldata)
jack@jack-R478-R429:~/OwrtLEDE/lede1701$

Please take the fastpath discussion to the fastpath thread. Nothing to do with my build spcifically...

After more careful reading, you have git problem. You are trying to use "git am" while you have dirty index, so that you have not committed the already done changes.

You could git commit those changes first so that "git am" works, or you could use "patch" instead of "git am":
patch -p 1 -i patchfile.patch

I tried it just now,it works ,Awesome !
thank you Mr hnyman!