Ipq806x NSS build (Netgear R7800 / TP-Link C2600 / Linksys EA8500)

Building from scratch is the cleanest but that should work too.

1 Like

For at least the last 3 versions I have used 3/15, 4/03 and 4/11, I have noticed some strangeness with the 2.5Ghz radio. I can connect from my phone, but it initially says 'no internet' and, though connected, I cannot reach anything on my LAN. I have other devices on the LAN (RPis) that connect and appear to have connectivity. I've tried rebooting the phone to no avail. This morning (running 4/14) the RPi went down (no internet) and I tried to reach it. The router showed it still had an IP, but I could not connect to it. I 'disconnected' the device and rebooted the radio and when it reconnected, It started working again.

looking through the log, I saw this:

Tue Apr 20 06:06:52 2021 daemon.err nlbwmon[4419]: Netlink receive failure: Out of memory
Tue Apr 20 06:06:52 2021 daemon.err nlbwmon[4419]: Unable to dump conntrack: No buffer space available

this happened about the time it lost connectivity. Does this point to anything?

Also, and I think this may be a phone problem and not a router/firmware problem, but thought I'd mention it in case anyone here has any insight. My child has a Motorola e5. It used to connect to the 5GHz radio, but for a few months, it has not. I have been using @ACwifidude builds since September 2020, or so, and it used to connect but now it only connects to teh 2.5GHz and with the flakiness of the 2.5GHz radio, I get a lot of static from the child...

Should I try the other driver (non-ct)? If so, can anyone give me a step by step on how to try it?

the 2.5GHz radio is seriously hampered. Woke up this AM and noticed that all 2.5GHz devices were off line. The log shows this again:

Thu Apr 22 19:03:59 2021 daemon.err nlbwmon[4419]: Netlink receive failure: Out of memory
Thu Apr 22 19:03:59 2021 daemon.err nlbwmon[4419]: Unable to dump conntrack: No buffer space available
...
Thu Apr 22 21:38:20 2021 daemon.err nlbwmon[4419]: Netlink receive failure: Out of memory
Thu Apr 22 21:38:20 2021 daemon.err nlbwmon[4419]: Unable to dump conntrack: No buffer space available
...
Fri Apr 23 01:49:21 2021 daemon.err nlbwmon[4419]: Unable to dump conntrack: I/O error
Fri Apr 23 01:49:21 2021 daemon.err nlbwmon[4419]: Unable to dump conntrack: I/O error
...
Fri Apr 23 01:58:22 2021 daemon.err nlbwmon[4419]: Netlink receive failure: Out of memory
 Fri Apr 23 01:58:22 2021 daemon.err nlbwmon[4419]: Unable to dump conntrack: No buffer space available

Also, I have a MAC address that I want the router to ignore and not provide an address to. I list its IP as 'ignore' yet I see this in the log:
Thu Apr 22 17:59:55 2021 daemon.err dnsmasq[5177]: bad address at /tmp/hosts/dhcp.cfg01411c line 16
That line of the dhcp.cfg01411c file is:
ignore Unknown.lan
From the hints in the Luci UI, this appears to be correct. I have no quotes around it as Luci will not let me do this. Is this correct?

Also, I still would like to know if I can/should try the non-ct ath10k driver and how to do that.

Don't think @ACwifidude build the non-ct driver

I stay true to the defaults in master. You can swap to non-ct with the command in the second post or you could build with non-ct. See if changing helps for your clients.

Here is your answer but no NSS:

All builds are with the default ath10k-ct wifi driver. The mainline ath10k wifi driver is being offered as a downloadable .ipk in the download directory of each build.

That only replaces the firmware.

Maybe you can build the non-ct driver as a module so people can just install the ipk if they want to use non-ct driver.

1 Like

After 10 days uptime, NSS cores set at 800MHz and onDemand governor set at 800Mhz minimum frequency, the router rebooted during a large rsync operation (from laptop / wifi -> server / lan).

I didn't get the kernel panic, but the last lines console-ramoops-0 are:

[957667.702309] ath10k_pci 0000:01:00.0: wmi command 36967 timeout, restarting h
ardware
[957668.314909] ath10k_pci 0000:01:00.0: failed to send pdev bss chan info reque
st
[957668.315287] ath10k_pci 0000:01:00.0: failed to set beacon mode for vdev 0: -
108
[957668.321032] ath10k_pci 0000:01:00.0: failed to set dtim period for vdev 0: -
108
[957668.328836] ath10k_pci 0000:01:00.0: failed to recalculate rts/cts prot for
vdev 0: -108
[957668.336132] ath10k_pci 0000:01:00.0: failed to set cts protection for vdev 0
: -108
[957668.344203] ath10k_pci 0000:01:00.0: failed to set preamble for vdev 0: -108

[957668.351669] ath10k_pci 0000:01:00.0: failed to set mgmt tx rate -108
[957671.381293] ath10k_pci 0001:01:00.0: bss channel survey timed out

Not sure if this has triggered the reset (I'm running the non-ct firmware to have access to the 160Mhz channels).

I doubt that the problem is the wifi driver, at least not only that would be the problem. my routers stable with nss as long as i dont use wired torrent for example. if I turn on torrent it restarts in hours. multiple connections overflow memory? I don't know .... I don't understand anything about this topic, I'm sorry, I'm sorry I can't help.

Have you been monitoring active connections on the status page, how many connections do you see when you torrent? Do you also use samba?

Current builds are not optimized for torrent usage, it is possible, that the unit has not enough reserved memory configured. I can add a few changes to the next build, if you want to test?

connections do not usually exceed 1000, in total. the torrent client is very limited as it is a synology nas. I am willing to try any modification, no problem.

problems are only with nss.

I noticed, that we have not added the min freq setting in the nss build, this is very likely the issue for the sporadic problems.

Who can solve this issue?
Please do something on this

if the nss cores are at 800,800,800, but this was talked about and I don't think there is a patch for it, at least on r7800.

my knowledge is that of a 1 year old by your side, Kong. the only thing I have provided is an efficient method of blocking and breaking nss builds

Ok, I'm confused about what is going on and how I can help. I posted a log snipped of an 'out of memory' error which seems to correspond when I had wifi connectivity issues. Does it point to anything which could cause wifi traffic problems? When this happened, it disconnected then quickly reconnected (I was on 5Ghz). I can provide more info.
I am not doing anything strenuous on the router (no mounted file systems, VPN, samba or torrents or anything). I can start trying older NSS images to see if it was better at some point in the past as this does seem like a new development. The wired connections all seem fine. I have not noted any traffic issues or anything on the wired side. The 2.5GHz radio seems to be the most flaky. If I try to connect with my Moto One phone, it says 'Connected, No Internet' on the 2.5GHz radio but connects fine on the 5GHz. This is with the 4/11 image:
OpenWrt SNAPSHOT r16494+17-dcdafbfc1a / LuCI Master git-21.092.22207-70a7490
For my next trial, I will go back to the Feb 7 bin just to see if the 2.5GHz radio will work from my phone.

edit: Update. I figured out why it will not connect to the internet from my phone. It was assigning a 'randomized MAC' when connecting form 2.5GHz. I removed the blocked MAC and now I get internet on both 5GHz and 2.5GHz.
This still does not resolve the question of why I am getting an out of memory message in the log and why all 2.5GHz devices were offline earlier in the week.

New nss builds have been uploaded. Check if the they fix the torrent issue.

During the day I test it and comment on it in the next few days. thanks Kong. out of curiosity, what changes?

By the way I recently switched to non-ct build.Might fix you issue. But after initial flashing you need need to reboot one more time, as there is currently a little startup timing issues, when it creates the calibration file for the first time, which causes the 5G to be deactivated right after flashing. Once it is created there is no issue anymore.

1 Like

The low CPU frequencies should be disabled directly at DTS, but that is probably overkill.

I used to have a patch for tweaking the other parameters in my build, but recently switched to init script which enables setting the low limit.

And ansuel has made a PR about it.

1 Like