Ipq806x NSS build (Netgear R7800 / TP-Link C2600 / Linksys EA8500)

Just to note, after 18 hours uptime, i realized upnp was missing, so i installed luci-app-upnp and started the service. A few minutes later, reboot.
So upnp seems to be the main culprit here. Can anyone try this on their routers?

isn't upnp somewhat of a security risk? I was taught to only explicitly assign stuff on my network as opposed to allowing stuff to assign itself.

IGDV2 is fine afaik, those breaches were with V1.
It's easier for me, i use like 7 programs and a server that make use of upnp.

My understanding is that it is better to build the images with what you need 'baked' in vs adding later. It may not be the cause of your problem, though. I used to add minidlna to the builds before but now build my own images with everything needed as per @ACwifidude suggestion.

1 Like

what's new? :slight_smile:

@ACwifidude - I would really like y'all to try a very simple patch and see if you can "feel" a difference, reducing NAPI_POLL_WAIT to 8, from it's present 64, across your build. multi-core A53s, in particular, could context switch fast enough to do this well, and it it seemed to be better to do less work, more often.

1 Like

interesting wonder if this can be applied generally...

Depends on what you mean by general. x86 - probably not. a72 - have no idea. In a virtualized environment, no. But bare metal quad core a53, works for me, and we've been trying to shorten the latencies in wifi (see link above) and it turns out that NAPI wasn't polling often enough. I think. So... could use more testers.

1 Like

Sounds interesting. I’ll be busy at work this week so I won’t have time to test or build - but would love to encourage some testing to see the results!

If anyone tweaks their custom build - post some results!

2 Likes

How exactly do we apply that patch. Do we have to edit a file before compiling?

2 Likes

I must say that with the work that has collectively been done so far by all involved it already "feels" pretty darn good. Love the commitment by the way.

Are there particular points of measurement you're interested in? Or would you settle for our praises and high spirits?

2 Likes

Which NSS build is currently more stable, version 22 or version 21?

@pattagghiu , @D43m0n

Correct, you can simply add my NSS repo to your feeds.conf. My repository has a copy of it already. You can also use the base r7800-diffconfig config as starting point.

And then just the usual prepping commands.

cp r7800-diffconfig .config
make menuconfig

2 Likes

NAPI_POLL_WAIT to 8, from it's present 64

Complete shot in the dark, but is it the following? I was unable to find any variable like that in backports package, or any existing patches.

--- a/drivers/net/wireless/ath/ath10k/core.h
+++ b/drivers/net/wireless/ath/ath10k/core.h
@@ -67,7 +67,7 @@
 #define ATH10K_KEEPALIVE_MAX_UNRESPONSIVE 3900

 /* NAPI poll budget */
-#define ATH10K_NAPI_BUDGET      64
+#define ATH10K_NAPI_BUDGET      8
1 Like

I was unaware that the ATH10k were not inheriting NAPI_POLL_WEIGHT! Good find! Try that!

Explains a lot. Our original work for the ath10k in 2016 - had no NAPI support in it at all - IMHO it simply can't interrupt often enough for
it to matter, all NAPI does is bulk up operations for far, far too long... but it got added anyway by someone else much later for cargo cult reasons.

So there's a separate patch for NAPI_POLL_WEIGHT, which will service the ethernet driver more often, @amteza?

I too am very happy and relieved we all working together have made this release of openwrt the best ever. Now what? :slight_smile:

Abstractly, my goal - oft expressed - is to make wifi capable of FPS gaming, and one day cloud gaming, even with several other heavy users on the link. We've got new benchmarks now - like "apple's responsiveness" and iperf's new "bounceback" test that can show the minimum possible latency a given wifi link can achieve, which was about 21000 RPM, or 350RPS, with no load. Up until a few patches for the mt76 on the other thread went by, with load, we were seeing 350 RPM in many scenarios, such as mesh, and ath10k is still doing poorly here.

In an ideal pre-wifi7 environment, getting 100 RPS under load seems achievable, if we rip ever more latencies out.

As for wanting to try a few patches on the NSS build, you have plenty of cpu left over due to all the offloads, making it more possible to isolate
wtf the ath10k is doing so badly. Maybe.

2 Likes

Here you go:

--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2440,7 +2440,7 @@ static inline void *netdev_priv(const st
 /* Default NAPI poll() weight
  * Device drivers are strongly advised to not use bigger value
  */
-#define NAPI_POLL_WEIGHT 64
+#define NAPI_POLL_WEIGHT 8

 /**
  *	netif_napi_add - initialize a NAPI context

It might be interesting going with this one too:

--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -514,7 +514,7 @@ struct sta_info *sta_info_alloc(struct i
 	sta->sta.max_rc_amsdu_len = IEEE80211_MAX_MPDU_LEN_HT_BA;

 	sta->cparams.ce_threshold = CODEL_DISABLED_THRESHOLD;
-	sta->cparams.target = MS2TIME(20);
+	sta->cparams.target = MS2TIME(8);
 	sta->cparams.interval = MS2TIME(100);
 	sta->cparams.ecn = true;

@@ -2548,15 +2548,9 @@ static void sta_update_codel_params(stru
 	if (!sta->sdata->local->ops->wake_tx_queue)
 		return;

-	if (thr && thr < STA_SLOW_THRESHOLD * sta->local->num_sta) {
-		sta->cparams.target = MS2TIME(50);
-		sta->cparams.interval = MS2TIME(300);
-		sta->cparams.ecn = false;
-	} else {
-		sta->cparams.target = MS2TIME(20);
-		sta->cparams.interval = MS2TIME(100);
-		sta->cparams.ecn = true;
-	}
+	sta->cparams.target = MS2TIME(8);
+	sta->cparams.interval = MS2TIME(100);
+	sta->cparams.ecn = true;
 }

 void ieee80211_sta_set_expected_throughput(struct ieee80211_sta *pubsta,
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -1564,7 +1564,7 @@ int ieee80211_txq_setup_flows(struct iee

 	codel_params_init(&local->cparams);
 	local->cparams.interval = MS2TIME(100);
-	local->cparams.target = MS2TIME(20);
+	local->cparams.target = MS2TIME(8);
 	local->cparams.ecn = true;

 	local->cvars = kcalloc(fq->flows_cnt, sizeof(local->cvars[0]),
--- a/include/net/cfg80211.h
+++ b/include/net/cfg80211.h
@@ -2842,11 +2842,11 @@ enum wiphy_params_flags {
 #define IEEE80211_DEFAULT_AIRTIME_WEIGHT	256

 /* The per TXQ device queue limit in airtime */
-#define IEEE80211_DEFAULT_AQL_TXQ_LIMIT_L	5000
-#define IEEE80211_DEFAULT_AQL_TXQ_LIMIT_H	12000
+#define IEEE80211_DEFAULT_AQL_TXQ_LIMIT_L	2000
+#define IEEE80211_DEFAULT_AQL_TXQ_LIMIT_H	4000

 /* The per interface airtime threshold to switch to lower queue limit */
-#define IEEE80211_AQL_THRESHOLD			24000
+#define IEEE80211_AQL_THRESHOLD			8000

 /**
  * struct cfg80211_pmksa - PMK Security Association
2 Likes

if you use the performance CPU governor on 22.03 or clamp down CPU frequency so your router will always run on the same frequency, there's really no difference. There have been some WiFi improvements done to 22.03 recently so give that a shot. But remember locking CPU frequency to a specific number.

Here's an update of recent commits for 22.03.

And here's one for master.

Specifically some mac80211 fixes have been committed.

2 Likes

Current 22.03 and master with kernel 5.10 stay stable for over a week when CPU is clamped to one specific frequency. At the moment I'm testing kernel 5.15 with patches from @Ansuel for stability where my R7800's CPU's may roam freely between 600MHz and 1725MHz, this build is without NSS. After a week of testing and reporting back, I'll give it a go with @qosmio 's work on kernel 5.15 with NSS. Hopefully Ansuels PR will be accepted into master by then, otherwise I'll add that to my private build. If someone could get these patches from @amteza in a GitHub repository, I'll add that too:

--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2440,7 +2440,7 @@ static inline void *netdev_priv(const st
 /* Default NAPI poll() weight
  * Device drivers are strongly advised to not use bigger value
  */
-#define NAPI_POLL_WEIGHT 64
+#define NAPI_POLL_WEIGHT 8

 /**
  *	netif_napi_add - initialize a NAPI context

And the interesting one:

--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -514,7 +514,7 @@ struct sta_info *sta_info_alloc(struct i
 	sta->sta.max_rc_amsdu_len = IEEE80211_MAX_MPDU_LEN_HT_BA;

 	sta->cparams.ce_threshold = CODEL_DISABLED_THRESHOLD;
-	sta->cparams.target = MS2TIME(20);
+	sta->cparams.target = MS2TIME(8);
 	sta->cparams.interval = MS2TIME(100);
 	sta->cparams.ecn = true;

@@ -2548,15 +2548,9 @@ static void sta_update_codel_params(stru
 	if (!sta->sdata->local->ops->wake_tx_queue)
 		return;

-	if (thr && thr < STA_SLOW_THRESHOLD * sta->local->num_sta) {
-		sta->cparams.target = MS2TIME(50);
-		sta->cparams.interval = MS2TIME(300);
-		sta->cparams.ecn = false;
-	} else {
-		sta->cparams.target = MS2TIME(20);
-		sta->cparams.interval = MS2TIME(100);
-		sta->cparams.ecn = true;
-	}
+	sta->cparams.target = MS2TIME(8);
+	sta->cparams.interval = MS2TIME(100);
+	sta->cparams.ecn = true;
 }

 void ieee80211_sta_set_expected_throughput(struct ieee80211_sta *pubsta,
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -1564,7 +1564,7 @@ int ieee80211_txq_setup_flows(struct iee

 	codel_params_init(&local->cparams);
 	local->cparams.interval = MS2TIME(100);
-	local->cparams.target = MS2TIME(20);
+	local->cparams.target = MS2TIME(8);
 	local->cparams.ecn = true;

 	local->cvars = kcalloc(fq->flows_cnt, sizeof(local->cvars[0]),
--- a/include/net/cfg80211.h
+++ b/include/net/cfg80211.h
@@ -2842,11 +2842,11 @@ enum wiphy_params_flags {
 #define IEEE80211_DEFAULT_AIRTIME_WEIGHT	256

 /* The per TXQ device queue limit in airtime */
-#define IEEE80211_DEFAULT_AQL_TXQ_LIMIT_L	5000
-#define IEEE80211_DEFAULT_AQL_TXQ_LIMIT_H	12000
+#define IEEE80211_DEFAULT_AQL_TXQ_LIMIT_L	2000
+#define IEEE80211_DEFAULT_AQL_TXQ_LIMIT_H	4000

 /* The per interface airtime threshold to switch to lower queue limit */
-#define IEEE80211_AQL_THRESHOLD			24000
+#define IEEE80211_AQL_THRESHOLD			8000

 /**
  * struct cfg80211_pmksa - PMK Security Association
2 Likes