Ipq806x NSS build (Netgear R7800 / TP-Link C2600 / Linksys EA8500)

Cool, so indeed it's a blob which does the hardware magic. Has it happened in the past that OpenWrt developers went to manufacturers and received such code to further develop? Now NSS is deprecated chances rise maybe?
The start post suggests compatibility with other builds, would it be safe for the settings to upgrade from Hnyman master-r17319? I'm not sure when a previous version is considered "old" :wink:

Well it's not really deprecated but more in a sense not logical to apply it anymore since the mainstream processors are better (x64) but yes there are multiple instances where developers approached qualcomm for source sharing which resulted to nothing.
Another note: OpenWRT doesn't have a static set of developers. So basically anyone that contributes to the source/firmware is part of the OpenWRT team. But obviously it's not only the people that add in code but also people with good knowledge around OpenWRT.
How ever, there are people overlooking everything making sure everything works and is easy to adjust within new versions of OpenWRT. Which is the same group that did not apply NSS repos into the main releases.

Regarding compatibility crossing from one repo to another i have no clue.. You have to ask @hnyman or go through his thread if you can sysupgrade to any firmware without any issue, but usually you can sysupgrade to any version unless specified by release notes from OpenWRT itself.
Another note: I have no clue how @hnyman repos work as i run with the NBG6817 which is oddly similar to the R7800 hahaha!

I'm going to look at it since I want to switch to 5.10 too, too much trouble maintaining 5.4 for nss in master.
So sit tight and watch out for commits in my repo.

4 Likes

Will do. I've been trouble shooting gmac and building a 5.4=>5.10 patch.

This is what I have so far for the patch I've added to the gmac package patch folder (hope it helps you get started):

--- a/ipq806x/nss_gmac_dev.c
+++ b/ipq806x/nss_gmac_dev.c
@@ -1585,7 +1585,7 @@ 
	}

	/* ioremap addresses */
-	gmacdev->mac_base = ioremap_nocache(reg_base,
+	gmacdev->mac_base = ioremap(reg_base,
						      NSS_GMAC_REG_BLOCK_LEN);
	if (!gmacdev->mac_base) {
		netdev_dbg(netdev, "ioremap fail.\n");
--- a/ipq806x/nss_gmac_ctrl.c
+++ b/ipq806x/nss_gmac_ctrl.c
@@ -984,7 +984,7 @@ 
	
	of_property_read_u32(np, "qcom,aux-clk-freq", &gmacdev->aux_clk_freq);

-	gmaccfg->phy_mii_type = of_get_phy_mode(np);
+	gmaccfg->phy_mii_type = of_get_phy_mode(np, 0);
	netdev->irq = irq_of_parse_and_map(np, 0);
	if (netdev->irq == NO_IRQ) {
		pr_err("%s: Can't map interrupt\n", np->name);
@@ -1061,26 +1061,26 @@ 
	ctx.msm_clk_ctl_enabled = true;
 #endif

-	ctx.nss_base = (uint8_t *)ioremap_nocache(res_nss_base.start,
+	ctx.nss_base = (uint8_t *)ioremap(res_nss_base.start,
						  resource_size(&res_nss_base));
	if (!ctx.nss_base) {
		pr_info("Error mapping NSS GMAC registers\n");
		ret = -EIO;
		goto nss_gmac_cmn_init_fail;
	}
	pr_debug("%s: NSS base ioremap OK.\n", __func__);

-	ctx.qsgmii_base = (uint32_t *)ioremap_nocache(res_qsgmii_base.start,
+	ctx.qsgmii_base = (uint32_t *)ioremap(res_qsgmii_base.start,
					      resource_size(&res_qsgmii_base));
	if (!ctx.qsgmii_base) {
		pr_info("Error mapping QSGMII registers\n");
		ret = -EIO;
		goto nss_gmac_qsgmii_map_err;
	}
	pr_debug("%s: QSGMII base ioremap OK, vaddr = 0x%p\n",
						__func__, ctx.qsgmii_base);

-	ctx.clk_ctl_base = (uint32_t *)ioremap_nocache(res_clk_ctl_base.start,
+	ctx.clk_ctl_base = (uint32_t *)ioremap(res_clk_ctl_base.start,
				       resource_size(&res_clk_ctl_base));
	if (!ctx.clk_ctl_base) {
		pr_info("Error mapping Clk control registers\n");

I've cleared most of the compiling errors. The patch is not finished yet. Still getting this one error in the build log that I haven't found a solution yet (haven't looked in depth yet):

/home/HTPC/OpenWRT/NSSMaster2/openwrt/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/linux-ipq806x_generic/qca-nss-gmac-9b74deef/ipq806x/nss_gmac_ctrl.c:910:27: error: initialization of 'void (*)(struct net_device *, unsigned int)' from incompatible pointer type 'void (*)(struct net_device *)' [-Werror=incompatible-pointer-types]
  910 |         .ndo_tx_timeout = &nss_gmac_tx_timeout,
      |                           ^
/home/HTPC/OpenWRT/NSSMaster2/openwrt/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/linux-ipq806x_generic/qca-nss-gmac-9b74deef/ipq806x/nss_gmac_ctrl.c:910:27: note: (near initialization for 'nss_gmac_netdev_ops.ndo_tx_timeout')

This is the section of the nss_gmac_ctrl.c file that needs editing (where the error came from):

/**
 * Netdevice operations
 */
static const struct net_device_ops nss_gmac_netdev_ops = {
	.ndo_open = &nss_gmac_open,
	.ndo_stop = &nss_gmac_close,
	.ndo_start_xmit = &nss_gmac_xmit_frames,
	.ndo_get_stats64 = &nss_gmac_get_stats64,
	.ndo_set_mac_address = &nss_gmac_set_mac_address,
	.ndo_validate_addr = &eth_validate_addr,
	.ndo_change_mtu = &nss_gmac_change_mtu,
	.ndo_do_ioctl = &nss_gmac_do_ioctl,
	.ndo_tx_timeout = &nss_gmac_tx_timeout,
	.ndo_set_rx_mode = &nss_gmac_set_rx_mode,
	.ndo_set_features = &nss_gmac_set_features,
};

I also just re added 999-01-Revert-ARM-dma-mapping-remove-dmac_clean_range-and-d.patch to bring back dmac_clean_range for nss-drv. I'm compiling right now and see how far it gets.

3 Likes

Didn't they add a new argument to tx timeout?

unsigned int txqueue
as in:
(*ndo_tx_timeout) (struct net_device *dev, unsigned int txqueue);

Little overview: https://elixir.bootlin.com/linux/v5.10.82/source/include/linux/netdevice.h

I hope you didn't waste too much time in investigating, look at my post before yours and follow the link to my commit :innocent:

I'm still patching, lots of work. Probably have to continue tomorrow.

5 Likes

Yeah noticed it later, for some odd reason this website doesn't want to update properly. Maybe i have a plugin/addon of some sort preventing updates once i'm on it? Anyways, i did see your commits and looks like it is going towards the right direction!

Now i start to think about it, i'm fairly sure we could even patch it to 5.15 if we wanted too.

i have a dumb question. Does applying nss_fqcodel to the ethernet interface, without shaping, work?

I'm fairly sure that nss_fqcodel practically is the same as regular fqcodel only difference is that it is offloaded to the nss core(s?) rather than the cpu. Never really went full geek mode around shaping, because there is way to many factors that can cause bufferbloat/etc, plus the fact that there is not much network load in this household besides by me so shaping would be almost useless.

But i do wonder, what would the benefits be of running on nss_fqcodel without shaping?
Lets say we put 0 for ingress/egress and hope the number will result as unlimited..
What would it do? Will it give full speed? Will buffers be processed better than without it?
Many questions little knowledge? I guess it requires some testing/benchmarking?

Little note:
I found another site that is good to test on and somewhat accurate, it also has a lot of details/information regarding download/upload speed, ping(latency) and jitter.

Openwrt runs fq-codel on all interfaces. So, let's say, you have 3 radios and a 1gbit internet connection, the total capable of driving the internal lan past 2gbits, but the lan is only 1... so... where do the packets go?

I worry that with the upcoming shift to 2.5gbit interfaces especially that we're back to big dumb fifos on the 1gbit one, instead of shedding and spreading the load. My hope was by making fq_codel ubiquitous that we'd always see zero latency for sparse flows + a few us for serialization/routing delay, and no more than 5ms for saturating ones, minimal packet loss (even none, with ecn) - but that bright shiny future... needs a little testing.

1 Like

Just recently updated to the latest Stable from 11/06 and am seeing some stuff in the System log that I have not seen before. I get this periodically in the System log:

Wed Dec  1 06:13:17 2021 daemon.err dnsmasq[11938]: failed to load names from /tmp/hosts/dhcp.cfg01411c: Permission denied
Wed Dec  1 06:13:17 2021 daemon.info dnsmasq-dhcp[11938]: read /etc/ethers - 0 addresses
Wed Dec  1 06:13:37 2021 daemon.err nlbwmon[5303]: Netlink receive failure: Out of memory
Wed Dec  1 06:13:37 2021 daemon.err nlbwmon[5303]: Unable to dump conntrack: No buffer space available

and some stuff in the kernel log I have not noticed in a while:

[  443.590479] ath10k_pci 0000:01:00.0: Invalid VHT mcs 15 peer stats
[  554.318594] ath10k_pci 0001:01:00.0: htt tx: fixing invalid VHT TX rate code 0xff
[ 1743.347994] ath10k_pci 0001:01:00.0: wmi: fixing invalid VHT TX rate code 0xff
[18265.496010] ath10k_pci 0000:01:00.0: Invalid peer id 6 or peer stats buffer, peer: 00000000  sta: 00000000

I have seen this before and it is why I switched to the non CT version for a while. Are these anything to worry about?

Keith

Updated the 21.02 build with all the latest 21.02 commits. There has been tons of mac80211 and hostapd fixes.

5.10 kernel is getting close. Hopefully this weekend will have a working build.

6 Likes

you really are a machine. I think you are South American so you will understand jajaja. Gracias!!

1 Like

@ACwifidude remember that ipv6 will cause kernel panic did you test that? i had to revert a patch to fix the kernel panic (problem is in ecm ipv6)

Haven’t tested it yet. For the kernel 5.10 build I removed most of the features (qdisc, crypto, etc) and just simplified the packages to enable hardware offloading. It had some conflicts with some packages in the compile (not NSS packages ) so I’ll have to pull the log this weekend and figure that out.

If you have some suggested fixes or patches I’d love to compare with any repos you have available.

My 5.10 build looks exactly like Kong’s.

Is this regarding wifi or wired?
What i expect is that the packets would be scheduled in cases it would go over an set limit given to the driver, at least i hope the current system does that at all and efficiently if not i guess we have to look into that as i see a lot of potential in such system.

In regards of 2.5gbit interfaces i have wouldn't have a clue how that would affect the NSS core(s).
I know it can somewhat easily handle wired gigabits without having too much trouble internally.
External wan speeds has way too many factors and complexities to be considered stable, because in some countries they are very dependent on wall to hub cable and hub gear. For example in my case the wall to hub cable is based on copper but my ISP claims that anything from the hub outside is fiber.
Now as far my knowledge goes around cables i know copper is very slow and is doomed with loss of power over distance. What i know is that loss in power means inefficiency so it adds loads of latency which i consider problematic in any usecase.

In regards of wifi, i have too few devices in the house that rely on wifi besides 2 phones and 2 tablets.
Where only 1 tablet is active constantly watching films/videos and this device is extremely close to the router(source) which should cause very little to no problems in that use case.
However, i have been thinking about making the whole household wifi dependent as a futuristic project, but our current technology is too far behind surrounding accuracy of wired. Unless somehow magically there will be technology for wifi that can beat fiber latency and speeds, which i don't see yet unless quantum network gets a breakthrough.
Also wifi is doomed with a lot of factors and complexities such as signal congestion from other routers nearby, the occasional radar sweep from the government and airports.

That error message is caused by nlbwmon, you can extend the buffer to option netlink_buffer_size '1048576' (/etc/config/nlbwmon), but also need to extend the corresponding sysctl (net.core.rmem_max=1048576 in -e.g.- /etc/sysctl.d/12-nlbwmon.conf).

You should try netdata, very good looking app for any kind of monitoring!