Ipq806x NSS build (Netgear R7800 / TP-Link C2600 / Linksys EA8500)

Blockquote There's a script /etc/sysctl.d/qca-nss-ecm.conf that sets net.bridge.bridge-nf-call-iptables=1, if you set it to 0 instead promisc mode on br-lan isn't needed. I just tried it....
/etc/init.d/qca-nss-ecm also sets the above values so it's a tricky one to get pass...

From what I can tell, nss is somehow hooked in to netfilter, the sysctl setting enables netfilter on the bridge allowing bridged traffic to be accelerated. physdev-is-bridged apparently matches traffic that's being bridged, so the rule is added to ensure having enabled netfilter on the bridge, we don't inadvertently block bridged traffic...

Your observations that the physdev rule isn't required matches my own and most likely is because there are other rules in the forward chain to allow the traffic. I also agree, the scripts do look a bit weird :slight_smile:

It does sound like iptables is still blocking your port forwards and additional rule are required to allow the traffic in the forward chain. physdev-is-bridged apparently matches traffic being bridged, which I don't think wan to lan would be, however it's still seen on the bridge interface and therefore blocked with the sysctl setting set. I have no ideal why I'm not seeing the problem, there must be something different between our setups...

Well... I think one reasons it doesn't work is because net.bridge.bridge-nf-call-iptables=1 is set, but without it the router restarts after 3 minutes. But you and I have the same issue.
Router is stable for me too without promisc mode except that port forwards can't be reached from LAN (ethernet).

What's weird is that port forwards work fine on wlan... I have no special rules for wlan devices.
We'll see how it behaves with nft... perhaps things changed. I'm waitng for the recent fw4 commits.

1 Like

completely agree with you. still wonderful firmware, been using it for years on my old 1043nd and other devices and perfect. I trust you!! I already have limitations in these subjects.

Well, I did a rebase of this build so I got the fw4 commits.
Using external port forwards from LAN still doesn't work, I had to set br-lan to promisc mode for them to work.

However, no crash yet... I also set scaling_min_freq to 800000 which I also had before but I'm not sure I actually had it set properly because i noticed that my rc.local got wiped somehow.
Cross your fingers... uptime 1d4h, it should have rebooted by now, let's see if it surives the night.

Btw, I had a look at the /etc/init.d/qca-nss-drv script once again... it's not correct from what I can tell, it sets the nss queues wrongly.

Acceleration seems to work:

# cat /sys/kernel/debug/ecm/ecm_nss_ipv4/udp_accelerated_count
6

#cat /sys/kernel/debug/ecm/ecm_nss_ipv4/tcp_accelerated_count
50

EDIT:
It survied the night, still going this morning...

3 Likes

Hello, I am new here

I have an EA8500 with official OpenWrt 21.02.1 build, and I flashed build EA8500-20220127-Stable2012NSS-sysupgrade.bin via Luci web interface
I didn't keep any settings

Now the router boot fine, but I can't connect to it via Ethernet or WLAN
Any idea?

The LED just stay on, not flashing

My EA8500 got the same, you have to mod the driver, and rebuild:

or using the newest v5.10 repo to build.

Thanks!
Finally I realized that there's a hardware button to switch on WLAN, and luckily I got rid of this terrible situation.

Of course I should make my own build, based on my requirements.

Not sure if you've seen the FW4 / NFTables tricks and trips threads, but the nft monitor in conjunction with the nftrace function looks really useful. Given how odd your problem is, I'm not sure if it's going to help or not, but it'll certainly tell you where the traffic is going from an nft perspective.

qca-nss-drv pushes interputs for nss_queue1 over to CPU1 in my case, but what's confusing me is why I have nss_queue1 listed twice in /proc/interrupts. The script itself looks pretty generic and therefore able to handle devices with more CPUs and queues than we have, like the ax3600, which also has it's nss_queues listed twice in interrupts, so I guess there's a good reason for it...

1 Like

True, I also have that, however the counts are 0 for IRQ43.

 41:          0    6027959     GIC-0 264 Level     nss_queue1
 43:          0          0     GIC-0 265 Level     nss_queue1

Any ideas @quarky? Is it the same for you?

Same for me. From what I understand of the driver's codes, the second queue is for the 2nd NSS core, which is mainly for the crypto engine. Since it's not used, it'll be zero.

The name of the queue (being the same) is likely just sloppy programming.

2 Likes

@GloooM is @KONG (or anyone else with an ea8500) the current 21.02 and master .dts are the same. Seems like the master settings work. What settings work for you on 21.02 (seems like they need to be different?)

21.02:


--- b/arch/arm/boot/dts/qcom-ipq8064-ea8500.dts
+++ a/arch/arm/boot/dts/qcom-ipq8064-ea8500.dts
@@ -107,15 +107,15 @@	
 };

 &gmac1 {
-	qcom,phy_mdio_addr = <4>;
-	qcom,poll_required = <1>;
-	qcom,rgmii_delay = <0>;
+	qcom,phy-mdio-addr = <0>;
+	qcom,poll-required = <0>;
+	qcom,rgmii-delay = <1>;
	qcom,emulation = <0>;
 };

 /* LAN */
 &gmac2 {
-	qcom,phy_mdio_addr = <0>;	/* none */
+	qcom,phy-mdio-addr = <4>;
	qcom,poll_required = <0>;	/* no polling */
	qcom,rgmii_delay = <0>;
	qcom,emulation = <0>;

master:


--- b/arch/arm/boot/dts/qcom-ipq8064-ea8500.dts
+++ a/arch/arm/boot/dts/qcom-ipq8064-ea8500.dts
@@ -113,15 +113,15 @@	
 };

 &gmac1 {
-	qcom,phy_mdio_addr = <4>;
-	qcom,poll_required = <1>;
-	qcom,rgmii_delay = <0>;
+	qcom,phy-mdio-addr = <0>;
+	qcom,poll-required = <0>;
+	qcom,rgmii-delay = <1>;
	qcom,emulation = <0>;
 };

 /* LAN */
 &gmac2 {
-	qcom,phy_mdio_addr = <0>;	/* none */
+	qcom,phy-mdio-addr = <4>;
	qcom,poll_required = <0>;	/* no polling */
	qcom,rgmii_delay = <0>;
	qcom,emulation = <0>;

Hi @ACwifidude , I cannot test all builds because my ea8500 is my ISP router everyday, I only tested v5.10 build.

On my ea8500 (Hong Kong version) and NSS build, the follow config is must:

 &gmac1 {
	qcom,phy-mdio-addr = <0>;
...
 /* LAN */
 &gmac2 {
	qcom,phy-mdio-addr = <4>;
...

So the above 21.02 and master should be ok for me, but not tested.

BTW, I have noticed in OpenWrt official build, 21.02.1, the ea8500 driver is:

 &gmac1 {
	qcom,phy-mdio-addr = <4>;
...
 /* LAN */
 &gmac2 {
	qcom,phy-mdio-addr = <0>;
...

using this configs, Official build also work on my set. I do not know why.

I noticed I get these on when using the 2.4Ghz interface/radio...
"wlan1: NSS TX failed with error: NSS_TX_FAILURE_TOO_SHORT"

I don't get them on the 5Ghz interface tho', I think I asked about this before but I don't think I ever got an answer.

Yep on 5.10, it is different.

I'm still testing stability with promisc mode disabled, I'm already there for 8 days, everything stable, very fast, Wi-Fi at maximum speed and no disconnections. Disable the options of both wifis to test:

Enable key reinstallation (KRACK) countermeasures

Y

Disassociate On Low Acknowledgment

The question that makes me curious is because almost no one fails or generates instability in the promisc mode, I imagine that some equipment in my wired network causes some type of incompatibility in the router.

how about the tests with firewall4?

My router finally rebooted with br-lan in promisc mode and fw4.
However it rebooted when i fiddled with the firewall so I might have caused it myself.

I'm curious why promisc mode makes the router run like a normal build where hairpin forwards work.
I know for a fact that net.bridge.bridge-nf-call-iptables=1 causes promisc mode to be needed but there gotta be a firewall rule that fixes hairpin forwards.

Perhaps someone can figure out a solution using this as guide:

1 Like

Really alarming discovery that I've just found out with current master build from 20220206 firewall4
If I uncheck a firewall traffic rule and hit save&apply the rule will still remain active and port will be left open.
Can anyone try the same and confirm it.
This is the current content of the /etc/config/firewall

config rule
	option name 'ssh-wan'
	list proto 'tcp'
	option src 'wan'
	option dest_port '22'
	option target 'ACCEPT'
	option enabled '0'

I'm still able to connect from the outside world.

I noticed that too, I don't think think firewall4 reloads properly from LuCi, you have to do it from shell with fw4 reload

1 Like

Thanks.
Your suggestion workarounds the issue for now.

I tried to make a build for EA8500 and every time failed, it said my uImage is too big.
In diffconfig, I've tried to delete almost all addons but do no work, still "too big".

Then I looked up for target/linux/ipq806x/image/generic.mk and found

define Device/linksys_ea8500
	$(call Device/LegacyImage)
	DEVICE_VENDOR := Linksys
	DEVICE_MODEL := EA8500
	SOC := qcom-ipq8064
	PAGESIZE := 2048
	BLOCKSIZE := 128k
	KERNEL_SIZE := 3072k

Then I changed the KERNEL_SIZE to 3200k and build success

Will this brick my router? Or should I just proceed?
I don't think the partition table allows me to change, but how could I shrink the kernel size then?