IPQ807X NSS Build

Cheers will do.

Thanks for posting your config file. This helps as we can see which kernel modules you have selected. Will try these out in a new build :+1:

I think the way to go is to "provide" a "default" config with the kernel modules that work or we think that work. In this way it can be tested by the community, including people like me that don't know too much about this, and confirm if it is stable or there is some bug. This default config I think is better to add too luci and the full wpad version (to let users with mesh AP to upgrade without loosing connectivity). The rest is better to maintain like default OpenWrt.

For othe part, in this thread can be discussed other modules to add for advanced users, and once tested by this users, can be added to the "default" config.

What do you think? Is this a good approach?

PD: I've added a link to this thread in the wiki, to see if more people is interested.

4 Likes

excellent idea ... suggest though to add all of the nss modules (apart from the ipsecmgr that is breaking at the moment). I feel is better to include all use cases than just pick a few ... like the one you mentioned mesh support or ppoe etc ....

I am seeing ridiculously high number of retries if using NSS along with 2.5G port running at 2.5G while at 1G its fine, CPU is basically not loaded at all.

Note that this without ECM through NAT, has somebody else seen this as well?

I built your repo today using the .config you posted.
Looks like OC it's only cosmetic, the cpu is not OC:

root@X2:~# mhz
count=330570 us50=11841 us250=59218 diff=47377 cpu_MHz=1395.487

coremark gives the same result as openwrt snapshot:

root@X2:~# coremark
2K performance run parameters for coremark.
CoreMark Size    : 666
Total ticks      : 12799
Total time (secs): 12.799000
Iterations/Sec   : 4687.866240
Iterations       : 60000

Thanks for all your work.

That's a great idea, all the modules should be fixed now, at least during my testing. I've added a configure script (borrowed a lot of it from @Qosmio's QSDK11.2 nss-packages), to simplify enabling / disabling them and also to try and prevent reverse / circular dependencies issues like it was the case with ipsecmgr. I am enabling the following stuff on my build, all confirmed to load and work on AX3600:

CONFIG_PACKAGE_kmod-qca-nss-drv=y
CONFIG_PACKAGE_kmod-qca-nss-drv-bridge-mgr=y
CONFIG_PACKAGE_kmod-qca-nss-drv-clmapmgr=y
CONFIG_PACKAGE_kmod-qca-nss-drv-dtlsmgr=y
CONFIG_PACKAGE_kmod-qca-nss-drv-eogremgr=y
CONFIG_PACKAGE_kmod-qca-nss-drv-gre=y
# CONFIG_PACKAGE_kmod-qca-nss-drv-ipsecmgr is not set
CONFIG_PACKAGE_kmod-qca-nss-drv-l2tpv2=y
CONFIG_PACKAGE_kmod-qca-nss-drv-lag-mgr=y
CONFIG_PACKAGE_kmod-qca-nss-drv-map-t=y
CONFIG_PACKAGE_kmod-qca-nss-drv-match=y
CONFIG_PACKAGE_kmod-qca-nss-drv-netlink=y
# CONFIG_PACKAGE_kmod-qca-nss-drv-ovpn-link is not set
# CONFIG_PACKAGE_kmod-qca-nss-drv-ovpn-mgr is not set
CONFIG_PACKAGE_kmod-qca-nss-drv-pppoe=y
CONFIG_PACKAGE_kmod-qca-nss-drv-pptp=y
CONFIG_PACKAGE_kmod-qca-nss-drv-pvxlanmgr=y
CONFIG_PACKAGE_kmod-qca-nss-drv-tlsmgr=y
CONFIG_PACKAGE_kmod-qca-nss-drv-tun6rd=y
CONFIG_PACKAGE_kmod-qca-nss-drv-tunipip6=y
CONFIG_PACKAGE_kmod-qca-nss-drv-vlan-mgr=y
CONFIG_PACKAGE_kmod-qca-nss-drv-vxlanmgr=y
CONFIG_PACKAGE_kmod-qca-nss-drv-igs=y
CONFIG_PACKAGE_kmod-qca-nss-drv-qdisc=y

Ah interesting, that's a great find, thanks for the heads-up. I had a feeling that OC patch was way too good to be true, jumping from 1.4GHz to 2.2 seemed unrealistic, I'll remove it with the next openwrt sync.

I have now tested this, after the latest changes. I don't think qdisc was working properly (missing NSS-DRV symbols and hard dependency on qca-nss-drv-igs, both popped up now).

# Enable modules and start virtual interface
/etc/init.d/qca-nss-mirred start
insmod nss-ifb nss_dev_name=wan
ip link set up nssifb

# Shape ingress traffic to 900 Mbit with chained NSSFQ_CODEL
tc qdisc add dev nssifb root handle 1: nsstbl rate 900Mbit burst 1Mb
tc qdisc add dev nssifb parent 1: handle 10: nssfq_codel limit 10240 flows 1024 quantum 1514 target 5ms interval 100ms set_default

# Shape egress traffic to 500 Mbit with chained NSSFQ_CODEL
tc qdisc add dev wan root handle 1: nsstbl rate 500Mbit burst 1Mb
tc qdisc add dev wan parent 1: handle 10: nssfq_codel limit 10240 flows 1024 quantum 1514 target 5ms interval 100ms set_default

It does apply for the WireGuard tunnel interfaces behind the WAN also, from the testing so far. And it has literally obliterated any buffer bloat and random jumping latencies, no performance downgrade visible, so I think it's a win.

It'd be great if anyone interested could try patching sqm-scripts with https://github.com/ricsc/sqm-scripts/commit/c824f6bca679aebc656fdaad8ebec6e11663b665 and the luci-app-sqm package, our ipq806x friends have done most of the hard work. If everything works, I'll add it to the repo afterwards.

5 Likes

It's nice to see that nss is working well. In addition, if the CPU can really work at 2.2GHZ, it means that we experience the performance of Pro1210 at the price of Pro600, which sounds good! A few days ago, I compiled your last commit and added ricsc's sqm-script, but it didn't work. Obviously, we still have some work to do to use luci-app-sqm to create qos. For example, adjust the code of sqm-script to support the creation of queues and implement multiple nssifb to match friends who use mwan3. Next, I will compile the new firmware with your latest commit and try to make sqm-script work.

here's the patch for sqm-scripts - one will need to edit /usr/lib/sqm/nss.qos and change the interface and then use luci-sqm to configure this interface

https://pastebin.com/raw/GgZD9syS

note that I also changed the /etc/init.d/sqm START to 90 from 50 as it was starting too early

[   18.960048] __nss_qdisc_init[2192]:parent (65536) and TC_H_ROOT (-1))
[   18.960070] __nss_qdisc_init[2193]:root->ops->owner (0000000000000000) and THIS_MODULE (ffffffc001044cc0))
[   18.960103] __nss_qdisc_init[2194]:NSS qdisc ffffff8002086400 (type 1) used along with non-nss qdiscs, or the interface is currently down
[   19.023053] __nss_qdisc_init[2192]:parent (65536) and TC_H_ROOT (-1))
[   19.023110] __nss_qdisc_init[2193]:root->ops->owner (0000000000000000) and THIS_MODULE (ffffffc001044cc0))
[   19.034538] __nss_qdisc_init[2194]:NSS qdisc ffffff8004201000 (type 1) used along with non-nss qdiscs, or the interface is currently down

EDIT - I suggest to start sqm manually ... even with START=90 and then crashes if I type something like 'tc' or 'ifconfig'

[   78.849339] rcu: INFO: rcu_preempt self-detected stall on CPU
[   78.860668] rcu:     1-....: (1 GPs behind) idle=aef/1/0x4000000000000002 softirq=3979/3980 fqs=13077
[   78.866401]  (t=60014 jiffies g=5389 q=3317)
[   78.875246] Task dump for CPU 1:
[   78.879758] task:tc              state:R  running task     stack:    0 pid: 3455 ppid:  3173 flags:0x0000000a
[   78.882984] Call trace:
[   78.892775]  dump_backtrace+0x0/0x17c
[   78.895036]  show_stack+0x18/0x40
[   78.898853]  sched_show_task+0x148/0x174
[   78.902153]  dump_cpu_task+0x44/0x58
[   78.906146]  rcu_dump_cpu_stacks+0xe4/0x128
[   78.909706]  rcu_sched_clock_irq+0x980/0xbf0
[   78.913614]  update_process_times+0x9c/0x11c
[   78.918126]  tick_sched_timer+0x58/0xd0
[   78.922379]  __hrtimer_run_queues+0x138/0x1dc
[   78.925940]  hrtimer_interrupt+0xe8/0x244
[   78.930453]  arch_timer_handler_virt+0x34/0x4c
[   78.934446]  handle_percpu_devid_irq+0x84/0x130
[   78.938787]  handle_domain_irq+0x60/0x90
[   78.943214]  gic_handle_irq+0x54/0xe0
[   78.947379]  call_on_irq_stack+0x28/0x54
[   78.950939]  do_interrupt_handler+0x54/0x60
[   78.954931]  el1_interrupt+0x30/0x70
[   78.958837]  el1h_64_irq_handler+0x18/0x24
[   78.962657]  el1h_64_irq+0x78/0x7c
[   78.966562]  queued_spin_lock_slowpath+0x1a0/0x2e0
[   78.969951]  nss_qdisc_stats_qdisc_detach+0x24b4/0x2d44 [qca_nss_qdisc]
[   78.974728]  nss_ppe_all_queue_enable_hybrid+0x50/0x60 [qca_nss_qdisc]
[   78.981238]  nss_qdisc_node_attach+0xb4/0x24c [qca_nss_qdisc]
[   78.987835]  nss_qdisc_stats_qdisc_detach+0x768/0x2d44 [qca_nss_qdisc]
[   78.993652]  qdisc_graft+0xb4/0x620
[   79.000069]  tc_modify_qdisc+0x48c/0x71c
[   79.003456]  rtnetlink_rcv_msg+0x110/0x334
[   79.007622]  netlink_rcv_skb+0x5c/0x130
[   79.011528]  rtnetlink_rcv+0x18/0x2c
[   79.015260]  netlink_unicast+0x184/0x280
[   79.019080]  netlink_sendmsg+0x1a0/0x3dc
[   79.022988]  ____sys_sendmsg+0x288/0x2c0
[   79.026893]  ___sys_sendmsg+0x84/0xf0
[   79.030800]  __sys_sendmsg+0x48/0xb0
[   79.034358]  __arm64_sys_sendmsg+0x24/0x30
[   79.038004]  invoke_syscall.constprop.0+0x5c/0x104
[   79.041912]  do_el0_svc+0x6c/0x15c
[   79.046684]  el0_svc+0x18/0x54
[   79.050068]  el0t_64_sync_handler+0xe8/0x114
[   79.053109]  el0t_64_sync+0x184/0x188

2 Likes

Hello, does Wi-Fi offload also included in this NSS build?

you have the option to compile nss-drv with
CONFIG_NSS_DRV_WIFI_ENABLE
CONFIG_NSS_DRV_WIFI_EXT_VDEV_ENABLE
CONFIG_NSS_DRV_WIFI_MESH_ENABLE

doing ecm_dump.sh i can see the wifi interface and clients ... so I would say it is there ...

2 Likes

I managed to build a working fw for ax3600 by using .config file provided by bitthief. I can also change options in menuconfig and all works.

I also tried to built a version for ax9000 by changing the target in menuconfig. The build succeeds but no wifi working in ax9000. Obviously the menuconfig is not the right way do adapt the config for other targets.

[   12.444193] ath11k c000000.wifi: ipq8074 hw2.0
[   12.444227] ath11k c000000.wifi: FW memory mode: 0
[   12.447800] remoteproc remoteproc0: powering up cd00000.q6v5_wcss
[   12.452480] remoteproc remoteproc0: Booting fw image IPQ8074/q6_fw.mdt, size 668
[   13.385340] remoteproc remoteproc0: remote processor cd00000.q6v5_wcss is now up
[   13.387027] ath11k c000000.wifi: qmi ignore invalid mem req type 3
[   13.387208] xt_DNETMAP: CONFIG_NF_NAT is not available in your kernel, hence this module cannot function.
[   13.392328] ath11k c000000.wifi: chip_id 0x0 chip_family 0x0 board_id 0xff soc_id 0xffffffff
[   13.407503] ath11k c000000.wifi: fw_version 0x250a04a5 fw_build_timestamp 2021-12-20 07:09 fw_build_id WLAN.HK.2.5.0.1-01208-QCAHKSWPL_SILICONZ-1
[   13.427484] kmodloader: 1 module could not be probed
[   13.429073] kmodloader: - xt_DNETMAP - 0
[   13.447831] ath11k c000000.wifi: failed to fetch board data for bus=ahb,qmi-chip-id=0,qmi-board-id=255,variant=Xiaomi-AX9000 from ath11k/IPQ8074/hw2.0/board-2.bin
[   13.447888] ath11k c000000.wifi: failed to fetch board data for bus=ahb,qmi-chip-id=0,qmi-board-id=255 from ath11k/IPQ8074/hw2.0/board-2.bin
[   13.461399] ath11k c000000.wifi: failed to fetch board.bin from IPQ8074/hw2.0
[   13.474086] ath11k c000000.wifi: qmi failed to fetch board file: -12
[   13.481063] ath11k c000000.wifi: failed to load board data file: -12
1 Like

Check firmware - ath10k overrides

You may need to select the correct target, probably still on the ax3600 board file.

2 Likes

Thanks. As I stated I used "make menuconfig" and selected the ax9000 as a target. How to verify/change the board file used ?

From make menuconfig select "firmware" then select "ath10k Board-Specific Overrrides".

2 Likes

Thanks. That made the needed change in .config file.

I was about to manually edit the .config. This caught my eye:

CONFIG_TARGET_ipq807x_generic_DEVICE_xiaomi_ax9000=y
CONFIG_TARGET_PROFILE="DEVICE_xiaomi_ax9000"
CONFIG_DEFAULT_ipq-wifi-xiaomi_ax9000=y
# CONFIG_PACKAGE_ipq-wifi-xiaomi_ax9000 is not set

After changing the override all looks good. Let'st wait for the build to finnish. What if I choose multiple targets, should I choose multiple overrides accordingly ? Noobie learning :wink:

I enabled these in my build. Download peformance has seemed to improve but I'm still seeing significant CPU usage doing a speedtest over wifi.

Hi everyone! I completed the compilation of bitthief's latest commit yesterday. There are some contradictions between nss-ecm and tproxy. When I turn on nss-ecm and tproxy at the same time, the IPv6 address through tproxy will be redirected to the router itself. As long as I open the website which through tproxy, the luci login interface will be displayed. This did not happen in previous compiled versions. I spent a day looking for the reason, but I had no clue, and even recompiled the git repository, but it didn't get better. For some special reasons, it takes tproxy to visit some well-known websites (such as Google) in my country to open it. Can you help me analyze or inspire me?

mac80211 may need add hw nss wifi offload support
I find some commit https://github.com/coolsnowwolf/lede/package/kernel/mac80211


I guess you are using openclash to proxy some traffic on LAN devices. I encountered the same problem, and then I turned to sing-box with homeproxy. Although it is relatively novel and has many problems that have not been solved, it solves the problem you mentioned. I don't think this should be an ecm problem. You can check your firewall rules.

@dimfish have you tried again to build it? With the config provided by bitthief it seems to work. It will be great if we have a "standard" build to be used. Thanks!