Ipq806x NSS build (Netgear R7800 / TP-Link C2600 / Linksys EA8500)

@ACwifidude You may want to consider adding this patch. One of the patch file is to fix ecm’s bridge hairpin mode:

2 Likes

Seems like it's already included but under a different name and slightly larger.

However I wonder what sets br_is_hairpin_enabled?

You can set a bridge port's hairpin mode via ssh console. Default is off. For example for the bridge br-lan with eth1.1 as one of it's bridge port, the hairpin_mode debug_fs config can be found here:

/sys/class/net/br-lan/brif/eth1.1

You can also set it using the ip command. ip -d show link eth1.1 should show all the configured parameters. IIRC, the ip utility also sets it via the /sys/class/net/ debug_fs config.

1 Like

how I have your same problem, to see if you can explain me the steps. currently 12 days of activity without promiscuous mode.

We're onto something, hairpin always worked from wlan. So I checked:

# cat /sys/class/net/br-lan/brif/wlan0/hairpin_mode
1

Hairpin is enabled for wlan by default...

But this isn't, and setting it to 1 didn't help, unless something needs to be reloaded?

# cat /sys/class/net/br-lan/brif/eth1.1/hairpin_mode
0

Nothing seem to set hairpin_mode to one in /etc so something must set it by default for wlan but not eth1.1.

1 Like

Hairpin need to be on for wlan interfaces as typically Wi-Fi clients need to communicate among themselves. IIRC, the drivers will set it automatically.

Edit: From your posts it looks like you want to hairpin your WAN interface instead as show below?

client 1 -> eth1.1/wlan -> bridge -> wan (hairpin) -> bridge -> wth1.1/wlan -> client 1

2 Likes

client 1 -> eth1.1 -> bridge -> wan (hairpin) -> bridge -> eth1.1 -> client 1

Indeed, and the above works fine in non-NSS builds.
However setting br-lan to promisc seems to fix it for NSS but it also seem to cause problems with reboots as @xeonpj mentioned.

EDIT:
I edited the above a bit, coming from a client through wlan does work, but not when coming from a client through eth1.1 like my edit shows...

1 Like

Did you UCI update the packages? Or did you rebuild fresh?

I rebased and built fresh.

1 Like

@ACwifidude I've cloned your repo, checked out the openwrt-21.02-nss-qsdk11.0 branch. Then I made some changes to the diffconfig; add and remove some packages but that's it. I then ran the following like you described in your 2nd post:

./scripts/feeds update -a && ./scripts/feeds install -a && cp diffconfig .config && make defconfig && ./scripts/getver.sh

When running make -j1 V=s on my low end NAS I get compile errors;

make[7]: Entering directory '/home/dude/openwrt/openwrt/build_dir/host/findutils-4.7.0/find'
gcc -DHAVE_CONFIG_H -I. -I..  -I../gl/lib -I../lib -I../gl/lib -DLOCALEDIR=\"/home/dude/openwrt/openwrt/staging_dir/host/share/locale\" -I/home/dude/openwrt/openwrt/staging_dir/host/include   -O2 -I/home/dude/openwrt/openwrt/staging_dir/host/include  -MT ftsfind.o -MD -MP -MF .deps/ftsfind.Tpo -c -o ftsfind.o ftsfind.c
mv -f .deps/ftsfind.Tpo .deps/ftsfind.Po
gcc -DHAVE_CONFIG_H -I. -I..  -I../gl/lib -I../lib -I../gl/lib -DLOCALEDIR=\"/home/dude/openwrt/openwrt/staging_dir/host/share/locale\" -I/home/dude/openwrt/openwrt/staging_dir/host/include   -O2 -I/home/dude/openwrt/openwrt/staging_dir/host/include  -MT finddata.o -MD -MP -MF .deps/finddata.Tpo -c -o finddata.o finddata.c
mv -f .deps/finddata.Tpo .deps/finddata.Po
gcc -DHAVE_CONFIG_H -I. -I..  -I../gl/lib -I../lib -I../gl/lib -DLOCALEDIR=\"/home/dude/openwrt/openwrt/staging_dir/host/share/locale\" -I/home/dude/openwrt/openwrt/staging_dir/host/include   -O2 -I/home/dude/openwrt/openwrt/staging_dir/host/include  -MT fstype.o -MD -MP -MF .deps/fstype.Tpo -c -o fstype.o fstype.c
mv -f .deps/fstype.Tpo .deps/fstype.Po
gcc -DHAVE_CONFIG_H -I. -I..  -I../gl/lib -I../lib -I../gl/lib -DLOCALEDIR=\"/home/dude/openwrt/openwrt/staging_dir/host/share/locale\" -I/home/dude/openwrt/openwrt/staging_dir/host/include   -O2 -I/home/dude/openwrt/openwrt/staging_dir/host/include  -MT parser.o -MD -MP -MF .deps/parser.Tpo -c -o parser.o parser.c
parser.c: In function 'parse_user':
parser.c:75:20: error: expected expression before ')' token
   75 | # define endpwent ()
      |                    ^
parser.c:2463:7: note: in expansion of macro 'endpwent'
 2463 |       endpwent ();
      |       ^~~~~~~~
make[7]: *** [Makefile:1945: parser.o] Error 1
make[7]: Leaving directory '/home/dude/openwrt/openwrt/build_dir/host/findutils-4.7.0/find'
make[6]: *** [Makefile:2008: all-recursive] Error 1
make[6]: Leaving directory '/home/dude/openwrt/openwrt/build_dir/host/findutils-4.7.0/find'
make[5]: *** [Makefile:2109: all-recursive] Error 1
make[5]: Leaving directory '/home/dude/openwrt/openwrt/build_dir/host/findutils-4.7.0'
make[4]: *** [Makefile:2048: all] Error 2
make[4]: Leaving directory '/home/dude/openwrt/openwrt/build_dir/host/findutils-4.7.0'
make[3]: *** [Makefile:28: /home/dude/openwrt/openwrt/build_dir/host/findutils-4.7.0/.built] Error 2
make[3]: Leaving directory '/home/dude/openwrt/openwrt/tools/findutils'
time: tools/findutils/compile#34.43#5.03#40.68
    ERROR: tools/findutils failed to build.
make[2]: *** [tools/Makefile:159: tools/findutils/compile] Error 1
make[2]: Leaving directory '/home/dude/openwrt/openwrt'
make[1]: *** [tools/Makefile:155: /home/dude/openwrt/openwrt/staging_dir/host/stamp/.tools_compile_yyyyyynnyyynyyyyyynyynnyyyynyyyyyyyyyyyyyyyynynnyyyyyyy] Error 2
make[1]: Leaving directory '/home/dude/openwrt/openwrt'
make: *** [/home/dude/openwrt/openwrt/include/toplevel.mk:230: world] Error 2

Do you have any idea what I can do? I didn't rebase anything BTW.

My router is still up and running today without any reboot...
You can try and set this in /etc/rc.local and see if it helps, either I'm just lucky right now or it actually does work. There's no need to try it now unless you want to, just run the same commands in shell, you don't have to reboot. But no guarantees! :slight_smile:

# use ondemand and lock the frequencies so it never goes below 800Mhz
echo ondemand > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
echo ondemand > /sys/devices/system/cpu/cpufreq/policy1/scaling_governor
echo 1725000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_max_freq
echo 1725000 > /sys/devices/system/cpu/cpufreq/policy1/scaling_max_freq
echo 800000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_min_freq
echo 800000 > /sys/devices/system/cpu/cpufreq/policy1/scaling_min_freq

# set br-lan to promisc mode on boot
ifconfig br-lan promisc

The router has already been restarted, not even 2 days. with your startup configuration, it's enable br-lan promisc and reboots.

I’d rerun the update and install commands. Looks like there is a typo in the coding for that package that needs to be fixed. Usually these bugs (especially if they cause the package to fail to compile) are fixed by the package developer within hours to days.

Other updates:

Rebased master with all the latest firewall 4 updates. Should fix a couple firewall bugs. Fixed the mac80211 patch to be compatible with the latest master additions.

Anyone having issues with nlbwmon (this is an OpenWrt wide thing, not a NSS specific issue):

The build has double the buffer size. Make sure you set your buffer size in /etc/config/nlbwmon to:


option netlink_buffer_size 1048576

If that is not enough for your needs @KONG is using a larger amount that is working for him:


sysctl -w net.core.rmem_max=1573376
sysctl -w net.core.wmem_max=1573376

And don't forget to change /etc/config/nlbwmon:


option netlink_buffer_size 1573376

6 Likes

Could you take an image to G10 during the next compilation?

1 Like

I updated to it immediately after you posted. Firewall4 is definitely running better. THANK YOU as always!!! @ACwifidude

1 Like

Hi back again have finally just got my 900/110 line installed

Seeing some strange behaviour on my network at the moment, decided to invest in 2 more r7800's so now have 3 in total

Am connected to the ONT via PPPOE - have the 2 additional r7800's connected as DHCP Clients

One is giving me excellent speeds and one is giving me 80/20 - thing is they are not showing on my main router in clients which is strange as they were when I was connected to my 70/16 line all I have done is change the PPPOE credentials

Obviously I have rebooted all 3 routers and checked all settings but nothing has changed, this has literally been installed for a couple of hours so I still much testing to do

Test with QOS:

Test without QOS:

Oh I am on the stable build btw

I'm sorry to hear that, yes my router rebooted today too after I upgraded the firmware, I have no idea why it stayed alive for 4 days before...

it's really unmotivated, but hey, it works really well without that enabled. so thanks for them, and thanks to you for trying to fix the promisc, in the end, you'll get it. greetings and encouragement.

Well, I noticed I needed to set /sys/class/net/br-lan/brif/eth1.1/hairpin_mode to 1 for another issue... that for sure makes the router reboot if br-lan is set to promisc mode.
So if anyone wants to debug, go ahead :slight_smile:

Having pstore active doesn't help in this case, /sys/fs/pstore/ is empty. :frowning:

1 Like

Did you enable pstore for console messages, I had a random reboot earlier this week and didn't get any panic/oops logged, but it did capture the below which I think indicates the NSS firmware crashed and it decided to reboot the device for me...

That one was caused by me setting scaling_min_freq to the min permitted value, with the reboot occurring ~2 days later. Normally the reboots are caused my me updating firmware rather than a "normal" occurrence since capping the min freq at 600000, but obviously everyone has different workloads and clients, a lot of mine are connected via dumb APs rather than to the main router...

Summary
[334464.821769] NSS core 0 signal COREDUMP COMPLETE 4000
[334464.821858]
[334464.821858] 69d42857: Starting NSS-FW logbuffer dump for core 0
[334464.825865] 69d42857: Warn: trap[813]: Trap on CHIP ID 00050000
[334464.833417] 69d42857: Warn: trap[620]: Trapped: TRAP_TD(00000004) DCAPT(3C000080)
[334464.839504] 69d42857: Warn: trap[645]: Trapped: Thread: 2, reason: 00000020, PC: 4002F30C, previous PC: 4002F308
[334464.846944] 69d42857: Warn: trap[594]: A0_3: 588B9B50 402301C0 3F01ABF8 588B9B52
[334464.857316] 69d42857: Warn: trap[594]: A4_7: 588B9B52 40052304 3F01ABF8 3F00AEF0
[334464.864719] 69d42857: Warn: trap[599]: D0_3: 00000026 00000009 00000001 588B9B40
[334464.872185] 69d42857: Warn: trap[599]: D4_7: 00060000 00000026 000003F4 00000009
[334464.879575] 69d42857: Warn: trap[599]: D8_11: 406AAE20 4C5220BC 55A001D0 00000000
[334464.887118] 69d42857: Warn: trap[599]: D12_15: 00000000 00000000 00D84001 00005805
[334464.894583] 69d42857: Warn: trap[649]: Thread_2 has non-recoverable trap
[334464.906955] NSS core 1 signal COREDUMP COMPLETE 4000
[334464.909015]
[334464.909015] f517ac2e: Starting NSS-FW logbuffer dump for core 1
[334464.914175] Kernel panic - not syncing: NSS FW coredump: bringing system down
[334464.921507] CPU1: stopping
[334464.928703] CPU: 1 PID: 1045 Comm: logd Not tainted 5.10.100 #0
[334464.931384] Hardware name: Generic DT based system
[334464.937666] [<c030ebb8>] (unwind_backtrace) from [<c030a820>] (show_stack+0x14/0x20)
[334464.942266] [<c030a820>] (show_stack) from [<c0679638>] (dump_stack+0x94/0xa8)
[334464.950246] [<c0679638>] (dump_stack) from [<c030d7b0>] (do_handle_IPI+0x140/0x184)
[334464.957359] [<c030d7b0>] (do_handle_IPI) from [<c030d810>] (ipi_handler+0x1c/0x2c)
[334464.965344] [<c030d810>] (ipi_handler) from [<c0373c28>] (__handle_domain_irq+0x90/0xf4)
[334464.972728] [<c0373c28>] (__handle_domain_irq) from [<c0694c70>] (gic_handle_irq+0x90/0xb8)
[334464.981061] [<c0694c70>] (gic_handle_irq) from [<c0300e90>] (__irq_usr+0x50/0x80)
[334464.989549] Exception stack(0xc61bffb0 to 0xc61bfff8)
[334464.996939] ffa0:                                     004b3130 b6e46b20 00000098 00000098
[334465.002080] ffc0: b6efd0b3 b6e075d0 b6e0762c 00000039 004b31ec 00000000 b6ce9080 b6e075d0
[334465.010317] ffe0: 004b2f1c befdcd88 004a1c20 b6e602bc 60000010 ffffffff
[334465.242752] Rebooting in 3 seconds..
1 Like