Tue Feb 1 22:05:16 2022 daemon.err nlbwmon[9914]: The netlink receive buffer size of 2097152 bytes will be capped to 180224 bytes
Tue Feb 1 22:05:16 2022 daemon.err nlbwmon[9914]: by the kernel. The net.core.rmem_max sysctl limit needs to be raised to
Tue Feb 1 22:05:16 2022 daemon.err nlbwmon[9914]: at least 2097152 in order to sucessfully set the desired receive buffer size!
Yes is capped. Will change rmem max to 524288 and change nlbwmon to default again
Indeed it has not point to set it to 5xxxx ++++ if the kernel doesn't allow it.
I know that Kongs build had 1048576 in nlbwmon conf but i don't know if the kong kernel allows that.
It got everything I need, but for me it's a bit overkill I think, but I might give it a try if SQM keeps giving me troubles. I will post my experiences when I do.
I enabled SQM again on the 18683 to see if it's stable...
Another crash / reboot with the 18683 firewall build and SQM (cake/layer_cake) enabled. I do think it logged the crash at /sys/fs/pstore/dmesg-ramoops-1:
For me it's hard to read what is the cause as I am not deep enough into the openwrt system, but it might be readible to some of you?
Maybe useful information as I run quite some services behind the router: cat /proc/sys/net/netfilter/nf_conntrack_count 6271
From the panic log, CPU0's LR is at __krait_myx_set_sel. Looks very similar to the panic I encountered recently. I posted the issue here:
Seems like for 5.4 and 5.10, the ipq806x CPU does not like changing CPU frequency. Maybe this issue has been their for the ipq806x all along. Probably just more pronounced with more recent kernels.
Things I notice: the webinterface feels a lot snapier but as a downside it increased the thermals a few degrees on all zones. (which makes sense of course)
Edit I checked my CPU load when using the full bandwidth and decided to decrease the clockspeed (and temperature) a bit by setting the core freq to 1.4Mhz instead of 1.725Mhz
I can confirm in the graphs that the clockspeed is now fixed at 1.4Mhz.
It also makes sense to do some tests: packet steering enabled, software offloading & IRQ balance disabled:
100/100 loaded connection core 0 load SQM disabled: 40%
100/100 loaded connection core 0 load with SQM layer_cake enabled: 81%
100/100 loaded connection core 0 load with SQM piece_of_cake enabled: 84%
Enabling sofware flow offloading in the UI and IRQ balance at /etc/config/irqbalance didn't change the results: probably because SQM is enabled?
I have been using these in my local start-up tab for over a year, and haven't had a crash/reboot since. The main culprit for me was the frequency switching to lowest freq. All of the values were gathered from around this forum.
See! there is definitely some problem with the scaling and it does crash always right after the mux code... (that i'm 100% sure it's called by the krait notifier for the safe parent) with a random error like this one for the virtual page fault...
forcing the system to max freq should remove any instability by this problem and the system would crash only with some defect on the chip/bad power supply.
The fact is that regulators are not set to work 100% of the time at max voltage and this on the long run cause crash due to overheat or power supply spike... (or even grid problems)
Back in the old 5.4 days all worked well because we didn't have a cpu freq driver and the cpu and cache was set to the lower value of 800mhz... so the cpu wasn't that sensible to voltage change or problems by the regulator overheating...
would also explain why with the nss core the system is more stable... cpu is less used... less cpu freq change... less load on the regulators...
Yes that makes a lot of sense.
I think a higher minimum scaling freq might help or a fixed one at 1.4Mhz if you don't have extreme bandwidth and want to use SQM. I am going to test both for a few days.
If you do have a high bandwidth connection you probably want nss or set the core fixed at 1.7Mhz and hope for the best
not to be that guy, but we've been on 5.4 (and then 5.10) for well over a year now? knock on wood, i haven't experienced this instability we're talking about. my current uptime is 93 days.
i also pin the min frequency to 800mhz, i seem to recall there was a comment in the qsdk code (or in the patch series at some point) that talked about their being a bug with scaling below 800mhz.