Nlbwmon hangs after few days of use

That's way to low and R7800 has loads of mem. I use the following settings and have no issues with multiple configs/places that run my builds:

In /etc/sysctl.conf

net.core.rmem_max=3146752
net.core.wmem_max=3146752

In /etc/config/nlbwmon:
option netlink_buffer_size '2097152'

1 Like

About how many days (GB of traffic ?) are you talking here ? Because of curiousity I have set up nlbwmon in a "generic" custom built image, 21.02.3, running on a low resource device (64MB RAM) for 3 days already, without issues. I can not recognize significant loss of free RAM.

Thanks, I have put:

net.core.rmem_default=1048576
net.core.wmem_default=1048576
net.core.rmem_max=1048576
net.core.wmem_max=1048576

and

option netlink_buffer_size '1048576'

to respective files. No single error syslog for 12h. I will monitor how it goes.

Restarting the service solves the issue until it stucks again. Everytime I change config files I perform a reboot of the router.

As far as I am concerned R7800 and WRT3200ACM both have 512MB of RAM. I will be increasing those values gradually until the problem is resolved.

To be honest i did not take notes of that. From now on after any change of above parameters i will delete DB so i will know how much traffic was recorded till service stucks and try to measure time between errors in syslog.

Still hangs after increasing the kernel parameters?

I have
Fam Host ( MAC ) Layer7 Conn. > Downld. ( > Pkts. ) Upload ( Pkts. )
IPv4 192.168.8.221 (xx:yy:zz) HTTPS 285.53 K 13.08 GB ( 11.05 M) 710.95 MB ( 6.88 M)
running for about a week now, no hanging, no OOMs. In a "standard" custom build, 21.02.03, ATH79 based small router, 64MB RAM. Only speciality: No firewall, no opkg support. Using few simple iptables rules, instead.

FWIW, sysctl supports subdirectories on OpenWrt as well, so you can drop it clean in there:

# cat /etc/sysctl.d/99-nlbwmon.conf 
net.core.wmem_max=67108864
net.core.rmem_max=67108864

Handy to know why you added a custom setting, to keep them in separate files.

Probably need to add it to sysupgrade.conf though, so it won't get wiped during next upgrade?

Yeah. Or include it in env/files/ if you roll your own.

2 Likes

Can also comment up /etc/sysctl.conf

# cat /etc/sysctl.conf
# Defaults are configured in /etc/sysctl.d/* and can be customized in this file

# nlbwmon
net.core.rmem_max = 3146752
net.core.wmem_max = 3146752

I wonder if we could simply bump those values from the nlbwon init script and if it would have implications elsewhere in the system.

5 Likes

After increasing parameters to 1MB nlbmon did not stuck for almost 3 days continuous running.

1 Like

What would be the advantage to change those parameters to sych high values? I obviously still don't get what those parameters do even after reading about it.

On a router is it rather unlikely, that slightly increasing this value causes any harm. Since a router usually does not run a lot of applications that open multiple sockets to communcicate with other services. This is the max per socket buffer, if the application does not specify a size for it's buffer via set socket options.

It would be different if you run for example a torrent client on the router, which opens hundreds of connections without specifying any buffer sizes for receive/send. Then you risk running out of memory quickly if the app did not specifiy any buffer sizes. I'm not sure if unbound sets socket buffers, but if you don't use socket reuse option on unbound it could cause an issue as unbound can open a lot of connections for dns.

Thus in my optionen unless something stupid is done on the router it should not be an issue. And the current values would already cause issues if you have apps running, the would open a few hundreds sockets without properly setting receive/send buffer.

Thanks a lot for explanation which I actually understood. This makes much more sense to me now.

I have managed to run my router for 7 days without restart and I can confirm that

net.core.rmem_default=1048576
net.core.wmem_default=1048576
net.core.rmem_max=1048576
net.core.wmem_max=1048576

works for me. No more errors in syslog and nlbwmon is still counting.

nlbwmon stopped counting again after 6 days of uptime. I am increasing all values to 8388608.

I also have this problem. I had bumped net.core.rmem_max to 524288 but still got:

Tue Nov 29 12:42:27 2022 daemon.err nlbwmon[3573]: Netlink receive failure: Out of memory
Tue Nov 29 12:42:27 2022 daemon.err nlbwmon[3573]: Unable to dump conntrack: No buffer space available

I've now increased it to 1048576 and will monitor