Nlbwmon hangs after few days of use

HI All,

i have installed nlbwmon app using opkg install nlbwmon luci-app-nlbwmon.

The problem is that it works for few days and then it stops collecting data. It shows the data on administration panel but it only shows the data to the point when it stucks. Restarting the service helps, then the app updates the accounting perdion until it hangs again. How can i start troubleshoot the issue to see what causes this?

1 Like

Anything showing in syslog?

How do you view syslog?

logread -e nlbwmon is best but you can view in LuCI/Status/System Log.

Thank you for the hint.

I get this:

Tue Oct  4 10:14:17 2022 daemon.err nlbwmon[2586]: The netlink receive buffer size of 524288 bytes will be capped to 180224 bytes
Tue Oct  4 10:14:17 2022 daemon.err nlbwmon[2586]: by the kernel. The net.core.rmem_max sysctl limit needs to be raised to
Tue Oct  4 10:14:17 2022 daemon.err nlbwmon[2586]: at least 524288 in order to sucessfully set the desired receive buffer size!
Tue Oct  4 10:14:17 2022 user.notice nlbwmon: Reloading nlbwmon due to ifup of lan (br-lan)
Tue Oct  4 10:14:18 2022 user.notice nlbwmon: Reloading nlbwmon due to ifup of loopback (lo)
Tue Oct  4 10:14:21 2022 user.notice nlbwmon: Reloading nlbwmon due to ifup of wan (wan)

Do you think those daemon.err events might be something related to nlbwmon hangs?

Not sure but you should add this to /etc/rc.local in any case:

sysctl -w net.core.rmem_max=524288

Will add that for sure.

Could you please explain why I should add this line? I would like to understand the situation.

nlbwmon wants more than the default, you would have to ask @jow who wrote it!

I think I asked wrong question.

Could you please explain what this command actually does?

google net.core.rmem_max or read https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/5/html/tuning_and_optimizing_red_hat_enterprise_linux_for_oracle_9i_and_10g_databases/sect-oracle_9i_and_10g_tuning_guide-adjusting_network_settings-changing_network_kernel_settings

1 Like

sysctl shows/sets kernel parameters. sysctl -w net.core.rmem_max=524288 updates the kernel UDP receive buffer maximum size which is set at 180224 bytes and must be increased to 524288 bytes. You should also add sysctl -w net.core.rmem_default=524288 to set the default size, and you might want to see if the errors disappear first by running each of these commands before making permanent changes.

You could also place these commands into a sysctl.conf file that will be incorporated by the init script on startup which executes well before rc.local.

You can see the kernel values used by sysctl for the net.core values by running sysctl -a | grep net.core.

I have no idea why nlbwmon uses the default values considering there are many threads surrounding this topic with nlbwmon, that's above my head and not pursued, but I am pretty sure there is a valid reasoning.

Tue Oct  4 10:14:17 2022 user.notice nlbwmon: Reloading nlbwmon due to ifup of lan (br-lan)
Tue Oct  4 10:14:18 2022 user.notice nlbwmon: Reloading nlbwmon due to ifup of loopback (lo)
Tue Oct  4 10:14:21 2022 user.notice nlbwmon: Reloading nlbwmon due to ifup of wan (wan)

This is a normal response from nlbwmon. It uses a hotplug script to trigger a reload of nlbwmon whenever an ifup of a specified interface is detected.

3 Likes

Thank you.

I have removed the first command from rc.local and added both commands to /etc/sysctl.conf file.

I will test it for a couple of days. Hopefully it will solve my issue.

I really appreciate your help!

Hi All,

nlbwmon still hangs. I get this in the syslog:

Fri Oct  7 01:56:14 2022 daemon.err nlbwmon[2540]: Netlink receive failure: Out of memory
Fri Oct  7 01:56:14 2022 daemon.err nlbwmon[2540]: Unable to dump conntrack: No buffer space available
Fri Oct  7 07:40:49 2022 daemon.err nlbwmon[2540]: Netlink receive failure: Out of memory
Fri Oct  7 07:40:49 2022 daemon.err nlbwmon[2540]: Unable to dump conntrack: No buffer space available

I am using Linksys WRT3200ACM. I have 32GB flashdrive connected where I store nlbwmon database. This is df -h:

/dev/sda                 28.2G      1.9M     28.1G   0% /mnt/usb

So almost empty.

Any clues what might be causing this?

that's probably an OOM error, not disk space ...

So router with 512MB of RAM is not enough for nlbwmon?

should be, what else have you got running on it ?

I have vnstat, openvpn client, statistics and recently nlbwmon. Free command shows that more than half of the memory is available:

              total        used        free      shared  buff/cache   available
Mem:         507964      175080      305276        1192       27608      293928
2 Likes

OK, so another kernel param, that sounds fair.

Did you restart the service?