Nlbwmon corrupts database at start of new month

I'm running OpenWrt 23.05.3 r23809-234f1a2efa / LuCI openwrt-23.05 branch git-24.086.45142-09d5a38 on an x64-64 VM.

The instance is configured with 256GB of RAM. LuCI currently shows 102.21 MB free, so I don't think I'm RAM starving it.

I've had a problem where nlbwmon corrupts the month's file at the start of the next month. This is May's file:

-rw-r-----    1 root     root         73728 Jun  1 00:00 20240501.db.gz

Here is the same file from a backup taken on May 30:

-rw-r-----    1 root     root        167030 May 30 23:30 20240501.db.gz

Note that the final file is quite truncated. And, as suspected:

root@guardhouse:~/bad# gunzip 20240501.db.gz
gunzip: unexpected end of file

This has happened to me repeatedly.

If I don't remember to take a full backup near the end of the month, I'll lose the whole month's worth of data.

This is the file from the backup, uncompressed:

-rw-r-----    1 root     root        678136 Jun 16 00:37 20240501.db

So it is large.

Oddly enough, "Compress database" was turned off in the configuration, yet the files are clearly still being compressed.

Here's the config:

config nlbwmon
        option refresh_interval '30s'
        option database_interval '1'
        option protocol_database '/usr/share/nlbwmon/protocols'
        option commit_interval '15m'
        list local_network '192.168.0.0/16'
        list local_network '172.16.0.0/12'
        list local_network '10.0.0.0/8'
        list local_network 'lan'
        option database_limit '0'
        option database_generations '0'
        option netlink_buffer_size '1572864'
        option database_directory '/usr/share/nlbwmon/db'

Does anyone else have this problem? Any suggestions?

Are there any errors related to nlbwmon in your system log?

Not that I've noticed. I'll watch more closely this month.

I'm beginning to wonder if this is just fs corruption when I reboot the instance. I'm going to use '/usr/libexec/nlbwmon-action backup' and copy the backup file off the box regularly as a work-around.

If this is the case… I had it set to store the db in memory and used an init script to backup and restore from NAND on reboot. I modified sysupgrade to execute said init script at update/upgrade time.

It worked well enough if I recall… it’s been a couple years and I’ve since deleted the info but I remember it was easy enough to script.

Well, it wasn't corruption on reboot. The corruption has happened, and I haven't rebooted.
What happens is that the new month starts out large, and the old month is corrupted:

-rw-r-----    1 root     root        237568 Jul  1 00:00 20240601.db.gz
-rw-r-----    1 root     root        253896 Jul  1 02:00 20240701.db.gz

The 20240601.db.gz file cannot be unzipped. And the 20240701 file is already huge (for a month that's been going for 2 hours).

I stopped nlbwmon, copied the 07 file on top of 06, and then started it up again. That didn't work: nlbwmon doesn't actually show any data available at all at that point (the apparently screwed up 20240701 file makes it not able to figure anything out at all.)

I ended up having to restore a backup to get things working again.

This stinks.

I'm running x86-64 in a KVM VM. I don't know if that's part of the problem.