Save luci-statistics/collectd across reboot

user674574 · October 3, 2020, 3:55pm

Excellent my friend, now i am quite new to openwrt and linux world, so i hope its ok if i run the script by you with my modifications, to make sure its correct.

Copy paste that script into a file on openwrt called rrdbackup, located at /etc/init.d/rrdbackup
chmod +x /etc/init.d/rrdbackup to make it executable
/etc/init.d/rrdbackup enable (to enable it as a service?)
/etc/init.d/rrdbackup start (not sure why this step, if u wanna explain i would appreciate it, in my head this starts the script, which isnt suppose to happend until reboot?)
- (optional) add the backup target directory to /etc/sysupgrade.conf for the backup to be preserved across sysupgrades
the script

#!/bin/sh /etc/rc.common

# OpenWrt /etc/init.d/ script to backup and restore the rrd (collectd) database, to preserve data across reboots
#
#
# howto:
# - upload this file as /etc/init.d/rrdbackup
# - (optional) adjust BACKUP_DIR below to point to a different target directory for the backup (e.g., a USB drive)
# - # chmod +x /etc/init.d/rrdbackup
# - # /etc/init.d/rrdbackup enable
# - # /etc/init.d/rrdbackup start
# - (optional) for periodic backups insert into crontab:
#      0 */6 * * * /etc/init.d/rrdbackup backup
#   adjust interval to own taste (example above backs up every 6 hours)
# - (optional) add the backup target directory to /etc/sysupgrade.conf for the backup to be preserved across sysupgrades

# queue this init script to start (i.e., restore) right before collectd starts (80)
# and stop (i.e., backup) right after collectd stops (10):
START=79
STOP=11

# add commands "backup" to manually backup database (e.g. from cronjob)
# and "restore" to restore database (undocumented, should not be needed in regular use)
EXTRA_COMMANDS="backup restore"
EXTRA_HELP="	backup  backup current rrd database"

# directories and files
# RRD_DIR needs to be relative to root, i.e. no slash in front (to mitigate tar "leading '/'" warning)
RRD_DIR=tmp/rrd 
BACKUP_DIR=/etc/luci_statistics
BACKUP_FILE=stats.tar.gz

backup() {
	local TMP_FILE=$(mktemp -u)
	tar -czf $TMP_FILE -C / $RRD_DIR
	mkdir -p $BACKUP_DIR
	mv $TMP_FILE "$BACKUP_DIR/$BACKUP_FILE"
}

restore() {
	[[ -f "$BACKUP_DIR/$BACKUP_FILE" ]] && tar -xzf "$BACKUP_DIR/$BACKUP_FILE" -C /
}

start() {
	restore
}

stop() {
	backup
}

The portion of the script i dont quite understand is

# add commands "backup" to manually backup database (e.g. from cronjob)
# and "restore" to restore database (undocumented, should not be needed in regular use)
EXTRA_COMMANDS="backup restore"
EXTRA_HELP="	backup  backup current rrd database"

Thank you

EDIT: I tried it on VM first and it worked so i did it on live router, WOW this is great how easy it was! And its great

mbo2o · October 3, 2020, 4:25pm

start() { restore }

Start will restore your previous backup at boot time or when run manually

takimata · October 3, 2020, 4:43pm

EXTRA_COMMANDS registers an additional "backup" and "restore" commands, so you can call /etc/init.d/rrdbackup backup from a cronjob to "manually" trigger a backup.

EXTRA_HELP is just that, an additional line added to the script to be displayed if you run it without any parameters. (One probably will never need "restore" in regular operation, it's there, but "undocumented.")

This is not unique to this script. After you enable an init.d script, it will be registered to be run at startup/shutdown, but it will not have started yet. So you usually start it, once, manually.

P.S.: Even if it's not "large" by any means, and pretty straightforward, the script is still a lot larger than it actually needs to be. I tend to be a little bit more verbose in documentation, I do not compress commands down to the bare minimum, and I have some things configurable through variables even if they don't need to be (e.g., there is no real reason to have /tmp/rrd in a variable.) This is for the benefit of other people trying to understand it, and also for my own sanity when I need to revisit it after some time and try to understand my own code.

user674574 · October 3, 2020, 5:57pm

Thank you @takimata @mbo2o @hnyman @krazeh

user674574 · October 3, 2020, 6:05pm

You guys wouldnt happen to have a rough estimate of the amount of space this will take up?

hnyman · October 3, 2020, 6:20pm

rrd databases have fixed sizes, as they operate round robin fashion.

Typically LuCI stats only take a few hundred kilobytes, but that depends on the amount of plugins you apply.

Check with du /tmp/rrd

user674574 · October 3, 2020, 6:23pm

Ok. So that means, if a backup is not made in X time, the database will overwrite itself?

takimata · October 3, 2020, 6:33pm

Not quite. It will (by default) keep five databases:

for the last hour, one value every collectd interval ... those are the raw values, also known as PDP (primary data point)
for the last day, one value every 600 seconds ... from here on down, these are CDPs (consolidated data points), i.e. averages of the "last hour" values, i.e. it averages 10 values of "the last hour" into one value
for the last week, one value every 4200 seconds (roughly one per hour), again averages from the "last day" values, and so on ... you get the point
for one month, one value every 18600 seconds (roughly one per 5 hours)
for one year, one value every 219600 seconds (roughly one per 2.5 days)

This means that the longer the timeframe, the lower the detail on the values. But you will only lose data after one year. (And I believe this can even be set to be a higher timeframe.)

The backup? It very much depends on what you will actually be logging in collectd, the more values you will be logging, the more space. It's not a huge amount, though, and it's gzipped on top. 20, 30, 40 kB, something like that.

hnyman · October 3, 2020, 6:35pm

It keeps separate data series for the different time periods. The shortest interval gets constantly overwritten, but its data gets summarised to next longer data series.

Hour, day, week, month, year are the data periods. (144 items in each)

hnyman · October 3, 2020, 6:42pm

Actually, 30 seconds is the default interval in LuCI stats.

Otherwise quite correct nice explanation.

takimata · October 3, 2020, 6:52pm

You are correct, I forgot that I set my collectd interval to 60 seconds and was going off the rrdtool dump output. Thanks.

user674574 · October 3, 2020, 8:23pm

appreciate the help everyone

tmomas · October 4, 2020, 8:20am

Thumbs up for this attitude!

user674574 · October 4, 2020, 9:45am

I guess this is what decides this?

LoadPlugin rrdtool
<Plugin rrdtool>
        DataDir "/tmp/rrd"
        RRARows 144
        RRASingle true
        RRATimespan 3600
        RRATimespan 86400
        RRATimespan 604800
        RRATimespan 2678400
        RRATimespan 31622400

hnyman · October 4, 2020, 9:57am

Yep, that is the data scale.

That section also defines that there is at least 144 data points (rows) in each series. (It was only 100 points earlier, but I increased that to 144 in April to provide more granularity and to decrease the frequency of mis-selected "best data series for the period".)

There is a separate definition of the data interval, 30 seconds.

user674574 · October 4, 2020, 9:58am

are you the creator of collectd?
EDIT: Increasing this value will make the files bigger and give more details, thats it?

hnyman · October 4, 2020, 10:59am

No, but I maintain the collectd package in OpenWrt and also the LuCI stats app. And the definitions discussed here come from LuCI.

github.com/openwrt/luci

luci-app-statistics: modify default amount of data items in RRD

committed 06:43AM - 18 May 20 UTC

hnyman

+1 -1

Increase the default number of data items in the RRD database from 100 to 144. T…hat leads to better matching summarising/averaging moments between day & week and week & month at the averaging intervals: 30sec, 10min, 70 min, 5h10min, 2d13h Previous 100 led too easily to situations, where the longer period's more scarce data gets selected for displaying in the graph. That could happen if the longer period's last data point was stored more recently than the last data item in the originally required period. (E.g. if the last "week data item" was more recent than the last "day data item", the week data was used for the day chart.) (Note: this change only applies in a live router if the RRD database is empty. E.g after reboot or after emptying the RRD database dir.) Reference to discussion at #4065 Signed-off-by: Hannu Nyman <hannu.nyman@iki.fi>

So the increased amount increases granularity/details and helps to avoid picking the wrong data series in LuCI display. Read details from that commit message.

user674574 · October 5, 2020, 9:20am

Would setting it to like 300 be feasible/doable/reasonable?

hnyman · October 5, 2020, 9:45am

Sure.
The original "100" was defined years ago, when routers had really tiny RAM memories and flash size.

user674574 · October 6, 2020, 2:08pm

Great, thanks.
There is a file /etc/config/luci_statistics that also contain a bunch of settings, in there is:

config statistics 'collectd_rrdtool'
        option enable '1'
        option DataDir '/tmp/rrd'
        option RRARows '144'
        option RRASingle '1'
        option RRATimespans '1hour 1day 1week 1month 1year'

this also needs to be changed?
Thanks for great replies and work

EDIT: What would be the impact of changing:

config statistics 'collectd'
        option Interval '15'

to a lower value? The "RRARows" value would still only contain 144 datapoints, so there would be no harm in lowing the interval?