Save luci-statistics/collectd across reboot

Excellent my friend, now i am quite new to openwrt and linux world, so i hope its ok if i run the script by you with my modifications, to make sure its correct.

  1. Copy paste that script into a file on openwrt called rrdbackup, located at /etc/init.d/rrdbackup
  2. chmod +x /etc/init.d/rrdbackup to make it executable
  3. /etc/init.d/rrdbackup enable (to enable it as a service?)
  4. /etc/init.d/rrdbackup start (not sure why this step, if u wanna explain i would appreciate it, in my head this starts the script, which isnt suppose to happend until reboot?)
    • (optional) add the backup target directory to /etc/sysupgrade.conf for the backup to be preserved across sysupgrades
  5. the script
#!/bin/sh /etc/rc.common

# OpenWrt /etc/init.d/ script to backup and restore the rrd (collectd) database, to preserve data across reboots
#
#
# howto:
# - upload this file as /etc/init.d/rrdbackup
# - (optional) adjust BACKUP_DIR below to point to a different target directory for the backup (e.g., a USB drive)
# - # chmod +x /etc/init.d/rrdbackup
# - # /etc/init.d/rrdbackup enable
# - # /etc/init.d/rrdbackup start
# - (optional) for periodic backups insert into crontab:
#      0 */6 * * * /etc/init.d/rrdbackup backup
#   adjust interval to own taste (example above backs up every 6 hours)
# - (optional) add the backup target directory to /etc/sysupgrade.conf for the backup to be preserved across sysupgrades

# queue this init script to start (i.e., restore) right before collectd starts (80)
# and stop (i.e., backup) right after collectd stops (10):
START=79
STOP=11

# add commands "backup" to manually backup database (e.g. from cronjob)
# and "restore" to restore database (undocumented, should not be needed in regular use)
EXTRA_COMMANDS="backup restore"
EXTRA_HELP="	backup  backup current rrd database"

# directories and files
# RRD_DIR needs to be relative to root, i.e. no slash in front (to mitigate tar "leading '/'" warning)
RRD_DIR=tmp/rrd 
BACKUP_DIR=/etc/luci_statistics
BACKUP_FILE=stats.tar.gz

backup() {
	local TMP_FILE=$(mktemp -u)
	tar -czf $TMP_FILE -C / $RRD_DIR
	mkdir -p $BACKUP_DIR
	mv $TMP_FILE "$BACKUP_DIR/$BACKUP_FILE"
}

restore() {
	[[ -f "$BACKUP_DIR/$BACKUP_FILE" ]] && tar -xzf "$BACKUP_DIR/$BACKUP_FILE" -C /
}

start() {
	restore
}

stop() {
	backup
}

The portion of the script i dont quite understand is

# add commands "backup" to manually backup database (e.g. from cronjob)
# and "restore" to restore database (undocumented, should not be needed in regular use)
EXTRA_COMMANDS="backup restore"
EXTRA_HELP="	backup  backup current rrd database"

Thank you

EDIT: I tried it on VM first and it worked so i did it on live router, WOW this is great how easy it was! And its great

1 Like

start() { restore }

Start will restore your previous backup at boot time or when run manually

EXTRA_COMMANDS registers an additional "backup" and "restore" commands, so you can call /etc/init.d/rrdbackup backup from a cronjob to "manually" trigger a backup.

EXTRA_HELP is just that, an additional line added to the script to be displayed if you run it without any parameters. (One probably will never need "restore" in regular operation, it's there, but "undocumented.")

This is not unique to this script. After you enable an init.d script, it will be registered to be run at startup/shutdown, but it will not have started yet. So you usually start it, once, manually.

P.S.: Even if it's not "large" by any means, and pretty straightforward, the script is still a lot larger than it actually needs to be. I tend to be a little bit more verbose in documentation, I do not compress commands down to the bare minimum, and I have some things configurable through variables even if they don't need to be (e.g., there is no real reason to have /tmp/rrd in a variable.) This is for the benefit of other people trying to understand it, and also for my own sanity when I need to revisit it after some time and try to understand my own code.

2 Likes

Thank you @takimata @mbo2o @hnyman @krazeh

You guys wouldnt happen to have a rough estimate of the amount of space this will take up?

rrd databases have fixed sizes, as they operate round robin fashion.

Typically LuCI stats only take a few hundred kilobytes, but that depends on the amount of plugins you apply.

Check with du /tmp/rrd

Ok. So that means, if a backup is not made in X time, the database will overwrite itself?

Not quite. It will (by default) keep five databases:

  • for the last hour, one value every collectd interval ... those are the raw values, also known as PDP (primary data point)
  • for the last day, one value every 600 seconds ... from here on down, these are CDPs (consolidated data points), i.e. averages of the "last hour" values, i.e. it averages 10 values of "the last hour" into one value
  • for the last week, one value every 4200 seconds (roughly one per hour), again averages from the "last day" values, and so on ... you get the point
  • for one month, one value every 18600 seconds (roughly one per 5 hours)
  • for one year, one value every 219600 seconds (roughly one per 2.5 days)

This means that the longer the timeframe, the lower the detail on the values. But you will only lose data after one year. (And I believe this can even be set to be a higher timeframe.)

The backup? It very much depends on what you will actually be logging in collectd, the more values you will be logging, the more space. It's not a huge amount, though, and it's gzipped on top. 20, 30, 40 kB, something like that.

1 Like

It keeps separate data series for the different time periods. The shortest interval gets constantly overwritten, but its data gets summarised to next longer data series.

Hour, day, week, month, year are the data periods. (144 items in each)

Actually, 30 seconds is the default interval in LuCI stats.

Otherwise quite correct nice explanation. :slight_smile:

1 Like

You are correct, I forgot that I set my collectd interval to 60 seconds and was going off the rrdtool dump output. Thanks.

appreciate the help everyone

Thumbs up for this attitude!

I guess this is what decides this?

LoadPlugin rrdtool
<Plugin rrdtool>
        DataDir "/tmp/rrd"
        RRARows 144
        RRASingle true
        RRATimespan 3600
        RRATimespan 86400
        RRATimespan 604800
        RRATimespan 2678400
        RRATimespan 31622400

Yep, that is the data scale.

That section also defines that there is at least 144 data points (rows) in each series. (It was only 100 points earlier, but I increased that to 144 in April to provide more granularity and to decrease the frequency of mis-selected "best data series for the period".)

There is a separate definition of the data interval, 30 seconds.

are you the creator of collectd? :smiley:
EDIT: Increasing this value will make the files bigger and give more details, thats it?

No, but I maintain the collectd package in OpenWrt and also the LuCI stats app. And the definitions discussed here come from LuCI.

So the increased amount increases granularity/details and helps to avoid picking the wrong data series in LuCI display. Read details from that commit message.

Would setting it to like 300 be feasible/doable/reasonable?

Sure.
The original "100" was defined years ago, when routers had really tiny RAM memories and flash size.

Great, thanks.
There is a file /etc/config/luci_statistics that also contain a bunch of settings, in there is:

config statistics 'collectd_rrdtool'
        option enable '1'
        option DataDir '/tmp/rrd'
        option RRARows '144'
        option RRASingle '1'
        option RRATimespans '1hour 1day 1week 1month 1year'

this also needs to be changed?
Thanks for great replies and work

EDIT: What would be the impact of changing:

config statistics 'collectd'
        option Interval '15'

to a lower value? The "RRARows" value would still only contain 144 datapoints, so there would be no harm in lowing the interval?