I'm looking for options for storing data to disk in a efficient way. I need some sort of object or key value storage that doesn't eat a lot of resources and minimizes writes. I don't need it to be human readable so it can be binary.
Are there any C libraries that handle this? If not, is there any other software for OpenWRT that I could look at? I'm trying to avoid json as it isn't very efficient and I would rather not have to deal with trying to make a database from scratch.
It might be useful to know what type of data you’re asking about - is it just a list of key/value pairs, or is it a more complex dataset? Do you need to simply store it for later, or will it be actively read and manipulated? What is the source of the data? And where will it be stored? (What type of device are you using, is this storage internal storage, external like a usb stick, or remote such as over the network?)
1 Like
I'm looking for a persistent version of blobmsg. I know that I could use json but json adds a lot of overhead that I don't think is necessary.
Data wise I'm looking to store some objects that aren't more than a few kb at worst.
I know this doesn't answer your questions and my knowledge is mostly from reading about the topic, but I'll add to the points that @psherman provided.
Modern storage devices, filesystems, kernel buffers etc can abstract your application data writes so that granular control of writes at the application layer may not be directly reflected on the storage device hardware. I'm referring to things like flash wear leveling, write-block-size, buffers holding small updates to write a larger unit etc. Some filesystems provide data compression so you don't have to be concerned at the application level. For smaller key:value sequences it could be better to put the data in the file name and store data in the filesystem directory and not allocate storage blocks to files at all. There are lots of options and knowing all your requirements and system specifications matter.
It may only be practical to try optimizing for a specific target, such as an all-in-one router with raw flash vs an x86 system with an ssd.
Honestly I really like the way dnsmasq stores data. I think I'll probably just do tab separated columns since that seems to be simple and lightweight to implement. From a write perspective I'll just make my application only write changes on a timer instead of when a change happens. I'm also doing data validation in the application so any corrupt data will just be thrown out.
Thanks for the response
1 Like
How frequent are your writes?
The writes are spawned by a user so anything between infrequent to a bunch of changes all at once. I'm looking to solve the problem of frequent writes by caching changes and the only writing when either the writes slow down or a time elapses such as every 10 min.
Do you know how much f2fs caches writes? It might be fine if I write tens of bytes at a time since kernel caching may be enough to prevent continuous changes. My big concern is that small writes in the middle of the file trigger the entire file to be rewritten to disk.
There is still a lot of critical information missing:
- What is the device that is running OpenWrt? This matters because of the storage options and the longevity of said storage.
- The amount of space you anticipate using is also critical...
This doesn't tell us everything, as we don't know if it's a total of a few kb (total data storage), or if this is a few kb per write/transaction appending that to one or more data files that will grow in size over time.
This could easily burn out flash storage in an embedded device. It may also kill USB or SD card storage, but those are easily replaceable (unlike the flash inside an all-in-one router). OTOH, if you're writing to a spinning-rust drive or a real SSD, that becomes much less of a concern.
I think ultimately I need to just do some testing. No theoretical model can really show how things work in practice.