Hi all, I wanted to share with you a side project I've been developing lately. It's a library of functions which implement arrays in POSIX shell.
Some of you may have already seen an early version some time ago. Since then I rewrote it with a completely different (and much faster) algorithm, and added lots of new functions.
I wouldn't recommend to use this with arrays larger than a few hundreds of elements, but within these boundaries, I think the performance is not bad, and for small arrays comparable to Bash arrays performance on x86 CPU. On my 11-years-old mips24 router CPU it's more than an order of magnitude slower than on my x86 CPU but still very usable with a 100 elements array.
The functions can be used either by sourcing the script, or by cherry-picking what you need.
I'm still making minor tweaks to the code from time to time but I think it's close to feature-complete and it passes a ton of tests I wrote for it.
If you find any bugs, please let me know. And I would be interested to hear what you think of the implementation.
Update: simplified some of the logic, fixed a few edge case bugs, implemented some additional functions and significantly optimized performance for complex workloads.
FYI: The OpenWrt has a jshn library that can create arrays and maps and save or load them to JSON.
It's not a fully shell solution because there is also a binary so it will work only on OpenWrt unless included to other platforms.
I suppose it may be useful in some cases, but I doubt it's a good solution for routine shell coding where arrays may be needed. Loading an external binary from shell code is orders of magnitude slower than native code. My implementation avoids using external binaries as much as possible, so the only case where those are called is when sorting of the array is needed.
The jshn is often used is almost all OpenWrt scripts so with a high chance you may already have it included. Arrays and maps are implemented in the shell and the jshn binary is used to parse JSON and print shell statements to create and array. So technically speaking if you don't parse or save the JSON the binary wouldn't be called.
The only reason not to use it if you want to make the script portable to other platforms were the jshn is not present.
Please check it, at least you may found something useful for you.
I'm sure jshn is a useful tool, but this doesn't make it "the tool" for every task. Regardless of whether or not a binary is called once in a script, next time the script calls it, there is a price to pay in terms of performance. I can definitely see how this is justified when parsing json (although personally I just use grep and/or sed for this purpose because it's faster and typically the piece of information which my scripts care about in the json file is specific and easy to extract, plus I'd rather avoid extra dependencies), but when you need functionality that's easy to implement in native shell code, for me it's not justified. YMMV.
As an example, another side project which I've worked on recently does detection of local networks on a machine and aggregates found subnets. The initial implementation took about 5s to complete the aggregation part with just 3 subnets (on my router CPU). The current version takes 0.4 seconds and that includes detection, aggregation of 7 subnets and validation. What changed in the code? Mostly replacing calls to external binaries with native shell code implementation.