This is an interesting idea. A few things that come to mind:
Since these lists will be public and the format is generic, anyone will be able to use them, not just adblock-lean. So perhaps a separate repo makes sense.
Perhaps we could maintain pre-processed compressed lists corresponding to those which we currently support with short identifiers. This would cover Hagezi, oisd and Steven Black. oisd and Steven Black lists selection is quite limited (at least the selection supported by adblock-lean with short identifiers). Hagezi has a much wider supported lists selection so that may be a bit of a processing burden. Probably it makes sense to only process some selection of those.
I don't know much about Github automaton but I would imagine that it should be possible to do all processing via this automation directly.
Regarding separate repository perhaps it is worthwhile considering whether this is mainly to enhance adblock-lean or to provide a new thing in of itself. Some of the processing might be somewhat adblock-lean-esque - e.g. duplicate removal and sanity checks. I wonder what would happen if something did not pass a sanity check? Would we simply skip the update of the compressed blocklist then?
How are you fixed up for processing/bandwidth? I have theoretically unlimited bandwidth albeit it's only 4G. And the thin client.
there can be a copyright on databases, so just processing data from other sources may be a legally hot topic (at least 'undesired' by ity owners)
you can quickly run into rate limiting (or hard blocking), if your data scraping is considered to be too aggressive (or if you have too many users pulling data from you)