While waiting for a connection in the Vienna airport, I thought to touch on another topic. The faster_processing branch is now almost 2150 lines. Some of it are empty lines, lots of comments. Some lines count (probably nearly 100) could be decreased by simply condensing these structures
if [ condition ]
then
[dosomething]
fi
for i in a b c
do
[dosomething]
done
To these
if [ condition ]; then
[dosomething]
fi
for i in a b c; do
[dosomething]
done
This would be nice. However the script would still be too long to comfortably navigate.
So I'd suggest splitting a library out of it. In particular, the 2 biggest things we have are processing-related and config-related. Probably splitting out the config-related functions (around 560 lines including spaces and comments) into a separate library would make sense. The processing code is actually less than that at about 450 lines (including everything again) but still pretty big. So also makes sense to consider splitting it into its own library.
I know that this doesn't exactly fit with "one script, super easy install" philosophy but I suppose that this philosophy was devised way before anyone knew about the path this project actually took. And I'd suggest to consider adjusting it at this point.
Maybe not with the upcoming release but eventually.
I joined my father in his vacation on some fairly random beach in Montenegro xD. Nothing too exciting to tell (except almost missing the plane, twice), but I enjoyed it.
I like that backup of config into /tmp, as that will automatically disappear after the next reboot, but still gives you a window to find it still if needed (although it's a pretty simple config, should be easy to recreate from scratch anyway).
Also I see that flag file "new_urls_format" so the process will only happen once:
On the first run with the new script , when using gen_config this happened. Is this related to the new location maybe?
root@OpenWrt:~# service adblock-lean gen_config
Generating new default config for adblock-lean.
Error: Failed to validate the new config.
But running adblock-lean start did fix that error, after following the y/n prompts. Then gen_config worked on each subsequent run.
These two should probs be moved do /etc also do you think?
When starting with dnsmasq lists, and no flag file "new_urls_format", that second prompt Convert known dnsmasq-formatted list URLs which are currently configured as "raw domains" to "raw domain" list URLs? might not be needed? As when I hit yes on that, I end up with no blocklists, and if I hit no, then I get an invalid config:
root@OpenWrt:~# service adblock-lean start
NOTE: adblock-lean now supports both dnsmasq-formatted lists and "raw domain" lists.
The "raw domain" lists are functionally identical but their filesize is smaller and they are faster to process.
Hagezi and OISD dnsmasq-formatted list URLs can be automatically converted to "raw domain" list URLs.
If you are using other dnsmasq-formatted lists, they will be moved into separate config options and continue to work,
but you may want to look for their raw domain equivalents (sometimes called "wildcard domains").
Current "raw domains" URL config entries:
blocklist_urls=""
blocklist_ipv4_urls=""
allowlist_urls=""
Current "dnsmasq" URL config entries:
dnsmasq_blocklist_urls=""
dnsmasq_blocklist_ipv4_urls="https://nsfw.oisd.nl/dnsmasq2 https://raw.githubusercontent.com/hagezi/dns-blocklists/main/dnsmasq/pro.txt"
dnsmasq_allowlist_urls=""
Convert known dnsmasq-formatted list URLs which are currently configured as "raw domains" to "raw domain" list URLs?
(you will be asked to confirm before writing the config)
y|n: y
Updated "raw domain" URL config entries:
blocklist_urls=""
blocklist_ipv4_urls=""
allowlist_urls=""
Updated dnsmasq URL config entries:
dnsmasq_blocklist_urls=""
dnsmasq_blocklist_ipv4_urls=""
dnsmasq_allowlist_urls=""
Confirm URLs update in config?
y|n:
When generating default config, the default paths for local will get changed to /etc/. Also when converting existing config, and the files /root/adblock-lean/*list do not exist, the paths will be changed. Only if you already have these files in /root/, the paths will be migrated to the new config. I think this is the least trouble for the user and for us.
Hmm, I think I created another bug - for some reason when the dnsmasq_blocklist_ipv4_urls config entry doesn't exist, wrong URLs end up in that entry.
Edit: nah, I just accidentally put the URLs in the wrong entry to begin with
Edit2: ah, I just copied the entries from your config, and that's where the trouble started. Those URLs are supposed to be in dnsmasq_blocklist_urls rather than
in dnsmasq_blocklist_ipv4_urls.
My bad, I did a few runs and must have pasted in the the wrong config line at least one. But even when placing into the correct dnsmasq_blocklist_urls config, the same issue occured:
However, tried your updated git and while not the the same issue as before, it also doesn't seem to convert dnsmasq to raw lists:
My config still contains: dnsmasq_blocklist_urls="https://big.oisd.nl/dnsmasq2 https://raw.githubusercontent.com/hagezi/dns-blocklists/main/dnsmasq/pro.txt"
root@OpenWrt:~# service adblock-lean start
NOTE: adblock-lean now supports both dnsmasq-formatted lists and "raw domain" lists.
The "raw domain" lists are functionally identical but their filesize is smaller and they are faster to process.
Hagezi and OISD dnsmasq-formatted list URLs can be automatically converted to "raw domain" list URLs.
If you are using other dnsmasq-formatted lists, they will be moved into separate config options and continue to work,
but you may want to look for their raw domain equivalents (sometimes called "wildcard domains").
Current "raw domains" URL config entries:
blocklist_urls=""
blocklist_ipv4_urls=""
allowlist_urls=""
Current "dnsmasq" URL config entries:
dnsmasq_blocklist_urls="https://big.oisd.nl/dnsmasq2 https://raw.githubusercontent.com/hagezi/dns-blocklists/main/dnsmasq/pro.txt"
dnsmasq_blocklist_ipv4_urls=""
dnsmasq_allowlist_urls=""
Convert known dnsmasq-formatted list URLs which are currently configured as "raw domains" to "raw domain" list URLs?
(you will be asked to confirm before writing the config)
y|n: y
Updated "raw domain" URL config entries:
blocklist_urls=""
blocklist_ipv4_urls=""
allowlist_urls=""
Updated dnsmasq URL config entries:
dnsmasq_blocklist_urls="https://big.oisd.nl/dnsmasq2 https://raw.githubusercontent.com/hagezi/dns-blocklists/main/dnsmasq/pro.txt"
dnsmasq_blocklist_ipv4_urls=""
dnsmasq_allowlist_urls=""
Confirm URLs update in config?
y|n: y
Old config file was saved as /tmp/adblock-lean_config.old.
Saving new config file to '/etc/adblock-lean/config'.
Started adblock-lean.
gawk detected so using gawk for fast (sub)domain match removal.
coreutils-sort detected so sort will be fast.
No existing compressed or uncompressed blocklist identified.
Found local allowlist. Sanitizing.
Successfully processed allowlist (source file size: 1 B, sanitized line count: 0).
Not using any allowlist for blocklist processing.
Found local blocklist. Sanitizing and compressing.
Successfully processed blocklist (source file size: 1 B, sanitized line count: 0).
Starting dnsmasq blocklist part(s) download.
Downloading, checking and sanitizing dnsmasq blocklist part from: https://big.oisd.nl/dnsmasq2.
The algorithm is only intended to convert URLs in the non-dnsmasq entries. My assumption is that if someone decided to put the dnsmasq URLs in the dnsmasq config entries then they want to keep these URLs as is. I suppose it's better to not offer conversion when there is nothing to convert, but that would require some additional logic, and since for the typical user the URL conversion will only run once when they don't have anything in the dnsmasq entries yet, and have dnsmasq URLs in the non-dnsmasq entries, I'm not sure adding more logic is justified.
Going back a week here, but I think while adblock-lean is still primarily installed via command line, best to keep as one file.
But as you mentioned there are some line consolidation options to reduce line count. Also, was some of this code inteneded to be temporary? Ie after x months, can assume that most people have migrated config and/or lists, and those parts can be removed from the main branch?
Yep, the temporary things are currently: some small part of the parse_config() function intended for config migration from initial format to v2 (or higher), which was added in August, and now URLs conversion code and the config file migration logic. It is all marked in the code.
Edit: found a bug with oisd URLs conversion, fixed in the current revision.