Adblock support thread

Yep, it looks like what I want is the /etc/init.d/adblock report functionality. Sadly opkg is installing 3.5.5-3, and I don't think I'm feeling up for trying to get an out-of-band release working on my router.

As a workaround to my particular problem, is it possible to have adblock ignore a list of IPs or MAC addresses on my LAN completely, so the traffic from those locations isn't restricted?

Good day to everybody!

Using AdBlock with OpenWrt 18.06.2 I faced a strange problem: sometimes after refreshing the lists, adb_list.overall file contains reversed domain names (see below) :scream: Of course, blocking is not working in this case. Please help to figure out the reason.

...
server=/com.microsoft.watson/
server=/com.microsoft.telemetry.ppe.watson/
server=/com.microsoft.telemetry.watson/
server=/com.microsoft.data.vortex.web/
server=/com.microsoft.telemetry.df.wes/
server=/com.microsoft.ipv6.win10/
server=/com.microsoft.ipv6.win1710/
server=/com.microsoft.ipv6.win8/
server=/net.msedge.www/
...

Please forgive a question that may be simplistic. I use Roku to stream, and would like to stream the PBS channel. In order to do so, I need to exclude pbs.org pages from being affected by adblock (in other words, not the pbs.org page itself, but those links on that page). I don't think this is as simple as adding pbs.org and www.pbs.org to the whitelist (at least, this approach didn't work). Is there a way to configure adblock so any site linked to on a pbs.org site is no longer subject to adblock? It's a little irritating to turn adblock off each time I want to stream PBS.

Most probably an OOM situation where the adblock processing dies in the middle of nowhere ... reduce your used blocklist sources.

Whenever you whitelist a domain like 'pbs.org' all the stuff coming from that domain is accessible, e.g all deep links related to that domain. However in your case it looks like that the videos are coming from a separate content provider ... please whitelist 'jwplatform.com', 'jwpcdn.com', 'ga.video.cdn.pbs.org' and 'prd.jwpltx.com' as well ... re-run adblock and check again.

dibot,
I'm afraid that didn't work. Just to make sure, I restarted adblock with /etc/init.d/adblock restart and then did a query with /etc/init.d/adblock query ga.video.cdn.pbs.org to make sure it was on the whitelist. And unfortunately when I try to stream a video it still aborts to the Roku home page.

I found out I needed to add pubads.g.doubleclick.net to the whitelist and that did the trick. Unfortunately I'm not thrilled about adding an ad site to my whitelist in order to stream PBS.

daemon.err adblock.sh[7046]: sort: write failed: /tmp/sortiIbgLa: No space left on device

I am using a router with 256MB RAM and I get that message in syslog about three minutes after the adblock starts. My blacklist is ~850K entries or ~480K after deduplication.

I was watching ram usage and free space under /tmp and there was still over 50MB left when that message was logged. Is that what is expected to happen with 256MB RAM or I am doing something wrong?

UPDATE: adb_list.overall is under /root/, so adblock should have access to over 100MB of RAM...

Maybe you've reached the limits of the busybox sort implementation with such huge list, retry with the full sort package ('coreutils-sort')

You were right, I increased the memory (mine OpenWrt is VM) up to 192Mb and all things became normal. Thank you :slight_smile:

That is the one I am using. I have not tried the busybox implementation assuming it is worse.

@dibdot
https://ransomwaretracker.abuse.ch/downloads/RW_DOMBL.txt now redirects to https://ransomwaretracker.abuse.ch/byebye.php (it has been discontinued.)
And http://someonewhocares.org/hosts/hosts now redirects to https://someonewhocares.org/hosts/hosts.

Using curl as download utility, I would end up with errors in adb_list.overall (see screenshot.) This would send Unbound into a crash loop for me.

Disabling the ransomwaretracker block list and either making sure whocares uses HTTPS or adding the -L parameter to curl fixed my problem, but just letting you know.

Many thanks, will be fixed in adblock 3.8.13 (see https://github.com/openwrt/packages/pull/10747)!

Please retry with adblock 3.8.13 (see https://github.com/openwrt/packages/pull/10747) where you can set the sort temp directory, set 'adb_sorttmpdir' in the adblock config extra section accordingly. This is quite experimental and not exposed to LuCi yet.

I cannot; my router does not have that much storage. I am gonna try de-duping the list during the firmware build process.

I have noticed that the files tmp.add.whitelist & tmp.raw.whitelist are not removed until very end while only tmp.rem.whitelist is actually used in the last stage: can those two be removed earlier to free up some RAM?

Also, the whitelist file is using a pattern like

"^api\.segment\.io|\.api\.segment\.io",

but would not

"^(|.*\.)api\.segment\.io"

do the same thing while making the whitelist file much smaller?

Could a missing $ (to check for the end of the line) at the end of the pattern cause some minor issues? Line api.segment.io.blah should not be filtered out in the example below.

cat 1.txt
api.segment.io
x.api.segment.io
api.segment.io.blah
blah

cat tmp.rem.whitelist 
^api\.segment\.io\|\.api\.segment\.io

grep -vf tmp.rem.whitelist 1.txt 
blah

UPDATE: I have just compared the results and ~2K domains that should have been blacklisted, ended up being excluded. Here are examples:

Whitelisted: microsoft.com
Removed from blacklist: safety.microsoft.com.ruqem.yq7flcfpxhylyajsqc.trade

Whitelisted: apple.com
Removed from blacklist: apple.computersoftwaresecurityinstall.xyz

Sigh, another day another bug ... :wink:
Many thanks, the whitelist issues will be fixed with this PR https://github.com/openwrt/packages/pull/10758

The PR only removes one whitelist tmp file: is the other one still required?

Thx for a quick fix. I will test it the moment the packages are built.

In the meantime, would you consider using --compress-program=/bin/gzip for the sort operation to compress the temp files?

My final blacklist is a around 511K lines and there is no chance it will ever get smaller, so I decided to do some major performance benchmarking. The router is GL-B1300 Atheros IPQ4028, Quad-core ARM,717MHz with 256MB RAM and it is currently taking ~1 hour to prepare the list.

3.8.14 has already dropped the processing time from ~65 minutes to ~58 minutes (probably due to a simpler regex).

For the tests below I made a copy of the Adblock work directory and put all the files are under /tmp/test1.

Then I compared egrep from BusyBox against grep -E from the grep package and redirected all output to /dev/null to eliminate the filesystem, etc impact. The difference is 30m vs 30s.

time /bin/egrep -vf /tmp/test1/tmp.rem.whitelist /tmp/test1/adb_list.overall | /usr/bin/awk '{print "server=/"$0"/"; }' > /dev/null
real	31m 8.40s
user	30m 49.43s
sys	0m 13.96s

vs

time /usr/bin/grep -Evf /tmp/test1/tmp.rem.whitelist /tmp/test1/adb_list.overall | /usr/bin/awk '{print "server=/"$0"/"; }' > /dev/null 
real	0m 33.63s
user	0m 22.75s
sys	0m 1.10s

The next issue is the writes. The script is using ">" and ">>" and those are writing every single line (one at a time) to flash, which is slow and even more so if the router is using JFFS2 (compressed filesystem). Any basic buffering here should improve the throughout tremendously and I will compare "> ./t" against "| tee ./t >/dev/null": 37m vs 3m. BTW, writing files under /tmp would also improve by a lot if redirects are replaced with tee or tee -a.

time /usr/bin/grep -Evf /tmp/test1/tmp.rem.whitelist /tmp/test1/adb_list.overall | /usr/bin/awk '{print "server=/"$0"/"; }' > ./t
real	37m 3.74s
user	0m 23.00s
sys	0m 1.01s

vs

time /usr/bin/grep -Evf /tmp/test1/tmp.rem.whitelist /tmp/test1/adb_list.overall | /usr/bin/awk '{print "server=/"$0"/"; }' | tee t >/dev/null
real	3m 22.62s
user	0m 23.26s
sys	0m 1.19s

@dibdot Do you mind reviewing these results? In this particular use case, the processing time could be dropped from ~1 hour down to ~5 minutes for a massive blacklist. What is also important is that the buffering will minimize the writes to flash thus extending its life.

As a quick all-in test, I modified one line in adblock.sh and the time to prepare adb_list.overall immediately dropped from 58m down to only 7m (I think one or so more minutes can be shaved off by using buffering via tee for temporary files).

diff adblock.sh /usr/bin/adblock.sh 
674c674
< 				egrep -vf "${adb_tmpdir}/tmp.rem.whitelist" "${adb_tmpdir}/${adb_dnsfile}" | eval "${adb_dnsdeny}" >> "${adb_dnsdir}/${adb_dnsfile}"
---
> 				grep -Evf "${adb_tmpdir}/tmp.rem.whitelist" "${adb_tmpdir}/${adb_dnsfile}" | eval "${adb_dnsdeny}" | tee -a "${adb_dnsdir}/${adb_dnsfile}" >/dev/null

I have not noticed any changes in the memory utilization (there 256MB of RAM on this router).

UPDATE: I did a few more tests by installing coreutils-tee, gzip, and tar packages one at a time and they made no difference performance wise. So only the two changes below provide a major performance boost:

  1. grep & grep -E (from grep) instead of grep & egrep (from BusyBox)
  2. tee & tee -a instead of > & >>