Optimization of adblock-lean

Until now I hadn't fully worked through @spence's script to see exactly what it did and in my mind I conflated two things:

  • dnsmasq log file rotation; and
  • processing logged queries.

This is because for some time I have been wondering about ways to facilitate adblock-lean processing queries as they are generated.

@spence, now that I've digested your script, which addresses log file rotation, it looks good to me. Nice work and I think you've had fun with this.

When it receives SIGUSR2 and it is logging direct to a file (see --log-facility ) dnsmasq will close and reopen the log file. Note that during this operation, dnsmasq will not be running as root. When it first creates the logfile dnsmasq changes the ownership of the file to the non-root user it will run as. Logrotate should be configured to create a new log file with the ownership which matches the existing one before sending SIGUSR2. If TCP DNS queries are in progress, the old logfile will remain open in child processes which are handling TCP queries and may continue to be written. There is a limit of 150 seconds, after which all existing TCP processes will have expired: for this reason, it is not wise to configure logfile compression for logfiles which have just been rotated. Using logrotate, the required options are create and delaycompress.

@spence I have a suggestion for your script.

At the moment once you've sent SIGUSR2 you then move the old logfile to a different file system, which I think could be problematic given these trailing queries described in the helpful dnsmasq manual excerpt reproduced above that you identified. It seems that ideally dnsmasq is provided with a full 150 seconds for any old queries to get written to the old log file at the new location. It is my understanding that your initial mv within /tmp is perfectly okay - the old queries will still get appended to the same file despite the move. However, the mv to a different file system (e.g. USB stick) is not okay because the writes will not follow through.

Thankfully this would be mostly resolved by simply adding a 150 second sleep after sending the USR2 signal and before archiving the renamed log file in order to give dnsmasq sufficient time to complete its logging of any outstanding queries.

Now I believe ideally you would actually wait 150 seconds from the point of time that dnsmasq actually receives the signal, rather than 150 seconds from sending it. But I don't think there's an easy way to do that except perhaps to send the signal, wait for some kind of log entry in the system log to appear, then wait 150 seconds.

But honestly I think simply waiting 180 seconds would suffice - the signal in reality is not going to take more than 30 seconds to get processed.

Then after the 180 seconds you can safely move the old log file to your external archive on the USB stick.

You could, of course, just use logrotate and its builtin scripting functionality:

   prerotate
       script
   endscript
          The script is executed before the log file and its old logs
          are rotated and only if the log will actually be rotated.
          These directives may only appear inside a log file
          definition.  Normally, the absolute path to the log file is
          passed as the first argument to the script.  If
          sharedscripts is specified, the whole pattern is passed to
          the script.  See also postrotate and the SCRIPTS section.
          See sharedscripts and nosharedscripts for error handling.

But that's obviously way less fun than writing a custom script and as a bonus we lose the package dependency.

By the way, I discovered earlier that it's possible to setup rclone to mount cloud storage on /tmp. One could leverage that to send dnsmasq log files to Google Drive or OneDrive.

2 Likes

I think you are interpreting the manual correctly. Props for figuring this out!

I concur, signals are not guaranteed to be delivered immediately. However 30 seconds seems like an overkill. IMO if it takes longer than 5 seconds then something is very wrong with the system and at that point hanging dnsmasq log writes are the least of the worries :slight_smile:

From the somewhat convoluted description of how dnsmasq handles USR2, I'm getting the impression that using dnsmasq's PID to check when the log file has been released may be the wrong approach. So perhaps the correct way to go is, as @Lynx suggests:

  1. Send USR2
  2. Move the logfile to another path in /tmp
  3. Wait some number of seconds > 150 (I would say that 155 is enough)
  4. Move the file to its final destination

That said, given the irritating delay of 150 seconds, maybe the awk approach is better, simply because then we are in complete control of the output file. Although this is a more complex solution.

Since the discussion continued, I'll ask my question before my own research.

What I've seen so far:

with dnsmasq configured to log to /var/run/dnsmasq/dnsmasq_log
dnsmasqPID=$(pidof "$dnsmasqProg")

Before file rotation:
ls -l /proc/$dnsmasqPID/fd/ | grep "/var/run/dnsmasq/"
returns only one line that shows fd 64 is a link to the log file and what it points to matches the dnsmasq config.:
l-wx------ 1 root root 64 Mar 27 06:00 46 -> /var/run/dnsmasq/dnsmasq_log

After file rotation the ls command returns only one line and fd 64 points to the renamed file:
l-wx------ 1 root root 64 Mar 27 06:00 46 -> /var/run/dnsmasq/dnsmasq_log.2026-03-27T154402
and /var/run/dnsmasq/dnsmasq_log does not exist. In testing, I've waited over a minute, even doing manual dns queries and the fd sticks to the renamed file as I expect it to.

After kill -s USR2 "$dnsmasqPID":
the ls command returns only one line and fd 64 points to the configure log path:
l-wx------ 1 root root 64 Mar 27 06:00 46 -> /var/run/dnsmasq/dnsmasq_log
and the file exists. There is no other fd open for a file in the /var/run/dnsmasq/ directory.

With no dns queries to log, I've seen the log file empty at that point. With a dns query, done by the script, I see those queries in the "new" file without waiting (no sleep 1).

Questions:
Q1: After signal USR2, If checking shows only one dnsmasq fd open in /var/run/dnsmasq/, and it is no longer the renamed log file, is that a valid test that it is closed?

Q2: Since the "new" log file is created by dnsmasq after signal USR2 and not created by the script, is:
[ -e "/var/run/dnsmasq/dnsmasq_log" ]
a valid test that dnsmasq closed the old file and is strictly using the new one, even if there are pending dns queries per the man page?

Reasoning:
I'm thinking that dnsmasq would keep the renamed "old" file open for the pending queries without opening a second fd for the configured log path
rather than
re-opening the renamed path that it is not configured for, presumably after already opening the configured path for new queries.

Thoughts:
I think I'd prefer the "-e" test in while loop that logs an error after 155 sleep 1 . I'd have to think about how to handle the log file in that state, or attempt to mv it to the archive anyway. I'm assuming that something is very wrong at that point and maybe restarting dnsmasq is in order. I think the most of the time the "-e" test will be true on the first loop iteration on a lightly used dns system and normal external response times. The while loop takes care of those pesky slow responses. :-)

Thanks @antonk for pointing out the race condition situation.

Given this quote:

If TCP DNS queries are in progress, the old logfile will remain open in child processes which are handling TCP queries and may continue to be written.

I'm not sure. It specifically mentions 'child processes'. I've never seen any dnsmasq child processes but that doesn't mean that they never exist. If they do sometimes exist (even briefly) then checking the open fd's for the the main process doesn't provide a decisive answer.

If the manual is using the word 'processes' loosely and what it means is 'threads' then perhaps the answer to this question is 'yes'. To figure this out, we would probably need to review the code...

Given the above quote, if I had to guess, my guess would be that under certain conditions, some dnsmasq 'child processes' (whatever they are) will be writing to the new file while others are still writing to the old file. So IMO the existence of a new file does not guarantee that the old file has been completely released.

I hope so. And thanks.

Heh - yes I agree that 30 seconds is surely totally overkill. I was thinking that @spence only needs to run his script every few hours or so such that 30 seconds wouldn't matter and it gave a nice round three minute wait. That's the only reason I went for the 30.

I think rather:

  • Move the logfile to another path in /tmp
  • Send USR2
  • Wait some number of seconds > 150 (I would say that 155 is enough)
  • Move the file to its final destination

I know that that seems odd, but my understanding is that the mv whilst dnsmasq is running will keep the leftover writes writing to the temporary new log file path (confirmed by @spense's foray into the labyrinth of file descriptors), and then the USR2 causes dnsmasq to open and start writing to a new log file.

What @spence is doing right now is:

  • Move the logfile to another path in /tmp
  • Send USR2
  • Move the file to its final destination (USB stick)

But this breaks because of the lack of the wait before moving to the file to a new file system (which breaks the chain from dnsmasq during the 150 second safety window for logging the trailing queries).

I would say drilling this deep is best left for academic interest and respecting the dnsmasq manual excerpt that you helpfully identified and just adding the 155 second wait should be enough.

1 Like

Thanks for indulging me with these questions.
I had not thought about there being other process IDs to check under for the fd filesystem.

I'll update the script to do a couple basic checks and wait 155 seconds before moving the tmp file to a different filesystem.

The following is just a bit more info and a thought or two...

Per the dnsmasq manpage, Each tcp connection forks a process, with a configured limit on concurrent processes (default=20). I generated a stats report by sending SIGUSR1 and it shows zero tcp connection done (historical, not just current) since the last SIGUSR1. I use udp for my queries and dnsmasq matches the inbound protocol for outbound connections for forwarding requests. I'll put the delay in the script to cover the tcp use case.

Thanks @Lynx for your thoughts. Waiting 155 seconds is my plan.

I edited the earlier post with the revised script and added a couple notes at the top.

Time to setup a GitHub account and repository for your new project methinks!

1 Like