Stubby dns over tls using dnsmasq-full for dnssec & caching

Ok I just figured out something. When rebooting the router DNSSEC doesn't work, and there are no files in /tmp/subby. That situation persists indefinitely.
That is until I manually do /etc/init.d/stubby restart
Then the files magically appear and it passes the DNSSEC tests. So my earlier success was due to the stubby restart, not the change of config file.
So the question is how can I fix this so DNSSEC starts working from the get-go and doesn't need a stubby resart?

Ok, that explains a lot.

Is the service (stubby) running after booting the router? (use 'ps' to check this)

Did you do "/etc/init.d/stubby enable" after installing stubby?

You can always put "/etc/init.d/stubby restart" in your startup script (rc.local) as a last resort, but the problem must be somewhere else.

Although, if the stubby service isn't enabled and hence not running at boot and you have set DHCP correctly (to just use stubby via dnsmasq) you shouldn't even be able to load the webpages without stubby running.

So it's either not running and the clients are using a second DNS record (maybe IPv6?) that does not go through stubby. Or it is running at boot but for some reason it's not recreating the key dir and downloading the keys for DNSSEC validation.

Due to the nature of DNSSEC validation in order for it to be reliable you have to guarantee that all DNS requests from your clients go through DNSSEC compliant DNS servers using a DNSSEC aware and enabled resolver (stubby). Because a domain that fails DNSSEC validation can return a SERVFAIL message which makes the client try another DNS server if there is one configured (via DHCP) and if that server does not do DNSSEC validation it will return the IP address thus circumventing the validation. If there isn't another DNS server configured, than the SERVFAIL message (it comes without IP) works as a NXDOMAIN (non-existent domain) preventing connecting to the domain. Personally I would prefer that the non valid domains would always return an authoritive NXDOMAIN, which would prevent the client looking for the response from other DNS servers.

Sorry I had to wait a while before I could do a reboot.
I can confirm stubby is running on reboot (via ps), but no DNSSEC or files in /tmp/stubby

I'm willing to go with the band-aid fix of putting stubby restart in my startup script, but I'm also interested in finding out the actual cause of failure.

Quick and dirty solution, add /etc/init.d/stubby restart to rc.local (startup).

1 Like

It might have something to do with the startup order of services and not having internet connectivity when stubby starts.
It can also be related to the fact that the keys are saved in /tmp which gets nuked when rebooting but stubby expects them to be there? This does not make complete sense to me because restarting the service wouldn't solve it, if that was the case,I believe.

Many thanks for all the suggestions @Specimen
If you can think of any experiments to perform (if you are interested in finding out the root cause) I am willing to try them.
We can test the keys hypothesis maybe by storing them in a more permanent directory (if you can suggest one)? Like you say it's probably not that, but would be good to eliminate that as a possibility.
Connectivity on startup makes most sense to me.

Another thing I've realized is that stubby is run under user stubby (not user root), and so won't have write permissions for the default directory in /root, which explains why you need to set the appdata_dir to something like /tmp where it does have write permissions in order for the certificates to download and for DNSSEC to work.
Is there a good directory location which is both permanent and writeable by user stubby?

Edit: Ok I just added a/stubby directory under the root directory with write permissions. Stubby now retains the certs on reboot and DNSSEC works from the get-go after a reboot. Job done! Many thanks!

1 Like

Dear Specimen,
I edited the tutorial as you directed. Forgive my mistakes - mostly they come from running UNBOUND with STUBBY.
Peace,
directnupe

1 Like

Dear Caveat,
Hello and I hope that you are well. Specimen answered a few of your questions. As for your first question i.e.:
1. Any reason you set padding size to 256? (dnsprivacy.org seem to recommend 128).
See here: https://edns0-padding.org/
All I can tell you is look at my other guide / tutorial Adding DNS-Over-TLS support to OpenWrt (LEDE) with Unbound - especially here : https://dnsprivacy.org/wiki/display/DP/DNS+Privacy+Clients#DNSPrivacyClients-Unbound
Look especially at section " Unbound/Stubby combination " where "Stubby config" indicates
tls_query_padding_blocksize: 256 - in short it is what it is and this is the correct setting.
I believe that you are looking at an old guide. The DNS OVER TLS SERVERS set their specifications - STUBBY must match what specifications are configured on the servers.

Peace,

directnupe

1 Like

Many thanks for the reply @directnupe. You are right Stubby config used to recommend a blocksize of 256:

# EDNS0 option to pad the size of the DNS query to the given blocksize
# 256 is currently recommended by 
# https://tools.ietf.org/html/draft-ietf-dprive-padding-policy-01
tls_query_padding_blocksize: 256

I got the above from here: https://github.com/getdnsapi/stubby/issues/13

But the more recent version of stubby that I have says:

# EDNS0 option to pad the size of the DNS query to the given blocksize
# 128 is currently recommended by
# https://tools.ietf.org/html/draft-ietf-dprive-padding-policy-03
tls_query_padding_blocksize: 128

and 128 is the default on the (more recent) version of stubby I have. I guess it's not a big deal if both have been recommended by stubby at various times.

Padding with a block size of 128 bytes on the query side, and 468 bytes on the response side was considered the optimum trade-off between defender and attacker cost.

So, yeah, definitely 128. 256 was actually never recommended, the version 1 of the draft cited doesn't actually recommend 256, on that chapter ( "Block Length Padding" ) it says the recommendation it's still TO DO, later versions up to 6, the latest one, always recommend 128.

They talk about cost because the higher the padding the slower the whole process is, so there is a trade-off.

1 Like

Dear Specimen,
Thanks for this - I never realized that this is the recommended method. So are you saying that we should use:
- digest: "sha128"
in the default STUBBY configuration ? If so, upon your reply and confirmation - then I will change my tutorials to reflect this.
Just get back to me and let me know. I have learned much from you and I appreciate your knowledge and insights - as we all should and do. Thanks again.
In Peace,
directnupe

I'm talking about
tls_query_padding_blocksize: 128

Dear Specimen,
OK - got you . Thanks - Should we list this option in Stubby config?

directnupe

We already do.

Dear Specimen,
I changed all my tutorials to include:
tls_query_padding_blocksize: 128 I had this set to " 256 " which was not in keeping with the correct setting. So thanks for setting me straight on getting this value set correctly. Again - I thank you and appreciate your expertise. Again - I am still on a learning curve and I am thankful for your patience and knowledge in bringing us all up to speed. God Bless.
In Peace,

directnupe

PS - I really do not want to spread bad advice in my guides, so I truly thank you personally and on behalf of the others who trust my tutorials.

@Specimen I have an interesting problem! I have folder for the stubby certs so that DNSSEC works when the router restarts, but I also want the belt-and-braces approach of adding /etc/init.d/stubby restartto rc.local as you suggested earlier.
Problem is it doesn't work!
rc.local runs ok, but the line /etc/init.d/stubby restart doesn't seem to have the desired effect. When I run rc.local directly as a script everything works. It's only when it is run as part of boot that it doesn't seem to pull in the certificate files.
I've even put in a delay (wait 180) in there to make sure other stuff was done first, but to no avail. Any ideas? Much appreciated!

The command you want is 'sleep' not 'wait'.

I suggest:

sleep 15
sh /etc/init.d/stubby restart

I think 15 seconds should be enough before executing /etc/init.d/stubby restart.

@Specimen, apologies I meant sleep not wait! It doesn't seem to do anything. I don't think there is an internet connection when the rc.local script is executed. And if I put a really long wait in there (say 120 seconds), it just seems to delay the point at which the router connects to my ISP. Any advice / ideas? Many thanks!