So I have what I think is a pretty standard setup, a bunch of ports forwarded, DDNS set up with a hostname so I can access it from the outside even if my WAN IP changes,
For example Jellyfin, Home Assistant.
My problem is if the internet connection goes down I can't use my DDNS hostname to access my services, eg, I might need to turn on a light with home assistant, which will access myrouter.duckdns.org/homeassistant or might want to watch jellyfin myrouter.duckdns.org/jellyfin
But Duckdns is set to my OLD wan address before the internet went down, and DNS isn't reachable anyway.
I could set a static dnsmasq host, but what IP do I set it to? I can't set it to my WAN IP (it's down), and I can't set it to my router LAN IP (it doesn't have the port forwards). I could set it to my home assistant LAN IP, but some networks might not have access to that LAN address (would have access to WAN) and also then if my jellyfin is on another machine it won't work.
Sounds like you were talking about a split DNS sitution where internal clients should use an internal IP address whereas external clients should use an external IP address, just i case your WAN ip is offline and haivng internal clients connect the external IP address is not possible.
If I am right: I wouldn't do that, not at all. Split DNS always hits you in the back. An application starts, resolves the host name to an IP, the network changes, the IP gets unreachable but the client just doesn't re-resolv the hostname. It's totally up to the client application to decide.
I'd go for a VPN setup where external clinets connect e.g. via Wireguard. This way you can always use the internal IP addresses and you never need application specific port forwardings and.
I've been running such a setup at home for many years without problems. I'm not even sure what the issue you've raised is supposed to apply to? Is it internal clients? In which case your second solution of a VPN and using internal addresses would still be affected? Or is it external clients? Which would equally suffer the problem of the external IP being down or the VPN not being up if the DDNS has changed and the application doesn't recheck (pretty sure this can be an issue with wireguard).
VPNs are not an option for at least a couple of reasons:
I'd like my friends to be able to connect, and to use services from normal machines.
Mobile OSes only support one VPN at a time and I don't want to use that "slot" just for local access
Also, don't you have a chicken-egg problem with the wireguard? How does it find the VPN to connect to without the DDNS? Then I need split DNS for the VPN to connect anyway?
Well, from an aesthetic point of view it would be nice if the clients didn't know they were on a network that's any different from the rest of the internet. It should work the same if they are on my wifi or free wifi.
I can move past that if I can get split-dns to work, but I still have the problem that the port forwards/firewall rules are on the WAN address, and with the WAN down due to the dropped pppoe connection, there's no IP address to target. And the WAN is dynamic while it's up anyway..
I have a bunch of VLANs that don't usually access local IPs in my trusted network, so I was hoping that I could define the rules once, eg, Allow them internet access, and they can access only local stuff with port forwards, by virtue of WAN port forwards being internet.. But now I guess I have to create manual allow rules for each thing that is also port forwarded..
I was also hoping from a privacy perspective, things that scan local address ranges on those networks wouldn't get any hits, the only thing exposed would be on a WAN address range, so they'd never be able to localize themselves by proximity to other networks/machines. Paranoid I know, but it was a thought.
Another problem this creates is I can't have different external and internal port mapping.
For example, I might have 3 devices I might ssh into:
22 -> 22 on Router
2222 --> 22 on Server1
4444 --> 22 on Server2
With port forwarding, and using non split-dns, this just works.
With split DNS I have to have completely different configurations for internal and external. Or change ports on each device individually (if it even allows that, cameras for example are unlikely to let you change the admin port)
You can buy a domain or run DDNS on the servers and use it with IPv6 exclusively, so each server will get a unique domain name and IPv6, use a tunnel broker if necessary.
Then both local and remote clients can access the services seamlessly resolving their domain to the same IPv6 and using the same port as long as firewall on the router is properly configured to accept the relevant transit connections.
This way you can safely forget about port forwarding and split DNS at the cost of IPv6 connectivity becoming essential.
It all depends on the client applications running on non-stationary devices to support/expect this.
From a pure DNS perspective, you need at least an A record (or CNAME record) with a small TTL. If you go with, e.g., 10 minutes TTL your non-stationary devices are at the time of roaming (statistically) 5 minutes into the current refresh and 5 minutes away from the next. So right at the time of roaming, access is not possible for (statistically speaking) 5 minutes. So you don't go with a TTL of 10 minutes but 60 seconds, which reduces the potential break to (statistically) 30 seconds. But if you have tiny and fast requests ("GET /lightbulb/status, if-no-match:fullbrightnes" resulting in "304 not modified + no response content" which translates to "I'm expecting the light is on at full brightness, give me the status if it is not but don't waste bandwidth it is") that can be served within a low 2 digit amount of milliseconds, you're doubling the response time resulting in half the performance if you make the client add a DNS round trip for every such request.
That's assuming your client application does the lookup when connecting as well as when reconnecting. I don't know details about your clients, and there's a real chance this already is the case and fine because that will be the default with every naive high-level implementation.
But I know some software developers love over-optimizing in places they don't have business messing around with, trying to save 3 CPU cycles and 10 bytes of network traffic by doing the address resolution one per application launch and caching the result no matter the TTL.
If you're fine with your set up, roll with it. There's a real chance you never encounter the problems I am talking about because when it comes to a single family smart home setup you would need to need to leave the house while actively watching your light bulb status.
But generally speaking, and speaking from my experience as a professional software developer for 20 years now, I'd take a VPN configuration passing a plubic IP address through a VPN tunnel over a split DNS setup every day.
You still could use wireguard for your friends to connect and only give them access to a single IP. No real problem with firewall rules.
But if you can only have one VPN at a time and you already run another tunnel, that might be the reason for you to not follow this approach.
Regarding the chicken-egg situation you describe, I guess that's not a problem at all.
You are either up and running, Then your DDNS will up to date too. Let's say its 60 seconds behind after you reconnect. But wireguard doesn't constantly re-check the DNS when traffic is going through. Wireguard will only need to re-evalualte the DNS when heartbeat fails. So that's not a "once per request" penalty but a "once per tunnel" time.
Or you're down. But this means wir WAN connection is not working. Nothing you can do on your side to fix this except call your ISP and wait for the technission. But at least your internal devices can happily access your internl servers since they don't need a WAN uplink to access private IP addresses from private IP addresses. So only external connections are affected, who depend on the WAN uplink to come up anyway, no matter the IP change.
Only externally. But given a dynamic DNS service the TTL is likely to be pretty low anyway. And everything else in that paragraph is already the situation the OP will be in with their current setup (which they don't appear to have a particular issue with).
Home Assistant, media servers, mail servers, web hosting (an all round mish-mash). For those which use a client app, outside of a brief drop out when leaving home wifi and switching to mobile signal, I've never had an issue. For websites, even less so.
I don't need to, but I could.
If I didn't want others to access some of the services I'm hosting I might agree with you, but a split-DNS setup is far less hassle (IMO) than sorting out VPN connections for everyone (and then the inevitable trouble shooting when it doesn't work (or stops working) for a myriad of reasons.
As a sysadmin, I totally get the benifits of wireguard, it's a nice security boundary, reduces the amount of firewall rules I need to manage, etc, etc
But for onboarding normal people (eg my wife) it's a disaster.
If I use Port forwards, I share a link to a website with a user/pass and now they can turn on the light.
With wireguard, they have to generate a key, install a system app, put up with a permanent vpn status icon / ("your network may be monitored"), not ever use another vpn provider... Que jokes of "how many system administrators does it take to turn on a light bulb"...
Working back from what the perfect situation would look like:
Router starts up
Has a fall back wan address on a non-routable subnet until the wan comes up
Caches that with Dnsmasq (small time out) until the wan comes up, ddns updates the Dnsmasq host name
Router loses wan:
Continues with the old wan address assigned to wan interface. Doesn't remove it because pppoe is down
Dnsmasq caches the wan address until ddns tells it something different.
Maybe I could achieve this with ddns scripts/interface change scripts? I'm just prised this isn't something with a ready-made solution, it seems like a problem all home-lab people would have.
$admin prints out the QR code with the connection details, $wife downloads/ opens the official wireguard app, and uses it to scan the QR code, 5s later the phone/ tablet is connected.
It's quite easy to send only specific traffic through wireguard -- specifically when you're dealing with RFC1918 addresses to your lan. I don't know if the Android may still trigger an alert like you're suggesting, but the battery impact would be negligible especially if most traffic is not going through the tunnel.
It'd probably be a little easier to advise if we had a bit more knowledge about your setup.
Some configs from the router would be a good starting point. Please copy the output of the following commands and post it here using the "Preformatted text </> " button:
Remember to redact passwords, MAC addresses and any public IP addresses you may have:
your problem is not that when internet is down DDNS IP points to (old) WAN address but that your internet access is down! meaning nobody externally can reach you anyhow!
so why bother if friends can access externally with or without VPN if you don't have any external connection at all?
if internally you still want to use services via public domain name then split DNS is the only solution.
once internet is back then DDNS service will refresh the IP address to new WAN ip, so your forwarding will work again. and by the way, having split DNS is better for your internal clients as well as they will not loop through internet just to access a service next to them.
In fact, aside from not being reachable from the outside, when the internet is down, you no longer have a (valid/reachable) wan address (depending on the details of how the link goes down, your router may or may not see a link-down event, though, so it is possible that a DHCP lease may remain in place for the wan).
I also agree here.
I understand that the complication that the OP sees here is that the port forwarding won't work anymore...
A simple solution for that: change server 1 such that it is actually listening on port 2222 instead of 22. The external forward can then be 2222 --> 2222. And then do the same for server 2 and 4444. Combined with split DNS, that should solve the issue.