IPsec tunnel just dies after some time and cannot be brought back up with 'ipsec up'

Hi everyone,

I'm running OpenWrt 24.10.5 with a strongSwan IPsec IKEv2 site-to-site tunnel between two routers. The tunnel establishes fine initially, but after some time it simply dies and I cannot bring it back up.

Environment

  • OpenWrt 24.10.5
  • Kernel: Linux 6.6.119 aarch64
  • strongSwan 5.9.14-r8 (full package set installed)
  • Authentication: PSK, IKEv2, AES-256, SHA-256, DH group 14

The problem

The tunnel runs fine after initial setup. After an unpredictable amount of time — sometimes minutes, sometimes longer — the tunnel drops. When I try to bring it back:

ipsec up bb2-tun

It just hangs, retransmitting indefinitely with no response from the peer:

initiating IKE_SA bb2-tun[1] to 10.0.0.32
sending packet: from 10.0.0.31[500] to 10.0.0.32[500] (1144 bytes)
retransmit 1 of request with message ID 0
sending packet: from 10.0.0.31[500] to 10.0.0.32[500] (1144 bytes)
retransmit 2 of request with message ID 0
...

Even though basic connectivity between the two routers is fine (ping works). Checking ipsec statusall shows the connection is defined but no Security Associations are up:

Connections:
     bb2-tun:  %any...10.0.0.32  IKEv2
     bb2-tun:   child:  192.168.101.0/24 === 192.168.102.0/24 TUNNEL
Security Associations (0 up, 0 connecting):
  none

Root cause I suspect

After investigation, I believe the actual problem is not the tunnel negotiation itself — it's that the charon daemon on one or both sides is silently dying or getting into a broken state, and the /etc/init.d/ipsec restart does not properly revive it.

When the daemon is in this broken state, the following stale files remain in /var/run/:

/var/run/charon.ctl
/var/run/charon.pid
/var/run/charon.vici
/var/run/charon.xml
/var/run/charon.wlst
/var/run/charon.dck
/var/run/starter.charon.pid

When start is called again, the starter sees these files and skips launching the daemon entirely:

charon is already running (/var/run/charon.pid exists) -- skipping daemon start
starter is already running (/var/run/starter.charon.pid exists) -- no fork done

So from the outside it looks like the tunnel simply won't come back up, but the real issue is that charon isn't actually running at all — it just thinks it is because of leftover lock files.

Current workaround

I found that manually cleaning the lock files and forcing the starter up brings the tunnel back:

sh

killall -9 charon starter 2>/dev/null
sleep 2
rm -f /var/run/charon.* /var/run/starter.*
/etc/init.d/ipsec start
sleep 1
rm -f /var/run/charon.* /var/run/starter.*
sleep 1
/usr/lib/ipsec/starter --daemon charon --nofork &
sleep 10
ipsec up bb2-tun

This works and the tunnel comes back up cleanly. But obviously this is a terrible long-term solution — I can't be running this manually every time the tunnel drops, especially in a production environment.

Is there a native procd way to ensure these stale lock files are automatically cleared if the service crashes, or is this a known issue with the strongSwan package in 24.10?

Any help is appreciated.

It's 15 years ago I had to deal with ipsec but what still is present:
Check and maybe even lower timeout values. These can be very conservative.
A cronjob checking the state is a plus, and if the connection hangs then let a script handle the restart of interfaces.

Just for context: both routers are OpenWrt with stromgswan ? I'm asking because many vendor ipsec implementation even dell or fortigate can just be utterly shit. Sorry for the rant.

In addition. Maybe it helps. Please share your config of ipsec and network with private details removed.

Thanks for your response, I will try what you said.

Answering your question:

Yes, both routers are running OpenWrt 24.10.5 with strongSwan 5.9.14-r8 on both sides. Same firmware, same package versions.

Both routers — /etc/config/ipsec

Router A:

config ipsec
    option enabled '1'

config remote 'router_b'
    option enabled '1'
    option gateway '10.0.0.X'
    option pre_shared_key 'REDACTED'
    option authentication_method 'psk'
    option exchange_mode 'ikev2'
    option dpd_action 'restart'
    option dpd_delay '30'
    list p1_proposal 'p1'
    list tunnel 'tunnel'

config p1_proposal 'p1'
    option encryption_algorithm 'aes256'
    option hash_algorithm 'sha256'
    option dh_group '14'

config tunnel 'tunnel'
    option local_subnet '192.168.A.0/24'
    option remote_subnet '192.168.B.0/24'
    option p2_proposal 'p2'

config p2_proposal 'p2'
    option encryption_algorithm 'aes256'
    option authentication_algorithm 'sha256'
    option pfs_group '14'

Router B is identical with IPs and subnets swapped.

One thing I thought: Could IPsec be "washed" nowadays?

Please define washed. Washed away aka replaced? If yes please yes of course please use wireguard!

Seams as most is using implicit defaults. I would cross check the uci defaults and the defaults in the strongswan upstream documentation just to be sure what could be tuned.... But as both sides are Linux and same stongswan... Mhm next step would be checking the strongswan issue tracker...

This makes me curious.
As I asked before could you also share network config? Or at least the relevant parts.

And maybe a brief description of both sides. I see both using 192.168.x as their local network.

Maybe the addresses out of 10/8 are the addresses within the tunnel but what's about the outer connection? The interconnection details are missing.
Thanks.

Again, thanks for your answer!

Washed means outdated. I've already done a WireGuard tunnel and it worked, but we want to test every possibility.

Both routers are on the same transport network. There is no internet involved, it is just a private LAN-to-LAN setup where both WAN interfaces are on the same L2/L3 segment.

[Router A] eth1: 10.0.0.31/24 ──── (transport network 10.0.0.0/24) ──── eth1: 10.0.0.32/24 [Router B]
     |                                                                                |
  br-lan                                                                          br-lan
192.168.101.0/24                                                            192.168.102.0/24

Router A relevant interfaces:

eth1  (WAN/transport): 10.0.0.31/24
br-lan (LAN):          192.168.101.1/24

Router B relevant interfaces:

eth1  (WAN/transport): 10.0.0.32/24
br-lan (LAN):          192.168.102.1/24

Here the /etc/config/ipsec of router A (in router B is identical with IPs and subnets swapped, and different names too):

config ipsec
    option enabled '1'

config remote 'bb2'
    option enabled '1'
    option gateway '10.0.0.32'
    option pre_shared_key ''
    option authentication_method 'psk'
    option exchange_mode 'ikev2'
    option dpd_action 'restart'
    option dpd_delay '30'
    list p1_proposal 'p1'
    list tunnel 'tun'

config p1_proposal 'p1'
    option encryption_algorithm 'aes256'
    option hash_algorithm 'sha256'
    option dh_group '14'

config tunnel 'tun'
    option local_subnet '192.168.101.0/24'
    option remote_subnet '192.168.102.0/24'
    option p2_proposal 'p2'

config p2_proposal 'p2'
    option encryption_algorithm 'aes256'
    option authentication_algorithm 'sha256'
    option pfs_group '14'

The goal is to encrypt all traffic between 192.168.101.0/24 and 192.168.102.0/24 as it crosses the transport network. Both routers can ping each other's WAN IPs at all times, basic connectivity is never lost. The issue is purely that the IPsec tunnel dies and can't fix this.

If ipsec is no hard requirement then just skip it and go straight with wireguard.

It's 1000x easier and more light weight.
Either it works or not. But no half and a quarter state as with ipsec.

Also dynamic routing using ospf or bgp is far more easier as with ipsec IMHO.

Unfortunately IPsec is a hard requirement for this project so I need to make it work. As I said, we are evaluating multiple protocols (VXLAN, GRE, WireGuard, and IPsec) to benchmark their setup, stability and limits. So far, only IPsec is giving me these headaches. Thank you anyway, I'll keep trying and if I can't find out a native fix, I'll just use the workaround script I created. Have a nice day!