Policy Based Routing - Strict enforcement

Hello @stangri ! I have experienced a situation where strict enforcement doesn't seem to do what it says, at least not as I understand it. I have a vpn tunnel, which is used by one device only (10.0.2.5) but not for all its traffic. If something happens to the tunnel and it doesn't work anymore, there is no default gateway in the 202 table. However the gateway still exists in the main routing table. Now I would expect that the device would not be routed through the main wan interface, but it does

I tried to change the strict enforcement setting, but there was no change in the configuration.

Strict enforcement on

Summary
pbr 0.9.3-9 running on OpenWrt 21.02.1.
============================================================
Dnsmasq version 2.85  Copyright (c) 2000-2021 Simon Kelley
Compile time options: IPv6 GNU-getopt no-DBus UBus no-i18n no-IDN DHCP DHCPv6 no-Lua TFTP conntrack ipset auth cryptohash DNSSEC no-ID loop-detect inotify dumpfile
============================================================
Routes/IP Rules
default         XXX.XXX.XXX.XXX 0.0.0.0         UG    10     0        0 pppoe-wan
default         *               0.0.0.0         U     90     0        0 tun2

IPv4 Table 201: default via XXX.XXX.XXX.XXX dev pppoe-wan 
10.0.2.0/24 dev eth0.4 proto static scope link metric 11 
10.0.3.0/24 via 10.0.10.3 dev roadwarrior proto zebra metric 20 
10.0.10.0/24 dev roadwarrior proto kernel scope link src 10.0.10.1 
10.0.20.4/30 dev elvetias proto kernel scope link src 10.0.20.5 
172.20.0.0/24 dev eth0.2 proto kernel scope link src 172.20.0.1 
172.30.30.0/24 dev eth0.3 proto kernel scope link src 172.30.30.1 
192.168.1.0/24 via 10.0.20.6 dev elvetias proto zebra metric 20 
192.168.100.0/24 dev eth0 proto kernel scope link src 192.168.100.1 
IPv4 Table 201 Rules:
30000:	from all fwmark 0x10000/0xff0000 lookup wan

IPv4 Table 202: 10.0.2.0/24 dev eth0.4 proto static scope link metric 11 
10.0.3.0/24 via 10.0.10.3 dev roadwarrior proto zebra metric 20 
10.0.10.0/24 dev roadwarrior proto kernel scope link src 10.0.10.1 
10.0.20.4/30 dev elvetias proto kernel scope link src 10.0.20.5 
172.20.0.0/24 dev eth0.2 proto kernel scope link src 172.20.0.1 
172.30.30.0/24 dev eth0.3 proto kernel scope link src 172.30.30.1 
192.168.1.0/24 via 10.0.20.6 dev elvetias proto zebra metric 20 
192.168.100.0/24 dev eth0 proto kernel scope link src 192.168.100.1 
IPv4 Table 202 Rules:
29999:	from all fwmark 0x20000/0xff0000 lookup proton

IPv4 Table 203: unreachable default 
10.0.2.0/24 dev eth0.4 proto static scope link metric 11 
10.0.3.0/24 via 10.0.10.3 dev roadwarrior proto zebra metric 20 
10.0.10.0/24 dev roadwarrior proto kernel scope link src 10.0.10.1 
10.0.20.4/30 dev elvetias proto kernel scope link src 10.0.20.5 
172.20.0.0/24 dev eth0.2 proto kernel scope link src 172.20.0.1 
172.30.30.0/24 dev eth0.3 proto kernel scope link src 172.30.30.1 
192.168.1.0/24 via 10.0.20.6 dev elvetias proto zebra metric 20 
192.168.100.0/24 dev eth0 proto kernel scope link src 192.168.100.1 
IPv4 Table 203 Rules:
29998:	from all fwmark 0x30000/0xff0000 lookup wwan
============================================================
Mangle IP Table: PREROUTING
-N PBR_PREROUTING
-A PBR_PREROUTING -s 10.0.2.5/32 -m set --match-set pbr_ignore_dst_net_cfg016ff5 dst -m comment --comment rockpi_local -c 68389 5589527 -j RETURN
-A PBR_PREROUTING -s 10.0.2.5/32 -p tcp -m set --match-set pbr_wan_dst_ip_cfg026ff5 dst -m multiport --dports 443 -m comment --comment letsencrypt -c 0 0 -g PBR_MARK0x010000
-A PBR_PREROUTING -s 10.0.2.5/32 -p tcp -m multiport --sports 80,443 -m comment --comment rockpi_http_s -c 1852 580366 -g PBR_MARK0x010000
-A PBR_PREROUTING -s 10.0.2.5/32 -m set --match-set pbr_proton_dst_net_cfg056ff5 dst -m comment --comment rockpi_proton1 -c 4433703 1734505750 -g PBR_MARK0x020000
-A PBR_PREROUTING -s 10.0.2.5/32 -m set --match-set pbr_proton_dst_net_cfg066ff5 dst -m comment --comment rockpi_proton1 -c 781906 856046604 -g PBR_MARK0x020000
============================================================
Mangle IP Table: OUTPUT
-N PBR_OUTPUT
-A PBR_OUTPUT -m set --match-set pbr_wan_dst_ip_cfg036ff5 dst -m comment --comment henet1 -c 25089 2863967 -g PBR_MARK0x010000
============================================================
Mangle IP Table MARK Chain: PBR_MARK0x010000
-N PBR_MARK0x010000
-A PBR_MARK0x010000 -c 27028 3465105 -j MARK --set-xmark 0x10000/0xff0000
-A PBR_MARK0x010000 -c 27028 3465105 -j RETURN
============================================================
Mangle IP Table MARK Chain: PBR_MARK0x020000
-N PBR_MARK0x020000
-A PBR_MARK0x020000 -c 5217649 2590740573 -j MARK --set-xmark 0x20000/0xff0000
-A PBR_MARK0x020000 -c 5217649 2590740573 -j RETURN
============================================================
Mangle IP Table MARK Chain: PBR_MARK0x030000
-N PBR_MARK0x030000
-A PBR_MARK0x030000 -c 0 0 -j MARK --set-xmark 0x30000/0xff0000
-A PBR_MARK0x030000 -c 0 0 -j RETURN
============================================================
Current ipsets

create pbr_proton_dst_net_cfg056ff5 hash:net family inet hashsize 1024 maxelem 65536 comment
add pbr_proton_dst_net_cfg056ff5 0.0.0.0/1 comment "rockpi proton1: 0.0.0.0/1"
create pbr_proton_dst_net_cfg066ff5 hash:net family inet hashsize 1024 maxelem 65536 comment
add pbr_proton_dst_net_cfg066ff5 128.0.0.0/1 comment "rockpi proton1: 128.0.0.0/1"
============================================================
DNSMASQ ipsets
ipset=/letsencrypt.org/pbr_wan_dst_ip_cfg026ff5 # letsencrypt: letsencrypt.org
============================================================

Strict enforcement off

Summary
pbr 0.9.3-9 running on OpenWrt 21.02.1.
============================================================
Dnsmasq version 2.85  Copyright (c) 2000-2021 Simon Kelley
Compile time options: IPv6 GNU-getopt no-DBus UBus no-i18n no-IDN DHCP DHCPv6 no-Lua TFTP conntrack ipset auth cryptohash DNSSEC no-ID loop-detect inotify dumpfile
============================================================
Routes/IP Rules
default         xxx.xxx.xxx.xxx 0.0.0.0         UG    10     0        0 pppoe-wan
default         *               0.0.0.0         U     90     0        0 tun2

IPv4 Table 201: default via xxx.xxx.xxx.xxx dev pppoe-wan 
10.0.2.0/24 dev eth0.4 proto static scope link metric 11 
10.0.3.0/24 via 10.0.10.3 dev roadwarrior proto zebra metric 20 
10.0.10.0/24 dev roadwarrior proto kernel scope link src 10.0.10.1 
10.0.20.4/30 dev elvetias proto kernel scope link src 10.0.20.5 
172.20.0.0/24 dev eth0.2 proto kernel scope link src 172.20.0.1 
172.30.30.0/24 dev eth0.3 proto kernel scope link src 172.30.30.1 
192.168.1.0/24 via 10.0.20.6 dev elvetias proto zebra metric 20 
192.168.100.0/24 dev eth0 proto kernel scope link src 192.168.100.1 
IPv4 Table 201 Rules:
30000:	from all fwmark 0x10000/0xff0000 lookup wan

IPv4 Table 202: 10.0.2.0/24 dev eth0.4 proto static scope link metric 11 
10.0.3.0/24 via 10.0.10.3 dev roadwarrior proto zebra metric 20 
10.0.10.0/24 dev roadwarrior proto kernel scope link src 10.0.10.1 
10.0.20.4/30 dev elvetias proto kernel scope link src 10.0.20.5 
172.20.0.0/24 dev eth0.2 proto kernel scope link src 172.20.0.1 
172.30.30.0/24 dev eth0.3 proto kernel scope link src 172.30.30.1 
192.168.1.0/24 via 10.0.20.6 dev elvetias proto zebra metric 20 
192.168.100.0/24 dev eth0 proto kernel scope link src 192.168.100.1 
IPv4 Table 202 Rules:
29999:	from all fwmark 0x20000/0xff0000 lookup proton

IPv4 Table 203: unreachable default 
10.0.2.0/24 dev eth0.4 proto static scope link metric 11 
10.0.3.0/24 via 10.0.10.3 dev roadwarrior proto zebra metric 20 
10.0.10.0/24 dev roadwarrior proto kernel scope link src 10.0.10.1 
10.0.20.4/30 dev elvetias proto kernel scope link src 10.0.20.5 
172.20.0.0/24 dev eth0.2 proto kernel scope link src 172.20.0.1 
172.30.30.0/24 dev eth0.3 proto kernel scope link src 172.30.30.1 
192.168.1.0/24 via 10.0.20.6 dev elvetias proto zebra metric 20 
192.168.100.0/24 dev eth0 proto kernel scope link src 192.168.100.1 
IPv4 Table 203 Rules:
29998:	from all fwmark 0x30000/0xff0000 lookup wwan
============================================================
Mangle IP Table: PREROUTING
-N PBR_PREROUTING
-A PBR_PREROUTING -s 10.0.2.5/32 -m set --match-set pbr_ignore_dst_net_cfg016ff5 dst -m comment --comment rockpi_local -c 0 0 -j RETURN
-A PBR_PREROUTING -s 10.0.2.5/32 -p tcp -m set --match-set pbr_wan_dst_ip_cfg026ff5 dst -m multiport --dports 443 -m comment --comment letsencrypt -c 0 0 -g PBR_MARK0x010000
-A PBR_PREROUTING -s 10.0.2.5/32 -p tcp -m multiport --sports 80,443 -m comment --comment rockpi_http_s -c 0 0 -g PBR_MARK0x010000
-A PBR_PREROUTING -s 10.0.2.5/32 -m set --match-set pbr_proton_dst_net_cfg056ff5 dst -m comment --comment rockpi_proton1 -c 6 975 -g PBR_MARK0x020000
-A PBR_PREROUTING -s 10.0.2.5/32 -m set --match-set pbr_proton_dst_net_cfg066ff5 dst -m comment --comment rockpi_proton1 -c 5 827 -g PBR_MARK0x020000
============================================================
Mangle IP Table: OUTPUT
-N PBR_OUTPUT
-A PBR_OUTPUT -m set --match-set pbr_wan_dst_ip_cfg036ff5 dst -m comment --comment henet1 -c 0 0 -g PBR_MARK0x010000
============================================================
Mangle IP Table MARK Chain: PBR_MARK0x010000
-N PBR_MARK0x010000
-A PBR_MARK0x010000 -c 27041 3466330 -j MARK --set-xmark 0x10000/0xff0000
-A PBR_MARK0x010000 -c 27041 3466330 -j RETURN
============================================================
Mangle IP Table MARK Chain: PBR_MARK0x020000
-N PBR_MARK0x020000
-A PBR_MARK0x020000 -c 5218092 2590785205 -j MARK --set-xmark 0x20000/0xff0000
-A PBR_MARK0x020000 -c 5218092 2590785205 -j RETURN
============================================================
Mangle IP Table MARK Chain: PBR_MARK0x030000
-N PBR_MARK0x030000
-A PBR_MARK0x030000 -c 0 0 -j MARK --set-xmark 0x30000/0xff0000
-A PBR_MARK0x030000 -c 0 0 -j RETURN
============================================================
Current ipsets

create pbr_proton_dst_net_cfg056ff5 hash:net family inet hashsize 1024 maxelem 65536 comment
add pbr_proton_dst_net_cfg056ff5 0.0.0.0/1 comment "rockpi proton1: 0.0.0.0/1"
create pbr_proton_dst_net_cfg066ff5 hash:net family inet hashsize 1024 maxelem 65536 comment
add pbr_proton_dst_net_cfg066ff5 128.0.0.0/1 comment "rockpi proton1: 128.0.0.0/1"
============================================================
DNSMASQ ipsets
ipset=/letsencrypt.org/pbr_wan_dst_ip_cfg026ff5 # letsencrypt: letsencrypt.org
============================================================

PBR

Summary
package pbr

config policy
        option name 'rockpi local'
        option src_addr '10.0.2.5/32'
        option interface 'ignore'
        option dest_addr '10.0.0.0/19'

config policy
        option interface 'wan'
        option name 'letsencrypt'
        option src_addr '10.0.2.5/32'
        option dest_addr 'letsencrypt.org'
        option dest_port '443'
        option proto 'tcp'

config policy
        option interface 'wan'
        option dest_addr '216.66.86.122'
        option proto 'all'
        option chain 'OUTPUT'
        option name 'henet1'

config policy
        option interface 'wan'
        option name 'rockpi http/s'
        option src_addr '10.0.2.5/32'
        option src_port '80 443'
        option proto 'tcp'

config policy
        option src_addr '10.0.2.5/32'
        option interface 'proton'
        option dest_addr '0.0.0.0/1'
        option name 'rockpi proton1'

config policy
        option name 'rockpi proton1'
        option src_addr '10.0.2.5/32'
        option dest_addr '128.0.0.0/1'
        option interface 'proton'

config pbr 'config'
        option src_ipset '0'
        option resolver_ipset 'dnsmasq.ipset'
        option ipv6_enabled '0'
        option iptables_rule_option 'append'
        option procd_reload_delay '1'
        option webui_enable_column '1'
        option webui_protocol_column '1'
        option webui_chain_column '1'
        option webui_show_ignore_target '1'
        option webui_sorting '1'
        list webui_supported_protocol 'tcp'
        list webui_supported_protocol 'udp'
        list webui_supported_protocol 'tcp udp'
        list webui_supported_protocol 'icmp'
        list webui_supported_protocol 'all'
        option enabled '1'
        list ignored_interface 'vpnserver'
        list ignored_interface 'wgserver'
        list ignored_interface 'elvetias'
        list ignored_interface 'roadwarrior'
        option dest_ipset '1'
        option boot_timeout '30'
        option verbosity '1'
        option strict_enforcement '1'

config include
        option enabled '0'
        option path '/usr/share/pbr/pbr.user.aws'

config include
        option enabled '0'
        option path '/usr/share/pbr/pbr.user.netflix'

I also tried to disable the first rule which bypasses, but that didn't help either.
Eventually as a workaround I had to block the device on the firewall, but is my understanding wrong that this should not happen?

What kind of tunnel is that? Does pbr get triggered on the interface down?

It is an OpenVPN tunnel. The interface doesn't go down, but it is stuck and doesn't route anything till I restart it.

And in that case traffic designated for tunnel flows thru WAN? That's definitely unexpected.

The strict enforcement only creates an "unreachable" route for tunnels that are down, to prevent traffic from leaking. In most cases that's what is needed.

Yes!

I suppose if there is no alternative gateway in the routing table of the rule, then it will be blackholed anyway.
It occurred again last night. At least this time the firewall was blocking the traffic from flowing out of wan.
What I noticed was that in the iptables the rule:
-A PREROUTING -m mark --mark 0x0/0xff0000 -j PBR_PREROUTING
was missing. I tried to replicate the issue by restarting some services, like dnsmasq, adblock, banip, and pbr. Unfortunately no trigger for the problem so far. I'll keep an eye on it, but at least now there is some indication what might have gone wrong. Could it be that in some restart script this entry is deleted but not recreated?

The only thing it could be is a result of a single interface reload. If you happen to reproduce it using reload_interface command to pbr, please let me know.

/etc/init.d/pbr reload_interface did not make any difference. The rule is still there after the reload. I'll continue investigating and hope to catch the culprit.