I have an OpenWRT router (the issue happens both with TP-Link archer C7 v2 and with Netgear R7800). Flow offloading is enabled, but (for censorship circumvention reasons, so that I can analyze the first few packets for known strings generated by censorware) in a custom way:
iptables -A FORWARD -m comment --comment "!fw3: Traffic offloading (modified)" -m conntrack --ctstate RELATED,ESTABLISHED -m connbytes --connbytes 8 --connbytes-mode packets --connbytes-dir reply -j FLOWOFFLOAD
I have three networks: the usual WAN and LAN, and also LAB, which is just a separate VLAN without DHCP on the LAN side of the switch. There is no NAT between LAN and LAB, but there is a stateful firewall: hosts in the LAN can initiate connections to the LAB, but not the other way round.
I have noticed a strange thing: ssh connections from LAN to LAB break more often than I expect.
The default /etc/sysctl.d/11-nf-conntrack.conf file has this line:
net.netfilter.nf_conntrack_tcp_timeout_established=7440
And indeed, the initial content of /proc/net/nf_conntrack shows something like this:
ipv4 2 tcp 6 7438 ESTABLISHED src=192.168.0.235 dst=10.0.0.2 sport=55142 dport=22 packets=3 bytes=164 src=10.0.0.2 dst=192.168.0.235 sport=22 dport=55142 packets=2 bytes=153 [ASSURED] mark=16128 zone=0 use=2
Then the [ASSURED] part changes into [OFFLOAD], and the timer disappears.
ipv4 2 tcp 6 src=192.168.0.235 dst=10.0.0.2 sport=55142 dport=22 packets=30 bytes=1920 src=10.0.0.2 dst=192.168.0.235 sport=22 dport=55142 packets=20 bytes=2560 [OFFLOAD] mark=16128 zone=0 use=3
After some time (~30 sec), the [OFFLOAD] line changes back into [ASSURED], but the count-down timer starts at 120.
ipv4 2 tcp 6 114 ESTABLISHED src=192.168.0.235 dst=10.0.0.2 sport=55142 dport=22 packets=48 bytes=3072 src=10.0.0.2 dst=192.168.0.235 sport=22 dport=55142 packets=32 bytes=4096 [ASSURED] mark=16128 zone=0 use=2
Then, after 2 minutes, the conntrack entry disappears. I would expect it to disappear after ~2 hours of inactivity, not 2 minutes.
Can anyone else confirm this behavior? Is it a bug?
Why doesn't the count-down use the nf_conntrack_tcp_timeout_established value when it exits the [OFFLOAD] status? Where can I change the timeout that is used after the connection exits the [OFFLOAD] status?