IPS mode of snort3 is not dropping traffic

I haven't tried the conntrack queues yet, but I did try an nft netdev table with an ingress hook, but it caused a kernel crash when snort started. Apparently don't do that!

table netdev snort {
    chain ingress {
        type filter hook ingress devices = { eth0, br-lan } priority -500; policy accept;
        counter  queue flags bypass to 4-6
    }
}

Run snort:

Mon May 29 08:37:17 2023 kern.warn kernel: [138216.974169] ------------[ cut here ]------------
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.974680] WARNING: CPU: 0 PID: 5781 at nf_reinject+0x3f/0x1e0
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.975211] Modules linked in: pppoe ppp_async nft_fib_inet nf_flow_table_ipv6 nf_flow_table_ipv4 nf_flow_table_inet pppox ppp_generic nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_queue nft_objref nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_counter nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack_netlink nf_conntrack lzo iptable_mangle iptable_filter ipt_REJECT ipt_ECN ip_tables xt_time xt_tcpudp xt_tcpmss xt_statistic xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_ecn xt_dscp xt_comment xt_TCPMSS xt_LOG xt_HL xt_DSCP xt_CLASSIFY x_tables slhc sch_cake r8169 nfnetlink_queue nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 lzo_rle lzo_decompress lzo_compress libcrc32c igc forcedeth e1000e crc_ccitt bnx2 sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.975244]  act_gact i2c_dev ixgbe e1000 amd_xgbe ifb mdio nls_utf8 ena crypto_acompress nls_iso8859_1 nls_cp437 igb vfat fat button_hotplug tg3 ptp realtek pps_core mii
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.981496] CPU: 0 PID: 5781 Comm: snort Tainted: G        W         5.15.112 #0
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.982124] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.0 11/01/2019
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.982829] RIP: 0010:nf_reinject+0x3f/0x1e0
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.983430] Code: 10 0f b6 57 31 4c 8b 77 10 48 8b 4f 50 0f b6 47 30 80 fa 07 74 43 80 fa 0a 0f 84 f7 00 00 00 80 fa 02 0f 84 de 00 00 00 0f 0b <0f> 0b 31 f6 4c 89 f7 e8 25 60 f6 ff 4c 89 e7 e8 dd f9 ff ff 4c 89
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.984832] RSP: 0018:ffffc90005243a08 EFLAGS: 00010206
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.985470] RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff8273d880
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.986153] RDX: 0000000000000005 RSI: 0000000000000001 RDI: ffff88800856e680
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.986825] RBP: ffffc90005243a40 R08: 0000000000000000 R09: ffffc90005243b48
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.987499] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88800856e680
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.988169] R13: ffffffffa03110c0 R14: ffff888007f3b700 R15: ffffc90005243b48
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.988850] FS:  00007f64d4e89b30(0000) GS:ffff88801f000000(0000) knlGS:0000000000000000
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.989551] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.990206] CR2: 00007f64cc13c000 CR3: 000000001f1ea000 CR4: 00000000003506f0
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.990898] Call Trace:
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.991516]  <TASK>
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.992116]  0xffffffffa023a060
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.992747]  0xffffffffa023b96c
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.993371]  nfnetlink_unicast+0x2ae/0xdee [nfnetlink]
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.994020]  ? __wake_up+0xe/0x20
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.994656]  ? nfnetlink_unicast+0xf0/0xdee [nfnetlink]
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.995325]  netlink_rcv_skb+0x52/0x100
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.995973]  nfnetlink_unicast+0xd1a/0xdee [nfnetlink]
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.996652]  ? __kmalloc_track_caller+0x48/0x440
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.997304]  ? _copy_from_iter+0x90/0x5f0
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.997954]  netlink_unicast+0x1ff/0x2e0
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.998597]  netlink_sendmsg+0x21d/0x460
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.999226]  __sys_sendto+0x17f/0x190
Mon May 29 08:37:17 2023 kern.warn kernel: [138216.999869]  ? fput+0xe/0x20
Mon May 29 08:37:17 2023 kern.warn kernel: [138217.000462]  ? __sys_recvmsg+0x62/0x90
Mon May 29 08:37:17 2023 kern.warn kernel: [138217.001054]  ? irqentry_exit+0x1d/0x30
Mon May 29 08:37:17 2023 kern.warn kernel: [138217.001655]  __x64_sys_sendto+0x1f/0x30
Mon May 29 08:37:17 2023 kern.warn kernel: [138217.002246]  do_syscall_64+0x42/0x90
Mon May 29 08:37:17 2023 kern.warn kernel: [138217.002853]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
Mon May 29 08:37:17 2023 kern.warn kernel: [138217.003475] RIP: 0033:0x7f64d71dc399
Mon May 29 08:37:17 2023 kern.warn kernel: [138217.004076] Code: c3 8b 07 85 c0 75 24 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> e9 c8 cf ff ff 41 54 b8 02 00 00 00 55 48 89 f5 be 00 88 08 00
Mon May 29 08:37:17 2023 kern.warn kernel: [138217.005563] RSP: 002b:00007f64d4e628f8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
Mon May 29 08:37:17 2023 kern.warn kernel: [138217.006225] RAX: ffffffffffffffda RBX: 000000000000002c RCX: 00007f64d71dc399
Mon May 29 08:37:17 2023 kern.warn kernel: [138217.006882] RDX: 0000000000000020 RSI: 00007f64d5d6d080 RDI: 0000000000000003
Mon May 29 08:37:17 2023 kern.warn kernel: [138217.007530] RBP: 00007f64d4e89b30 R08: 00007f64d5400000 R09: 000000000000000c
Mon May 29 08:37:17 2023 kern.warn kernel: [138217.008158] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
Mon May 29 08:37:17 2023 kern.warn kernel: [138217.008772] R13: 00007f64d53eb7e8 R14: 0000000000000001 R15: 0000000000000000
Mon May 29 08:37:17 2023 kern.warn kernel: [138217.009378]  </TASK>
Mon May 29 08:37:17 2023 kern.warn kernel: [138217.009884] ---[ end trace 57b675713d9ff314 ]---

My philosophy for "make sure it works first" has meant to me that all data paths should be processed by the snort rules (i.e., everything that passes between any of router, WAN and LAN in any direction). My assumption is that if an attacker compromises your router, then snort should try to scan the router->somewhere packets. I'm not sure that's a productive avenue to follow; we assume that the router is not going to be compromised for the most part (see recent forum post on SE Linux), so maybe that's the direction I should follow here.

If I discard my requirement that packets originating from the router itself be processed, then the obvious chain to use would be raw prerouting, as that precedes everything but defrag and hence should produce the best performance.

Side note: you should all change "flush table" to "delete table" in the setup script, as "flush" doesn't delete the chains and if you change it's hook, you'll get errors.

nft list tables | grep -q 'snort' && nft delete table inet snort

Exactly that was also my thought that's why I tried the prerouting hook but it didn't work but if I look at it it might work if we specify it more on raw prerouting. I'll see if I can make it work the nice thing is we only need to delete and recreate the rules without restarting snort that saves time.

//edit/ I have tested it again with prerouting the performance does not go over 50Mbit. So it remains the fastest is the forward hook the safest the postrouting hook everything else brings unacceptable performence.

//edit2/ It seems that with the Nfq Daq it is better to turn on gro and tso, only lro (software flow offloading) must remain off.

Well, forward should be faster since it's missing "half" of the packets that go through the input/output chains.

             /-> input -> output -\
-> prerouting                      postrouting ->
             \-> forward ---------/

Any malicious actions against the router itself will be missed if you only scan on the forward chain.

Running these four commands from 192.168.1.121 (workstation on LAN):

ping -c4 192.168.1.121  # Ping "self"
ping -c4 10.1.1.20      # The router's WAN address
ping -c4 192.168.1.1    # The router's LAN address
ping -c4 8.8.8.8

Here's what snort reports:

Mon May 29 12:35:07 2023 user.notice snort-table: Chain set to input
Mon May 29 12:35:29 2023 auth.info snort: [1:10000010:1] "TEST ALERT ICMP v4" {ICMP} 192.168.1.121 -> 10.1.1.20
Mon May 29 12:35:42 2023 auth.info snort: [1:10000010:1] "TEST ALERT ICMP v4" {ICMP} 192.168.1.121 -> 192.168.1.1

Mon May 29 12:36:27 2023 user.notice snort-table: Chain set to forward
Mon May 29 12:36:44 2023 auth.info snort: [1:10000010:1] "TEST ALERT ICMP v4" {ICMP} 192.168.1.121 -> 8.8.8.8

Mon May 29 12:37:21 2023 user.notice snort-table: Chain set to prerouting
Mon May 29 12:37:30 2023 auth.info snort: [1:10000010:1] "TEST ALERT ICMP v4" {ICMP} 192.168.1.121 -> 10.1.1.20
Mon May 29 12:37:43 2023 auth.info snort: [1:10000010:1] "TEST ALERT ICMP v4" {ICMP} 192.168.1.121 -> 192.168.1.1
Mon May 29 12:37:59 2023 auth.info snort: [1:10000010:1] "TEST ALERT ICMP v4" {ICMP} 192.168.1.121 -> 8.8.8.8

Yes that is clear. The Postrouting hook snort queue works with activated Generic Receive offload and tcp-segmentation-offload much faster I reach now with a single connection now almost my full bandwidth of 100 Mbit and the problems why the options should be disabled should not exist with Nfq because that works differently than afpacket what the article referred to then.
But better test also with activated reputation blacklist if the drop works correctly with Http/Https.

I have tested again something the fastest configuration is the postrouting hook with a priority of 300 there snort behaves as if it were bound with pcap to the Wan device, but is able to block on the idea I came after I have read here https://wiki.nftables.org/wiki-nftables/index.php/Netfilter_hooks#Priority_within_hook that the connection tracking helper work there. The WanDevice should be specified in the Snort.conf ( variables = { 'device=eth0' }, ) otherwise I could observe in my tests the strange behavior that one device was blocked on another but not or only partially blocked. Here is the modified queue script:

#!/bin/sh

verbose=false

nft list tables | grep -q 'snort' && nft flush table inet snort

nft -f - <<TABLE
    table inet snort {
        chain IPS {
            type filter hook postrouting priority 300; policy accept;

            counter  queue flags bypass to 4-7

        }
    }
TABLE

$verbose && nft list table inet snort

exit 0

#!/bin/sh

verbose=false

nft list tables | grep -q 'snort' && nft flush table inet snort

nft -f - <<TABLE
    table inet snort {
        chain IPS {
            type filter hook postrouting priority 300; policy accept;

	    ct state invalid drop;

# Add here accept or drop rules to bypass snort or drop traffic that snort not should see
# Note that if nat is enabled, snort will only see the address of the outgoing device for outgoing traffic, 
# for example for wan port the wan ip address or if you are using vpn the device address of the virtual adapter

		 counter  queue flags bypass to 4-6


        }
    }
TABLE

$verbose && nft list table inet snort

exit 0

I have slightly revised the script again the snaplen should be better 65531

Are you guys still testing setups? I am happy to test out some stuff when ready.

@xxxx - I tested your script one post above using the following. I got 165 Mbps download where running without snort is 950-1000 Mbps. Do I need to run any ethtool tweaks or modify anything?

My WAN NIC is eth1 by the way.

# ethtool -k eth1 | grep receive-offload
generic-receive-offload: on
large-receive-offload: off [fixed]

My /etc/snort/ok.lua

ips = {
  mode = inline,
  variables = default_variables,
  action_override = 'block',
filesystem
  include = RULE_PATH .. '/snort.rules',
}

output.logdir
output.logdir = '/mnt/mmcblk0p3'
alert_fast = {
	file = true,
	packet = false,
}

normalizer = {
  tcp = {
    ips = true,
  }
}

file_policy = {
  enable_type = true,
  enable_signature = true,
  rules = {
    use = {
      verdict = 'log', enable_file_type = true, enable_file_signature = true
    }
  }
}

And:

# snort -c "/etc/snort/snort.lua" -i "4" -i "5" -i "6" -i "7" --daq-dir /usr/lib/daq --daq nfq -Q -z 4 -s 64000 --daq-var queue_maxlen=8192 --daq-var device=eth1 --tweaks ok

Activate for both network ports (wan, lan) also tso: ethtool -K eth0 (1) gro on tso on and change the s parameter (snaplen) from 64000 to 65531. But I'm afraid your Cpu is too weak for much more throughput I could increase the throughput on one queue also only about 25 percent this way.

No I don't test anymore it doesn't make sense to put the queue somewhere else because the performance is worse I have done all reasonable configurations what makes sense for example to exclude the vpn traffic by accept rule before the queue or the traffic if you have different internal networks (snort also controls the traffic between the internal networks). But these are adjustments that everyone must perform for themselves.

//edit I'm not sure if your file_policy option makes sense since Snort only says yes or no to the passing traffic and can't manipulate anything itself.

@xxxx - I got x86 hardware based on Intel N95 CPU and want to revisit this.

eth0 = LAN
eth1 = WAN

I compiled in kmod-nfnetlink-queue and kmod-nft-queue

What is the setting I need to tweak with ethtool?

# ethtool -k eth1 | grep receive-offload
generic-receive-offload: on
large-receive-offload: off [fixed]

I ran your script from post#36.

Then I ran:

snort -c "/etc/snort/snort.lua" -i "4" -i "5" -i "6" -i "7" --daq-dir /usr/lib/daq --daq nfq -Q -z 4 -s 65531 --daq-var queue_maxlen=8192 --daq-var device=eth1

I am getting my full download bandwidth of around 650-700 Mbps. The CPU is a quad core and one of them hits >99% during the download test (other 3 are 30-40%).

So this is a win!

Next question is how to adjust the snort3 package config files to mimic this without the command line.

My notes indicate that when using the nft queues (nfq DAQ), you don't need to change the NIC offload settings at all; it's only needed when you use the IP filter packet path (i.e., afpacket DAQ).

But just in case, here's what I've got:

wan=$(uci get network.wan.device)
ethtool -K $wan   gro off   lro off   tso off  2> /dev/null

Where
tso = tcp-segmentation-offload
gro = generic-receive-offload
lro = large-receive-offload

I've been picking away at this, see post #15 for (still) current state of things... I'll be sure to bother you when I get something worth testing. :grin:

1 Like

Sounds good please let me know

Generic-receive-offload (GSO) and TCP Segmentation Offload (TSO) should be on, large-receive-offload (LRO) must be off.
Entering the parameters in the config is not a good idea because command line parameters are given priority and overwrite the parameters in the config, not to mention any errors caused by forgotten or incorrect characters. It is better to modify the service file, maybe you could extend it so that you only have to enter the options under /etc/config/snort like as Efahl's suggestion.

Exactly, I think that's the plan all around. Put all the settings possible in the config, then have the init.d script generate configs and/or modify its internal command line, so all of the stuff we've been doing on these two big threads boils down to 1) change config (cli or LuCI if we get really energetic), 2) run /etc/init.d/snort start, and then general users won't have to go through all the discovery steps we've been discussing here.

There's also the part about downloading rules, which is how I got started on all of this. I took @darksky's original script from the wiki and have "tuned it up" so that you can set uci snort.snort.oinkcode="blah" and if your code is valid, the script downloads and "installs" (symlinks right now) your subscription rules, in a location set in local.lua, and if no oinkcode exists, then it does the same with the community rules. (The update-rules script also has a testing mode to generate a hard-coded set of rules that you can easily test with just ping.)

Those sound like nice additions. I am thinking that passing a more complex command line is a nice option as well as coding it up in a config file. We could just have something like a SnortArgs= in a file that would get expanded by the init script.

Are you sure about TSO on? For my WAN facing eth1 it is off and performance is amazing.

generic-receive-offload: on
tx-tcp-segmentation: off
large-receive-offload: off [fixed]

EDIT: when I try to set tos to on it does not seem to be allowed:

# ethtool -K eth1 tso on gro on lro on
Cannot change large-receive-offload
Could not change any device features

# ethtool -k eth1 | grep tcp          
tcp-segmentation-offload: off
	tx-tcp-segmentation: off [requested on]
	tx-tcp-ecn-segmentation: off [fixed]
	tx-tcp-mangleid-segmentation: off [requested on]
	tx-tcp6-segmentation: off [requested on]

I can't find my notes about the tso setting. I'm sure it's somewhere... As I recall, it was another "bug fix" that caused fragmented packets to be passed to snort, which then tried to match on disassembled pieces and snort would just pass the data without proper inspection. And, as far as I can tell, none of this is needed with nfq daq, as it delivers fully assembled packets by virtue of the queue mechanism.

In any case, if the setting is [fixed] then it's apparently hard coded into the driver or hardware, so you won't be able to change it.

Here are some explicit references to ethtool use in afpacket docs: https://github.com/snort3/libdaq/blob/master/modules/afpacket/README.afpacket.md#interface-preparation

But no equivalent mention in the nfq docs: https://github.com/snort3/libdaq/blob/master/modules/nfq/README.nfq.md

And some old discussion I had stashed that does mention segmentation in various forms: https://s3.amazonaws.com/snort-org-site/production/document_files/files/000/000/067/original/packet-offloading-issues.pdf

Oh, and that means that you also need to jack up the snaplen value as high as possible, i.e., 64k, as mentioned in that above link to the nfq README. From the snort3 documentation:

• int daq.snaplen = 1518: set snap length (same as -s) { 0:65535 }

So, CLI would be -s 65535 and in config files the equivalent would be in the daq section of local.lua:

daq = {
  snaplen = 65535,
  ...

EDIT: I did this on an x86 VM that was configured with only(!) 512 MB of RAM, and it locked the VM on snort startup. I had to back it way down to get snort to start, but then set the VM to have 1 GB RAM and it was ok with the full 64k snap buffer.

Yes that is good. The problem will be to add exception rules to the queue script via /etc/config/snort because depending on whether masquerading is enabled or disabled the source ip address sent to the queue will change therefore the exception rules to be created will also change. If masquerading is enabled, Snort sees only the source Ip address of the outgoing network card. But when masquerading is disabled Snort sees the original Ip address. This is a problem in that you can enable masquerading between internal networks as well, so exception rules between internal networks would no longer work in one direction and would have to be changed. Which brings the next problem with itself since the outgoing address is always one of the device addresses of Openwrt for example 192.168.1.1 would exclude such a rule also everything which comes from 192.168.1.1.
Better use a snaplen = 65531 if I remember correctly it will skip too small packets without content.