Open Source DPI and Network Intelligence Engine (Beta)

Hi lantis1008,

We have thought about creating tighter integration with netilter for a couple reasons:

  1. Speed - it would be able to react to traffic-making decisions a little faster
  2. Resources - it would require less overhead, important for typical OpenWrt hardware deployments

The big disadvantages: it would be a Linux-only feature, and it would slightly break the Unix philosophy.

Regardless, it's certainly technically possible and doesn't require too much development.

I've certainly been looking for a Layer7-ish replacement for a while for use in Gargoyle, and i'd dabbled in nDPI (and the various forks that turn it into a NF extension) and while it was great, it was typically unstable and caused more crashes than it was worth.

If this was done (and i certainly would encourage it!), i'd also like to see the patterns/rules being pluggable. I know that some of the rules are quite complex and don't lend themsleves to this concept easily, but it should certainly be possible for some of them.
This allows for rules to be updated easily in the case where an IP range updates for a match, rather than recompiling a new version.

Anyway, thanks for sharing. I'll certainly keep an eye on where you guys are heading.

Hi lantis1008,

In Netify, this has already been done to a certain extent. As you have noticed, the nDPI project has munged together the concepts of "protocols" and "applications". So a bandwidth graph might show totals like:

  1. HTTPS: 40%
  2. HTTP: 25%
  3. YouTube: 20%
  4. Facebook: 10%
  5. DNS: 5%

There are protocols (HTTPS, HTTP, DNS) and there are applications that run on top of those protocols (YouTube, Facebook). Mixing the two concepts was a bit problematic for us, so these are separated in Netify.

Yup! Detecting things like encrypted BitTorrent requires a proper engine (or "dissector" in the language of nDPI). These protocols need to be compiled into the DPI engine - it's hard to get around that. It's not too bad though, protocols are relatively slow moving targets - FTP in 2015 is the same FTP in 2020. Applications, on the other hand, are fast moving targets. Fortunately, there's no need to recompile netifyd code to make changes to application definitions -- the changes can be done via a configuration file.

For example, here's a sample of the Snapchat application definition:

host:"^snapchat.com$",host:".snapchat.com$"@netify.snapchat
host:"^sc-cdn.net$",host:".sc-cdn.net$"@netify.snapchat
host:"^sc-static.net$",host:".sc-static.net$"@netify.snapchat
etc.

And about a dozen others (here's the current list of Snapchat domains). Any DNS, HTTP, HTTPS/SNI, SSL certificate name, etc. that matches the domain (or IP address block) will get classified into the specified application.

You can see the list of detected applications shipped with OpenWrt in /etc/netify.d/netify-sink.conf. For Netify cloud customers, this list is updated on a regular basis. For non-cloud customers, we can also provide an updated list that could be updated via a cronjob or some other way.

Trying to ascertain how secure the offsite transport is... after clicking "Learn More" several times on your product site... all I found was this...

  • User-provided data is 100% private, encrypted with a passphrase known only to you

Which leaves me to wondering... Where would someone who was interested in using your product be able to find out how offsite traffic is secured? ( shouldn't this feature more prominently in your post / site? ). Does the above statement refer to cloud based on-disk encryption or transport only? Is the same key used for both?

Yes, we should have a landing page that describes all the offsite details. I'll add it to our TODOs.

Short version...

The netifyd agent is open source, so developers can certainly poke around the data the is sent to the cloud. The JSON payload example on the netifyd page gives you a rough idea though. The network metadata sent to the cloud can be further anonymized via some of the privacy features listed here: Netify Privacy.

The data is transported via HTTPS using a unique netifyd identifier. The encryption key, which is never sent to us, is used to encrypt the following user-provided data:

  • API resource keys
  • Owners (e.g. Dave Smith)
  • Groups (e.g. Sales)
  • Device Names (e.g. Pete's Mobile)

So encrypted user-provided data, along with the anonymized metadata, provides a path to remove traces of personally identifiable information. In addition, the data stored is not directly linked to a user's account - that's where the API resource keys come into play. Only the encryption key can establish the link between an account and the network metadata.

There are more technical details in the "Netify Privacy" link above, but that's the abridged version.

Oh. For large deployments, all of the infrastructure can be hosted on a private network... no public cloud required. For medium-size deployments, we also have the option of storing just the data on a private network.

1 Like

From the source code of netify-fwa it seems that nftables are not supported?


Any plans to provide/package Netify Console tool for OpenWrt?

The console is currently available for ClearOS


Could not locate

netifyd.conf(5) man page for documentation

/etc/netifyd.conf refers to

See /usr/share/netifyd/netifyd.conf-sample for all possible options.

however, that file is absent.


The package description states:

These detections can be saved locally

Is this related to

[socket]
dump_established_flows = <enable to write all established flows to connecting client>
dump_unknown_flows = <enable to write unknown flows to connecting client>

or to

[netifyd]
json_save = <yes/no>

? If so how can the dump location be specified?


[flow_hash_cache]
save = <persistent/volatile/disabled> 
[dns_hint_cache] 
save = <persistent/volatile/disabled>

If persistent how can the dump path be specified?

kuhfufhrbuierf,

Thanks for the questions! I am the principal developer for the Netify Agent.

The Netify Console tool (as currently released) is a PHP application. The packaging for that was done only for ClearOS. There were no plans to release a package for other distributions/platforms because a new version is being designed in C++. The PHP version was first a debugging tool which now should be developed further into a full application with more needed features.

That being said, the PHP version is available here, and can be run from a cloned/manual install on any host that has a PHP interpreter. Netify Agent can then be configured to listen on a network socket (versus the default file socket), enabling remote Netify Console connections. TODO: At the moment, there is no privacy/authentication/encryption on this socket so some thought should be given to secure network access to it.

We don't package/include man pages or other documentation files for OpenWrt. I thought considering it's an embedded platform, perhaps that would be frowned upon. In hindsight, these files are so tiny compared to the rest of the image, we can include them in the next release if that's expected.

In the meantime, you can find the the man pages and sample configuration here.

Both. Depending on your requirements. An established socket connection will stream real-time detections and other status information (JSON payloads) for applications that want to ingest a stream. The "dump_established_flows", when enabled, will send the connecting client the entire current state of the engine. It does not dump the current state to a file. For that, use "json_save".

"json_save" will periodically (15 seconds by default), dump all new detections and all active flows to the file: sink-request.json

"dump_unknown_flows" is more of a debug function. It creates small pcap files (8 - 10 packets, configurable) for unidentified flows. When enabled, these files can be found in the volatile state directory as: nd-flow-xxxxxxxx.cap

This file is saved to the "volatile" state directory, which on OpenWrt is: /run/netifyd/

This path is currently compiled in and cannot be changed at runtime.

Again, this path is compiled into the executable and currently cannot be changed at runtime. For OpenWrt, the "persistent state path" is: /etc/netify.d/

1 Like

kuhfufhrbuierf,

I missed this first question. You are correct... we will be circling back to add support for it. I spent the last development time on implementing support for BSD PF, which consumed considerable time. NFTables is on the roadmap next.

1 Like

Thank you for the pointer and explanation. Perhaps you would consider packaging it for OpenWrt, reckon it would add value?


That somewhat lessens the appeal/attraction of the app. Whilst it may make sense for devices that only feature storage prone to extensive wear by intensive disk writes, e.g. NAND flash, there are also devices supported by OpenWrt that feature support for installation of SSD/USB drives.

With only the volatile storage option any dump will vanish during a power cycle.

1 Like

It seems a bit heavy (dependency-wise) to run on an embedded router. Netify Console works over the network so most users run it on their laptop/desktop or other server to view real-time flow data from their embedded gateway. That being said, the rewrite in C++ that is underway would run beautifully on tiny systems so we certainly can release that as a separate OpenWrt package.

We will consider making these paths configurable at run-time for the next release.

A moot point considering this file is overwritten every 15 seconds. You would have only lost the last 15 seconds of activity. If you want to keep the history, you would have to select and copy the data you're interesting in, regardless of the update file's storage location.

If a complete history was to be kept, a simple shell script could be written to copy the file to a permanent SSD/USB storage location every 15 seconds.

1 Like

Thanks for pointing that out, it was not clear.


Suppose that is what being offered, aside from other features, through the cloud-based paid subscription service.

Thanks for this and glad that the agent is licensed under a Free Software license.

I have just connected to the socket to look at the stream and it makes me happy to know that there are immense possibilities to write applications that take decisions on the device itself.

I have a question though, it seems to auto-detect WAN interface which I suppose is to merge the flows from LAN and WAN on the analysis server to identify end-to-end client-to-server flow, instead of considering them 2 separate flows. Now, what happens in the multi-wan scenario?

We extensively use MWAN3 with 2 to 4 WANs. Does the agent support multiple -E options?

Thanks in advance.

1 Like

Hi Netify!

I've been deploying the agent and sending
JSON to ELK stack over TCP socket in a couple of minutes! (actually might be hours but I'm also new to ELK stack...)
Currently registered to your cloud service in "trial mode", your dashboards are amazing! So many insights on my familiy devices/Internet usage :smile:
One comment though: as an advanced/nerd home user I don't think I would pay for the basic plan monthly fee beyond trial... It would be good to offer a free plan with much limited features or limited to a single site? Self-hosting might be also an option?
Alternatively I would try to build up nice dashboards on Kibana, but do you provide somewhere a full description of the JSON fields and a list/lookup for protocols and applications identifiers?
Thanks and great job!

2 Likes

First, sorry for the inexcusable delay responding to your message! I have notifications enabled for this thread but for some reason I'm not receiving emails for replies.

Second, thank-you for the feedback and kind words! Glad to hear you like what you see so far...

Regarding Multi-WAN; yes you can specify an "unlimited" number of LAN (internal) and WAN (external) interfaces. The latest source code in the OpenWrt packages repository also has updates that let you configure the Agent by editing /etc/config/netifyd, but I'm not sure if it will allow multiple interfaces (I didn't write that code).

The easiest way to test custom options is to edit: /etc/init.d/netifyd and disable "auto-detect" mode. You can then specify your full command-line in the NETIFY_OPTS variable (in quotes), such as:

NETIFY_OPTS="-I br-lan -E eth0 -E eth1"

Greetings!

Thanks for the positive feedback -- much appreciated!

Regarding a "free" (or very low) plan for home use -- I hear what you're saying and I've been a proponent to introducing such an option but I have no updates on that yet.

We have some limited documentation, but it lacks field-by-field descriptions. However it still may have some information that you find useful. There are also links in there to other applications that use the JSON stream which you could refer to for example implementations.

The link is here: Netify Agent JSON v1.90

Of course, if there are some specific questions you have regarding the structure, I would be happy to answer them privately.

Thanks Darryl for the explanations.
I was able to capture the "protocols" type message

I don't know if this thread is still active but I have two more questions:

  1. Do you also support application categories like streaming video, FPS games, etc.?

  2. I see an example NFA configuration here which allows me to specify which protocols and applications to block or prioritize but I also wanted to know if I could use NFA to select which WAN to use (with mwan3) for a given stream based on Netify classifications?

I just found the answer to #1 -- you do have application categories such as "Streaming Media".

Is there a complete list of the application categories somewhere? Thanks!

Hey bizzbyster.

The Netify Firewall Agent (NFA) should be considered a skeleton for integrating applications/protocols (Layer 7 stuff) into a platform. Basically, there's a stream of metadata coming out Netify, and NFA takes that information and munges it into various firewall/QoS engines (VyOS, pfSense, etc). We haven't built support for OpenWRT yet, but it would take less than a day to get a v1.0 completed.

As for routing policies for multiwan, you are getting into the "first packet" problem that some of our SD-WAN vendors had to solve. Tricky! Deep packet inspection can take up to 10-ish packets to get enough data to do the detection. For example, the HTTPS SNI hostname is seen in packet #4. However, a routing decision for multiwan needs to be done on the first packet. There are a few ways around it.

  1. Quick and dirty. Let the first initial conversation (e.g. the first Netflix payload) pass through without a WAN application policy, but cache the results. Subsequent conversations would then follow the WAN policy. Lots of leakages.

  2. DNS hinting. Most Internet requests start with an initial DNS request. Netify can pre-populate a lookup table before the first application (e.g. HTTPS/QUIC) packet arrives. There's still some leakage, but it's better.

  3. Other voodoo that engineers come up with.

2 Likes

And to answer your other question, the list of applications and categories are here:

The "Adult" apps aren't listed, but you can also grab the full dataset via the public API:

https://informatics.netify.ai/api/v1/lookup/applications?settings_limit=1000

The categories are in the "data_options" part of the API payload (though some might be deprecated).

1 Like