The Poor Man's (or Woman's) Intrusion Detection System

Jon Foster 8a33f5fdd8 Break analyzer out of iptraffic & tests Added tests for testing address wild card matching and a test jig to put tests in. I moved the core of the log analyzer out of iptraffic and into data.o for use in multiple tools.		2 years ago
blacklists	Start blacklist collection	2 years ago
tests	Break analyzer out of iptraffic & tests	2 years ago
.gitignore	Prepare for first PUSH to the 'Shack	2 years ago
Makefile	Proper CLI interface for iptraffic	2 years ago
README.md	Improved DNS matching	2 years ago
cli.cpp	Proper CLI interface for iptraffic	2 years ago
cli.h	Proper CLI interface for iptraffic	2 years ago
config.cpp	Improved DNS matching	2 years ago
config.h	Break analyzer out of iptraffic & tests	2 years ago
data.cpp	Break analyzer out of iptraffic & tests	2 years ago
data.h	Break analyzer out of iptraffic & tests	2 years ago
iptraffic.cpp	Break analyzer out of iptraffic & tests	2 years ago
strutil.cpp	Prepare for first PUSH to the 'Shack	2 years ago
strutil.h	Prepare for first PUSH to the 'Shack	2 years ago

README.md

Poor Man’s IDS

(Yes Women can use it too)

The goal of this project is to keep an eye on the requests going in and out of my network onto the Internet (iNet). This is made necessary for two reasons:

By looking for unusual activity I can get a heads up about unwanted software or even “spy hardware” on my systems, ie. “Detect Intrusions”.
Almost all software now days, especially those created by gigabuck giants, makes requests out onto the iNet that I did not ask for and don’t want happening. But even Mozilla makes network traffic I didn’t ask for and don’t want.

So this tool is my way of “watching the watchers”. This is not a plug-n-play tool that magically grants the user a “suit of invulnerability”. But it is a tool for those looking for more insight into their iNet traffic either due to security concerns or curiosity.

This software is in the very early stages. Right now it just combines data from logs from a couple of different software packages. I’ve already setup many blocks for traffic I don’t want happening, like g00gle Analytics, some ad servers, ...

For the curious I’ll post my current block lists in this repository from time to time. But be WARNED that its likely to break your iNet experience if you use them. I’m a cyber-rebel at heart and tend to take an “if its doing something I don’t want, I have no use for it” approach. Meaning I’d rather not use a site / program if it violates my concerns, rather than just “go with the flow”. And I will likely discover I’m breaking stuff I actually want, like: I realized I’ve broken my ability to post comments on HaD, but I was curious about what “Server X” did.

So! If you’re not “faint of heart” come join me on my adventure in iNet security exploration.

Phase 1 - General Setup & Operation

In general I feel its necessary to have a real-world idea of what you’re dealing with before diving in and writing software to deal with what you think your dealing with. So the basic plan is simple:

Setup my network routing devices, which are running Linux, to log DNS queries and network connections. Specifically, anything that is a new unrelated packet gets logged.
Collect the logs on a machine, with backups on alternate machines.
Run the logs through a filter to combine the DNS query data with the packet data.
Analyze results to determine phase 2 needs.

In this phase I’m jumping the gun a bit, but cyber-thugs are actively beating on all of our doors, even as I’m getting my stuff together and there are certain kinds of traffic I know I want to put an end to. In that light I’ve already spotted interesting servers I’m blocking. And I’ve beefed up my firewall to my mail server with more permanent blocks from obvious MTA bashers.

In this phase I’m using the most excellent “dnsmasq” to log the queries and block host names I don’t want being accessed. I do this by assigning the bogus address “127.0.0.255” to the names in my server’s “hosts” file, which is used by “dnsmasq” to answer DNS queries. That address is a valid “localhost” address, so will immediately fail requests unless you put a http(s) server on it. And I’m sure you can imagine those possibilities.

The other source of information and block capabilities is “iptables” / “ip6tables”. I added rules to log “new” packets. The parsing software is what this repository is about right now.

Note on routers

I’m using Netgear appliances for WiFi and the first tier firewall connected to my iNet connection. Since a lot of modern ISP connection devices provide their own idea of security I have to turn off their firewall stuff and configure them in a transparent bridging configuration.

I use Netgear because they actively provide tool chains for select models of their equipment and encourage people to load their own software mods on them. Have a look at My Open Router. On my devices I have settled on Shibby’s Tomato. Its very compact, extremely flexible and seems to have everything I’ve needed when I’ve needed it. As an example my dnsmasq & iptables setup for my gateway router required no changes to the firmware. I just put some of the lesser used config pages to use.

The future

The goal is an active alert system. This system should provide immediate feedback on unknown connections allowing the user to either grant or deny access and maintain the appropriate block lists.

But as “reality” exposes itself things are likely to change.

The CODE

I’m using C++ to write this. I’m targeting C++98 at this time. Although I think C++11 defines the minimum viable version of C++ there are places that its still not available, which is most likely to happen with oddball tool chains like those provided by Netgear. But even my RaspberryPI 2B doesn’t support it. Well... maybe with an OS upgrade... In the end I’m hoping to support the broadest range of Linux powered platforms.

I’m doing all of my development with Linux. It may be possible to do something similar, maybe with very little modification, on BSD based platforms. I don’t have them and therefor won’t personally be working on it. But, unless it makes things really ugly, I’m not opposed to contributions on that front.

Installation & use

This is still VERY crude and incomplete.

Type “make” to compile. Hopefully it compiles for you.
Create a configuration file. See “Config File” section below.
Run iptraffic .... See “Command Line” section below.
Review the output.

Config File

The config file is a sloppy INI style file. Its broken into sections that start with a header, which is a line containing a name surrounded by square brackets ( [...] ). The content of the section stretches to the next header and will either contain name / value pairs or specifically formatted text data.

Remarks are lines that start with a pound sign (#). I am not using the MS style semicolon (;). Blank lines are ignored.

Currently there are two sections allowed in the file:

us: which lists the ip address range(s) that are used inside your network (us). Every address not matching is considered outside your network (them). This is used to determine direction of connection (in/out).

Each line is a single entry. The entries can be in IPv4 or 6 notation and should be in a minimalistic format. Basically this means no leading zeros. For IPv6 make sure to use :: as appropriate. Address prefixes can be used to define networks and they should end with either “.” (IPv4) or “:” (IPv6).

Example: “192.168.1.” matches the 192.168.1.0/24 network.
ignores: is a set of TSV records that define what traffic you are OK with happening, which are ignored and not reported in the output.

TSV records contain the following fields separated by tabs:
1. Our / internal address. Follows same wild card specification as described in “us” above.
2. Our port number. 0 matches all.
3. Outside / their IP address. Follows same wild card specification as described in “us” above.
4. Their port number. Zero matches all.
5. Domain name associated with the destination. “*” matches all.
6. IP protocol used (must be all caps):
  - ICMP: pings, trace route, ... Port #s match the ICMP message “type”
  - TCP: the most common network connection (HTTP(s), FTP, ...)
  - UDP: Connectionless IP protocol, used for DNS, NTP, ...
7. Incoming (1) / Outgoing (0) connection.
A network connection that matches the filter defined by the fields will be ignored and not reported to the output. In this phase this is basically how we define what’s “OK” traffic. Its assumed blocking is handled in the firewall and therfore not showing in the logs.

example:
```
*	0	*	443	hackaday.com	TCP	0
```
Ignores a connection from any internal address to a remote address for “hackaday.com” using the remote port of 443 in the TCP protocol. Basically this ignores (marks OK): https://hackaday.com/. BUT other connections generated by the content from that connection will still show in the output. So this means things like g00gle analytics, CDNs, external JavaScript and CSS servers, ... will still be reported.

Obviously in normal use the list of OK (ignored) content will get quite long. Future tools will handle this better.

iptraffic Command Line

The iptraffic command uses the following syntax:

iptraffic -c {config file} \[-o {output log}\] \[{input log} \[ ... \]\]

{config file}: must be specified and for best results must contain at least the [us] section with your internal addresses. See “Config File” above. Otherwise the “direction” indication is meaningless and it gets harder to write ignores.
{output log}: if supplied is the file name to write the agregated and filtered content to. If the “-o” option is left off output is written to stdout, in UNIX tradition.
{input log}: One or more input log files can be specified on the command line. They will be processed in the order specified and resulting data written to the specified output or stdout.

NOTE: Its assumed that the content of an input log file is in chronological order (pretty typical) and that the files are specified on the command line in chronological order. Data collected from the earlier logs will be used to satisfy needs in the latter logs. This affects how DNS names (queries) are mapped to connection addresses. Its assumed that the DNS query is answered before the resulting connection is made. This is the natural order of things.

This program is mainly intended as a test jig. To test both the concept and DNS to connection mapping algorithms, which still need work.

iptraffic Output

Here’s a sample of an unexpected connection I found being made:

Jun 15 02:06:28 192.0.2.75 -> 74.125.195.188 TCP[5228] mobile-gtalk.l.google.com

This shows my tablet calling home to g00gle for something. The internal address was changed to match the “demo” IPv4 network, as specified in RFCs. This line consists of the following information in order:

Date / time stamp of the iptables log entry
The us (internal) address, assuming your config file is correct.
Direction of the connection. “192.0.2.75 called them".
The them (external) address, assuming your config file is correct.
IP protocol & port of the destination
The DNS name if a matching DNS query was logged prior to the connection getting logged.

Ye Ol' π Shack