ChipMaster
/
Poor-Mans-IDS

Poor Man's IDS
==============

### (Yes Women can use it too)

The goal of this project is to keep an eye on the requests going in
and out of my network onto the Internet (iNet). This is made
necessary for two reasons:

1. By looking for unusual activity I can get a heads up about
   unwanted software or even "spy hardware" on my systems, ie.
   "Detect Intrusions".

2. Almost all software now days, especially those created by gigabuck
   giants, makes requests out onto the iNet that I did not ask for
   and don't want happening. But even Mozilla makes network traffic I
   didn't ask for and don't want.

So this tool is my way of "watching the watchers". This is not a
plug-n-play tool that _magically_ grants the user a "suit of
invulnerability". But it is a tool for those looking for more insight
into their iNet traffic either due to security concerns or curiosity.

This software is in the very early stages. Right now it just combines
data from logs from a couple of different software packages. I've
already setup many blocks for traffic I don't want happening, like
g00gle Analytics, some ad servers, ...

For the curious I'll post my current block lists in this repository
from time to time. But be **WARNED** that its likely to break your
iNet experience if you use them. I'm a cyber-rebel at heart and tend
to take an "if its doing something I don't want, I have no use for
it" approach. Meaning I'd rather not use a site / program if it
violates my concerns, rather than just "go with the flow". And I will
likely discover I'm breaking stuff I actually want, like: I realized
I've broken my ability to post comments on
[HaD](https://hackaday.com/), but I was curious about what "Server X"
did.

So! If you're not "faint of heart" come join me on my adventure in
iNet security exploration.


Phase 1 - General Setup & Operation
-----------------------------------

In general I feel its necessary to have a **real-world** idea of what
you're dealing with before diving in and writing software to deal
with what you _think_ your dealing with. So the basic plan is simple:

1. Setup my network routing devices, which are running Linux, to log
  DNS queries and network connections. Specifically, anything that is
  a _new_ unrelated packet gets logged.

2. Collect the logs on a machine, with backups on alternate machines.

3. Run the logs through a filter to combine the DNS query data with
   the packet data.

4. Analyze results to determine phase 2 needs.

In this phase I'm jumping the gun a bit, but cyber-thugs are actively
beating on all of our doors, even as I'm getting my stuff together
and there are certain kinds of traffic I know I want to put an end
to. In that light I've already spotted interesting servers I'm
blocking. And I've beefed up my firewall to my mail server with more
permanent blocks from obvious MTA bashers.

In this phase I'm using the most excellent
"[dnsmasq](https://thekelleys.org.uk/dnsmasq/doc.html)" to log the
queries **and** block host names I don't want being accessed. I do
this by assigning the bogus address "127.0.0.255" to the names in my
server's "hosts" file, which is used by "dnsmasq" to answer DNS
queries. That address is a **valid** "localhost" address, so will
**immediately** fail requests unless you put a http(s) server on it.
And I'm sure you can imagine those possibilities.

The other source of information and block capabilities is "iptables"
/ "ip6tables". I added rules to log "new" packets. The parsing
software is what this repository is about right now.


Note on routers
---------------

I'm using Netgear appliances for WiFi and the first tier firewall
connected to my iNet connection. Since a lot of modern ISP connection
devices provide their own idea of security I have to turn off their
firewall stuff and configure them in a transparent bridging
configuration.

I use Netgear because they actively provide tool chains for select
models of their equipment and encourage people to load their own
software mods on them. Have a look at
[My Open Router](https://www.myopenrouter.com/). On my devices I
have settled on [Shibby's Tomato](https://tomato.groov.pl/). Its very
compact, extremely flexible and seems to have everything I've needed
when I've needed it. As an example my dnsmasq & iptables setup for my
gateway router required no changes to the firmware. I just put some
of the lesser used config pages to use.


The future
----------

The goal is an active alert system. This system should provide
immediate feedback on unknown connections allowing the user to either
grant or deny access and maintain the appropriate block lists.

But as "reality" exposes itself things are likely to change.


The CODE
--------

I'm using C++ to write this. I'm targeting C++98 at this time.
Although I think C++11 defines the minimum viable version of C++
there are places that its still not available, which is most likely
to happen with oddball tool chains like those provided by Netgear.
But even my RaspberryPI 2B doesn't support it. Well... maybe with an
OS upgrade... In the end I'm hoping to support the broadest range of
Linux powered platforms.

I'm doing all of my development with Linux. It may be possible to do
something similar, maybe with very little modification, on BSD based
platforms. I don't have them and therefor won't personally be working
on it. But, unless it makes things really ugly, I'm not opposed to
contributions on that front.


Installation & use
------------------

This is still **VERY** crude and incomplete.

1. Type "make" to compile. Hopefully it compiles for you.

2. Create a configuration file. See "Config File" section below.

3. Run `iptraffic ...`. See "Command Line" section below.

4. Review the output.


Config File
-----------

The config file is a sloppy INI style file. Its broken into sections
that start with a header, which is a line containing a name
surrounded by square brackets ( [...] ). The content of the section
stretches to the next header and will either contain name / value
pairs or specifically formatted text data.

Remarks are lines that start with a pound sign (#). I am not using
the MS style semicolon (;). Blank lines are ignored.

Currently there are two sections allowed in the file:

 - **us:** which lists the ip address range(s) that are used inside
   your network (us). Every address not matching is considered
   outside your network (them). This is used to determine direction
   of connection (in/out).

    Each line is a single entry. The entries can be in IPv4 or 6
    notation and should be in a minimalistic format. Basically this
    means no leading zeros. For IPv6 make sure to use :: as
    appropriate. Address prefixes can be used to define networks and
    they should end with either "." (IPv4) or ":" (IPv6).

    Example: "192.168.1." matches the 192.168.1.0/24 network.

 - **ignores:** is a set of TSV records that define what traffic you
   are OK with happening, which are ignored and not reported in the
   output.

    TSV records contain the following fields separated by tabs:

    1. Our / internal address. Follows same wild card specification as
       described in "us" above.
    2. Our port number. 0 matches all.
    3. Outside / their IP address. Follows same wild card
       specification as described in "us" above.
    4. Their port number. Zero matches all.
    5. Domain name associated with the destination. "*" matches all.
    6. IP protocol used (must be all caps):

        - **ICMP:** pings, trace route, ... Port #s match the ICMP
          message "type"
        - **TCP:** the most common network connection (HTTP(s), FTP, ...)
        - **UDP:** Connectionless IP protocol, used for DNS, NTP, ...

    7. Incoming (1) / Outgoing (0) connection.

    A network connection that matches the filter defined by the
    fields will be ignored and not reported to the output. In this
    phase this is basically how we define what's "OK" traffic. Its
    assumed blocking is handled in the firewall and therfore not
    showing in the logs.

    example:

    ```
    *	0	*	443	hackaday.com	TCP	0
    ```

    Ignores a connection from any internal address to a remote
    address for "hackaday.com" using the remote port of 443 in the
    TCP protocol. Basically this ignores (marks OK):
    `https://hackaday.com/`. **BUT** other connections generated by
    the content from that connection will still show in the output.
    So this means things like g00gle analytics, CDNs, external
    JavaScript and CSS servers, ... will still be reported.

    Obviously in normal use the list of OK (ignored) content will get
    quite long. Future tools will handle this better.


iptraffic Command Line
----------------------

The iptraffic command uses the following syntax:

```
iptraffic -c {config file} \[-o {output log}\] \[{input log} \[ ... \]\]
```

- **{config file}:** must be specified and for best results must
  contain at least the \[us\] section with your internal addresses.
  See "Config File" above. Otherwise the "direction" indication is
  meaningless and it gets harder to write ignores.

- **{output log}:** if supplied is the file name to write the
  agregated and filtered content to. If the "-o" option is left off
  output is written to stdout, in UNIX tradition.

- **{input log}:** One or more input log files can be specified on
  the command line. They will be processed in the order specified and
  resulting data written to the specified output or stdout.

**NOTE:** Its assumed that the content of an input log file is in
chronological order (pretty typical) and that the files are specified
on the command line in chronological order. Data collected from the
earlier logs will be used to satisfy needs in the latter logs. This
affects how DNS names (queries) are mapped to connection addresses.
Its assumed that the DNS query is answered before the resulting
connection is made. This is the natural order of things.

This program is mainly intended as a test _jig_. To test both the
concept and DNS to connection mapping algorithms, which still need
work.


iptraffic Output
----------------

Here's a sample of an unexpected connection I found being made:

```
Jun 15 02:06:28 192.0.2.75 -> 74.125.195.188 TCP[5228] mobile-gtalk.l.google.com
```

This shows my tablet calling home to g00gle for something. The
internal address was changed to match the "demo" IPv4 network, as
specified in RFCs. This line consists of the following information in
order:

 1. Date / time stamp of the iptables log entry
 2. The _us_ (internal) address, assuming your config file is correct.
 3. Direction of the connection. "192.0.2.75 called _them_".
 4. The _them_ (external) address, assuming your config file is
    correct.
 5. IP protocol & port of the destination
 6. The DNS name if a matching DNS query was logged prior to the
    connection getting logged.