greptrick.com

Incoherent ramblings of an InfoSec professional

PFSense, Netflow and ELK w/geoip

Several months ago I started working with the ELK stack (elasticsearch, logstash, kibana) for use with bluecoat proxy logs. I have been running pfsense at home for quite sometime and decided it would be nice to get some data pulled out of it, why not with netflow. As with everything else there are pieces of stuff all over the interwebs, but nothing that pulled it all together for me to use.

Here is a simple breakdown of the steps.

Configure pfsense to pass flow data
install java
install elasticsearch – research optimizations for single node.
install kibana 4
install logstash
install nginx
configure logstash
configure kibana – secure
configure nginx
configure elasticsearch – secure
configure mappings for flow
build dashboards in kibana

Here is the base setup.

Debian 8.1 64bit running on ESXi
– 2 vCPUs
– 8GB Ram
– 60G Storage

A. Setup PFSense to collect and pass flow data

Install softflowd package that is available for pfsense.

Screen-Shot-2015-07-13-at-3.55.11-PM.png

Services -> softflowd select “Interface, Host “ip of ELK box”, Port “9995” (will be configured later in logstash config)

Screen Shot 2015-07-13 at 9.44.07 PM

B. Install and configure ELK – a good chunk with modifications was taken from this DigitalOcean article – https://www.digitalocean.com/community/tutorials/how-to-install-elasticsearch-logstash-and-kibana-4-on-ubuntu-14-04

1. Install Java (latest version)

Add repo list and update –

 sudo add-apt-repository -y ppa:webupd8team/java && apt-get update
 

Install java8

 sudo apt-get -y install oracle-java8-installer
 

2. Install Elasticsearch NOTE:I just used the latest (1.6.0) as of the time of writing this.

Download GPG key for apt.

    wget -O - http://packages.elasticsearch.org/GPG-KEY-elasticsearch | sudo apt-key add -
 

Add to repo list and update.

    echo 'deb http://packages.elasticsearch.org/elasticsearch/1.6/debian stable main' | sudo tee /etc/apt/sources.list.d/elasticsearch.list
 

Install

    sudo apt-get -y install elasticsearch=1.6.0
 

Edit /etc/elasticsearch/elasticsearch.yml and set network.host: localhost

3. Install Kibana 4 NOTE:Latest as of this writing 4.1.1

  wget https://download.elasticsearch.org/kibana/kibana/kibana-4.1.1-linux-x64.tar.gz
  tar -zxvf kibana-4.1.1-linux-x64.tar.gz

Move the code over to someplace that seems normal

sudo mkdir -p /opt/kibana && sudo cp -R ~/kibana-4*/* /opt/kibana/

Go head and install this git code that will setup kibana as a service.

cd /etc/init.d && sudo wget https://gist.githubusercontent.com/thisismitch/8b15ac909aed214ad04a/raw/bce61d85643c2dcdfbc2728c55a41dab444dca20/kibana4
sudo chmod +x /etc/init.d/kibana4

At this point edit the kibana config in /opt/kibana/config/kibana.yml and set host: “localhost”

4. Install Logstash NOTE:Latest as of this writing 1.5.2

Add repo and update.

  echo 'deb http://packages.elasticsearch.org/logstash/1.5/debian stable main' | sudo tee /etc/apt/sources.list.d/logstash.list && sudo apt-get update

Install Logstash

  sudo apt-get install logstash

5. Install nginx

We will use nginx as a reverse proxy to get to the ELK instance. Kibana and Elasticsearch do not offer much in the way of security so we have to lock them to only being accessible from localhost.

apache2-utils only necessary if htpasswd is desired to help secure the kibana instance.

  apt-get install nginx apache2-utils 

6. Configure nginx reverse proxy

Replace /etc/nginx/sites-available/default with the following.

server {
    listen 80;

    server_name <servername>;

    location / {
        proxy_pass http://localhost:5601;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_cache_bypass $http_upgrade;        
    }
}

7. Configure Logstash to accept traffic and create geoid data.

First get the geoip data, this is updated monthly so a cron is probably a good idea. This will create the lookup database for logstash.

cd /etc/logstash && sudo curl -O "http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz" && sudo gunzip GeoLiteCity.dat.gz

Create /etc/logstash/conf.d/logstash-netflow.conf place the following in the file.

input {
   udp {
     port => 9995
     codec => netflow
     type => netflow
   }
}
filter {
  geoip {
    source => "[netflow][ipv4_src_addr]"
    target => "src_geoip"
    database => "/etc/logstash/GeoLiteCity.dat"
  }
  geoip {
    source => "[netflow][ipv4_dst_addr]"
    target => "dst_geoip"
    database => "/etc/logstash/GeoLiteCity.dat"
  }
}
output {
  stdout { codec => rubydebug }
  if ( [host] =~ "10\.100\.1\.1" ) {
    elasticsearch {
      index => "logstash_netflow-%{+YYYY.MM.dd}"
      host => "localhost"
    }
  } else {
    elasticsearch {
      index => "logstash-%{+YYYY.MM.dd}"
      host => "localhost"
    }
  }
}

Parts of this are for future growth, where [host] =~ .

8.

Start everything up and set to start on boot

    sudo systemctl enable elasticsearch.service
    sudo systemctl enable logstash.service
    sudo systemctl enable kibana4.service
    sudo systemctl enable nginx.service
    sudo systemctl start elasticsearch.service
    sudo systemctl start logstash.service
    sudo systemctl start kibana.service
    sudo systemctl start nginx.service
 

Everything now should be working, check your logs /var/log/logstash/logstash.log and /var/log/elasticsearch/elasticsearch.log – once you get data pushed into logstash ES should create an index in /var/lib/elasticsearch/elasticsearch/nodes/0/indicies

Now we just need to connect to Kibana and configure some stuffs

9. Create geo_point mappings

Once we are sure that data is flowing we have to create static mappings for the geo-location data that is created by logstash. ES will do dynamic mappings but Kibana won’t see them correctly to create a map of the data. Below worked on my setup, but it took me a while to get it right.

This will apply the “logstash_netflow_template” to any index named “logstash_netflow*” our index is date based (logstash_netflow-YYYY.MM.DD) so this will cover every index created going forward.

root@Logstash:~# curl -XPUT http://localhost:9200/_template/logstash_netflow_template -d '
{
  "template" : "logstash_netflow*",
  "mappings" : {
    "netflow" : {
      "properties" : {
        "dst_geoip" : {
          "properties" : {
            "location" : {"type":"geo_point"}
          }
        },
       "src_geoip" : {
         "properties" : {
           "location" : {"type":"geo_point"}
         }
       }
     }
   }
 }
}'

10. That should be it for configs, the rest can be done inside Kibana.

Go to kibana http:// once ELK loads, go to Settings -> Indices

Create New Index Pattern, select “Use event times to create index names” set the Index name or pattern to [logstash_netflow-]YYYY.MM.DD – You should get pattern match if you index is created properly, select Time-field name of “@timestamp” then just hit “Create”

Screen-Shot-2015-07-13-at-9.14.54-PM.png

I will save the rest of the Kibana setup for another post or for the reader to struggle through, here are some shots of my dashboard as it sits using the data collected above.

Screen Shot 2015-07-13 at 9.25.43 PM

Screen Shot 2015-07-13 at 9.25.59 PM

Screen_Shot_2015-07-13_at_9_26_18_PM

Screen_Shot_2015-07-13_at_9_26_47_PM

There are a lot of performance tuning for ELK that I will talk about in another post. This will work for a small installation. This is running at my home with average usage of about 1000 events/min. In a larger environment you will likely need to separate your collectors (logstash) from your elasticsearch instances, and tune your ES. Rule of thumb, ES_HEAP_SIZE = half of your system ram but never more than 32g. If you need more than that spin up a second instance of ES.

4 responses to “PFSense, Netflow and ELK w/geoip

  1. jg3 2015/07/17 at 19:13

    Hey, cool stuff. A couple of questions:
    – why use softflowd instead of pfflowd?
    – Can ELK differentiate traffic’s origin? You show TopN Src/Dst ports, can the system instead show TopN Dst ports for traffic coming in/going out?
    – In this set-up you have only the one flow generation point. If there were another in the path (eg, the switch just below my pfsense) would all the flows crossing the firewall show up twice, or would the system recognize they were part of the same flow and collapse them into one .. thing?
    – In my little network I use NAT for outbound traffic. How would that be handled here?

    Thanks!

    • greptrick 2015/07/17 at 21:19

      There is no particular reason I picked softflowd, the only real difference is how the two collect the traffic.
      – In short if I understand what you are asking, yes. You can just filter by source address. i.e., if source address is your net range then it is outbound and list only those ports.
      – I actually have 2 generation points but coming from the same physical device. Flows would show up twice, but you can either tag the flow by the device they are coming from, or you could send them to separate indexes thus separating them logically but you can query them together or separate in Kibana.
      – I NAT as well, I collect flows on the WAN and LAN side.

  2. whommbat 2016/03/11 at 23:16

    I’m so close to getting this working… I’m having an issue getting nginx starting…. When I try to create the new index, I get “Pattern does not match any existing indices” – and I’m not seeing any data in /var/lib/elasticsearch/elasticsearch/nodes/0/indicies yet. How far away am i?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: