Connection records in a database?

I want to stuff connections records into a relational database (likely postgres). Has anyone done this?

My first shot will be to write a simple python process that tails the conn.* log file and inserts records. I'm wondering if there is a more elegant way to collect and insert connection records?

As far as motivation, at FNAL we have a issue tracking system which includes email notification. I would like to use bro to find 'issues' and then create an event in the issue tracking system. The tracking system workflow will resolve a local IP address into a specific machine, find the registered user(s) and send a notification email (informational, warning, critical). It would be useful if this email contained a list of recent connections for the system. This would help the recipient understand what recent computer use caused the network activity that triggered the issue. Hence, having recent connections in a database would be helpful.

I think time machine might be too much. Currently I'm thinking of saving a small time period - say a rolling week's worth of connections (or whatever fits). I've previously used splunk (http://www.splunk.com) to suck in connection records for later searches. This worked, however splunk introduced a delay in retrieval that caused problems formatting the notification email.

Thanks,
Randy Reitz
Fermilab

I want to stuff connections records into a relational database (likely
postgres). Has anyone done this?

I don't push my connection records, but I'm pushing a number of my other logs into postgres.

My first shot will be to write a simple python process that tails the
conn.* log file and inserts records. I'm wondering if there is a more
elegant way to collect and insert connection records?

I have a threaded ruby script that uses the "COPY FROM" technique to push blocks of rows into the database. It's still early and messy, but it does work fairly well and it keeps up with a brisk pace of INSERTs.

I'm going to get started on a C or C++ application soon that will use Broccoli to listen to some event which would be intended for database logging. You would have to run a Bro script that would throw the database logging event for each connection, but that should be fairly easy to write. We'll see how far I make it with that. :slight_smile:

   .Seth

Seth Hall wrote:

I'm going to get started on a C or C++ application soon that will use Broccoli to listen to some event which would be intended for database logging.

Hi Seth,
    I've got one written already, if you're interested I can send you the source.

    Steve

Please! I actually just wrote one which is getting close to working, but I'd be happy to see your implementation.

   .Seth

Hi,

I have written a similar program in C. It imports over 2 Mill. connection log lines in just about 20 minutes. Other scripted methods, such as via Perl, appear to take a bit more time, CPU and RAM, which is why I chose C.

It parses logs (conn.log only right now) from Bro and puts the contents into MySQL.

The code is autoconf’ed, so you might want to give it a try. I also include the SQL Table layout I used.

I have the code up here: https://sourceforge.net/projects/bro-tools/

HTH

Cheers!
–Christopher

I'm not seeing any files there.

   .Seth

Hi Seth,

My error. I have associated the file with the release at: http://sourceforge.net/projects/bro-tools/.

HTH

Cheers!
–Christopher

Seth Hall wrote:

My first shot will be to write a simple python process that tails the
conn.* log file and inserts records. I'm wondering if there is a more
elegant way to collect and insert connection records?

I have something[1] similar written late last year, which parses Bro
logs and inserts the data to PostgreSQL[2]. I also have an extremely
alpha version of the web frontend, written in PHP with Symfony framework.

I stopped working on it (due to work commitment, mainly) after realizing
that the best way to do it is by using Broccoli - which up until now I
haven't got around to do.

I'm going to get started on a C or C++ application soon that will use
Broccoli to listen to some event which would be intended for database
logging. You would have to run a Bro script that would throw the
database logging event for each connection, but that should be fairly
easy to write. We'll see how far I make it with that. :slight_smile:

Keep us updated!

Seth Hall

--mel

[1] http://security.org.my/brologs2db.rb
[2] http://security.org.my/brodb.sql.txt

Randy,

Can you or anyone else add details on your experiences using Bro with
Splunk? I'm considering pairing the two.

Thank you,

Richard

I want to stuff connections records into a relational database (likely
postgres). Has anyone done this?

Note, we have a significant research project underway for exporting Bro
events into a high-performance database for purposes of both forensics and
real-time detection of previously described activity. We describe the
vision in our recent HotSecurity paper:

  Principles for Developing Comprehensive Network Visibility

The underlying technology is partially implemented, but won't be ready
for use by others for a good while.

    Vern

I have something[1] similar written late last year, which parses Bro
logs and inserts the data to PostgreSQL[2]. I also have an extremely
alpha version of the web frontend, written in PHP with Symfony framework.

Nice! I'd be interested to take a look at it. I've been working on something similar recently.

I checked out your log importer too, but I noticed that you're doing individual inserts for each record. In my testing, doing individual inserts doesn't scale for high data rates, the database can't insert data quickly enough. I have been using the COPY [1] method for inserting data in batches and it turns out that even at high data rates the database can keep up just fine.

I'm going to get started on a C or C++ application soon that will use
Broccoli to listen to some event which would be intended for database
logging. You would have to run a Bro script that would throw the
database logging event for each connection, but that should be fairly
easy to write. We'll see how far I make it with that. :slight_smile:

Keep us updated!

On Friday, I got an initial version of my C++ database logger functioning. :slight_smile: Here's how it will work...

In your bro scripts, you'll call something like this (field names and values don't have to have the same name)...
   event db_log("http_logs", [$orig_h=orig_h, $resp_h=resp_h, $resp_p=resp_p, $method=method, $url=url]);

The database logger will listen for the db_log event and dynamically create the following SQL query...
   COPY http_logs (orig_h, resp_h, resp_p, method, url) FROM STDIN

Every time the db_log event is called for that table, it will send another row of data to the database. Once a certain number of rows have been pushed to the database it will end the COPY query and all of the data you have already pushed to the database will be inserted. The COPY query will then be executed again and the cycle repeats.

For any data you want to insert to a database, all you have to do is make sure that your database has the necessary fields in it, then throw the proper db_log event. I'll be releasing the code under the BSD license as soon as I get a few more features added to it.

   .Seth

[1] PostgreSQL: Documentation: 16: 34.10. Functions Associated with the COPY Command

I like this approach!

Robin

Yes, individual inserts don't work!

Here is the conn.log file on my BRO installation...

[brother@dtmb ~]$ s=$(wc -l spool/bro/conn.log | awk '{print $1}'); while true; do sleep 10;s1=$(wc -l spool/bro/conn.log | awk '{print $1}');printf "%d\n" $((s1 - s));s=$s1;done
4750
4728
4565
4243
4926
4379
^C

Looks like conn.log is adding ~450 connections per second. Here is what happens with a python script that tails conn.log and inserts each record into a Postgres DB...

[brother@dtmb ~]$ l=$(echo "select count(*) from bro_connections" | psql -h nimisrv nimi_dev | awk '/^ [0-9]/ { print $1}');while true;do sleep 10;n=$(echo "select count(*) from bro_connections" | psql -h nimisrv nimi_dev | awk '/^ [0-9]/ { print $1}');printf "%d\n" $((n-l));l=$n;done
1756
1625
1631
1667
1670
1838
^C

Maybe ~160 records per second. Not even close.

It's always nice to know what not to do.

Randy