Couple elasticsearch questions

Hey all,

A few questions:

1. Is there a proper way to set which logs to send to elasticsearch that I can use in local.bro instead of modifying logs-to-elasticsearch.bro? I am assuming that logs-to-elasticsearch.bro might change in future versions of bro.
2. The docs say to add @load tuning/logs-to-elasticsearch in local.bro...how can I send bro data to a remote elasticsearch server instead?
3. And lastly, as I look at the Brownian demo, I see that all the fields are correctly laid out..was this down with Brownian, or with elasticsearch itself?

I'm trying to get bro data into logstash direct, instead of using log files. Thanks for any insight.

James

Hey all,

A few questions:

1. Is there a proper way to set which logs to send to elasticsearch
that I can use in local.bro instead of modifying
logs-to-elasticsearch.bro? I am assuming that logs-to-elasticsearch.bro
might change in future versions of bro.

Yes, you should just create your own .bro file and take what you need
from logs-to-elasticsearch.bro

2. The docs say to add @load tuning/logs-to-elasticsearch in
local.bro...how can I send bro data to a remote elasticsearch server
instead?

redef LogElasticSearch::server_host = "...";

3. And lastly, as I look at the Brownian demo, I see that all the
fields are correctly laid out..was this down with Brownian, or with
elasticsearch itself?

No idea.. Vlad would know :slight_smile:

I'm trying to get bro data into logstash direct, instead of using log
files. Thanks for any insight.

Keep in mind that in a failure of communication between Bro and ES you
might have a very bad time.

1. Is there a proper way to set which logs to send to elasticsearch
that I can use in local.bro instead of modifying
logs-to-elasticsearch.bro?

Yes, there are settings that you can change. In local.bro, you can do this...

@load tuning/logs-to-elasticsearch
redef LogElasticSearch::send_logs += {
  Conn::LOG,
  HTTP::LOG
};

That will only send the conn.log and http.log to ElasticSearch.

2. The docs say to add @load tuning/logs-to-elasticsearch in
local.bro...how can I send bro data to a remote elasticsearch server
instead?

redef LogElasticSearch::server_host = "1.2.3.4";

3. And lastly, as I look at the Brownian demo, I see that all the
fields are correctly laid out..was this down with Brownian, or with
elasticsearch itself?

Could you explain what you mean by "correctly laid out"?

I'm trying to get bro data into logstash direct, instead of using log
files. Thanks for any insight.

Cool! With the current mechanism, you could encounter overload situations that cause Bro to grow in memory until you run out of memory. We're slowly working on extensions to the ES writer to make it write to a disk backed queuing system so that things should remain more stable over time. I am interested to hear any experiences you have with this though.

  .Seth

Thanks for the responses Gents...they do help. So...for example here...I have snort currently going to logstash. In order to match fields I have this:

filter {
         grok {
                 match => [ "message", "%{SYSLOGTIMESTAMP:date} %{IPORHOST:device} %{WORD:snort}\[%{INT:snort_pid}\]\: \[%{INT:gid}\:%{INT:sid}\:%{INT:rev}\] %{DATA:ids_alert} \[Classification\: %{DATA:ids_classification}\] \[Priority\: %{INT:ids_priority}\] \{%{WORD:proto}\} %{IP:ids_src_ip}\:%{INT:ids_src_port} \-\> %{IP:ids_dst_ip}\:%{INT:ids_dst_port}" ]
}

to match:

Jul 23 09:44:46 gateway snort[13205]: [1:2500084:3305] ET COMPROMISED Known Compromised or Hostile Host Traffic TCP group 43 [Classification: Misc Attack] [Priority: 2] {TCP} 61.174.51.229:6000 -> x.x.x.x:22

I'm guessing I'm going to have to create something like the above grok for each bro log file....which...is going to be a hoot :wink: I was hoping that work was already done somewhere...and I think I had it working at one time for conn.log that I posted here some time ago. Thanks again...after looking at the Brownian source I think I'm going to have to just bite the bullet and generate the grok lines.

James

I've done most of them using grok and custom patterns. Conn.log below Using logstash to read the log files, process and insert into elasticsearch. Then using kibana as a web front end.

       grok {
         match => [ "message", "(?<ts>(.*?))\t(?<uid>(.*?))\t(?<id.orig_h>(.*?))\t(?<id.orig_p>(.*?))\t(?<id.resp_h>(.*?))\t(?<id.resp_p>(.*?))\t(?<proto>(.*?))\t(?<service>(.*?))\t(?<duration>(.*?))\t(?<orig_bytes>(.*?))\t(?<resp_bytes>(.*?))\t(?<conn_state>(.*?))\t(?<local_orig>(.*?))\t(?<missed_bytes>(.*?))\t(?<history>(.*?))\t(?<orig_pkts>(.*?))\t(?<orig_ip_bytes>(.*?))\t(?<resp_pkts>(.*?))\t(?<resp_ip_bytes>(.*?))\t(?<tunnel_parents>(.*))" ]
       }

Hey thanks Chris that's a big help...if you'd be willing to share any of the others that would be excellent as well. I have to admit I'm fairly excited to see a dashboard that shows me things like "show me snort ids hits AND firewall hits AND connection tracking" :slight_smile:

James

Are you saying that you're going to have to do this because you don't want Bro to write directly to ElasticSearch?

  .Seth

Negative. In order to get Logstash/Kibana to identify fields, the grok patterns are what is used. I guess that's the question for me....does Bro dump the data raw into elasticsearch? If it does then I'll need to include a grok line in my logstash config to parse out the data of each type of log that bro generates. I hope that makes sense..thanks Seth.

James

Bro converts the data to json and then writes that to elasticsearch using ES’s bulk interface. But it does a “fire and forget” so doesn’t confirm that the data was actually accepted.

I wrote an AMQPRiver writer a while back that allows you to leverage an ElasticSearch River, it provided for a higher level of reliability of data ingestion, but I haven’t touched it since I wrote it a few months back.

Bro will write the logs directly into elasticsearch (with the fields separated and named correctly). You don't need logstash at all. The only difference is that in your kibana config, you'll need to make it use slightly different index names. I'm hoping that this is something we'll have more guidance on at some point. I definitely recognize that more cleanup needs to done to this code to make it more resilient and make it easier to get to an end-result.

  .Seth

How does Bro handle indexes within ES? Does it rotate indexes, or does it write to one extremely large index with TTLs?

Right now we're handling indexes with Bro log rotation. The logs-to-elasticsearch script sets a log rotation interval of 3 hours so you'll have a new index created every three hours. Bro is also not doing anything to clean up old indexes so you'll have to do that on your own.

  .Seth

I’m for from being an expert, but I think that using the built in grok patterns and/or being more specific with the regex syntax will result in better logstash performance.

for example, I’m profiling performance for some of our dns grok parsing patterns:

match => [ “message”, “(?<bro_event_time>[0-9.]{14})[0-9]+\t%{IP:dns_requester}\s%(?<dns_query_src_port>[0-9]{1,5})\t%{IP:dns_server}\s%(?<dns_query_dst_port>[0-9]{1,5})\t%{WORD:dns_query_proto}\t(?<dns_query_transid>[0-9]+)\t%{HOSTNAME:dns_query}\t%{NOTSPACE:dns_query_class}\t(?<dns_query_type>[A-Za-z0-9-*]+)\t%{NOTSPACE:dns_query_result>[A-Z*]+)\t(?<dns_authoritative_answer>[TF])\t(?<dns_recursion_desired>[TF])\t(?<dns_recursion_available>[TF])\t%{GREEDYDATA:dns_response}” ]

I’m also sure that there’s more efficient ways to write it than what I did. The odd parsing of the timestamp is because we use logstash to rewrite event times where possible, using the actual event time with the date filter:

date {

match => [ “bro_event_time”, “UNIX” ]
}

Just my .02.

Take a look at

http://brostash.herokuapp.com/

-Mike

Yea that's a thing of beauty...thank you!

James

Thanks MK...that does help...this has been an interesting day of
discovery.

James

Confirming that this works like a champ. My testing here is using Logstash with it's built in Kibana, and a separate instance of Elasticsearch since there's more going in then just Bro. In fact the whole idea is to tie in bro, snort, and syslogs. With bro going direct to elasticsearch, there's nothing to really configure, save just to make sure your Kibana index is set to _all. Kibana also allows you to tweak the timestamp so the original unix time, after tweaking, shows up as 2014-07-24T12:16:05.795-06:00 for example. My next step will be to get snort and firewall logs in....ironically, the Bro portion has been the easiest :slight_smile: Thanks for the work on this!

James