optimize running bro from PCAPs / advantage of cluster mode

Hello!

In contrast to the normal use case I run Bro mostly from pcaps. When
huge amounts of data (~20 TB) have to be processed, bro in standalone
mode becomes a real bottleneck. So I thought about using the bro cluster
mode.

In the past I thought, the bro workers would communicate with each
other, so when for example one worker sees upstream and the other
downstream, they would combine the information to one log. Seth told me
at BroCon, that Bro needs to be fed complete streams. To do this some
kind of load balancer is needed in front of bro.

When I need to split the flows with a load balancer anyway, is there any
advantage of running bro in cluster mode at all? I do not need any
shared data like tables. Are there any parsers which combine the
information seen by different workers in different flows?

If cluster mode has no added value in my case, I could just load
balance my pcaps to independent bro instances which would make my setup
much easier.

Have a nice weekend!

Franky

Frank,

I would argue that using Bro’s cluster configuration ends up making it a lot easier for you in the long run.

  1. To start, you only have one logger node so all of your logs will be in one place and you don’t have to worry about trying to consolidate them later.

  2. broctl provides an easy way to check the status of all of your nodes without having to write anything custom.

  3. Sync’ing all of your bro binaries and policies across all workers is also done for you.

  4. I question not needing to have shared tables, but I also don’t know your environment and your end goals. That’s how most of the scan detection scripts work, by counting the number of anomalies over time across all of your traffic. If an attacker scans you ten times which are split across ten bro nodes that aren’t communicating with each other, you may miss it. A lot of the malware detection policies also look for the inbound connection and then a separate outbound connection.

Also, using broctl puts you in the same place as a lot of other other installations so it’s easier for people on this list to help troubleshoot.

-Dop

Hi Mike,

thanks for your reply!

I would argue that using Bro's cluster configuration ends up making
it a lot easier for you in the long run.

1) To start, you only have one logger node so all of your logs will
be in one place and you don't have to worry about trying to
consolidate them later.

This is true, but you could also argue that you might get better
throughput, if multiple loggers write to for example a cluster of
elastic or kafka servers.

4) I question not needing to have shared tables, but I also don't
know your environment and your end goals. That's how most of the
scan detection scripts work, by counting the number of anomalies over
time across all of your traffic. If an attacker scans you ten times
which are split across ten bro nodes that aren't communicating with
each other, you may miss it. A lot of the malware detection policies
also look for the inbound connection and then a separate outbound
connection.

I should clarify, that I run bro mainly as a source of meta data about
pcaps. As all data is from the past, scanner detection is no priority.

Also, using broctl puts you in the same place as a lot of other other
installations so it's easier for people on this list to help
troubleshoot.

That's a good point.

My original question still stands: Are there any parsers which combine
the information seen by different workers in different flows?

Martin

My original question still stands: Are there any parsers which combine
the information seen by different workers in different flows?

Policies yes, parsers I'm not sure, but I don't believe so.

-Dop

Yes, FTP (control and data channels). Also, there are some scripts that take global views of activity to create derived logs (may not matter so much in your use case?).

   .Seth