Standalone vs cluster

This appears to have been discussed in 2009, so I thought I might re-ask to see if anything has changed, and to add a follow on question/clarification. I don’t see any further discussion from searching the archives.

If using a single box to run bro, is there any advantage to running cluster mode (all localhost) rather than standalone?

The previous answer was: no reason to do so, with additional clarification that a) if you’re thinking of eventually migrating to cluster mode, getting the configuration correct will be the least of your trouble and b) unless you want to take advantage of multiple cores.

The latter point is why I am posing the question again: on a 12-core box, for example, how does one (and should one) take advantage of these cores. The last I have seen is a) bro is single threaded and b) the rule of thumb is 80Mbps/core. If this is so, then am I at risk of dropping data on the floor if I don’t specifically have more workers?

Say I can expect to see 500 Mbps peak, with occasional sustained load of say 300 Mbps.

To accommodate this traffic load, should six workers be defined all on localhost? Or does a single localhost worker (the default in standalone, right?) already utilize the cores to achieve the desired performance?

Thanks for your suggestions
Clark

You're going to want to run it as a cluster, even if it's all on one box.
80Mbps/core seems low nowadays, although it depends on your CPUs. We're easily handling loads[0] in the 3-4Gbps range on 16 workers, 4 proxies, and a manager (all on the same 20 core box). My CPUs are E5-2687W v3 @ 3.10GHz. Pin your processes and you should be ok. But yes, if the load is too much, then you'll drop traffic. Enable the capture loss script and graph its output to get an idea.

[0] asterisk: two workers drop more traffic than the other 14 due to CPUs at 100%, load follows the workers, gave up trying to figure that one out for now, those drop 5-10% - I'm assuming it's some prolonged traffic and/or some weird hashing on my network card, an Endace DAG 9.2X2.

Mike

I run all of my single boxes as clusters. This is how you get it to scale locally. That way you can take full advantage of all the cores on the box. The amount of workers really depends on the traffic and types of traffic. start with 6 and see how it does.

Thanks

Mike

If using a single box to run bro, is there any advantage to running cluster mode (all localhost) rather than standalone?

Simple answer here, you almost never want to run standalone.

The previous answer was: no reason to do so, with additional clarification that a) if you're thinking of eventually migrating to cluster mode, getting the configuration correct will be the least of your trouble and b) unless you want to take advantage of multiple cores.

The latter point is why I am posing the question again: on a 12-core box, for example, how does one (and should one) take advantage of these cores. The last I have seen is a) bro is single threaded and b) the rule of thumb is 80Mbps/core. If this is so, then am I at risk of dropping data on the floor if I don't specifically have more workers?

That rule of thumb was actually created for this box:
  http://www.amazon.com/Dell-Computer-Professional-Extremely-Operation/dp/B002Q6ZTZM

I don’t recommend using those anymore (or ever), but the first production Bro cluster was running on a big stack of those because I got them for free. :slight_smile: That documentation needs to be updated at some point, but generally these days with modern hardware people will see ~200-250Mbps per core although it’s possible to make it run faster.

To accommodate this traffic load, should six workers be defined all on localhost? Or does a single localhost worker (the default in standalone, right?) already utilize the cores to achieve the desired performance?

Did you read the load balancing documentation?
  https://www.bro.org/documentation/load-balancing.html

It’s a bit out of date, and unfortunately only includes directions for load balancing with pf_ring, but it should give you a first direction. I’ll see if I can update that with a second mechanism soon too. We’re working on adding another mechanism to the on-host load balancing options as well which we think should be really flexible and nice.

  .Seth