Running a Bro cluster diskless?

Background:

With some guidance from Seth, Baylor is jumping into Bro in a 'timidly aggressive' (should I trademark that?) fashion. We are currently working to build a Bro cluster that can analyze up to 2Gb/s of traffic. We'll have about 900Mb/s of capacity once the upgrades to our exit are complete, with our real aggregate traffic measuring significantly below the 1.8Gb/s maximum.

We have purchased six systems and a switch: one front end system to run Click!, four worker systems, and a manager system. A private network will be used between the frontend system and the workers and another will be used between the workers and the management system.

I have a history with running diskless HPC systems leveraging JessWulf [1], and hope/plan to do the same with our Bro configuration. Simply put, JessWulf is an RPM based toolkit/guide for running RPM based Linux distributions in a master/node cluster environment, where all nodes are diskless.

I hope to use the 'manager' server as the master and the worker server as the nodes in a JessWulf cluster to ease configuration and management. I will certainly have some small local ramdisk as well as local hard drives for non-persistent scratch space as needed.

Now, for the question(s):

Does anyone have experience running Bro diskless like this already? What are the common problems unique to this configuration, where will I likely want to leverage the local scratch space, and is this absolutely the wrong way to run a Bro cluster?

Thanks for any help,

-- KS

[1] - https://wiki.uis.georgetown.edu/display/CCF/JessWulf+-+A+Diskless+Beowulf+Cluster+Toolkit

Keith Schoenefeld
Information Security Analyst
Baylor University

I hope to use the 'manager' server as the master and the worker server as the nodes in a JessWulf cluster to ease configuration and management. I will certainly have some small local ramdisk as well as local hard drives for non-persistent scratch space as needed.

You will want the local disk space for the directory where you have Bro installed, I usually use /bro on clusters. The remote.log file is still kept locally on each worker and proxy node and the Bro binary is copied to each when you do the "install" command.

There is a setting for BroControl named "HaveNFS" which is commented on here:
  http://svn.icir.org/bro/releases/release_1_5/bro/aux/broctl/README.html#_questions_and_answers

Does anyone have experience running Bro diskless like this already? What are the common problems unique to this configuration, where will I likely want to leverage the local scratch space, and is this absolutely the wrong way to run a Bro cluster?

OSU ran very similarly to that for a very long time. I suppose it was actually how the first production cluster (with BroControl at least) was done but we backed away from it a bit due to all of the problems I was encountering. In the hands of a more experienced cluster admin, I expect the results would be much better. :slight_smile:

I think you'd probably be fine with this deployment scenario.

  .Seth

I come from an HPC background and used beowulf/warewulf and perceus cluster software for scientific computing. This is linux software and we prefer FreeBSD for our bro clusters, but I’ve been laying the ground work for a memory file system bro cluster (with local scratch). Our next cluster will likely be built this way. But if you go down this path and want to compare notes, let me know.