So, slightly off-topic but since myself and several of you seem to be going through this would anyone be willing to collaborate on a paper/presentation to submit for Brocon 2015 that details the various methodologies folks are using to capture at X rate?
We're somewhere around 5-6Gbps average but burst as high as 9.
I've been through many iterations to get the "perfect" recipe and it might prove useful to others.
However there are many different options on the network and system side so there are probably a few "perfect" recipes depending upon budget and equipment.
Thoughts?
Cheers,
Harry
I second it, and will be able to provide some docs/tuning/configs
regarding what we are currently doing.
scott
So, slightly off-topic but since myself and several of you seem to
be going through this would anyone be willing to collaborate on a
paper/presentation to submit for Brocon 2015 that details the
various methodologies folks are using to capture at X rate?
We're somewhere around 5-6Gbps average but burst as high as 9.
I've been through many iterations to get the "perfect" recipe and
it might prove useful to others.
However there are many different options on the network and system
side so there are probably a few "perfect" recipes depending upon
budget and equipment.
Thoughts?
Cheers, Harry
Bear in mind that there is a 32 application limit for the number
of bro workers/slaves that can attach to a single cluster ID with
the pf_ring dna/zc drivers. Or you can get really crafty and
bounce traffic from one ring to another interface/ring and have
up to 64 workers on a single box, provided you have the cores to
work with
Looking at the current Intel chips, I'd say the 8-core high-clock
(+3.3Ghz) speed procs are a good option in a quad-socket system
build and not break the bank. Would give you 32-cores to pin
workers upon at a nice high clockspeed, which bro seems to
greatly appreciate. The E5-2687W v2 or E5-2667 v2 or E5-4627 v2,
some of which can turbo up to 4Ghz for traffic spikes (if you
manage the power modes correctly!
https://communities.intel.com/community/itpeernetwork/datastack/blog/2013/08/05/how-to-maximise-cpu-performance-for-the-oracle-database-on-linux
)
-Alex
For perspective I currently have a bro cluster comprised of 3
physical hosts. The first host runs the manager, proxies, and
has storage to handle lots of bro logs and keep them for
several months, the other two are dedicated to workers with
relatively little storage. We have a hardware load-balancer to
distribute traffic as evenly as possible between the worker
nodes, and some effort has been made to limit having to process
really large uninteresting flows before they reach the cluster.
I looked at one of our typically busier blocks of time today
(10:00-14:00) and during that time the cluster was seeing an
average of 10Gbps of traffic with peaks as high as 15Gbps.
Looking at our traffic graphs and capstats showed each host
typically was seeing around 50% of that load, or around 5Gbps
on average. During this time we saw an average capture loss of
around 0.47%, with a max loss of 22.53%. During that same
time-frame I had 18 snapshots where individual workers reported
loss over 5%, and 2 over 10% out of 748. So, I'd say each host
is probably seeing about the same amount of traffic as you
have described, but loaded scripts etc may vary from your
configuration. We have 22 workers per host for a total of 44
workers, and I believe the capture loss script is sampling
traffic over 15 minute intervals by default, so there are
roughly 17 time slices for each worker. Here are some details
of how those nodes are configured in terms of hardware and
bro.
2 worker hosts each with: 2xE5-2697v2 (12 Cores / 24 HT)
2.7Ghz/3.5Ghz Turbo 256GB RAM (probably overkill, but I used to
have the manager and proxies running on one of the hosts and it
skewed my memory use quite a bit) Intel X520-DA2 NIC Bro 2.3-7
(git master at the time I last updated) 22 workers PF_RING
5.6.2 using DNA IXGBE drivers, and pfdnacluster_master script
CPU's pinned (used OS to verify which core presented to the OS
mapped to each physical core to avoid mapping 2 workers to the
same physical cores, and didn't use the 1st core on each CPU)
HT is not disabled on these hosts and I'm still using the OS
malloc.
Worker configs like this: [worker-1] type=worker
host=10.10.10.10 interface=dnacluster:21 lb_procs=22
lb_method=pf_ring
pin_cpus=2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23
I suspect the faster CPUs will handle bursty flows better such as when a