Using Bro in offline mode (pcap spooling)

Greetings!

I have seen previous posts about people wanting to use bro against full
packet capture datastore's and wanted to share a solution I have found,
some limitations encountered and a partial solution before finally
asking for clarification on cluster support for reading PCAP's.

Solution for consuming pcap's continously:

Firstly to deal with the issue of sessions spanning multiple capture
files without resorting to using tcpreplay I use this project from CIRCL
https://github.com/CIRCL/pcapdj

It works by using redis to instantiate a queue that holds paths to
pcap's for sequential processing. The initial PCAP is fed to a fifo
with the header intact but at the end of that file the footer is not put
on the spool causing the libpcap code to continue waiting for additional
packets, the next and subsequent PCAP's have the header removed so to
the reading application this appears as a continous packet stream.
There are additional details regarding how the queue is managed (you
have to authorize reading of the next file) that I won't go into here
but I will add that I use inotify to trigger adding PCAP's to the queue
after they are closed by the writing application.

<SNIP>

Limitations:

I went this route initially because I believed that processing PCAP's
already present on these sensors due to fullpacket capture would be more
efficient as it removed the realtime constraints associated with live
operation and analysis. This has held true on sensors analyzing single
or multiple 1Gbps links however on a deployment that has 2x10Gbps and
1x1Gbps link being dumped using PF_RING multi (eth4,eth5,eth6) I have
run into a few issues.

1. Timing issues between the 10Gbps and 1Gbps links lead to packet
ordering issues that made me change the way the SPAN sessions were setup
otherwise file extraction and other things did not happen properly.
Specifically whenever dealing with multiple capture links on switches in
the same zone (especially when 2 or more switches are redundant) my
practice is to setup an RX span on every port except those that
interconnect the redundant chassis. This gets a copy of both
directions of the traffic and prevents duplication.

In this specific scenario the two 10Gbps links are to redundant Nexus
(think server-core) switches (cross connect via port-channel) serving
multiple leaf swtches where servers live and an additional port channel
to the redundant user-core switches. User core has to transit
server-core to get WAN/INET.

The 1Gbps link is to a standalone switch that terminates the WAN routers
and firewalls. So all ports had RX copied except the port-channels cross
connecting the nexus's and each of their uplinks to the wan switch.

To work around the timing issues and maintain fidelity of capture for
internet bound traffic i had to change that WAN switch to copy RX on
every port and implement ACL's on the Nexus SPAN setup to prevent
getting duplicate copies in the outbound direction for internet flows
while still getting user->server and server->user traffic

2. I have found that first Bro is amazingly efficient provided you do
not get stupid with scripts, and so far for non 10Gbps environments this
spooling setup works great! However for the 10G environments we are
falling quite far behind using a single process (12-14 hours) during
peak usage periods. As the evening goes on and usage dies down we start
to catch up but I only expect this to get worse as this organization
continues to grow. It's obvious to me that due to the nature of
reading from a FIFO my options for parallelization are farely limited
but I have found some somewhat cludgey solutions.

a. use mbuffer to read from the single fifo and output the same stream
to additional fifo's which are then read by individual bro instances.
Use BPF's in the local.bro as detailed here (
http://ossectools.blogspot.com/2012/10/multi-node-bro-cluster-setup-howto.html
) to do load sharing based on 2-tuple.

b. use tcpsplit ( https://github.com/pmcgleenon/tcpsplit ) to read from
the single fifo and write to a number of additional fifo's which can
then be read by multiple bro's. TCPSplit supports VLAN headers and uses
either an LFU table, or a hash on the 4-tuple (5 for vlan), hash on the
2-tuple or any of the previous methods but using only the high order 24
bits of the ip (split by /24). In testing this was workable but lacks
the elegance of bro-clustering and its state and metadate sharing.
The standalone nature of the bro's would break things like SMTPUrl click
analysis and other cool correlations we use.

c. DPDK has support for software virtual devices (
http://dpdk.org/doc/guides/nics/pcap_ring.html ) backed by pcap
(librte_pmd_ring,librte_pmd_pcap) which I have not tested but look
promising as a way to run a bro cluster against PCAP's.

<SNIP>

Now recently I was reading this list and came across this
http://mailman.icsi.berkeley.edu/pipermail/bro/2014-September/007458.html
where seth mentions using the process command in broctl. I wanted to
ask if that is still valid in a cluster environment, and if so how is
the pcap distributed to workers? I do recall someone else mentioning
using packet briks as a PCAP broker which at that time was still in
development....

Thanks and sorry if this is all TLDR.

The process command only runs the pcap through a single Bro instance,
so probably not what you need. There's more details on how it works
in the docs [1], for reference.

- Jon

[1] https://www.bro.org/sphinx/components/broctl/README.html#command-reference