"bro-cluster-in-a-box-setup" to "zeek-cluster-in-a-box-setup"?

Hello Zeek Community,

I am working on a project where Zeek has been deployed in two phases. During the first phase, some participants used “https://github.com/ncsa/bro-cluster-in-a-box-setup” script to assist in, and automate a lot of the installation process.

Since then we have entered the phase in our project where more participants have been added, CentOS 8 is preferred, and we are using Zeek 3.0.1.

I wonder if any consideration, or work has been done, in updating the bro-cluster-in-a-box script to work with the updated OS and Zeek version. Any information would be appreciated.

Thanks in advance,

Paul Sibley

Hi!

It shouldn’t be that hard to update to 3.x…

  • bro-pkg should be swapped out with the renamed zkg
  • the python2 references can likely be changed to 3
  • caf no longer needs to be installed separately
  • geoip and databases needs to be swapped out with maxminddb versions, might need a license
  • probably worth it to switch to af_packet from pf_ring… pf_ring was only used initially to easily support capturing directly from both halves of a tap, which might not be a requirement anymore.

My schedule is a bit crazy for the next week, but once I have some time to work on it I should be able to get things updated pretty quickly… There’s really not much to it.

At the Canarie workshop, Steve Smoot from Corelight suggested using pf_ring still. Any thoughts/comments on switching to af_packet? Advantages vs Disadvantages?

Regards,
Scott

Well, it was more nuanced. Or should have been. I think AFpacket is generally better but if you want 2 taps, then PF_Ring.
But always happy to hear other people’s experience!
-s

There’s a law that if you say pf_ring and af_packet 3 times, Michal shows up.

I don’t see many (any?) reasons for using pf_ring, TBH, if you have a modern kernel or a decent network card (Mellanox, Intel, etc). And I still owe the community the article to show how to use the af_packet correctly :confused:

The case where one has inputs from multiple taps, to multiple network ports will be handled the same way by af_packet, if interfaces are bonded or bridged and by pf_ring. None of them buffers data and processes them at L4 and deals with out of order, etc.

OOOH! You can bond two interfaces together and run af_packet on the bond0 interface? that works?!?

Somewhat of a tangent, but did af_packet in CentOS/RHEL 7 kernels ever solve the distribution of packets across multiple bro/zeek processes when observing IPv6 traffic?

I observed an issue a while back where when watching traffic on an interface (bonded or not) with multiple bro/zeek processes, that all processes would see the IPv6 traffic, vice only one process. IPv4 worked properly, but any network with IPv6 had some nasty logs because of duplication.

Would love to hear this confirmed with no performance issues.

Cheers,

JB



From: justin@corelight.com
Sent: February 5, 2020 5:26 PM
To: michalpurzynski1@gmail.com
Cc: Paul.Sibley@canarie.ca; zeek@zeek.org
Subject: Re: [Zeek] “bro-cluster-in-a-box-setup” to “zeek-cluster-in-a-box-setup”?

|

  • |

OOOH! You can bond two interfaces together and run af_packet on the bond0 interface? that works?!?

You can absolutely do this. We are using af_packet and bonded interfaces throughout the majority of our deployments (approximately 1800 sensors).

We decided on af_packet as it was included in recent (at the time 2yrs ago) kernels. I can’t speak to non-Debian based distro’s, but we haven’t seen any issues related to the use of af_packet.

-Justin

Sure, you can run af_packet on any device, including device-made-of-devices, any virtual and physical interface and a combination thereof. The whole af_packet mechanism (they call it “taps” internally) works on a higher level.

Now let’s address the elephant in the room, shall we.

IPv4 is correctly hashed on relatively modern kernels (I believe RHEL 7.4 has a fix for that) - so you can use the cluster_flow mode.
IPv6 seems to have problems, sometimes - I can see it correctly hashed most of the time (but not always).

What we do on production, is we let card hash packets by src + dst IP address (and never ports, because fragments don’t have port numbers), with the symmetric key, offloading disabled, correct number of queues set and cluster_qm.

If the community is interested I can have an article out in a week - just need to know if there’s someone who wants that?

+1 on the article.

-Justin

+1.

Regards,
C. L. Martinez

+1 on the article.

+1 on the article.

Give me a week, I already started working on it. I’ll be in touch with Amber. to post it to the official Zeek blog (and only there).