Bro 2.5 Packet Drop Issue

Hello Everyone,

I am reaching out with the hope that someone will be able to help us with an issue we are having with Bro upgrade from 2.4.1 to 2.5.X.

We have a system with 12 core (3Ghz) ,128GB RAM, and 10G NIC (Intel X520-SR2 10GbE Dual-port), monitoring between 1.5 - 2.5 Gbps traffic.

Bro 2.4.1 is working great and periodically drops 2-5% when traffic peaks at ~ 2.5. However, when we upgrade to Bro 2.5.3/4 on the same exact system the drops go up to 90%.

We are using CentOS-7 and tired installing Bro and Pfring from both rpm and source without any luck. I wonder if anyone has seen this issue and can give some clues to resolve this issue.

Bro Node Conf:

[manager]

type=manager

host=localhost

Hello Everyone,

I am reaching out with the hope that someone will be able to help us with an issue we are having with Bro upgrade from 2.4.1 to 2.5.X.

We have a system with 12 core (3Ghz) ,128GB RAM, and 10G NIC (Intel X520-SR2 10GbE Dual-port), monitoring between 1.5 - 2.5 Gbps traffic.

Bro 2.4.1 is working great and periodically drops 2-5% when traffic peaks at ~ 2.5. However, when we upgrade to Bro 2.5.3/4 on the same exact system the drops go up to 90%.

We are using CentOS-7 and tired installing Bro and Pfring from both rpm and source without any luck. I wonder if anyone has seen this issue and can give some clues to resolve this issue.

Bro Node Conf:
[manager]
type=manager
host=localhost
#
[proxy-1]
type=proxy
host=localhost

#
[worker-1]
type=worker
host=localhost
interface=ens1f1
lb_method=pf_ring
lb_procs=11
pin_cpus=1,2,3,4,5,6,7,8,9,10,11

You're missing a logger process, adding one will make the cluster run better:

[logger]
type=logger
host=localhost

[root@bro-test ~]# cat /proc/net/pf_ring/info
PF_RING Version : 7.3.0 (unknown)
Total rings : 11

you should have 1, not 11...

Standard (non ZC) Options
Ring slots : 65534
Slot version : 17
Capture TX : No [RX only]
IP Defragment : No
Socket Mode : Standard
Cluster Fragment Queue : 0
Cluster Fragment Discard : 0

Looks like you are having the issue where bro is not actually use pf_ring load balancing if you installed it from rpms.
What you're effectively doing is running 11 workers that are all receiving 100% of the traffic, so you are doing 11 times the work.

You can further confirm that this is the problem you are having by running

  broctl config | grep -i clusterid

and seeing if the id is set to 0:

  pfringclusterid = 0

if so, edit /opt/bro/etc/broctl.cfg and add

  PFRINGClusterID = 11

and broctl deploy to restart everything.

This is already fixed and won't happen again in bro >= 2.6... just keeps tripping people up on 2.5.x

You should also look into switching to the native bro pf_ring plugin or the bro af_packet plugin which are both better choices than using the pcap wrapper method.

Thank you so much Justin, the solution worked. We were literally troubleshooting for more than a month and did not find anything online.

Jawad Rajput
System Administrator
U.S. Department of Energy
IM-62 /Germantown Building
HQ Network Security Team
Email: Jawad.Rajput@hq.doe.gov
Office: 301-903-2176
Office: 301-903-3895
Cell: 301-795-5406