Bro Cluster Dropped Packets

Is there any way to determine the cause of dropped packets? I’m running Bro Cluster (2.2) on a single machine with 1 manager, 1 proxy and 10 workers. The total number of workers is much less than the number of cpus in this machine (system load doesn’t usually get higher than 2 and individual worker processes hover at around 30-40% cpu utilization). The machine has PF_Ring and related ethernet drivers installed. After looking at netstats there’s always some dropped packets. The occasional dropped packet isn’t usually a cause for concern but some workers show large numbers of dropped packets. I’d like to know what part of the process is bottle-necked and causing packets to be dropped.

The documentation mentions that broctl cron logs stats but doesn’t mention where they’re located (didn’t see anything in spool that looked like cluster runtime stats) or how to view the data.

Anyone have any ideas?

Hello MK,

Would you happen to be running PF_RING 5.6.2? If so, you might want to join in on this thread on the ntop-misc list:

To speak more directly to the question you asked, you can certainly look at the stats from ifconfig to see if your card is dropping packets (something I’m seeing with the above issue), and you can also look at the stats in /proc/net/pf_ring/${PID_FROM_EACH_BRO_WORKER}* . I’m not sure where any Bro specific stats may be kept…



Generally with network monitoring you're going to have some degree of dropped packets even on the most appropriately scaled systems. What you generally want to do is fight to keep the percentage of dropped packets as consistently low as possible. Also, when you're using things like PF_Ring that do odd things with nic buffers you have to be very leery of the stats reported from the NIC. Even in the best of cases those stats aren't very trustable.

What we generally recommend for our users is to run the misc/capture-loss script. You can load it by adding this line to local.bro (and doing install then restart in broctl)

@load misc/capture-loss

This will create a capture-loss.log file that is written to every 15 minutes (by default) which will tell your apparent packet loss measured by watching non-seen but acked data segments in TCP streams. This can also be confusing for people sometimes it will measure traffic loss happening upstream in your network. Here is a blog post where someone had packet loss happening on a network device before the packets were even sent to their box running Bro: