broctl unable to find peers

I’m seeing an issue using bro 2.4.1 where when I run ./broctl status it hangs on ‘Getting peer status …’. When I run the same command specifying manager, any of the proxies, or any of the individual workers it has no issue. Has anybody seen this before?

This is a 5 node cluster (1 manager, 4 sensors) running on Ubuntu 14.04. I am in the process of upgrading to 2.5, but before I do so I’m adding 2 additional sensor machines (bringing it to 7 nodes) to the cluster because we sorely need the additional processing power. After the upgrade to 2.5 I will be adding another node and splitting the logger function onto it, making it an 8 node cluster.

Here’s an example of me running ./broctl status and it failing after 3 1/2 minutes, then it goes on to successfully get the status for every component/instance specifically, however the Peers section returns “???”.

$ time ./broctl status || time ./broctl status manager;time for proxy in {1…5}; do ./broctl status proxy-${proxy}; done;for svr in {1…4}; do for instance in {1…20}; do ./broctl status worker-${svr}-${instance}; done; done

removing stale lock

Getting process status …

Getting peer status …

Killed

real 3m35.233s

user 0m0.126s

sys 0m0.119s

waiting for lock (owned by PID 22222) …

Getting process status …

Getting peer status …

Name Type Host Status Pid Peers Started

manager manager A.B.C.D running 11111 ??? 18 Dec 03:24:38

Jon

You likely have iptables enabled on your hosts and it is preventing broctl from connecting to bro on the workers.

https://www.bro.org/sphinx/components/broctl/README.html#bro-communication

I’ve tested with iptables stopped and have the same issue. We do typically run with iptables up but have openings for all the required communication as far as I’m aware. This additional context may be helpful:

$ ./broctl status

Getting process status …
Getting peer status …
Killed
$ Traceback (most recent call last):
File “”, line 1, in
File “”, line 23, in
File “/usr/lib/python2.7/json/init.py”, line 338, in loads
return _default_decoder.decode(s)
File “/usr/lib/python2.7/json/decoder.py”, line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File “/usr/lib/python2.7/json/decoder.py”, line 384, in raw_decode
raise ValueError(“No JSON object could be decoded”)
ValueError: No JSON object could be decoded

Jon

Are you sure? That's always what this is.

If you run tcpdump at the same time you should see the manager try (and probably fail) to connect to the other nodes.

It's probably working when you do one at a time because only one has to timeout instead of all of them.

I could be wrong, but I don’t think that’s the issue. tcpdump -nn -i ${interface} "dst net ${Worker_Subnet}/24 and src host ${Manager}" shows plenty of valid traffic between the manager the cluster members, and everything else in the cluster appears to be functioning normally.

I modified the iptables to allow all tcp ports between members of the cluster, restarted iptables, verified the new rules were effective across all systems, and tested ./broctl status again, but it failed the same way as before.

Jon

What happens if you run "broctl peerstatus"? (after starting
the cluster, of course)

I get a similar failure with broctl peerstatus when the cluster is up. It sits for a few minutes then kills itself.

$ time ./broctl peerstatus

Killed

real 6m48.594s

user 0m0.102s

sys 0m0.111s

I have tried adding a log line to my iptables so it will log right before getting dropped, but after reviewing the log over a 10 minute period I wasn’t able to find anything from any members of my bro cluster getting dropped. While the logging was on I tried multiple ./broctl commands, including directly hitting the server using ./broctl status worker-1-1 and a more general ./broctl status or ./broctl peerstatus.

Jon

One simple workaround for the status command being too slow is to
edit your etc/broctl.cfg file and look for the option
"StatusCmdShowAll". Change it to this:

StatusCmdShowAll = 0

However, this doesn't solve the problem of Bro processes
not being able to communicate with each other.

Awesome, thank you.

So, I worked with Justin on IRC and we did find this:

$ ./broctl print foo worker-1-1

worker-1-1 <error: cannot connect to WORKER:47767>

However, when I ran tcpdump on WORKER I saw a clean connection setup, data transfer, and teardown from the manager. I also turned logging on for the manager’s iptables, ran ./broctl status assuming it would hit the manager first, and I didn’t see any DROPs or REJECTs that would be relevant (looking at eth0, 127.0.0.1, and 127.0.1.1).

Per Justin’s suggestion I’m going to look into enabling debugging in broccoli tomorrow.

Jon