Possible Bro Cluster communication issue?

Hello,

Another Bro newbie here. Having an odd issue getting my bro 2.2 (release) cluster working properly. I have 2 physical hosts. The first host is running the manager, proxy, and some workers, and the second host is running several workers. After running broctl install and broctl start the workers spin up on both hosts, however, the workers on host 2 don't seem to be reliably reporting back to the master or connecting to the proxy.

I confirmed that the processes were running on both hosts and that ssh sessions were established between the two hosts, but a broctl status only showed peers for workers on the same host as the manager, fewer peers than expected for the proxy (about as many as were on host1), and broctl netstat didn't return any results for the workers on the second host.

At some point the proxy crashed on my first run, and upon restarting everything I had the same results minus the proxy crash. Interestingly enough broctl capstats did return results for both hosts showing a relatively even workload of about 3Gbps each. Also, I didn't find any logs other than stderr and stdout on the second host in /bro/log or /bro/spool. Any thoughts?

Regards,

Did you check if a there's a firewall running on either host?
If so, you could try turning it off temporarily to see if that resolves the problem.

Both hosts are running host based FWs, but disabling them doesn't appear to make a difference in the behavior. I can ssh between hosts just fine as the bro user with key-based auth and broctl seems to open an ssh session per worker between the two hosts that appear stay established throughout just fine. Does all the communication happen over those ssh sessions or are there other types of connections happening between master/proxy and worker?

Actually, it was the firewall, but I also had a secondary problem in that the proxy was constantly crashing due a lack of system resources so it didn't initially appear that disabling the firewall relieved the communication problem. I didn't recall seeing any FW considerations beyond ssh in the documentation, but I did eventually find an external document at https://gist.github.com/grigorescu/3776670 and a quick netstat allowed me to confirm the ports on my hosts. Thanks for the help!

Which Linux distro (and which version) are you using? And were
you using the default FW settings? Also, were you able to
determine why the proxy was crashing? If so, how did
you resolve the problem?

We're running RHEL 6.4 (2.6.32-358.6.2.el6.x86_64). We had our own fairly restrictive rule set on the hosts and simply didn't have the ports open as I didn't see them in the particular documentation I was referencing on the bro site. I knew from the documentation that bro needed to be able to SSH between hosts, but didn't know that (manager/proxy/worker) were also listening on specific ports or what they were.

As for the proxy crashing, I think the issue was simply running too many workers on the same host as the proxy and manager. I started my learning by running a single host with manager/proxy/workers and gradually ramping up the worker count, then added the second with just workers. So I suspect I just pushed it too far and needed to free up some system resources on that first host running the manager/proxy. Ideally I think I'd like to run the master and proxy on their own system (as others have suggested). For testing purposes I simply disabled the workers on that host, made sure the proxy didn't crash and then observed the behavior of the host firewalls to see if they were blocking anything. So mostly ignorance and misconfiguration on my part.

Regards,

Gary Faulkner
UW Madison
Office of Campus Information Security
608-262-8591