Hello
I've finished the long process of merging all sensors in a large
cluster. To my surprise, every time I enable all of them and run
"broctl deploy" all workers start, so do proxies, but manager dies
right away.
This cluster has almost 200 workers, 9 servers, between 8 and 16
proxies (tried 8 and 16, didn't change anything).
I have lots of traffic, lots of connections, lots of everything My
guess is that manager can't keep up with the amount of logs it is
expected to generate and it gives up.
manager and proxies run on a server dedicated just for them, 64GB RAM,
16 physical cores, dedicated network for the cluster traffic.
Now, when I divide the cluster more or less in half (4 nodes enabled,
5 disabled) everything is stable.
The amount of logs with 4 sensors enabled (almost exactly an hour, I'm
like 2 minutes from rotation). Hm. Maybe I should do something about
the Mysql traffic
What can I do? I'd like to help debug, if that's a bug I'm running into.
total 24G
-rw-rw-r-- 1 bro bro 84K Sep 1 01:57 capture_loss.log
-rw-rw-r-- 1 bro bro 5.6M Sep 1 01:58 communication.log
-rw-rw-r-- 1 bro bro 1.2G Sep 1 01:58 conn.log
-rw-rw-r-- 1 bro bro 476M Sep 1 01:58 conn-noise.log
-rw-rw-r-- 1 bro bro 1.8M Sep 1 01:58 dhcp.log
-rw-rw-r-- 1 bro bro 309M Sep 1 01:58 dns.log
-rw-rw-r-- 1 bro bro 609M Sep 1 01:58 dns-noise.log
-rw-rw-r-- 1 bro bro 115K Sep 1 01:58 dpd.log
-rw-rw-r-- 1 bro bro 1.5G Sep 1 01:58 files.log
-rw-rw-r-- 1 bro bro 1.5G Sep 1 01:58 http.log
-rw-rw-r-- 1 bro bro 65M Sep 1 01:58 http-noise.log
-rw-rw-r-- 1 bro bro 1.6M Sep 1 01:58 intel.log
-rw-rw-r-- 1 bro bro 37K Sep 1 01:54 intel-noise.log
-rw-rw-r-- 1 bro bro 68K Sep 1 01:58 irc.log
-rw-rw-r-- 1 bro bro 7.8M Sep 1 01:58 kerberos.log
-rw-rw-r-- 1 bro bro 566K Sep 1 01:58 known_certs.log
-rw-rw-r-- 1 bro bro 41K Sep 1 01:58 known_devices.log
-rw-rw-r-- 1 bro bro 244K Sep 1 01:58 known_hosts.log
-rw-rw-r-- 1 bro bro 330K Sep 1 01:58 known_services.log
-rw-rw-r-- 1 bro bro 4.9G Sep 1 01:58 mysql.log
-rw-rw-r-- 1 bro bro 636K Sep 1 01:58 notice.log
-rw-rw-r-- 1 bro bro 6.0K Sep 1 01:58 pe.log
-rw-rw-r-- 1 bro bro 559 Sep 1 01:31 reporter.log
-rw-rw-r-- 1 bro bro 168K Sep 1 01:57 sip.log
-rw-rw-r-- 1 bro bro 12M Sep 1 01:58 smtp.log
-rw-rw-r-- 1 bro bro 25M Sep 1 01:58 snmp.log
-rw-rw-r-- 1 bro bro 73M Sep 1 01:58 software.log
-rw-rw-r-- 1 bro bro 3.9M Sep 1 01:58 ssh.log
-rw-rw-r-- 1 bro bro 23K Sep 1 01:57 sslcipherstat_log1.log
-rw-rw-r-- 1 bro bro 766K Sep 1 01:58 sslcipherstat_log2.log
-rw-rw-r-- 1 bro bro 783M Sep 1 01:58 ssl.log
-rw-rw-r-- 1 bro bro 17K Sep 1 01:57 sslprotostat_log1.log
-rw-rw-r-- 1 bro bro 773K Sep 1 01:58 sslprotostat_log2.log
-rw-rw-r-- 1 bro bro 492 Sep 1 00:12 stderr.log
-rw-rw-r-- 1 bro bro 188 Sep 1 00:12 stdout.log
-rw-rw-r-- 1 bro bro 7.8K Sep 1 01:12 subnet.log
-rw-rw-r-- 1 bro bro 3.9G Sep 1 01:58 syslog.log
-rw-rw-r-- 1 bro bro 1.6M Sep 1 01:58 tunnel.log
-rw-rw-r-- 1 bro bro 46M Sep 1 01:58 weird.log
-rw-rw-r-- 1 bro bro 1.6G Sep 1 01:58 x509.log
-rw-rw-r-- 1 bro bro 683 Sep 1 01:31 xss.log
Logs aren't really helpful.
cat post-terminate-2015-09-01-00-12-15-61637-crash/.crash-diag.log
Bro 2.4
Linux 3.19.0-26-generic
==== No reporter.log
==== stderr.log
warning in /opt/bro/share/bro/brozilla/./intel-dns.bro, line 99:
deprecated (join_string_array)
warning in /nsm/bro/spool/installed-scripts-do-not-touch/site/local.bro,
line 176: multiple initializations for index (10.248.75.6)
warning in /nsm/bro/spool/installed-scripts-do-not-touch/site/local.bro,
line 176: multiple initializations for index (10.248.75.7)
warning in /nsm/bro/spool/installed-scripts-do-not-touch/site/local.bro,
line 177: multiple initializations for index (10.248.22.1)
==== stdout.log
max memory size (kbytes, -m) unlimited
data seg size (kbytes, -d) unlimited
virtual memory (kbytes, -v) unlimited
core file size (blocks, -c) unlimited
==== .cmdline
-U .status -p broctl -p broctl-live -p local -p nsmserver1-manager
local.bro broctl base/frameworks/cluster local-manager.bro broctl/auto
==== .env_vars
PATH=/opt/bro/bin:/opt/bro/share/broctl/scripts:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
BROPATH=/nsm/bro/spool/installed-scripts-do-not-touch/site::/nsm/bro/spool/installed-scripts-do-not-touch/auto:/opt/bro/share/bro:/opt/bro/share/bro/policy:/opt/bro/share/bro/site
CLUSTER_NODE=nsmserver1-manager
==== .status
TERMINATED [atexit]
==== No prof.log
==== No packet_filter.log
==== No loaded_scripts.log
bro@nsmserver1:/nsm/bro/spool/tmp$