We’re trying to understand manager memory requirements when the intel framework is in use, after experiencing multiple manager crashes per day when using the framework on a low-bandwidth (less than 1Gbps) CentOS 6 machine running a production Bro 2.4.1 cluster. These are happening because the manager is exhausting its tcmalloc heap limit of 16G, as reported in its stderr.log. We removed the heap limit on an idle (no network traffic) Bro 2.4.1 test system, and found the parent VSize reported by “broctl top manager” went to 27G for an intel input file of 18K unique Intel::DOMAIN items. It remained at 27G after many cycles of replacing the input file with 18K new unique items.
Restoring the heap limit and attaching gdb to the manager on the test system shows a malloc failure backtrace that comes out of RemoteSerializer::SendCall (). We commented the conditional that invokes “event Intel::new_item(item)” in base/frameworks/intel/main.bro to disable remote synchronization with the workers, and the huge VSize disappeared.
We then built bro from master (version 2.5-569) and retested. The manager VSize is much lower, but is still about 15G.
Any advice on how to proceed with further diagnostics to hopefully reign in the manager memory requirements for intel? It doesn’t appear at first blush that upgrading Bro will fix it, at least not entirely, and we’re reluctant to upgrade the production system without fully understanding the problem.