Remember to double check your DNS resolver configuration

I’ve been troubleshooting an issue where a single node would have all of its workers grow in memory until they would be OOM killed. The troubleshooting process spanned multiple days and I only happened to come across this with some help from Justin combined with a thread on the issue tracker (https://bro-tracker.atlassian.net/browse/BIT-1482).

Keep in mind that when you are using the MHR script (enabled by default) or the notary script, your Bro workers are performing a LOT of DNS. In my case I was using both. Since lookup_host_txt and lookup_host never return if the worker node doesn’t reach a DNS server, this results in what would appear to be a new thread for each new DNS query when your DNS resolvers are misconfigured.

Did evidence of this show up in stats.log? There are some fields that track the amount of DNS actively being performed by Bro in there.

  .Seth