Possible memory leak in logger process?

redbaron · December 13, 2021, 6:15am

Hi,

We have a Zeek node that sees high volumes on working days. Due to our internal network configuration a lot of connections for our internal DNS servers are generated by certain endpoints (because our DNS does not resolve any external domains and certain applications keep repeating the DNS requests at astronomical rates). The node is a 16 core, 128GB VM and we use ASCII logger.

We have observed that under high loads (~40k writes/s), the logger process starts lagging behind and its memory usage goes up. Once the machine is using >60% of its memory, Zeek starts dropping packets and a general drop in performance is observed. Only solution is to restart the zeek process.

My understanding is that logger is buffering the unwritten lines in memory and so memory usage is going up.

To work around this, I split the output files so that all connections to the DNS server and all DNS requests to high velocity domains are logged to separate files (conn-noise.log and dns-noise.log). These two files consume nearly 80% of the disk usage under the current directory (E.g. in 30 minutes the current directory use is 4.9G out of which these two files use 4.0G). Doing this, I hoped that any lags would be limited to these two files and I will lose less data on a restart. Also by using separate threads for heavily written files, I may be able to get better performance. The idea has worked partially as lags for other files are generally low now although we do need to restart zeek if memory usage goes beyond 55%.

The problem is that I have observed that logger memory usage does not decrease on its own when the loads reduce (e.g. at night). E.g. If Zeek was using 40G memory on Friday evening and dns-noise was showing a lag of 1800 seconds, the memory usage on Monday morning is still 40G although the lag is only around 1 second. Has anyone experienced anything similar? I am running Zeek-4.1.1.

Thanks,
Dheeraj

Tim_Wojtulewicz · December 13, 2021, 5:48pm

We also have another report of the same in https://github.com/zeek/zeek/issues/1856. Is it possible for you to rebuild with jemalloc support and run the jemalloc profiling plugin on your logger node? That should give more information about what’s causing the bloat. We can use that issue to discuss more in depth what’s going on with it, if that’s easier than email.

Tim

redbaron · December 14, 2021, 5:03am

Thanks for the pointer Tim.

I will try to run jemalloc profiling and post back on the Github issue.

Dheeraj

Aashish_Sharma1 · December 14, 2021, 5:21am

Dheeraj,

Whats the OS on the host running the logger node ?

I've seen same issues of bloating logger node with tcmalloc on FreeBSD. Mine
crashes after 180+GB - takes a couple weeks to do so!

Since last week I have been running with jemalloc and things seem better - but
lets see I may risk speaking sooner here.

(On a side note)

I've been trying jemalloc and few hiccups (struggles) related to building zeek with
jemalloc on FreeBSD:

1) fix for building zeek + jemalloc + FreeBSD: https://github.com/zeek/zeek/pull/1878

and,

2) Fix for building jemalloc itself on FreeBSD to --enable-profiling

We've (Craig leres) got out a patch to be able to do so as well.

(2) is mostly needed so that I can build zeek against jemalloc with
--enable-profiling to run Justin's zeekctl jemalloc profiler.

Aashish

redbaron · December 14, 2021, 5:38am

Hi Aashish,

OS is CentOS-7 and we haven’t enabled tcmalloc/jemalloc during build. All processes (logger, workers, proxy, manager) run on the same machine. We have four more “sensors” with similar setup but somewhat lesser traffic which do not exhibit this problem.

Dheeraj

Topic		Replies	Views
Zeek memory is increasing constantly Zeek	7	678	November 27, 2023
Zeek is consuming 100% RAM/memory Zeek	3	616	September 6, 2023
Logger Child Memory Leak (logger crashing often) Zeek	5	127	May 6, 2022
Memory usage climbs and never recovers Zeek	4	109	May 6, 2022
Increased memory usage by Zeek.. Zeek	4	141	May 6, 2022

Possible memory leak in logger process?

Related topics