Question on using dir.zeek

arotnemer · June 15, 2023, 12:26pm

I am a new user to Zeek. In our case we have an application that continually populates a directory with PCAP files which are then processed by our forensic tools. We’d like to be able to use zeek in a way where we can specify a directory where it would find its input PCAP files and process then as they come in. The closest I have seen on the zeek websites is on pages

github.com

zeek/zeek/blob/master/scripts/base/utils/dir.zeek

@load base/utils/exec
@load base/frameworks/reporter
@load base/utils/paths

module Dir;

export {
	## The default interval this module checks for files in directories when
	## using the :zeek:see:`Dir::monitor` function.
	option polling_interval = 30sec;

	## Register a directory to monitor with a callback that is called
	## every time a previously unseen file is seen.  If a file is deleted
	## and seen to be gone, then the file is available for being seen again
	## in the future.
	##
	## dir: The directory to monitor for files.
	##
	## callback: Callback that gets executed with each file name
	##           that is found.  Filenames are provided with the full path.

This file has been truncated. show original

and
https://docs.zeek.org/en/master/scripts/base/utils/dir.zeek.html
Are there examples of how I can get zeek to run to continually monitor, say, directory “/home/alan/pcaps” to process pcap files that get placed in that directory?

Many Thanks,
Alan

Christian · June 16, 2023, 11:39pm

Welcome Alan,

Great question! There’s currently no (or at least no reasonably user-friendly) way to do this fully in Zeek because in order to process a new pcap file you’ll currently need to launch a new Zeek process. I suggest you script this outside of Zeek. A few suggestions:

You could use something like the inotifywait tool to inform you when a file has been written to your input directory, and then launch Zeek on that pcap. This has the benefit of allowing you to adjust the invocation each time, for example to create logs in per-pcap output directories.
If you know that the pcaps are chronologically sequential, you could use the same monitoring approach but have Zeek read packets from stdin, continuously writing the new packets into it. This has the benefit of Zeek running continuously, so you can maintain state across your pcaps. You might be able to build this out further with something like mergecap to help protect against out-of-order timestamps.
If in addition to the previous point the timing information in the pcaps does not matter (which I suspect is unlikely), you could replay the arriving pcaps onto a network interface that is watched by a local cluster of Zeek processes, to parallelize the processing.
There’s a Zeek package that provides support to submit pcaps to Zeek over TCP, which depending on your use case might also be helpful.

It’d be interesting to hear more about your solution, or if none of the above is workable for you. Expanding pcap support in the direction you’re requesting is something we’re currently looking into.

Best,
Christian

Topic		Replies	Views
Reading paps offline continuously with Seek 3.2.2 Zeek	3	109	May 6, 2022
Tracking PCAP file sources? Zeek	1	118	May 6, 2022
How could get better optimization of pcap processing in Zeek? Zeek	6	769	May 15, 2023
Running Script in a Cluster Zeek	2	83	May 6, 2022
Problem with trace file in Zeek Zeek	2	237	April 17, 2023

Question on using dir.zeek

Related topics