Zeek - Usecase based File Extraction

Hi all,

I’ve recently been working on file carving/extraction based on a few usecases.

Namely:
During a match with the Intel Framework on a FILE_HASH, I want to extract the file.
During a match with the Intel Framework on a DOMAIN and ADDR, I want to extract the file.

See code below.

Yet everytime I’ll get the error message:
Analyzer Files::ANALYZER_EXTRACT not added successfully to file …

This occurs when you try to extract from the event: file_hash.
However, within events such as file_new and file_sniff, files can be extracted.
I’d like to hand over the hash within the event of file_hash to Intel::Seen($indicator=hash)

A few questions:

  • Is it possible to extract a file during an Intel::match event?
  • If yes, how would I go about this?
  • Is there a simple way to hand over the hash, originating tx_host and domain to the Intel framework and extract a file after a match?

Looking forward to your reply.

Kind regards,
Bart

{CODE}

@load base/frameworks/intel
@load base/files/extract

##Redefine to path desired.
global path = “/home/zintern/EXTRACTED/temp/”;

##Redefine to desired IoC .dat file
redef Intel::read_files += {fmt("%s/otx.dat", @DIR)};

When a new file is seen:

event file_new(f: fa_file)
{
Files::add_analyzer(f, Files::ANALYZER_MD5);
Files::add_analyzer(f, Files::ANALYZER_SHA1);
Files::add_analyzer(f, Files::ANALYZER_SHA256);
}

When a file_hash has been seen

event file_hash(f: fa_file, kind: string, hash: string)
{
local seen = Intel::Seen($indicator=hash,
$indicator_type=Intel::FILE_HASH,
$f=f,
$where=Files::IN_HASH);

Intel::seen(seen);
}

When a match has been found between the seen traffic and the otx.dat file indicators.

event Intel::match(s: Intel::Seen, items:set[Intel::Item])
{
if(s$indicator_type == Intel::FILE_HASH)
{
local fname = fmt("%s%s-%s", path, s$f$source, s$f$id);
Files::add_analyzer(s$f, Files::ANALYZER_EXTRACT,[$extract_filename = fname]);
}

}

Hi Bart,

A few questions:
- Is it possible to extract a file during an Intel::match event?
...

usually the match is too late to attach the file analyzer that handles extraction. Furthermore, in a cluster setup its triggered on the manager. The simplest way to get files for intel hits is to extract all files and just preserve the ones that triggered a hit (for the poor man's approach see https://github.com/J-Gras/intel-extensions/blob/master/scripts/preserve_files.bro).

Jan

Hi Jan,

Thank you for the clarification!
I should’ve known a file cannot be extracted “after” the hash of the file has been calculated.
To calculate the hash of a file in the first place you’d need to analyse the file in its entirety.
Meaning after the hash has been analysed of the file it’s likely at the END bit of the data stream.

The partial solution to extract first and verify later might be overkill on a network where thousands of files are downloaded.
Restricting it to particular data protocols such as HTTP ‘only’ will have less of an impact on the computational load.
I’ll have to try your suggested method, thank you for the link!

I was wondering if the usecase of extracting after getting an intel hit on INTEL::DOMAIN and INTEL::ADDR might still work.
My assumption here is that the time between the event file_new and intel::match might be small enough to not make a difference.
As long as the function Intel::seen is called immediately during a file_new event (this might cause some dataloss).

I have a one more questions if you or anyone has time:

  • I’d like to compare the tx_hosts seen of a file with the INTEL::ADDR, how would I go about this? (since tx_hosts is a set (still learning bro)).

Kind regards,
Bart