Hello again,
I was hoping to get some guidance on how to best use Bro to process email files. My end goal is to strip out inbound email attachments, identify the file type, then run a distinct set of external tools against them. Each file type would have a different set or order of tools.
I will without a doubt eventually incorporate “http-ext-identified-files.sig” instead of what I am currently using, but I am having trouble determining where to integrate the logic for handling each file type. As it currently works, I am saving off every pdf and word doc, which would be unnecessary if I used bro to call the external tools and evaluate the results.
Current logic (this method calls for the external tools to be run against the directory by cron and are independent of Bro):
#if the hot flag is set then we dump the MIME-decoded attachment to it’s own file for analysis
if( session$entity_is_hot )
{
if ( session$entity_filename == hot_pdf_attachment_filenames )
{
#build the filename out of MD5, length and filename
hot_attachment_dumpname = fmt(“dumped_pdf_files/%s:%d:%s”, session$content_hash, length, session$entity_filename);
}
if ( session$entity_filename == hot_word_attachment_filenames )
{
hot_attachment_dumpname = fmt(“dumped_doc_files/%s:%d:%s”, session$content_hash, length,session$entity_filename);
}
#get a raw filehandle, notice open() instead of open_log_file(), write the data out, and be sure to close the fh
hot_attachment_dump_fh = open( hot_attachment_dumpname );
write_file(hot_attachment_dump_fh, data);
close(hot_attachment_dump_fh);
}
What I would like to be able to do:
if ( session$entity_filename == hot_pdf_attachment_filenames )
{
hot_attachment_dumpname = fmt(“dumped_pdf_files/%d:%s”, length, session$entity_filename);
hot_attachment_dump_fh = open( hot_attachment_dumpname );
write_file(hot_attachment_dump_fh, data);
scan_pdf_file(file) #call the external tools
scan_pdf_file would include something like this:
scanpdf.py (which would include clamscan, pdfid.py, cymruMHR, ssdeep…etc) The pdf python script can pass the results back to bro for handling.
if ( result == bad )
{
alert
}
else
{
delete file, carry on or log results somewhere then delete file
}
The scan for office docs would be similiar, but use ‘OfficeMalScanner’ instead of pdfid.py and pdf-parser.py. If I get this to work, I would like to do something very similar with http files.
How can I call the external tools? Is this the right place to be doing this?
I read in Robin’s ‘Advanced Scripting’ presentation from the 2009 workshop about injecting external information but am still confused how to do the alternative.
I would be surprised if this capability doesn’t already exist and suppose I might be going about this all wrong. I would just prefer to incorporate the file scans in Bro vice running them completely independently. If I wasn’t clear or am completely out in left field feel free to be honest. I won’t be offended.
Thanks in advance!
Will