An assist with file extraction

Hey all,

The topic pretty much says it…I’ve done a fair amount of reading trying to determine the best way to extract file attachments in smtp traffic. Most of the information I’ve found is related to older versions of bro. Can someone point me to a current resource that will work with the current version of bro? Thank you.

James

For 2.3.2 (current release) you’ll want to use the event file_new.

Note that in 2.3.2 if you are extracting based on mime_type (most people do) you will want to verify that the field exists before you actually use it.

For master, which is what you are likely referring to… you’ll want the event file_mime_type.

For 2.3.2 (current release) you’ll want to use the event file_new.

Note that in 2.3.2 if you are extracting based on mime_type (most people do) you will want to verify that the field exists before you actually use it.

For master, which is what you are likely referring to… you’ll want the event file_mime_type.

Well here’s what I have:

global ext_map: table[string] of string = {
[“application/x-dosexec”] = “exe”,
[“application/zip”] = “zip”,
[“application/msword”] = “xls”,
};

event file_new(f: fa_file)
{
if ( f$source != “SMTP” )
return;

if ( ! f?$mime_type || f$mime_type !in ext_map )
return;

local ext = “”;

if ( f?$mime_type )
ext = ext_map[f$mime_type];

local fname = fmt("%s-%s.%s", f$source, f$id, ext);
Files::add_analyzer(f, Files::ANALYZER_EXTRACT, [$extract_filename=fname]);
}

This appears to function ok…Office doc XML format end up as zips, which is fine by me. Can anyone see anything glaringly wrong with this? Also…I have bro log files zipped and rotated at midnight…is there a way to include the extract_files directory in that rotation, or, even better, have the extracted files go into a directory name with say something like /mnt/backup/extract_files/04-16-16 and change per day? Thank you.

James

This appears to function ok....Office doc XML format end up as zips, which is fine by me.

This will be fixed in 2.4. New xml Office files will be identified as....

application/vnd.openxmlformats-officedocument.presentationml.presentation
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
application/vnd.openxmlformats-officedocument.wordprocessingml.document
and...
application/vnd.openxmlformats-officedocument in case a better option wasn’t discovered. And, yes, those are the *actual* mime types for MS Office documents.

Also...I have bro log files zipped and rotated at midnight..is there a way to include the extract_files directory in that rotation, or, even better, have the extracted files go into a directory name with say something like /mnt/backup/extract_files/04-16-16 and change per day?

Please feel free to file a ticket. That would be a nice trick. :slight_smile:
  http://tracker.bro.org

  .Seth

I will file...looks like I'll have to 'roll my own' for the archiving. Thank you.

James

And one last bit....could I theoretically redef extract_files?

share/bro/base/files/extract/main.bro: const prefix = "./extract_files/" &redef;

I could always symlink that directory to a different drive but eh....the more I can shove into the script the better. Thanks again.

James