extract jar files from HTTP stream

Hello,

Is there a tutorial for version 2.2 of BRO? I’d like to understand how can I write my own scripts to support extraction of verious files, like jar. So far I tried adding “application/jar” (it was logged to /nsm/bro/logs/current/files.log) as mime type to /opt/bro/share/bro/file-extraction/extract.bro file but it seems I have to do something else too as this change is not capturing files to /nsm/bro/extracted/ directory.

I guess it’s not simple as that :wink: I should also mention that I am using BRO installed within SecurityOnion distro. I posted this question there too, but got reply with links to version 2.1 which is not compatible with 2.2.

Hello,

Check the second example under 'Adding Analysis' for a start in file
extraction.
http://www.bro.org/sphinx/frameworks/file-analysis.html#adding-analysis

Also, not sure how it maps to Sec Onion, but there is
$PREFX/share/bro/base/files/extract/main.bro from a source install.
Might be your 'extract.bro'? I don't that file name in either 2.1 or 2.2
source trees.

Used the code below to do something similar. There's probably a more
elegant, or efficient solution, but it seems to working as expected,
given the limited testing I've done.

# define file extraction filters
const match_file_source = /HTTP/ |
              /IRC/ |
              /IRC_DATA/ |
              /FTP/ |
              /FTP_DATA/ &redef;

const match_file_mime = /text\/x-perl/ |
              /text\/x-msdos-batch/ |
              /text\/x-java/ |
              /application\/x-gzip/ |
              /application\/x-bzip2/ |
              /application\/x-dosexec/ |
              /application\/zip/ |
              /application\/jar/ |
              /application\/x-tar/ |
              /application\/x-archive/ |
              /application\/mac-binhex40/ |
              /application\/x-java-keystore/ |
              /application\/x-java-jce-keystore/ |
              /application\/x-executable/ |
              /application\/javascript/ &redef;

# add analyer to file_new event
event file_new(f: fa_file)
    {
    if ( f?$mime_type &&
        match_file_source in f$source &&
        match_file_mime in f$mime_type )
            Files::add_analyzer(f, Files::ANALYZER_EXTRACT);
       }

Thanks,

Shane

As an aside, you might also add application/zip to your file extract
(Shane has it in their list) as JAR files are also Zip files.

Thanks Shane, could you please write step-by-step instruction where should I put your code. I have no idea how to port it into my installation. Let’s assume I use ‘plain’ bro (no security onion) installed in /opt/bro. What is the next step? BTW - I have these both files (extract.bro and main.bro):

root@onion:~# ls -al /opt/bro/share/bro/base/files/extract/main.bro /opt/bro/share/bro/file-extraction/extract.bro
-rw-r–r-- 1 root root 2126 Nov 7 18:27 /opt/bro/share/bro/base/files/extract/main.bro
-rw-r–r-- 1 root root 572 Jan 1 12:26 /opt/bro/share/bro/file-extraction/extract.bro

Hi drum,

Start off with the following:

- edit /opt/bro/share/bro/file-extraction/extract.bro

- change the following line:
    if ( ! f?$mime_type || f$mime_type != "application/x-dosexec" )
to:
    if ( ! f?$mime_type || f$mime_type != "application/jar" )

- run the following:
sudo broctl install
sudo broctl restart

Bro should now be extracting jar files to /nsm/bro/extracted/.

Once you have that working, then you should be able to add in Shane's
match_file_mime to the same script to allow you to extract multiple
file types.

Thank you Doug, that worked. Actually I ended up with following (ugly) syntax:

root@onion:~# cat /opt/bro/share/bro/file-extraction/extract.bro
global ext_map: table[string] of string = {
[“application/x-dosexec”] = “exe”,
[“text/plain”] = “txt”,
[“image/jpeg”] = “jpg”,
[“image/png”] = “png”,
[“text/html”] = “html”,
} &default ="";

event file_new(f: fa_file)
{
#if ( ! f?$mime_type || f$mime_type != “application/x-dosexec” )
if ( ! f?$mime_type || f$mime_type != “application/jar” )
return;

local ext = “”;

if ( f?$mime_type )
ext = ext_map[f$mime_type];

local fname = fmt("/nsm/bro/extracted/%s-%s.%s", f$source, f$id, ext);
Files::add_analyzer(f, Files::ANALYZER_EXTRACT, [$extract_filename=fname]);
}

define file extraction filters

const match_file_source = /HTTP/ |
/IRC/ |
/IRC_DATA/ |
/FTP/ |
/FTP_DATA/ &redef;

const match_file_mime = /text/x-perl/ |
/text/x-msdos-batch/ |
/text/x-java/ |
/application/x-gzip/ |
/application/x-bzip2/ |
/application/x-dosexec/ |
/application/zip/ |
/application/jar/ |
/application/x-tar/ |
/application/x-archive/ |
/application/mac-binhex40/ |
/application/x-java-keystore/ |
/application/x-java-jce-keystore/ |
/application/x-executable/ |
/application/javascript/ &redef;

add analyer to file_new event

event file_new(f: fa_file)
{
if ( f?$mime_type &&
match_file_source in f$source &&
match_file_mime in f$mime_type )
Files::add_analyzer(f, Files::ANALYZER_EXTRACT);
}

and I bet it can be written better.

During this excersise I noticed that /nsm/bro/logs/current/files.log was not present. Found this in google: https://groups.google.com/forum/#!topic/security-onion/r4eZWOegvsY and followed suggestions. Indeed, /nsm/bro/logs/current/communication.log file contained:

1388589086.005591 manager child - - - error can’t bind to 0.0.0.0:47761, Address already in use

I had to use lsof command to check which process was it and:

root@onion:/nsm/bro/logs/current# lsof -i:47761
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
bro 12253 root 0u IPv4 300348 0t0 TCP *:47761 (LISTEN)
bro 12253 root 1u IPv6 300349 0t0 TCP *:47761 (LISTEN)

so I killed it. After doing service nsm restart everything seems to be working again (logs + file extraction. BTW - jar files are stored without extension “jar”). But I still can see errors in communication.log:

root@onion:~# cat /nsm/bro/logs/current/communication.log |grep Address
1388589202.005024 proxy child - - - error can’t bind to 0.0.0.0:47762, Address already in use
1388589204.006373 onion-eth1-1 child - - - error can’t bind to 0.0.0.0:47763, Address already in use
1388589235.000845 onion-eth1-1 child - - - error can’t bind to 0.0.0.0:47763, Address already in use
1388589233.001513 proxy child - - - error can’t bind to 0.0.0.0:47762, Address already in use
1388589264.004692 proxy child - - - error can’t bind to 0.0.0.0:47762, Address already in use
1388589266.005739 onion-eth1-1 child - - - error can’t bind to 0.0.0.0:47763, Address already in use
1388589297.004983 onion-eth1-1 child - - - error can’t bind to 0.0.0.0:47763, Address already in use
1388589295.005424 proxy child - - - error can’t bind to 0.0.0.0:47762, Address already in use
1388589328.004598 onion-eth1-1 child - - - error can’t bind to 0.0.0.0:47763, Address already in use
1388589326.005488 proxy child - - - error can’t bind to 0.0.0.0:47762, Address already in use
1388589359.004987 onion-eth1-1 child - - - error can’t bind to 0.0.0.0:47763, Address already in use
1388589357.004749 proxy child - - - error can’t bind to 0.0.0.0:47762, Address already in use
1388589390.004760 onion-eth1-1 child - - - error can’t bind to 0.0.0.0:47763, Address already in use
1388589388.004887 proxy child - - - error can’t bind to 0.0.0.0:47762, Address already in use
1388589419.005759 proxy child - - - error can’t bind to 0.0.0.0:47762, Address already in use
1388589421.005335 onion-eth1-1 child - - - error can’t bind to 0.0.0.0:47763, Address already in use
1388589450.004988 proxy child - - - error can’t bind to 0.0.0.0:47762, Address already in use
1388589452.005818 onion-eth1-1 child - - - error can’t bind to 0.0.0.0:47763, Address already in use
1388589481.001524 proxy child - - - error can’t bind to 0.0.0.0:47762, Address already in use
1388589483.001843 onion-eth1-1 child - - - error can’t bind to 0.0.0.0:47763, Address already in use
1388589512.004547 proxy child - - - error can’t bind to 0.0.0.0:47762, Address already in use
1388589514.004785 onion-eth1-1 child - - - error can’t bind to 0.0.0.0:47763, Address already in use
1388589543.005441 proxy child - - - error can’t bind to 0.0.0.0:47762, Address already in use
1388589545.004584 onion-eth1-1 child - - - error can’t bind to 0.0.0.0:47763, Address already in use
1388589574.005125 proxy child - - - error can’t bind to 0.0.0.0:47762, Address already in use
1388589576.005318 onion-eth1-1 child - - - error can’t bind to 0.0.0.0:47763, Address already in use
1388589605.005628 proxy child - - - error can’t bind to 0.0.0.0:47762, Address already in use
1388589607.004816 onion-eth1-1 child - - - error can’t bind to 0.0.0.0:47763, Address already in use
1388589636.005317 proxy child - - - error can’t bind to 0.0.0.0:47762, Address already in use
1388589638.005756 onion-eth1-1 child - - - error can’t bind to 0.0.0.0:47763, Address already in use
1388589667.005455 proxy child - - - error can’t bind to 0.0.0.0:47762, Address already in use
1388589669.005977 onion-eth1-1 child - - - error can’t bind to 0.0.0.0:47763, Address already in use
1388589700.006115 onion-eth1-1 child - - - error can’t bind to 0.0.0.0:47763, Address already in use
1388589698.004967 proxy child - - - error can’t bind to 0.0.0.0:47762, Address already in use
1388589729.000811 proxy child - - - error can’t bind to 0.0.0.0:47762, Address already in use
1388589731.012333 onion-eth1-1 child - - - error can’t bind to 0.0.0.0:47763, Address already in use
1388589760.005435 proxy child - - - error can’t bind to 0.0.0.0:47762, Address already in use
1388589762.005389 onion-eth1-1 child - - - error can’t bind to 0.0.0.0:47763, Address already in use
1388589791.004834 proxy child - - - error can’t bind to 0.0.0.0:47762, Address already in use
1388589793.005790 onion-eth1-1 child - - - error can’t bind to 0.0.0.0:47763, Address already in use
1388589824.005289 onion-eth1-1 child - - - error can’t bind to 0.0.0.0:47763, Address already in use
1388589822.004770 proxy child - - - error can’t bind to 0.0.0.0:47762, Address already in use
1388589855.005452 onion-eth1-1 child - - - error can’t bind to 0.0.0.0:47763, Address already in use
1388589853.006436 proxy child - - - error can’t bind to 0.0.0.0:47762, Address already in use

Should I be worried about these errors? I mean, file extraction is working fine, but maybe other service is not?

Check that your IP address is correct in /opt/bro/etc/node.cfg and
then run the following:
sudo broctl install
sudo reboot

If you continue to have issues, please start a new thread on the
Security Onion mailing list and we can troubleshoot further there.

Thanks,
Doug

Is there a way to name the extracted files based on what IP or domain name they originated from? In the event file_new(f: fa_file) section, I’m not able to access anything from f$conns to use for such naming. That would make the extracted files much more useful.

What do you mean that you aren't able to access anything from f$conns?

Generally, giving extracted files names like that is complicated because the file handling in Bro is separated from everything else. There are some strategic points where they tie together, but generally you have to be careful.

Since the best way is probably through concrete examples, I'll give one and we'll see if it sticks. If you want to name extracted files like HTTP_1.2.3.4:12345-5.6.7.8:80.resp.dat you can do this…

  https://gist.github.com/sethhall/8221401

This will only extract files over HTTP with these special file names. You can modify that script if you want it to behave differently. One thing people ask a lot is if you can extract files and name them by their SHA1 or MD5 hash. Generally this is possible but it's something that would need to be done when the file is completely extracted because you don't know the file hash at the beginning of the file but you need to give a filename to start writing the file into. In the normal case you would extract the file and then move it into it's new filename (hopefully on the same file system).

Actually, I'll do one more small modification to the script to show you how to add the domain…

  https://gist.github.com/sethhall/8221692

One final thing to notice is that I've made both of these scripts only use the "special" filename for cases where a file is being received over HTTP. If the client sends data over HTTP or another protocol things will revert to the default filename.

There are also some small considerations being ignored in this example like single files transferred over multiple connections (which is possible in Bro).

  .Seth