Hello, everyone .
i’m new to bro recently, i’m using FAF(File Analysis Framework) to extract certain type file to disk for further analysis from traffic .
but now i have problem which is so difficult to understand:
- bro extract file size is one byte bigger than my original file
- or bro extract file the right size with my original file, but it’s different MD5 value among these files
below is my test env, test steps and test result:
my test env
bro version:
- bro version 2.5-156
OS (32C 64G): - CentOS Linux release 7.3.1611 (Core)
CPU model: - Model name: Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
- CPU(s): 32
- CPU MHz: 2334.445
NIC: - 03:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network
my test bro scripts
event file_sniff(f: fa_file, meta: fa_metadata)
{
print "file sniff event by Myth";
if ( meta?$mime_type )#&& hook FileExtraction::extract(f, meta) )
{
if ( meta$mime_type in mime_to_ext )
{
local fext = mime_to_ext[meta$mime_type];
if ( fext == "txt" )
{
#print "txt";
if ( f$source != "SMTP" )
{
#print "NOT SMTP";
return;
}
}
}
else
return;
#fext = split_string(meta$mime_type, /\//)[1];
local fname = fmt("%s%s-%s.%s", path, f$source, f$id, fext);
# file path
#print fname;
Files::add_analyzer(f, Files::ANALYZER_MD5);
Files::add_analyzer(f, Files::ANALYZER_SHA1);
Files::add_analyzer(f, Files::ANALYZER_SHA256);
Files::add_analyzer(f, Files::ANALYZER_EXTRACT,[$extract_filename=fname]);
}
}
my test steps
- generate test file
[root@sensor ~]# dd if=/dev/urandom of=test.for.bro.txt bs=1024 count=512
[root@sensor ~]# tar -cvzf test.for.bro.tar.gz test.for.bro.txt
- original file size and MD5 valud
[root@sensor ~]# ls -lt test.for.bro.tar.gz
-rw-r–r-- 1 root root 524608 8月 7 13:59 test.for.bro.tar.gz
[root@sensor ~]# md5sum test.for.bro.tar.gz
6e755b5c0a7754c7066ca6db5f0f90ba test.for.bro.tar.gz
- start test web server using Python
[root@sensor ~]# python -m SimpleHTTPServer 8998 > ws.log 2>&1
- start bro
[root@sensor myth]# /usr/local/bro/bin/bro -i eno1 -C bro-scripts/tophant.entrypoint.bro > myth.log 2>&1
- using
ab
do make lots of http request to test file from another machine
[root@localhost ~]# ab -n 2000 -c 4 ‘http://10.0.81.54:8998/test.for.bro.tar.gz’
- result ( after all request is done)
5.1 webserver process request count
[root@sensor ~]# cat ws.log | grep test.for.bro | wc -l
2000
5.2 bro file_sniff
event count
[root@sensor myth]# cat myth.log | grep “file sniff event by Myth” | wc -l
976
5.3 download file count
[root@sensor sensor_files_by_myth]# ls | wc -l
973
5.4 file count with different file size:
[root@sensor sensor_files_by_myth]# ls -lt | grep -v 524608 | wc -l
193
5.5 file count with same file size:
[root@sensor sensor_files_by_myth]# ls -lt | grep 524608 | wc -l
780
5.6 file count with same MD5 value:
[root@sensor sensor_files_by_myth]# ls -lt | awk ‘{print $NF}’ | xargs md5sum | grep 6e755b5c0a7754c7066ca6db5f0f90ba | wc -l
19
5.7 file count with same file size but different MD5 (!!! NOTICE: all is different MD5)
[root@sensor sensor_files_by_myth]# ls -lt | grep 524608 | awk ‘{print $NF}’ | xargs md5sum | grep -v 6e755b5c0a7754c7066ca6db5f0f90ba | awk ‘{print $1}’ | sort | uniq -c | wc -l
761
5.8 download file size distribution:
[root@sensor sensor_files_by_myth]# ls -lt | awk ‘{print $5}’ | sort -rn | uniq -c
136 524609 <<<<<<<<<<<<<<< this is one byte bigger than my original test file !!!
780 524608
3 523990
3 522542
8 521094
1 520208
1 519646
2 518198
1 515302
1 513854
1 512968
1 512406
1 510958
1 509510
2 503718
1 502176
1 501384
1 497926
1 490296
1 488808
1 487040
1 486342
1 480550
1 473310
1 467518
1 464622
1 458830
1 453038
1 442902
1 441454
1 396566
1 382408
1 377742
1 358918
1 354574
1 318240
1 283312
1 263350
1 256110
1 250318
1 234952
1 189502
1 164886
1 79454
2 2710
1
Thanks for reading so far, wish someone could help me with this
Myth