I am relatively new to bro so please excuse me, if I missed the obvious solution.
I want to extract files downloaded via http from a pcap-file, but the files I download are never extracted completely.
They seem to be truncated at ~1 mb. My bro-script is quite simple:
Are there any other events I have to catch to get the complete file?
When I download a test file from [1] with size 3521964 bytes, only 960204 bytes are extracted. I checked with
wireshark and tcpflow, that the download was completely captured in the pcap,
I tested with Bro 2.3.2 and the current dev version from git.
Are there any other events I have to catch to get the complete file?
When I download a test file from [1] with size 3521964 bytes, only 960204 bytes are extracted. I checked with
wireshark and tcpflow, that the download was completely captured in the pcap,
Could you show me the files.log entry and the associated conn.log entry?
Your question for the logs is a valid one, I should have sent them in my initial mail.
I was also wondering, why the correct size is in the logs. If data was missing I would
at least have exspected a warning or some missing_bytes.
I hope the logs are readable inline in the mail, attachments seem to be filtered.
thanks for your mail. I will have a look at the examples. For your hint about the extraction proces:
I still doubt that the root of the problem lies here, because other tools successfully extract the
files from the same pcap.
In files.log, the value of total_bytes is just taken from the HTTP Content-Length header. Since the value of seen_bytes is less than total_bytes, you can suspect Bro didn’t see the full file for some reason. Do you have a weird.log containing any obvious clues? Else, I may need the original pcap to understand what went wrong.
In files.log, the value of total_bytes is just taken from the HTTP Content-Length header. Since the value of seen_bytes is less than total_bytes, you can suspect Bro didn’t see the full file for some reason. Do you have a weird.log containing any obvious clues? Else, I may need the original pcap to understand what went wrong.
The weird.log states some “above_hole_data_without_any_acks”, but why does it work with tcpflow?
Gathered the pcap: tcpdump -s0 -i eth0 -w download.pcap port http
checked if the file was completely captured with tcpflow:
tcpflow -FT -e http -r download.pcap
md5sums do match:
~/bro-liste$ md5sum 2015-04-01T07:45:00Z080.249.099.148.00080-192.168.002.103.42716-HTTPBODY-001.zip
~/bro-liste$ /usr/local/bro/bin/bro -r download.pcap extract.bro
1427874309.892545 warning in /usr/local/bro/share/bro/base/misc/find-checksum-offloading.bro, line 54: Your trace file likely has invalid TCP checksums, most likely from NIC checksum offloading.
~/bro-liste$ /usr/local/bro/bin/bro -r download.pcap extract.bro
1427874309.892545 warning in /usr/local/bro/share/bro/base/misc/find-checksum-offloading.bro, line 54: Your trace file likely has invalid TCP checksums, most likely from NIC checksum offloading.
You’ll have to address this problem to get the results you expect. See:
The weird.log states some “above_hole_data_without_any_acks"
In this case, this seems like it’s just a side effect of the bad checksums, but in case you’re interested on how that type of situation can effect file extraction in Bro there’s discussion of how/why here: