Bad news: Although it took just a simple modification to a copy of
"demunx_conn()", I couldn't get it to work when writing to 1 file by using
the CONTENTS_BOTH flag.
Ah - I realized the key problem, which is that CONTENTS_BOTH is not in
fact a valid parameter for set_contents_file. The way that contents are
extracted from streams, it simply can't work. (The definition is lying
around because it's used internal to the event engine in a slightly
different context.)
Is there some reason why you want to have both directions in a single file?
If so, then the way to do it is by defining a tcp_contents handler that
writes out the contents directly to a file:
Is there some reason why you want to have both directions in a single
file?
The initial reason was twofold: just to reduce the number of files and to be
compatible with a prototype that had been developed. Clearly neither are
critical and if the 2 file approach is faster/efficient then that should be
enough to trump the single file approach given the importance of speed.
Ah - I realized the key problem, which is that CONTENTS_BOTH is not in
fact a valid parameter for set_contents_file. The way that contents are
extracted from streams, it simply can't work. (The definition is lying
around because it's used internal to the event engine in a slightly
different context.)
I really don't want anyone spending more time on this but:
a. do you think I just didn't look hard enough at the data for a single file
(i.e., it wasn't working when I thought it was)
or
b. the single file approach would work "most of the time" or all of the time
if a connection (such as HTTP_Conn or SMTP_Conn) added a contents processor
derived from TCP_ContentLine. i.e. it would work if the TCP_Connection's
BuildEndpoints() method did something functionally equivalent to:
orig->AddContentsProcessor( new TCP_ContentLine(orig,1,0,1));
resp->AddContentsProcessor( new TCP_ContentLine(resp,0,0,1));
Obviously "most of the time" isn't good enough but it would explain why a
cursory check of the output looked valid.
Once again even if the single file did work, if the 2 file approach has a
measurable speed advantage we'll probably make the necessary changes to our
protocol handlers/applications.
though this won't easily do the right thing in the presence of packet
loss/retransmission.
In fact, tcp_contents won't be affected by packet loss/retransmission, and it always delivers contents in the order of TCP sequence numbers, because it is called after TCP reassembly in TCP_Contents::DeliverBlock(). However:
1) There can be content gaps in case some packets are not captured by Bro. Gaps are reported by event content_gap, but you can also tell by looking at parameter <seq> and length of <contents> of tcp_contents.
2) Also, if the connection is "skipped" (some analyzers, e.g. Netbios/SSN, will automatically skip after seeing a content gap.)
function skip_further_processing%(cid: conn_id%): bool
the content afterwards won't reach tcp_contents. The same also applies to "TCP content files".