Daily report and Byte Transfer Pairs

I have been noticing that sometimes the daily report Byte Transfer Pair
information in the Local bytes and Remote Bytes values can be off by a
very large factor from the actual traffic size.

The problems are artifacts of Bro's use of sequence numbers to compute
connection sizes. The estimates can be too large because of connections
that have malformed sequence numbers (especially in RST packets); or too
small due to connections for which Bro misses the beginning or end (and
hence doesn't compute a size), which in fact is much more likely to happen
for big (and thus long-lived) connections than small ones.

We have a draft paper on incoporating random sampling into Bro's analysis.
This allows it to make more accurate estimates of traffic volume and also
the sizes of individual connections. One part of this is already available
in the Bro distribution using large-conns.bro. Another part (that does
overall traffic profiling) has not yet been integrated.