Do you understand where that increase is coming from? Is it indeed
because Bro is doing additional reassembly work now? In other words,
it's not overhead incurred on traffic that does't require reassembly?
Roughly: the increase of “default_file_bof_buffer_size” from 1024 to 4096 bytes is significant. That affects all file analysis, not just what needs reassembling. This setting changes how much data is copied in to a buffer for use with mime type signature matching. IIRC, signature matching is a large portion of file analysis cost.
Average timings for 5 runs of `time bro -r ipv6.trace local "Site::local_nets={192.168.0.0/16}”`:
bro/master, default_file_bof_buffer_size=4096
avg real is 9.9484 seconds
avg sys is 0.718 seconds
avg user is 11.3786 seconds
bro/master, default_file_bof_buffer_size=1024
avg real is 9.356 seconds
avg sys is 0.6782 seconds
avg user is 10.9312 seconds
bro/6f2b8cb, default_file_bof_buffer_size=4096
avg real is 10.018 seconds
avg sys is 0.691 seconds
avg user is 11.4358 seconds
bro/6f2b8cb, default_file_bof_buffer_size=1024
avg real is 9.4856 seconds
avg sys is 0.7148 seconds
avg user is 11.1298 seconds
Interesting that for the same default_file_bof_buffer_size, the new version of Bro w/ file reassembly is actually better.
I suspect it’s because I modified the low level handling of files. The flow of chunks when they first enter the file analysis framework is quite different now.
I can confirm that: if I switch back to 1024, things actually get
faster than before for me, too. That is great, not only do we
understand what happened, but we actually improved things.