I have been experiencing hash misses so to speak with PE files due to
lack of seen_bytes verse total_bytes. Is this indication of a
performance problem which the sensor is overwhelmed therefore cannot
parse the entire file?
e.g. I have a file that's 300832 in which seen_bytes consistently
matches total_bytes and then a hash is provided. Another file is
774200 total_bytes but the seen_bytes usually does not amount to the
total_bytes (sometimes it does).
Those numbers can be really tricky. If a protocol indicates how much data it's going to transfer or how big the file is, Bro will know the total_bytes. There are a number of cases where total_bytes isn't even known. It's also possible that Bro is tracking files that aren't even being transferred in their entirety. Over SMB, you will very frequently see portions of files transferred where Bro never even had an opportunity to see the whole file.
What may help next is if you look at the conn log for the connections where you are seeing files transferred to see if the missed_bytes on that connection is greater than zero. That should tell you if there was any packet loss in the connection which could also cause some bizarre behavior as you're describing.
If you could provide a conn log entry and files log entry where you are seeing the problem, that would be the fastest way to figure out what happening (please just mask out ip addresses).