Hi,
Sorry if my questions have already been answered but it would be really helpful if anyone can provide information on the following.
1. Does bro capture_loss indicate that packets that are mirrored using a switch's SPAN/TAP port to a server running bro, drop packets in the mirroring process somewhere upstream?
In our particular setting, we are seeing zero packet drops reported by "broctl netstats" but more than 40% packet losses in capture_loss. Does that imply that the server running bro is not dropping any packets but that packets are being dropped upstream? Bidirectional traffic is sent to the server running bro using SPAN ports.
2. Is there a document that explains in detail how capture loss is computed?
It says "Reported loss is computed in terms of the number of “gap events” (ACKs for a sequence number that’s above a gap)."
What exactly is a gap event and how is the function call "get_gap_stats()" defined? The code in "capture-loss.bro" does not explain how acks and gaps can be used to estimate capture loss. Any detailed documentation would be useful.
Bro simply counts tcp ACKs for packets that it did not see in the first place. If it saw the ACK, but not the original packet, there was capture loss.
Thanks and regards,
Sourav Maji
Capture loss by itself is kind of a useless metric.. when it's zero, that's great, but any number above a very small percentage just tells you there is a problem somewhere but not where it is.
It's kind of like a "Check engine" light.
You need to figure out where your loss is coming from. Analyzing the "missed_bytes" column in the conn.log will help.
If you install bro-doctor (GitHub - ncsa/bro-doctor)
bro-pkg install ncsa/bro-doctor
broctl doctor.bro
the "Checking what percentage of recent tcp connections show loss" section in the output will tell you what percentage of your recent connections is seeing loss.
The number of connections seeing loss can often be a better metric than the overall loss count itself. If that is also 40% then you are missing a lot of traffic. If it's 1%, you have a small number of broken connections.
A really good test (that I still haven't figure out how to add to bro-doctor) is to run something like this from somewhere on your network:
for x in $(seq 1 9); do echo -e 'GET / HTTP/1.1\r\nHost: www.bro.org\r\n\r\n' | socat - tcp-connect:www.bro.org:80,sp=2000$x,reuseaddr; sleep 1; done
Then see what bro logged using
cat conn.log |bro-cut -d ts id.orig_h id.orig_p id.resp_h id.resp_p orig_pkts resp_pkts missed_bytes | fgrep 192.150.187.43
You should see 9 almost identical lines like
141.142.148.70 20001 192.150.187.43 80 6 4 0 ShADFadf