I don’t know how much relevant this solution is to the case presented, but might worth a try.We run a bro cluster of 4 worker nodes and a manager. We recently started seeing a lot of capture loss (>60%)
and tried doing some tuning of the interfaces:
Turned off Tx and Rx check-summing on the NIC, hence reduced the lag between packets captured by interface and packets processed by BRO.
Also, check-sum calculation is default in BRO, hence turning it off on interface won’t create any security issues.
$ sudo ethtool -K em1 rx off
large-receive-offload: off [requested on]
$ sudo ethtool -K em1 tx off
tx-tcp-segmentation: off [requested on]
tx-tcp6-segmentation: off [requested on]
$ sudo ethtool -K em1 sg off
generic-segmentation-offload: off [requested on]
$ sudo ethtool -K em1 tso off
$ sudo ethtool -K em1 gso off
$ sudo ethtool -K em1 gro off
This reduced the capture loss % to below 1%, and the cluster is not seeing any capture loss till date.