No notice.log after switch upgrade/downgrade

Hello,

We are having some issues with the BRO cluster here @SLAC. I am kind of a noob with respect to BRO and the setup we have @SLAC. Please excuse me and my ignorance.

We have Cisco 3k switch running on tap aggregation mode and it also load-balances traffic to the BRO cluster. We tried to upgrade the switch to the newer NX-OS version but we had some problems and we had to revert to the original version with the exact same configuration.

However, there are no notice.log being generated since the upgrade/downgrade incident. On splunk, the BRO traffic event counts have decreased 1/7th after the incident. I am sure there are things that I am missing after the upgrade/downgrade and I am unable to figure out.

One of the colleague suggested, it might be related to asymmetric flow of forward and reverse packets to the worker nodes which is why BRO is failing to analyze the traffic. So, on the switch, I checked if there is load-balance symmetry command; which is on the switch and I performed tcpdump on bro-worker node and the traffic is communication with bro-manager node.

Planning to involve cisco support tomorrow and to capture traffic from the switchport to Bro worker node and see if I can figure out what’s going on.

Any thoughts?

Thanks,

Andy

Sounds like you are on the right track. You can tell from conn.log entries if you are getting asymmetric flow distribution. Instead of seeing a single connection between a and b with orig_pkts and resp_pkts and a history like ShADadFf, you'll see two connections

one from a to b with orig_pkts and no resp_pkts with a history of SAD.. (SAD IS BAD)
one from b to a with resp_pkts and no orig_pkts with a history of had..

What I would do is check for this, then reproduce it by using tcpdump directly, that way you can take the evidence to cisco and they can't blame Bro.

Thanks Justin for your help.

I was able to look at connection log and I listed the connections by increasing order of duration, i.e., the longest connections at the end.
Here’s the snippet of connections at the end of this sort :

I do see there is no equal amount of packets in orig_pkts and resp_pkts but it seems like symmetry is not broken considering in the conn.log there are packets recorded in both directions.

However, there are many small connections logged which do not have resp_pkts.

Thoughts?

Thanks,

Andy