Duplicate packets in Zeek

Hello. I recently discovered that our Zeek instance was generating duplicate events. This problem was identified by comparing zeek-conn logs and logs captured by our perimeter firewall. In the zeek-conn logs, we observed two distinct log events sharing the same Timestamp, Source IP, Destination IP, Source Port, Destination Port, but with unique connection UIDs. Additionally, in the zeek-http logs, two events with partial field information were generated for the same zeek-conn and Timestamp. For example, the first event captured the HTTP Method, while the second event captured fields such as HTTP Connections Status and HTTP Content Length. Both of these events had a “-” (dash) for the field that the other event was capturing. Similarly, zeek-dns logs were also capturing field values in chunks. For instance, one event would capture query type while other would capture Domain. For each of these duplicated events, we noted a corresponding event in zeek-con logs, exacerbating the EPS spike.

Further investigation led me to a post where a Zeek community user talked about a similar issue, attributing it to misconfigurations in the AF_PACKET plugin’s “fanout mode” setting. This setting dictates the various modes for load balancing packets between a given set of network interfaces (if I understood it correctly). I am currently working to identify the correct configuration that ensures a single log entry for a specific event, rather than two logs with partial information/fields.

Any insights or leads on this matter would be greatly appreciated.

Hi there,

From the symptoms you’re describing, it sounds like you have multiple Zeek workers that are seeing only parts of a given flow. If you were seeing more consistent packet duplication, you’d see different failure modes (for example, most likely largely correct TCP logs, since its reassembler will abstract away much of the duplication). Do your conn logs indicate lots of gaps, or lots of retransmissions?

Regarding AF_PACKET, the biggie to keep in mind is that a fanout group ID establishes a consistent set of packets for each of the workers in that group, so each can meaningfully process the packets. If you monitor multiple interfaces, you need to provide a unique fanout ID for the workers on a given interface.

In the unlikely event that you are running a very old kernel, you can take a look at this helper tool to confirm that your fanouts are working correctly: GitHub - JustinAzoff/can-i-use-afpacket-fanout: Validate if afpacket PACKET_FANOUT_HASH is working properly — the documentation doesn’t spell it out, but you’re looking for a roughly even share of flows across the reported workers.

Best,
Christian

Thanks @Christian for the response. After some further troubleshooting, we learnt that our IXIA taps capture bi-directional traffic over individual TX/RX ports supporting SFP+ transceiver output that goes to Zeek. We have two interfaces on our Zeek server, each accepting traffic from monitoring port on taps over RX ports. One interface is receiving tapped inbound traffic (RX) while other is receiving tapped outbound traffic (TX) off our perimeter firewall.

This is resulting in two individual events on, for e.g, for http logs on interface 1, fields such as method, host, uri, referrer, UA are captured while on interface 2, status_code and status_msg are populated.

When I run zeek on standalone mode, I do not see this problem and logs get created correctly. Is there a way for us to avoid that and make sure Zeek receives/generates combined logs?

Assuming that I am understanding it correctly that the server that you have has two interfaces that get parts of the traffic that has to be merged, then there is not really an easy way to fix this.

You are correct that you can run one Zeek binary that listens to both interfaces, and that this will probably happen to work more or less correctly, if timing works out well. However, this is not entirely guaranteed to work, and is probably not a setup you want to run in production.

I sadly don’t have a ready solution for you - in similar setups that I have seen in the past, there was additional hardware in front of the Zeek server that took care to correctly join traffic from the different monitored ports, making sure that the ordering of packets is correct.

Zeek itself has no built-in support for these use-cases, the traffic has to be normalized before it it sent to it.