Packet loss and compilation issues with PF_RING plugin on Zeek 8.0.6 (3.5 Gbps traffic)


Hi everyone,

I’m struggling to get Zeek 8.0.6 to work with PF_RING on a high-traffic sensor. I’m experiencing 80% packet loss at 3.5 Gbps, which is corrupting our file extraction (SMTP/Email).

System Environment:

  • OS: cat /etc/os-release → Ubuntu 22.04.3 LTS (Jammy Jellyfish)

  • Kernel: uname -aLinux zeek 5.15.0-101-generic #111-Ubuntu SMP x86_64

  • Hardware: 16-core CPU / 64GB RAM.

  • Zeek Version: 8.0.6 (Compiled from source in /opt/zeek).

PF_RING Installation Status: I have PF_RING 8.x installed in standard paths, but Zeek is not recognizing it.

  • Binaries: /usr/local/bin/pfcount, /usr/local/sbin/pf_ringctl

  • Libraries: /usr/local/lib/libpfring.a, /usr/local/lib/libpcap.a (ntop version)

  • Headers: /usr/local/include/pfring.h

  • **Interfaces listed by PF_RING:
    **
    pf_ringcfg --list-interfaces
    Name: ens18 Driver: virtio_net RSS: 1 [Linux Driver]
    Name: ens16f0 Driver: igb RSS: 8 [Supported by ZC]
    Name: ens16f1 Driver: igb RSS: 8 [Supported by ZC]
    Name: ens16f2 Driver: igb RSS: 8 [Supported by ZC]
    Name: ens16f3 Driver: igb RSS: 8 [Supported by ZC]

    cat /proc/net/pf_ring/dev/ens16f0/info
    Name: ens16f0
    Index: 3
    Address: xxxxxxxxxx
    Polling Mode: NAPI
    Promisc: Disabled
    Type: Ethernet
    Family: Standard NIC
    bound sockets: 0

    TX Queues: 8
    RX Queues: 8
    my node:

    logger

    type=logger
    host=localhost

    manager

    type=manager
    host=localhost

    proxy-1

    type=proxy
    host=localhost

    worker-1

    type=worker
    host=localhost
    interface=ens16f0
    lb_method=pf_ring
    lb_procs=4
    pin_cpus=0,1,2,3

    lb_param=10

    worker-2

    type=worker
    host=localhost
    interface=ens16f1
    lb_method=pf_ring
    lb_procs=4
    pin_cpus=4,5,6,7
    lb_param=20

    worker-3

    type=worker
    host=localhost
    interface=ens16f2
    lb_method=pf_ring
    lb_procs=3
    pin_cpus=8,9,10
    lb_param=30

    worker-4

    type=worker
    host=localhost
    interface=ens16f3
    lb_method=pf_ring
    lb_procs=3
    pin_cpus=11,12,13
    lb_param=40

    capstats:

    Interface kpps mbps (10s average)

    localhost/ens16f0 182.5 909.7
    localhost/ens16f1 173.5 918.0
    localhost/ens16f2 129.8 718.5
    localhost/ens16f3 170.7 779.3

    Total 656.5 3325.5

    The Wall:

    1. Plugin 404: The repository https://github.com/ntop/zeek-plugin-pf_ring is returning a 404, and it’s no longer present in the PF_RING/userland source tree.

    2. zkg failure: Running zkg install zeek-plugin-pf_ring returns “package name not found in sources”.

    3. Core Build: Compiling Zeek with --with-pcap=/usr/local finishes successfully, but zeek -N does not list Zeek::PF_RING.

    Questions for the community:

    1. The “Best Practice” 2026 Question: For 3.5 Gbps on a modern Linux Kernel (5.15+), is PF_RING (non-ZC) still superior to AF_PACKET + Fanout? Or has AF_PACKET become the recommended non-paid path for these speeds?

    2. The Plugin: If PF_RING is still the way to go, where is the official source code for the Zeek 8 plugin located now?

    3. Kernel Bottleneck: Is there any known issue with the 5.15 kernel and Zeek’s file extraction that could be exacerbated by the current packet loss?

    Any help to break this loop and get the sensor back to zero-drop would be greatly appreciated.

Hey @puma2009,

not a PF_RING expert, but I’d expect it to work. The idea is that Zeek is linked against a PF_RING specific libpcap version as you showed here:

Core Build: Compiling Zeek with --with-pcap=/usr/local finishes successfully, but zeek -N does not list Zeek::PF_RING.

I’m not aware of the need for a plugin. So Zeek::PF_RING isn’t expected to show up. There’s an older bro-pf_ring plugin, but that should not be necessary. Please check with ldd /opt/zeek/bin/zeek to see if Zeek is dynamically linked against PF_RING’s specific libpcap.so file. If not, then the cluster might currently use default libpcap and load-balancing between worker not in effect, easily resulting in overload.

Further, do you have PFRINGClusterType in your zeekctl.cfg? It defaults to 4-tuple, but inner-5-tuple or inner-6-tuple may be more reasonable today when dealing with tunneled traffic.

If you look at zeekctl top, is there one or two workers at 100% usage, or are all of them at 100% usage. In the former case, flows might end-up only on a few processes. In the latter case, processes might all see all flows rather than just a portion.

I’m a bit confused about the lb_param value in your node.cfg - I don’t think this exists. Is it something custom?

The “Best Practice” 2026 Question: For 3.5 Gbps on a modern Linux Kernel (5.15+), is PF_RING (non-ZC) still superior to AF_PACKET + Fanout? Or has AF_PACKET become the recommended non-paid path for these speeds?

The existing PF_RING setup should continue to work. Please double check the libpcap.so file in use by Zeek.

I do think AF_PACKET is easier to setup, so it might be easy for you to do a quick comparison.

With Zeek 8.0, AF_PACKET is builtin on Linux, so you could try:

[worker-1]
type=worker
host=localhost
interface=ens16f2
lb_method=af_packet
af_packet_fanout_id = 12
...

Might make sense to search for offloading and tuning related to AF_PACKET. Also, with multiple network cards on a single system, make sure you set af_packet_fanout_id to a unique value and double check on the block/buffer size tunables.

  1. Kernel Bottleneck: Is there any known issue with the 5.15 kernel and Zeek’s file extraction that could be exacerbated by the current packet loss?

This should be fine :crossed_fingers:

Any help to break this loop and get the sensor back to zero-drop would be greatly appreciated.

Hope this helps. Curious if you can figure out more. Please share if you determine the culprit!

Thanks,
Arne

@puma2009 - which PF_RING version are you using? There was an issue around poll/epoll reported ~2.5years ago when upgrading to Zeek 6.0 that we reported to PF_RING. It was with PF_RING 8.2.0 and the fix made it into PF_RING 8.8.0 or later.

Alternatively, a fix in Zeek would be to try and put the following into local.zeek to poll PF_RING more often:

# local.zeek - Use more aggressive polling to avoid polling issue with kqueue and PF_RING < 8.8.0.
redef io_poll_interval_live = 100;

So if you don’t yet have PF_RING 8.8.0 or later, try upgrading or see if the above redef helps.

Please report back if any of these steps helps.