af_packet comparison to PF_RING ZC/DNA for Bro (in light of recent Suricata tuning paper)

After reading over the paper Michal and others worked on concerning tuning Suricata for best performance with AF_Packet I'm wondering how af_packet performance compares to pf_ring DNA/ZC (with the commercially licensed drivers, not just vanilla) especially when it comes to Bro.

Is af_packet generally sufficient for Bro when it comes to monitoring 100G+ networks using a cluster of commodity servers with Intel X520 NICs?

Is the distro shipped driver for something like an up-to-date Ubuntu 16.04 (4.4 kernel) server sufficient or do you really need to compile the driver from source to enable some extended features, or to get a properly patched driver etc? I could see some benefits to just using the distro packaged driver and not having to compile the driver from scratch or rely on dkms when patching sensors. I've had this go very wrong a few times.

Are there any gotchas where running one or the other might be the better way to go? Examples (want to use some bro feature such as capstats, or want to see VLAN tags in Bro logs, something else is broken or not performing as expected)

Does af_packet or the Bro plugin for it have a way to deal with multiple NICS (one per numa node), sort of like how pf_ring has dnacluster and zbalance_ipc?

Feel free to share any other relevant considerations. I'm especially interested in things such as ease of management, performance, compatibility etc.

~Gary

Hi Gary,

After reading over the paper Michal and others worked on concerning
tuning Suricata for best performance with AF_Packet I'm wondering how
af_packet performance compares to pf_ring DNA/ZC (with the commercially
licensed drivers, not just vanilla) especially when it comes to Bro.

unfortunately I cannot provide any numbers. My main motivation for using
AF_Packet with Bro was the ease of use. Especially the PF_RING ZC
drivers caused issues in my environment, which I struggled to debug.
Given the extra cost of building this and that myself I chose AF_Packet.

Is af_packet generally sufficient for Bro when it comes to monitoring
100G+ networks using a cluster of commodity servers with Intel X520 NICs?

Good question. Someone should test this :slight_smile:

Is the distro shipped driver for something like an up-to-date Ubuntu
16.04 (4.4 kernel) server sufficient or do you really need to compile
the driver from source to enable some extended features, or to get a
properly patched driver etc? I could see some benefits to just using the
distro packaged driver and not having to compile the driver from scratch
or rely on dkms when patching sensors. I've had this go very wrong a few
times.

For me (CentOS 7) the packaged driver worked well with AF_Packet. But if
you want to tune things for maximal performance, I would recommend using
the latest drivers. E.g., from time to time looking at the code might
help in this case to understand what's going on.

Are there any gotchas where running one or the other might be the better
way to go? Examples (want to use some bro feature such as capstats, or
want to see VLAN tags in Bro logs, something else is broken or not
performing as expected)

I haven't used capstats but if I remember correctly, it is kind of
deprecated as it relies on libpcap. One should be able to obtain the
same information from other sources.

VLAN tags are indeed an issue using AF_Packet. For consistency reasons,
the kernel extracts VLAN tags even if there is no hardware VLAN
offloading (in contrast to Bro, Suricata can handle this due to its
monolithic structure). Actually that's something on my list.

Finally, one has to be careful regarding the kernel used. There is a bug
concerning AF-Packet's symmetric hashing that has been fixed in recent
kernels
(https://bro-tracker.atlassian.net/browse/BIT-1575?focusedCommentId=29627#comment-29627).

Does af_packet or the Bro plugin for it have a way to deal with multiple
NICS (one per numa node), sort of like how pf_ring has dnacluster and
zbalance_ipc?

In theory configuring a set of workers per NUMA node using separate NICs
shouldn't be an issue. The only thing is that you won't get load
balancing across the NICs. I am not sure how well this works with
PF_RING, though.

Feel free to share any other relevant considerations.

In addition to the VLAN stuff I have a couple of other things on my
list, which might allow some tuning. Unfortunately this list hasn't seen
much progress lately as I don't have access to a test setup. So there
might be room for improvement.

Jan

I'm trying AF_PACKET with Bro, but seem to be running a kernel and driver combo that doesn't appear to properly support symmetric hashing. I'm on Ubuntu 16.04 with kernel 4.4.0-59-generic. From what I can tell the patches should have been added around kernel 4.4.0-39 or so, but Justin's verification tool and Bro both seem to agree that it is broken on my system. I've tried with the OS supplied IXGBE driver (4.2.1-k) as well as compiling from scratch using a recent IXGBE directly from Intel (5.0.4). Is there a known working kernel and driver combo for Ubuntu 16.04, or are the necessary patches still not pushed into 16.04?

Thanks,

Gary

Have you disabled hardware hash with ethtool? By default kernel will use the card hash which is asymmetric.

You can verify it with ethtool -k

Look for rxhash - should be disabled.

https://github.com/pevma/SEPTun

Should show you how to prepare your system, you can ignore the core isolation and affinity for bro.

I tried to follow the guide fairly closely and adapt for Bro with the
exception of BIOS level tuning (which I plan to investigate later).
rxhash is set to off. I was cpu pinning bro before, so I am continuing
to do so. Settings are below as well as a rough script I am tweaking to
load them.

For troubleshooting purposes I decided not to simplify the script with a
loop as I was running into some issues with command order (especially
with set_irq_affinity placement) as well as a couple unsupported options:

Features for eth4:

rx-checksumming: off
tx-checksumming: off
        tx-checksum-ipv4: off
        tx-checksum-ip-generic: off [fixed]
        tx-checksum-ipv6: off
        tx-checksum-fcoe-crc: off [fixed]
        tx-checksum-sctp: off [fixed]
scatter-gather: off
        tx-scatter-gather: off
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
        tx-tcp-segmentation: off
        tx-tcp-ecn-segmentation: off [fixed]
        tx-tcp6-segmentation: off
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: off
generic-receive-offload: off
large-receive-offload: off
rx-vlan-offload: off
tx-vlan-offload: off
ntuple-filters: off
receive-hashing: off
highdma: on [fixed]
rx-vlan-filter: on
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
busy-poll: on [fixed]
hw-tc-offload: off [fixed]

#!/bin/bash

#Unload any existing module and load with new parameters
rmmod ixgbe
modprobe ixgbe MQ=0,0,0,0 RSS=1,1,1,1 VMDQ=0,0,0,0
InterruptThrottleRate=12500,12500,12500,12500 FCoE=0,0,0,0 LRO=0,0,0,0
vxlan_rx=0,0,0,0
sleep 1

#Disable irqbalance to stop bouncing interrupts between cores
killall irqbalance
sleep 1

#Enable interfaces in promisc mode
ip link set eth4 promisc on arp off up
ip link set eth6 promisc on arp off up
sleep 1

#Disable IPv6 on interfaces
echo 1 > /proc/sys/net/ipv6/conf/eth4/disable_ipv6
echo 1 > /proc/sys/net/ipv6/conf/eth6/disable_ipv6

#Enable Jumbo Frames (MTU of 9216 used on routers)
ip link set dev eth4 mtu 9216
ip link set dev eth6 mtu 9216

#Enforce a single RX queue
ethtool -L eth4 combined 1
ethtool -L eth6 combined 1

#Manage interrupts
ethtool -C eth4 adaptive-rx on rx-usecs 100
ethtool -C eth6 adaptive-rx on rx-usecs 100

#Lower the NIC ring descriptor size
ethtool -G eth4 rx 512
ethtool -G eth6 rx 512

#Disable pause frames
#ethtool -A eth4 autoneg off
#ethtool -A eth6 autoneg off
ethtool -A eth4 rx off tx off
ethtool -A eth6 rx off tx off

#Disable offloading features
ethtool -K eth4 rx off
ethtool -K eth4 tx off
ethtool -K eth4 tso off
ethtool -K eth4 ufo off
ethtool -K eth4 gso off
ethtool -K eth4 gro off
ethtool -K eth4 lro off
ethtool -K eth4 tx-nocache-copy off
ethtool -K eth4 rxhash off
ethtool -K eth4 ntuple off
ethtool -K eth4 sg off
ethtool -K eth4 txvlan off
ethtool -K eth4 rxvlan off
ethtool -K eth6 rx off
ethtool -K eth6 tx off
ethtool -K eth6 tso off
ethtool -K eth6 ufo off
ethtool -K eth6 gso off
ethtool -K eth6 gro off
ethtool -K eth6 lro off
ethtool -K eth6 tx-nocache-copy off
ethtool -K eth6 rxhash off
ethtool -K eth6 ntuple off
ethtool -K eth6 sg off
ethtool -K eth6 txvlan off
ethtool -K eth6 rxvlan off

#Set irq affinity
/bin/bash ./set_irq_affinity 2 eth4
/bin/bash ./set_irq_affinity 3 eth6