Myricom and Bro... show of hands for successful deployments on 10G links (with > 5Gpbs)

Hi All,

So, I’m writing to hopefully get a show of hands from those of your out there who’ve employed Myricom cards to capture packets on your 10G links.

I’ll start by saying that while the myricom cards we have in place do a fine job of capturing I’ve been unable to find the secret sauce that allows both capture and writing to disk in a way that doesn’t drop significant amounts of packets using either bro, tcpdump, snort, suricata.

For those of you out there using myricom cards in conjunction with your favorite tools (bro of course :wink: ) can you let me know what data rate your Myricom cards are seeing and what (assuming some) percentage of packets you are dropping?

If you aren’t dropping anything I’d love to know more about your setup! :slight_smile:

Cheers,
Harry

Hi Harry,

Can you expand on “allowing both capture and writing to disk?” Carnegie Mellon runs a Bro cluster with Myricom NICS, which works well. However, the manager is on a box that doesn’t have any workers on it (and thus doesn’t receive any traffic), so I haven’t had any I/O contention from network traffic and log writing. Is that what you’re referring to?

We’re seeing about 16 Gbps and dropping < 1% (around 0.1% most of the time, I believe). That’s split up over 4 rather beefy boxes, though.

–Vlad

check your memory bandwidth:
http://www.ntop.org/pf_ring/not-all-servers-are-alike-with-dna/

Hi Vlad,

Absolutely. Sorry if that was vague or cryptic.

What I meant was that using the myricom test utilities I can capture everything on the wire. These utilities don’t write to disk so they only show that there’s not a issue with nic to memory transfers.

Once I fire up bro one worker consistently pegs a core at 100% and I drop greater then 1/2 of packets. The rate of drop isn’t as severe with tools like tcpdump but I assume that the difference in processing that bro does with packets.

All of these is running on a Dell R710 with 2 Xeon CPUs at 2.8GHz with 6 cores each (HT disabled) and 96GB of RAM and two SSD drives for data each 700GB in size. We moved to the Dell specifically to test whether or not using SSD drives gave a performance boost in writing to disk.

We’re using the myricom tools (/opt/snf/bin/myri_counters) to determine dropped packets, via SNF drop ring full, due to the application (tcpdump, bro, etc) being too slow to grab packets from the ring buffer.

As an initial, memory only test, we’ve run /opt/snf/bin/tests/snf_simple_recv and /opt/snf/bin/tests/snf_multi_recv. Both run without any drops and output shows an avg of 7Gbps on the wire. Running either test for extended periods of time does not cause the values in “SNF drop ring full” to increment.

/usr/local/bro/etc/node.cfg looks like (as you can see we’re attempting to tweak performance via the various SNF env variables. There’s no difference noticed using pin_cpus:

[manager]
type=manager
host=localhost

Thanks, Kyle!

Very informative article. I’m installing numactl now and will test.

I do note that they say they are doing close to line rate with a Dell R710 so that’s promising :slight_smile:

Cheers,
Harry

That most likely means you are not using the Myricom API to capture packets. I’ve seen the symptoms you’re describing. Please send the output of ldd which bro | egrep -i '(myri|snf)

Hi Michal,

I am indeed linked to the pcap supplied by myricom. As described I see the snf counters increasing when bro is running (via myri_counters). Bro also shows up in the myri_endpoint_info

ldd /usr/local/bro/bin/bro

linux-vdso.so.1 => (0x00007fffaabd1000)
libpcap.so.1 => /opt/snf/lib/libpcap.so.1 (0x00007f6c0ebf1000)

Thanks for the pointer about the shared rings. I’d mistakenly (?) believed the opposite and will consult the documentation.

Any other thoughts?

Cheers,
Harry