Question about Bro manager write data to kafka

Hi Bro, i’m encountered a performance issue about Bro manager write data to kafka. Can anyone help me please?

System details:
Operation System: CentOS 7.2
CPU:

  • Model name: Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
  • CPU(s): 32
  • CPU MHz: 2334.445
    Memory:
  • 64GB
    Network Interface:
  • 03:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
    Disks:
    Operation system is running with SSD drive.
    The Kafka log will write into the RAID0(Two HDD).

Bro Cluster Config details:
[manager]
type=manager
host=localhost

[proxy-1]
type=proxy
host=localhost

[worker-1]
type=worker
host=localhost
interface=eno1
aux_scripts= -C
lb_method=pf_ring
lb_procs=15
pin_cpus=3,5,7,9,11,13,15,17,19,21,23,25,27,29,31

Kakfa Config details:
1 broker
listeners=PLAINTEXT://10.0.81.60:9091
advertised.listeners=PLAINTEXT://10.0.81.60:9091
num.network.threads=60
num.io.threads=120
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=1024000
socket.request.max.bytes=104857600
log.dirs=/data-kafka/kafka-logs
num.partitions=10
num.recovery.threads.per.data.dir=1
log.flush.interval.messages=10000
log.flush.interval.ms=1000
log.retention.hours=5
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000

(Bro & kafka) are installed and running with a single machine.
Bro Kafka plugin: https://github.com/bro/bro-plugins/tree/master/kafka
librdkafka: librdkafka-0.9.4.tar.gz
Kafka: http://ftp.cuhk.edu.hk/pub/packages/apache.org/kafka/0.10.0.0/kafka_2.11-0.10.0.0.tgz

Bro load the custom scripts, which will reassembly the http data(include full request and response), the data format as below:
{“ts”:1497353836.648655,“sip”:“10.0.85.9”,“sport”:60484,“dip”:“10.0.81.48”,“dport”:80,“protocol”:“http”,“sensor”:“10.0.81.60”,“worker”:“worker-1-3”,“http.request.host”:“10.0.81.48”,“http.request.uri”:"/xab.html",“http.request.method”:“GET”,“http.request.body”:"",“http.request.body_len”:0,“http.request.header_names”:[“HOST”,“USER-AGENT”,“ACCEPT”],“http.request.header_values”:[“10.0.81.48”,“curl/7.51.0”,"/"],“http.request.range_request”:false,“http.request.username”:"",“http.request.password”:"",“http.response.status_code”:200,“http.response.status_msg”:“OK”,“http.response.body”:"\u000a\u000atest\u000a\u000a\u000a",“http.response.body_len”:35,“http.response.header_names”:[“SERVER”,“DATE”,“CONTENT-TYPE”,“CONTENT-LENGTH”,“LAST-MODIFIED”,“CONNECTION”,“ETAG”,“ACCEPT-RANGES”],“http.response.header_values”:[“nginx/1.10.2”,“Tue, 13 Jun 2017 11:37:16 GMT”,“text/html”,“35”,“Tue, 13 Jun 2017 11:25:40 GMT”,“keep-alive”,"\u0022593fcbb4-23\u0022",“bytes”]}

We set up a const variable to limit http response length(like const http_max_body_len: count = 10240;).

Scenario:
The details of configuration as below:

  • Bro capture network flow average size: 700mbps
  • the single http response length: 100kb
  • set the limit of http response length: http.response.body=51200(50kb)

System performance status as below:

  • Loopback network flow average size: 1gbps
  • Total disk write: 100MB/s

Test Results:
We used htop and watch /usr/local/bro/bin/broctl top to watch the system and bro status. The OS memory usage grow with time until fill up. The operation system will be unstable.
Bro Manager used 40G+ memory. But each worker memory usage size: 200MB.

So we performed another test which reduce the limit of http response length to 5kb in bro script. After testing, the bro manager memory usage will remain around 130MB.

In other test, we load 500mbps on NIC and Bro Manager use 4G memory(with 40960 http response data limit), but when we stop the performance test, the Manager memory usage is not reduce, just keep in 4G(we use vmstat durning in test).

In summary, we assume the write rate(Bro Manager write data to Kafka) less than Bro Manager generate the data rate. Which leads to the Bro Manager high memory usage.

The mechanism is correct? Or Bro Manager exist performance issue about write a huge data into Kafka? Or incorrect configuration? Please kindly let me know if you have any recommendation. Thank you so much.

Hi Bro, i'm encountered a performance issue about Bro manager write data to kafka. Can anyone help me please?

...

Bro Cluster Config details:
[manager]
type=manager
host=localhost

[proxy-1]
type=proxy
host=localhost

[worker-1]
type=worker
host=localhost
interface=eno1
aux_scripts= -C
lb_method=pf_ring
lb_procs=15
pin_cpus=3,5,7,9,11,13,15,17,19,21,23,25,27,29,31

...

The mechanism is correct? Or Bro Manager exist performance issue about write a huge data into Kafka? Or incorrect configuration? Please kindly let me know if you have any recommendation. Thank you so much.

You're not running a logger process which will easily double the performance of your cluster. Add

[logger]
type=logger
host=localhost

to your node.cfg

If you install the bro 2.5.1 beta you can have two or more loggers defined:

[logger-1]
type=logger
host=localhost

[logger-2]
type=logger
host=localhost

(This is specifically intended for things like kafka)

I am working on a newer version of the Kafka writer plugin (as a part of the Apache Metron project, which is where the plugin was initially created) which has support for sending to kerberized Kafka, some bug fixes, better debug logging, etc. It currently exists here, but I’m going to be turning it into a bro package and moving it here eventually (once it has more testing). If you’re willing to beta test a bit, perhaps it’s worth giving a shot, in addition to Justin’s comments?

Jon