Hi Bro, i’m encountered a performance issue about Bro manager write data to kafka. Can anyone help me please?
System details:
Operation System: CentOS 7.2
CPU:
- Model name: Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
- CPU(s): 32
- CPU MHz: 2334.445
Memory: - 64GB
Network Interface: - 03:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
Disks:
Operation system is running with SSD drive.
The Kafka log will write into the RAID0(Two HDD).
Bro Cluster Config details:
[manager]
type=manager
host=localhost
[proxy-1]
type=proxy
host=localhost
[worker-1]
type=worker
host=localhost
interface=eno1
aux_scripts= -C
lb_method=pf_ring
lb_procs=15
pin_cpus=3,5,7,9,11,13,15,17,19,21,23,25,27,29,31
Kakfa Config details:
1 broker
listeners=PLAINTEXT://10.0.81.60:9091
advertised.listeners=PLAINTEXT://10.0.81.60:9091
num.network.threads=60
num.io.threads=120
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=1024000
socket.request.max.bytes=104857600
log.dirs=/data-kafka/kafka-logs
num.partitions=10
num.recovery.threads.per.data.dir=1
log.flush.interval.messages=10000
log.flush.interval.ms=1000
log.retention.hours=5
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
(Bro & kafka) are installed and running with a single machine.
Bro Kafka plugin: https://github.com/bro/bro-plugins/tree/master/kafka
librdkafka: librdkafka-0.9.4.tar.gz
Kafka: http://ftp.cuhk.edu.hk/pub/packages/apache.org/kafka/0.10.0.0/kafka_2.11-0.10.0.0.tgz
Bro load the custom scripts, which will reassembly the http data(include full request and response), the data format as below:
{“ts”:1497353836.648655,“sip”:“10.0.85.9”,“sport”:60484,“dip”:“10.0.81.48”,“dport”:80,“protocol”:“http”,“sensor”:“10.0.81.60”,“worker”:“worker-1-3”,“http.request.host”:“10.0.81.48”,“http.request.uri”:"/xab.html",“http.request.method”:“GET”,“http.request.body”:"",“http.request.body_len”:0,“http.request.header_names”:[“HOST”,“USER-AGENT”,“ACCEPT”],“http.request.header_values”:[“10.0.81.48”,“curl/7.51.0”,"/"],“http.request.range_request”:false,“http.request.username”:"",“http.request.password”:"",“http.response.status_code”:200,“http.response.status_msg”:“OK”,“http.response.body”:"\u000a\u000atest\u000a\u000a\u000a",“http.response.body_len”:35,“http.response.header_names”:[“SERVER”,“DATE”,“CONTENT-TYPE”,“CONTENT-LENGTH”,“LAST-MODIFIED”,“CONNECTION”,“ETAG”,“ACCEPT-RANGES”],“http.response.header_values”:[“nginx/1.10.2”,“Tue, 13 Jun 2017 11:37:16 GMT”,“text/html”,“35”,“Tue, 13 Jun 2017 11:25:40 GMT”,“keep-alive”,"\u0022593fcbb4-23\u0022",“bytes”]}
We set up a const variable to limit http response length(like const http_max_body_len: count = 10240;
).
Scenario:
The details of configuration as below:
- Bro capture network flow average size: 700mbps
- the single http response length: 100kb
- set the limit of http response length: http.response.body=51200(50kb)
System performance status as below:
- Loopback network flow average size: 1gbps
- Total disk write: 100MB/s
Test Results:
We used htop
and watch /usr/local/bro/bin/broctl top
to watch the system and bro status. The OS memory usage grow with time until fill up. The operation system will be unstable.
Bro Manager used 40G+ memory. But each worker memory usage size: 200MB.
So we performed another test which reduce the limit of http response length to 5kb in bro script. After testing, the bro manager memory usage will remain around 130MB.
In other test, we load 500mbps on NIC and Bro Manager use 4G memory(with 40960 http response data limit), but when we stop the performance test, the Manager memory usage is not reduce, just keep in 4G(we use vmstat durning in test).
In summary, we assume the write rate(Bro Manager write data to Kafka) less than Bro Manager generate the data rate. Which leads to the Bro Manager high memory usage.
The mechanism is correct? Or Bro Manager exist performance issue about write a huge data into Kafka? Or incorrect configuration? Please kindly let me know if you have any recommendation. Thank you so much.