I have a few questions about trying to improve the overall performance of zeek by compiling zeek scripts to C++. As Dr. Vern mentioned in the post , in recent zeek releases (5.x), zeek now have the ability to compile zeek scripts to C++ code and run them directly within zeek (as mentioned in the README guide on GitHub).
We did some initial testing, but the results were not as expected. Here are some test details:
We extended the default http handling script in the base directory by adding a few more fields to the original record and extracting the bodies of http requests and replies through events. We named the new extension script as http-ext.zeek and put it in the site directory.
We load the http-ext.zeek by the local.zeek script under the same directory.
We followed the instructions and compiled the script using
zeek -O gen-C++ http-ext.zeek.
The network traffic we used for the test contained a mix of http and dns, with a CPS of ~1.5K and an http payload of ~3.9KB.
We ran a single zeek process using
-O use-C++ local.zeekand analyzed the network traffic. Zeek reported no drops and had a CPU usage of ~52% according to the top command.
For comparison, we ran zeek again under the same conditions, but with the
-O ZAMoption. We found that zeek also reported no drops and had a CPU usage of ~48%.
However, the results show that the ZAM optimization is similar or better than the compilation, which is contrary to what we expected. Based on our limited understanding, we thought that compiling the extended http script into C++ code would bypass the zeek script engine, and since our test network traffic contained over 90% http traffic, we expected a significant performance (1x% ~2x%) improvement. But it seems that’s not the case.
We are wondering if there are any mistakes in our tests or understanding? And we have some additional questions about the compilation process.
Since the http-ext.zeek is an extension of the default HTTP scripts, will the HTTP handling scripts in the base directory also be compiled to C++? And if not, can we also compile the HTTP handling scripts in the base directory and expect an improvement in the TCP to HTTP analyzing pipeline?
We appreciate any help and are happy to provide more details about the tests if needed.