About the performance improvement by compiling zeek scripts to C++ code

mchen · February 8, 2023, 1:42pm

Hi everyone,

I have a few questions about trying to improve the overall performance of zeek by compiling zeek scripts to C++. As Dr. Vern mentioned in the post , in recent zeek releases (5.x), zeek now have the ability to compile zeek scripts to C++ code and run them directly within zeek (as mentioned in the README guide on GitHub).

We did some initial testing, but the results were not as expected. Here are some test details:

We extended the default http handling script in the base directory by adding a few more fields to the original record and extracting the bodies of http requests and replies through events. We named the new extension script as http-ext.zeek and put it in the site directory.
We load the http-ext.zeek by the local.zeek script under the same directory.
We followed the instructions and compiled the script using zeek -O gen-C++ http-ext.zeek.
The network traffic we used for the test contained a mix of http and dns, with a CPS of ~1.5K and an http payload of ~3.9KB.
We ran a single zeek process using -O use-C++ local.zeek and analyzed the network traffic. Zeek reported no drops and had a CPU usage of ~52% according to the top command.
For comparison, we ran zeek again under the same conditions, but with the -O ZAM option. We found that zeek also reported no drops and had a CPU usage of ~48%.

However, the results show that the ZAM optimization is similar or better than the compilation, which is contrary to what we expected. Based on our limited understanding, we thought that compiling the extended http script into C++ code would bypass the zeek script engine, and since our test network traffic contained over 90% http traffic, we expected a significant performance (1x% ~2x%) improvement. But it seems that’s not the case.

We are wondering if there are any mistakes in our tests or understanding? And we have some additional questions about the compilation process.

Since the http-ext.zeek is an extension of the default HTTP scripts, will the HTTP handling scripts in the base directory also be compiled to C++? And if not, can we also compile the HTTP handling scripts in the base directory and expect an improvement in the TCP to HTTP analyzing pipeline?

We appreciate any help and are happy to provide more details about the tests if needed.

Thank you!

Vern · February 9, 2023, 5:38am

If you’re active on Zeek Slack, let’s take the discussion there. There are a lot of specifics to go into that’ll be easier to do via Slack rather than here. However if that’s not an option, let me know and we can iterate here.

mchen · February 9, 2023, 8:45am

Sure, Dr. Vern, let’s discuss in slack. Thanks!

eladsolomon · September 27, 2023, 8:27am

Would be happy if you can share here the conclusions from your slack discussion

mchen · September 28, 2023, 2:25am

Hello,

Dr. Vern had fixed several bugs of the optimization code and I believed all code had been merged into the latest release of zeek. Now, the optimization mechanism functions seamlessly right out of the box.

You can find both the ZAM and C++ optimization guides at: https://github.com/zeek/zeek/tree/master/src/script_opt

Depending on the traffic model, both ZAM and C++ should provide performance enhancements, although the extent of the enhancement may vary. And you can only choose one of the optimization method.

I wonder if @Vern Dr. Vern has additional insights or if there’s anything I’ve misunderstood.

Thank you.

Vern · September 28, 2023, 5:39am

What you sketch is correct. Those who are interested should note that script optimization remains experimental, and is not yet heavily tested, so bugs continue to turn up. It’d be great to hear from users who encounter problems - if possible, first confirming that they still manifest when running off of the latest version in GitHub, since I’m making frequent updates there.

Topic		Replies	Views
Looking for Advice on Optimizing Zeek Scripts Zeek	1	68	August 30, 2024
zeek performance with some events activated Zeek	4	105	May 6, 2022
About usage instruction of the ZAM (Zeek Assembly Machine) feature in newly releases? Zeek	8	683	November 14, 2022
measuring zeek's performance Development development	2	117	May 6, 2022
Some questions on the performance of Zeek（with pf_ring ZC） Zeek development	0	394	November 5, 2022

About the performance improvement by compiling zeek scripts to C++ code

Related topics