About usage instruction of the ZAM (Zeek Assembly Machine) feature in newly releases?

Hello everyone,

I noticed that there seems a new feature named ZAM (Zeek Assembly Machine) that purposes performance enhancement of zeek script engine, according to

Since we also have this performance optimization needs, may I ask what’s the current status of this new feature, and has it been merged to newly releases of zeek?

I did some searches on github zeek, and according to ZAM script compilation, it seems that the feature had been merged to zeek 5.1.0?

If that’s the case, do we need to take some extra steps to make it effective? Or it’s enabled by default so we can expect an performance enhancement of zeek scripts execution on 5.1.0?

Thank you for your help!

ZAM is available in Zeek 5 by specifying -O ZAM on the command-line. (There are also environment variables for turning it on if you need that.) See src/script_opt/ZAM/README.md for particulars.
I welcome feedback about correctness and performance.

Note that it takes a number of seconds to compile the scripts, so if you try it out on small inputs it will take longer, not shorter.

(BTW, ZAM = Zeek Abstract Machine.)

Thank you for your kind reply and correctness! :smiley:

We’ll try it out.

Currently our environment involve using zeekctl to setup a zeek cluster. To enable ZAM, according to zeekctl#configuration I assume it needs to add the option -O ZAM to the zeekctl.cfg file as:

ZeekArgs = -O ZAM

I wonder in this case if the ZAM will be enabled for all zeek workers?

Thanks again!

Unfortunately I don’t use zeekctl so can’t comment about whether that’s the right tweak. Perhaps @Aashish_Sharma1 can opine.

Yes, using ZeekArgs=-O ZAM inside your zeekctl.cfg and then you can use zeekctl to operate your cluster and for activating ZAM across all workers as well.

Thank for your reply! We’ll try in our environment to see the performance.

Hello everyone,

First I would like to thank @Vern and @Aashish_Sharma1 again for your kind replies, we did some initial tests about the performance based on ZAM, and we’d like to share some results.

In summary, we actually focus the cps (connections per second) more than network traffic, and found that under the same test setting, without/with the ZAM was enabled, we noticed roughly 20% performance increasement in general of cps according to our tests.

Some test details are as below:

Physical server of zeek

  • cpu: 40 core xeon @2.1ghz
  • mem: 256GB
  • network adapter: x520 intel 10 Gigabit network adapter

Traffic generator generates a hybrid network traffic includes

  • (99%) http (v4/v6), 4KB payload
  • (1%) dns(v4/v6)
    with ~20K - 24K cps, and each tcp connection maintains for ~30 sec, total open tcp connections are ~800K, and total network traffic is ~1gbps

We port mirror the generated traffic over switch to the 10gb network adapter of the server and then processed by zeek.

To capture the network traffic we use pfring zc and did load-balance by zbalance_ipc to create a total of 31 rss queues:

zbalance_ipc -i zc:ens192 -n 31 -m 1 -c 10 -g 1

And for zeek configurations:

  • version 5.1.0
  • 31 workers (1 core per worker, and each worker processed traffic from one particular pfring rss queue)
  • 1 logger, manager and proxy process
  • disable/enable ZAM by the zeekargs mentioned above

Under this setting, without installing any other third party plugins and scripts, and the traffic generator keep generating traffic for about 1hr, the results showed that

  • without ZAM, zeek can process ~20K cps without reporting dropped packets (by the capture_loss and the stat log)
  • with ZAM is on, zeek can process ~24K cps without(or little) reporting dropped packets

and conn, dns, files and http logs had been functioning normally during the test and had correct analyzed entries; avg cpu loads of workers are ~95%, and memory occupied ~90GB.

We haven’t done a thorough tests but based on the initial results, ZAM does have a significant performance enhancement.

I notice that other than -O ZAM, there are some additional options of ZAM according to ZAM/README. So I wonder if there is any options can print some debug information, such that which can be utilized to increase the performance further.

Also we appreciate that if there is any comments, ideas and suggestions about our settings/configs in favor of performance enhancement, or if you would like to know more details of our tests.

Again thank for having this great feature of zeek in new releases!

1 Like

It’s great to hear that you’re getting definite performance benefits running on a large workload!

Regarding options for trying to further increase performance, there are three possibilities that come ot mind:

  1. Run using --profile-scripts (both with and without -O ZAM) to generate script-level profiles to stdout, used to look for script-level optimization opportunities. I can help with interpreting what it shows.

  2. Run using -O ZAM -O profile-ZAM to get a fine-grained profile of ZAM execution that you can send to me for analysis.

  3. Compile your scripts to C++ using the features documented in src/script_opt/CPP/README.md. This is a significant undertaking, so if you want to do it you should contact me via the Zeek Slack so we can discuss the particulars.

I’ve already put a bunch of work into optimizing ZAM for executing the default scripts, so the first two suggestions above only make sense if you’re using significant additional scripts over the default ones. (Also, the first two options will make your execution run much more slowly than it normally does, because of the expense of the instrumentation used to gather the profiles, so you’ll need to adjust your traffic load accordingly.)

Thank you for your time and great guidance, actually we would like to tried all your mentioned three possibilities.

For your mentioned methods 1) and 2), actually we do have installed some third party plugins from packages.zeek.org and some custom scripts on our working environment. But since we know they are going to take extra computing, so we use the default scripts version as a based performance reference.

We’ll test our loaded scripts/plugins version and adjust the traffic accordingly. Once we have the profile result, we’ll post asap.

For your mentioned third method, we are more than happy to try the feature. And it might be the case that we don’t have to compile all scripts to C++, but some bottleneck ones from profiling results of aforementioned 1 and 2. We look forwarding discussing details with you by zeek slack.

Again thank you for your time and we’re honor to have this chance to try ZAM and might even improve the performance further.