I am trying to figure out if it is possible to update the number of nodes running Zeek on a cluster configuration without restarting it. This could be a possible way to cope with increasing network traffic occuring in certain periods during a day or certain days when trafficic is expected to peak. However, restarting Zeek would cause a possible loss of data and I would rather avoid it.
As far as I understand, I can update the node.cfg file, for example with new workers, and run the deploy command in broctl to update the configuration. But this will stop and restart the workers for a short time. Is there a way to avoid it? I had a look into the cluster framework and other parts of zeek’s code, but it doesn’t seem so easy to me.
Thanks in advance,
A dynamically changing cluster is theoretically possible, but not
something I know any tricks to get working now -- it's likely some
effort to hack that feature in or else try to roll your own cluster
config that uses the underlying Broker framework to set up connections
instead of the default cluster/broctl frameworks.
The main issue with adding new workers on demand is here:
the cluster layout expects to know about all workers before hand.
However, if you changed the code to assume that when an unknown node
connects and is calling itself worker-44, to just add a worker with
that ID to the nodes table as a worker... basically just trust the
However there are other issues. If you're using something like
AF_packet and add a worker the number of workers will change causing
connections to hash differently causing connections to move move
between different workers which is almost as bad as restarting things.
If you were talking about physically adding new nodes then you have a
similar problem if you are using a packet broker because the hashing
will change on that side as well.
This is not it bad idea though, I had wanted to build a cluster on top
of k8s or nomad and integrate it with arista to be able to dynamically
provision and resize clusters.
Thanks for your reply.
My main concern is that changing the cluster config on the fly may disrupt something else in Zeek's processing of traffic, as you point out below. Not sure what else may be affected by such changes.
A dynamic resizing of a cluster and consequently adapting Zeek to it seems though a possible idea to handle temporary peaks in network traffic.