I am in process of clusterizing a bunch of scripts and using worker2manager and manager2worker events for doing so. This seem to be working *quite fantastic* actually and I see 1-to-1 mapping on data moving around.
I still don't quite understand how the communication happens in background (can someone elaborate or point me to where should I be looking )
While I am using local caches and not sending data if already sent around, I know still the number of events has increased significantly. I am wondering if in background proxy/workers/manager/workers keep a persistent connection over which bytes just move (so doesn't quite matter how many times we move the data ) or am I in danger of overloading proxies at some point with this communication ? Would increase in number of proxies help ?
for an example test case I am trying synchronizing bloomfilter (populating with IPs based on outgoing SF seen) across workers using this technique.
Right now I don't see significant increase in CPU or memory perse doing this but porting old-scan detection to cluster is next on to-do list and I want to make sure I don't cause proxies to explode.
Thanks,
Aashish