Cluster setup

Hi All,

Here is what I am trying to achieve: Incoming traffic on Host-A should
be sent to worker Host-B (and to more workers in future).

Here is how my config looks like in node.cfg

Manager: Host-A
Proxy: Host-A
Worker1: Host-B (which is 10.73.149.31)

I have bro installed on all machines. Now, I start broctl on Host-A:

-bash-4.2$ sudo broctl
Password:

Welcome to BroControl 1.1

Type “help” for help.

[BroControl] > install
removing old policies in
/usr/local/spool/installed-scripts-do-not-touch/site … done.
removing old policies in
/usr/local/spool/installed-scripts-do-not-touch/auto … done.
creating policy directories … done.
installing site policies … done.
generating cluster-layout.bro … done.
generating local-networks.bro … done.
generating broctl-config.bro … done.
updating nodes … warning: host 10.73.149.31 is not alive
done.
[BroControl] > install
removing old policies in
/usr/local/spool/installed-scripts-do-not-touch/site … done.
removing old policies in
/usr/local/spool/installed-scripts-do-not-touch/auto … done.
creating policy directories … done.
installing site policies … done.
generating cluster-layout.bro … done.
generating local-networks.bro … done.
generating broctl-config.bro … done.
updating nodes … done.
[BroControl] > start
starting manager …
starting proxy-1 …
starting worker-1 …
cannot create working directory for worker-1 <<-- not sure why I get
this message.
[BroControl] >

Do I need to do anything on Worker-1?? Do I need to put it in some special mode?

Any help/pointers would be appreciated.

Cheers,
Hiren

Did you verify that you can ssh from host-A to host-B without
having to type a password?

Next, on host-B, verify that the partition where /usr/local/spool
is located is not mounted read-only and that there is some free
disk space (broctl is trying to create a directory
in /usr/local/spool on host-B).

Did you verify that you can ssh from host-A to host-B without
having to type a password?

Just set that up.

Next, on host-B, verify that the partition where /usr/local/spool
is located is not mounted read-only and that there is some free
disk space (broctl is trying to create a directory
in /usr/local/spool on host-B).

Checked this too.

Still,

[BroControl] > install
removing old policies in
/usr/local/spool/installed-scripts-do-not-touch/site ... done.
removing old policies in
/usr/local/spool/installed-scripts-do-not-touch/auto ... done.
creating policy directories ... done.
installing site policies ... done.
generating cluster-layout.bro ... done.
generating local-networks.bro ... done.
generating broctl-config.bro ... done.
updating nodes ... warning: host 10.73.149.31 is not alive
done.
[BroControl] >

What does that mean? I still cannot get worker-1 to work properly.

in "top" (inside broctl) also, worker-1 is shown <not running>

Do I need to setup anything on worker-1?

Cheers,
Hiren

Ah, I realized that I had to do this as "root" because broctl is run as root :slight_smile:

Set that up and now:

[BroControl] > install
removing old policies in
/usr/local/spool/installed-scripts-do-not-touch/site ... done.
removing old policies in
/usr/local/spool/installed-scripts-do-not-touch/auto ... done.
creating policy directories ... done.
installing site policies ... done.
generating cluster-layout.bro ... done.
generating local-networks.bro ... done.
generating broctl-config.bro ... done.
updating nodes ... warning: host 10.73.149.31 is not alive
done.
[BroControl] > check
manager is ok.
proxy-1 is ok.
worker-1 is ok.
[BroControl] > start
starting manager ...
starting proxy-1 ...
starting worker-1 ...
worker-1 terminated immediately after starting; check output with "diag"

[BroControl] > diag worker-1
[worker-1]

==== No reporter.log

==== stderr.log
error in /usr/local/share/bro/base/frameworks/cluster/__load__.bro,
line 16: can't open cluster-layout

==== stdout.log
unlimited
536870912
unlimited

==== .cmdline
-i bce1 -U .status -p broctl -p broctl-live -p local -p worker-1
local.bro broctl base/frameworks/cluster local-worker.bro broctl/auto

==== .env_vars
PATH=/usr/local/bin:/usr/local/share/broctl/scripts:/sbin:/bin:/usr/sbin:/usr/bin:/usr/games:/usr/local/sbin:/usr/local/bin:/home/y/bin:/root/bin
BROPATH=/usr/local/spool/installed-scripts-do-not-touch/site::/usr/local/spool/installed-scripts-do-not-touch/auto:/usr/local/share/bro:/usr/local/share/bro/policy:/usr/local/share/bro/site
CLUSTER_NODE=worker-1

==== .status
TERMINATED [atexit]

==== No prof.log

==== No packet_filter.log

==== No loaded_scripts.log
[BroControl] >

Trying to determine what is causing this.

cheers,
Hiren

Exit from broctl, then verify that you can ping the
worker machine from the manager machine. If that works,
then do another "broctl install" and make sure you
don't see any error or warning messages.

I am not able to find/understand what is causing this problem.

Anyone with some clue?

Thanks in advance,
Hiren

Alright, So its looking for a file: cluster-layout.bro

I could see that on manager node at:
/usr/local/spool/installed-scripts-do-not-touch/auto/
But it was not available on same location in worker-1 node. (please
let me know if there is a better way to this)

I scp'ed that file there and then
[BroControl] > install
[BroControl] > start

worked.

top also shows all the nodes: manager, proxy-1 and worker-1 active.

Now is time for actual traffic.

Cheers,
Hiren

This is from:

    def isAlive(host):
        ...
        (success, output) = runLocalCmd(os.path.join(config.Config.scriptsdir, "is-alive") + " " + util.scopeAddr(host))
        ...
        if not success and not config.Config.cron == "1":
            util.warn("host %s is not alive" % host)

which just runs the is-alive script which runs

    ping -c 1 -W 1 host

so, if that is failing you're running a restrictive iptables policy or have disabled icmp?


Thanks for pointing that out, I've seen that on quite a few hosts (CentOS) where the default iptables setup interferes with BroControl.

  .Seth

Thanks for all the hints and help guys.

I have a question about parallelism. If I am consuming live traffic on
a 10G intel nic with 8 queues in it on my manager node, can I setup
bro to send data to 8 separate workers such that, queue:1 traffic goes
to a worker:1 and so on...?

Is that possible? Or I am thinking it wrong?

Cheers,
Hiren

Yes, in that case you would just put your workers and your manager on the same physical host.

.Seth

I have a question about parallelism. If I am consuming live traffic on
a 10G intel nic with 8 queues in it on my manager node

Yes, in that case you would just put your workers and your manager on the same physical host.

But then would one host be able to keep up doing everything?

I somehow cannot direct a queue’s traffic to a specific worker?

Thanks a lot for your help,
Hiren

You already said you had 8 queues on your NIC though? I guess I assumed you had something like PF_Ring configured to split your traffic to multiple processes.

.Seth

Right. So (afaik) in FreeBSD we do not have PF_RING like functionality
where there is an PF_RING application sdk and applications can choose
which queue it wants to listen to. Intel NIC (that I am using)
definitely can distribute traffic in 8 queues it has but question for
me is, how do I distribute it to the application/workers.

I also have a larger question here though.

How do people usually do something that I want to do: Have a box tap
continuous traffic on a 10G card and let workers parse/interpret it
(via bro).

Do they have PF_RING setup which blindly ports queue:1 traffic to
worker:1 and bro (using PF_RING's sdk) will do the parsing?

Simple aim here is load-distribution.

Thanks a lot of tolerating my (possibly) stupid questions :slight_smile:

Cheers,
Hiren

Right. So (afaik) in FreeBSD we do not have PF_RING like functionality
where there is an PF_RING application sdk and applications can choose

Ah, generally right now people are only doing load balancing on FreeBSD with Myricom NICs and the Myricom Sniffer driver.

which queue it wants to listen to. Intel NIC (that I am using)
definitely can distribute traffic in 8 queues it has but question for
me is, how do I distribute it to the application/workers.

In FreeBSD at the moment you don't. It's possible that if you have netmap enabled you might be able to use that in some fashion, but generally those FlowDirector based queues on the high end Intel NICs aren't actually exposed in userland. If you are talking about RSS (receive side scaling), then that's insufficient unless you have RX and TX RSS (I'm a little confused about this, but I read something recently that seemed to indicate this might be a thing on some NICs) because both directions of each connection need to go to each process.

Do they have PF_RING setup which blindly ports queue:1 traffic to
worker:1 and bro (using PF_RING's sdk) will do the parsing?

Typically people run PF_Ring in mode 0 which is actually not exposing hardware load balanced traffic. It's collecting all of the traffic and load balancing it in the core.

I don't have much of a suggestion right now for FreeBSD beyond Myricom though.

  .Seth

Right. So (afaik) in FreeBSD we do not have PF_RING like functionality
where there is an PF_RING application sdk and applications can choose

Ah, generally right now people are only doing load balancing on FreeBSD with Myricom NICs and the Myricom Sniffer driver.

This is not an option for me but I will surely see how they are doing it.

which queue it wants to listen to. Intel NIC (that I am using)
definitely can distribute traffic in 8 queues it has but question for
me is, how do I distribute it to the application/workers.

In FreeBSD at the moment you don't. It's possible that if you have netmap enabled you might be able to use that in some fashion, but generally those FlowDirector based queues on the high end Intel NICs aren't actually exposed in userland. If you are talking about RSS (receive side scaling), then that's insufficient unless you have RX and TX RSS (I'm a little confused about this, but I read something recently that seemed to indicate this might be a thing on some NICs) because both directions of each connection need to go to each process.

Yeah, tricky part is the userland association. But I am also not too
clear on RSS detail. That looks like the only option I have. I need to
dig deeper.

Do they have PF_RING setup which blindly ports queue:1 traffic to
worker:1 and bro (using PF_RING's sdk) will do the parsing?

Typically people run PF_Ring in mode 0 which is actually not exposing hardware load balanced traffic. It's collecting all of the traffic and load balancing it in the core.

Here, core == bro's core?

I really appreciate you taking time out and responding to my questions, Seth :slight_smile:

cheers,
Hiren

Sorry, totally wrong word. I meant kernel. :slight_smile:

  .Seth

Ah, okay. So instead of card, let kernel do all the load balancing
(probably with assistance from PF_RING).

Thank you,
Hiren