log streams in a bro cluster

Hi all,

I need some help with the logs generated by a Bro Cluster:

I have 5 bro scripts that run in all workers of my cluster infrastructure. All of them work OK, sending notices to the manager and all the staff, but one of them should create a LOG stream (warnings.log) that I can't find anywhere:

Log::create_stream(umas::WARN, [$columns=warn_info,$path="warnings"]);

If I run my script in a single bro installation, all logs and notices seem to work, but I need it working in a cluster infrastructure.

I expected this Log stream to be sent to the 'logs' directory in the manager, but that log file is not there. Only standard log files (dns.log, http.log, stdout.log, etc) are copied to the 'logs' directory.

This warnings.log file do not appear either anywhere in the worker, and not error log file is shown, so... I'm lost.

I anyone can shed some light into this, I would appreciate it.

The other problem I have is the following: My script should open a config file. In a single machine infrastructure this config file is in the same directory of the scripts, and everything work fine. The file is opened and read. However in a cluster infrastructure the file is not opened in the workers. I find that the file is copied by broctl to the worker BUT it is not read when the bro script is running. Anyone can tell me what I'm doing wrong or where should I locate that file in the workers?

Thank you for any help!!

Hi all,

I need some help with the logs generated by a Bro Cluster:

I have 5 bro scripts that run in all workers of my cluster
infrastructure. All of them work OK, sending notices to the manager and
all the staff, but one of them should create a LOG stream (warnings.log)
that I can't find anywhere:

Log::create_stream(umas::WARN, [$columns=warn_info,$path="warnings"]);

If I run my script in a single bro installation, all logs and notices
seem to work, but I need it working in a cluster infrastructure.

I expected this Log stream to be sent to the 'logs' directory in the
manager, but that log file is not there. Only standard log files
(dns.log, http.log, stdout.log, etc) are copied to the 'logs' directory.

This warnings.log file do not appear either anywhere in the worker, and
not error log file is shown, so... I'm lost.

I anyone can shed some light into this, I would appreciate it.

When are you writing to that log? Just creating the log stream doesn't create the file until you do a

    Log::write(umas::WARN, record);

The other problem I have is the following: My script should open a
config file. In a single machine infrastructure this config file is in
the same directory of the scripts, and everything work fine. The file is
opened and read. However in a cluster infrastructure the file is not
opened in the workers. I find that the file is copied by broctl to the
worker BUT it is not read when the bro script is running. Anyone can
tell me what I'm doing wrong or where should I locate that file in the
workers?

Thank you for any help!!

How are you loading the configuration file?

You should be using something like

    local config_path = fmt("%s/my-config.something", @DIR);

otherwise a relative or absolute path may not be what you expect.

Thank you for your reply Justin.

You are right, probably I didn't explain myself. There's data to be logged but the log file is not created.

Actually, if I set:

redef Log::enable_local_logging = T;

in /usr/local/bro/share/bro/base/frameworks/cluster/nodes/worker.bro

...and deploy again, I can see the mentioned log file in the worker node (stored in /home/bro/bro/spool/worker-1/warnings.log), but this log file is NOT created in the manager.

I expected all the logs from the worker nodes to be copied somehow to manager, but it does not seem to work like that.

Rgds

Hi all,

A little bit of investigation and I found (with a tcpdump) that the logs arrive to the manager process BUT they are not stored to disk. Then I found the following entry at the beginning of the communication.log file :

1465472892.006482 manager parent - - - error [#10002/192.168.1.10:57322] unserializing event/function Notice::cluster_notice: write error for creating writer

followed by a lot of errors like:

1465473767.549373 manager parent - - - error [#10001/192.168.1.10:57322] unserializing event/function Notice::cluster_notice: write error for log entry

It seems that for any reason, the deserializer writer is not able to open a writer ad then it's not possible to write the log files.

I found this error in github in RemoteSerializer.cc, in its function 'ProcessLogCreateWriter':

https://github.com/bro/bro/blob/f5ce4785ea96b56643c092331a16308f071c8092/src/RemoteSerializer.cc

But I still cannot figure out why is this happening.

Tried to change permissions to all log directories to 777, but didn't work.

Any idea why could be happening this error?

Thank you!!

Ah.. the "write error for creating writer" message is a bit misleading, it outputs that for any error in the process.

Those messages also point to an issue with notices, not with your log file. Are you also calling NOTICE somewhere?

Your problem could be that there is a discrepancy between how you defined warn_info and what you are passing Log::write. Non clustered bro doesn't need to serialize/deserialize the messages so you can get away with certain mistakes that break once you use a cluster.

The standard log files all use the same mechanism, so if you are getting an http.log then your remote communication is working and there should be nothing preventing your log file from being written.

It would help if you could post your scripts somewhere or try to come up with a minimal example that shows the problem.

This is the simplest example for writing a custom log file:

http://try.bro.org/#/trybro?example=log

If you modify it like this and deploy it to a cluster you should get a foo.log containing things like

1465477100.871640 hello from manager
1465477105.884494 hello from manager
1465477104.537564 hello from proxy-1
1465477108.648193 hello from worker-1-2
1465477108.527117 hello from worker-1-1
1465477110.887240 hello from manager
1465477113.652352 hello from worker-1-2
1465477109.552765 hello from proxy-1

module FOO;

export {
    redef enum Log::ID += { LOG };

    type Info: record {
        ts: time &log;
        msg: string &log;
    };
}

event do_log()
    {
    local l = [$ts = network_time(), $msg=fmt("hello from %s", peer_description)];
    Log::write(LOG, l);
    schedule 5sec {do_log() };
    }

event bro_init()
{
    Log::create_stream(LOG, [$columns=Info]);

    schedule 5sec {do_log() };
}

Hi Justin, thank you for your help.

I reply inline in your email. Sorry for the capital letters, they are just to make a difference between your comments and mine:

Hi all,

A little bit of investigation and I found (with a tcpdump) that the logs arrive to the manager process BUT they are not stored to disk. Then I found the following entry at the beginning of the communication.log file :

1465472892.006482 manager parent - - - error [#10002/192.168.1.10:57322] unserializing event/function Notice::cluster_notice: write error for creating writer

followed by a lot of errors like:

1465473767.549373 manager parent - - - error [#10001/192.168.1.10:57322] unserializing event/function Notice::cluster_notice: write error for log entry

Ah.. the "write error for creating writer" message is a bit misleading, it outputs that for any error in the process.

Those messages also point to an issue with notices, not with your log file. Are you also calling NOTICE somewhere?

YES I AM. PROBABLY YOU ARE RIGHT AND THAT ERROR IS RELATED TO THE NOTICE FRAMEWORK. BUT NOTICES AS BEING LOGGED FINE.

Your problem could be that there is a discrepancy between how you defined warn_info and what you are passing Log::write.

I DON'T THINK SO BECAUSE WHEN RUNNING STANDALONE, THE LOGS ARE WRITTEN FINE.

Non clustered bro doesn't need to serialize/deserialize the messages so you can get away with certain mistakes that break once you use a cluster.

...BUT I'M WORKING IN A CLUSTER CONFIGURATION. HENCE THE DESERIALIZE ERROR. WHEN I MENTION STANDALONE I BECAUSE I CAN CHECK ALSO IN A SINGLE NODE BRO CONFIGURATION, BUT MY OBJECTIVE IS TO MAKE IT WORK IN A CLUSTER.

The standard log files all use the same mechanism, so if you are getting an http.log then your remote communication is working and there should be nothing preventing your log file from being written.

YEAP, I'M GETTING DNS or CONN LOGS, SO THE LOGGING FRAMEWORK SEEMS TO BE LOGGING FINE BUT STILL MY OWN LOG FILES ARE NOT WRITTEN TO DISK IN THE MANAGER NODE.

HAVE A LOOK TO THIS. THIS IS THE LOG DIRECTORY IN ONE OF THE WORKERS:

root@Worker1:/home/bro/bro/spool/worker-1# ls -l
total 2340
-rw-r--r-- 1 root root 6005 Jun 9 16:35 communication.log
-rw-r--r-- 1 root root 8709 Jun 9 16:35 conn.log
-rw-r--r-- 1 root root 20596 Jun 9 16:35 dns.log
-rw-r--r-- 1 root root 292 Jun 9 16:32 known_services.log
-rw-r--r-- 1 root root 23596 Jun 9 16:32 loaded_scripts.log
-rw-r--r-- 1 root root 1425948 Jun 9 16:35 my-prot.log
-rw-r--r-- 1 root root 231 Jun 9 16:32 packet_filter.log
-rw-r--r-- 1 root root 290 Jun 9 16:32 reporter.log
-rw-r--r-- 1 root root 79 Jun 9 16:32 stderr.log
-rw-r--r-- 1 root root 188 Jun 9 16:32 stdout.log
-rw-r--r-- 1 root root 898 Jun 9 16:34 weird.log

...AND THIS IS THE LOG DIRECTORY IN THE MANAGER:

root@Manager:/home/bro/logs# ls -l
total 160
-rw-r--r-- 1 root root 75440 Jun 9 16:36 communication.log
-rw-r--r-- 1 root root 485 Jun 9 16:33 conn.log
-rw-r--r-- 1 root root 429 Jun 9 16:33 dns.log
-rw-r--r-- 1 root root 199 Jun 9 16:32 known_services.log
-rw-r--r-- 1 root root 23113 Jun 9 16:32 loaded_scripts.log
-rw-r--r-- 1 root root 187 Jun 9 16:32 packet_filter.log
-rw-r--r-- 1 root root 237 Jun 9 16:32 reporter.log
-rw-r--r-- 1 root root 0 Jun 9 16:32 stderr.log
-rw-r--r-- 1 root root 188 Jun 9 16:32 stdout.log
-rw-r--r-- 1 root root 249 Jun 9 16:32 weird.log

MY_PROT.LOG IS NOT THERE BUT THE OTHERS LOGS ARE. THERE'S SOMETHING I'M DOING WRONG AND DON'T KNOW WHAT IT IS..

IT'S STRANGE BECAUSE I CAN SEE WITH TCPDUMP NETWORK TRAFFIC WITH THE PROPIETARY LOGS SENT TO THE MANAGER:

     0x0190: 0008 0100 0000 1243 517a 7a36 5733 5567 .......CQzz6W3Ug
     0x01a0: 6d66 4e34 4c59 3372 6600 0000 0d01 040a mfN4LY3rf.......
     0x01b0: 0100 0c00 0000 0d01 040a 0100 6500 0000 ............e...
     0x01c0: 0201 ffff ffff ffff ffff 0000 0008 0100 ................
     0x01d0: 0000 1f30 7835 3020 2d20 5641 5249 4142 ...0x50.-.VARIAB
     0x01e0: 4c45 5f4d 4f4e 4954 4f52 5f52 4551 5545 LE_MONITOR_REQUE
     0x01f0: 5354 0000 0008 0100 0000 0000 0000 1c00 ST..............
     0x0200: 0000 0100 0000 066d 6f64 6275 7300 0000 ................
     0x0210: 0800 0000 0601 41d5 d65d ab02 e200 0000 ......A..]......
     0x0220: 0008 0100 0000 1243 517a 7a36 5733 5567 .......CQzz6W3Ug
     0x0230: 6d66 4e34 4c59 3372 6600 0000 0d01 040a mfN4LY3rf.......
     0x0240: 0100 0c00 0000 0c01 0000 0000 0000 d472 ...............r
     0x0250: 0000 0001 0000 000d 0104 0a01 0065 0000 .............e..
     0x0260: 000c 0100 0000 0000 0001 f600 0000 0100 ................

THIS IS AN EXTRACT FROM A COMMUNICATION BETWEEN WORKER AND MANAGER AND THAT'S WHAT MY-PROT.LOG SHOULD LOOK LIKE...

It would help if you could post your scripts somewhere or try to come up with a minimal example that shows the problem.

OK, I will try to summarize what my scripts do:

Workers

...

Ah! the files you are putting things in is your problem. Take everything you put in local-worker.bro and local-manager.bro and put it into a script and load that script from local.bro

When you use local-worker directly and call create_stream there then the manager knows nothing about about those log streams.

Additionally, if that is where you are defining the notice types, that will break notices as well because when the notice reaches the manager it also has no idea what to do with it.

There aren't many things that people do that actually belong in the local-worker or local-manger scripts, everything should almost always just go in local.bro. things like create_stream and NOTICE automatically do the right thing depending on what node it is ran on.

THAT WORKED!!!

Thank you Justin!

but… in this case… what is local-worker.bro for?. I understood that local-worker is what you are running in the workers… Actually the linux process still uses local-worker.bro (which is empty…):

/usr/local/bro/bin/bro -i eth0 -U .status -p broctl -p broctl-live -p local -p worker-1 local.bro broctl base/frameworks/cluster local-worker.bro broctl/auto

Thank you Justin, I would never have figured out this was the problem…

We've considered getting rid of those extra files. You have to understand the entire architecture too much to understand why they are there. We ended up going in a direction where the distinctions between the types of processes are hidden from users because it can be too difficult to know when to use those different files.

The rule of thumb is to just put everything into local.bro or your own scripts that you load in local.bro.

  .Seth