New logging architecture

Hey folks:

I'm working on building threading into the logging framework in parallel with the work I'm doing on DataSeries. I believe I have a plan, and was hoping to get some input from other folks on the list.

At the moment, a log message:

*) Is generated deep within Bro, eventually finding its way to LogMgr::Write
*) In LogMgr::Write, the following happens:
    > Checks that an appropriate LogMgr::Stream exists for the writer
    > Checks that any relevant LogWriter has been properly initialized
    > Applies any necessary filters
*) The log message is turned into a LogVal **
    > In the case of a remote filter, the LogVal ** is spirited away to serializations unknown
    > In the case of a local filter, the LogVal ** is passed along to the appropriate LogWriter::Write for processing.

So, to change this to support threading, I was planning to turn LogMgr::Stream into a self-contained object with two 0mq message-passing sockets attached:

*) One write-only (PUSH) 0mq socket created by the parent LogMgr, that could be used to send messages to the Stream object (for example, something like LOG_WRITE)
*) One read-only (PULL) 0mq socket created by the child Stream object, which would be used to receive the messages the LogMgr sent.

After the LogMgr::Stream was created, the PUSH 0mq socket would be the only means with which to communicate with it (in order to avoid needing evil things like semaphores / condition variables). This means that the LogMgr would need to generate and pass messages to the LogMgr::Stream object.

Thus, the LogWriter initialization bit and the LogWriter logging bit would both happen within the context of the Stream thread (and as a result of writing an appropriate message to the LogMgr::Stream's PUSH socket).

I figure the LogMgr would need to be able to generate (at a minimum) the following types of messages:

*) EnableStream
*) DisableStream
*) StreamInit
*) StreamFinish
*) RotateLog
*) LogMessage

(Note: as a shortcut, we could probably build a fast-track LogMessageInProc type that passed a pointer to the data to log, rather than encapsulating everything when passing within a single process... but I figure that's an optimization, and could probably be dealt with later if it proves to be necessary).

Anyway, I figure that, after this point, we'd be close to having an entirely self-contained logging infrastructure; so long as the message format was standardized (and I do have a rough draft of a message format which I'd be happy to send out, assuming the above isn't too confusing and seems technically sound), anything that spoke the correct message format could act as a logger for Bro.

So. . . thoughts? Does that make sense? What's bad / broken about doing things this way?

Thanks,
Gilbert Clark

I figure the LogMgr would need to be able to generate (at a minimum)
the following types of messages:

*) EnableStream
*) DisableStream
*) StreamInit
*) StreamFinish
*) RotateLog
*) LogMessage

Does that mean the LogMgr would send messages to Stream threads even if they're disabled? Couldn't the LogMgr itself keep track of the Stream enabled/disabled state and just not send messages to disabled Streams?

anything that spoke the correct message format could act as a logger for Bro.

That's a neat idea if you're hinting at something like creating Streams such that the LogMgr binds its 0MQ socket to a tcp port, allowing 0MQ sockets connected from other hosts (not necessarily running a full Bro process) to receive logs. But I think some 0MQ-specific caveats of that might be:

* the same PUSH/PULL pattern may not work well because it's possible for the pusher to block on a call to zmq_send(). Maybe PUB/SUB is better when working w/ Streams that aren't inproc?

* 0MQ doesn't (currently) provide a good framework for securing messages[1] or being exposed to the public Internet[2]. (I'm about to send another mail about an experiment I did with the former topic, but basically I just ended up coming to the same conclusions that they/we already talked about).

- Jon

[1] http://lists.zeromq.org/pipermail/zeromq-dev/2010-October/006559.html
[2] http://lists.zeromq.org/pipermail/zeromq-dev/2010-September/005944.html

I figure the LogMgr would need to be able to generate (at a minimum)
the following types of messages:

*) EnableStream
*) DisableStream
*) StreamInit
*) StreamFinish
*) RotateLog
*) LogMessage

Does that mean the LogMgr would send messages to Stream threads even if they're disabled? Couldn't the LogMgr itself keep track of the Stream enabled/disabled state and just not send messages to disabled Streams?

Good point. Yes :slight_smile:

Thinking about it now, I'm not sure it would benefit the client to know whether or not a stream was disabled; assuming the client has subscribed, it simply won't get any messages to log from the publisher. Logic dictates it should also be up to the client to determine when to rotate logs.

So, let's drop everything but LogMessage, and enable an additional heartbeat message that gets broadcast once every 15 seconds (in the absence of a LogMessage). If the subscriber misses two consecutive heartbeats, it's safe to assume the publisher is no longer broadcasting, and the subscriber shuts down. The stream is initialized when the first LogMessage is received, and is torn down when two heartbeats are missed OR the subscriber is terminated locally (e.g. CTRL-C).

anything that spoke the correct message format could act as a logger for Bro.

That's a neat idea if you're hinting at something like creating Streams such that the LogMgr binds its 0MQ socket to a tcp port, allowing 0MQ sockets connected from other hosts (not necessarily running a full Bro process) to receive logs.

Yup. That's the idea.

In the longer-term, I've been thinking about what would happen if someone tried to turn Bro into a set of well-defined, focused utilities all built around a shared protocol. It's a fun thought-experiment :slight_smile:

But I think some 0MQ-specific caveats of that might be:

* the same PUSH/PULL pattern may not work well because it's possible for the pusher to block on a call to zmq_send(). Maybe PUB/SUB is better when working w/ Streams that aren't inproc?

Another good point. Is there a good reason to keep inproc streams PUSH / PULL if inter-process communications use PUB / SUB?

* 0MQ doesn't (currently) provide a good framework for securing messages[1] or being exposed to the public Internet[2]. (I'm about to send another mail about an experiment I did with the former topic, but basically I just ended up coming to the same conclusions that they/we already talked about).

I wasn't really anticipating internet deployment; instead, I was thinking of limited use-cases within a Bro cluster / local environment.

I think the new message scheme would be a lot more robust than what I proposed initially, but we'd still want to implement some kind of message signing (at a minimum) if we wanted to support logging across an untrusted network; if we didn't, it'd be pretty simple to flood the remote logger with a stream of invalid events.

--Gilbert

So, to change this to support threading, I was planning to turn
LogMgr::Stream into a self-contained object with two 0mq message-passing
sockets attached:

Thinking about this a bit more, I'm not sure Stream is the right piece
to turn into a thread. Two issues:

    - That would give us only one thread per type of writer (e.g., one
    for all ASCII logs).

    - Not all the things that the Stream does are sufficiently
    decoupled from the main thread, and they wouldn't be thread-safe.
    There's in particular the filtering, where Bro goes back into
    script-land, but also raising the log event and just adding new
    filters.

So how about instead turning each LogWriter into a thread in the same
way as you propose: making it self-contained with message-passing 0mq
sockets and a simple protocol for sending the various types of
messages (like Init, Write, etc.; just as you said).

than encapsulating everything when passing within a single process...

By encapsulating do you mean the LogVal::{Read,Write} serializations?
I don't think we'll actually get around them. They are to make things
thread-safe by decoupling the writer's data from Bro's main data
structures.

Robin

So how about instead turning each LogWriter into a thread in the same
way as you propose: making it self-contained with message-passing 0mq
sockets and a simple protocol for sending the various types of
messages (like Init, Write, etc.; just as you said).

That makes sense. See note about message types, however; after reading Jon's comments, I'm starting to think that Rotate / Init / etc. should be handled by the client exclusively.

Also, supporting Init, Enable, Disable, etc. would open up a world of fun security issues to deal with on the client / subscriber end.

than encapsulating everything when passing within a single process...

By encapsulating do you mean the LogVal::{Read,Write} serializations?
I don't think we'll actually get around them. They are to make things
thread-safe by decoupling the writer's data from Bro's main data
structures.

No. I was referring more to the LogWriter::Write -- when dealing with something inproc, it doesn't make sense to take the entire LogVal ** and encapsulate it into a 0mq message if we can just pass the LogVal pointer directly instead.

--Gilbert

That makes sense. See note about message types, however; after reading
Jon's comments, I'm starting to think that Rotate / Init / etc. should
be handled by the client exclusively.

Not quite sure what you mean be handled exlusively. They'd still need
to be triggered by the LogManager so we would need messages to send
the action over to then be performed by the client. So what I'd
picture is essentially one message per method in LogWriter.

One reason that the manager needs to trigger them is that there are
parameters to pass (like path, etc. for Init). There're more though:
some can be triggered explicitly from the script-layer at any time
(e.g., Flush()); and others need to be executed synchronously across
log files (rotation: all files should rotate at the same time as
otherwise one gets inconsistent timestamps).

No. I was referring more to the LogWriter::Write -- when dealing with
something inproc, it doesn't make sense to take the entire LogVal ** and
encapsulate it into a 0mq message if we can just pass the LogVal pointer
directly instead.

Got it. Yes, that makes sense.

Robin

Hmmm ... So this is something that would be needed for communication
to external but it wouldn't if we did just in inproc, right? Not sure
I like it then. It makes the internal communication between LogMgr and
LogWriter quite complex for a case that'll be pretty rarely used
later.

Here's an alternative suggestion: why don't we leave the
LogMgr-to-LogWriter channel simple and just do inproc communication
there. Then, in addition, we can create a LogForwarder class derived
from LogWriter that in turn uses 0mq to talk to the outside the work.
This forwarder can be as complex as needed, and just like any other
writer it would run in its own thread.

Does that make sense?

Robin

Another good point. Is there a good reason to keep inproc streams PUSH
/ PULL if inter-process communications use PUB / SUB?

The PUB socket has the potential to drop messages when a high-water mark
is reached for a subscriber. That didn't seem acceptable for inproc.
(BTW, I don't know exactly how HWMs are defined, I was just glancing at
the "zmq_socket" man page).

- Jon

I could also see dropping messages under load as be a plus; wouldn't we want the event processing loop to keep running if whatever's receiving the inproc:// messages can't keep up?

It's true that we could end up with slightly inconsistent logs that way; at the same time, I think this could do quite a bit to help the event processor keep chugging along in the event the system as a whole became overloaded.

--Gilbert

Just to wrap this up: Gilbert and I talked about the inter-thread
communication more the other day and decided to just do inproc between
the LogMgr (i.e., the main thread) and LogWriters. If we want to add
inter-process/node communication, we can later do a LogWriter that
just forwards to extern whatever he receives.

Robin