libbroker status/plans

If anyone has time/interest, I feel like the main components of Broker are established now and deserving feedback/critique. Rather than try to detail how things work here, it’s probably best for people to try figuring things out from the repo (e.g. source code comments and unit test examples) and ask questions about what's unclear.

But it would be helpful to start a discussion on some of the planned features and open questions. I’ll try literally pasting my TODO list and hope it’s readable. The items are roughly ordered from most-certainty to least-certainty. Feedback welcome generally, but particularly where questions are posed.

Broker TODO

I tried building it on the newest version of debian, and got this error:

CMake Error at CMakeLists.txt:2 (cmake_minimum_required):
   CMake 2.8.12 or higher is required. You are running version 2.8.9

Why does it need version 2.8.12 of cmake?

2.8.8+ for object library targets and 2.8.12+ for MACOSX_RPATH (http://www.kitware.com/blog/home/post/510). Don’t see these as strict requirements, but simplifies some things.

- Jon

Two thoughts inline:

- C API

Do we need a vanilla C API in addition to the C++ API offered? Could be that this was a requirement, so won't argue the point: just making sure someone asked this question.

- Need/want overload or flow-control mechanisms?

     E.g. a simple policy for handling overload is to let a user specify
     a threshold for how many items are allowed in a queue before new
     messages are dropped.

More general question: are there expectations of reliability on broker messages? I'm not really familiar enough with the actor model in general / this library to know. Depending on the protocol being used, it could also be that the transport is going to provide flow-control of its own.

Also, one thing that might concern me a little is that, based on a very limited understanding of CAF, messages appear to be garbage collected [1]. How are the messages GC'd? Depending on how the GC works, my concern there would be that a poorly-timed GC cycle could lead to drops.

Cheers,
Gilbert

[1] http://actor-framework.org/pdf/cshw-nassp-13.pdf

From 4.1 -

"Message Handling and Processing: Messages shall (a) be garbage collected, (b) not be limited to particular types, and (c) provide pattern matching. Requirement (a) is owed to the experience that manual memory management in concurrent systems is error-prone and thus impractical, while the alternative approach of copying a message for each single recipient, leads to suboptimal performance if a message has multiple recipients. Requirement (b) reflects the common experience that message passing with restricted types is of limited use in practice. However, unrestricted messaging requires efficient and expressive facilities such as pattern matching (c), because message handling is a continuously recurring task to implement."

Do we need a vanilla C API in addition to the C++ API offered? Could be
that this was a requirement, so won't argue the point: just making sure
someone asked this question.

Yeah, if a goal is to make it easy to instrument existing applications w/ the library, the idea was that some will want/need the C interface.

- Need/want overload or flow-control mechanisms?

    E.g. a simple policy for handling overload is to let a user specify
    a threshold for how many items are allowed in a queue before new
    messages are dropped.

More general question: are there expectations of reliability on broker
messages? I'm not really familiar enough with the actor model in
general / this library to know. Depending on the protocol being used,
it could also be that the transport is going to provide flow-control of
its own.

The message delivery is going to be reliable and taken care of for us. I was more referring to situations such as receiving remote logs at a rate faster than one can process. What happens currently in Bro and Broker is messages pile up until memory is exhausted and you crash.

I don’t know how great the expectation is to handle that gracefully right away. We could probably easily put in a way to let a user specify “it’s ok to drop messages for this queue if it gets overloaded”. Matthias has been talking some about getting flow control mechanisms (e.g. overloaded actor tells senders “I’m currently overloaded, please slow down”) in CAF on their mailing list. I’m not sure the degree to which something like that would be helpful in Broker — it may just push the overload problem to the sender(s) and begs the question of what they do if they can’t artificially slow down?

Also, one thing that might concern me a little is that, based on a very
limited understanding of CAF, messages appear to be garbage collected
[1]. How are the messages GC'd? Depending on how the GC works, my
concern there would be that a poorly-timed GC cycle could lead to drops.

My impression was that messages are reference counted, copy-on-write tuple values that get reclaimed automatically when the ref count reaches zero.

- Jon

With regards to logging, I think this is one area where you can cheat a bit and just push back at scriptland to give script developers a chance to know if a logging queue is getting backed up. There are a number of ways we could deal with overload situations there.

  .Seth

Yeah, leaving things up to application may be reasonable here. For most of the messaging patterns in Broker, I expect it to be lightweight to hand off pending messages from Broker’s queues to the application, so essentially the application is already going to be left to manage its own resources. For Bro that could mean tying in to the script-layer like you suggested and shuffle around or even disable some logging/events (on either sender or receiver side). We could probably already be doing something like that in Bro without Broker, so maybe that is a hint that it may not need to be addressed directly in the library.

- Jon

If anyone has time/interest, I feel like the main components of Broker
are established now and deserving feedback/critique.

Very cool. I'll look through the repository.

Some toughts regarding your TODO list, in terms of priorities and your
question:

- C API

Yeah, I would like to have this early on. In principle, we could
postpone to later, but it looks like one of these things that if we
don't get it into place right away, it will be even harder to do
later. Maybe others can help you with this; once an initial structure
is in place it's probably getting quite mechanic.

Another question here would be how we ensure that the C API stays in
sync going forward. Is there some testing we can put in place?

- Python bindings

That I can see skipping for the first version. While it would of
cource be great to have, it should be pretty straight forward to do
once the C bindings are there.

- Persistent storage backend

My guess is that storage will end up being the primary initial use
case (because that's a capability we don't have right now; vs.
replacing exisitng stuff), so yeah, a good target.

- SSL/IPv6 (dependent on actor-framework support)

Can do later, though I don't rememeber what the conclusion was if/how the
actor-framework supports these?

- Need/want overload or flow-control mechanisms?

Punting for now sounds good, per the other mails. Maybe CAF will get
some support eventually that we can leverage, and/or we can add the
script-level hooks.

- In-place data store value modifications

    Plan to support increment/decrement on integral values.
    Need any other operations?

Set operations (insert element, remove element).

    What to do when applying an operation to invalid data type?
    Plan to just send error message back to sender and leave further
    decisions up to them.

Sounds like a more general question: what to do in terms of semantic
errors? There are probably more like that, like writing to a store
that doesn't exist. Error messages sound right, with hooks to report
them to the application. In Bro they can show up as events in script
land.

- Data store support for optional expiry model

    What are the desired mechanims? Options:

        (1) Inserter may specify "expire this entry at time X" ?
        (2) Inserter may specify "expire this entry based on
            create/read/modification access time" ?

How about providing on-create and on-modification, but not on-read. In
those two cases there's already communication necessary anyways, and
the expiration time could be piggy-bagged on that.

        (3) Other hooks to make expiry conditional?

Hmm, maybe, but unclear how it would look like. Could work like Bro's
expire func (i.e., potentially delay expiration), but I think this
hook could really only run only at the storage node directly. I'd skip
for now.

- Data typing model

        (1) Data holds additional type tag to suggest how to interpret
        (2) Fully implement separate Bro-types.

    Planning to try integrating w/ Bro as it is and see what specific
    problems arise. I think (1) may end up being helpful, but maybe not
    required and I'd like to avoid (2) if possible.

I'm not fully sure what you mean by (2) but I believe I agree. :slight_smile:

(1) would be good; maybe Bro doesn't strictly need it, but (a) it
would allow it to double-check input at least; and (b) for independent
applications it will be quite helpful, as otherwise it's hard to work
with data of different input types dynamically (in other words: the
applications would need a way to define what to expect, forcing them
to replicate the typing that exists in Bro already).

A more general question in this context: what's the trust model? Are
we expecting that a client taking part in the communication will play
by the rules of the protocol? That's what current Bro does, and I
think it's a reasonable assumption. On the other hand, maybe Broker
could do a bit better, as it's data model isn't as complex as Bro's
native Val structure. Asked differently: to which degree can a
receiver validate that incoming data makes sense (with some appropiate
definition of "makes sense"; there are differen ones, like having the
right binary layout, or semantically sending valid information)

- Bro integration
    Is Broker the default in Bro 2.4 ? That implies requiring C++11.
    Also I'm requiring CMake 2.8.12+ and may be hard to go below 2.8.
    Bro is still happy with 2.6.3.

We should definitly intregrate it. It could be an optional dependency,
or a mandatory component. I'm leaning towards the latter, to pave the
way for the future. But yeah, that then means requiring C++11 (and a
current cmake).

Did we ever come to a conclusion on C++11 for Bro? We did the survey,
but I don't recall if we settled on whether it's ok to switch now?

Robin

- C API

Yeah, I would like to have this early on. In principle, we could
postpone to later, but it looks like one of these things that if we
don't get it into place right away, it will be even harder to do
later. Maybe others can help you with this; once an initial structure
is in place it's probably getting quite mechanic.

I agree it will probably be a pretty mechanical process once there’s a better foundation that tackles some of the harder aspects and common patterns. But I wonder if prioritizing Bro integration over this may be more helpful — if we learn significant parts of the C++ interface need to change, then that may mean some of the effort making the C interface goes to waste.

Another question here would be how we ensure that the C API stays in
sync going forward. Is there some testing we can put in place?

There’s already some unit tests for the C++ API that could be replicated in C.

- Python bindings

That I can see skipping for the first version. While it would of
cource be great to have, it should be pretty straight forward to do
once the C bindings are there.

Won’t BroControl require this?

- Persistent storage backend

My guess is that storage will end up being the primary initial use
case (because that's a capability we don't have right now; vs.
replacing exisitng stuff), so yeah, a good target.

I started looking in to this a little and I’m thinking either LevelDB or RocksDB may be good default choices to use here.

- SSL/IPv6 (dependent on actor-framework support)

Can do later, though I don't rememeber what the conclusion was if/how the
actor-framework supports these?

IIRC, the idea was that it could, but doesn’t yet. I also think it’s not critical to have these in the initial version.

- In-place data store value modifications

   Plan to support increment/decrement on integral values.
   Need any other operations?

Set operations (insert element, remove element).

Ack.

   What to do when applying an operation to invalid data type?
   Plan to just send error message back to sender and leave further
   decisions up to them.

Sounds like a more general question: what to do in terms of semantic
errors? There are probably more like that, like writing to a store
that doesn't exist. Error messages sound right, with hooks to report
them to the application. In Bro they can show up as events in script
land.

Yeah, probably does warrant a general solution for application to get at errors that can’t be reported synchronously — I can recall a few other unrelated places in the code I’ve written something like “TODO: log an error”.

- Data store support for optional expiry model

   What are the desired mechanims? Options:

       (1) Inserter may specify "expire this entry at time X" ?
       (2) Inserter may specify "expire this entry based on
           create/read/modification access time" ?

How about providing on-create and on-modification, but not on-read. In
those two cases there's already communication necessary anyways, and
the expiration time could be piggy-bagged on that.

May be reasonable, I’ll aim to support that and see if I hit any issues.

       (3) Other hooks to make expiry conditional?

Hmm, maybe, but unclear how it would look like. Could work like Bro's
expire func (i.e., potentially delay expiration), but I think this
hook could really only run only at the storage node directly. I'd skip
for now.

Ack. Was hoping skipping this would be ok :slight_smile:

A more general question in this context: what's the trust model? Are
we expecting that a client taking part in the communication will play
by the rules of the protocol? That's what current Bro does, and I
think it's a reasonable assumption.

This was also my expectation.

On the other hand, maybe Broker
could do a bit better, as it's data model isn't as complex as Bro's
native Val structure. Asked differently: to which degree can a
receiver validate that incoming data makes sense (with some appropiate
definition of "makes sense"; there are differen ones, like having the
right binary layout, or semantically sending valid information)

If “receiver" here means the application I think it can validate pretty well. For example, if the application is Bro and it’s receiving logs, I expect it currently has enough information from Broker to be able to tell if what it got can actually be converted in to the types it needs to log.

If “receiver” means Broker versus The Internet, I’m not actually sure the extent it would hold up -- that seems dependent on CAF being able to hold up.

- Bro integration
   Is Broker the default in Bro 2.4 ? That implies requiring C++11.
   Also I'm requiring CMake 2.8.12+ and may be hard to go below 2.8.
   Bro is still happy with 2.6.3.

We should definitly intregrate it. It could be an optional dependency,
or a mandatory component. I'm leaning towards the latter, to pave the
way for the future. But yeah, that then means requiring C++11 (and a
current cmake).

Did we ever come to a conclusion on C++11 for Bro? We did the survey,
but I don't recall if we settled on whether it's ok to switch now?

We didn’t reach a conclusion. I can take a closer look, but I recall supporting EL6 may be important. On RHEL6, looks like GCC 4.4.7 is the default, but all major C++11 features I think appear in 4.8 and later (I’m not sure it’s worth it to try and figure out what specific C++11 features we can “get away with” and thus require older compiler version).

- Jon

patterns. But I wonder if prioritizing Bro integration over this may
be more helpful — if we learn significant parts of the C++ interface
need to change,

I can see that either way actually: doing the C interface could turn
up problems with the C++ structure as well. But I agree in terms of
priority: Bro over C.

There’s already some unit tests for the C++ API that could be
replicated in C.

What I meant was ensuring the two stay in sync. Say if we added a new
capability to C++, can we trigger somehow a test failure if we forget
to add it to C?

Won’t BroControl require this?

I was imagining for 2.4 we'd leave the BroControl parts in place as
they are now, i.e., using the old comm framework. Were you planing to
replace that already?

I started looking in to this a little and I’m thinking either LevelDB
or RocksDB may be good default choices to use here.

(No experience with either, will take a look)

If “receiver” means Broker versus The Internet

I was thinking about this case (the other current Bro does already as
well), but yeah, CAF certainly plays a part there as well.

On RHEL6, looks like GCC 4.4.7 is the default, but all major C++11
features I think appear in 4.8 and later

So maybe that means we'll need to wait another release at least before
making it mandatory, and giving people a heads-up that we plan to do
the switch. On the other hand, I would prefer to have it fully in
there right away, so that scripts can start to rely on it as soon as
possible. Tough call ...

(I’m not sure it’s worth it to try and figure out what specific C++11
features we can “get away with” and thus require older compiler
version).

Agree.

Robin

What I meant was ensuring the two stay in sync. Say if we added a new
capability to C++, can we trigger somehow a test failure if we forget
to add it to C?

Don’t have any ideas for how to do that at the moment, but, yes, it would be nice to have that type of coverage test.

I was imagining for 2.4 we'd leave the BroControl parts in place as
they are now, i.e., using the old comm framework. Were you planing to
replace that already?

Yeah, I was thinking the user would be given a binary choice: either everything uses new comm or everything uses old comm. But I guess it also works to say old comm is still available for everything it used to be, but additionally/concurrently there’s the option of trying the new comm for tasks related to A, B, and C. I’m not sure what approach I like best, but may make sense to go in to the integration process with the intent of just making incremental additions and then see how far we get.

- Jon

This would be my inclination, as it limits the scope of what we need
to get in shape for the first release.

Robin

I looked over them a bit, and RocksDB looks pretty cool, although also
quite complex given that we won't need all of what it offers.

Have you considered SQLite as an alternative? It's more than a
key/value store, and slower, but it would have the advantage of not
adding another dependency beyond what we already use. Not saying
that's what we should do, just wondering about the pros and cons.

Also, I was thinking it would be cool to have a command line tool that
can inspect (and potentially even manipulate [1]), the contents of a
Broker store. Say, you wanted to see what IPs are currently tracked in
some table, you could just run that tool to dump it out.

Robin

(*) Does any of the DBs have support for modifying a table exernally
while being open? Then that command line tool could even add/change
entries that way. That would actually make for a nice configuration
mechanism for things like whitelists or some tuning options.

Jon,

When I tried compiling Broker with clang 3.5 (and it's libc++) the
other day I got some compiler errors. I'll look more closely later,
but was wondering what compiler are you using for development?

Robin

I started looking in to this a little and I’m thinking either LevelDB
or RocksDB may be good default choices to use here.

I looked over them a bit, and RocksDB looks pretty cool, although also
quite complex given that we won't need all of what it offers.

Have you considered SQLite as an alternative?

Had not thought of that.

It's more than a
key/value store, and slower, but it would have the advantage of not
adding another dependency beyond what we already use.

I wasn’t that worried about adding another dependency since they’re already kind of specific, i.e. don’t expect the DB dependency to be more of a hassle than libcaf. If it is a concern, we could consider distributing (e.g. as git submodules) and building these dependencies along with Broker (analogous to Bro redistributing the sqlite amalgamation).

In the end, I’m not expecting there to actually be a lot of code involved in implementing different persistent storage backends, so we wouldn’t be stuck with SQLite if that’s chosen as a default. And we could provide more than one option at a time. Maybe we could have the default be SQLite (for convenience), but optionally support RocksDB (for those in need of better performance).

Also, I was thinking it would be cool to have a command line tool that
can inspect (and potentially even manipulate [1]), the contents of a
Broker store. Say, you wanted to see what IPs are currently tracked in
some table, you could just run that tool to dump it out.

(*) Does any of the DBs have support for modifying a table exernally
while being open? Then that command line tool could even add/change
entries that way. That would actually make for a nice configuration
mechanism for things like whitelists or some tuning options.

We need to be using Broker’s data store abstraction when making changes for those updates to be correctly propagated to clones. But should be easy to write such tools using libbroker. Or once there’s Python bindings, those will probably be a natural way to do such dynamic querying and modification to data stores.

- Jon

$ c++ --version
Apple LLVM version 6.0 (clang-600.0.54) (based on LLVM 3.5svn)

- Jon

wouldn’t be stuck with SQLite if that’s chosen as a default. And we
could provide more than one option at a time. Maybe we could have the
default be SQLite (for convenience), but optionally support RocksDB
(for those in need of better performance).

That would sound good to me. SQlite is somethign we can ship (just as
with Bro), whereas RocksDB seems complex enough that leaving it
external may be better.

We need to be using Broker’s data store abstraction when making
changes for those updates to be correctly propagated to clones. But
should be easy to write such tools using libbroker.

Ah, good point. And yeah, working without Broker wouldn't work anyways.

Robin