0MQ security considerations

We kind of already came to the conclusion that switching to 0MQ for the
communication framework would mean dropping Bro's built-in support for
SSL/TLS communication, but here's a summary of why (and how the best
replacement option seems to be an external tunneling solution).

With 0MQ, you're not going to have access to a fd that can be wrapped in an
OpenSSL socket BIO. So my idea was to try replacing the socket BIO with
memory BIOs that sit in front of the 0MQ socket and wrap/unwrap application
data coming from or going onto it. Here's the code from that attempt:

https://github.com/jsiwek/ZeroMQ-SSL-State-Machine

And the README explains why it won't work:

This was an incomplete/failed experiment in using OpenSSL as a state machine
to complete a SSL/TLS handshake over a 0MQ socket. In summary, the handshake
can be completed for a single connection and app. data can be exchanged over
SSL/TLS for the duration of that connection, but there lies a problem in
detecting when a peer is disconnected[1][2] and thus requiring a new handshake
upon reconnection. In 0MQ, the 'tcp' transport is considered a "disconnected"
TCP transport, meaning that the connectivity state of peers is transparent
to applications. So this reaffirms previous 0MQ security discussions[3] that
possible approaches are:

1) Tunneling 0MQ traffic over another channel that performs SSL in some fashion
   (e.g. stunnel can work). This relies on the user of the application to
   be able to set this up, but you get SSL/TLS strength security for "free".

2) Using a (currently non-existent) 0MQ transport implemented as some part
   of the core 0MQ library to encrypt hop-by-hop. If this existed, drawbacks
   might be that it doesn't scale well to some of 0MQ's messaging patterns
   and would need to be implemented differently for its supported unicast
   vs. multicast transports.

3) Adding a crypto layer at the application level to wrap messages with some
   signing + encryption before sending them across the 0MQ socket. In order
   for this to provide security features that SSL/TLS offers beyond minimal
   message authn., confidentiality, integrity, it needs to be able to use a
   key-exchange algorithm (possibly PAKE), and some form of MAC'd nonce
   (replay protection). This doesn't seem worth the risk of rolling your own,
   best wait until 0MQ core is taught to use a well-established protocol.

[1] http://lists.zeromq.org/pipermail/zeromq-dev/2010-July/004230.html
[2] http://lists.zeromq.org/pipermail/zeromq-dev/2010-August/005285.html
[3] http://lists.zeromq.org/pipermail/zeromq-dev/2010-October/006562.html

A few thoughts inline:

Jonathan Siwek wrote:

We kind of already came to the conclusion that switching to 0MQ for the
communication framework would mean dropping Bro's built-in support for
SSL/TLS communication, but here's a summary of why (and how the best
replacement option seems to be an external tunneling solution).

With 0MQ, you're not going to have access to a fd that can be wrapped in an
OpenSSL socket BIO. So my idea was to try replacing the socket BIO with
memory BIOs that sit in front of the 0MQ socket and wrap/unwrap application
data coming from or going onto it. Here's the code from that attempt:

https://github.com/jsiwek/ZeroMQ-SSL-State-Machine

And the README explains why it won't work:

This was an incomplete/failed experiment in using OpenSSL as a state machine
to complete a SSL/TLS handshake over a 0MQ socket. In summary, the handshake
can be completed for a single connection and app. data can be exchanged over
SSL/TLS for the duration of that connection, but there lies a problem in
detecting when a peer is disconnected[1][2] and thus requiring a new handshake
upon reconnection.

What about DTLS? I think OpenSSL supports that, but I'm not sure how well.

I would see that protocol mapping more naturally to 0mq's idea of messages and / or disconnected transports, and the connection emulation it provides *might* work on top of 0mq.

(haven't looked at the code yet, so apologies if that's explained within :slight_smile:

1) Tunneling 0MQ traffic over another channel that performs SSL in some fashion
   (e.g. stunnel can work). This relies on the user of the application to
   be able to set this up, but you get SSL/TLS strength security for "free".
  
One of the things I don't like about transparently tunneling / encrypting like this is that an increase in network I/O also leads to an increase in CPU utilization; thus, as the amount of traffic bro analyzes increases (and its use increases), so too does the load on the system as a whole. This kind of transparent encryption, then, could lead to some interesting behavior on older systems.

I don't know how relevant the statistics are, but this thread is still interesting to read: http://openvpn.net/archive/openvpn-users/2005-06/msg00224.html

Anyway, the point is that while a VPN might Just Work (tm) from an administrative / systems perspective, we'd probably need to test to see exactly how many events we could push through the VPN on various systems while simultaneously processing packets before the virtual links found themselves CPU-bound.

3) Adding a crypto layer at the application level to wrap messages with some
   signing + encryption before sending them across the 0MQ socket. In order
   for this to provide security features that SSL/TLS offers beyond minimal
   message authn., confidentiality, integrity, it needs to be able to use a
   key-exchange algorithm (possibly PAKE), and some form of MAC'd nonce
   (replay protection). This doesn't seem worth the risk of rolling your own,
   best wait until 0MQ core is taught to use a well-established protocol.
  
+1; implementing anything more complex than signing seems like a bad idea.

--Gilbert

Thanks a lot for the summary and for trying this, even if eventually
unsuccessful.

upon reconnection. In 0MQ, the 'tcp' transport is considered a "disconnected"
TCP transport, meaning that the connectivity state of peers is transparent
to applications.

Oh, that's actually something that could bite us in another way as
well. When Bro starts talking to Bro, there's some state that's
exchanged initially just after the connection has been setup and
before "normal" messages start being exchanged. If we don't learn
about a reconnect (which is I how interpret your statement above), we
can't do that state exchange.

This *may* be something we could get around by changing parts of the
protocol but (1) that would make switching to 0mq quite a bit more
complicated, and (2) I'm not sure right now whether it would work at
all.

Is there a way around this, like not doing transparent reconnects and
setting up new connections instead?

Robin

What about DTLS? I think OpenSSL supports that, but I'm not sure how
well.

I would see that protocol mapping more naturally to 0mq's idea of
messages and / or disconnected transports, and the connection
emulation it provides *might* work on top of 0mq.

I didn't try, but don't think it helps. As a general scenario, let's
say a client and server both complete a handshake over 0MQ (DTLS, SSL,
TLS, whichever), but after a while of exchanging app. data, the client
crashes.

In any protocol, session resuming is supported provided that the client
saves some state (session ID, master secret). We could do that (don't
think we want to), but another question is how can the server know
that the client will ever return? That seems to require implementing
a heartbeat and DTLS seems to just rely on retransmission timers during
the handshake?

(haven't looked at the code yet, so apologies if that's explained
within :slight_smile:

Not really, the code is just hacked together, but it's short enough
to read/understand if you want to try anything.

- Jon

> upon reconnection. In 0MQ, the 'tcp' transport is considered a
> "disconnected" TCP transport, meaning that the connectivity state of peers is
> transparent to applications.

Oh, that's actually something that could bite us in another way as
well. When Bro starts talking to Bro, there's some state that's
exchanged initially just after the connection has been setup and
before "normal" messages start being exchanged. If we don't learn
about a reconnect (which is I how interpret your statement above), we
can't do that state exchange.

If the acceptor needs to recognize new connections, that doesn't seem
well-suited to it, but if the client can say "I'm new" that could work.

Also, the idea of having a listening socket and an accepted socket fd
like in traditional socket APIs isn't there in 0MQ. You just have
a socket bound at one endpoint that may accept connections from multiple
endpoints, but anything written to the bound socket is going to be
received by all connected endpoints. There's a ZMQ_PAIR that does
communication between just two endpoints, but only a single peer can be
connected at one time. I think we really need try to use 0MQ as a
parallelism framework and not just as a networking library or else it's
going to be a struggle to get things to work.

This *may* be something we could get around by changing parts of the
protocol but (1) that would make switching to 0mq quite a bit more
complicated, and (2) I'm not sure right now whether it would work at
all.

In general, I'm getting the feeling that even the original idea of "let's
just try replacing the socket code with 0MQ and increment upon that" isn't
going to be easy, and to really take advantage of 0MQ's strengths requires
some redesign.

Is there a way around this, like not doing transparent reconnects and
setting up new connections instead?

Not that I saw.

- Jon

I didn't try, but don't think it helps. As a general scenario, let's
say a client and server both complete a handshake over 0MQ (DTLS, SSL,
TLS, whichever), but after a while of exchanging app. data, the client
crashes.

In any protocol, session resuming is supported provided that the client
saves some state (session ID, master secret). We could do that (don't
think we want to), but another question is how can the server know
that the client will ever return? That seems to require implementing
a heartbeat and DTLS seems to just rely on retransmission timers during
the handshake?

Okay, that makes sense.

Speaking of heartbeats though, what about implementing an application-level heartbeat and forcing the connection closed if X are missed (something like IRC's PING / PONG)? It's not optimal, but it might be a workaround in the short term (e.g. until 0mq acquires something native).

--Gilbert

I'm getting that feeling as well, and I'm starting to wonder whether
0mq is the right tool for us at all. I'd really like to replace just
the socket code with something more robust initially. We may overhaul
the whole serialization (with its caching, lack of support for
broadcasts, etc.) at some point as well but I think that's orthogonal
and better done later/separately.

Does anybody know other options for the communication layer? Ideally,
it would be *C* library so Broccoli can use it directly as well.

And: one conceptal change that we might consider is having Bro itslef
actually use Broccoli and then handle all the communication in there.

Rovin

FWIW,

I think we really need try to use 0MQ as a
parallelism framework and not just as a networking library or else it's
going to be a struggle to get things to work.

+1 this.

Additionally, any kind of incremental deployment within Bro's existing communication framework seems like it would be challenging.

In general, I'm getting the feeling that even the original idea of "let's
just try replacing the socket code with 0MQ and increment upon that" isn't
going to be easy, and to really take advantage of 0MQ's strengths requires
some redesign.

+1 this, too.

--Gilbert

Does anybody know other options for the communication layer? Ideally,
it would be *C* library so Broccoli can use it directly as well.

Something licensed BSD would be cool too. It always bothered me that 0MQ is lesser GPL.

And: one conceptal change that we might consider is having Bro itslef
actually use Broccoli and then handle all the communication in there.

I actually sort of like this idea in a way. It would result in better maintenance for broccoli and new features. I've always had this feeling like I'd be able to do something super awesome if I could have broccoli hold open a socket and let a Bro connect. Right now you can't even have broccoli->broccoli because broccoli can't be the server.

Rovin

Oh, this must be a new nickname? :stuck_out_tongue:

  .Sith

Re: alternate libraries, only thing I know of beyond boost::asio is ACE, but I never understood the license well enough to feel comfortable using it in any reasonably large project: http://www1.cse.wustl.edu/~schmidt/ACE-copying.html

It's a slick library, but only supports C++ to the best of my knowledge.

We may overhaul
the whole serialization (with its caching, lack of support for
broadcasts, etc.) at some point as well but I think that's orthogonal
and better done later/separately.

I don't necessarily agree that overhauling serialization is orthogonal, per se, to deploying a new communication framework. Instead, I feel like the serialization framework is way too tightly coupled to the communication framework, and that it's really limiting what we're able to do with communication.

Figuring out what we want to do with both before we find a replacement for the communication library might be a good thing, since it would give us an idea of what kind of band-aid we would want to use while we're waiting for the serialization stuff to get redone.

And: one conceptal change that we might consider is having Bro itslef
actually use Broccoli and then handle all the communication in there.

I really like this idea. I have a feeling it would enforce better design, would lead to awesome (and, more importantly, transparent) support for parallel / remote processing, and would allow the development of stuff for bro without dragging the entire bro project along as a dependency.

--Gilbert

Hmm... that's already there. Broccoli doesn't depend on Bro.

  .Seth

Hmm... that's already there. Broccoli doesn't depend on Bro.

Mea culpa. I just don't know how to read configure's output, apparently :slight_smile:

--Gilbert

We should of course consider what communication primitives we'll need
later for a new serialization framework. But other than that, it's
actually not that tightly coupled. For me it's just shuffling data
across a network connection. :slight_smile:

Anyway, I think the result of this discussion is "back to the drawing
board". That's fine, the plan was to explore 0mq whether it's suitable
for us, and it looks like at best we're unsure whether it is and at
worst it just is not.

So I'd say we postpone this discussion to a more general post-1.6
strategy posting. If anybody feels like it, we can in the meantime
collect ideas and options on a web page.

Robin

Gilbert, the next question then is however if we really want to use
0mq for the inter-thread communication. That idea was based on the
assumption that we'd be using it anyway, which doesn't necessarily
seem to be the case anymore.

Should we just bite the bullet and do a pure pthreads implementation?
We could encapsulate the thread management into a class ThreadMgr that
would take care of starting/stopping/etc. threads, and a new class
Thread would be the top-level class for LogWriters to be derived from.

Robin

PS: Yeah, I know, C++0x will make everything better ... And I'm still
very reluctant to pull Boost in ...

Gilbert, the next question then is however if we really want to use
0mq for the inter-thread communication. That idea was based on the
assumption that we'd be using it anyway, which doesn't necessarily
seem to be the case anymore.

*nod* Yeah.

I'll be very sad to see the lock-free synchronization go. I'll additionally hate to lose the constraints 0mq enforces on design; there's a little bit of coding around the library to do, but in this case I think it'd lead to a cleaner overall design than we may get otherwise.

I don't know whether the above necessarily justifies introduces the additional dependency, though, especially when I'm not sure exactly how much of a win the lock-free stuff would actually turn out to be.

Should we just bite the bullet and do a pure pthreads implementation?
We could encapsulate the thread management into a class ThreadMgr that
would take care of starting/stopping/etc. threads, and a new class
Thread would be the top-level class for LogWriters to be derived from.

0mq doesn't completely eliminate the need for threads; we still need the LogWriter running in its own context. Thus, building a thread manager is definitely on the to-do list.

First, though, I thought I'd look around for a library that does something like that. If I can't find one, then it'll be time to build one.

Robin

PS: Yeah, I know, C++0x will make everything better ... And I'm still
very reluctant to pull Boost in ...

Python would make everything better. C++0x will just make things more complicated. /cynical

Also, the dependencies involved with boost do suck, but pushing maintenance onto someone else can be a huge win.

--Gilbert

First, though, I thought I'd look around for a library that does
something like that. If I can't find one, then it'll be time to build
one.

I forgot the obvious one yesterday: Intel's TBB. That's what the
multi-core Bro prototype is already using, and it's main thread
abstraction is (almost?) compatible to C++0x. I could live that
dependency. And it has lock-free data structures as well. I think
that's actually the best option I see right now.

Also, the dependencies involved with boost do suck, but pushing
maintenance onto someone else can be a huge win.

But somebody has maintain the code that's *using* Boost ... My main
concern is actually that once we have Boost, folks will immediately
start using pretty much any feature it provides. :slight_smile:

Robin

First, though, I thought I'd look around for a library that does
something like that. If I can't find one, then it'll be time to build
one.

I forgot the obvious one yesterday: Intel's TBB. That's what the
multi-core Bro prototype is already using, and it's main thread
abstraction is (almost?) compatible to C++0x. I could live that
dependency. And it has lock-free data structures as well. I think
that's actually the best option I see right now.

I would definitely use it then. Once Bro goes multi-core I think we really don't want to have to deal with two different threading libraries. And while I don't know TBB pthreads definitely is very bare-bone.

However, it appears that TBB is GPLv2 with runtime exception (and not Lesser GPL). Just wondering whether that's going to be a issue for us....
See: http://gcc.gnu.org/onlinedocs/libstdc++/manual/bk01pt01ch01s02.html

cu
Gregor

First, though, I thought I'd look around for a library that does
something like that. If I can't find one, then it'll be time to build
one.

I forgot the obvious one yesterday: Intel's TBB. That's what the
multi-core Bro prototype is already using, and it's main thread
abstraction is (almost?) compatible to C++0x. I could live that
dependency. And it has lock-free data structures as well. I think
that's actually the best option I see right now.

Looks interesting, but one thing from the FAQ that bothers me:

"I write software of <a particular nature>. Is TBB use appropriate for me?

It depends on what your application profile is. TBB does not try to replace I/O threads or GUI threads or general Win Threads. TBB is best for computational tasks that are not prone to frequent waiting for I/O or events in order to proceed (this is an area the TBB team does want to tackle later). "

Also, the license (GPLv2 with a linking exception) is probably sub-optimal. It could definitely be worse, though (read: bdb and Sleepycat).

But somebody has maintain the code that's *using* Boost

Well, sure. That's true of any library, though, so I'm not sure I really understand this argument :slight_smile:

... My main
concern is actually that once we have Boost, folks will immediately
start using pretty much any feature it provides. :slight_smile:

Okay, I'll bite :slight_smile:

I like the accepted answer here: http://stackoverflow.com/questions/1226206/is-there-a-reason-to-not-use-boost

As long as we kept our design focused, I think we'd be fine.

--Gilbert

Once Bro goes multi-core I think we
really don't want to have to deal with two different threading
libraries.

+1

However, it appears that TBB is GPLv2 with runtime exception (and not
Lesser GPL). Just wondering whether that's going to be a issue for us....
See: http://gcc.gnu.org/onlinedocs/libstdc++/manual/bk01pt01ch01s02.html

Yeah, that.

Any idea what the difference is between that and LGPL version 2?

--Gilbert

I think this concerns primarily their main usage model where one uses
TBB to automatically schedule the work across threads. We wouldn't
need that part, and instead use it more as a portable thread
abstraction. In that sense, I don't see why TBB wouldn't be a fit for
what we need. The licensing, yeah, not ideal, though it would work I
think.

But anyway, mulling over this a bit more, let's just go with pthreads.
I don't really see that introducing a new dependency is worth it just
for doing logging; and that's all we need right now. Doing the
mgr-to-writer communication shouldn't be that difficult with pthreads
either, and while lock-free queues are nice, I'm not convinced they
are crucial here. Let's make sure to encapsulate all this nicely at
some well-defined places, and we can always change the implementation
later.

Robin