Plugin branch status

The plugin branch is almost ready for merging, except for some
clean-up and missing API docs. It does two things:

    (1) Move all protocol analyzers over to new infrastructure code
    that's structured around standalone modules (plugins): everything
    that's part of an analyzer is now contained to a single directory
    (incl. C++ code, bif, pac). Currently all these plugins are still
    compiled in statically but in the future (not 2.2) there will also
    be an option to compile individual analyzers standalone into
    dynamic libraries.

    (2) Make analyzer activation/deactivation dynamic, controllable by
    function calls via the new analyzer framework (dpd_config is
    gone).

In the future, the infrastructure for (1) will also faciliate moving
other components to the plugin-model as well (e.g., readers/writers,
packet sources)

So, my question is if I should go ahead merging this into master for
2.2. At the user-level it doesn't change much other than what relates
to (2), but internally it moves things move around quite a bit,
including renaming analyzers classes and introducing an analyzer
namespace. I think generally that's fine, but let me know what you
think.

Also, there's one particular issue coming with a merge that we would
need to fix: the Broxygen docs for analyzer bifs are now spread out
over many files, and look pretty ugly in the generated pages. I think
what we'll need to do is switching from a purely file-based model to
documenting semantic groups, like a specific analyzer. I don't think
this will actually be too difficult, the plugin infrastructure comes
with "introspection" functinality that gives you all bif elements that
a plugin defines. I believe Broxygen could just go through and turn it
into one corresponding pages (see below for output of the new "-N"
switch that summarizes this information for all available plugins).
However, it's probably still a bit of work to get this into a nice
shape.

So my question is, mostly for Jon: is that something we could tackle
for 2.2 final (during beta would be ok)? If that's too much work to be
realistic, I'm wondering if we should postpone the plugin branch for
2.3.

Robin

--------- cut -------------------------------------------------------

# bro -NN

[...]

Plugin: Bro::FTP - FTP analyzer (built-in)
    [Analyzer] FTP (ANALYZER_FTP, enabled)
    [Analyzer] FTP_ADAT (enabled)
    [Event] ftp_request
    [Event] ftp_reply
    [Type] ftp_port
    [Function] parse_ftp_port
    [Function] parse_eftp_port
    [Function] parse_ftp_pasv
    [Function] parse_ftp_epsv
    [Function] fmt_ftp_port

[...]

I say let's go for it in 2.1, except that your point about the documentation is worth considering so it sounds like it's up to Jon. :slight_smile:

I could go along with it either way.

  .Seth

I would lean towards go for it even if there is not enough time in the beta to address the broxygen issues.

It will probably be fine as long as we're doing a "when it's ready" final release and not some specific deadline. If it's the later, I'll have to look more at what will be involved before I can say.

- Jon

I'm ok with sticking with soft timetables on this.

  .Seth

Fine with me too. I'll go ahead thyen and get the branch into
mergeable state.

Robin

I started looking over the plugin branch (not done yet), it looks nice so far.

One thing I see is that it uses CMake object library targets (e.g. "add_library(<name> OBJECT <src>)"), which became available in v2.8.8. That's recent enough such that none of the platforms on our testbed meet that requirement using the CMake package in their official OS repositories.

So as it is, I think it would become a common thing for someone trying to build Bro to have to get a sufficient CMake version from a place other than what their OS repos provide. IMO, that's fine, just not sure if it was anticipated and fine with others as well?

- Jon

Interesting that you ask, I was actually about to write an email about
this. :slight_smile:

I'm aware that this a recent CMake feature. I have used it for now
because it feels like the right thing to do here. The alternative
would be creating lots of *.a libraries and then linking them all in.
That should work too, but it's not really nice. (One probably could
also manually collect all the *.o across the sub-dirs, but that's even
worse I think).

I wanted to pose exactly the question you bring up: what cmake version
can we require these days?

I'd be reluctant to up the requirement if it meant installing new
cmakes for a large share of our users. From your mail it sounds like
that could be the the case. If so, I think I'd actually prefer the *.a
version for now and we could switch over to object libraries at a
later time as >= 2.8.8 becomes more common.

Let me know any other feedback you have on the branch. I'll do a
little bit more cleanup and documetnation, maybe even later today.

Robin

Turns out I forgot one motivation for using object libs in the first
place: with static libraries, global objects aren't pulled in if they
aren't explicitly referenced. Plugins register themselves via ctors of
globals, which doesn't work with that model.

I've put in an option now that switches between objects libraries and
static libraries. Need to figure out if there's a way to make the
latter work.

Robin

I wanted to pose exactly the question you bring up: what cmake version
can we require these days?

Here's a survey:

RHEL/CentOS 6.4: cmake version 2.6-patch 4
Ubuntu 12.04.2: cmake version 2.8.7
Debian 6.0: cmake version 2.8.2
FreeBSD 9.1: cmake version 2.8.6
Mac OSX: Fink/Homebrew/MacPorts all have > 2.8.8

We'd probably hear the least complaints if 2.6.4 was the minimum requirement.

- Jon

We'd probably hear the least complaints if 2.6.4 was the minimum requirement.

Thanks for the summary. I've pushed a work-around that gets things to
work with static archives. It's not nice but should do for now until
we can up the cmake version more easily.

The plugin branch is pretty much ready now. I still see one test
failure however that I don't fully understand yet. If you're still
reviewing the branch, could you take a look at this one:

core.tunnels.teredo-known-services ... failed
  % 'btest-diff known_services.log' failed unexpectedly (exit code 1)
  % cat .diag
  btest-diff: input known_services.log does not exist.

From the test description I'm not sure if known_services.log can

legitimately be missing in the 2nd case.

rRobin

core.tunnels.teredo-known-services … failed

There's a subtle change to the test in this branch: it no longer does `bro -b`. The reason that ends up mattering for the test is that the pcap has a connection for which both Teredo and DNS analyzers get attached and the Teredo analyzer does this thing where it won't emit a protocol_confirmation if some other analyzer on the same connection has already. When doing `bro -b`, the DNS analyzer doesn't get attached since the associated scripts aren't loaded, but the Teredo analyzer does since it has a signature that matches and so it will emit a protocol_confirmation which causes the known_service.log.

From the test description I'm not sure if known_services.log can
legitimately be missing in the 2nd case.

Seems fine. Or you can add the -b back.

- Jon

The plugin branch is pretty much ready now.

Just finished reading all the changes; looks pretty great.

I put some misc. doc and typo fixes in topic/jsiwek/plugins-cleanup if you want to merge it to your branch (or I'll make a merge ticket later if it's easier).

A couple questions:

Was there anything that makes use of analyzer::Tag::subtype yet? Wasn't sure I understood how to use that or what cases would be candidates for using it.

Maybe put some initial calls to BRO_PLUGIN_VERSION(1) for everything? Just feels like that feature is going to be underused/forgotten unless people see things are already explicitly using it.

- Jon

Did that, not sure how that got lost in the first place.

Robin

I put some misc. doc and typo fixes in topic/jsiwek/plugins-cleanup

Merged, thanks!

Was there anything that makes use of analyzer::Tag::subtype yet?

That's indeed not used yet. It's in preparation for BinPAC++ where a
single class will provide a set of analyzers.

Maybe put some initial calls to BRO_PLUGIN_VERSION(1) for everything?
Just feels like that feature is going to be underused/forgotten unless
people see things are already explicitly using it.

Generally I'm thinking for statically compiled plugins versions aren't
really useful; starting to version them would probably be more
confusing than helpful. I'm seeing versions primarily for later when
we have shared libraries and people may start updating plugins
independently of Bro itself.

That said, the idea was to always set the version of static plugins to
BRO_PLUGIN_VERSION_BUILTIN. That doesn't happen right now if I'm not
mistaken. Need to take another look.

Robin

What would be the minimum we could get away with for now? I'm thinking:

1) Replace the "Built-In Functions (BIFs)" bullet under "Script Reference" with a "Plugin Reference" bullet. (better name to use here?)

2) The Plugin Reference page will have a section for each plugin.

3) Each plugin section lists its components (e.g. protocol analyzer w/ tag declaration) and BIF-items.

4) For the BIF-items, the reST markup would just use some directive to duplicate the content from where they're originally declared (the per-file documentation). Alternatively it could just be a link to that declaration; I'll probably try that first since it's easier, but I feel like it won't be enough.

I want to avoid doing too much that I'd just throw away when I do the Broxygen overhaul (config files + sphinx directives), but I think at least this much will be re-usable as a sort of "plugin index" target. If there's more we need to do than have a new "plugin index", I think I should just start working on the Broxygen overhaul/rewrite. I'll probably be doing that soon anyway, but didn't expect to fit it in before next release.

- Jon

1) Replace the "Built-In Functions (BIFs)" bullet under "Script Reference" with a "Plugin Reference" bullet. (better name to use here?)

We could just call it "Analyzer Reference" for now because there
aren't any other plugins.

I'm not quite sure how it should look longer term once we have
different component types, in particular because a single plugin can
provide different ones (like a reader and writer). We probably want to
group them by component type eventually, plus a separate plugin index
listing all of a plugin's components.

2) The Plugin Reference page will have a section for each plugin.

Ack.

3) Each plugin section lists its components (e.g. protocol analyzer w/ tag declaration) and BIF-items.

Ack.

(For later: we should add support for a README.rst coming with a
plugin that would go to the top of its section; and probably also
additional README.rst for each component it provides).

4) For the BIF-items, the reST markup would just use some directive to
duplicate the content from where they're originally declared (the
per-file documentation). Alternatively it could just be a link to
that declaration; I'll probably try that first since it's easier, but
I feel like it won't be enough.

It would be nice to have the content in there directly; duplicating is
fine for now if it doesn't mess up indexing/linking etc?

So, yes, I think this a good short-term compromise. I certainly want
avoid doing something now that we'd just be throwing away soon, and I
see that doing the full Broxygen overhaul will take more time.

Robin

1) Replace the "Built-In Functions (BIFs)" bullet under "Script Reference" with a "Plugin Reference" bullet. (better name to use here?)

We could just call it "Analyzer Reference" for now because there
aren't any other plugins.

I expect there be File Analyzer plugins real soon. Making another page for that shouldn't be difficult in the short-term, but...

I'm not quite sure how it should look longer term once we have
different component types, in particular because a single plugin can
provide different ones (like a reader and writer). We probably want to
group them by component type eventually, plus a separate plugin index
listing all of a plugin's components.

Separating by "plugin type" feels awkward because the type distinction doesn't actually exist. It's implicit by peeking at the components that belong to a plugin, but they aren't guaranteed to be all the same component type in the future. But we can probably just wait and see what component mixtures are there in the future and decide how to categorize them then.

Another thing is that the bif-items are associated with a plugin, not with particular components. So if we want docs-per-component, it's not clear what bif-items go with what component.

- Jon

I expect there be File Analyzer plugins real soon.

Looking forward to them. :slight_smile:

Separating by "plugin type" feels awkward because the type distinction
doesn't actually exist. It's implicit by peeking at the components
that belong to a plugin, but they aren't guaranteed to be all the same
component type in the future.

That's why I thought to eventually group by component type, e.g.:

    Analyzers