Proposed IOSource reorg

I'm thinking to move the IOSource infrastructure into its own
subdirectory/namespace and turn the IOSourceRegistry into
iosource::Manager in alignment with the layout we've started to move
to with the logging/input/etc. I'd then move the classes derived from
IOSource into corresponding subdirectories, like this:

    src/iosource/
    src/iosource/Manager.{h,cc}
    src/iosource/IOSource.{h,cc}
    src/iosource/sources/pkt-src/PktSrc.{h,cc}
    src/iosource/sources/pkt-src/bpf/*
    src/iosource/sources/flow-src/*
    src/iosource/sources/dns-mgr/*
    src/iosource/sources/remote-serializer/*

The sources would turn into plugin components. New types of packet
sources (like netmap) would then go into iosource/pkt-src/foo/.

Does that make sense?

One piece where I'm unsure: would it be better to keep the remote
serializer out if this and instead do a separate serializer/ hierarchy
where all the current serialization/communication code goes?

Robin

Maybe best would be if the remote serializer code is refactored so the code that implements the IOSource interface lives in the iosource/ tree, while the code that implements Serializer interface lives in a separate serializer/ tree?

Though taking the reorg and plugin adaptation one step at a time would make sense to just stick it in iosource for now and then later consider what, if any, pieces to pull out and put in serializer/.

- Jon

Would the input framework code be morphed into iosource/sources or
continue living in it's own directory?

Re breaking out communication and serialization into it's own place,
it seems like it has a distinct function outside of i/o - it is used
by i/o, but it can work outside of the core functionality. Making
this a one step at a time reorg as Jon suggests might be a less
complex and destructive way to go about that...

cheers,
scott

Could be an option, though I'm not immediately sure how well it would
split.

But one step at a time sounds good in any case, so I'll go ahead with
that and we can later see.

Robin

No, the input framework is separate. The threading manager could be
affected (it's an IOSource as well, and the input framework uses it)
but that's probably best left where it is for now as well (it's
actually a similar question as with RemoteSerializer; didn't realize
that yesterday).

Robin

To document our conversation from yesterday, flow-src should probably be thrown out and the netflow analyzer turned into a file analyzer. Extending the input framework to be able to open raw sockets would then enable us to create an input stream holding open a datagram socket and attach the netflow file analyzer to it. This would simplify the whole thing and make it possible to reuse the netflow analyzer code because we could yank netflow directly off the wire with it too (pending some analyzer infrastructure re-architecting).

  .Seth

As I'm working on the reorg, I propose to do the following:

    - Remove flow sources completely for now. Per below, we should
      eventually turn them into a file analyzer and at it doesn't look
      worth the effort (nor the ugliness) to migrate them over to the
      new structure first only to throw them out later. I'd be
      surprised if anybody is using them anyways.

    - Remove the secondary path from the packet-layer code. We have
      discussed this before and at that time decided for keeping the
      code; see [BIT-434] - Bro Tracker

      However, I propose to go ahead and remove now because (1) it
      doesn't really fit the new structure of making the API (mostly)
      pcap-independent (it never really fit in well in the first
      place, and has made the code a lot more complex); (2)
      large-conns.bro seems to be the only actual use case, which we
      don't ship with 2.x anymore, and I'm not convinced that by
      itself warrants a separate data path (can we find a different
      solution to the problem?); and (3) it would be quite a bit of
      additional effort to port the code and make sure it still works
      (we don't have any tests, not surprisingly).

Thoughts?

Robin

As I'm working on the reorg, I propose to do the following:

Everything sound good to me.

     large-conns.bro seems to be the only actual use case, which we
     don't ship with 2.x anymore, and I'm not convinced that by
     itself warrants a separate data path (can we find a different
     solution to the problem?)

I think the only reason that large-conns.bro used the secondary data path was to get access to more traffic that Bro might not be normally seeing due to being filtered out. With our change to a default open filter (and I don't see this changing anytime soon) we don't need the secondary path to let in more traffic than is normally allowed in.

  .Seth