I'm thinking to move the IOSource infrastructure into its own
subdirectory/namespace and turn the IOSourceRegistry into
iosource::Manager in alignment with the layout we've started to move
to with the logging/input/etc. I'd then move the classes derived from
IOSource into corresponding subdirectories, like this:
The sources would turn into plugin components. New types of packet
sources (like netmap) would then go into iosource/pkt-src/foo/.
Does that make sense?
One piece where I'm unsure: would it be better to keep the remote
serializer out if this and instead do a separate serializer/ hierarchy
where all the current serialization/communication code goes?
Maybe best would be if the remote serializer code is refactored so the code that implements the IOSource interface lives in the iosource/ tree, while the code that implements Serializer interface lives in a separate serializer/ tree?
Though taking the reorg and plugin adaptation one step at a time would make sense to just stick it in iosource for now and then later consider what, if any, pieces to pull out and put in serializer/.
Would the input framework code be morphed into iosource/sources or
continue living in it's own directory?
Re breaking out communication and serialization into it's own place,
it seems like it has a distinct function outside of i/o - it is used
by i/o, but it can work outside of the core functionality. Making
this a one step at a time reorg as Jon suggests might be a less
complex and destructive way to go about that...
No, the input framework is separate. The threading manager could be
affected (it's an IOSource as well, and the input framework uses it)
but that's probably best left where it is for now as well (it's
actually a similar question as with RemoteSerializer; didn't realize
that yesterday).
To document our conversation from yesterday, flow-src should probably be thrown out and the netflow analyzer turned into a file analyzer. Extending the input framework to be able to open raw sockets would then enable us to create an input stream holding open a datagram socket and attach the netflow file analyzer to it. This would simplify the whole thing and make it possible to reuse the netflow analyzer code because we could yank netflow directly off the wire with it too (pending some analyzer infrastructure re-architecting).
As I'm working on the reorg, I propose to do the following:
- Remove flow sources completely for now. Per below, we should
eventually turn them into a file analyzer and at it doesn't look
worth the effort (nor the ugliness) to migrate them over to the
new structure first only to throw them out later. I'd be
surprised if anybody is using them anyways.
- Remove the secondary path from the packet-layer code. We have
discussed this before and at that time decided for keeping the
code; see [BIT-434] - Bro Tracker
However, I propose to go ahead and remove now because (1) it
doesn't really fit the new structure of making the API (mostly)
pcap-independent (it never really fit in well in the first
place, and has made the code a lot more complex); (2)
large-conns.bro seems to be the only actual use case, which we
don't ship with 2.x anymore, and I'm not convinced that by
itself warrants a separate data path (can we find a different
solution to the problem?); and (3) it would be quite a bit of
additional effort to port the code and make sure it still works
(we don't have any tests, not surprisingly).
As I'm working on the reorg, I propose to do the following:
Everything sound good to me.
large-conns.bro seems to be the only actual use case, which we
don't ship with 2.x anymore, and I'm not convinced that by
itself warrants a separate data path (can we find a different
solution to the problem?)
I think the only reason that large-conns.bro used the secondary data path was to get access to more traffic that Bro might not be normally seeing due to being filtered out. With our change to a default open filter (and I don't see this changing anytime soon) we don't need the secondary path to let in more traffic than is normally allowed in.