(I realized this slipped through the cracks, sorry for the late
feedback, hope it still helps)
- What would be the lowest layer to built up on or should everything be
pluggable down to the packet source?
I see three pieces here overall that I think can be tackled
independently:
(1) Link-layer: Currently hardcoded in Packet::ProcessLayer2()
(2) IP-Layer: Currently hardcoded in NetSessions::NextPacket()
(3) Transport-layer: Currently hardcoded in NetSessions::DoNextPacket().
Case (1) is all about skipping the header to get to IP. There's some
redundancy across cases, though, and MPLS makes it all more messy.
With (2), a plugin would be able to add support for non-IP protocols.
However, due to Bro generally assuming that it is analyzing IP, the
plugin would either need to take care of such packets completely (like
ARP does), or eventually get to an IP packet that it can then feed
back for further analysis (like if it some kind of a tunnel).
Similar for (3): A plugin would be able to add support for further
transport layer protocols, but it'd be mostly about stripping
additional headers to eventually get to TCP/UDP/ICMP.
There's also a more general version of (2) and (3) where we'd remove
Bro's assumption of analyzing TCP/IP protocols. But that's a separate,
large effort by itself.
On a technical level, plugging in such low-level analyzers needs to be
very efficient, in particular if we move the currently hardcoded cases
into the plugins as well (as I think we should; similar to how
application-layer analyzers have all moved into internal plugins).
Then the lookup-the-analyzer-and-dispatch operation will happen
multiple times for every packet.
- What about the concept of connections? For some LL protocols the
concept might be counterintuitive.
Couple cases there:
- If there's really no sense of a connection, then the plugin will
need to take complete care of the packets, as the rest of Bro
assumes connection-semantics.
- If it's just the definition of what defines a connection that is
different, then I think we could make that more flexible. I've been
hoping for a while that we can make Bro's notion of connection IDs
dynamic, so that it's not necessarily just the 5-tuple. There are
use cases outside of new protocols for this, too. For example, one
could include the VLAN ID to deal with overlapping IP ranges in
independent VLANs.
- The interface should support to pass payload to other analyzers. Does
it make sense to come up with a generalized DPD-mechanism?
Not quite sure what you're thinking here, but I believe that fully
solving this would require addressing Bro's overall assumption of
analyzing TCP/IP. For now, maybe the best way would be just having the
analyzer call back into entry points corresponding to the various
layers where analysis would then proceed as normal. I.e., some
variation of: ProcessLinkLayer(...), ProcessIP(...),
ProcessTransport(data), ProcessAppLayer(...). The caller would be
responsible for providing all the right (meta-)data, like IP headers.
Were you thinking something different / more general?
Robin