From the “Class Layout” picture, every analyzer is derived from class “Analyzer”, but the wording also says that “The root node must always be of type TransportLayerAnalyzer.” So which one is the real root in the Bro’s code. yzer directly derived by “Analyzer”) are located in this analyzer tree structure.
In the section “Determining Analyzer Activation”, I am also confused about the method to activate the analyzer on all connections. Foo_Analyzer is derived TCP_ApplicationAnalyzer, but why this Foo_Analyzer is added as the child of TCP_Analyzer.
tcp->AddChildAnalyzer(new Foo_Analyzer(conn));
So what is the differences between TCP_ApplicationAnalyzer and TCP_Analyzer.
1. From the "Class Layout" picture, every analyzer is derived from class
"Analyzer", but the wording also says that "The root node must always be of
type TransportLayerAnalyzer." So which one is the real root in the Bro's
code. yzer directly derived by "Analyzer") are located in this analyzer tree
structure.
There are two different trees here: (1) the class hierarchy, which is
shown on the Wiki page and in which the Analyzer class is the root;
(2) the tree of analyzer *instances* instantiated for each connection
at runtime. In the latter, a TransportLayerAnalyzer instance must be
the root. The paper may help: http://www.icir.org/robin/papers/usenix06.pdf
So what is the differences between TCP_ApplicationAnalyzer and
TCP_Analyzer.
The TCP_Analyzer analyzes TCP itself, while a TCP_ApplicationAnalyzer
analyzes an application-layer protocol that's running on top of TCP.
The former passes payload data on to the latter that's why they are
linked in the analyzer tree.
For 1, I am OK. For 2, I still confused, please see the inline comment.
From the “Class Layout” picture, every analyzer is derived from class
“Analyzer”, but the wording also says that “The root node must always be of
type TransportLayerAnalyzer.” So which one is the real root in the Bro’s
code. yzer directly derived by “Analyzer”) are located in this analyzer tree
structure.
There are two different trees here: (1) the class hierarchy, which is
shown on the Wiki page and in which the Analyzer class is the root;
(2) the tree of analyzer instances instantiated for each connection
at runtime. In the latter, a TransportLayerAnalyzer instance must be
the root. The paper may help: http://www.icir.org/robin/papers/usenix06.pdf
So what is the differences between TCP_ApplicationAnalyzer and
TCP_Analyzer.
The TCP_Analyzer analyzes TCP itself, while a TCP_ApplicationAnalyzer
analyzes an application-layer protocol that’s running on top of TCP.
The former passes payload data on to the latter that’s why they are
linked in the analyzer tree.
So it seems that TCP_ApplicationAnalyzer behave like a helping interface between TCP protocol and other application-over-TCP protocol. I would also like to learn how TCP_Analyzer passes payload to TCP_AppliationAnalyzer in implementation. For the DNP3 protocol, I actually have to write two application level analyzer and one passes the payload to the other one to do some further parsing. I would like to refer TCP’s implementation.
TCP's data flow is more complex than you need (I believe) because the
TCP reassembler is potentially involved too. In your case, the first
analyzer would call its ForwardStream(), and the data will then show
up in the second's DeliverStream() method.
After checking Bro’s code (especially Analyzer.h, Analyzer.cc), I think the logic is like this (please point out if I am wrong):
TCP_Analyzer will parse the TCP protocol and extract the palyload (input to application level protocol analyzer), this payload will be passed up to Analyzer class (how to pass is not clear to me).
After Analyzer knows this stream of TCP payload and this is the input of the ForwardStream. ForwardStream then call Analyzer’s children to use their own DeliverStream. In each DeliverStream implementation, the Binpac Conn function is used to parse the stream. So in my opinion, from Analyzer, TCP_ApplicationAnalyzer to the application-level analyzer, the stream actually does not change.
But for my situation, I have two application-level protocols, p1 and p2. p1 derive from TCP_ApplicationAnalyzer and p1 needs to parse and reconstruct stream from TCP level, not directly pass the stream to p2. So I think what I can do is to let p2 derives from p1. And then define a event handler in p1 to reconstruct stream as it parse its protocol, in this event handler, we can have the reconstructed stream and use it as the input to call ForwardStream of p1. p2 still defines the DeliverStream as usual, and in this way, p2’s protocol analyzer should be able get those reconstructed stream.