SMB Analyzer code factorization

Hi all,

As I looked into SMBv1 analyzer, I found that most of the files
describing SMB messages have code duplication. According to the SMB
specification ([MS-CIFS]), SMB messages are composed of a fixed-length
header (defined as SMB_Header in smb1-protocol.pac for Bro) and then of
two "blocks" : the parameter block (section 2.2.3.2) and the data block
(section 2.2.3.3). Every message of the protocol I saw in the
specification or encountered in trafic always followed that scheme.
However, in Bro's SMBv1 analyzer, each .pac file describing a particular
type of message reimplement the two blocks common fields.

The SMB PDU is defined as such :

type SMB_PDU(...) = record {

header: SMB\_Header\(\.\.\.\)

message: case msg\_len of \{

    35 \-> no\_msg: SMB\_No\_Message;

    default \-> msg: SMB\_Message\(\.\.\.\);

\};

};

SMB_Message is just a switch between SMB_Message_request and
SMB_Message_Response. Then these two are defined as a switch on command
type :

type SMB_Message_Request(...) = case command of {

SMB\_COM\_READ\_ANDX \-> read\_andx: SMB1\_read\_andx\_request\(\.\.\.\);

\.\.\.

};

And then every command is implemented like this one :

type SMB_read_andx_request(...) = record {

word\_count: uint8

\.\.\. specific fields \.\.\.

byte\_count: uint16;

\.\.\. specific fields \.\.\.

};

I feel like it would be a good idea to abstract the two blocks : it
would factorize code. Also, there is currently no check on the value of
word_count field or byte_count field for the rest of the payload. This
could lead to protocol violations from BinPAC if you have a byte_count
set to 0 and then a following field trying to parse a SMB_String for
example (see SMB1_transaction_request record in smb1-com-transaction.pac
for an example of that).

A solution could be to change SMB_Message_* to something closer to what
the specification describe by dividing every message in two half
structures :

type SMB_Message_Request(...) = record {

parameter\_block: SMB\_Request\_Parameters\(\.\.\.\);

data\_block: SMB\_Request\_Data\(\.\.\.\);

};

type SMB_Request_Parameters(...) = record {

word\_count: uint8;

\# Maybe a check on word\_count here? Some messages seems to have a

predefined value for this field

words: SMB\_Request\_Words\(\.\.\.\);

};

type SMB_Request_Data(...) = record {

byte\_count: uint16;

data\_or\_not: case byte\_count of \{

    0 \-> none: empty;

    default \-> bytes: SMB\_Request\_Bytes\(\.\.\.\);

\};

};

And then SMB_Request_Words and SMB_Request_Bytes would be switch on
different messages types :

type SMB_Request_Words(...) = case command of {

SMB\_COM\_READ\_ANDX \-> words\_read\_and\_x:

SMB1_Words_read_andx_request(...);

\.\.\.

};

type SMB_Request_Bytes(...) = case command of {

SMB\_COM\_READ\_ANDX \-> bytes\_read\_and\_x:

SMB1_Bytes_read_andx_request(...);

\.\.\.

};

Of course, it means splitting every existing message and re factoring a
lot of code of the analyzer, but it would be easier then to address
problems such as the one quoted as example above. What do you think of it ?

Regards,