binpac with PDUs in multiple segments

Hi all,

I’m attempting to use binpac to create an analyzer that works with packets where the application protocol data unit might be broken across multiple TCP segments. Specifically, trying to handle TPKT/COTP in a reasonably robust way, and it’s not working as I expected. Trying to figure out if I’m doing it wrong, or if my expectation is simply wrong, or if it’s broken. I’ve pulled the key bits out into a separate skeleton analyzer.

Here’s my .pac:

type HDR = record {
version: uint8;
reserved: uint8;
len: uint16;
} &byteorder=bigendian;

type FOO_PDU(is_orig: bool) = record {
hdr: HDR;
plen: uint8;
ptype: uint8;
something: bytestring &restofdata;
} &byteorder=bigendian, &length=hdr.len;

flow FOO_Flow(is_orig: bool) {
flowunit = FOO_PDU(is_orig) withcontext(connection, this);

datagram = FOO_PDU(is_orig) withcontext(connection, this);


The trace file I have has the HDR part (4 bytes: actually a TPKT) in the first TCP segment, and the remainder in the next packet. If I run it, it throws an ExceptionOutOfBound with the message:

FOO_PDU:ptype: 6 > 4

Accurate enough, but my expectation was the flowunit together with the &length clause should cause the analyzer to first accumulate enough data for the HDR part, then dereference the length embedded there, then accumulate more data until the &length provided by the header field is fulfilled in order to fill out the rest of the structure. But that doesn’t happen. (Indeed you can see in the generated .cc that hdr.len is never evaluated.)

I tried putting a support analyzer in front of it that first accumulates the correct number of bytes before passing to the binpac code. If I do that, then remove the &length clause and change the flow to use the datagram = FOO_PDU clause, it works correctly. But isn’t that what the flowunit is supposed to do? That’s what the documentation page implies:

So, really the question is: should this work? And if so, what am I doing wrong?

You can pull and build my code from here if you like:

The rest of the code is pretty much straight from the binpac quickstart. There is a pcap in the traces directory that invokes the analyzer.

Any help in understanding this greatly appreciated.

I didn't notice anything wrong with what you were trying, but I did
find the parsing logic generated by binpac was wrong. I made a PR
with a fix (and using your test case) here:

Specifically this binpac patch in case you want to use it and work
ahead until the bugfix gets reviewed/merged into Zeek.

- Jon