binpac crash triggered by exportsourcedata

Thank you Ronka.

It is a flowunit analyzer. I checked zeek source tree and found that there is only 1 flowunit
analyzer (tls-handshake) uses exportsourcedata directive. I guess that exportsourcedata only
apply to non-incremental types. Maybe these are true:

  • all types in a datagram analyzer can use exportsourcedata directive

  • only non-incremental types in a flowunit analyzer can use exportsourcedata

But I’m not sure about what is non-incremental type, I have to check the generated code.

The reason that I want sourcedata field is that I want to feed the whole test_pdu to another
analyzer. Now as a workaround, I have to do something like this:

test_rpc->DeliverStream(${data}.length() + 4, ${data}.begin() - 4, is_orig);

to bring back the first 4 bytes to form the original whole PDU.

Maybe I should try datagram analyzer.

------------------ Original ------------------

It turns out that my workaround is wrong. Every field is a copy, so field.begin() - 4 will not give you the address of the data 4 bytes before the field. It may contains random data even be an illegal address.

------------------ Original ------------------

Hi Ronka,

The protocol I’m trying to analyze supports multiple authentication methods, including SASL Kerberos GSSAPI.
After authentication, according to the authentication method chosen and security layer negotiated, the RPC
requests/responses followed could be in plain text, signed or encrypted.

In the plain text form, the PDU is like:
<4 bytes length field>
<request/response data with length indicated by the 4 bytes length field>

While in signed or encrypted form, the outmost layer of PDU is like:
<4 bytes length field>
<Kerberos 5 GSSAPI Wrap Token with length indicated by the 4 bytes length field>

In the later case, the RPC requests/responses PDU (including the 4 bytes length field indicating the length of the
request/response data) is encapsulated in the Wrap Tokens. It is possible that a big RPC request/response will
be carried by multiple Wrap Token PDUs.

So I have two analyzers:

  • controlling analyzer: deal with authentication and decryption, forward decrypted RPC PDU data to RPC analyzer
  • RPC analyzer: decode RPC request/response

I need the &exportsourcedata for the plain text case in which the whole controlling analyzer PDU should be forwarded

to the RPC analyzer.

Today I will try to change the type of controlling analyzer to datagram.

Best regards,

------------------ Original ------------------

I tried to set flow type of the controlling analyzer to datagram and used &exportsourcedata. Although
the resulted analyzer works great against my test pcap file, but after checking the code binpac generated
I think the datagram analyzer is not suited to the TCP-based protocols. Below are the generated codes:

2308 void TEST_Flow::NewData(const_byteptr t_begin_of_data, const_byteptr t_end_of_data)
2309 {
2310 try
2311 {
2312 dataunit_ = new TEST_PDU(is_orig());
2313 context_ = new ContextTEST(connection(), this);
2314 int t_dataunit__size;
2315 t_dataunit__size = dataunit_->Parse(t_begin_of_data, t_end_of_data, context_);
2316 // Evaluate ‘let’ and ‘withinput’ fields
2317 delete dataunit_;
2318 dataunit_ = 0;
2319 delete context_;
2320 context_ = 0;
2321 }
2322 catch ( binpac::Exception const &e )
2323 {
2324 delete dataunit_;
2325 dataunit_ = 0;
2326 delete context_;
2327 context_ = 0;
2328 throw;
2329 }
2330 }

Notice that in line #2312, every piece of data will be treated as a new PDU which obviously is not good for TCP
data stream.

I think now the only option I have is to build a new bytestring from the length and data fields and to feed it to
the RPC analyzer. This solution is bad from the performance point of view since we have to do 2 extra memory
copy: first to generate data field, second to regenerate the original whole PDU.

------------------ Original ------------------

Just realized that the ASSERT() in binpac’s is only effective for DEBUG build, the non-debug
version will accept &exportsourcedata for incremental inputs happily. But I need DBG_LOG() macro, so I
just comment out 2 ASSERT() in, everything goes smoothly so far :slight_smile:

diff --git a/src/ b/src/
index 1b827ea…eb8b868 100644
— a/src/
+++ b/src/
@@ -837,7 +837,7 @@ bool Type::AddSizeVar(Output* out_cc, Env* env)
if ( StaticSize(env) >= 0 )
return false;

  • ASSERT(! incremental_input());
  • //ASSERT(! incremental_input());

ID size_var_id = new ID(strfmt("%s__size",
value_var() ? value_var()->Name() : decl_id()->Name()));
@@ -854,7 +854,7 @@ bool Type::AddSizeVar(Output
out_cc, Env* env)

string Type::EvalLengthExpr(Output* out_cc, Env* env)

  • ASSERT(!incremental_input());
  • //ASSERT(!incremental_input());
    int static_length;
    if ( attr_length_expr_->ConstFold(env, &static_length) )

I have checked the generated code, everything is fine except a harmless superflours statement near the end
of the type’s ParseBuffer() method:

sourcedata_.set_end(t_begin_of_data + t_TEST_Req_plain__size);

Really I can not think of a reason why incremental types cannot have a sourcedata field.

------------------ Original ------------------

Hi Ronka,

Did you mean doing everything in a single analyzer? That would make things complicated. As I said, the clear text extracted from
a single Wrap Token may be just a fragment of a RPC PDU so we need to resemble those fragments to form a complete RPC PDU,
then feed the result RPC PDU to a RPC type.

The most simple solution to do the resembling I can think of is to delegate the work to a dedicated RPC flowunit analyzer. Please
note that this is a completely different analyzer. I have 2 analyzers in 1 plugin (2 AddComponent() in

The other solution I could think of is to do the resembling inside the flow or connection, maybe implemented with FlowBuffer. But
I think the code will not be trival (more states to keep, more boundary checks to do, buffer management …) and I’m too lazy …


------------------ Original ------------------