TCP Reassembler question

Hi,

I have some questions regarding the TCP resassembler. I have a midstream
NFS con
nection (i.e., no handshake) with tons of data. The NFS analyzer can
handle gaps
and partial connections, however it seems that there are some content
gaps and
that the TCP Reassembler doesn't recover from them.

* When I look at the packet level, I see data packets all the time, but
  the analyzer's DeliverStreams stops being called somewhere half
  through the trace.

* I don't get any calls to Undelivered() either (actually I get some,
  at the very end of the trace, but the delivery stops way way earlier.

* I *don't* get content_gap and ack above hole message,
  because the connections doesn't have a handshake. Can I force that
  somehow? (So that I can debug where the gaps happen).

* What's the Reassemblers default / indented bevhavior wrt gaps in
  partial connections? Are there any policy-level settings I can tweak?

cu
gregor

I have some questions regarding the TCP resassembler. I have a midstream
NFS connection (i.e., no handshake) with tons of data. The NFS analyzer can
handle gaps and partial connections

Are you sure it does okay with those? It's layered on top of the RPC
handler, and I would think that that will make it brittle to missing
byte-stream elements.

* When I look at the packet level, I see data packets all the time, but
  the analyzer's DeliverStreams stops being called somewhere half
  through the trace.

If it's a huge trace, then I'm guessing you may be encountering wrap-round with:

        if ( seq_delta(start_block->seq, last_reassem_seq) > 0 ||
             seq_delta(start_block->upper, last_reassem_seq) <= 0 )
                return;

Or have you already converted all of those variables to 64-bit?

Alternatively, you may be running into the behavior I sketch in response
to your next question:

* I don't get any calls to Undelivered() either (actually I get some,
  at the very end of the trace, but the delivery stops way way earlier.

This sounds like skip_deliveries is getting set. This can happen due
to the following logic:

        if ( Endpoint()->NoDataAcked() && tcp_max_above_hole_without_any_acks &&
             NumUndeliveredBytes() > tcp_max_above_hole_without_any_acks )
                {
                tcp_analyzer->Weird("above_hole_data_without_any_acks");
                ClearBlocks();
                skip_deliveries = 1;
                }
            
        if ( tcp_excessive_data_without_further_acks &&
             NumUndeliveredBytes() > tcp_excessive_data_without_further_acks )
                {
                tcp_analyzer->Weird("excessive_data_without_further_acks");
                ClearBlocks();
                skip_deliveries = 1;
                }

So it could be that you have enough loss at some point that you're tripping
over one of these thresholds.

* I *don't* get content_gap and ack above hole message,
  because the connections doesn't have a handshake. Can I force that
  somehow? (So that I can debug where the gaps happen).

Changing this would require source-code edits, because it's wired in as:

  if ( content_gap &&
       endpoint->state == TCP_ENDPOINT_ESTABLISHED &&
       peer->state == TCP_ENDPOINT_ESTABLISHED )

(there's a comment before this explaining false positives that can result
for non-established connections, though I think it's reasonable giving
users an option to abide these possibilities)

    Vern

[Re-sync]

Are you sure it does okay with those? It's layered on top of the RPC
handler, and I would think that that will make it brittle to missing
byte-stream elements.

Yes. The RPC-analyzer has some heuristics to re-sync to a TCP-stream
with gaps. It definitely works at the start of the trace. (My problem is
also that DeliverStream doesn't get called at some stage)

* When I look at the packet level, I see data packets all the time, but
  the analyzer's DeliverStreams stops being called somewhere half
  through the trace.

If it's a huge trace, then I'm guessing you may be encountering wrap-round with:

[snip]
Or have you already converted all of those variables to 64-bit?

I haven't converted them to 64 bit and the trace is indeed huge.
However, I thought that seq_delta() handles wrap-around correctly.
(Also, I think that the problem occurs after the first 4GB of data). Or
can there be a problem if a hole spans over the seq wrap-around?
I'll have a look at the reassembler code,.

* I don't get any calls to Undelivered() either (actually I get some,
  at the very end of the trace, but the delivery stops way way earlier.

This sounds like skip_deliveries is getting set. This can happen due
to the following logic:

[snip]

So it could be that you have enough loss at some point that you're tripping
over one of these thresholds.

I was thinking that too, however, I don't see any Weird message from the
Reassembler that would hint at any of those (but I think that there are
other places where skip_deliveries) gets sets where there isn't a Weird.

* I *don't* get content_gap and ack above hole message,
  because the connections doesn't have a handshake. Can I force that
  somehow? (So that I can debug where the gaps happen).

Changing this would require source-code edits, because it's wired in as:

  if ( content_gap &&
       endpoint->state == TCP_ENDPOINT_ESTABLISHED &&
       peer->state == TCP_ENDPOINT_ESTABLISHED )

(there's a comment before this explaining false positives that can result
for non-established connections, though I think it's reasonable giving
users an option to abide these possibilities)

I'll make this policy-configurable then.

I'll investigate this some more. See how many gaps I have, if
skip_deliveries gets set, etc.

thanks,
Gregor