HTTP Object length calculation

Hello,

I am trying to use Bro 1.5…1 to calculate the HTTP object length from a test packet trace. I have observed that in several HTTP transactions the calculated object length (stat$body_length) is higher than the “Content-Length” (msg$content_length) r

For example:

GET /tools/services?XXX (200 “OK” ["1945 ", 11182])

I have isolated an example TCP connection, and measured the bytes using wireshark. The real object length is equal to the “Content-Length”, but the reported by bro is much higher. Therefore, I cannot understand what the value stat$body_length represents.

Any help would be highly appreciated.

Thank you,
Yannis

stat$body_length *should* be the actual counted number of bytes that were in the body. If you see a disparity between the two numbers, the web server could be reporting an incorrect length for the data it's sending. Could you send the trace file privately?

  .Seth

Actually that's not exactly the case. Bro reports the body length *after decompression* (for transfer-encodings that use compressions).

In addition, the Content-Length header is often unreliable. E.g., if an HTTP transfer is interrupted fewer bytes are transferred that reported by Content-Length. This can happen often with (misconfigured) download managers. Or the Content-Length header can also be just plain wrong (HTTP server sends garbage). We did a study with residential traffic and found that the Content-Length header will on average over-report the volume by a factor of about 5 (with some spikes reaching several 100(!))

cu
Gregor

Thank you both for getting back to me.

I understand your point regarding the content-length.

However, it looks to me that there is a contradiction in the calculation.

When the host receives the full object and it is encoded, bro reports the size of the object in the server (before compression), not the actual bytes in the network (after compression).
When the object is partially downloaded (object_length<content_length) and assuming it was encoded, bro reports the actual bytes in the network (since partial decompression cannot be performed).

Am I missing something?
Is there a way to get the actual bytes transferred?

thanks,
Yannis

When the host receives the full object and it is encoded, bro reports
the size of the object in the server (before compression), not the
actual bytes in the network (after compression).

Correct.

When the object is partially downloaded (object_length<content_length)
and assuming it was encoded, bro reports the actual bytes in the network
(since partial decompression cannot be performed).

No, the compression schemes can be done on the fly. I.e., decompression can start before the whole object is downloaded. That's what Bro is doing. So if an encoded transfer is interrupted, Bro will report the amount of bytes it has decompressed so far.

Is there a way to get the actual bytes transferred?

Unfortunately not.

cu
gregor