Support for HTTP body extraction of originator

The current HTTP body extraction mechanism only allows for recording responses:

event http_entity_data(c: connection, is_orig: bool,...
  {
  # Client body extraction is not currently supported in this script.
  if ( is_orig )
    return;

Does anyone recall the reason for this? Later in the script, we have:

local suffix = fmt("%s_%d.dat", is_orig ? "orig" : "resp", \
    c$http_state$current_response);

So simply removing the is_orig check readily enables extraction of
HTTP request bodies, and also correctly tags the extraction file with
"orig" or "resp".

The current workaround at this point is to copy the entire event
handler for http_entity_data and simply invert the above check, which
is redundant and inefficient.

Here's my suggestion: we'd introduce an enum that specifies the
direction, e.g., ORIG, RESP, BOTH. Users can then decide what they'd
like to have recorded.

     Matthias

# Client body extraction is not currently supported in this script.
if ( is_orig )
   return;

Does anyone recall the reason for this?

Oversight on my part. :slight_smile:

Here's my suggestion: we'd introduce an enum that specifies the
direction, e.g., ORIG, RESP, BOTH. Users can then decide what they'd
like to have recorded.

This is all being done through the file analysis framework now and is being abstracted there now. The script you are having trouble with is being removed.

  .Seth

Here's my suggestion: we'd introduce an enum that specifies the
direction, e.g., ORIG, RESP, BOTH. Users can then decide what they'd
like to have recorded.

This is all being done through the file analysis framework now and is being abstracted there now. The script you are having trouble with is being removed.

The script isn't being removed, just changed to use the generic file analysis events instead of http_entity_data.

And the generic file events don't currently specify any direction information, so HTTP extraction will do both request and response bodies, but they can't be controlled independently. Do I need to add an 'is_orig' flag to at least the 'file_new' event?

- Jon

Do I need to add an 'is_orig' flag to at least the 'file_new' event?

I don't know the internals of the FA framework, I just recall a record
fa_file which appears to be what the Info record is to the logging
framework. Could it make sense to put the directionality in there for
more flexibility? Then users can access this information in any event.

     Matthias

Do I need to add an 'is_orig' flag to at least the 'file_new' event?

I don't know the internals of the FA framework, I just recall a record
fa_file which appears to be what the Info record is to the logging
framework.

fa_file is more analogous to the connection record now.

Could it make sense to put the directionality in there for
more flexibility? Then users can access this information in any event.

Yeah, that might be fine. Do you have an opinion, Seth (I thought you did when we talked about the loss of directionality before) ?

- Jon

I think we had discussed creating an enums values to represent each location for files. For example:
HTTP::FILE_CLIENT
HTTP::FILE_SERVER
SMTP::FILE_ENTITY
FTP::FILE_ENTITY
SSL::FILE_CLIENT_CERT
SSL::FILE_SERVER_CERT

This would give the directionality while leaving the possibility for protocols to have multiple transport mechanisms.

PROTO::FILE_CLIENT_WRITE_METHOD1
PROTO::FILE_CLIENT_WRITE_METHOD2
PROTO::FILE_CLIENT_READ_METHOD2

Do you think we need to go that far or do you think that directionality alone is enough?

I'm also not completely sure how this should be conveyed since I don't think it should be an argument to file_new since file_new is used for files read off disk or extracted from other files (child files). Perhaps it should just be a field in the fa_file record?

  .Seth

This would give the directionality while leaving the possibility for protocols to have multiple transport mechanisms.

PROTO::FILE_CLIENT_WRITE_METHOD1
PROTO::FILE_CLIENT_WRITE_METHOD2
PROTO::FILE_CLIENT_READ_METHOD2

Do you think we need to go that far or do you think that directionality alone is enough?

That case seems maybe like overkill because the mechanism and other context is typically available in c$proto which people can inspect in the FAF events, but the part that's missing is a consistent and protocol independent way of determining the direction that the file is going. As long as they have that, any other context that's available at the time of the FAF event becomes usable.

Perhaps it should just be a field in the fa_file record?

Seems fine for now. Will add unless there's other thoughts.

- Jon

Sure, that seems reasonable. I think I would use it too. :slight_smile:

  .Seth