Log::add_filter with mime_type or filename predicate

I'm looking for the new way (in 2.2) for filtering HTTP::LOG logging based
upon mime_type or filename. It seems with the new file analysis framework
the filename and mime_type of an HTTP connection are set in HTTP::Info in
base/protocols/http/entities.bro inside the file_over_new_connection
event. However I'm thinking at this point that that event is triggered
only AFTER the HTTP::LOG filter predicates are processed, since all of the
new entities fields in the HTTP::Info record are "<uninitialized>" when
printed from the predicate function. Here is a possibly helpful code
snippet that goes inside bro_init() (Excuse the formatting, not much I can
do.)

Log::add_filter(HTTP::LOG, [$name = "http-executables",
  $path = "http_exe",
  $pred(rec: HTTP::Info) =
  {
    print "file:", rec;
    return 1==1;
  },
# This line was in the predicate function, but it no longer works
# return rec?$mime_type && rec$mime_type == "application/x-dosexec"; },

  $include=set("ts","id.orig_h","id.orig_p","id.resp_h","id.resp_p","method"
,"host","uri","referrer","user_agent","request_body_len","response_body_len
","status_code","info_msg","contenttype","filename","mime_type")
  ]);

Thoughts?

return rec?$resp_mime_types && "application/x-dosexec" in rec$resp_mime_types;

  .Seth

resp_mime_types is also uninitialized:

file: , [ts=1380560274.291225, uid=CtYbny3SoceMiawke6, id=[orig_h=X.X.X.X,
orig_p=43457/tcp, resp_h=74.125.239.123, resp_p=80/tcp], trans_depth=1,
method=GET, host=s0.2mdn.net, uri=/viewad/910797/pixel.gif,
referrer=http://www.kbb.com/used-cars/, user_agent=Mozilla/5.0 (Windows NT
6.1; WOW64; rv:17.0) Gecko/20100101 Firefox/17.0, request_body_len=0,
response_body_len=0, status_code=<uninitialized>,
status_msg=<uninitialized>, info_code=<uninitialized>,
info_msg=<uninitialized>, filename=<uninitialized>, tags={

}, username=<uninitialized>, password=<uninitialized>, capture_password=F,
range_request=F, orig_fuids=<uninitialized>,
orig_mime_types=<uninitialized>, resp_fuids=<uninitialized>,
resp_mime_types=<uninitialized>, current_entity=<uninitialized>,
orig_mime_depth=1, resp_mime_depth=0, contenttype=<uninitialized>]

Your traffic is messed up. You aren't seeing both sides of that connection. Perhaps this is related to your questions about sniffing multiple interfaces?

  .Seth

Yep, that was it. Good catch, thanks!

I'm still working through this, still encountering issues. The return
statement you provided makes sense and is what I need, but bro is giving
me an error when I use it:

...: not an index type (application/x-dosexec in rec$resp_mime_types)

Does the 'in' operator work with a string and a vector type?

Arg! I forgot that was a vector, I was thinking it was a set. It's little things like this that are pretty annoying to eventually find out (that I didn't consider some situation).

I don't really like this solution but it should work if you put it in your predicate...

if ( rec?$resp_mime_types )
  {
  for ( i in rec$resp_mime_types )
    {
    if ( "application/x-dosexec" == rec$resp_mime_types[i] )
      return T;
    }
  }
return F;

  .Seth