dns.bro

As discussed, I'll start going through the scripts in the
policy-scripts-new branch, starting with dns.bro.

I really like the new dns.log, pretty neat!

Two general, DNS-independent, questions first:

    - Does a policy/foo.bro script always load all of
    policy/foo/*.bro? Would be nice if that was consistent, and
    perhaps it already is. :slight_smile:

    - We should include new connection$uid into pretty much all
   relevant logs.

dns/base.bro:

    - There are number of commented out "print" statements. Should we
      pass this into weird.bro?

    - The script activates the binpac analyzer. Do we want to remove
      "classic" C++ analyzer?

    - There's a TODO about the EDNS/TSIG. What's the problem?

    - The reply handlers check for "ans$answer_type == DNS_ANS", but
      there are also options dns_skip_all_auth/dns_skip_all_addl in
      bro.init? Can we get rid of one of the two ways (I'd say the
      latter)?

    - The reply handlers are all almost identical. How about
      refactoring that code into a function called by them all?

    - The comment in connection_state_remove() seems misleading: this
      is the only place that logs anything, right?

    Watislat_ctveusd or
  - om o te yps n ontsbr dn' semto Do we
      want to delete he?

dns/passive-replication.bro

    - Can you remind me what the passive replication is for? I thought
    I knew but not sure that's matching with the script. :slight_smile:

    - Regarding the TODO: should "recent_requests" be a table[string]
    of set[string]?

Robin

I have no idea what happened here. :slight_smile: That's supposed to be:

    - What is last_active used for?

    - Some of the types in consts.bro don't seem to be used. Do we
      want to delete them?

Robin

   - Does a policy/foo.bro script always load all of
   policy/foo/*.bro? Would be nice if that was consistent, and
   perhaps it already is. :slight_smile:

It loads a subset that I believe are a mix between general utility and performance. More consistency would nice, but I'm not exactly sure how to get it. Any ideas?

   - We should include new connection$uid into pretty much all
  relevant logs.

I'll start doing that.

   - There are number of commented out "print" statements. Should we
     pass this into weird.bro?

Yes, I'll do that. I forgot to go back and do something with it.

   - The script activates the binpac analyzer. Do we want to remove
     "classic" C++ analyzer?

The DNS analyzer would only be used if the --use-binpac flag is given to the bro binary. A long time ago I had segfaulting trouble with the DNS binpac analyzer though (which scares me a bit). We need to run it on live traffic for a while to see how it handles itself. By default the non-binpac analyzer will still be used so we should be good for the moment.

   - There's a TODO about the EDNS/TSIG. What's the problem?

I've never worked with those messages (and don't really know much about them). I keep meaning to look into how to handle them but I haven't gotten to it yet.

   - The reply handlers check for "ans$answer_type == DNS_ANS", but
     there are also options dns_skip_all_auth/dns_skip_all_addl in
     bro.init? Can we get rid of one of the two ways (I'd say the
     latter)?

Do you think it could cause any performance problems due to the frequently high volume of DNS traffic? I suppose that's why the variables were implemented in the first place. I do like it better with the check in the handler though. We could always pass everything through and see if anyone is having trouble with performance.

I actually just tested this a bit and my trace file a lot of DNS traffic didn't really take any longer to analyze with the auth and addl responses being passed through or not. I propose we remove all options from the analyzer related to not triggering on auth/addl RRs and handle everything at the scripting layer (it's less of a surprise that way at least).

I added a script named dns/auth-addl that adds fields to store those responses in the logs, but it doesn't currently work because there seem to be a bug with the &priority attribute on script layer defined and called events. I can't influence the order that the events are handled. I left a note in the export section of dns/base.bro that comments on what may be a part of the problem since if I define the event prototype (like in events.bif.bro), I can't handle the at all.

   - The reply handlers are all almost identical. How about
     refactoring that code into a function called by them all?

Done.

   - The comment in connection_state_remove() seems misleading: this
     is the only place that logs anything, right?

Oops, it's not supposed to be misleading. I forgot to implement the logging action once all responses have been received in addition to the final logging at state removal time.

   - What is last_active used for?

Good question. It's removed now. I was thinking it would nice to know when the last response was received, but I wasn't even doing anything with it.

   - Some of the types in consts.bro don't seem to be used. Do we
     want to delete them?

Let's leave them, most of them are being used now. :slight_smile:

   - Can you remind me what the passive replication is for? I thought
   I knew but not sure that's matching with the script. :slight_smile:

It's to log responses received for queries (so you can see what a query resolved to at that specific time). I'm not completely happy with it yet, but if you run it you can see that it usually logs many fewer lines and the total log file size is greatly reduced as well. I don't know if it's even useful enough to include in the base distribution but I liked it because it's a good example of using the logging framework to build a new log.

   - Regarding the TODO: should "recent_requests" be a table[string]
   of set[string]?

Probably, but I didn't want to suck up too much memory for that script. That's yet another one of those hairy decisions that might not work in some high volume situations.

Thanks!
  .Seth

It loads a subset that I believe are a mix between general utility and
performance. More consistency would nice, but I'm not exactly sure
how to get it. Any ideas?

Not really ... But the problem I see is that it's hard to figure out
what a script is *not* loading and what thus can make sense to add
manually.

How about at least documenting this explicityly for the top-level
script, something like:

    Further scripts that one might one to load in addition are:

        - foo/bar1.bro
            <short description what it does>

        - foo/bar2.bro
            <short description what it does>

DNS binpac analyzer though (which scares me a bit). We need to run it
on live traffic for a while to see how it handles itself.

Ok, let's postone that for the more general analyzer overhaul at some
point.

addl responses being passed through or not. I propose we remove all
options from the analyzer related to not triggering on auth/addl RRs
and handle everything at the scripting layer (it's less of a surprise
that way at least).

Agreed, filed a ticket.

responses in the logs, but it doesn't currently work because there
seem to be a bug with the &priority attribute on script layer defined
and called events.

I've filed a ticket for that as well.

It's to log responses received for queries (so you can see what a
query resolved to at that specific time). I'm not completely happy
with it yet, but if you run it you can see that it usually logs many
fewer lines and the total log file size is greatly reduced as well.

I think one thing that confused me was the low timeout of 10secs; I
was expecting something longer, perhaps an hour or so (but then
listing all replies during that interval). But not sure my mental
model on using this data is right.

Also, including a timestamp by default could be helpful?

distribution but I liked it because it's a good example of using the
logging framework to build a new log.

It definitly is!

Probably, but I didn't want to suck up too much memory for that
script. That's yet another one of those hairy decisions that might
not work in some high volume situations.

Yeah, but I'd say let's first at least try to do these things "right",
even if more expensive. If it turns out to be a problem, we can still
change it change/optimize later.

Robin

Mulling over this a bit more, I'm wondering whether we should change
tack and generally always load *everything* from the top-level file,
even if that may be too much for some environments. These can always
go ala carte and load the sub-files individually. By just loading
everything, it's much easier to get an idea what's available.

Robin

Are these new scripts going to have Seth's 'dns-ext.bro'
included/merged or is this re-write changing the base dns.bro
altogether?

Both. :slight_smile: Seth is rewriting/restructuring the base scripts and
including the functionality of his *-ext scripts in the process (not
only dns, but all).

Robin

And that's not mentioning the new scripts that aren't even in my -ext scripts. :slight_smile:

  .Seth