Clarification needed

Hello,

I do have a question regarding data types. I would like to create a dynamic set of domain names in a script and would like to do a dns lookup on each of those domains in another script.

The problem is that bro automatically tries to perform dns lookup on any domain names provided. Using a single domain name works well (global restricted_domains = abc.com) but when i try to assign a group of domains at a time (global restriced_domains = { abc.com, 123.net }; or global restricted_domains: set[addr] and using add statement), I get an error which states “Type Clash”. I would like to know if there is a way to create a set of hostnames so that I can work on them later.

Since domain names are essentially strings, I think it would be nice to have an explicit conversion function to convert from strings to domain names. If there was one, the above problem would have been solved easily.

  • Pelo

The problem is that bro automatically tries to perform dns lookup on any domain names provided. Using a single domain name works well (global restricted_domains = abc.com) but when i try to assign a group of domains at a time (global restriced_domains = { abc.com, 123.net }; or global restricted_domains: set[addr] and using add statement), I get an error which states "Type Clash". I would like to know if there is a way to create a set of hostnames so that I can work on them later.

If you’re purely using unquoted domain names, you can think of that as being automatically converted in to a set[addr] at parse-time. E.g.

global mydomains: set[addr] = { bro.org, google.com };
for ( i in example.com ) add mydomains[i];
print my domains;

Note, the loop over example.com is because it’s technically a set[addr] and you can only add a single element to the mydomains set at a time (at least I can’t recall an easier way to merge two sets).

Since domain names are essentially strings, I think it would be nice to have an explicit conversion function to convert from strings to domain names. If there was one, the above problem would have been solved easily.

There’s not really a distinct type for domain names — if the parser sees a string of characters that looks like a domain name and it’s not in quotes, Bro will resolve those in to a set[addr] as part of the initialization process. For run-time resolution of domain names, storing the domain name as a string data type (e.g. by putting quotes around it) and then passing that as an argument to the “lookup_hostname” function may be what you want.

- Jon

The problem is that bro automatically tries to perform dns lookup on any domain names provided.

I don't tend to use that feature of Bro because I never have a problem that fits it quite right.

Using a single domain name works well (global restricted_domains = abc.com) but when i try to assign a group of domains at a time (global restriced_domains = { abc.com, 123.net }; or global restricted_domains: set[addr] and using add statement), I get an error which states "Type Clash". I would like to know if there is a way to create a set of hostnames so that I can work on them later.

I would have to see your code to know exactly what was failing.

Since domain names are essentially strings, I think it would be nice to have an explicit conversion function to convert from strings to domain names. If there was one, the above problem would have been solved easily.

https://www.bro.org/sphinx/scripts/base/bif/bro.bif.bro.html#id-lookup_hostname
https://www.bro.org/sphinx/scripts/base/bif/bro.bif.bro.html#id-lookup_hostname_txt
https://www.bro.org/sphinx/scripts/base/bif/bro.bif.bro.html#id-lookup_addr

You can see an example using one of these scripts here:
  https://github.com/bro/bro/blob/master/scripts/policy/protocols/ssh/interesting-hostnames.bro#L34

(you have to use them in when statements)

  .Seth

Thanks for the links. Below is a sample code. Error messages are also included in comments.

event bro_init() {

This Works fine

local amazon_ips = amazon.com;
for (i in amazon_ips) print(i);

Error occurs here

Error Output

============

error : type clash (addr and {74.125.236.213,2404:6800:4007:803::1015})

error : type mismatch ({74.125.236.213,2404:6800:4007:803::1015} and addr)

local google_ips: set[addr] = { mail.google.com, maps.google.com, youtube.com };
for (i in google_ips) print(i);

No errors and output here

Anything wrong with the code???

local ip_list: set[addr];
local domain_list: set[string] = { “google.com”, “bro.org” };

for (domain in domain_list){
when( local temp = lookup_hostname(domain) ){
for (ip in temp)
add ip_list[ip];
}
}
for (i in ip_list) print(i);
}

  ### Error occurs here
  ### Error Output
  ### ============
  ### error : type clash (addr and {74.125.236.213,2404:6800:4007:803::1015})
  ### error : type mismatch ({74.125.236.213,2404:6800:4007:803::1015} and addr)
  
  local google_ips: set[addr] = { mail.google.com, maps.google.com, youtube.com };
  for (i in google_ips) print(i);

Ugh, I suspect this has something to do with using the "{ }" constructor syntax somewhere that it shouldn't be used. I.e., you've encountered a wart.

  ### No errors and output here
  ### Anything wrong with the code???

You have an issue where you are trying to synchronously access data from asynchronous operations. :slight_smile:

When statements return immediately and the body only executes after the condition becomes true. You are printing before you've actually gotten a response from the DNS server. Let me try restructuring your code a bit...

  Try Zeek

Does that explain it a bit better?

  .Seth

  ### Error occurs here
  ### Error Output
  ### ============
  ### error : type clash (addr and {74.125.236.213,2404:6800:4007:803::1015})
  ### error : type mismatch ({74.125.236.213,2404:6800:4007:803::1015} and addr)
  
  local google_ips: set[addr] = { mail.google.com, maps.google.com, youtube.com };
  for (i in google_ips) print(i);

Moving the declaration up in to a global makes it work for me.

  ### No errors and output here
  ### Anything wrong with the code???

  local ip_list: set[addr];
  local domain_list: set[string] = { "google.com", "bro.org" };
  
  for (domain in domain_list){
    when( local temp = lookup_hostname(domain) ){
      for (ip in temp)
        add ip_list[ip];
    }
  }
  for (i in ip_list) print(i);
}

In the absence of input sources (e.g. reading live traffic), it may terminate before the lookup returns. The way to tell it not to do that is to redef “exit_only_after_terminate”. Then you also have do the printing within the body of the when statement as that’s when results are actually available. For example see:

http://try.bro.org/#/trybro/saved/46c2c025-6462-41e2-a581-14a9c3eba656

- Jon

​​
Regarding the Schedule statement used in the code, I see that the execution is halted until the specified time expires. Since Bro executes all the event handlers in a FIFO style, if by mistake I wrote a schedule statement with a time interval of say 10 sec, will this then block the execution all the event handlers in the queue thereby delaying the whole process??

Regaring “exit_only_after_terminate”, does the execution repeat a specific loop until it satisfies the condition?? As per the logic, it should be repeating the loop from the statement right after the when statement. Could you clear me on this.

Neither timers (schedule blocks) nor triggers block execution. Instead, when bro sees a timer / trigger, it just makes a note of it and moves on to the next line of code it sees. In the case of the timer described below, bro would keep doing other stuff for 1 second before eventually coming back to execute the code in the { }. In the case of the typo, bro would keep doing other stuff for 10 seconds before eventually coming back to execute the code in the { }.

Triggers operate in a similar fashion to timers, except that the conditions for *every* trigger are evaluated at least once / every packet bro observes. In general, this means that *every registered trigger* is going to add per-packet overhead, so there's a pretty good argument to be made that relatively few triggers should be active at once.

Also, as far as I know, exit_only_after_terminate is a global flag that will simply request that bro wait to exit until there's an explicit request to do so [1]. It shouldn't really have any impact on bro's execution otherwise: it's only there to allow operations with longer execution times to complete before bro actually exits.

As a note, there are actually relatively few blocking calls supported by bro just because blocking script execution for any reason is going to eat through queue space *incredibly* quickly (and likely lead to burst losses).

HTH,
Gilbert

[1] http://comments.gmane.org/gmane.comp.security.detection.bro/5998

In hindsight, that triggers paragraph was poorly written. Let me try again:

Triggers operate in a similar fashion to timers, except that the conditions for *every* trigger are evaluated at least once / every packet bro observes. Note, however, that bro is actually pretty intelligent about the way it evaluates the condition for a trigger. Bro actually keeps track of which values a when() depends on, and will actually only *re-execute* the code in the when() if one of these values has been changed. Thus, the actual total overhead per trigger per packet should actually work out to be a function of *not only* how many active triggers there are, *but also* what exactly those triggers have defined in their when().

More experienced bro folks can feel free to correct / refine the above if desired :slight_smile:

Regardless, I'm afraid the last e-mail I sent may have come across more strongly than I intended. The point I was trying to make there wasn't that "defining triggers is terribly expensive and no one should ever do it", but instead that the cost of maintaining a trigger could be relatively more expensive than maintaining a timer, and that their use should therefore be considered more carefully.

Cheers,
Gilbert

I went and double-checked what I wrote about triggers (again), and found that they don't seem to work quite how I thought they did. There seems to be a lot *less* overhead than I was expecting there to be [1].

So, rather than trying to correct myself again and possibly putting more FUD on the list: would someone (e.g. Robin, Seth) mind offering the Right Explanation (tm) here? It looks like I don't understand triggers quite as well as I originally thought I did, so time to retract and punt the question!

Cheers,
Gilbert

[1] Script I used to test the overhead introduced by a set of triggers

redef exit_only_after_terminate = T;

module Counters;

global conn_count:int;
global exec_count:int;
global eval_count:int;
global pending_count:int;
global shared_var:int;
global target_count:int;

event bro_init() {
    Counters::conn_count = 0;
    Counters::exec_count = 0;
    Counters::eval_count = 0;
    Counters::pending_count = 0;
    Counters::shared_var = 0;
    Counters::target_count = 0;
}

function evalfunction() :bool {
    eval_count = eval_count + 1;
    return shared_var > target_count;
}

event new_packet(c: connection, p: pkt_hdr) {
    pending_count = pending_count + Instrumentation::GetPendingTriggerCount();
}

event connection_established(c: connection) {
    conn_count = conn_count + 1;
    shared_var = shared_var + 1;
    when ( evalfunction() ) {
        exec_count = exec_count + 1;
        target_count += 100;
        print("New target count:");
        print target_count;
    }
}

event bro_done() {
    print("Counters:");
    print("Executed");
    print(exec_count);
    print("Number of connections");
    print(conn_count);
    print("Number of evaluations");
    print(eval_count);
    print("Total pending");
    print(pending_count);
    print("Shared value");
    print(shared_var);
    print("Next target count");
    print(target_count);
}