Bro scripts

Hi all,

I'm doing work on Bro's policy scripts for the next release and I want to find policy scripts floating around that can be shared and any helpful code snippets. Anything you can contribute would be greatly appreciated, thanks!

  .Seth

I'm doing work on Bro's policy scripts for the next release and I want
to find policy scripts floating around that can be shared and any
helpful code snippets. Anything you can contribute would be greatly
appreciated, thanks!

The whole buzz about Firesheep caused me to hack up a sidejacking
detector. I haven't tested it because I literally wrote it 5 minutes
ago.

   Matthias

Here is the code:

    @load http-request
    @load http-reply

    module HTTP;

    export
    {
        redef enum Notice += { CookieReuse };

        # Number of cookies per client.
        const max_cookies = 1 &redef;

        # The time after when we expiring entries.
        const cookie_expiration = 1 hr &redef;
    }

    # Count the number of cookies per client.
    global cookies: table[string] of set[addr] &write_expire = cookie_expiration;

    event http_header(c: connection, is_orig: bool, name: string, value: string)
    {
        # We are only looking for session IDs in the client cookie header.
        if (! (is_orig && name == /[cC][oO][oO][kK][iI][eE]/))
            return;

        local client = c$id$orig_h;
        if (value !in cookies)
            cookies[value] = set();
        else
            add cookies[value][client];

        if (|cookies[value]| <= max_cookies)
            return;

        local s = lookup_http_request_stream(c);
        NOTICE([$note=CookieReuse, $src=client,
                $msg=fmt("potential sidejacking by %s: cookie used by %d addresses",
                client, |cookies[value]|)]);
    }

That's pretty cool! I do have one suggestion, though: Instead of
tracking by IP, how about one cookie per user agent? That will help
catch the side jacking when used under a NAT.

Good point! Changing the tracking global from...

global cookies: table[string] of set[addr]
to...
global cookies: table[string] of set[addr, string]e

and then storing the user-agent in the last string would take care of that.

I think your point about NAT gets to a more general point of what techniques could we use to detect NAT? I know that there are a lot of little indicators of addresses that are doing NAT, but I think it could be really worthwhile to organize them all and then write a script to implement all of them so that we can get reliable NAT detection with Bro. I can start with a few thoughts.

* Multiple web browser user-agents at a single address
    - Must match some regex for a "real" browser so that weird applications throwing junk in the user-agent don't trigger this.
    - Must be closely together in time.

Over the past several years I've had a lot of ideas for detecting NATs, but they have all completely escaped me. Anyone else have thoughts to add to this?

  .Seth

I think that will definitely work for detecting NAT's if you stick to
regexing the variants on the major browsers. As we've all seen, most
browser plugins have their own UA, so you're bound to get many UA's
out of a single computer naturally, but they should not all be for
Internet Explorer, for example. I think scoping to IE, FF, and Webkit
engines would be good enough to be effective.

One other point, once a NAT is detected, would it be possible to
exclude that IP from future detection to save resources? I'm a bit
concerned with memory utilization for all of these state tables.

> Instead of tracking by IP, how about one cookie per user agent?

Good point!

Indeed.

global cookies: table[string] of set[addr]
to...
global cookies: table[string] of set[addr, string]e

That will almost do it, except that I now need to write a handler for
http_all_headers instead of http_header to obviate the need for some
global glue code.

Furthermore, the Cookie header often bundles a bunch of cookie key-value
pairs of which only a few define the actual user session. The others can
vary and thus cause false negatives. Firesheep fortunately ships with a
bunch of handlers for major sites which I will use a baseline to
define user session for specific sites, i.e.,

    # Distills relevant cookies that define a user session.
    type user_session: record
    {
        url: pattern; # URL
        cookies: pattern; # Cookie keys that define the user session.
    };

    const session_info: table[string] of user_session =
    {
        ["Amazon"] = [$url=/amazon.com/, $cookies=/x-main/],
        ["Dropbox"] = [$url=/dropbox.com/, $cookies=/lid/],
        ["Facebook"] = [$url=/facebook.com/, $cookies=/xs|c_user|sid/],
        ["Flickr"] = [$url=/flickr.com/, $cookies=/cookie_session/],
        ["Google"] = [$url=/google.com/, $cookies=/NID|SID|HSID|PREF/],
        ["NY Times"] = [$url=/nytimes.com/, $cookies=/NYT-s|nyt-d/],
        ["Twitter"] = [$url=/twitter.com/, $cookies=/_twitter_sess/],
        ["Yelp"] = [$url=/yelp.com/, $cookies=/__utma/],
        ["Windows Live"] = [$url=/live.com/,
                            $cookies=/MSP(Prof|Auth)|RPSTAuth|NAP/],
        ["Wordpress"] = [$url=/yelp.com/,
                            $cookies=/wordpress_[0-9a-fA-F]+/]
    } &redef;

What remains todo is to split the Cookie string into its key-value pairs
and then match the keys against user_session$cookies. Instead of regular
expression, I'd preferably have a set[string], but this cannot be
statically defined in a record, i.e.,

    ["Facebook"] = [$url=/facebook.com/, $cookies={"xs", "c_user", "sid"}],
                                                      ^^^^^^^^^^^^^^^^^^^^^^^
appears not to be correct Bro syntax, because I think variable-size
types inside records cannot be initialized statically. Is that correct?
If so, I'd probably change to simple table[string] of set[string] to
represent user sessions.

In any case, the downside is that this would only detect sidejacking for
known sites. I think it would make sense to do the following. If a
profile for a user_session for a particular site (as defined above)
exists, use it, and otherwise use the entire cookie value.

I think your point about NAT gets to a more general point of what
techniques could we use to detect NAT?

This is truly an important issue to tackle. I wonder if it is possible
to have better abstractions in Bro to support user-based analysis. For
example, it would be neat to augment several events with a "user"
argument which is a essentially a record filled by many other events. In
HTTP for example, some code would parse the User-Agent and fill this
record, so that the script writer could simply refer to user$os or
user$browser.

   Matthias

Using user-agents for this is tricky. I've written some code to analyze
the output of your http-user-agents.log in splunk, and found that the
best thing to look at is the architecture and os, and ignore the
browser itself.

the script I use is here:

http://github.com/JustinAzoff/splunk-scripts/blob/master/ua2os.py

it's for use in splunk, but it's 90% regexes, stuff like this:

os_mapping = (
    ('Windows .. 5.1', 'Windows XP'),
    ('Windows .. 5.2', 'Windows XP'),
    ('Windows NT 6.0', 'Windows Vista'),
    ('Windows 6.0', 'Windows Server 2008'),
    ('Windows NT 6.1', 'Windows 7'),
    ('OS X 10.5', 'MAC OS X 10.5.x'),
    ('Darwin', 'MAC OS X other'),
    ...
    ('Android', 'Android'),
    ('Linux ', 'Linux'),
    ('Windows', 'Windows - Other'),
    ('iPad', 'ipad'),
    ('iPod', 'ipod'),
    ('iPhone', 'iphone'),
)

arch_mapping = (
    ('Windows .. 5.2', 'x64'),
    ('x64', 'x64'),
    ...
    ('iPad', 'ipad'),
    ('iPod', 'ipod'),
    ('iPhone', 'iphone'),
    ('Intel', 'Intel'),
)

It is not uncommon to have one machine using multiple browsers, but rare
for it to indentify as both Vista and Windows 7, or both i386 and x64, or
Windows XP and Mac OS X 10.5.

Also, some user-agents can immediately identify NAT: iOS and android
devices do not have ethernet interfaces, so if one of these devices is
found on a non-wireless subnet it indicates the presense of a rogue access
point.

Thanks for sharing that. Obviously in a corporate environment (or any
in which desktops are managed) most user agents will appear the same
because they are all running the same browser version. However, I
have seen that for guest wireless and other public access points, the
amount of plugins, .NET versions, etc. makes the UA's fairly unique,
so off the bat your mileage will vary depending on the client class.
Using the detected OS would certainly be more accurate, but the
chances of an attacker having the same OS as the victim are pretty
good, so you'll obviously have to deal with a lot of false negatives.
Maybe concatenating the p0f signature with the user agent is the best
way to get a pseudo machine ID.

I think it would make sense to do the following. If a profile for a
user_session for a particular site (as defined above) exists, use it,
and otherwise use the entire cookie value.

Attached is the full version of the sidejacking detector that includes
all the Firesheep handlers. I tested it for Twitter, Amazon, and Google.
The script successfully reports alarms when I hijack my own connections
with Firesheep.

   Matthias

sidejack.bro (4.51 KB)

expression, I'd preferably have a set[string], but this cannot be
statically defined in a record, i.e.,

    ["Facebook"] = [$url=/facebook.com/, $cookies={"xs", "c_user", "sid"}],
                                                      ^^^^^^^^^^^^^^^^^^^^^^^
appears not to be correct Bro syntax, because I think variable-size
types inside records cannot be initialized statically. Is that correct?

You can construct sets using

  .... $cookies=set("xs", "c_user", "sid")

for example.

    Vern

Hi,

I've played around with NAT detection based on user-agent strings and IP
TTL.
See http://www.icir.org/gregor/papers/gregor-phd.pdf, Chapter 4

cu
gregor

  .... $cookies=set("xs", "c_user", "sid")

That works, thanks.

I attached a revamped version of the sidejacking detector which should
exhibit less false positives now (and has slightly improved logging).

   Matthias

sidejack.bro (5.65 KB)