Unified scan.bro script

As part of the sumstats things I've been looking into I tried refactoring scan.bro to put less load on sumstats.

The refactored script is at https://gist.github.com/JustinAzoff/fe68223da6f81319d3389c605b8dfb99

It is.. amazing! The unified code is simpler, uses less memory, puts less load on sumstats, generates nicer notice messages, and detects attackers scanning across multiple victims AND ports.

Details:

The current scan.bro maintains two sumstats streams keyed by attacked ip and port.

When attacker attempts to connect to victim on port 22, sumstats effectively creates:

an [attacker 22] key with data containing [victim]
an [attacker victim] key with data containing [22]

It does this so it can figure out if an attacker is scanning lots victims on one port, or lots of ports on one victim.

When an attacker does the equivalent of 'nmap -p 22 your/16', sumstats ends up with 65536 extra [attacker victim] keys. This kills the sumstats :slight_smile:

my refactored version simply creates:

an [attacker] key containing [victim/22, othervictim/22, ...]

This means that no matter how many hosts or ports attacker scans, there will only ever be one key.

Additionally, since the reducer is configured as

    ... $apply=set(SumStats::UNIQUE), $unique_max=double_to_count(scan_threshold+2)

the data the key references can not grow unbounded, so a full /16 port scan can only create 1 key and scan_threshold+2 values per worker process. This is a huge reduction in the amount of data stored.

The downside of this was that the notices were effectively "attacker scanned... something!", but I realized I could analyze all the victim/port strings in unique_vals and figure out what was scanned. With that in place, bro now generates notices like this:

Scan::Scan 198.20.69.98 made 102 failed connections on 102 hosts and 77 ports in 4m59s
Scan::Scan 198.20.99.130 made 102 failed connections on 102 hosts and 78 ports in 4m59s
Scan::Scan 36.101.163.186 made 102 failed connections on port 23 in 0m14s
Scan::Scan 91.212.44.254 made 102 failed connections on ports 135, 445 in 4m59s
Scan::Scan 207.244.70.169 made 103 failed connections on port 389 in 5m0s
Scan::Scan 222.124.28.164 made 102 failed connections on port 23 in 0m14s
Scan::Scan 91.236.75.4 made 102 failed connections on ports 8080, 3128 in 4m58s
Scan::Scan 177.18.254.165 made 102 failed connections on port 23 in 0m38s
Scan::Scan 14.169.221.169 made 102 failed connections on port 23 in 0m36s
Scan::Scan 192.99.58.163 made 100 failed connections on 100 hosts and 100 ports in 4m55s

The only downside is that 192.99.58.163 appears to be backscatter (conn_state and history are OTH H), but that's an issue inside is_failed_conn somewhere which is unchanged from scan.bro

It should be a drop in replacement for scan.bro other than that any notice policies or scan policy hooks will need to be changed.

It could possibly be changed to still raise Address_Scan/Port_Scan notices at least in some cases. I don't know how people may be using those notices differently - we handle them the same, so the change to a unified notice type is a non-issue for us.

Nice job Justin! Perhaps this begs the question if we should use this version in Bro? We do have a tendency to make design decisions so that Bro works the best that it can with minimal configuration for even the largest sites.

I think the notices are very reasonable and have the additional benefit of being a single noticed to watch for for "scanning". Having to watch for two different notices always felt a bit unnatural. I think that I personally care about scans, not the type of scan being performed (although there may be some nuance to that that someone is taking advantage of?).

  .Seth

It is.. amazing! The unified code is simpler, uses less memory, puts less load on sumstats, generates nicer notice messages, and detects attackers scanning across multiple victims AND ports.

Nice job Justin! Perhaps this begs the question if we should use this version in Bro? We do have a tendency to make design decisions so that Bro works the best that it can with minimal configuration for even the largest sites.

I think that is the hard part :slight_smile: Minimally as a first step we can make it available with 2.5 but disabled by default. If someone isn't relying on the existing behavior they can take advantage of it immediately. We can move the parts common to scan.bro and scan_unified.bro into a common script so they won't conflict. We could also make it the default in 2.5, but as long as someone keeps their old local.bro nothing will change unless they want it to.

We just need to fix the backscatter issue first :slight_smile:

I think the notices are very reasonable and have the additional benefit of being a single noticed to watch for for "scanning". Having to watch for two different notices always felt a bit unnatural. I think that I personally care about scans, not the type of scan being performed (although there may be some nuance to that that someone is taking advantage of?).

That did occur to me.. with this new version it is hard to apply a notice policy to the resulting notice.. i.e. do one thing if they were scanning port 22, do something else if they were scanning port 3389, do something else if they port scanned a single machine.. If only I could put the set of ports and hosts scanned inside the notice somewhere..

The unified scanning detection complicates the notice generation. Before there was 1 notice for each of 2 different behaviors, my script has 1 notice for 5 behaviors:

* Scanning 1 port on many hosts
* Scanning <= 5 ports on many hosts
* Scanning many ports on 1 host
* Scanning many ports on <= 5 hosts
* Scanning many ports on many hosts.

Maybe a solution is to raise different notices? otherwise someone needs to do nasty regex stuff inside of a notice policy to tell them apart. It would help if I knew how current bro users were using Scan::AddressScan and Scan::PortScan notices.

A further iteration of the unified scan.bro script is now in the branch topic/jazoff/scan-unified

Use of the branch isn't required though, as it is a self contained change one can just grab the

https://raw.githubusercontent.com/bro/bro/31b63445ed07e2e76f98c49dd59091b1742523d1/scripts/policy/misc/scan.bro

and replace the stock scan.bro with it - or better, move it to site and change the loading from misc/scan to just ./scan.bro)

It is aiming to replace scan.bro so you can not run both at the same time. However, If you really wanted to you could search/replace all the identifiers that conflict with scan.bro and run both.

It should behave visibly similar to current scan.bro except there is a new Random scan notice:

Scan::Random_Scan 198.20.69.74 scanned at least 102 hosts on 82 ports in 4m51s

and the existing notices may report for more than one port or host (up to 5) - after that it becomes a Random_Scan

Address_Scan 91.236.75.4 scanned at least 102 unique hosts on ports 3128, 8080 in 4m47s