As part of the sumstats things I've been looking into I tried refactoring scan.bro to put less load on sumstats.
The refactored script is at https://gist.github.com/JustinAzoff/fe68223da6f81319d3389c605b8dfb99
It is.. amazing! The unified code is simpler, uses less memory, puts less load on sumstats, generates nicer notice messages, and detects attackers scanning across multiple victims AND ports.
The current scan.bro maintains two sumstats streams keyed by attacked ip and port.
When attacker attempts to connect to victim on port 22, sumstats effectively creates:
an [attacker 22] key with data containing [victim]
an [attacker victim] key with data containing 
It does this so it can figure out if an attacker is scanning lots victims on one port, or lots of ports on one victim.
When an attacker does the equivalent of 'nmap -p 22 your/16', sumstats ends up with 65536 extra [attacker victim] keys. This kills the sumstats
my refactored version simply creates:
an [attacker] key containing [victim/22, othervictim/22, ...]
This means that no matter how many hosts or ports attacker scans, there will only ever be one key.
Additionally, since the reducer is configured as
... $apply=set(SumStats::UNIQUE), $unique_max=double_to_count(scan_threshold+2)
the data the key references can not grow unbounded, so a full /16 port scan can only create 1 key and scan_threshold+2 values per worker process. This is a huge reduction in the amount of data stored.
The downside of this was that the notices were effectively "attacker scanned... something!", but I realized I could analyze all the victim/port strings in unique_vals and figure out what was scanned. With that in place, bro now generates notices like this:
Scan::Scan 126.96.36.199 made 102 failed connections on 102 hosts and 77 ports in 4m59s
Scan::Scan 188.8.131.52 made 102 failed connections on 102 hosts and 78 ports in 4m59s
Scan::Scan 184.108.40.206 made 102 failed connections on port 23 in 0m14s
Scan::Scan 220.127.116.11 made 102 failed connections on ports 135, 445 in 4m59s
Scan::Scan 18.104.22.168 made 103 failed connections on port 389 in 5m0s
Scan::Scan 22.214.171.124 made 102 failed connections on port 23 in 0m14s
Scan::Scan 126.96.36.199 made 102 failed connections on ports 8080, 3128 in 4m58s
Scan::Scan 188.8.131.52 made 102 failed connections on port 23 in 0m38s
Scan::Scan 184.108.40.206 made 102 failed connections on port 23 in 0m36s
Scan::Scan 220.127.116.11 made 100 failed connections on 100 hosts and 100 ports in 4m55s
The only downside is that 18.104.22.168 appears to be backscatter (conn_state and history are OTH H), but that's an issue inside is_failed_conn somewhere which is unchanged from scan.bro
It should be a drop in replacement for scan.bro other than that any notice policies or scan policy hooks will need to be changed.
It could possibly be changed to still raise Address_Scan/Port_Scan notices at least in some cases. I don't know how people may be using those notices differently - we handle them the same, so the change to a unified notice type is a non-issue for us.