Keyword matching in documents


Is it possible for Bro to perform keyword matching on document files (such as text, open office, pdf etc.) and generate notices when the keyword is found.


Vikram Basu

I have made a sample Bro script after looking into the ssn-exposure and credit-card-exposure scripts. But I am getting error

“{“ts”:1505214009.989112,“level”:“Reporter::ERROR”,“message”:“string without NUL terminator: \u0022CONFIDENTIAL\u005cx0a\u0022”,“location”:""}” in reporter.log

How would I fix this ?



Here is the script

#Keyword Matching Basic script

@load base/frameworks/notice

module KeywordMatch;

export {

Keyword Matching Log ID definition

redef enum Log::ID += { LOG };

redef enum Notice::Type += {



type Info: record {

ts: time &log;

uid: string &log;

id: conn_id &log;

word: string &log &optional;

data: string &log;


The Keyword that is being matched

const keyword = “CONFIDENTIAL” &redef;


event bro_init() &priority=5


Log::create_stream(KeywordMatch::LOG, [$columns=Info]);


function check_keyword(c: connection, data: string): bool


local it_matched = F;

if ( keyword in data )


it_matched = T;


if ( it_matched )


local log: Info = [$ts=network_time(),

$uid=c$uid, $id=c$id,

$word=keyword, $data=data];

Log::write(KeywordMatch::LOG, log);


$msg=fmt(“Keyword Matched %s”,keyword),


return T;


return F;


event KeywordMatch::stream_data(f: fa_file, data: string)


local c: connection;

for ( id in f$conns )


c = f$conns[id];



if ( c$start_time > network_time()-20secs )

check_keyword(c, data);


event file_new (f: fa_file)


if ( f$source ==“HTTP” )


Files::add_analyzer(f, Files::ANALYZER_DATA_EVENT,




Hi Vikram,

it turns out that you found a small bug (or at least gotcha) in Bro. Bro
has a few functions that do not deal very well with binary data. "in"
happens to be one of them.

I wrote a small patch to Bro that should fix this problems. It is in the
branch topic/johanna/in-binary. If you want to manually apply it, you only
need the single line change in

I also created a merge request for this at if you are interested in
tracking this.