Bro + Yara File Scanning Module?

Hello all:

I wanted to poke the hive mind to see if anyone has considered, or is actively pursuing integrating Yara into a Bro script?

An idea for a script I would like to write is to simply take any file from a ‘file_new’ event. Then add something like Files::ANALYZER_YARA that would do the heavy lifting and take a user defined path to a master Yara file, scan the file, append the results to either files.log or notice.log, and finally, extract any file that hit on a signature (for further analysis).

Interested if this is something that has been considered previously? If so, what were the results? If not, I’m happy to spin off an effort of my own. Either way I see it as a good project to get into Bro scripting at a deeper level.

Thanks,
Jason

Jason,

I would be curious to hear more about this as well. I don’t know if it already exists, but we are considering a functionality here very similar to what you’ve described. We were considering moving the extracted files to another system for Yara scanning, but integrating it within Bro might be a more efficient process.

Thanks,

John

The process I use is I have all of the files being written to a directory then a python scripts monitors that for new files. It uses a Redis keystore and checks the sha256 of the file. If it exists in the keystore it simply deletes the file and moves on. If it does not exist it adds it to the keystore and then moves it somewhere else. This could be Yara or whatever. I will see if I can dig it up but it was rather simple python. I did this because I didn’t want to tie up Bro especially if you are seeing high file volume.

Mike

I was working on this a while ago and got it working. :slight_smile:

Unfortunately it required some changes to Yara itself to add an incremental analysis API which I need to update because the Yara developers have been making changes in the areas that I had to make changes. I've been thinking of coming back around to that code to get it cleaned up and contributed back to the Yara developers so that we could easily have a Yara analyzer in Bro.

  .Seth

These solutions are very awesome and mirror the path we are taking at Cisco with OpenSOC to scale up and out. I’ll be speaking a bit deeper about our plans at BroCon in a few weeks but the theories are very similar: gather telemetry data (bro logs), gather intelligence data (yara results, threat intel lists, etc), inspect (storm, python scripts, etc).

For this specific instance we queue the logs through kafka to enter our storm topology and plan to throw the files into hdfs for retention/deeper analysis.

I forgot to mention one more point. It was pretty slow because of the internal architecture of Yara and I had started reworking a bit of Yara to fix the problem (compiling rule sets is slow and they mix rule match state with the compiled rule structure so you can't match multiple files concurrently with the same compiled rule set).

Is there anyone out there interested in taking on this rework and pushing it to completion? (you need to know C)

  .Seth

Out of curiosity, were you working with Yara 2.0 when you were developing? It is several orders of magnitude faster than previous versions.

To your question, I would be interested in this effort but before diving in would like some time to familiarize myself more with Bro development. I will be at this years BroCon in pursuit of that goal and would welcome further collaboration toward this end :slight_smile:

Ideally, what I would love to see is a way to take actions on alerts generated by some kind of ‘Files::ANALYZER_YARA’. So say if I have a ZIP file for example and a Yara rule to detect a ZIP. I think it would be very valuable for someone to not only just trigger on that, but then invoke an event that decompresses the ZIP and feeds the contents through the same scanning engine. Now replace ZIP files with a known crypter/obfuscation or something else and you can perhaps start to see the power and possibilities that begin to unfold :slight_smile:

Full disclosure time…:

I am a malware reverse engineer by trade. When I RE a binary I can tell my customers (the analysts) a lot about it, however, their ability to take action on the intelligence I give them is oftentimes limited by their capabilities / security posture as an organization.

Enter Bro, with a modular framework, I look to this as a means to make the observables I gain from my RE efforts as more valuable actionable intelligence for my team. By implementing this modular ‘take action on X’ mentality with respect to Bro and Yara, my signatures get more milage, as well as my observables on how certain crypters/encodings can be defeated.

Imagine this, I have a signature for shellcode that decrypts a PE in a certain way always at a certain offset. My Yara rule hits on this signature and triggers an event that unmaskes the binary as well, out pops the dropper, that is scanned again, and hits on the signature I created for the dropper, etc, etc…

So I’ve automated analysis that is usually done by someone more experienced on the command line. Not only that, but now the analyst knows more about what they are dealing with which directly informs IR/Intel efforts.

Hope that helps paint the picture a little more :slight_smile:

  • Jason

Out of curiosity, were you working with Yara 2.0 when you were developing? It is several orders of magnitude faster than previous versions.

I was working on it during the lead up to the 2.0 code so my work was developed around the changes they made.

To your question, I would be interested in this effort but before diving in would like some time to familiarize myself more with Bro development. I will be at this years BroCon in pursuit of that goal and would welcome further collaboration toward this end :slight_smile:

Once an incremental analysis api is added to Yara and Yara's match state and compiled rules are separated, the Bro module is really simple (and it's already been written somewhere...).

Ideally, what I would love to see is a way to take actions on alerts generated by some kind of 'Files::ANALYZER_YARA'. So say if I have a ZIP file for example and a Yara rule to detect a ZIP. I think it would be very valuable for someone to not only just trigger on that, but then invoke an event that decompresses the ZIP and feeds the contents through the same scanning engine. Now replace ZIP files with a known crypter/obfuscation or something else and you can perhaps start to see the power and possibilities that begin to unfold :slight_smile:

It's a bit more complicated than that unfortunately. :slight_smile:

Everything in Bro is organized around incremental analysis. If you have a yara rule fire you can't go back and look at the old data, it's gone already. You'd need to write Bro scripts that extract files temporarily and then possibly re-analyze them with new information.

By implementing this modular 'take action on X' mentality with respect to Bro and Yara, my signatures get more milage,

I agree there, but there are some questions left lingering. We aren't really sure if you'll be able to run large rule sets again all files and just how much help they will be.

Imagine this, I have a signature for shellcode that decrypts a PE in a certain way always at a certain offset. My Yara rule hits on this signature and triggers an event that unmaskes the binary as well, out pops the dropper, that is scanned again, and hits on the signature I created for the dropper, etc, etc..

This is one of those areas where the file would need to be extracted and re-analyzed.

Hope that helps paint the picture a little more :slight_smile:

Yes! I'm just excited that someone that doesn't primarily look at network traffic is playing with Bro, or at least looking into it. :slight_smile:

  .Seth

It probably isn’t what you’re looking for, but I tried making something similar to Yara a little while back. It is a hack on top of the Intel framework.

https://github.com/anthonykasza/scratch_pad/tree/master/rules