Bro logs from JSON to TSV

Greetings,

Hope my email finds you well. I was wondering if someone can help me figure out how to transform existing Bro logs from JSON format to TSV format. The TSV format is what Bro uses by default to write log files. Thanks in advance!

Sincerely,

Moustafa ElBadry, Information Security Analyst, Office of Information Security
Oregon State University | Information Services | 541-737-4545

Do you want the exact TSV format with the #fields and #types header, or just TSV in general?

This is a somewhat strange thing to want to do - since working with the data in JSON format is generally easier.. What exactly are you trying to accomplish after you convert the logs?

I want the exact TSV format.

We currently have our Bro cluster writing logs in JSON. There are couple of network traffic analytics tools like RITA (Real Intelligence Threat Analytics) and some AWK scripts that we want to use. The problem is that the tools we want to use work only with Bro’s default TSV format.

Moustafa

Ah, I see now. You have a few of options here.

You could just tell bro to write out the logs in both formats at the same time. For older logs there is only a script for bro that can re-log to json, but not the other way, most people have the opposite problem.

There is an open issue for RITA to support json: https://github.com/ocmdev/rita/issues/146

A tool to convert the json logs back into the TSV format could be written, but ultimately that would be a waste of time. Better to update RITA to support json instead of writing more tools to work with the tsv format that only bro uses.

For awk stuff you can swap out bro-cut for jq or https://github.com/JustinAzoff/json-cut

json-cut it doesn't support all the options that bro-cut supports and may be a bit buggy, but it's easier to extract a few fields from a json log as TSV and 2x faster than jq. If I can find a nice, small json library for C we can probably update bro-cut to natively support the json logs.

For now, to extract note and msg from a stream of notice logs with bro-cut and json-cut you just do

    zcat notice.* | bro-cut note msg | awk ...
    zcat notice.* | json-cut note msg | awk ...

For jq you use something like

    zcat notice.* | jq -r '[.note, .msg]|@tsv' | awk ...

If the awk scripts are hardcoding top level field numbers like $3 and $5 instead of using bro-cut... they should not do that :slight_smile:

Great. Thanks Justin for sharing this. Definitely helps us a lot.

Moustafa

    >
    > I want the exact TSV format.
    >
    > We currently have our Bro cluster writing logs in JSON. There are couple of network traffic analytics tools like RITA (Real Intelligence Threat Analytics) and some AWK scripts that we want to use. The problem is that the tools we want to use work only with Bro’s default TSV format.
    >
    > Moustafa
    
    Ah, I see now. You have a few of options here.
    
    You could just tell bro to write out the logs in both formats at the same time. For older logs there is only a script for bro that can re-log to json, but not the other way, most people have the opposite problem.
    
    There is an open issue for RITA to support json: https://github.com/ocmdev/rita/issues/146
    
    A tool to convert the json logs back into the TSV format could be written, but ultimately that would be a waste of time. Better to update RITA to support json instead of writing more tools to work with the tsv format that only bro uses.
    
    For awk stuff you can swap out bro-cut for jq or https://github.com/JustinAzoff/json-cut
    
    json-cut it doesn't support all the options that bro-cut supports and may be a bit buggy, but it's easier to extract a few fields from a json log as TSV and 2x faster than jq. If I can find a nice, small json library for C we can probably update bro-cut to natively support the json logs.
    
    For now, to extract note and msg from a stream of notice logs with bro-cut and json-cut you just do
    
        zcat notice.* | bro-cut note msg | awk ...
        zcat notice.* | json-cut note msg | awk ...
    
    For jq you use something like
    
        zcat notice.* | jq -r '[.note, .msg]|@tsv' | awk ...
    
    If the awk scripts are hardcoding top level field numbers like $3 and $5 instead of using bro-cut... they should not do that :slight_smile:

Hello,

I have a follow up question on this. Justin, you mentioned that I could tell bro to write out the logs in both formats (TSV and JSON) at the same time. How can I do this? And can I have the TSV logs saved in one directory and the JSON logs saved in another directory?

Is the ascii.bro file located at /usr/local/bro/share/bro/base/frameworks/logging/writers/ the right file where we can configure bro to write in two different formats?

Thanks a lot for your help. I really appreciate it!

Moustafa

    Great. Thanks Justin for sharing this. Definitely helps us a lot.
    
    Moustafa
    
        >
        > I want the exact TSV format.
        >
        > We currently have our Bro cluster writing logs in JSON. There are couple of network traffic analytics tools like RITA (Real Intelligence Threat Analytics) and some AWK scripts that we want to use. The problem is that the tools we want to use work only with Bro’s default TSV format.
        >
        > Moustafa
        
        Ah, I see now. You have a few of options here.
        
        You could just tell bro to write out the logs in both formats at the same time. For older logs there is only a script for bro that can re-log to json, but not the other way, most people have the opposite problem.
        
        There is an open issue for RITA to support json: https://github.com/ocmdev/rita/issues/146
        
        A tool to convert the json logs back into the TSV format could be written, but ultimately that would be a waste of time. Better to update RITA to support json instead of writing more tools to work with the tsv format that only bro uses.
        
        For awk stuff you can swap out bro-cut for jq or https://github.com/JustinAzoff/json-cut
        
        json-cut it doesn't support all the options that bro-cut supports and may be a bit buggy, but it's easier to extract a few fields from a json log as TSV and 2x faster than jq. If I can find a nice, small json library for C we can probably update bro-cut to natively support the json logs.
        
        For now, to extract note and msg from a stream of notice logs with bro-cut and json-cut you just do
        
            zcat notice.* | bro-cut note msg | awk ...
            zcat notice.* | json-cut note msg | awk ...
        
        For jq you use something like
        
            zcat notice.* | jq -r '[.note, .msg]|@tsv' | awk ...
        
        If the awk scripts are hardcoding top level field numbers like $3 and $5 instead of using bro-cut... they should not do that :slight_smile:

Great. Thanks for sharing this. I really appreciate it!

Moustafa

    Look at add-JSON:
    
    https://gist.github.com/J-Gras/f9f86828f9e9d9c0b8f0908bc3573bb0
    
    That will log JSON output to the path you define in path_json, and should retain the standard logging as well. Add-JSON is also available as a bro package.
    
    I've been able to get the log rotation to work for this script, though. I ended up creating a cron job that stops bro once a day, purges the JSON logs, and restarts.