How to convert name field in smb_files.log to "readable" string?

Hi, all

Is there some way that convert name field of smb_files.log to “readable”?

I got name value like “\u00ec\u0099\u0084”

It seems like unicode and I read weird string(e.g. ê¸°íš íŒ€) when I send to ELK(characterset: utf-8).

I might need to convert it.

Any comments would be appreciated!

Thanks!

I've been thinking about how to handle this for a while. The data that is being written into the log is technically already UTF-8, it's just that non-ascii bytes are escaped.

I think we can deal with this by making a switch for the logs to make them "UTF-8". It would incur a bit of overhead because each string would have to be scanned for valid UTF-8 characters before being written and then only non-valid bytes would be escaped.

   .Seth

Does the json log writer make this simpler for users? I think bro writes out valid json for this,
so any json parser should give you proper UTF-8 strings.

It writes out valid JSON but strings aren't handled as well as they could. It's why I was saying that non-ascii bytes are escaped according to the json spec, but that has other problems.

   .Seth

I've been thinking about how to handle this for a while. The data that
is being written into the log is technically already UTF-8, it's just
that non-ascii bytes are escaped.

I think we can deal with this by making a switch for the logs to make
them "UTF-8". It would incur a bit of overhead because each string
would have to be scanned for valid UTF-8 characters before being written
and then only non-valid bytes would be escaped.

  .Seth

I see..
So, I need to write non-ascii bytes that are escaped to utf-8.
I want to make the logs to be readable even if it would make a bit overhead.
Is there some sample bro script to do it?
It's hard to do it because I'm newbie about bro script.

Thanks!