Order events by timestamp when json logs are used

Hi all,

My Zeek hosts are working ok using Splunk as back-end and using kafka as a message broker. There is only one problem: the events are sorted by the timestamp field arrived to Splunk instead of using the Zeek event timestamp. Is it possible to revert this situation?

Many thanks.

Hey there, usually what Splunk uses for timestamp gets defined in the props.conf of deployment app. I think your sourcetype would be different, but this is an example if the sourcetype was from a Corelight sensor (yours could be zeek_conn or something else).
[corelight_conn]
TIME_PREFIX = _write_ts(?:“\s*:\s*”)?
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%6QZ

And I think Corelight adds that write_ts, you may just use the ts field instead. Now that said, the above would be if you’re using something like the Splunk Universal Forwarder.

If using HEC (HTTP Event Collector), then the above Splunk doesn’t apply. Instead, Splunk uses the “time” field in the HEC envelope itself, OR falls back to when Splunk receives the log. I’m not 100% sure on your data path, but I’m going to guess you’re using Splunk Connect for Kafka, which looks like it utilizes HEC, so you’re seeing the latter behavior where Splunk just using the time it receives the log. If it helps, this is roughly how a Zeek log carried over HEC looks over the wire.

{
“time”: 1656358964.456853, this is what splunk uses
“host”: “ricky-ap200”,
“sourcetype”: “corelight_files”,
“index”: “corelight-hec”,
“event”: {
“_path”: “files”,
“_system_name”: “ricky-ap200”,
“_write_ts”: “2022-06-27T19:42:44.456853Z”,
“ts”: “2022-06-27T19:42:44.221064Z”,
“fuid”: “F65LW1V5LhuLgC4Jd”,
“tx_hosts”: [
“192.168.1.6”
],
…truncating
“extracted_size”: null
}

I haven’t used this Splunk Connect for Kafka, but looking at their docs, Data ingestion parameters for Splunk Connect for Kafka - Splunk Documentation
I THINK what you would need is in the Timestamp Extraction Parameters.
enable.timestamp.extraction To enable timestamp extraction, set the value of this field to true. NOTE: Applicable only if splunk.hec.raw is false. false
timestamp.regex Regex for timestamp extraction. NOTE: Regex must have name captured group “time”. For example, \"time\":\s*\"(?.*?)" is formatted correctly. “”

That’s going to depend some on your field and formatted, but I would guess you need to enable that, and then specify a regex with ts and your values.