Rounding error for intervals from timestamp deltas

Hello, I found a strange error with intervals calculated from timestamp differences. When I run zeek with a script containing the following code:

event zeek_init() {
    local x: time = double_to_time(0.000002);
    local y: time = double_to_time(0.000001);
    local z = x-y;
    print z;
    print interval_to_double(z);
}

I get the expected output:

1.0 usec
0.000001

When I change the time to a more current date though, there is some kind of rounding error.

event zeek_init() {
    local x: time = double_to_time(1717676016.000002);
    local y: time = double_to_time(1717676016.000001);
    local z = x-y;
    print z;
    print interval_to_double(z);
}

output:

0.953674 usecs
9.5367431640625e-07

Does anyone know what is causing this issue or has an idea how to fix this? Originally I stumbled across this while inspecting the duration field in some conn.logs
Thanks in advance,
Timo

Floating point numbers like double (or time which in Zeek really is just a type wrapping a double) cannot always exactly represent decimal numbers. This is a general issue of floating point numbers in many languages.

In this particular case none of the numbers can be represented exactly, but 0.000002 and 0.000001 have lots of zeros so there is enough remaining capacity to get the floating point representation close to the decimal values (in absolute terms); 1717676016.000002 and 1717676016.000001 on the other hand have much less room for that so in absolute terms their deviation is bigger.

1 Like

Thanks for your answer. One small followup question:
like I said I got confused when I saw the value 9.5367431640625e-07 in the duration field of a conn.log. The timestamp field of the same log entry has microsecond precision. I assume the duration gets calculated from this timestamp and the one from the last packet in this connection or something similar. Is there a reason the value for the duration suggests much higher precision? I also noticed this is only the case when using logs in JSON format. When using logs in tsv format the duration value is rounded to 6 decimals which matches the accuracy of the timestamp.

AFAICT the output format of double values is unrelated to their actual precision (which would depend on how the values were calculated), and instead simply due to different rendering. The default formatter explicit limits to six decimal places while for JSON output no limitation is put in place.