I’ve found it convenient to use an undocumented feature of Sumstats: changing the epoch. This comes particularly handy when creating statistics for human consumption, as oftentimes it is useful to synchronize to a logging interval. For example, if hourly stats are desired, it is useful to have a shorter epoch for the original sumstats to align with an hour, then to have subsequent sumstats trigger on the hour.
Researching into this, I realized that the epoch variable can be changed, if the argument to Sumstats::create is a variable, rather than the usual style of an anonymous argument. Then, in epoch_result, or epoch_finished, the timeout for the next epoch can be recomputed on the fly using calc_next_rotate().
However, this fails to work as expected as the next sumstat is scheduled prior to executing epoch_result, and epoch_finished. What does work is the following hack:
- Create the initial sumstat with a epoch that will synchronize to the logging interval
- Immediately change the epoch to the desired interval
Example:
event bro_init()
{
# So network_time() will be initialized…
schedule 0 usec { setup_sumstat() };
}
event setup_sumstat()
{
… blah …
local mysumstat: SumStats::SumStat;
mysumstat = [
$name=“mysumstat”,
$epoch=calc_next_rotate(10 min) - network_time(),
etc…
];
SumStats::create(mysumstat);
# Now SumStat has been created, and the initial epoch scheduled, change epoch to regular interval for the future
mysumstat$epoch = 10 min;
}
It would be convenient if the epoch could be changed in epoch_result or epoch_finished, but some internals would require a bit of change - the reschedule would need to take place after processing results, which could throw the timing off a bit - on the other hand, unless one is interested in exact statistics over a known time period (as I am), the small amount of jitter probably wouldn’t be noticeable or significant.
The above is horribly hackish, and a different approach for accomplishing the goal would be to allow use scripts to schedule the end of the epoch:
-
Mark epoch as &optional.
-
Expose and document SumStats::finish_epoch as part of the public API
-
Make the minor changes to not schedule SumStats::finish_epoch if epoch is undefined.
By not defining epoch a script would indicate that it will manage epoch timing. The script would schedule the first epoch based on the logging interval, and in the epoch_finished function schedule each successive epoch to stay in sync with the logging interval.
Any comments, suggestions, etc. ???
Jim