My site was an early tester of 2.5_beta, and during most of our testing, we did not experience much in the way of issues related to log rotation. As soon as we started using 2.5_beta we were utilizing logger mode, as we run an extremely busy cluster, and had been experiencing memory utilization exhaustion issues with the manager being single threaded At one point near the end of the 2.5 beta, we started seeing some irregular rotation intervals, which we noted were potentially related to us starting to use the Intel framework.
I have since moved to 2.5 stable, and I’m still seeing irregular rotation issues which appear to be related to how busy the manager is at the time of log rotation. I do notice that the main local-manager process seems to stay loaded at or near 100% CPU utilization when we are seeing a lot of traffic. As an example, I rotated logs perfectly from midnight up to 07:00-08:00. After that, my current logs in spool/logger are running well over into 9 o’clocks data set. I frequently see .rotated files well after a log rotation interval has run (the following was observed 09:41 today):
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.capture_loss
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.communication
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.conn
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.conn-summary
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.dce_rpc
-rw-r–r-- 1 root root 18 Nov 30 03:00 .rotated.dhcp
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.dns
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.dpd
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.files
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.ftp
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.http
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.intel
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.irc
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.kerberos
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.known_certs
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.known_hosts
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.known_services
-rw-r–r-- 1 root root 18 Nov 29 22:00 .rotated.loaded_scripts
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.mysql
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.notice
-rw-r–r-- 1 root root 18 Nov 29 22:00 .rotated.packet_filter
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.pe
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.radius
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.rdp
-rw-r–r-- 1 root root 18 Nov 29 23:00 .rotated.reporter
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.rfb
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.sip
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.smtp
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.snmp
-rw-r–r-- 1 root root 18 Nov 30 01:00 .rotated.socks
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.software
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.ssh
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.ssl
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.stats
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.syslog
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.tunnel
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.weird
-rw-r–r-- 1 root root 18 Nov 30 08:00 .rotated.x509
Any tips or pointers on how to track down the culprit here and get things back to normally scheduled log rotation intervals?
FWIW, I’m running on a brand new R730xd with SSD’s for the OS, an NVMe drive for the Bro installation, and RAID 10 & RAID 5 respectively for the 60 day archive and long term storage archive volumes. I’m linting my intel files to make sure they are properly formatted, and there doesn’t seem to be an issue here…
Respectfully,
-Erin Shelton
Program Manager: Incident Response and Network Security
Office of Information Technology
University of Colorado Boulder