Issues since 2.5.5

Well..I think I'll also put my name in the "something is funky with 2.5.5" group. I have seen far more crashes and OOM's with 2.5.5 than with 2.5.4. Case in point just now:

[425271.774232] bro invoked oom-killer: gfp_mask=0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=0, order=0, oom_score_adj=0
[425271.774234] bro cpuset=/ mems_allowed=0
[425271.774239] CPU: 0 PID: 2482 Comm: bro Tainted: G I 4.10.0-35-generic #39~16.04.1-Ubuntu
[425271.774240] Hardware name: Dell Inc. PowerEdge
[425271.774241] Call Trace:
[425271.774249] dump_stack+0x63/0x90
[425271.774253] dump_header+0x7b/0x1fd
[425271.774257] ? security_capable_noaudit+0x45/0x60
[425271.774261] oom_kill_process+0x219/0x3e0
[425271.774264] out_of_memory+0x120/0x4b0
[425271.774267] __alloc_pages_slowpath+0x9ea/0xb30
[425271.774270] __alloc_pages_nodemask+0x21a/0x2a0
[425271.774272] alloc_pages_vma+0xa2/0x270
[425271.774277] handle_mm_fault+0xdbc/0x1270
[425271.774282] __do_page_fault+0x240/0x4e0
[425271.774285] do_page_fault+0x22/0x30
[425271.774289] page_fault+0x28/0x30
[425271.774291] RIP: 0033:0x7fb47bef3786
[425271.774292] RSP: 002b:00007fff751e5330 EFLAGS: 00010206
[425271.774294] RAX: 000000000000ffe1 RBX: 00007fb46c000020 RCX: 0000000000000065
[425271.774295] RDX: 00007fb46d670fc0 RSI: 00007fb46d671020 RDI: 0000000000000000
[425271.774296] RBP: 0000000000000061 R08: 0000000000000000 R09: 0000000000000000
[425271.774297] R10: 0000000000000001 R11: 00007fb46d6521b0 R12: 00007fb46c000078
[425271.774298] R13: 00007fb46c000078 R14: 0000000000002710 R15: 00007fb46c0000c8
[425271.774300] Mem-Info:
[425271.774305] active_anon:5547944 inactive_anon:396201 isolated_anon:0
                  active_file:1269 inactive_file:726 isolated_file:32
                  unevictable:0 dirty:0 writeback:83 unstable:0
                  slab_reclaimable:5205 slab_unreclaimable:6731
                  mapped:133175 shmem:1921 pagetables:17900 bounce:0
                  free:41669 free_pcp:583 free_cma:0
[425271.774310] Node 0 active_anon:22191776kB inactive_anon:1584804kB active_file:5076kB inactive_file:2904kB unevictable:0kB isolated(anon):0kB isolated(file):128kB mapped:532700kB dirty:0kB writeback:332kB shmem:7684kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 8192kB writeback_tmp:0kB unstable:0kB pages_scanned:17295 all_unreclaimable? yes
[425271.774311] Node 0 DMA free:15896kB min:40kB low:52kB high:64kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15996kB managed:15904kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:8kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[425271.774316] lowmem_reserve[]: 0 3178 24041 24041 24041
[425271.774320] Node 0 DMA32 free:92204kB min:8928kB low:12180kB high:15432kB active_anon:2645908kB inactive_anon:543392kB active_file:580kB inactive_file:1504kB unevictable:0kB writepending:0kB present:3378660kB managed:3296088kB mlocked:0kB slab_reclaimable:596kB slab_unreclaimable:500kB kernel_stack:16kB pagetables:7936kB bounce:0kB free_pcp:1188kB local_pcp:272kB free_cma:0kB
[425271.774325] lowmem_reserve[]: 0 0 20862 20862 20862
[425271.774328] Node 0 Normal free:58576kB min:58608kB low:79968kB high:101328kB active_anon:19546072kB inactive_anon:1041212kB active_file:4508kB inactive_file:1316kB unevictable:0kB writepending:4kB present:21757952kB managed:21363532kB mlocked:0kB slab_reclaimable:20224kB slab_unreclaimable:26416kB kernel_stack:2736kB pagetables:63664kB bounce:0kB free_pcp:1144kB local_pcp:240kB free_cma:0kB
[425271.774334] lowmem_reserve[]: 0 0 0 0 0
[425271.774337] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15896kB
[425271.774349] Node 0 DMA32: 129*4kB (UE) 97*8kB (UME) 60*16kB (UE) 94*32kB (UME) 34*64kB (UME) 24*128kB (UME) 31*256kB (UME) 18*512kB (UE) 11*1024kB (UE) 8*2048kB (UME) 9*4096kB (UME) = 92172kB
[425271.774363] Node 0 Normal: 602*4kB (UMEH) 761*8kB (UMEH) 480*16kB (UMEH) 976*32kB (UMEH) 125*64kB (UMEH) 20*128kB (UMH) 2*256kB (UH) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 58480kB
[425271.774375] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[425271.774377] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[425271.774378] 6037 total pagecache pages
[425271.774379] 2026 pages in swap cache
[425271.774380] Swap cache stats: add 19065590, delete 19063564, find 8257735/16409127
[425271.774381] Free swap = 0kB
[425271.774381] Total swap = 9782268kB
[425271.774382] 6288152 pages RAM
[425271.774383] 0 pages HighMem/MovableOnly
[425271.774383] 119271 pages reserved
[425271.774384] 0 pages cma reserved
[425271.774384] 0 pages hwpoisoned
[425271.774385] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
[425271.774390] [ 769] 0 769 10865 714 25 3 50 0 systemd-journal
[425271.774393] [ 780] 0 780 11465 12 24 3 518 -1000 systemd-udevd
[425271.774395] [ 1119] 0 1119 7155 25 19 3 52 0 systemd-logind
[425271.774397] [ 1125] 0 1125 6801 36 19 3 216 0 smartd
[425271.774399] [ 1187] 0 1187 1098 0 7 3 33 0 acpid
[425271.774401] [ 1190] 0 1190 7468 0 19 3 61 0 cgmanager
[425271.774403] [ 1191] 0 1191 6931 35 18 3 35 0 cron
[425271.774405] [ 1197] 101 1197 64097 475 28 3 610 0 rsyslogd
[425271.774407] [ 1201] 102 1201 10726 65 28 3 58 -900 dbus-daemon
[425271.774409] [ 1216] 0 1216 6510 8 18 3 44 0 atd
[425271.774411] [ 1217] 0 1217 68647 28 36 3 150 0 accounts-daemon
[425271.774413] [ 1405] 0 1405 4900 37 14 3 39 0 irqbalance
[425271.774415] [ 1406] 0 1406 69271 67 39 3 109 0 polkitd
[425271.774417] [ 1419] 103 1419 86258 26 69 4 836 0 whoopsie
[425271.774419] [ 1437] 0 1437 16376 0 37 3 177 -1000 sshd
[425271.774421] [ 1438] 106 1438 22001 1564 39 3 12831 0 redis-server
[425271.774423] [ 1633] 0 1633 4934 2 14 3 71 0 run-bro
[425271.774425] [ 1639] 0 1639 611863 6111 129 6 14689 0 bro
[425271.774427] [ 1640] 0 1640 27527 420 56 3 14833 0 bro
[425271.774430] [ 1823] 0 1823 4934 2 13 3 71 0 run-bro
[425271.774432] [ 1829] 0 1829 32819 8914 68 3 13033 0 bro
[425271.774433] [ 1830] 0 1830 27365 94 55 3 13927 0 bro
[425271.774435] [ 2456] 0 2456 4934 2 14 3 72 0 run-bro
[425271.774437] [ 2461] 0 2461 4934 2 14 3 72 0 run-bro
[425271.774439] [ 2467] 0 2467 4934 2 14 3 71 0 run-bro
[425271.774441] [ 2470] 0 2470 4934 2 15 3 71 0 run-bro
[425271.774443] [ 2482] 0 2482 734289 702195 1419 5 11537 0 bro
[425271.774445] [ 2488] 0 2488 600424 569376 1176 5 20589 0 bro
[425271.774447] [ 2492] 0 2492 2711715 1785937 5298 13 913479 0 bro
[425271.774449] [ 2493] 0 2493 4439547 3000280 8675 19 1428890 0 bro
[425271.774451] [ 2495] 0 2495 58638 33238 116 3 13250 0 bro
[425271.774453] [ 2494] 0 2494 58630 33605 116 3 13627 0 bro
[425271.774455] [ 2496] 0 2496 58616 33114 117 3 13422 0 bro
[425271.774457] [ 2497] 0 2497 58619 33112 117 3 14094 0 bro
[425271.774459] [ 2564] 0 2564 3663 0 11 3 37 0 agetty
[425271.774461] Out of memory: Kill process 2493 (bro) score 499 or sacrifice child
[425271.774515] Killed process 2497 (bro) total-vm:234476kB, anon-rss:1340kB, file-rss:131108kB, shmem-rss:0kB

I think I'll try the 2.6 beta and see if that helps. If there's other info I can provide just let me know..thank you.

James

Various thoughts:

* This is the first I've heard of trouble directly related to 2.5.5 in
contrast to 2.5.4. If you have reference to others reporting similar,
please point me at it as it may help with correlating/diagnosing.

* For any crashes, forwarding stack traces to reports@bro.org would help.

* For OOM, a first sanity check is to make sure reporter.log isn't
showing any scripting errors. E.g. unitialized record field access is
known to leak memory, but it's also an underlying scripting mistake
that needs to fixed.

* Similarly, remember that memory utilization is effected by scripting
logic. If you use any custom or external scripts/packages that are
not conservative with how they track state over time, that's always a
possible source of OOM problems that's independent of Bro version. So
a question would be whether you are comparing the same configuration
between 2.5.4 and 2.5.5 or were some scripts/packages different?

- Jon

Thanks Jon...I'll do my research and report my findings.

James