Simple way to get a combined unique IP list from an arbitrary date range

Help with this would be greatly appreciated. I am trying to figure out a simple way to get a combined unique ip list from an arbitrary date range. I want the unique IP addresses as a single list from the conn.log fields ip.orig_h and ip.resp_h. Answering questions like give me the unique IPs from the past 7/14/30/60/90 days would be quite tedious this way.

I can do it manually as the below example using a temp file for the working data.



This should do it:

zcat 2016-01-0{1,5}/conn.* | bro-cut id.orig_h id.resp_h -F $'\n' | sort | uniq -c | sort -n > /tmp/alluniqip.txt

If you're going to be doing that a lot, it would make sense to process each day individually (but keep them sorted by ip), then reporting on a date range would just involve doing a k-way merge across multiple days of data.

I use this program as a replacement for sort | uniq -c | sort -n, as long as you have the memory it ends up being a lot faster:

#!/usr/bin/env python
import sys
from collections import defaultdict

c = defaultdict(int)

for line in sys.stdin:
    c[line] += 1

top = sorted(c.items(), key=lambda (k,v): v)
for k, v in top:
    print v, k,