Hi all,
I’ve set up a Bro instance to test out URL extraction from SMTP, using the smtp-embedded-url-bloom.bro scripts. For the most part the extract/logging is working, but many times I’ll find that the host and url logged will be truncated. As an example I’d see one email listed that has 20 links extracted, but one log entry would have host name as “award” with the url as “http://award”. The remaining URLs for that email look to be extracted correctly.
Has anyone else noticed this issue?
Thanks,
Steve
Yep…I suspect emails that are quoted-printable emails fall victim to this:
https://en.wikipedia.org/wiki/Quoted-printable
James
Hello James,
Yes, that was caused in a very early version of the script because of using
You should try this:
- event mime_segment_data(c: connection, length: count, data: string) &priority=-5
+ event mime_all_data(c: connection, length: count, data: string) &priority=-5
Or try this policy:
https://github.com/initconf/smtp-analysis/blob/master/smtp-embedded-url-bloom.bro
Aashish
Thank you Aashish…that’s awesome!
James
Unfortunately I get this when running the latest version:
1459456959.248537 expression error in /usr/local/bro/share/bro/site/smtp-embedded-url-bloom.bro, line 156: field value missing [SMTPurl::c$smtp$from]
Thank you.
James
Ah! I see the entires in reporter.log
I have uploaded a revised version. This should fix the issue.
Please try this
https://github.com/initconf/smtp-analysis/blob/master/smtp-embedded-url-cluster.bro
Also note:
SMTP_Link_in_EMAIL_Clicked will only partially work in the cluster setup with this policy.
I have a clusterized version of this policy but I am not entirely satisfied with it. It syncs extracted URLs across the nodes so check against all HTTP traffic ranter than just the node which saw the smtp connection. However, there are a few corner cases I need to address.
Aashish