Detecting software components that do strange dns queries

C_L_Martinez · March 20, 2013, 7:25am

Hi all,

Is it possible to detect what software components do "strange"
queries?? For example, in our network, we detected queries to
"abnormal" domains like these:

1363608064.778525|VmUnpNRkiF5|192.168.65.160|2933|10.196.0.67|53|udp|54891|gqtpngnqt.com|1|C_INTERNET|1|A|-|-|F|F|T|F|0|-|-
1363608064.792823|JT4SuPtIQ3k|192.168.65.160|2940|10.196.0.67|53|udp|3431|wvxzfmyw.cc|1|C_INTERNET|1|A|-|-|F|F|T|F|0|-|-
1363608064.794325|tYWZyjP18fd|192.168.65.160|2941|10.196.0.67|53|udp|15204|shlghhw.org|1|C_INTERNET|1|A|-|-|F|F|T|F|0|-|-
1363608079.436835|TO6u5Zqbx1|192.168.65.160|2962|10.196.0.67|53|udp|50810|xqqkwjqdbhh.ws|1|C_INTERNET|1|A|0|NOERROR|F|F|T|T|0|149.20.56.32,149.20.56.33,149.20.56.34|6024.000000,6024.000000,6024.000000

.. and a lot of more.

Any ideas how to accomplish this??

lysemose · March 20, 2013, 8:03am

Hi

Maybe this could help you…
http://code.google.com/p/security-onion/wiki/DNSAnomalyDetection

/Lysemose

Mike_Sconzo · March 20, 2013, 12:41pm

Are you asking from a host perspective (now that you've seen this
traffic on a network, what is causing it on the host) or from a
network perspective (how do I find suspicious queries like the in
network traffic)?

-=Mike

Tritium_Cat · March 21, 2013, 8:36pm

Character frequency analysis.

C_L_Martinez · March 22, 2013, 7:32am

Do you mean https://www.google.es/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CDAQFjAA&url=http%3A%2F%2Farxiv.org%2Fpdf%2F1004.4358&ei=eQFMUcnUGsamhAfDzYGoAQ&usg=AFQjCNG7i1H_2CSKH5k11Z44zOg6sLAQgA&bvm=bv.44158598,d.ZG4??

lysemose · March 22, 2013, 7:47am

I saw this the other day on Twitter, https://github.com/sethhall/bro-domain-generation, but that still doesn’t answer your original question.

/Lysemose

Vlad_Grigorescu2 · March 22, 2013, 2:06pm

You can do character frequency analysis with a simple Bro script. Look at <http://www.bro.org/documentation-git/scripts/base/strings.bif.html> to see the functions you can use for strings.

I think that this is asking the wrong question, however. I'd be amazed if you could reliably determine "good" domains from "bad" domains based simply on character frequency analysis. Bro can calculate entropy for you: <http://www.bro.org/documentation/scripts/base/bro.bif.html#id-find_entropy>. That being said, I don't think entropy is the right answer either.

Here are the entropy results (in no particular order) for the 4 domains you listed and for 4 very common domains (google.com, twitter.com, fbcdn.net and amazon.co.uk):

[entropy=2.646439, chi_square=450.8, mean=100.2, monte_carlo_pi=4.0, serial_correlation=0.096875]
[entropy=3.085055, chi_square=400.538462, mean=104.692308, monte_carlo_pi=4.0, serial_correlation=-0.005991]
[entropy=3.095795, chi_square=338.090909, mean=106.727273, monte_carlo_pi=4.0, serial_correlation=0.062381]
[entropy=3.027169, chi_square=384.636364, mean=104.727273, monte_carlo_pi=4.0, serial_correlation=0.011643]
[entropy=3.182006, chi_square=424.857143, mean=105.5, monte_carlo_pi=4.0, serial_correlation=-0.050923]
[entropy=2.947703, chi_square=303.888889, mean=98.0, monte_carlo_pi=4.0, serial_correlation=-0.316796]
[entropy=3.084963, chi_square=372.0, mean=97.666667, monte_carlo_pi=4.0, serial_correlation=-0.248104]
[entropy=2.845351, chi_square=431.181818, mean=102.818182, monte_carlo_pi=4.0, serial_correlation=-0.322755]

I don't know about you, but I can't tell which are good and which are bad. I suspect that DNS names are too short of a sample to provide any meaningful data.

I think you should focus instead on the behavior that you're trying to detect. Looking at your example below, some alerts that'd be more useful might be:

- Too many NXDOMAIN queries.
- A query that resolves to an ISC sinkhole.
- Queries for a domain that no one else queried.
- Repetitive queries every X seconds with little to no deviation.
- Queries for a domain that you haven't seen before.

Hope this helps,

--Vlad

C_L_Martinez · March 22, 2013, 3:06pm

Many many thanks Vlad for your explanation ... I'll think about it this weekend

Tritium_Cat · March 23, 2013, 1:53am

Yes, thanks for the example and detail.

CFA was the first thing that crossed my mind so I googled for it and found the Arxiv paper; it sounds promising to me but I can see your point about the length.

While searching for supporting information I found old Google and Github projects with some code inspired by the paper. It appears someone forked the original project but abandoned it after updating the README file.

Readme: https://code.google.com/p/dnapy/

Code: https://github.com/gourryinverse/dnapy

Topic		Replies	Views
DNS behavior alerting Zeek	4	184	May 6, 2022
Using Bro to detect DNS lookups in given timeframe Zeek	1	61	May 6, 2022
Using Bro to detect DNS lookups in given timeframe Zeek	2	70	May 6, 2022
DNS alert for CryptoLocker? Zeek	3	121	May 6, 2022
Blacklist DNS alerting Zeek	2	95	May 6, 2022

Detecting software components that do strange dns queries

Related topics