Hi guys,
I just upgraded Zeek on my test server from version 6.0.4 to 6.0.6 without any change in the Zeek confguration file. I noticed that dns.log no longer appeared in the directory with the current logs.
Tried version 6.0.8 instead. Same behaviour. Switched back to 6.0.4 and now the dns.log is there.
I read through all the release notes starting with 6.0.5 but didn’t see anything that could explain this behaviour. Build instructions used for all versions were the same. No errors in stdout.log or stderr.log.
nothing in the other logs either. I’m using docker containers so it is extremely easy to switch between versions. Letting it run for a longer period of time is not necessary because this test server gets the same data feed as one of the production servers and receives around 600k DNS requests/hour
I’ll go through the git logs to see if anything changed in the docker build files and will also build another 6.0.4 container using the current build files just to find out what I can reproduce, or not.
A few remarkable things, besides there being no dns.log
there is a lot of DNS traffic (udp port 53) to be found in the conn.log, however the field “service”:dns is missing;
other mentiongs like service:http are present;
in the known_services.log there isn’t any mentioning of any DNS service, unlike other installations;
I recompiled 6.0.4 in a container using our current Dockerfiles and unfortunately the same behaviour so the problem must be somewhere else in the entire setup, a library, an installed package, etc.
I’m now trying to recreate a brand new container using version 6.0.4 stripped down to roughly the same instructions as available inside the docker/ source directory. I took this as a starting point earlier and there aren’t many differences but I want to start as clean as possible.
Hmm a packet checksum issue? Do thing change if ignore_checksums set to T? Maybe some ethtool settings that enable/disable checksum offloading have changed?
I reran all steps from scratch. I now have a running 6.0.4 inside a container with the same compilation options as I have in production. So far so good.
Next step is to add the Zeek packages we normally use, one by one to see which of those causes this weird problem.
Logically thinking my guess is that spicy-dns might be the problem. I tried to install it seperately instead of the spicy-analyzers (zkg install --force zeek/spicy-dns) but that fails. The only message I get is “error: failed to run tests for zeek/zeek/spicy-dns: test_command failed with exit code 1”.
Haven’t had the time to figure out exactly why. Everything runs inside containers so it makes debugging a bit harder.