Updating testing/external

I'm going to do some work on the testing/external infrastructure,
which meanss the testing repositories may not be available for a bit
(and will likely need a fresh clone afterwards). This is to remove the
big trace files from the repositories; wasn't a good idea to store
them in git.

Robin

Done with this. I've recreated the bro-testing repository so if you
have checked out the old one into testing/external, first remove it
and then follow the (updated) README again. It will now pull the
traces via curl.

Robin

One problem I'm running into is that different libmagic setups
classify data in different ways. For example, I see a number of HTTP
entities classified as text/html on one machine yet as text/plain on
another.

Not sure how to deal with that for test baselines. I'm thinking to
preprocess logs to just have a boolean flag indicating whether there
is a mime type at all before diffing but otherwise make the
comparision oblivious to the actual value.

Better ideas?

Robin

Jon handled this in several places are there some he missed? Which tests are you having trouble with?

Part of the general file analysis work will be completely not relying on libmagic for file type identification anymore, it's way too annoying.

  .Seth

The tests running on traces in external/*. What's the trick to make
them ignore the differences?

Robin

Jon handled this in several places are there some he missed? Which tests are you having trouble with?

Could be; I just tried to make it work for some of the testing/btest unit tests that I caught using it, but I didn't try to address the problem for testing/external.

The way I did it for the testing/btest tests was to use filters to either filter mime types out completely if the test doesn't depend on it, or if it does, to again use a filter to normalize mime types to some constant dummy value. That approach might not be so scalable to do in a generalized way for testing/external.

- Jon

The tests running on traces in external/*. What's the trick to make
them ignore the differences?

testing/btest/scripts/base/protocols/irc/dcc-extract.test has an example of what I did to normalize mime type for unit tests. Maybe it's easy enough to brute-force the same filtering approach for now if the number of logs/fields that depend on libmagic is small.

- Jon