It would be extremely difficult to compare IDS systems and here are a couple of reasons why.
What does it mean, to compare IDS systems? Would you compare the performance? Sure, this one can be measured, but it’s so much ruleset dependent that it rarely makes sense.
Detection accuracy? All of them basically perform the same job - reassemble a number of packets in a stream, compare that stream against a huge dataset, flag matches. If one of those components does not work correctly, for example, IDS X can correctly reassemble TCP flows 99% of time, it is either a bug in the engine or there is something wrong with the capture, or there are performance problems.
Then it comes the comparison of “how many alerts will IDS X generate vs IDS Y vs Z for the same input”. You’re not really comparing IDS-es again, but the set of rules. Some rules can be used by multiple engines, like the Emerging Threats has versions for both Snort and Suricata. Some don’t - like the commercial Palo Alto Networks.
Finally, you have the “neutral” engines, like Zeek (or Suricata without any rules and with flow and protocol logging enabled in eve-json). They do not tell what is good and what is bad - because that’s up to you. They merely tell you about a connection that happened in the past, was from A to B, N bytes and packets were sent and M bytes and packets were received, it took 5 minutes and it was SSL.
In case of Zeek or Suricata you can have protocol analysis done, so for our SSL connection, you will see the SNI, ciphersuites negotiated, X509 certificate details and so on.
At this point nothing is technically good or bad. Now, you as an analyst can feed Zeek with rules saying “connections to or from IP A are always bad” - and Zeek will let you know when those happen (or try to happen).
Or you can say “all SSL connections with a certificate with a serial number 12345 are bad” or flag a domain name in many places (not just the DNS traffic), calculate file hashes, analyze PE files, SMB and RPC sessions, etc.
NSM like Zeek is basically like a giant time machine + a matching engine + an engine that can do almost arbitrary operations on network flows. It’s up to you to program it.
And that’s why I think it cannot be compared with IDSes like Snort (purely rule based) or Suricata (a combination of a traditional IDS with NSM functionality).
Comparing Snort vs Suricata doesn’t make sense either - because you would be comparing rulesets, not engines.