changing Notice::policy mechanism

I've discussed changing the Notice::policy notice handling mechanism with a few people and I think that generally everyone agrees that the current mechanism, while very powerful, sucks from a usability perspective. This is a good thing to address now since we're either on the precipice of repeating a mistake to more frameworks or adapting to a style that is easier to understand. I'm going to give an example of how I think it could work in an evented model now. Please shoot holes. :slight_smile:

These implement the decidedly silly case of sending a notice to an email if the SSH connection originated locally.

event Notice::policy(n: Notice::Info)
  {
  if ( n$note == SSH::Login &&
       Site::is_local_host(n$id$orig_h) )
    {
    add n$actions[Notice::ACTION_EMAIL];
    }
  }

That would be replacement to the current model of this:

redef Notice::policy += {
  [$pred(n: Notice::Info) = {
    return ( n$note == SSH::Login && Site::is_local_host(n$id$orig_h) );
   },
   $action = Notice::ACTION_EMAIL],
};

Here are some random thoughts about these two approaches in no particular order:

- The evented model (top one) is more Bro-y and easier which is a BIG plus.

- The PolicyItem model (bottom one) has the ability to halt further processing with the $halt attribute of PolicyItems. I don't think I'm convinced that this is a huge issue.

- The evented model has latency from the event queue, but I don't think this is a huge issue. The latency is normally ok. Jon, is it an issue for the file analysis framework? I don't remember. The actions being applied would be processed through an event queue too so they will be processed after the policy events anyway.

- Code block prioritization is built into the evented model using the &priority attribute. It's specifically implemented for PolicyItem model.

Any thoughts?

  .Seth

I like this idea (it seems MUCH easier to use). How would
"suppress_for" be implemented in the evented model?

- The PolicyItem model (bottom one) has the ability to halt further processing with the $halt attribute of PolicyItems. I don't think I'm convinced that this is a huge issue.

I think the same thing could be done with the event-model -- if Notice::Info just had a boolean field that each event handler body could check and then bail out early if necessary. And that field could even be something left up to a user to define in a redef.

- The evented model has latency from the event queue, but I don't think this is a huge issue. The latency is normally ok. Jon, is it an issue for the file analysis framework?

I think the latency is a problem in that case since immediate feedback is helpful if certain actions are to work without buffering incoming file data. E.g. if the only thing I want to happen is for files to be extracted to disk, then I wouldn't expect that to require buffering.

It may be workable, but I think the latency does make things more complicated so it might not make sense to re-use this policy model for the file analysis framework, but that shouldn't be a showstopper for using it for notices. I think it's fine to have the policy models be inconsistent, as long as both are easy to understand and use for their purpose.

    Jon

Suppression is implemented at a pretty low level in the notice framework. That Notice::policy event would never be generated for suppressed notices (just like the notice policy variable currently isn't evaluated for suppressed notices).

  .Seth

As we discussed earlier and I never brought up in my overview, would it be better if we had some mechanism sitting between events and functions? "Synchronous events"? "Multi-bodied functions"? Here's an example of how a synchronous event might look at the script land…

global awesome: event(a: count) &synchronous;
global foobar: event(a: count);

event bro_init()
  {
  event foobar(1);
  event awesome(5);
  }

event awesome(a: count) &priority=5
  {
  print fmt("awesome: %d", a+5);
  }

event awesome(a: count) &priority=2
  {
  print fmt("awesome: %d", a+2);
  }

event foobar(a: count)
  {
  print "foobar handler!";
  }

Assuming synchronous events existed, the output would be this:

awesome: 10
awesome: 7
foobar handler!

I would really kind of like to move away from this policy model. The more I think about it, the less Bro-y it feels and I have trouble remembering how exactly to use them too (which I don't think bodes well for anyone!). Focusing on simplicity and consistency at the script layer is something that we'll probably have to continue to struggle and fight for forever. Although, I'm generally ok with complexity anywhere in Bro as long as we find a clear and obvious way to expose it to users.

  .Seth

Aaand, to reply to myself… Now that I implemented that sample code, I don't like it. Events presume the event queue. If we bypass that, it just seems bad and inconsistent. Maybe we just need a mechanism for multi-bodied functions? It would basically be functions that could be defined over and over, could have a &priority attribute applied to them, and would be enforced to have no return value.

global foobar: function(a: count) &multibody;

function foobar(a: count) &priority=5
  {
  print "foobar at priority 5";
  }

function foobar(a: count) &priority=8
  {
  print "foobar at priority 8";
  }

print foobar(1); // <-- Maybe it could return a count to indicate number of bodies executed?

Sample output

One more reply to myself. :slight_smile:

Dammit.

There would be some work that would need to be done to deal with edge cases brought about by smashing scopes together like this, but I think it should be possible to watch for locals being created and making sure that there aren't any conflicting names (Bro should probably not start up if there are conflicting names).

Actually, nevermind. I'm not fully thinking through the scope smashing. We could just create a scope for each of the function implementations…

function foobar(a: count) &priority=5
  {
  local test = "nope";
  print test;
  }

function foobar(a: count) &priority=8
  {
  local test = "yup";
  print test;
  }

Those would inline into this…

function foobar(a: count)
        {
         {
  local test = "nope";
  print test;
  }
  {
  local test = "yup";
  print test;
  }
  }

The only problem is that this doesn't currently work, but it begs the question… should it?

  .Seth

function foobar(a: count) &priority=5
  {
  local test = "nope";
  print test;
  }

function foobar(a: count) &priority=8
  {
  local test = "yup";
  print test;
  }

Adding the &multibody attribute to do this I think might work alright. And I think it's not much to implement it -- I just did some quick hacks as a proof of concept and so far it didn't look like a problem.

Those would inline into this…

function foobar(a: count)
       {
        {
  local test = "nope";
  print test;
  }
  {
  local test = "yup";
  print test;
  }
  }

The only problem is that this doesn't currently work, but it begs the question… should it?

That probably should work, but it's also not related to implementing the proposed &multibody attribute for functions since the bodies are already parsed separately with different scopes and can then be internally stored as separate bodies/scopes, there's no need to do the mashing together under a common scope -- just keep them separate and iterate over the bodies according to priority.

This code already exists because of the similarity of functions and events, it's just that functions are only allowed multiple body definitions if they have &redef, and in that case, the later definition *replaces* earlier ones instead of being pushed on to the vector of bodies.

    Jon

Ahh, that makes sense.

I actually thought about it a bit more and I'm still not completely attached to the idea. I sort of don't like how the documentation for that would look where we have to say that normally you shouldn't implement a function over and over, but in special cases it's just how something works.

I almost wonder if we should have another type of code block in addition to events and functions. It's really just a difference in semantics at this point, but it would be much easier to document if there was a completely different name. We could document the use and behavior of each of them very clearly with the third name.

  .Seth

Can't say I like the multi-body functions; it's an unusual concept and
they'd come with untuitive limitations (like the no return value
constraint). Or put differently, I believe what you call multi-body
functions here are actually just the synchronous events you proposed
first. :slight_smile:

Let me try to structure this a bit. The current policy approach gives
us four things I believe (let me know if I'm missing something):

    (1) Immediate execution to completion (that's the synchronous).

    (2) The ability for an item to prevent others from being considered.

    (3) Returning values back to the code running executing the policy
        (actions, suppress_for, etc.)

    (4) Items sorted by priority.

Events currently only provide (4), but I can see changing them to
do (1) and (2):

    - (1): That's the "synchronous" part. There's actually nothing
    internally that would prevent us from executing event handlers
    immediately, rather than queuing them first. In some sense, they
    would jump to the head of the queue and be processed immediately.
    We could indeed introduce a new attribute for such events, though
    I'm not sure I like &synchronous (but no better idea right now).

    This would vioalate the current FIFO order of events, but with
    explicit marking, I think that'd be ok.

    - (2): Jon suggested a boolean flag but I don't like that because
    now each handler would need check for it, which is easy to forget
    and generally repetetive. Another idea for this has been floating
    around already: allow event handlers to stop execution of further
    handlers of the same kind for an event. Something like:

        event foo()
        {
           [...]
           event stop; # No further handlers for this event.
        }

(3) is tricky though, not sure how to do that without return values?
Question is: can we do without? We don't need it for actions (they can
just turn into function calls), but what about suppress-for style
stuff. Seth, you said that suppress_for is not a problem either?

Robin

  - (1): That's the "synchronous" part. There's actually nothing
   internally that would prevent us from executing event handlers
   immediately, rather than queuing them first. In some sense, they
   would jump to the head of the queue and be processed immediately.
   We could indeed introduce a new attribute for such events, though
   I'm not sure I like &synchronous (but no better idea right now).

I don't like that attribute either.

       event foo()
       {
          [...]
          event stop; # No further handlers for this event.
       }

I actually don't like this. I like to be able to assume that all event handlers will execute and I can see users causing some major debugging headaches for us with it.

(3) is tricky though, not sure how to do that without return values?
Question is: can we do without? We don't need it for actions (they can
just turn into function calls), but what about suppress-for style
stuff. Seth, you said that suppress_for is not a problem either?

Correct, suppress_for isn't a problem and I don't think we need (3) anyway.

What about my proposal I sent out a little bit ago? Make a third type of code block with all of the characteristics that we want? We don't have to compromise on existing and good designs for functions or events that way.

  .Seth

(3) is tricky though, not sure how to do that without return values?
Question is: can we do without?

I think so. People could design the "return value" as something expected as a side effect of the call ? e.g. the value of a record field could get modified in the bodies and then inspected after.

    Jon

Continuing the discussion from the group call, I think that we're starting to converge on adding a "policy" keyword to the language to represent something between a function and an event. Jon and I chatted about it some more and we couldn't come to a more abstracted and still clear keyword than "policy". (I really like the policy keyword here)

As a concrete example of the existing policy mechanism in the notice framework and the reworked mechanism I'll give an equivalent implementation of both.

# New style under discussion.
policy Notice::trigger(n: Notice::Info) &priority=5
  {
  if (n$note == Whatever && Site::is_local_host(n$conn$id$orig_h)
    add n$actions[ACTION_BLAH];
  }

# Existing style
redef Notice::policy += {
  [$pred(n: Notice::Info) = {
     return (n$note == Whatever && Site::is_local_host(n$conn$id$orig_h);
   },
   $priority = 5,
   $action = ACTION_BLAH]
};

To address your various points about how this needs to work:

  1. Immediate policy handler execution. Not even by jumping to the beginning of the
     event queue and waiting for the queue to be flushed, just direct execution.

  2. Adding a "break"-like keyword to halt policy execution (only valid in policy handlers).

  3. No return value(s) when calling policy handlers. We'll reuse the mutability mechanism
     that I subversively began using all over the place by modifying records in place.
     Calling policy handlers would look like this:
         policy Notice::trigger(n);

  4. Policy handlers sorted by priority. Equivalent priority has undefined execution order,
     but I don't see that as a problem since it's already a concern for event handlers.

.Seth

Continuing the discussion from the group call, I think that we're starting
to converge on adding a "policy" keyword to the language to represent
something between a function and an event.

Hmmmm. This is sounding weird. I think it's telling that you're heading
towards introducing a concept that isn't (in terms of the name that comes
to mind for it) about anything relating to language semantics, but rather
to intended use, i.e., that this mechanism is for honing policy. That's
often a sign that it's time to stop & rethink things.

Let's start with what's deficient about the current style. The overarching
problem is it's hard to understand/explain, correct? How much of this is
the funky [$pred...] syntax, vs. that we've wound up adding a bunch of
separate hooks into the framework so it's hard to know which hook to use,
and/or it's hard to understand what a given collection of hooks does?

I feel like we need to better determine just what problem we want/need to
solve in order to then think about how to coherently provide better support
for it.

    Vern

Let's start with what's deficient about the current style. The overarching
problem is it's hard to understand/explain, correct?

Yep.

How much of this is
the funky [$pred...] syntax, vs. that we've wound up adding a bunch of
separate hooks into the framework so it's hard to know which hook to use,
and/or it's hard to understand what a given collection of hooks does?

Personally, it's mostly the syntax. There are also some cases where it's hard to express what you want in a single policy item too. Some of this could probably be fixed within the notice policy now, but then we'd essentially be turning the notice policy into a set of functions which starts to look suspiciously similar to events being executed immediately without the familiar syntax of events.

I feel like we need to better determine just what problem we want/need to
solve in order to then think about how to coherently provide better support
for it.

I suspect that the reason the notice policy mechanism was first created was due to need for immediately evaluated events. The event handling mechanism feels very natural in Bro after using it for a while and the existing notice policy ends up feeling weird because you have to do a lot of uncommon and unfamiliar syntax. I'm not completely tied to the "policy" keyword, but does it seem unreasonable to have something like this immediately executed event that we've been discussing?

As

I think I was done. I'm not sure what the stray "As" is there for. :slight_smile:

  .Seth

I'm not completely tied to the "policy" keyword, but does it seem
unreasonable to have something like this immediately executed event that
we've been discussing?

Yeah, it seems somewhat strange and confusing.

If we back up, it seems to me there are several things the framework
tries to achieve.

  (1) Associate predicates with different possible cases. This
      is the "n$note == Whatever" clause that's currently included
      in basically every policy element.

  (2) A predicate that applies logic when that case holds to yield
      a classification decision.

  (3) Priorities to decide which predicate "wins" when multiple ones
      of them match the case.

  (4) The ability for an early winner to short-circuit processing by
      others that match the case, either for efficiency or to prevent
      their side-effects.

What I wonder about is having a starting point of dealing with (1) differently.
Conceptually, one would like to say:

  switch n$note {

  case Notice::Sneakiness:
    if ( pred1() ) ...
    if ( pred2() ) ...

  }

but the problem is that we want refinement of the code inside each
case, without having to edit the overall code block.

I know that languages going as far back as CLOS had mechanisms for
doing this sort of refinement. Not being a CLOS hacker, this is just
from ancient memory, but I believe the CLOS equivalent for what we
want would be:

  function policy_action(n: Notice::Info &case n$note==Notice::Sneakiness,
        other_arguments_here ...)
    {
    ...
    }

That seems like too much verbiage. But it has me wondering whether there
might be some sort of prioritized case construct that works better than
"match ... using". Ideally this would be a language construct that nicely
applies in other switch-like contexts, too.

    Vern

2. Adding a "break"-like keyword to halt policy execution (only valid in policy handlers).

3. No return value(s) when calling policy handlers.

Just an alternative idea I had -- maybe they could be required to return a boolean value that's used internally to determine whether to keep traversing the vector of handler bodies. Then you don't need to add a new keyword for the halt/break functionality.

    Jon

I feel like we need to better determine just what problem we want/need to
solve in order to then think about how to coherently provide better support
for it.

I think the construct we need to solve the problem already exists purely in the script-layer:

    type TriggerFunc: function(n: Notice::Info): bool;
    global Notice::policy: vector of set[TriggerFunc];

I think that the issue is it being cumbersome to manipulate/redef? So we're working around that with the new proposal, but maybe we can do more to make redefining something like that syntactically clean and easy?

    Jon