support for event handlers using a subset of parameters

Just wanted to draw more attention/feedback to a proof-of-concept
patch I just made:

https://github.com/zeek/zeek/pull/262

Basically lets us do this:

    global my_event: event(a: count, b: string);

    event my_event(b: string)
        { print "my_event", b; }

Which is good because:

* user doesn't care about parameter 'a', so they shouldn't have to list it
* it makes it easier for to deprecate/change event parameters

I didn't see any drawbacks to this change, but maybe I missed
something, let me know.

- Jon

    global my_event: event(a: count, b: string);

    event my_event(b: string)
        { print "my_event", b; }

Which is good because:

* user doesn't care about parameter 'a', so they shouldn't have to list it
* it makes it easier for to deprecate/change event parameters

This seems like a pretty niche pair of benefits. Is there a compelling
use-case that's motivating this change?

One thing I initially wondered was whether this was going to tie our hands
in the future if we want to introduce C++-style overloading. However, it
looks like you've implemented this based on matching the names in the
declaration rather than the types, so that should be okay.

    Vern

> * user doesn't care about parameter 'a', so they shouldn't have to list it
> * it makes it easier for to deprecate/change event parameters

This seems like a pretty niche pair of benefits. Is there a compelling
use-case that's motivating this change?

The compelling use-case I'd say is the ability to change/deprecate
event parameters without suddenly breaking people's code since that
has come up many times already. I briefly skimmed NEWS for just the
last 2.6 release and count 5 times we broke an event signature where
this patch would have helped.

I think there's also some other higher-profile changes to event args
we haven't moved forward with because we didn't want to break user
code that this would help with. Old example from unresolved ticket:

https://bro-tracker.atlassian.net/browse/BIT-1431

One thing I initially wondered was whether this was going to tie our hands
in the future if we want to introduce C++-style overloading. However, it
looks like you've implemented this based on matching the names in the
declaration rather than the types, so that should be okay.

Also this change only effects events and hooks, not functions. The
semantics are different enough that maybe we would only want
overloading for functions anyway.

That is, functions have a single, fixed implementation/body that is
defined by the *author*, so you may want to re-use the same name for
something implemented in different ways. The *user* is the one that
calls the function.

Hooks and events have multiple implementations/bodies that are defined
by the *user*. The *author* is generally the one the generates
(calls) the event/hook.

So if the event/hook name were overloaded, it's a bit confusing -- the
user now has to decide between different signatures to handle, each
containing different data sets and maybe neither contains the set they
want (so now they handle two events of the same name instead of one).
Really, an event is a unique name with some amount of data
(parameters) associated with it and may always be generated with the
full data set -- the user then chooses which data they are interested
in by defining that explicitly in the handler's parameter list.

- Jon

I'm sure the main use case is changing an existing event's parameters
without breaking existing scripts -- someting we've been increasingly
running into as a major challenge.

It's a nice a idea to relax parameter passing to work by name, and
allow subsets. However, I can't quite get myself to really like it in
this form, because it *looks* like an error to not have matching
argument lists. Is there some syntax that would make it more clear
what's going on?

Robin

Not sure. If the syntax were different, that introduces a "one more
thing to remember" issue, so I might prefer consistency with other
function-like constructs.

Any other language we know that has multi-body functions we can
reference for ideas?

Did it look like an error in the sense of the user making a mistake or
in the sense of traditional way functions in other languages like
C/C++ require matching signatures?

In the former, I think the semantics/intentions are actually clearer
than before: the user didn't list a parameter because they don't care
about it, so why make them. I know what event they want because they
use unique names and the parameters they listed do map in a valid way.

On the traditional side of things, overloading seems it's maybe a
legit reason for requiring matching signatures, but I also explained
why I think overloading wouldn't make sense in the context of events.

- Jon

I think the change to using names does make things a bit more confusing for users, but it opens the door for us to greatly improve reliability of scripts in the long term and generally it feels like a nice way for analyzer authors to deprecate functionality without needing to create all new events. In my opinion even though there are hairy side effects to this I think it's a net positive. It would be great to get case sensitive versions of dns events and the http header event. That has been a very long standing deficit.

I guess if there is some more obvious way to do it could make sense, but I haven't been able to come up with anything after thinking about this for quite a while.

   .Seth

Note that the patch makes matching based on parameter name optional.
One can still pick their own parameter names as long as they strictly
follow the declaration's parameter order. i.e. there's no impact to
people that aren't paying attention to this change, they see no
difference in behavior. If a user finds this change confusing it's
because they read the docs/example/release-notes and didn't understand
them, so it's on us to improve them.

- Jon

The compelling use-case I'd say is the ability to change/deprecate
event parameters without suddenly breaking people's code since that
has come up many times already.

I see how it allows adding new parameters. I don't see how it helps with
deprecating existing parameters (which seems would be better served with
some sort of &deprecated attribute), and I don't see how it helps with
changing the semantics of existing event parameters.

Also this change only effects events and hooks, not functions. The
semantics are different enough that maybe we would only want
overloading for functions anyway.

It actually makes sense to me to support overloading for events. Then for
example you could have two event signatures depending on what information
the handler was going to leverage, which would allow the event engine to
offload work if there isn't a handler for a signature that requires extra
computation.

Hooks and events have multiple implementations/bodies that are defined
by the *user*. The *author* is generally the one the generates
(calls) the event/hook.

The big exception being the event engine (if I follow what you mean by
user/author).

So if the event/hook name were overloaded, it's a bit confusing -- the
user now has to decide between different signatures to handle, each
containing different data sets and maybe neither contains the set they
want (so now they handle two events of the same name instead of one).

Not really seeing this. I'm picturing that a common idiom will be a
lightweight version of an event and a heavyweight version, or maybe a
spectrum from light-to-heavy.

Really, an event is a unique name with some amount of data
(parameters) associated with it and may always be generated with the
full data set -- the user then chooses which data they are interested
in by defining that explicitly in the handler's parameter list.

Yeah, I agree that this too would allow (most of) the sort of offloading
I sketch above.

    Vern

I don't see how it helps with
deprecating existing parameters (which seems would be better served with
some sort of &deprecated attribute),

Support for &deprecated in parameters is part of the changes.

But if we don't allow the user to immediately remove the field, they
are then stuck doing 2 changes:

Step 1: we mark a field &deprecated
Step 2: the user sees that, so they remove uses of that parameter from
their body
Step 3: we actually remove the &deprecated field
Step 4: if the user was forced to still have the &deprecated param in
their handler's param list they now have to do a second change to
remove it instead of just removing it right away

With the proposed patch, we get rid of the need for Step 4 and
decrease burden on users.

and I don't see how it helps with
changing the semantics of existing event parameters.

Step 1: we mark old field &deprecated and introduce a new parameter
Step 2: the users sees that.. etc, etc. same as above.

It actually makes sense to me to support overloading for events. Then for
example you could have two event signatures depending on what information
the handler was going to leverage, which would allow the event engine to
offload work if there isn't a handler for a signature that requires extra
computation.

I think the same kind of offloading is possible with the "parameter
subset" approach. We know exactly what parameters are being
consumed, so we might have optimizations that don't produce a
parameter if no one consumes it. And if no one consumes any
parameters we also don't generate the call.

If you have two different event signatures, we just get the same type
of optimization we currently do, which only optimize out the entire
call if there's no handlers, but doesn't know if individual parameters
are being consumed or not. e.g:

    http_request(a, b, c)
    http_request(d, e, f)

If someone only consumes 'a' and 'e', you still have to produce both
function calls in their entirety (and also the other unused params),
but:

    http_request(a, b, c, d, e, f)

You can potentially not do any work generating the unused parameters
and only have to do the one function call with 'a' and 'e'.

Technically, we can still require a matching signature and do such an
optimization by walking the AST and finding local parameter usages. I
guess you have to do that ultimately, but it's an easy head start at
implementing such optimizations as a test/idea if we can simply see
someone isn't using a parameter because it's not in their handler
param list.

- Jon

It's probably the latter, but I'm not sure that helps: when I see the
above code, I wonder: "Heck, why is that working ... Or, wait, maybe
it isn't?". It makes it confusing to me to read the code.

If at least the prototype told me somehow "it's ok if parameters are
left off", then I'd have a clue that everything is fine.

The following would be even worst in terms of confusion:

    global my_event: event(a: string, b: string);
    event my_event(b: string)

Now I need to know if the language goes by order of parameters or by
parameter name.

I do see the appeal of making things just work when event handlers
change, but is there really no different way to support that?

(I don't have an opinion on the event overloading discussion yet, need
to digest that more)

Robin

The following would be even worst in terms of confusion:

    global my_event: event(a: string, b: string);
    event my_event(b: string)

Now I need to know if the language goes by order of parameters or by
parameter name.

The issue may be exaggerated because we're using contrived parameter
names. If we make it a bit more real (still "shortened" for sake of
example):

    http_request: event(method: string, version: string);
    event http_request(version: string) { ... }

The "parameter order vs. name" issue doesn't even cross my mind here
because the intent is clear -- I understand the meaning of the
handler's parameter because the original author of the event chose a
meaningful and useful name for it (which will be the common case).
That's all the reader cares about -- understanding the meaning of any
given parameter.

I do see the appeal of making things just work when event handlers
change, but is there really no different way to support that?

If the goal is to avoid breaking user-code, I don't think we're in a
position to do anything else. For example, we have:

    global my_event: event(a: string);

And we want to add a parameter:

    global my_event: event(a: string, b: string);

The user has already written their handler as:

    event my_event(a: string) { ... }

We can't magically change that user's code. We have to, somehow, let
that work as it is, without user intervention.

- Jon

Yeah, I think that's right. I was trying to come up with some proposed model for indicating that you're specifying named parameters but that would still force existing code to be updated so that it could keep using the existing model.

One thing I've been thinking about a little bit is how much we're concerning ourselves with maintaining perfect backward compatibility and if there is some benefit to breaking a bit of backward compatibility for something truly nicer? Like, should we have some specialized syntax for specifying named parameter use? Should we have something like anonymous records where you specify a variant of record syntax for named parameters? (ruby comes to mind for this, but I've seen people do similar things in c too)

I'm just wondering if we should do a one-time source compatibility break to could get some tangible benefit in usability or understanding. I think Robin's concern is about backing us into a corner with a syntax that makes code confusing. Is that correct Robin?

   .Seth

I think compatibility is a growing issue with scripts being released as plugins. I’m already seeing some code shift to:

@if (Version …)
new event
@else
old event

I think I like Seth’s idea of records, but I’m still thinking it through. It would formalize a growing trend towards moving more parameters into records anyway. If we’re worried about backwards compatibility, then maybe we have a built in version number in each record. Whenever fields are added/removed, or there are more subtle contextual changes, the version number could increase.

As a concrete example, the is_server parameter of the ssh_capabilities event was being set backwards (effectively as “is_client.”) I introduced a change[1] where the is_server parameter was corrected, but now how to interpret the value depended on the Bro version. This is a very subtle case, where no field was added or deleted, but users were still expected to change their code.

–Vlad

[1] - <https://github.com/zeek/zeek/pull/191>

One thing I've been thinking about a little bit is how much we're
concerning ourselves with maintaining perfect backward compatibility and
if there is some benefit to breaking a bit of backward compatibility for
something truly nicer?

Sure, I wouldn't argue against taking that approach if there's a good
suggestion.

Like, should we have some specialized syntax for
specifying named parameter use? Should we have something like anonymous
records where you specify a variant of record syntax for named
parameters?

I think a user would prefer to always do whatever is most robust and
won't break in the face of upstream changes. If I'm going to use the
alternative syntax everywhere because it's better, why have an
alternative in the first place.

- Jon

I _think_ I like Seth's idea of records, but I'm still thinking it through. It would formalize a growing trend towards moving more parameters into records anyway. If we're worried about backwards compatibility, then maybe we have a built in version number in each record. Whenever fields are added/removed, or there are more subtle contextual changes, the version number could increase.

Explicit versioning is a neat idea to maybe try expanding on. I'm not
quite sure how it would look for the user to, when they write their
code, make it explicit that they expect the semantics of the record to
match version XYZ. But yeah, on our end, we could ensure that if we
see someone wanted XYZ, we send them that version of the record, which
matches their semantic expectations.

As a concrete example, the is_server parameter of the ssh_capabilities event was being set backwards (effectively as "is_client.")

In this particular example, the way I would have wanted to solve it
with my proposed patch would be: deprecate "is_server", document it as
actually meaning "is client" (don't change semantics of that parameter
to avoid breaking user code that is already doing the inversion), but
do introduce a new "is_client" which actually means "is client", then
later remove "is_server".

- Jon

I _think_ I like Seth's idea of records, but I'm still thinking it through. It would formalize a growing trend towards moving more parameters into records anyway. If we're worried about backwards compatibility, then maybe we have a built in version number in each record. Whenever fields are added/removed, or there are more subtle contextual changes, the version number could increase.

Explicit versioning is a neat idea to maybe try expanding on. I'm not
quite sure how it would look for the user to, when they write their
code, make it explicit that they expect the semantics of the record to
match version XYZ.

If I get that right, this would mean that every record type comes with an individual version number, independent from the package or zeek version. While I can see that this allows to achieve backwards compatibility I am quite sure it will end in a complete mess. I expect the resulting code to be harder to write and to read, assuming that most of the records come with versioning constraints.

Actually, I like Jon's solution most. From a coder's perspective, the only thing I have to keep in mind is that parameters are matched by name. If I want to rename I can fallback to matching by order. That said, I would guess renaming a parameter is not a critical use case. I cannot imagine a situation where I rely on that feature. For me, renaming parameters makes code harder to read if I am familiar with the original definition. One could even think about introducing syntactically explicit renaming of parameters (which would probably break backwards compatibility again).

This thread got backburnered for me due to other $dayjob stuff, but returning
to it now, one thing I wondered would be what if instead of just allowing
missing parameters as a way to support changing event interfaces, we
introduced an explicit notion of deprecation, as follows.

Suppose "event foo(a: string)" is an event to which we want to add
a second parameter, b: string. We could express this in event.bif as:

  event foo%(a: string%) &deprecated;
  event foo%(a: string, b: string%);

This would (1) allow the developer who's changing the API to decide up
front whether the change can be backward-compatible by leaving out parameters,
(2) put the maintenance burden just on the developer, (3) allow users to
(automatically) look for the uses of the deprecated API if they wish,
(4) avoid certain forms of user errors, and (5) allow us to consider other
API changes such as removing parameters.

It's not as easy to implement ... but strikes me as cleaner than where
this discussion has wound up going.

    Vern

Suppose "event foo(a: string)" is an event to which we want to add
a second parameter, b: string. We could express this in event.bif as:

        event foo%(a: string%) &deprecated;
        event foo%(a: string, b: string%);

Nice idea. I also think that would work in this "want to add a new param" case.

(4) avoid certain forms of user errors

Which particular user errors were a concern?

And you mean in contrast w/ my initial proposal? I was still thinking
name-based matching most clearly expressed the users intent, so I
doubt that results in increased errors.

and (5) allow us to consider other
API changes such as removing parameters.

Yeah, removing should work following this scheme. What about changing
semantics of a parameter though? Take earlier example Vlad gave, I
think the process is:

Step 1:

    event foo%(is_server: bool%) &deprecated;
    event foo%(is_server: bool, is_client: bool%);

This is ok so far: no code will break and users can update to the new
handler with extra parameter. It is a bit odd to have the "is_server"
parameter available for use when the actual intent is to later get rid
of it.

Step 2:

    event foo%(is_server: bool, is_client: bool%) &deprecated;
    event foo%(is_client: bool%);

Now this is really awkward: users have to update their code once again
(in my proposal, they update their code just once).

The other problem is that if a user is skipping a version, they may
end up with a handler that treats "is_server" in the original way, but
the meaning has been changed and we only match events based on type +
number of parameters. With name-based parameter matching, we can
catch this scenario as well.

It's not as easy to implement ... but strikes me as cleaner than where
this discussion has wound up going.

I still prefer my original proposal from everything I've seen so far,
I think it best fit the notion of what event handlers actually are
(but maybe not how they've traditionally been thought about): we pick
points in time as an "event" and associate various bits of data with
them, then users declare their interest in some subset of that data.
Most natural way for them to do that seems to be by naming each bit of
data.

Can you maybe break down what you mean by "clean" further so we can
better compare/contrast solutions ? I don't recall seeing anything
that was a showstopper for my original patch (but been a while).

- Jon

> (4) avoid certain forms of user errors

Which particular user errors were a concern?

The main one being when there's an API change that's *not* backward
compatible, and the user doesn't update their scripts to be in conformance
with it as is required. Clearly something we'll in general strive to avoid.

I was still thinking
name-based matching most clearly expressed the users intent, so I
doubt that results in increased errors.

Agreed for those instances where it's compatible.

Yeah, removing should work following this scheme. What about changing
semantics of a parameter though?

That's indeed trickier. Here's a thought for the example you provide:

  event foo%(is_server: bool%) &deprecated;
  event foo%(is_client: bool%);

This would mean "if the handler is defined using the name is_server as the
parameter, that's the deprecated version". Any other declaration (that's
for a parameter of type bool, of course) would refer to the new version.

By the way, I didn't follow whether in Vlad's example, the semantics of
the parameter changed, or if his fix was just to give it the correct name
to match its actual semantics. For the latter, existing handlers are
presumably flat-out broken, and there's no benefit in trying to continue
to support them.

One concern I have with leaning on is-the-name-a-match is not-too-implausible
user coding along the lines of:

  event foo(is_server_orig: bool)
    {
    ...
    local is_server = my_twisty_logic(is_server_orig, x, y, z);
    ...
    }

i.e., the parameter being renamed by the user anyway because they want to
use the original name for a local variable. These users will be bitten
by changes in parameter semantics regardless of which approach we use;
name-based matching isn't a panacea.

The other problem is that if a user is skipping a version, they may
end up with a handler that treats "is_server" in the original way, but
the meaning has been changed and we only match events based on type +
number of parameters. With name-based parameter matching, we can
catch this scenario as well.

With the tweak I outline above, this would only bite them once the &deprecated
version is removed. That could span several releases, depending on the
particular situation. I don't have a lot of sympathy for users who upgrade
across a number of releases, for which the release notes flag backward
compatibility issues, and they don't take heed to address those.

Can you maybe break down what you mean by "clean" further so we can
better compare/contrast solutions ? I don't recall seeing anything
that was a showstopper for my original patch (but been a while).

I find an approach that changes the fundamental type-checking used for
event handlers to be more heavy-handed than one that provides control to
the developer to explicitly decide when they've made a change for which
it makes sense to support a deprecated version.

If Zeek had originally used name-based parameters, then I'd be fine with
this being the solution. However, switching from positional to name-based
strikes me as a conceptually deep change for a relatively modest problem.

    Vern

I like this. It addresses my main concern of making it clear what's
happening. If I see an implementation of the event and look up the
prototype, I'll find both and can understand what's going on.

Robin