[Proposal] Language extensions for better Broker support

Would that mean you wouldn't be able to return a bool and check for errors at the same time?

No, it would always return an opaque ("opaque of Broker::Data" in this
case). If you used that in a boolean context, it would evaluate to the
error flag. If you wanted to get the actual value out of it, you'd
cast it accordingly with the new cast operator; and that includes
casting to "bool".

Robin

So let's think that through. Let's say we did an Error enum that can
reflect various error conditions, something like this:

    local v = Broker::lookup(h, 42) # Continues to return 'opaque of Broker::Data'

    if ( v == Error::Success )
        print (v as string);

    else if ( v == Error::NotFound )
        print("not found");

    else if ( v == Error::Timeout )
        print("timeout");

    else
         print("unknown error");
    }

(That could also be a switch statement, naturally).

I think that looks pretty good. But where's that Error type defined?
Is it a new global type that Bro predefines for everybody to use? Then
we'd either need to limit the type of errors to a small predefined
set, or allow scripts to redef the enum and add their own types; but
the latter risks naming clashes.

Alternatively, we could leave it to frameworks to define their own
error types. So for Broker, we'd have Broker::NotFound,
Broker::Timeout, etc. And the opaque types would define internally
what they convert to, and how.

Seth, a question: do you have more in mind by "nicely generalized
error handling" than this? Do you see a way to generalize this to
purely script-level logic (i.e., no opaques, just normal Bro types
being passed around)?

Robin

The thing that got me about that for this particular example was that I can’t distinguish whether the "lookup" or the “put” failed, which might be important since the “test” operation is between them and I may or may not want “test" to happen in the retry attempt depending on what exactly failed.

Not really making an argument/vote against this feature, just thinking out loud whether it would be something I’d personally use on the chance it helps you better plan/design. Still think I’d instead always manually unpack/test individual async return values.

- Jon

I think that looks pretty good. But where's that Error type defined?
Is it a new global type that Bro predefines for everybody to use? Then
we'd either need to limit the type of errors to a small predefined
set, or allow scripts to redef the enum and add their own types; but
the latter risks naming clashes.

Is there risk of clashes here? Don’t recall how intentional it is, but don’t enum redefinitions export the identifiers into the current module/namespace? Example:

module GLOBAL;

type Error: enum {
    Success,
    InternalTimeout,
};

module Broker;

export {
    redef enum Error += {
        NotFound,
    };
}

module SomeOtherFramework;

export {
    redef enum Error += {
        WeirdError,
    };
}

module GLOBAL;

print Success;
print Broker::NotFound;
print SomeOtherFramework::WeirdError;

Alternatively, we could leave it to frameworks to define their own
error types. So for Broker, we'd have Broker::NotFound,
Broker::Timeout, etc. And the opaque types would define internally
what they convert to, and how.

So far, there's 3 pieces of information that can come out of an async function:

1) An error code that is specific to that function (e.g. Error::NotFound)
2) An error code that is not specific to that function, but comes predefined in Bro internals (e.g. Error::InternalTimeout)
3) The actual data you’d want to use in the case the function succeeds

Seems appropriate to let each framework define (1) and (3), but somehow have (2) internally predefined. I guess the above example allows that, depending on how much you want to rely on the “redef enum” namespacing behavior?

- Jon

Good point!

I've always found this a bit confusing, though. In fact I can never
even remember how exactly it works. So not sure if that's a good thing
to rely on here? On the other hand, given that it's there and can
solve the problem, we might as well use it ... I'm torn. :slight_smile:

Robin

   local v = Broker::lookup(h, 42) # Continues to return 'opaque of Broker::Data'

   if ( v == Error::Success )
       print (v as string);

This is a neat trick and would be nice if all Bro types could be compared against error enums.

Seth, a question: do you have more in mind by "nicely generalized
error handling" than this? Do you see a way to generalize this to
purely script-level logic (i.e., no opaques, just normal Bro types
being passed around)?

I don't have any concrete suggestions beyond or alternate to the one you suggested above. I actually really like the look but I think I need to try writing some code like that to see if I find issues with it. One thing I see immediately is that it seems like we're want to force people to cast the value to an error type, maybe like this...

if ( v as error == Error::Success )

I find "Error::Success" really unintuitive and kind of funny too. :slight_smile:

  .Seth

force people to cast the value to an error type, maybe like this...

if ( v as error == Error::Success )

Mind elaboratong why you think we need to force the cast? That seems
to take away some of the appeal of the solution.

Btw, the above is an example of why I'm still a bit reluctant on the
"as" syntax. The line above would really benefit from parentheses:

    if ( (v as error) == Error::Success )

I find "Error::Success" really unintuitive and kind of funny too. :slight_smile:

Yeah, agree, that's part of the question what namespace to use.
"Broker::Success" would certainly be nicer.

Robin

force people to cast the value to an error type, maybe like this...

if ( v as error == Error::Success )

Mind elaboratong why you think we need to force the cast? That seems
to take away some of the appeal of the solution.

I don't know. It just feels too magic otherwise, but I guess it's no different than what the system is already doing in it's normal automatic type casting situations so I don't think I can really complain here.

I find "Error::Success" really unintuitive and kind of funny too. :slight_smile:

Yeah, agree, that's part of the question what namespace to use.
"Broker::Success" would certainly be nicer.

Yep.

  .Seth

To me, "v == Error::XYZ" would feel ok still, yet "v == Broker::XYZ"
is getting borderline (because the latter means there's no clear
single concept here (error checking), it could mean whatever the value
type wants).

Maybe we need something else altogether. We could add a BIF that takes
the role of the cast, but gives more context what happens:

    local v = Broker::lookup(h, 42)

    if ( status(v) == Broker::SUCCESS )
        print (v as string);

That could be a Broker-specific BIF (then probably Broker::status())
or a generic oben working with opaques in general.

We could also make the two different return values explicit:

    [result, value] = Broker::lookup(h, 42) # Returns [Broker::Result, opaque of Broker::Data]

    if ( result == Broker::SUCCESS ) ...

This would be new syntax, as we don't have tuples yet to implement
this directly. But for-loops already support something not too
different for iterating over multi-value indices. We could also bite
the bullet and add full tuples to the language (but that's quite a bit
of work). Another downside is that "result", and "value" would need to
be declared first, making it less concise.

Robin

I've updated the web page according to what I believe we have
converged on so far. I've also added some "Notes" blocks where I have
diverged from the discussion, let me know what you think.

See here: http://bro.org/development/projects/broker-lang-ext.html

Alternatively, we could leave it to frameworks to define their own
error types. So for Broker, we'd have Broker::NotFound,
Broker::Timeout, etc. And the opaque types would define internally
what they convert to, and how.

It looks like this is the model you went with in the revised proposal.
For better modularity, that's also the model I'd prefer.

> if ( status(v) == Broker::SUCCESS )

Thinking more about this, I kind of like this version actually, and
have for now included that into the proposal. Curious to hear what
others think about this. It would be an easy solution actually.

I'm not sure if it's a typo or not, but your code example has a new,
so-far not discussed form of type castin: T to x. For example:

    case bool to b: # Make the boolean available as "b".
        print "bool", b;
        break;

That would mean we'd have new keywords: is, to, as. I find that's too
confusing and think we should go with either "to" or "as" but not both.

    Matthias

Yes, in this case that's not possible. Though nothing speaks against
tearing the expression apart if you need to that distinction. It might
not have been the best example, but I wanted to illustrate the semantics
of composition of asynchronous functions.

Overall, Bro is an asynchronous language with imperative building
blocks. Especially with this inherent (and now increasing) asynchrony,
I think it's important to look at established concepts from functional
paradigms that demonstrate the utility of composable primitives.

    Matthias

We could also make the two different return values explicit:

    [result, value] = Broker::lookup(h, 42) # Returns [Broker::Result, opaque of Broker::Data]

    if ( result == Broker::SUCCESS ) ...

I would prefer this solution, as it feels more natural coming from other
languages like python. Introducing new keywords/magic functions like
status() might be conceptually elegant from the perspective of the Bro
language itself but makes it more difficult to learn the language
respectively understand code.

Jan

> [result, value] = Broker::lookup(h, 42) # Returns [Broker::Result, opaque of Broker::Data]
>
> if ( result == Broker::SUCCESS ) ...

I would prefer this solution, as it feels more natural coming from other
languages like python.

I like it as well because there's no call to status() and the concrete
types of the return variables are irrelevant to the user. (Not only
scripting languages but also C++17 finally introduces "structured
bindings" to support this form of local tuple binding.)

The key question is whether it will only be a one-off or fits more
broadly into the language as part of first-class tuple support.

    Matthias

    [result, value] = Broker::lookup(h, 42) # Returns [Broker::Result, opaque of Broker::Data]

    if ( result == Broker::SUCCESS ) ...

I would prefer this solution, as it feels more natural coming from other
languages like python.

The key question is whether it will only be a one-off or fits more
broadly into the language as part of first-class tuple support.

Something similar is already available in context of expiration
callbacks for tables using multiple indices (see e.g.,
https://github.com/bro/bro/blob/master/scripts/base/frameworks/intel/main.bro#L258).
Introducing tuples would probably allow to get rid of using the any type
at this point. Maybe there are also some scripts, whose readability can
be improved by using tuples.

Jan

Yeah, I agree, don't like that version anymore either. I have just
committed a first implementation of the type-based switch that uses
this syntax instead:

    case type string:
        ....

    case type count as c:
        ....

What do you think of that? The additional "type" makes it visually
clear what's it's about, and also helps the parser figure it out.

Robin

Actually it's the opposite: status() wouldn't need anything new,
that's part of the appeal there. It would just be a bif (although
probably called Broker::status()) that takes the Broker opaque and
query its most recent state.

Robin

Yeah, although that's indeed a one-off that would be hard to avoid
doing for more cases (for-loop over tables is another one).

I agree that if we had full tuple support, that would be the way to go
here; and it would generally be a very nice extension of the language.
However, introducing tuples is a major piece by itself, and I'm
reluctant to have the Broker changes depend on that.

We could go the Broker::status() route for now and switch over to
tuples later if/when we get them ...

Robin

Yeah, I agree, don't like that version anymore either.

Ok, good. :slight_smile:

    case type count as c:
        ....

What do you think of that? The additional "type" makes it visually
clear what's it's about, and also helps the parser figure it out.

I find that there's too much going on in a single line now. With the
extra "type" keyword, my mind gets stuck figuring out precedence rules.

What I haven't considered until now is that in the current proposal,
"as" can occur in two forms:

    local x = async Broker::lookup(..);
    local c = x as count;

and

    case type count as c:

It would be great if LHS and RHS are consistent, ideally LHS always an
identifier and RHS a type, which reads most naturally. For "switch" that
would imply:

    case c as count:

Is that any easier for the parser?

    Matthias