Feedback on configuration framework implementation

Hello everyone,

the branch topic/johanna/config contains an implementation of the
configuration framework as it was discussed in an earlier thread on this
list. GitHub link: https://github.com/bro/bro/compare/topic/johanna/config

The implementation is basically what we discussed in the earlier thread
with some additional components like a reader for configuration values and
a script-level framework.

It would be great if people could take a look at all of this and see if
this makes sense, or if they see any problems with the implementation as
it is at the moment.

In the rest of the mail I wil go into a bit more detail and describe the
different parts of this change. Note that the rest of this email will be
very similar to the git commit message which also describes this change :slight_smile:

The configuration framework consists of three mostly distinct parts:

* option variables
* the config reader
* the script level framework

option variable

Hello everyone,

the branch topic/johanna/config contains an implementation of the
configuration framework as it was discussed in an earlier thread on this
list. GitHub link: https://github.com/bro/bro/compare/topic/johanna/config

Nice! I am curious to see all the usability improvements that can be built on top of this.

It would be great if people could take a look at all of this and see if
this makes sense, or if they see any problems with the implementation as
it is at the moment.

Having I quick look at the documentation, everything seems well though-out to me. I have just two small questions:

option variable

The option keyword allows variables to be specified as run-tine options.
Such variables cannot be changed using normal assignments. Instead, they
can be changed using Option::set. It is possible to "subscribe" to
options and be notified when an option value changes.

Change handlers can also change values before they are applied; this
gives them the opportunity to reject changes. Priorities can be
specified if there are several handlers for one option.

1. Thinking of handlers that may change values and are associated with a priority, hooks come to my mind (e.g. Intel::extend_match). Are functions preferable compared to hooks here?

config reader

The config reader provides a way to read configuration files back into
Bro. Most importantly it automatically converts values to the correct
types. This is important because it is at least inconvenient (and
sometimes near impossible) to perform the necessary type conversions in
Bro scripts themselves. This is especially true for sets/vectors.

Configuration generally look like this:

[option name][tab/spaces][new variable value]

2. Are module namespaces part of the option name (e.g. "Notice::reply_to" vs. "reply_to")?

Thanks,
Jan

1. Thinking of handlers that may change values and are associated with a
priority, hooks come to my mind (e.g. Intel::extend_match). Are
functions preferable compared to hooks here?

In this case - yes. The problem with hooks is that they cannot return a
value, which is used here to let user change (or reject) changes to
options. :slight_smile:

> config reader
> =============
>
> The config reader provides a way to read configuration files back into
> Bro. Most importantly it automatically converts values to the correct
> types. This is important because it is at least inconvenient (and
> sometimes near impossible) to perform the necessary type conversions in
> Bro scripts themselves. This is especially true for sets/vectors.
>
> Configuration generally look like this:
>
> [option name][tab/spaces][new variable value]

2. Are module namespaces part of the option name (e.g.
"Notice::reply_to" vs. "reply_to")?

Yes

Johanna

The Intel::extend_match hook allows changing values or rejecting as well. If the "chain of hooks" is "broken", i.e. one hook executed a break statement, the call to the hook returns false and (in that case) the log write is rejected. Otherwise, all changes made to the hook arguments inside the handlers are propagated allowing changes.

Jan

Ah, you have a point there it is possible to do it like that, I did not think of that. I honestly also never liked modifying the values that are passed in arguments; this is for example also theoretically possible for events, but something that we have avoided to use in practice so far. Functionally they are, however, obviously equivalent.

I think I prefer functions in this case from a stylistic point of view. I am happy to change it over to hooks though if there is a consensus that that is the more fitting approach here. :slight_smile:

Johanna

I like the hook approach that is used in the Intel-Framework but as I am biased, I will abstain :wink:

Jan

Yeah, and it also won't work for atomic values, at least not since

went in. :slight_smile:

Robin

I almost used this feature (bug) when I was trying to figure out how to implement user configurable
dynamic scan thresholds! Now I'm glad I never got it to work right :slight_smile:

Me too. :slight_smile:

Robin

> think of that. I honestly also never liked modifying the values that are
> passed in arguments; this is for example also theoretically possible for
> events, but something that we have avoided to use in practice so far.

Yeah, and it also won't work for atomic values, at least not since
Fix assignments to event arguments becoming visible to subsequent · bro/bro@5b88936 · GitHub
went in. :slight_smile:

And as far as I can tell that applies to hooks too, true?

This is actually a but sneaky - it should not be a problem for
Intel::extend_match that Jan mentioned earlier because it is unlikely that
someone will just assign a new value to info. But if someone does it will
fail.

Which, after thinking about it for a few moments seems like the right
choice in any case. :slight_smile:

Johanna

And as far as I can tell that applies to hooks too, true?

Yeah, correct.

But if someone does it will fail.

Which, after thinking about it for a few moments seems like the right
choice in any case. :slight_smile:

Yeah, that seems like behaviour that's really not good to rely on.

Robin

I think in case of hooks it might be a nice feature making hooks even more useful. For events I agree that allowing to reassign atomic values is undesirable.

Jan

What are the limits of this automatic conversion?

There's currently a few use cases that are difficult to do using the input framework
when then involve loading things into a nested data structure... like a

    table[subnet] of set[port]

It can be done, but requires using the input events and doing bookkeeping yourself.

Bro can serialize stuff to json, but I don't think we have the inverse implemented anywhere

Could be nice to be able to lay out values using something like

    port_whitelist {192.168.0.0/24: [22/tcp,80/tcp], 192.168.1.0/24: [443/tcp]}

Maybe this is more of a job for broker? I know broker can easily serialize and transfer such
a data structure over the network, is there a plain text serialization implementation too?

Cool.. so you figure something like a python script to load/organize your data from whatever upstream source you have, then just call Option::set using broker?

I think the ascii representation of data structures would still help in a few places.. bro is in a weird place right now where we have json and 'print' that can output an
ascii representation of almost any data structure, but what it outputs is not always valid bro code that can be parsed in the other direction.

Like,

const foo: table[subnet] of set[port] = {
    [192.168.0.0/24] = set(22/tcp)
};

Gets turned into

{
[192.168.0.0/24] = {
22/tcp
}
}

But the bro parser doesn't parse {...} as a set.

and

const foo: table[subnet] of vector of port = {
    [192.168.0.0/24] = vector(22/tcp)
};

Gets turned into

{
[192.168.0.0/24] = [22/tcp]
}

But trying to parse that crashes:

internal error in ././trybro.bro, line 2: missing aggregate in ListExpr::InitVal (22/tcp)

Indeed, that is my thought. This seems like a job for broker, instead of trying to somehow force this into a complex ascii-representation.

Note that this is just a limitation of the config reader - the rest of the config framework (that does not deal with file reading) does not care what you throw at it and will happily accept tables, etc. So if you get Broker to give you a table you should be able to just use the calls for setting options with that table afterwards.

Johanna

Cool.. so you figure something like a python script to load/organize your data from whatever upstream source you have, then just call Option::set using broker?

Yup.

I think the ascii representation of data structures would still help in a few places.. bro is in a weird place right now where we have json and 'print' that can output an
ascii representation of almost any data structure, but what it outputs is not always valid bro code that can be parsed in the other direction.

You might have a point - however that kind of is a problem that I think is out of scope for the config framework. Plus - if we ever introduce something it will work without changes with the config framework as it is now (and the config reader should be able to read anything new that the input framework supports because it just re-uses functionality)

Johanna