Configuration framework syntax proposal

Hello bro-dev,

in this email I want to get feedback on a possible syntax for the configuration
framework. The aim of the configuration framework is to provide an easy method
for Bro users and script writers to change configuration options during the
runtime of Bro (as opposed to only on startup as already possible using redef).

Let me start with an example what a script using the configuration framework
could look like:

Could the definition be

const filter = “ip” &config;

if you just wanted to use NameSpace::filter ? That kinda seems like the best of both worlds… Especially if anything marked &redef was automatically registered as a configuration variable.

Thinking of all my scripts that could use this feature I think I would always want NameSpace::option.

Could the definition be

const filter = “ip” &config;

if you just wanted to use NameSpace::filter ? That kinda seems like the best of both worlds… Especially if anything marked &redef was automatically registered as a configuration variable.

technically - yes. Though I am not quite sure that I like it :).

On the redef side - this specifically does not touch the functionality of
redef and also does not aim to automatically integrate with redef. The
background is that we do not know if a variable that is currently
redef-able will work as a configuration variable, or needs additional
commands to be run (or just works if set at startup as it is actually the
case with a lot of the current consts). I don't think going the route of
intermingling that would be a good idea - if someone wants something to be
a config variable, I think it should be an explicit opt-in.

Thinking of all my scripts that could use this feature I think I would always want NameSpace::option.

Ok. That would actually more pull me to using the other syntax again
(configopt varname) and not doing &plugin at all.

Thanks a lot :slight_smile:
Johanna

I also don't like it. I think with this proposal there is some recognition that our community has separate and distinct parts (and some overlap between them). The people that program Bro scripts and the people that use Bro scripts. I feel like there are benefits to getting a chance to separate the notion of configuration away from the notion of programming.

Invariably, there will be variable names chosen within software that will be short and convenient or long and explanatory but may not end up being just right if someone simply wants to configure a behavior. There's also the problem of single level namespaces which will limit the expressiveness and depth that you could possibly give through configuration keys.

   .Seth

If we decide to use the "const" syntax, then is the plan
to allow a const to have both the &config and &redef attributes?
(presumably, we wouldn't allow "&redef" with "configopt")

Actually, yes, that was the thought so far; since they do not interact,
they are combineable if someone would desire this.

Johanna

I also don't like it. I think with this proposal there is some
recognition that our community has separate and distinct parts (and some
overlap between them). The people that program Bro scripts and the
people that use Bro scripts. I feel like there are benefits to getting
a chance to separate the notion of configuration away from the notion of
programming.

While I agree that there are two (more or less) distinct groups and that the notion of configuration should be separated from the notion of brogramming, I don't think that anyone would profit from introducing something like &config="just.another.name". Because of the additional name mapping, this will make scripts harder to understand for brogrammers (especially for beginners), whereas the benefit for users would be minimal.

Invariably, there will be variable names chosen within software that
will be short and convenient or long and explanatory but may not end up
being just right if someone simply wants to configure a behavior.

I think that would just be bad coding style.

There's also the problem of single level namespaces which will limit the
expressiveness and depth that you could possibly give through
configuration keys.

So that is definitively a valid point! But instead of coming up with a "new language element" I would prefer to add support for multi-level namespaces into Bro. As the Bro language is already quite extensive, I think that every new syntax, which does not follow the already existing concepts, reduces usability. That's also why I would choose &config instead of configopt.

That all said, what's about the poor users? Instead of throwing a huge config file of key-value-pairs at someone who does not know about the internals of the corresponding scripts, I would provide a UI to them (maybe someone comes up with a bro-package...). The UI could display all options in a structured way and further support configuration by also showing the corresponding documentation for each option. This would allow a clean cut between users and brogrammers without intermingling both worlds.

Jan

While I agree that there are two (more or less) distinct groups and that
the notion of configuration should be separated from the notion of
brogramming, I don't think that anyone would profit from introducing
something like &config="just.another.name". Because of the additional
name mapping, this will make scripts harder to understand for
brogrammers (especially for beginners), whereas the benefit for users
would be minimal.

I don't think that this proposal makes scripts any harder to understand for programmers than things are currently. From a programmers perspective, they would still use variables the same way they currently do. The only difference is that a user might not be using redef to change values, but the programmer doesn't see that happening anyway. It also doesn't change anything about what configuration values the programmer makes available since the programmer is already forced to expose their configuration through the export section.

Invariably, there will be variable names chosen within software that
will be short and convenient or long and explanatory but may not end up
being just right if someone simply wants to configure a behavior.

I think that would just be bad coding style.

That's a fair statement. In an ideal world, all programmers would name their variables perfectly. I know that the opportunity for poor naming is still present with the &config attribute value, but for me at least, it puts me in a different frame of mind because this is the name that I'm exposing to users as a configuration option. It almost forces people to step out of the programming mentality.

So that is definitively a valid point! But instead of coming up with a
"new language element" I would prefer to add support for multi-level
namespaces into Bro.

Yes, please do this! :slight_smile:

That all said, what's about the poor users? Instead of throwing a huge
config file of key-value-pairs at someone who does not know about the
internals of the corresponding scripts, I would provide a UI to them
(maybe someone comes up with a bro-package...). The UI could display all
options in a structured way and further support configuration by also
showing the corresponding documentation for each option. This would
allow a clean cut between users and brogrammers without intermingling
both worlds.

Yep, this notion of making things abstract-able into easy configuration interfaces and/or good documentation (using the inline broxygen comments) was always in the proposal, Johanna pointed it out in the original code sample.

There is just something about the idea of exposing variable names to users (even if it's wrapped in a gui) that is intensely unpalatable to me. It's pretty much unheard of among other types of software. It would be like exposing internal variable names to command line programs instead of abstracting it into easy flags (i.e. -a or --help) or, if in a gui a text entry box had a label next to it like "GUI::My_Program::user_name" instead of showing "Username".

Sometimes abstraction like this isn't warranted, but I think it has to be done here. Bro needs to turn into a platform that treats users as first class citizens in the community and we need to acknowledge that there will be a day that they won't be reading script source code and they won't want to be exposed to programmer-isms.

   .Seth

There is just something about the idea of exposing variable names to
users (even if it's wrapped in a gui) that is intensely unpalatable to
me. It's pretty much unheard of among other types of software. It
would be like exposing internal variable names to command line programs
instead of abstracting it into easy flags (i.e. -a or --help)

I guess I was just thinking from my perspective, every script I would write would just have

module Foo;

export {
  ## Set the threshold in bytes.
  const threshold = 1234 &config="foo.threshold";
}

And I would just be repeating the namespace + variable name for each option with no added value. It would just become unnecessary repetition and a source of errors:

  const one = 1234 &config="foo.one;
  const two = 1234 &config="foo.tow"; #oops
  const three = 1234 &config="foo.tow"; #oops!

I say this as someone that will absolutely screw this up :slight_smile:

Maybe the design should support renaming variables for the configuration, but programmers should be strongly discouraged from renaming things unless they have a good reason from deviating from the automatic namespace + variable

or, if in
a gui a text entry box had a label next to it like
"GUI::My_Program::user_name" instead of showing "Username".

I'm not sure how exposing something like "input.pcap.filter" is any different from exposing something like "Pcap::filter" from that standpoint. Maybe there's a larger discussion here around what the user experience should look like? I feel like two different things are being talked about now.

Directly using variable names in UI elements is not unheard of though, a lot of UI frameworks will do things like present a variable like user_name as "User Name" in the UI. This is usually a simple text transform like

    >>> s='My_Program::user_name'
    >>> s.replace("::", " - ").replace("_", " ").title()
    'My Program - User Name'

This way you don't end up with code that looks something like

    Args {
        user_name .. display as "User Name"
        age .. display as "Age"
        favorite_color .. display as "Favorite Color"
        favorite_food .. display as "Favorite Food"
        pin .. display as "PIN"
    }

Instead you only need to override the display when you have a good reason to deviate from the standard underscore to space and Title Case transform:

    Args {
        user_name
        age
        favorite_color
        favorite_food
        pin .. display as "PIN"
    }

Having a bro configuration tool display something like the current SSH::password_guesses_limit as

    SSH Password Guesses Limit
    The number of failed SSH connections before a host is designated as guessing passwords.
    Type: count
    Current Value: 30

Or Site::darknet_mode as

    Site Darknet Mode
    I just realized I didn't document the variable name itself :slight_smile:
    Type: DarknetMode enum
    Current Value: DARKNET
    Choices:
        DARKNET: Only hosts defined in darknet_address_space are dark
        NOT_ALLOCATED: Only hosts NOT listed in used_address_space are dark
        DARKNET_OR_NOT_ALLOCATED: Only hosts defined in darknet_address_space OR NOT listed in used_address_space are dar...
        DARKNET_AND_NOT_ALLOCATED: Only hosts both defined in darknet_address_space AND NOT listed in used_address...

wouldn't be crazy, and such a tool seems like it would be pretty user friendly to me.

Yep, this notion of making things abstract-able into easy configuration
interfaces and/or good documentation (using the inline broxygen
comments) was always in the proposal, Johanna pointed it out in the
original code sample.

Yeah, I was wondering what a UI would currently look like if you tried to use existing functionality, e.g. just identifier names and broxygen comments. Like Jan, I had a hard time understanding the benefit having two names for the same value: the identifier and config string. It seems to push more burden than needed onto script authors, like maybe they don’t really care about a UI, but want the improved configuration capabilities. i.e. maybe the requirements of a UI can be separate from the requirements of the new “configuration variables” concept.

Maybe one thing to do is try to actually build/design your ideal UI and/or configuration tool starting with just the existing Bro functionality. You’ll definitely get an understanding of the low-level requirements that way. i.e. first design/build the most basic user experience that functionally works and then, from that state, add whatever you think will be an improvement.

There is just something about the idea of exposing variable names to
users (even if it's wrapped in a gui) that is intensely unpalatable to
me. It's pretty much unheard of among other types of software. It
would be like exposing internal variable names to command line programs
instead of abstracting it into easy flags (i.e. -a or --help) or, if in
a gui a text entry box had a label next to it like
"GUI::My_Program::user_name" instead of showing "Username".

I’m half facetious in bringing it up, but have you seen CMake? Resources

- Jon

I'm not sure how exposing something like "input.pcap.filter" is any different from exposing something like "Pcap::filter" from that standpoint. Maybe there's a larger discussion here around what the user experience should look like? I feel like two different things are being talked about now.

I’m thinking on the same lines.

Directly using variable names in UI elements is not unheard of though, a lot of UI frameworks will do things like present a variable like user_name as "User Name" in the UI. This is usually a simple text transform like

s='My_Program::user_name'
s.replace("::", " - ").replace("_", " ").title()

   'My Program - User Name’

Yeah, I’ve noticed this approach in other UIs as well.

- Jon

Yeah, that's been my original concern as well. What if we focused that
new attribute just on displaying something to the user:

    const user_name: string &redef &display_name="User name"

A UI would show it as "User name", but everything else (incl.
internally the configuration framework) would use
My_Program::user_name. This would even work more generically, anything
could have a &display_name and we'd have Broxygen pick up on it too.

Robin

I am not sure that we do need a new language element for that at all. If
we want a new attribute for just displaying information in a different
way, that (at least to me) feels more like something broxygen would do
(i.e. something that a script writer could put into one of the ## comments
if they so desire for the respective variable).

That being said, I still think it would be nice to have something in the
Bro language to denote that a value is a configuration option, mostly for
the reasons stated in the very first email. The biggest reason from my
point of view is strong typing - we tried to implement this just as Bro
scripts and it ends up not so nice.

So - how about something like this:

## The username for our new feature

> I'm not sure how exposing something like "input.pcap.filter" is any different from exposing something like "Pcap::filter" from that standpoint. Maybe there's a larger discussion here around what the user experience should look like? I feel like two different things are being talked about now.

I’m thinking on the same lines.

Yes, I can see that argument.

> Directly using variable names in UI elements is not unheard of though, a lot of UI frameworks will do things like present a variable like user_name as "User Name" in the UI. This is usually a simple text transform like
>
>>>> s='My_Program::user_name'
>>>> s.replace("::", " - ").replace("_", " ").title()
> 'My Program - User Name’

Yeah, I’ve noticed this approach in other UIs as well.

I admittedly did not think of this - that could work as a neat default.

The only thing that I would like to avoid (which is obviously separate
from this) is internally remapping variable names to configuration names
in a non-reversible manner; then one suddenly has to think about what to
do when names conflict (several variable names being able to automatically
map to the same configuration name). But - that seem to be separate
concern :slight_smile:

Johanna

Still not sure how much of an issue that is, provided the display names are only for display and not used to actually locate/update identifier values. E.g. if a user sees 2 “User Name” fields in a UI, I think we’re still able to fall back on the broxygen documentation comments to provide more context to the user. Or if theres standardized/automatic conventions for these display names that are based on modules/namespacing, I’m not sure how often you’d even see such conflicts, or ’d expect they’d get patched out pretty rapidly by the community when they pop up.

- Jon

This actually was my point - which I apparently did not make clear. As long as it is only for display it is not a problem - I just don't want it to be used for identification :slight_smile:

Johanna

First of all, thanks to Johanna for getting this discussion going, and thanks to everyone who’s weighed in so far. I’m really excited to see this feature in Bro, and I’m also happy to see how much interest this has already garnered.

To extend what Seth said about our two user groups – I think that this feature is where those two groups intersect. While a lot of thought has gone into what this looks like from an end-user perspective, I want to make sure that we also make this easy and elegant from a developer’s perspective. Bro scripting already has a high barrier of entry, and I think that we need to be careful not to raise that barrier further. As was discussed during BroCon, I think that the Bro community is increasingly relying on developers outside of the core team to contribute scripts – and that’s a great thing!

I think that it’s important to have this behavior come with a reasonable default. I think that whatever we choose should just work out of the box. For example, I think the existing construct should continue to work:

const user_name: string &redef

At the end of the day, what we’re discussing is how a developer should document and expose a feature to an end-user. If, as a developer, I choose a bad variable name, then I’m not providing a good experience for the end-user, but that’s my decision. I don’t think that forcing developers to essentially add documentation via syntactic sugar is the right approach. If their variable names are confusing, people are less likely to use their script.

I think that a lot of what users might want to re-configure is too complex to be explained through a variable name anyway. We already have a system in place to document variables, and I think that we need to rely on that instead of focusing so much on which name is exposed to the user.

As we look at moving Bro scripts to packages, I think we need to look at how other package repositories have handled similar configuration options. Puppet Forge, for instance, has a types tab which documents the names of the parameters, and what they do: https://forge.puppet.com/puppetlabs/mysql/types This would be pretty easy to do with the Broxygen documentation, and a UI could also expose this.

tl;dr version: I want to find something that makes life easy for both developers and end-users, and I think we already have the documentation mechanism in place to be more expressive about variables.

–Vlad

I think that it's important to have this behavior come with a reasonable
default. I think that whatever we choose should just work out of the box.
For example, I think the existing construct should continue to work:

> const user_name: string &redef

I agree; note that what I proposed preserves this compatibility (it does
not change anything at all about redefs). The feature that the
configuration feature wants to bring is the ability to change options
during runtime - which cannot be accomplished with redefs. redef-able
consts will still have their place afterwards (for everything that still
cannot be changed during runtime).

At the end of the day, what we're discussing is how a developer should
document and expose a feature to an end-user. If, as a developer, I choose
a bad variable name, then I'm not providing a good experience for the
end-user, but that's my decision. I don't think that forcing developers to
essentially add documentation via syntactic sugar is the right approach. If
their variable names are confusing, people are less likely to use their
script.

I think that a lot of what users might want to re-configure is too complex
to be explained through a variable name anyway. We already have a system in
place to document variables, and I think that we need to rely on that
instead of focusing so much on which name is exposed to the user.

I agree with this.

As we look at moving Bro scripts to packages, I think we need to look at
how other package repositories have handled similar configuration options.
Puppet Forge, for instance, has a types tab which documents the names of
the parameters, and what they do:
puppetlabs/mysql · Installs, configures, and manages the MySQL service. · Puppet Forge This would be pretty easy
to do with the Broxygen documentation, and a UI could also expose this.

Yup, I also agree with this.

Johanna

I just wanted to note that after keeping up on this thread that I agree with those same points. :slight_smile:

   .Seth

Just had a misc. thought related to the use of ‘const’:

I remember first being confused/unfamiliar with Bro’s use of ‘const’ and thought it meant “never changes” only to learn it’s further qualified as “never changes at run-time” so that the ‘const’ + ‘&redef’ combo ultimately means “never changes at run-time, but initial value may be changed at parse-time”.

Though, technically, ‘const’ can still be modified at run-time, if you know how… e.g. send_id...

And that’s maybe all ok -- it’s easy to explain unfamiliar context as I did above and the means of subverting runtime modification restrictions isn’t actively advertised as such. Though, with a new config framework, there’s opportunities:

1) could remove need for the backdoor method of modifying ‘const’ values at runtime, (e.g. via send_id) as that’s done through new identifiers explicitly tagged for config purposes

2) using a new ‘configopt’ access modifier may be warranted over re-using ‘const’ for configuration values as the semantics are now blatantly different than what ‘const’ is expected to mean. i.e. config values are meant to change at run-time, but only via a restricted API and ‘const’ is still intended to never change at run-time

- Jon