first try at a skeleton script

http://github.com/sethhall/bro_scripts/blob/master/skeleton.bro

Sorry for taking a bit to get back on this, but I'm still mulling
over this. Generally, this looks good. Here are some additional
(unsorted) thoughts:

- The explanatory text is very helpful when writing a script,
however we should make it clear that once a script is done, this
should be removed, as otherwise we'll end up with many scripts
having the same boilerplate in them. So, perhaps it makes sense to
start anything that's just explaining something with an explicit
marker, like

    # REMOVE-ME: The export section contains the external ...
    # REMOVE-ME: ......

- Also, some elements have a documentation string (i.e., something
that will end up in the auto-generated documentation later) while
others don't. I think that generally at least everything in the
export section should have documentation.

- Taking the two previous points together, it doesn't yet become
clear what is documentation for a script's *users*, vs. what is help
for a script *writer*.

- As an alternative, we could keep the writer help inside a script
to a minimum, but have an additional external document giving more
details. Not sure which approach is better.

- The Notice enums should be documented as well.

- Another thought, considering our goal of building "configuration
wizards": it would be cool if the such a wizard could extract from
the script everything that it needs for providing the user with a
corresponding configuration dialog. Actually I don't think there's
much additional stuff needed to facilitate that once the
documentation strings are in place, but it's worth thinking about
what else could be helpful here.

Robin

Regarding the TODO: "Figure out a style for Sphinx-type
documentation", here are my thoughts:
    
    - As a general rule, I guess a comment block should always be
    associated with the next language element.
    
    - The script itslef should start with a block describing the
    overall purpose of the script and perhaps potential
    caveats/things to keep in mind etc.
    
    - Each indidivual doc section should start with a single short
    sentence summarizing the documented element, followed by further
    text going into more detail.
    
    - Generally, all doc text is formatted in reST. However, one
    thing I always hate when writing documentation strings is adding
    all kinds of markers for things like parameters, return values,
    etc. (e.g., "@param", or ":param:"). What I've played with for
    HILTI is reducing this stuff to a minimum and extracting it from
    the structure of the text. Example:
    
         # Function for checking That.

Finally coming back around to this...

    - As a general rule, I guess a comment block should always be
    associated with the next language element.

ACK.

    - The script itslef should start with a block describing the
    overall purpose of the script and perhaps potential
    caveats/things to keep in mind etc.

ACK.
Question: how to distinguish scripts' doc section from the doc section
of the first language element, if somebody choose to only use one of
them....

    - Each indidivual doc section should start with a single short
    sentence summarizing the documented element, followed by further
    text going into more detail.

ACK.

    - Generally, all doc text is formatted in reST. However, one
    thing I always hate when writing documentation strings is adding
    all kinds of markers for things like parameters, return values,
    etc. (e.g., "@param", or ":param:").
[cut]
     - "Returns: texttextext" is the magic to mark the description
     of the return value.

Thoughts?

I'm just wondering whether not using markers might end up exploding in
our faces once enough people write doc strings ....

- The explanatory text is very helpful when writing a script,
however we should make it clear that once a script is done, this
should be removed, as otherwise we'll end up with many scripts
having the same boilerplate in them. So, perhaps it makes sense to
start anything that's just explaining something with an explicit
marker, like

    # REMOVE-ME: The export section contains the external ...
    # REMOVE-ME: ......

- Also, some elements have a documentation string (i.e., something
that will end up in the auto-generated documentation later) while
others don't. I think that generally at least everything in the
export section should have documentation.

- Taking the two previous points together, it doesn't yet become
clear what is documentation for a script's *users*, vs. what is help
for a script *writer*.

Is a user somebody who just needs to use the "interface" (globals,
events, etc.) the script defines or somebody who might want to modify
the script according to their needs.
I guess the question is: where do we draw the line? Maybe it would help
if the per-script doc-block would specify the "public" interface that
users have to know about.....

Or maybe the per-element doc-comments can have a flag indicating whether
they are public interface or internal functionality. Then the doc
generator can separate these elements for the output. Then a user can
quickly see, what's relevant (which events, functions, globals) and
somebody who needs to understand the script (to enhance, modify,
whatever) also has all the information.

- As an alternative, we could keep the writer help inside a script
to a minimum, but have an additional external document giving more
details. Not sure which approach is better.

I think that everything that documents the actual script should go into
the script. If we separate code from documentation, we will probably
face the problem that the documentation and the code don't match further
down the road. Documentation about language elements can maybe go into a
different document.

- The Notice enums should be documented as well.

Maybe more general: can we come up with a nice way to document the redef
of the various "special purpose" globals that are edited by many
scripts? E.g., Notice enums, capture filters, analyzer configurations, etc.

cu
gregor

Probably a good idea. If they are messing with internal interface, probably easier to look at source code than bring up a separate documentation manual. It would be easy to flood the documentation with so much information that it serves no purpose beyond just reading comments in source directly

I'm just wondering whether not using markers might end up exploding in
our faces once enough people write doc strings ....

"Exploding" in which sense?

(Note that I'm not too strong on not using markers; I personally
find them cumbersome, but I can also configure my editor to do it
for me :slight_smile:

Is a user somebody who just needs to use the "interface" (globals,
events, etc.) the script defines or somebody who might want to modify
the script according to their needs.

I meant the former. I don't think we should formalize documenting a
script's internals. As Adam wrote, for that it's often easier to
read the source directly, and where it isn't, a few informal
comments (which don't go into the reference manual) are fine, just
as it's now.

My point was a different one: Seth's initial skeleton had some
instructions for the person writing a script *initially* (e.g., how
to structure things etc.). I found that confusing as there was no
separation between these and things which will be left in the final
script for people later needing to understand the interface.

Or maybe the per-element doc-comments can have a flag indicating whether
they are public interface or internal functionality.

Can't we just use the export section? Everything in there is by
definition part of the interface and should be documented.

Maybe more general: can we come up with a nice way to document the redef
of the various "special purpose" globals that are edited by many
scripts? E.g., Notice enums, capture filters, analyzer configurations, etc.

Good point. For these we need two different types of documentation:
one defining the purpose of the global, and one which states how a
script modifies the global when loaded. However, there aren't that
many of these so we might be fine special-casing them (i.e., just
listing which notices, filters, etc. a script defines).

Robin

I'll rework the skeleton script sometime this week to account for the documentation aimed at script users and at script developers. Maybe the right way to do it is for me to remove the embedded documentation for the script developers and move all of that documentation into a separate document. The skeleton script will only contain what the script developer would write.

  .Seth

I'm just wondering whether not using markers might end up exploding in
our faces once enough people write doc strings ....

"Exploding" in which sense?

(Note that I'm not too strong on not using markers; I personally
find them cumbersome, but I can also configure my editor to do it
for me :slight_smile:

I'm worrying that if we don't require markers (but infer the parameters)
somebody will write a doc string that confuses the auto-detection and we
get incorrect / weird output. Don't know how much of a problem this is
going to be though.

Is a user somebody who just needs to use the "interface" (globals,
events, etc.) the script defines or somebody who might want to modify
the script according to their needs.

I meant the former. I don't think we should formalize documenting a
script's internals. As Adam wrote, for that it's often easier to
read the source directly, and where it isn't, a few informal
comments (which don't go into the reference manual) are fine, just
as it's now.

Ok.
OTOH, I think it's nice if functions and parameters are described or
documented briefly, even if they are internal. However, If somebody has
to modify the policy script's code (as I had to do for my
http-analysis), it's helpful to have such documentation in the src code.
But yes, they don't need to go into the reference manual.

My point was a different one: Seth's initial skeleton had some
instructions for the person writing a script *initially* (e.g., how
to structure things etc.). I found that confusing as there was no
separation between these and things which will be left in the final
script for people later needing to understand the interface.

ACK.

Or maybe the per-element doc-comments can have a flag indicating whether
they are public interface or internal functionality.

Can't we just use the export section? Everything in there is by
definition part of the interface and should be documented.

Yes. Didn't think about that actually.

Maybe more general: can we come up with a nice way to document the redef
of the various "special purpose" globals that are edited by many
scripts? E.g., Notice enums, capture filters, analyzer configurations, etc.

Good point. For these we need two different types of documentation:
one defining the purpose of the global, and one which states how a
script modifies the global when loaded. However, there aren't that
many of these so we might be fine special-casing them (i.e., just
listing which notices, filters, etc. a script defines).

Yeah. And maybe have a tool, script that, given the names of the
"special globals", goes through the policy files and lists how each
policy script modifies them.

cu
gregor