package manager progress

On the other hand, if, for example, somebody ends up
browsing the package source repository on GitHub, I'm sure they'd be
confused by all the packages pointing to very old versions.

Yeah, agree that is confusing and a problem of using submodules without ever updating them in the source repo.

So I'm
wondering if it would be worth switching to custom index instead of
submodules; seems that wouldn't be difficult either if indeed all we
need to do is track the external URLs somehow.

I’m on board with switching to a custom index format.

Also, if you want to
track discoverability metadata there already as well, seems that the
URL could just become part of that, no?

Yes.

Any preference on the new index format? Single index file? Multiple files? INI, JSON, something else?

I think still allowing package sources to be structured in a directory hierarchy is more intuitive to navigate and maybe less intimidating to modify than a single file as the source grows over time. And I’m already using INI format in 2 other places, so seems fine to apply here, too. So a proposed structure of a package source:

  https://github.com/bro/packages
    alice/
      bro-pkg.index
    bob/
      bro-pkg.index
    …

alice’s bro-pkg.index:

  [foo]
  url = https://github.com/alice/foo
  keywords = Mr.T pity

  [bar]
  url = https://github.com/alice/bar
  keywords = club pub drinks

bob’s bro-pkg.index:

  [baz]
  url = https://github.com/bob/baz
  keywords = lightning storm

Output of `bro-pkg list all`:

  bro/alice/foo
  bro/alice/bar
  bro/bob/baz

- Would suggest to rename “pkg.meta” to, say, “bro-pkg.meta”

Sure.

- Does "upgrade" show the packages affected and ask for confirmation?
I would suggest either doing that or require an --all option for
upgrading everything, as that's a potentially dangerous operation.

It doesn’t ask for confirmation, but in favor requiring the explicit --all.

- I suppose upgrading does (better: will do) dependency checking
again, including making sure the Bro version matches the one that
update now requires?

Yes, I imagine the dependency analysis for upgrading and installing being the same or similar process.

- When installing the package manager as part of Bro, could we pull in
the Python dependencies automatically, for example by installing
them into the same prefix?

Yes, I can likely get that to work.

Both GitHub and semantic_version are
pretty non-standard. Using them is ok I think but it would be nice
if "bro-pkg" wouldn't abort first thing because they aren't
installed yet.

Alternatively, I can have CMake detect whether they are installed, then, if not, don’t install bro-pkg and put a warning/explanation in the CMake summary output. Let me know which is preferred. I’m a bit in favor of auto-installing the python dependencies into Bro’s install prefix.

- How about adding a note to either packages.bro or the whole
packages/ directory that's it's automatically maintained and not
supposed to be manually messed with?

Ok.

- In bro-pkg.conf, has "default" in "[sources]" a special meaning, or
could it be any tag? Assuming the latter, I would just call it
"bro"

It’s arbitrary, will change ‘default' to ‘bro’.

- For our default package source, do we want to support non-GitHub
repositories? If so, a naming scheme by GitHub user won't work.

The hierarchy isn’t strictly required to use GitHub usernames. Generally could be "$author_name/$package_name”, where the most common case is for $author to be a GitHub user name. A domain name, company/organization name, or any string to help identify the author would work.

- Suggest to rename "/opt/bro/var/lib/package-manager" to
"../bro-package-manager" or "../bro-pkg”.

Agree about changing “package-manager” to “bro-package-manager”, but do you also mean to get rid of the “lib” subdir? I think that fits within Filesystem Hierarchy Standard [1]. For /var/lib that says:

"State information. Persistent data modified by programs as they run, e.g., databases, packaging system metadata, etc.”

There’s probably nuances that let you get away with other locations when installing to prefixes other than ‘/‘, but I find it generally works well to just replace ‘/‘ with user’s preferred install prefix. Let me know what you think.

- Once we support dependencies on Bro versions, would be nice if that
worked also with the "x.y-z" scheme that git master uses (and maybe
it just does anyways).

Should already work via semantic_version.

   - Python 3.x works, right? Then I'd list that explicitly.

Worked for me. Will do.

   - A quick-start guide would be helpful that just mentions the most
     important steps, including basic installation along with Bro
     itself (once that's merged).

Tried to do this in the Overview/README’s “Installation” section. I think reorganizing that in smaller sections with bullet points to follow and re-labeling it as “quick-start guide” may help.

   - The "Installation" section becomes a bit confusing towards with
     the end with all those paths. Maybe split some parts out into an
     advanced section or so?

Yeah, will try to re-organize.

   - How-tos would be helpful that show by example how to create a
     (1) a pure script package, (2) and binary Bro plugin, and (3) a
     BroControl plugin.

Sure, I’ll add explicit step-by-step guides for each.

- Jon

[1] https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard

I think still allowing package sources to be structured in a directory
hierarchy is more intuitive to navigate and maybe less intimidating to
modify than a single file as the source grows over time. And I’m
already using INI format in 2 other places, so seems fine to apply
here, too.

Yep, agree with both. That also makes merging pull requests easy /
contained.

So a proposed structure of a package source:

Looks good to me.

the CMake summary output. Let me know which is preferred. I’m a bit
in favor of auto-installing the python dependencies into Bro’s install
prefix.

I also prefer auto-installation, unless there's a reasonable risk that
it could interfere with already installed versions of those packets,
not sure?

The hierarchy isn’t strictly required to use GitHub usernames.
Generally could be "$author_name/$package_name”, where the most common
case is for $author to be a GitHub user name. A domain name,
company/organization name, or any string to help identify the author
would work.

Ok, we probably need to write down our policy somewhere what we
do/expect for the default source.

Agree about changing “package-manager” to “bro-package-manager”, but
do you also mean to get rid of the “lib” subdir?

No, I didn't, sorry for the confusion. I was just too lazy to type the
full path again, should have inserted 3 dots to make that clear.

Tried to do this in the Overview/README’s “Installation” section. I
think reorganizing that in smaller sections with bullet points to
follow and re-labeling it as “quick-start guide” may help.

Ack.

Robin

in favor of auto-installing the python dependencies into Bro’s install
prefix.

I also prefer auto-installation, unless there's a reasonable risk that
it could interfere with already installed versions of those packets,
not sure?

Don’t think so.

Ok, we probably need to write down our policy somewhere what we
do/expect for the default source.

Expanding the README of https://github.com/bro/packages to include notes on the submission process and naming convention/policy seems like the place to me.

- Jon

Hhhm, is it naming conventions that people have a problem with or the implication of policing? These can be separated. I don’t see a downside to promoting conventions.

It also seems that some of the reason (e.g., that we have metadata is based on an assumption that we will have good metadata). But I recall a lot of resistance to requiring basic metadata.

I believe this merits a little more discussion and would like to nudge behavior if possible, though not compel it. We could do this by simply providing a skeleton taxonomy into which people could always just through things in “misc” or some equivalent.

The problem is that the suggested naming convention wouldn't work:
it's not clear how somebody would name their plugin if it provided
more than one specific piece of functionality.

Robin

I would expect that any package repository has that same issue, there is no perfect taxonomy.

If we are going to rely on metadata, which I agree can be better as you can tag a package with multiple categories, we should probably have some basic requirements for this type of metadata. Do you agree?

:Adam

At a minimum, it’s useful to provide a list of “suggested keywords” that people can choose from when tagging their packages so there’s at least a common terminology to search within.

Do you mean to require something more than that? E.g. “packages submitted to the ‘bro’ package source MUST NOT use keywords outside the pre-approved list” ? Or “packages submitted to the ‘bro’ package source MUST contain at least one keyword tag” ?

- Jon

If we are going to rely on metadata, which I agree can be better as you can tag a package with multiple categories, we should probably have some basic requirements for this type of metadata.

At a minimum, it’s useful to provide a list of “suggested keywords” that people can choose from when tagging their packages so there’s at least a common terminology to search within.

Do you mean to require something more than that? E.g. “packages submitted to the ‘bro’ package source MUST NOT use keywords outside the pre-approved list” ? Or “packages submitted to the ‘bro’ package source MUST contain at least one keyword tag” ?

I was thinking of requiring at least one keyword.