adding credits to the schema for package metadata

A few months ago at (what was then called) BroCon, in the Community Session
I put up a list of newly contributed packages along with my best guess as
to authorship / whom to credit for the contribution. A couple of contributors
came up to me afterwards to discuss adjusting how they were credited; and,
more generally, the notion of adding an explicit "credits" field to the
info associated with a package.

This could look like:

  [package]
  credit = Originally written by A. Sacker <ace@sacker.com>. JSON
      support added by W00ter (Acme Corporation).

As suggested by the example, the field would be free-form. Here, the
original author decided to include their email address, and the additional
contributor their company affiliation.

We wouldn't have any apriori rules about who can update the "credit" field,
but rather rely on the community to do that reasonably (and I imagine go
back to the original contributor if a dispute arose).

How does that sound?

    Vern

[package]
credit = Originally written by A. Sacker <ace@sacker.com>. JSON
    support added by W00ter (Acme Corporation).

I like this idea.

To support incremental growth of the credit field, we could call it
"credits" and make it a list of strings:

   credits = A. Sacker <ace@sacker.com>.,
     JSON support added by W00ter (Acme Corporation)

This may make it easier to grow the credits, simply by adding another credit string to the end.

    Matthias

So first I have to question the use-case: who or what will need this
information and how often?

This is not information I think users of the command-line tool ever
care about. E.g. I've never used other package-management tools to
look into credit/author info.

It's maybe something packages.zeek.org would show, but I don't see how
that would be better than looking at the "contributors" stats already
compiled by GitHub from author info encoded directly into git commits.

It's definitely info we've needed/used ourselves, like for the BroCon
presentation. Is that going to be a regular thing? What was
insufficient about using implicit author info in the git commit log?
It's left up to the contributor to self-report (via their git config)
the name/email info as they wish to be recognized. I'm not sure how
we can get any more accurate than that. Introducing
manually-maintained data can only make that info less reliable.

Alternatively, why would it help to have more free-form "credit"
information specifically in the metadata file versus in README,
CONTRIBUTORS, or AUTHORS files, which are already a common convention
in open-source projects?

Are there other use-cases I missed?

I'm not so much opposed to standardizing on a new metadata field
(people can already add a 'credit' field right now if they want,
there's no code changes to bro-pkg needed since it won't use it for
anything), but if it's only optional, not sure it will be
adopted/maintained widely and so not solve the problem as intended.

- Jon

To support incremental growth of the credit field, we could call it
"credits" and make it a list of strings:

Good thought - makes extracting & formatting it easier.

    Vern

So first I have to question the use-case: who or what will need this
information and how often?

The uses I have in mind are (1) displayed by the Web UI when browsing
packages, and (2) shout-outs at our annual conference.

This is not information I think users of the command-line tool ever
care about. E.g. I've never used other package-management tools to
look into credit/author info.

Agreed.

It's maybe something packages.zeek.org would show, but I don't see how
that would be better than looking at the "contributors" stats already
compiled by GitHub from author info encoded directly into git commits.

The problem with those stats is (1) they're removed from the Web UI,
(2) they're not in a coherent form. #2 was the issue that came up at
BroCon. The git commits don't necessarily identify the author like they
would want for public recognition; can be missing co-authors for instances
where one author does the git end of publishing the package even though
several people worked on it; and can make it unclear whether a given
git contributor merits public credit.

A primary goals here is to encourage contributors in terms of gaining
public visibility for themselves / their group / their affiliation.
I think there are enough degrees of freedom in doing so that we won't be
able to simply infer the correct way to do it based on the GitHub activity.

It's definitely info we've needed/used ourselves, like for the BroCon
presentation. Is that going to be a regular thing?

I would like it to be. It helps convey a sense of our active developer
community.

What was
insufficient about using implicit author info in the git commit log?

See the above. Those shortcomings are what led to the contributors coming
up to me afterwards.

Alternatively, why would it help to have more free-form "credit"
information specifically in the metadata file versus in README,
CONTRIBUTORS, or AUTHORS files, which are already a common convention
in open-source projects?

There are 3 possible advantages. (1) We know where to look for it.
(2) The Web UI can display it. (3) Contributors can know to think about
it up-front in terms of what ought to be displayed publicly, which could
be a bit different than what might go into one of those files.

I'm not so much opposed to standardizing on a new metadata field
(people can already add a 'credit' field right now if they want,
there's no code changes to bro-pkg needed since it won't use it for
anything), but if it's only optional, not sure it will be
adopted/maintained widely and so not solve the problem as intended.

Can we make it non-optional by having it be part of the contribution
process, just like (I believe) the need to clarify licensing currently is?

    Vern

> It's maybe something packages.zeek.org would show, but I don't see how
> that would be better than looking at the "contributors" stats already
> compiled by GitHub from author info encoded directly into git commits.

The problem with those stats is (1) they're removed from the Web UI,

Could just link to it or else gather stats ourselves for display there
if it's important enough.

(2) they're not in a coherent form.

That seem like it's up to the committer to get right -- if they don't
care enough to use coherent git user information, then that seems like
an indication that they don't care how others recognize their
contributions.

The git commits don't necessarily identify the author like they
would want for public recognition

But that's completely under their own control to change however they want.

can be missing co-authors for instances

There's a common git convention for co-authors they should then use:

https://help.github.com/articles/creating-a-commit-with-multiple-authors/

can make it unclear whether a given git contributor merits public credit.

I think I get that point, but who gets chosen still seems arbitrary,
which is why I feel the focus should be on getting the git data
accurate since that can speak for itself. But I can see how it may be
important to have alternate credit mechanisms in cases where
historical git data is not correct and can't be changed.

> Alternatively, why would it help to have more free-form "credit"
> information specifically in the metadata file versus in README,
> CONTRIBUTORS, or AUTHORS files, which are already a common convention
> in open-source projects?

There are 3 possible advantages. (1) We know where to look for it.
(2) The Web UI can display it. (3) Contributors can know to think about
it up-front in terms of what ought to be displayed publicly, which could
be a bit different than what might go into one of those files.

(1) and (2) are just about choosing a standardized location and
documenting it. That can be anywhere, but I was more just pointing
out that while adding it to the package metadata does work, it's also
currently only hosting data that serves a functional purpose for the
command-line tool. Credit information would not be used in any
functional way by the command-line tool, so that's why I was
suggesting alternatives that would put this issue more in the "good
conventions/practices for open-source project management" camp rather
than anything specific to bro-pkg. (Thinking it's generally a good
idea to limit our involvement in how people choose to maintain their
own work).

Can we make it non-optional by having it be part of the contribution
process, just like (I believe) the need to clarify licensing currently is?

It could, but I'd say it's not great to add requirements to the
contribution process unless really needed. We're also not going to be
able to enforce whether people keep that field updated properly and
begs what to do about existing packages that don't promptly add a
credits field to their metadata.

I'm still cool with documenting an optional "credits" field for the
package metadata, but just making sure, given all the caveats, that it
solves the problem sufficiently? I'd probably word the docs like:

"If you have particular requirements or concerns regarding how authors
or contributors for your package are credited in public listings, you
may explicitly provide the text that should be used to name or
describe such people in this field". And then also provide your
example.

- Jon

I went ahead and documented it [1] as I was updating other things
there. Feel free to PR any language tweaks.

- Jon

[1] https://docs.zeek.org/projects/package-manager/en/stable/package.html#credits-field