Proposing changes to the OpenBMC tree (to make upstreaming easier)

Wed Apr 6 12:19:14 AEST 2022

Hi Ed,

I think what's below largely points to a bit of an identity crisis for
the project, on a couple of fronts. Fundamentally OpenBMC is a distro
(or as Yocto likes to point out, a meta-distro), and we can:

1. Identify as a traditional OSS distro: An integration of otherwise
   independent applications

2. Identify as an appliance distro: The distro and the
   applications are a monolith

You're proposing 2, while I think there exists some tension towards 1.

With the amount of custom userspace we've always kinda sat in-between.
I'd like to see libraries and applications that have use cases outside
of OpenBMC be accessible to people with those external use cases,
without being burdened by understanding the rest of the OpenBMC context.
I have a concern that by integrating things in the way you're proposing
it will lead to more inertia there (e.g. for implementations of
standards MCTP or PLDM (libmctp and libpldm)).

On Tue, 5 Apr 2022, at 03:58, Ed Tanous wrote:
> The OpenBMC development process as it stands is difficult for people
> new to the project to understand, which severely limits our ability to
> onboard new maintainers, developers, and groups which would otherwise
> contribute major features to upstream, but don't have the technical
> expertise to do so.  This initiative, much like others before it[1] is
> attempting to reduce the toil and OpenBMC-specific processes of
> passing changes amongst the community, and move things to being more
> like other projects that have largely solved this problem already.

Can you be more specific about which projects here? Do you have links 
to examples?

>
> To that end, I'd like to propose a change to the way we structure our
> repositories within the project: specifically, putting (almost) all of
> the Linux Foundation OpenBMC owned code into a single repo that we can
> version as a single entity, rather than spreading out amongst many
> repos.  In practice, this would have some significant advantages:
>
> - The tree would be easily shareable amongst the various people
> working on OpenBMC, without having to rely on a single-source Gerrit
> instance.  Git is designed to be distributed, but if our recipe files
> point at other repositories, it largely defeats a lot of this
> capability.  Today, if you want to share a tree that has a change in
> it, you have to fork the main tree, then fork every single subproject
> you've made modifications to, then update the main tree to point to
> your forks. 

This isn't true, as you can add patches in the OpenBMC tree.

CI prevents these from being submitted, as it should, but there's nothing to
stop anyone using the `devtool modify ...` / `devtool finish ...` and
committing the result as a workflow to exchange state (I do this)?

Is the issue instead with devtool? Is it bad? Is the learning curve too steep?
It is at least the Yocto workflow.

> This gets very onerous over time, especially for simple
> commits.  Having maintained several different companies forks
> personally, and spoken to many others having problems with the same,
> adding major features are difficult to test and rebase because of
> this.  Moving the code to a single tree makes a lot of the toil of
> tagging and modifying local trees a lot more manageable, as a series
> of well-documented git commands (generally git rebase[2]).  It also
> increases the likelihood that someone pulls down the fork to test it
> if it's highly likely that they can apply it to their own tree in a
> single command.

Again, this is moot if the patches are applied in-tree.

>
> - There would be a reduction in reviews.  Today, anytime a person
> wants to make a change that would involve any part of the tree,
> there's at least 2 code reviews, one for the commit, and one for the
> recipe bump.  Compared to a single tree, this at least doubles the
> number of reviews we need to process.

Is there more work? Yes.

Is it always double? No. Is it sometimes double? Yes.

Often bumps batch multiple application commits. I think this paragraph 
overstates the problem somewhat, but what it does get right is 
identifying that *some* overhead exists.

>  For changes that want to make
> any change to a few subsystems, as is the case when developing a
> feature, they require 2 X <number of project changes> reviews, all of
> which need to be synchronized.

Same issue as above here.

> There is a well documented problem
> where we have no official way to synchronize merging of changes to
> userspace applications within a bump without manual human
> intervention.  This would largely render that problem moot.

Right, this can be hard to handle.

It can be mitigated by versioning interfaces (which the D-Bus spec 
calls out[6][7] but OpenBMC fails to do (?)) and supporting multiple 
interfaces for the transition period.

That said, that's also more work, and so needs to be considered in the 
set of trade-offs.

[6] https://dbus.freedesktop.org/doc/dbus-specification.html#message-protocol-names-interface
[7] https://dbus.freedesktop.org/doc/dbus-specification.html#message-protocol-names-bus

>
> - It would allow most developers to not need to understand Yocto at
> all to do their day to day work on existing applications.  No more
> "devtool modify", and related SRCREV bumps.  This will help most of
> the new developers on the project with a lower mental load, which will
> mean people are able to ramp up faster..

Okay. So devtool is seen as an issue.

Can we improve its visibility and any education around it? Or is it a 
lost cause? If so, why?

Separately, I'm concerned this is an attempt to shield people from
skills that help them work with upstream Yocto. OpenBMC feels like it's
a bit of an on-ramp for open-source contributions for people who have
worked in what was previously quite a proprietary environment. We tried
shielding people in the past wrt kernel contributions, and that failed
pretty spectacularly. We (at least Joel and I) now encourage people to
work with upstream directly *and support them in the process of doing
that* rather than trying to mitigate some of the difficulties with
working upstream by avoiding them.

>
> - It would give an opportunity for individuals and companies to "own"
> well-supported public forks (ie Redhat) of the codebase, which would
> increase participation in the project overall.  This already happens
> quite a bit, but in practice, the forks that do it squash history,
> making it nearly impossible to get their changes upstreamed from an
> outside entity.

Not sure this is something we want to encourage, even if it happens in 
practice.

>
> - It would centralize the bug databases.  Today, bugs filed against
> sub projects tend to not get answered. 

Do you have some numbers handy?

> Having all the bugs in
> openbmc/openbmc would help in the future to avoid duplicating bugs
> across projects.

Has this actually been a problem?

>
> - Would increase the likelihood that someone contributes a patch,
> especially a patch written by someone else.  If contributing a patch
> was just a matter of cherry-picking a tree of commits and submitting
> it to gerrit, it's a lot more likely that people would do it.

It sounds plausible, but again, some evidence for this would be helpful.

Why is this easier than submitting the patches to the application repo?

> My proposed version of this tree is pushed to a github fork here, and
> is based on the tree from a few weeks ago:
> https://github.com/edtanous/openbmc
>
> It implements all the above for the main branch.  This tree is based
> on the output of the automated tooling, and in the case where this
> proposal is accepted, the tooling would be re-run to capture the state
> of the tree at the point where we chose to make this change.
>
> The tool I wrote to generate this tree is also published, if you're
> interested in how this tree was built, and is quite interesting in its
> use of git export/import [5], but functionally, I would not expect
> that tooling to survive after this transition is made.

I think it would be good to capture the script in openbmc-tools if we 
choose to go ahead with this, mainly as a record of how we achieved it.

Andrew

>
> [1] 
> https://lore.kernel.org/openbmc/CACWQX821ADQCrekLj_bGAu=1SSLCv5pTee7jaoVo2Zs6havgnA@mail.gmail.com/
> [2] https://git-scm.com/docs/git-rebase
> [3] 
> https://github.com/openbmc/docs/blob/master/CONTRIBUTING.md#inclusive-naming
> [4] 
> https://www.yoctoproject.org/docs/1.8/ref-manual/ref-manual.html#ref-classes-externalsrc
> [5] https://github.com/edtanous/obmc-repo-combine/blob/main/combine