Proposing changes to the OpenBMC tree (to make upstreaming easier)

Thu May 26 01:02:26 AEST 2022

On Wed, May 25, 2022 at 6:32 AM Heyi Guo <guoheyi at linux.alibaba.com> wrote:
>
>
> 在 2022/5/24 上午12:27, Ed Tanous 写道:
> > On Tue, Apr 12, 2022 at 12:23 AM Heyi Guo<guoheyi at linux.alibaba.com>  wrote:
> >> I like the idea, for we don't utilize additional tools like repo to
> >> maintain the code, and it should make it easier for us to maintain
> >> multiple internal branches.
> >>
> > Hi Heyi,
> > Glad to see you on the project.  Do you think you could elaborate a
> > little about how you're hoping to use OpenBMC and its review process,
> > and if any of the changes being proposed here would help you?
>
> Hi Ed,
>
> The background is our team uses basic git commands to manage the
> repositories of openbmc, so the current multi-repositories structure
> costs extra effort for our code maintenance, including:
>
> 1. Normally two commits are required for one single change, one for the
> component repo and one for openbmc, for our internal release versions
> are more frequent and the fixes are required to be merged ASAP. We also
> created a script to check if openbmc has included the latest commits of
> all component repos.
>
> 2. Not easy to maintain stable branches, which require to have branches
> for openbmc and the integrated components.
>
> 3. Not easy to search code across all the component repos; I'd like to
> use "git grep" to search keyword in a single repo, but it doesn't work
> here; and it is not easy to make generic fix for all repos, as you said.
>
> I think monorepo will help to improve the situation, and it may help
> prevent the division of the community.
>
> The code review process is not difficult for us, for reviewers are
> chosen automatically by gerrit.
>
> If you also have better practice for the current multi-repo structure,
> please advise and help us improve :)
>
> Thanks,
>
> Heyi

Thank you for your feedback.

>
> >
> >> Thanks,
> >>
> >> Heyi
> >>
> >> 在 2022/4/5 上午2:28, Ed Tanous 写道:
> >>> The OpenBMC development process as it stands is difficult for people
> >>> new to the project to understand, which severely limits our ability to
> >>> onboard new maintainers, developers, and groups which would otherwise
> >>> contribute major features to upstream, but don't have the technical
> >>> expertise to do so.  This initiative, much like others before it[1] is
> >>> attempting to reduce the toil and OpenBMC-specific processes of
> >>> passing changes amongst the community, and move things to being more
> >>> like other projects that have largely solved this problem already.
> >>>
> >>> To that end, I'd like to propose a change to the way we structure our
> >>> repositories within the project: specifically, putting (almost) all of
> >>> the Linux Foundation OpenBMC owned code into a single repo that we can
> >>> version as a single entity, rather than spreading out amongst many
> >>> repos.  In practice, this would have some significant advantages:
> >>>
> >>> - The tree would be easily shareable amongst the various people
> >>> working on OpenBMC, without having to rely on a single-source Gerrit
> >>> instance.  Git is designed to be distributed, but if our recipe files
> >>> point at other repositories, it largely defeats a lot of this
> >>> capability.  Today, if you want to share a tree that has a change in
> >>> it, you have to fork the main tree, then fork every single subproject
> >>> you've made modifications to, then update the main tree to point to
> >>> your forks.  This gets very onerous over time, especially for simple
> >>> commits.  Having maintained several different companies forks
> >>> personally, and spoken to many others having problems with the same,
> >>> adding major features are difficult to test and rebase because of
> >>> this.  Moving the code to a single tree makes a lot of the toil of
> >>> tagging and modifying local trees a lot more manageable, as a series
> >>> of well-documented git commands (generally git rebase[2]).  It also
> >>> increases the likelihood that someone pulls down the fork to test it
> >>> if it's highly likely that they can apply it to their own tree in a
> >>> single command.
> >>>
> >>> - There would be a reduction in reviews.  Today, anytime a person
> >>> wants to make a change that would involve any part of the tree,
> >>> there's at least 2 code reviews, one for the commit, and one for the
> >>> recipe bump.  Compared to a single tree, this at least doubles the
> >>> number of reviews we need to process.  For changes that want to make
> >>> any change to a few subsystems, as is the case when developing a
> >>> feature, they require 2 X <number of project changes> reviews, all of
> >>> which need to be synchronized.  There is a well documented problem
> >>> where we have no official way to synchronize merging of changes to
> >>> userspace applications within a bump without manual human
> >>> intervention.  This would largely render that problem moot.
> >>>
> >>> - It would allow most developers to not need to understand Yocto at
> >>> all to do their day to day work on existing applications.  No more
> >>> "devtool modify", and related SRCREV bumps.  This will help most of
> >>> the new developers on the project with a lower mental load, which will
> >>> mean people are able to ramp up faster..
> >>>
> >>> - It would give an opportunity for individuals and companies to "own"
> >>> well-supported public forks (ie Redhat) of the codebase, which would
> >>> increase participation in the project overall.  This already happens
> >>> quite a bit, but in practice, the forks that do it squash history,
> >>> making it nearly impossible to get their changes upstreamed from an
> >>> outside entity.
> >>>
> >>> - It would centralize the bug databases.  Today, bugs filed against
> >>> sub projects tend to not get answered.  Having all the bugs in
> >>> openbmc/openbmc would help in the future to avoid duplicating bugs
> >>> across projects.
> >>>
> >>> - Would increase the likelihood that someone contributes a patch,
> >>> especially a patch written by someone else.  If contributing a patch
> >>> was just a matter of cherry-picking a tree of commits and submitting
> >>> it to gerrit, it's a lot more likely that people would do it.
> >>>
> >>> - Greatly increases the ease with which stats are collected.
> >>> Questions like: How many patches were submitted last year?  How many
> >>> lines of code changed between commit A and commit B?  Where was this
> >>> regression injected (ie git bisect)?  How much of our codebase is C++?
> >>> How many users of the dbus Sensor.Value interface are there?  Are all
> >>> easily answered in one liner git commands once this change is done.
> >>>
> >>> - New features no longer require single-point-of-contact core
> >>> maintainer processes (ie, creating a repo for changes, setting up
> >>> maintainer groups, ect) and can just be submitted as a series of
> >>> patches to openbmc/openbmc.
> >>>
> >>> - Tree-wide changes (c++ standard, yocto updates, formatting, ect) are
> >>> much easier to accomplish in a small number of patches, or a series of
> >>> patches that is easy to pull and test as a unit.
> >>>
> >>> In terms of concretely how we would accomplish this, I've put together
> >>> what such a tree would look like, and I'm looking for input on how it
> >>> could be improved.  Some key points on what it represents:
> >>>
> >>> - All history for both openbmc and sub projects will be retained.
> >>> Commits are interleaved based on the date in which they were submitted
> >>> using custom tooling that was built on top of git fast-export and
> >>> fast-import.  All previously available tags will have similar tags in
> >>> the new repository pointing at their equivalent commits in the new
> >>> repository.
> >>>
> >>> - Inclusive guidelines: To make progress toward an unrelated but
> >>> important goal at the same time, I'm recommending that the
> >>> openbmc/master branch will be left as-is, and the newly-created sha1
> >>> will be pushed to the branch openbmc/openbmc:main, to retain peoples
> >>> links to previous commits on master, and retain the exact project
> >>> history while at the same time moving the project to having more
> >>> inclusive naming, as has been documented previously[3].  At some point
> >>> in the future the master branch could be renamed and deprecated, but
> >>> this is considered out of scope for this specific change.
> >>>
> >>> - Each individual sub-project will be given a folder within
> >>> openbmc/openbmc based on their current repository name.  While there
> >>> is an opportunity to reorganize in more specific ways (ie, put all
> >>> ipmi-oem handler repos in a folder) this proposal intentionally
> >>> doesn't, under the proposition that once this change is made, any sort
> >>> of folder rearranging will be much easier to accomplish, and to keep
> >>> the scope limited.
> >>>
> >>> - Yocto recipes will be changed to point to their path equivalent, and
> >>> inherit externalsrc bbclass[4].  This workflow is exactly the workflow
> >>> devtool uses to point to local repositories during a "devtool modify",
> >>> so it's unlikely we will have incremental build-consistency issues
> >>> with this approach, as was a concern in the past.
> >>>
> >>> - Places where we've forked other well supported projects (u-boot,
> >>> kernel, ect) will continue to point to the openbmc/<projectname> fork.
> >>> This is done to ensure that we don't inflict the same problem we're
> >>> attempting to solve in OpenBMC upon those working in the subproject
> >>> forks, and to reinforce to contributors that patches to these projects
> >>> should prefer submitting first to the relevant upstream.
> >>>
> >>> - Subprojects that are intended to be reused outside of OpenBMC (ex
> >>> sdbusplus) will retain their previous commit, history, and trees, such
> >>> that they are usable outside the project.  This is intended to make
> >>> sure that the code that should be reusable by others remains so.
> >>>
> >>> - The above intentionally makes no changes to our subtree update
> >>> process, which would remain the same process as is currently.  The
> >>> openbmc-specific autobump job in Jenkins would be disabled considering
> >>> it's no longer required in this approach.
> >>>
> >>> - Most Gerrit patches would now be submitted to openbmc/openbmc.
> >>>
> >>> My proposed version of this tree is pushed to a github fork here, and
> >>> is based on the tree from a few weeks ago:
> >>> https://github.com/edtanous/openbmc
> >>>
> >>> It implements all the above for the main branch.  This tree is based
> >>> on the output of the automated tooling, and in the case where this
> >>> proposal is accepted, the tooling would be re-run to capture the state
> >>> of the tree at the point where we chose to make this change.
> >>>
> >>> The tool I wrote to generate this tree is also published, if you're
> >>> interested in how this tree was built, and is quite interesting in its
> >>> use of git export/import [5], but functionally, I would not expect
> >>> that tooling to survive after this transition is made.
> >>>
> >>> Let me know what you think.
> >>>
> >>> -Ed
> >>>
> >>> [1]https://lore.kernel.org/openbmc/CACWQX821ADQCrekLj_bGAu=1SSLCv5pTee7jaoVo2Zs6havgnA@mail.gmail.com/
> >>> [2]https://git-scm.com/docs/git-rebase
> >>> [3]https://github.com/openbmc/docs/blob/master/CONTRIBUTING.md#inclusive-naming
> >>> [4]https://www.yoctoproject.org/docs/1.8/ref-manual/ref-manual.html#ref-classes-externalsrc
> >>> [5]https://github.com/edtanous/obmc-repo-combine/blob/main/combine