Structured feeds

Daniel Axtens dja at
Thu Nov 7 22:09:24 AEDT 2019

Sending on to the patchwork list for discussion. I think at least some
of this makes sense for Patchwork to support, I'll do a more detailed
analysis/breakdown later on.

Konstantin Ryabitsev <konstantin at> writes:

> On Thu, Nov 07, 2019 at 02:35:08AM +1100, Daniel Axtens wrote:
>>This is an non-trivial problem, fwiw. Patchwork's email parser clocks 
>>at almost thirteen hundred lines, and that's with the benefit of the
>>Python standard library. It also regularly gets patched to handle
>>changes to email systems (e.g. DMARC), changes to git (git request-pull
>>format changed subtly in 2.14.3), the bizzare ways people send email,
>>and so on.
> I'm actually very interested in seeing patchwork switch from being fed 
> mail directly from postfix to using public-inbox repositories as its 
> source of patches. I know it's easy enough to accomplish as-is, by 
> piping things from public-inbox to, but it would be even 
> more awesome if patchwork learned to work with these repos natively.
> The way I see it:
> - site administrator configures upstream public-inbox feeds
> - a backend process clones these repositories
>    - if it doesn't find a refs/heads/json, then it does its own parsing 
>      to generate a structured feed with patches/series/trailers/pull 
>      requests, cross-referencing them by series as necessary. Something 
>      like a subset of this, excluding patchwork-specific data:
>    - if it does find an existing structured feed, it simply uses it (e.g.  
>      it was made available by another patchwork instance)
> - the same backend process updates the repositories from upstream using 
>    proper manifest files (e.g. see 
> - patchwork projects then consume one (or more) of these structured 
>    feeds to generate the actionable list of patches that maintainers can 
>    use, perhaps with optional filtering by specific headers (list-id, 
>    from, cc), patch paths, keywords, etc.
> Basically, is split into two, where one part does feed 
> cloning, pulling, and parsing into structured data (if not already 
> done), and another populates actual patchwork project with patches 
> matching requested parameters.
> I see the following upsides to this:
> - we consume public-inbox feeds directly, no longer losing patches due 
>    to MTA problems, postfix burps, parse failures, etc
> - a project can have multiple sources for patches instead of being tied 
>    to a single mailing list
> - downstream patchwork instances (the "local patchwork" tool I mentioned 
>    earlier) can benefit from structured feeds provided by 
>>Patchwork does expose much of this as an API, for example for patches:
>> so if you want to
>>build on that feel free. We can possibly add data to the API if that
>>would be helpful. (Patches are always welcome too, if you don't want to
>>wait an indeterminate amount of time.)
> As I said previously, I may be able to fund development of various 
> features, but I want to make sure that I properly work with upstream.  
> That requires getting consensus on features to make sure that we don't 
> spend funds and efforts on a feature that gets rejected. :)
> Would the above feature (using one or more public-inbox repositories as 
> sources for a patchwork project) be a welcome addition to upstream?
> -K

More information about the Patchwork mailing list