[rfc] Extending Patchwork as a GSoC project

Wed May 6 23:12:08 AEST 2020

Hi Rohit, hi all,

On 06/05/2020 05:13, Rohit Sarkar wrote:
> It would be great to hear about the views of the Patchwork community
> regarding this project. This would help us in better defining the work
> items and making informed architectural decisions regarding the
> interaction between PaStA and Patchwork.

Thanks for picking this up, and thanks for starting the discussion.

Daniel, just to keep you in sync: We (Lukas, Rohit and I) already had a
video call yesterday, and we were already able to identify three
milestones of the project:

1. Get PaStA and Patchwork in sync. Both need to work on the same data
   sources.
2. (Differentially) analyse new incoming data, such as new patches on
   lists, or new commits in the repo(s).
3. Update Patchwork relations by using the existent API.

But beforehand, we need to sort out some technical/architectural details
before Rohit can start coding.

So let me start the discussion for 1.:

Do I see it correctly, that the official Linux patchwork instances
receive the ML data on their own? So they do not rely on, for example,
public inboxes, right?

PaStA supports both: mboxes and public inboxes. PaStA also understands
the X-Patchwork-ID header to uniquely identify mails. Public Inboxes are
a great exchange format. We know exactly what was added since our last
pull. But we need some alternative strategy in case you don't support
it, and this might be tricky.

Daniel, I have in mind that there is already some kind of infrastructure
in patchwork for receiving raw patches... AFAIR, Mete implemented an
export routine that eases the first initial import. Is there a
possibility to reliably "receive all new patches since my last pull"?

Rohit, I guess the best thing you can do is to play with a local
patchwork instance. Convert an existing public inbox back to a mbox and
split it in the middle. Then, feed the first half to patchwork, and try
to receive all patches via the API. Then, feed the second half and try
to receive the rest of the patches. Compare the result of the API (e.g.,
all Patchwork-IDs) with the database entries of Patchwork to ensure that
we didn't miss a single mail.

Thanks
  Ralf