[PATCH v2 0/5] Update REST API: Add 'project patches as mbox' field

Lukas Bulwahn lukas.bulwahn at gmail.com
Tue Jul 2 15:38:42 AEST 2019


Hi Daniel, hi Andrew,

(I am mentoring Mete)

On Tue, Jul 2, 2019 at 6:33 AM Andrew Donnellan <ajd at linux.ibm.com> wrote:
>
> On 2/7/19 10:26 am, Daniel Axtens wrote:
> > So there are two possible complimenatry approaches I can think of:
> >
> >   - gather the data from a download of the mailing list that patchwork
> >     injests. For LKML you can get this from
> >     https://www.kernel.org/lore.html, for example.
> >     You could then pass this through a local patchwork instance. (Let me
> >     know if you want my scripts for importing a public-archive git repo
> >     into patchwork.)
> >

In fact, Pasta can already take public inbox git repositories as
input, and from the point of reproducibility of results, this is our
preference for analysis with pasta.
During the design discussion, I was considering that Mete could simply
extend patchwork to also take public inboxes as input.

We made this second priority, as:
1. we expected that to be a bit more complex and driving our focus a
bit away of original goal combining patchwork and pasta towards a
secondary goal of making public inbox and patchwork work nicely
together.
2. we expected that our first alpha users might not be mailing lists
administrators, but users that just run on their personal inbox and
would be interested to try out pasta's capabilities, but do not have a
public inbox setup for their personal inbox.

The public inbox to patchwork integration would certainly be nice and
helpful for using it on mailing lists that are already set up to be
archived with public inbox, and it would make it easier to keep the
state of pasta and patchwork consistent.

Please let us know where to find those scripts. We can then try them
out and see if the setup would principally work for us and which
extensions and changes we would need.

> >   - add a management command to export a project as an mbox and then
> >     coordinate with patchwork admins at the instance you're interested in
> >     to run the export at a time that suits them and provide you with a
> >     heavily compressed copy of the output.
> >
> If we were to add an API for this kind of bulk mbox export, I think it
> would need to export a fixed number of emails at a time (100 or 250 or
> something like that).
>
> For patchwork setups where the raw mailing list data isn't easily
> retrievable, a management command to export the whole project as mbox
> could be extended fairly easily to "export the past N days of the
> project as mbox". Then just put that in your crontab and have it export
> a new archive every so often, which you compress and then serve up
> statically.
>

It sounds reasonable to make the functionality we want a management
command rather than using the REST API. We will continue to dig into
that and let you know if we hit any unexpected issues.


Best regards,

Lukas


More information about the Patchwork mailing list