[PATCH] Include all email headers in mboxes

Veronika Kabatova vkabatov at redhat.com
Fri Apr 6 00:58:35 AEST 2018


----- Original Message -----
> From: "Daniel Axtens" <dja at axtens.net>
> To: "Johannes Berg" <johannes at sipsolutions.net>, vkabatov at redhat.com, patchwork at lists.ozlabs.org
> Sent: Thursday, April 5, 2018 3:47:03 PM
> Subject: Re: [PATCH] Include all email headers in mboxes
> 
> Johannes Berg <johannes at sipsolutions.net> writes:
> 
> > On Thu, 2018-04-05 at 19:58 +1000, Daniel Axtens wrote:
> >> vkabatov at redhat.com writes:
> >> 
> >> > From: Veronika Kabatova <vkabatov at redhat.com>
> >> > 
> >> > Solves issue #165 (Exported mboxes should include In-Reply-To,
> >> > References, etc headers). Instead of including only a few chosen ones,
> >> > all received headers are added to mboxes.
> >> 
> >> Thanks for the patch.
> >> 
> >> I'm a little worried that this will get really messy - I've included a
> >> snippet of headers from an email from an unrelated bug below. Maybe we
> >> don't care - I guess this isn't really for human consumption but is for
> >> consumption by e.g. git-am.
> >> 
> >> Alternatively we can blacklist headers: I don't think there's anything
> >> worth having in Received, X-*, List-*, DKIM, ARC, SPF, etc. But I wonder
> >> if this is just whack-a-mole in reverse.
> >> 
> >> Thoughts?
> >
> > I'm not really sure human consumption would be a worry for mbox files?
> > Having the full headers - since it's sort of an email archive already -
> > would be useful though.
> >
> > In particular, the change to keep the original Subject would be useful
> > for us, as we have a script that automatically replies to the email
> > saying "thank you, I've applied your patch" or similar.
> 
> Hmm, I think the hasher.py and patchwork-update-commits scripts were
> designed to facilitate this sort of thing in a slightly more robust
> way. But I've never really felt like I understood it, and I don't recall
> any docs.
> 
> I hope that the change to the subject mangling doesn't break anything
> else, but I can't imagine it would. [0]
> 
> > That said, using a dict for this is in general not quite right, since
> > many header lines are valid multiple times, e.g. "Received:", but I'm
> > not sure they're even all stored.
> 
> Right, you and Veronika have convinced me. Clearly as a developer of
> patchwork my view is a bit skewed - I have to look at them fairly
> frequently when people report parsing bugs so I forget that other people
> don't do this.
> 
> Veronika - can you check on repeated headers issue that Johannes raises?
> 

Right, this won't be too friendly in case of duplicate header keys. We do
store all headers with duplicate keys so I can easily change that and post
updated patch shortly.

However, since this was brought up, the same situation is present in
the API -- only one of the duplicate headers is shown. Based on my testing,
the code on our side does return correct values so this should be a problem
with the REST framework or something similar. I might take a look at it
later but can't promise I'll be able to fix it easily.


Thanks,
Veronika


> Thanks all.
> 
> Regards,
> Daniel
> 
> [0] Obligatory xkcd - https://xkcd.com/1172/
> >
> > johannes
> 


More information about the Patchwork mailing list