[PATCH 08/10] parse(mail|archive): handle early fail within email module

Stephen Finucane stephen at that.guru
Thu Jun 29 06:02:04 AEST 2017


On Thu, 2017-06-29 at 00:06 +1000, Daniel Axtens wrote:
> Andrew Donnellan <andrew.donnellan at au1.ibm.com> writes:
> 
> > On 28/06/17 17:48, Daniel Axtens wrote:
> > > Certain really messed up email messages can cause a failure within
> > > the email module (at least on py3). Catch this.
> > > 
> > > Signed-off-by: Daniel Axtens <dja at axtens.net>
> > > ---
> > >  patchwork/management/commands/parsearchive.py |  9 ++++++++
> > >  patchwork/management/commands/parsemail.py    | 31 ++++++++++++++++-----
> > > ------
> > >  2 files changed, 27 insertions(+), 13 deletions(-)
> > > 
> > > diff --git a/patchwork/management/commands/parsearchive.py
> > > b/patchwork/management/commands/parsearchive.py
> > > index a3c8360186c8..3aab58a45bcd 100644
> > > --- a/patchwork/management/commands/parsearchive.py
> > > +++ b/patchwork/management/commands/parsearchive.py
> > > @@ -77,6 +77,15 @@ class Command(BaseCommand):
> > > 
> > >          count = len(mbox)
> > > 
> > > +        # detect broken mails in the mbox
> > > +        # see earlyfail fuzz test on py3
> > > +        try:
> > > +            for m in mbox:
> > > +                pass
> > > +        except AttributeError:
> > > +            logger.warning('Broken mbox/Maildir, aborting')
> > > +            return
> > > +
> > 
> > This seems a bit non-obvious and could do with a little bit of explanation?
> 
> The message, or the code structure? I structured the code this way
> rather than the more obvious
>  try:
>    mbox = [m for m in mbox]
>  ...
> because the more obvious way requires loading the entire mbox/maildir
> into memory and I was a bit worried about the memory consumption of that
> when parsing a large mbox.
> 
> I agree a more helpful comment would have been in order. Stephen, do you
> want a v2 of this patch by itself? I can resend the series but it seems
> a bit excessive... Or I could do a follow-up.

Yeah, let's just do a follow-up. I'm going to go ahead and merge the rest of
these.

Stephen


More information about the Patchwork mailing list