[PATCH v2 0/9] Tools and fixes for parallel parsing
Daniel Axtens
dja at axtens.net
Mon Feb 26 16:42:27 AEDT 2018
Stephen Finucane <stephen at that.guru> writes:
> On Sun, 2018-02-25 at 01:50 +1100, Daniel Axtens wrote:
>> Thomas Petazzoni reported that Patchwork would occasionally lose
>> Buildroot email. Andrew - having talked to jk and sfr - suggested that
>> this may be race-condition related.
>>
>> I investigated and found some bugs. I first had to develop some tools.
>> Along the way I found other unrelated bugs too.
>>
>> Patches 1-4 are tooling - ways to do parallel parsing of messages and
>> get and compare the output. (Patch 1 fixes an issue I found when
>> running the tool from patch 2)
>>
>> Patch 5 is an unrelated fix that came up along the way and
>> demonstrates that humans remain the best fuzzers, and that Python's
>> email module is still adorably* quirky.
>>
>> Patch 6 is a bug that came up very quickly in testing but is unlikely
>> to be the actual bug Buildroot is hitting, as it can only occur the
>> first time an email address is seen.
>>
>> Patch 7 is a related tidy-up/optimisation.
>>
>> Patch 8 fixes up a MySQL-only bug, but also adds some robustness.
>>
>> I think patch 9 closes the most likely issue for Buildroot patches.
>>
>> Pending review, patches 5, 6, 8 and 9 should go to stable.
>>
>> V2: Address review comments from Andrew, see patches.
>> Added the new python script to the automatic pep8 testing.
>> Minor improvement to parallel parse script to allow different
>> pythons to be used.
>
> Well, this is all rather unfortunate :( When I was developing this, I
> was testing using 'parsearchive' with a single archive, which naturally
> meant everything was running in series. It seems the use of an MTA
> means the 'parsemail' script, which 'parsearchive' harnesses, gets
> called multiple times here and highlights all these races.
Likewise, and yes, I think that's how we get here.
> I've reviewed most of the patches (bar one, which I'm still working on)
> at this point, and think most of these should be merged. However, I've
> noted that some of the patches leave FIXMEs in place. I'm pretty sure
> all of these occur where we apply a "lookup now and save later _if_
> these preconditions are met" pattern, which we do for things like
> series references, authors, etc. This isn't going to be easy to fix,
> IMO.
>
> That brings me to the real question here: what do we need to do to
> ultimately fix this? Far as I can see, there are two options:
>
> * Save stuff unconditionally. If we're able to find an email, save
> that even if it's not a patch/cover letter/comment. If we find an
> author that we haven't seen before, save that too even if they
> haven't touched a patch, cover letter or comment. This greatly
> simplifies things, but would result in a more congested database and
> API, and would require some background work (saving non-patch/cover
> letter emails).
I don't think we need to do this - and I'm not sure that the problem is
the lookup-now-save-later pattern, I think it's just our implementation.
The big thing is series references, and I think that (apart from
unthreaded series and possibly deeply threaded series?) we can solve
that pretty easily by holding the 'head email' (patch 0 or patch 1) in
the SeriesReference. But I will think about it more; I don't have
certain answers just yet.
> * Use locks. I recall a patch from Damien Lespiau some years ago
> around making this a parallel process [1]. Couldn't we just do this
> on the critical sections of code (anything that involves DB
> accesses, I guess)? This would require minimal changes but could
> have scalability issues.
So I have Feelings about introducing locking mechanisms.
In particular, looking at [1], I worry we're going to land in the
territory of bizzare and irritating issues that robust futexes / pthread
mutexes deal with.
In short, a futex is a fast userspace mutex device facilitated by the
kernel. It's used to implement higher-level locks/mutexes/semaphores.
It uses shared memory - basically it twiddles bits in memory to use as a
lock.
Now, consider 2 processes, A and B are using shared memory and futexes.
A takes the futex, and B sits waiting for the futex so it can proceed.
While holding the futex, A segfaults. However, A is still holding the
lock, and cannot clear it, so B is now deadlocked.
It's easy to go "oh, we should catch segfaults", and with the right
handler functions you can. So let's say A does that. Now if it
segfaults, the handler unlocks and B continues. This is all well and
good, but now consider the situation where an impatient sysadmin (or the
kernel) SIGKILLs A. In this case no handler can help you: B will always
deadlock.
Robust futexes solve this problem with kernel magic - that's what makes
them "robust". (Basically, the kernel deals with the case of the process
going away with the lock held.)
So, in Damien's case:
- we could deadlock if there is a Python exception
- so we catch this with a finally: and unlock
- but we're still stuffed if the python process is killed in a way that
stops Python from running the finally block
So we will have introduced a new locking mechanism with new, fragile,
tricky to debug failure modes*. There is no mechanism we could
conceivably hook into to become truly "robust". Now, Damien's code does
try: the underlying implementaion in
https://lists.ozlabs.org/pipermail/patchwork/2015-October/001957.html
makes the lockfile (/tmp/whatever) as a symlink to "hostname:pid" as
this is atomic. Good! But, to avoid the robustness issue, if you try to
take the lock and it's owned by a process with a PID that does not
exist, it guesses that the process has died and steals the lock. This,
doesn't work if any of the following apply:
- you're in different PID namespaces but the same hostname namespace
(e.g docker --net=none).
- there have been enough processes that the PID counter has rolled over
and the PID has been reused.
- the code tests for the existence of a process by sending signal 0 to
the PID and seeing what happens. It's very easy to imagine a paranoid
sysadmin from using seccomp to block parsemail from sending a signal,
as there's no obvious reason why it should do so. (And I checked
strace, it doesn't appear to do so at the moment.)
To add insult to injury, we've added this concurrency control mechanism
to solve a problem in how we use an underlying system that has
(literally!!) decades of work in correct concurrent behaviour.
I would rather not do this.
Regards,
Daniel
* More interesting failure modes:
- Each parsemail process runs in a docker container with a unique /tmp
- PW runs on a RO filesystem
- PW runs distributed across machines - have to share /tmp (and w/
something that supports atomic symlinks e.g. NFS)
- can another unrelated process delete the lock file?
>
> Thoughts?
>
> [1] http://patchwork.ozlabs.org/patch/532930/
More information about the Patchwork
mailing list