[PATCH 2/3] Include the responsible user in patch-related events

Tue Oct 8 06:32:24 AEDT 2019

On Mon, Oct 7, 2019 at 6:06 PM Stephen Finucane <stephen at that.guru> wrote:
> On Mon, 2019-10-07 at 00:57 +0200, Johan Herland wrote:
> > We want to use the events as an audit log of patch state/delegate
> > changes. An important part of this is recording _who_ changed the
> > patch state/delegate.
> >
> > To do this, we need to know the active user (request.user) at the
> > point where we create the Event instance. Event creation is currently
> > triggered by signals, but neither the signal handlers, nor the model
> > classes themselves have easy access to the active user.
>
> I ran into the same issue when trying to resolve this (tracked via [1],
> which you could probably refer to in your commit messages)...

Ah, cool. I was not aware of this issue. Will add reference to the
commit message.

> > However, for the patch-based Events that are relevant to us, we can do
> > the following hack:
> >
> > The relevant events are created in signal handlers that are all hooked
> > up to either the pre_save or post_save signals sent by the save()
> > method of the Patch model. Before a Patch.save() can happen, naturally,
> > Patchwork must query Patch.is_editable() to ascertain whether the patch
> > can in fact be changed by the active user. Hence, we do in fact have
> > access to the active user (in Patch.is_editable()) before the Event is
> > created (as a result of the signals emitted by Patch.save()).
> >
> > Thus, all we really need is a way to communicate this user instance
> > from Patch.is_editable() to the signal handlers that create the
> > resulting Events. The Patch object is available in both places, and is
> > the simplest place to put this information, so we simply add an
> > ._edited_by attribute (which fortunately is not detected as a persistent
> > database field by Django).
>
> We _could_ do this but, as you note, it is a hack. Before we commit,
> it's probably worth exploring the other options we have. From my
> investigation into this in the distant past, there seemed to be two
> potential solutions: using middleware instead of events to intercept
> the request and create events,

I came across this in my research as well, and found multiple pointers
to things like https://pypi.org/project/django-currentuser/ (or worse
hacks along the same lines) in order to resolve it. I guess the
essence of the matter is that we're trying to accomplish something
that (seemingly) violates the MVC design of Django. The current user
is/should only be available in the view part of the app, and
everything is sufficiently decoupled that the "downstream" parts (e.g.
models and signals) simply don't have this information at hand. This
would suggest that creating our events should happen close to the
views themselves, but this seems like a really bad idea when very
different views (e.g. REST API vs. webpages) end up modifying the same
models in the same ways, as we would then have to duplicate the event
creation across a lot of views.

> adding 'last_modified_by' fields to
> various models and saving the user there for consumption when creating
> the event,

I briefly considered this, but it really does not appeal to me, as I
cannot see how this extra 'last_modified_by' field brings any value to
the models at all (given that the Events would store a much richer set
of details).

I am also unsure exactly which models would want this field. The
current event categories concern these models: CoverLetter (creation),
Patch (creation, state-change, delegation, completion), Check
(creation), and Series (creation, completion). But several of these
events (cover-created, patch-created, series-created, at least) happen
as a result of incoming emails, and therefore - I suspect - does not
really have a proper 'actor' as such. And the series-completed seems
to be closely tied to / triggered by patch-completed. This leaves
check-created, and the remaining patch-* events, which suggests that
only Patch and Check would need a 'last_modified_by'. Looking closer
at Check, however, it already has a 'user' field, which I suspect (but
I'm not sure) would correlate closely to 'last_modified_by' [1].

This leaves Patch as the only model requiring a 'last_modified_by'
field. And I believe the end result of that change would be very
similar to this patch, with the exception that my
transient/non-persistent ._edited_by attribute would instead become a
"proper"/persistent .last_modified_by field (but it would still add no
value to the Patch model itself, IMHO).

> and modifying the 'save' methods of the various models.

The current architecture hooks the event creation up to the pre_save
and post_save signals, meaning that the .save() methods are already
(indirectly) involved in the event creation. What do you gain by
plugging things directly into the .save() method? Do you have easier
access to the current user from here? As far as I could see, getting
the current user from inside the models (modulo my hack of
.is_editable()) is just as hard as getting it from inside
signals.py...

> The
> first option was the one I pursued but I ended up not following through
> with it because (a) it was hard to read/test and (b) it had to be
> enabled by the admin of the instance (via adding it to 'MIDDLEWARE' in
> settings).

Understandable.

>I don't recall trying out the other two approaches but the
> latter one (modifying 'save' methods) seems like it would do the trick,
> at the expense of making the 'models.py' file even longer. Did you
> check this approach out and, if so, how did it work?

I haven't tried overriding the .save() method(s), and I'm not sure
what the result would look like, as I'm still stuck (AFAICS) with the
issue of how to get the current user in this context.

> Thanks for taking a look into this. Much appreciated :)

No worries, I'm just happy it resonated with upstream, as having to
maintain patches (especially patches with schema changes) downstream
is not at all desirable. :)

...Johan

[1]: I have no experience with Checks, so I don't know: when a check
succeeds/fails, do we change the .state of the Check object, or do we
simply add another Check object with the new state? If the former is
true, should we add a check-state-changed event in addition to
check-created?

> Stephen
>
> [1] https://github.com/getpatchwork/patchwork/issues/73

-- 
Johan Herland, <johan at herland.net>
www.herland.net