[PATCH 1/5] KVM: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM guest

Sean Christopherson seanjc at google.com
Wed Aug 18 10:12:06 AEST 2021


Invoke rseq's NOTIFY_RESUME handler when processing the flag prior to
transferring to a KVM guest, which is roughly equivalent to an exit to
userspace and processes many of the same pending actions.  While the task
cannot be in an rseq critical section as the KVM path is reachable only
via ioctl(KVM_RUN), the side effects that apply to rseq outside of a
critical section still apply, e.g. the CPU ID needs to be updated if the
task is migrated.

Clearing TIF_NOTIFY_RESUME without informing rseq can lead to segfaults
and other badness in userspace VMMs that use rseq in combination with KVM,
e.g. due to the CPU ID being stale after task migration.

Fixes: 72c3c0fe54a3 ("x86/kvm: Use generic xfer to guest work function")
Reported-by: Peter Foley <pefoley at google.com>
Bisected-by: Doug Evans <dje at google.com>
Cc: Shakeel Butt <shakeelb at google.com>
Cc: Thomas Gleixner <tglx at linutronix.de>
Cc: stable at vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc at google.com>
---
 kernel/entry/kvm.c | 4 +++-
 kernel/rseq.c      | 4 ++--
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/entry/kvm.c b/kernel/entry/kvm.c
index 49972ee99aff..049fd06b4c3d 100644
--- a/kernel/entry/kvm.c
+++ b/kernel/entry/kvm.c
@@ -19,8 +19,10 @@ static int xfer_to_guest_mode_work(struct kvm_vcpu *vcpu, unsigned long ti_work)
 		if (ti_work & _TIF_NEED_RESCHED)
 			schedule();
 
-		if (ti_work & _TIF_NOTIFY_RESUME)
+		if (ti_work & _TIF_NOTIFY_RESUME) {
 			tracehook_notify_resume(NULL);
+			rseq_handle_notify_resume(NULL, NULL);
+		}
 
 		ret = arch_xfer_to_guest_mode_handle_work(vcpu, ti_work);
 		if (ret)
diff --git a/kernel/rseq.c b/kernel/rseq.c
index 35f7bd0fced0..58c79a7918cd 100644
--- a/kernel/rseq.c
+++ b/kernel/rseq.c
@@ -236,7 +236,7 @@ static bool in_rseq_cs(unsigned long ip, struct rseq_cs *rseq_cs)
 
 static int rseq_ip_fixup(struct pt_regs *regs)
 {
-	unsigned long ip = instruction_pointer(regs);
+	unsigned long ip = regs ? instruction_pointer(regs) : 0;
 	struct task_struct *t = current;
 	struct rseq_cs rseq_cs;
 	int ret;
@@ -250,7 +250,7 @@ static int rseq_ip_fixup(struct pt_regs *regs)
 	 * If not nested over a rseq critical section, restart is useless.
 	 * Clear the rseq_cs pointer and return.
 	 */
-	if (!in_rseq_cs(ip, &rseq_cs))
+	if (!regs || !in_rseq_cs(ip, &rseq_cs))
 		return clear_rseq_cs(t);
 	ret = rseq_need_restart(t, rseq_cs.flags);
 	if (ret <= 0)
-- 
2.33.0.rc1.237.g0d66db33f3-goog



More information about the Linuxppc-dev mailing list