[PATCH] KVM: PPC: Book3S HV: Do not expose HFSCR sanitisation to nested hypervisor
Fabiano Rosas
farosas at linux.ibm.com
Tue Mar 9 02:04:16 AEDT 2021
Nicholas Piggin <npiggin at gmail.com> writes:
> Excerpts from Fabiano Rosas's message of March 6, 2021 9:10 am:
>> As one of the arguments of the H_ENTER_NESTED hypercall, the nested
>> hypervisor (L1) prepares a structure containing the values of various
>> hypervisor-privileged registers with which it wants the nested guest
>> (L2) to run. Since the nested HV runs in supervisor mode it needs the
>> host to write to these registers.
>>
>> To stop a nested HV manipulating this mechanism and using a nested
>> guest as a proxy to access a facility that has been made unavailable
>> to it, we have a routine that sanitises the values of the HV registers
>> before copying them into the nested guest's vcpu struct.
>>
>> However, when coming out of the guest the values are copied as they
>> were back into L1 memory, which means that any sanitisation we did
>> during guest entry will be exposed to L1 after H_ENTER_NESTED returns.
>>
>> This is not a problem by itself, but in the case of the Hypervisor
>> Facility Status and Control Register (HFSCR), we use the intersection
>> between L2 hfscr bits and L1 hfscr bits. That means that L1 could use
>> this to indirectly read the (hv-privileged) value from its vcpu
>> struct.
>>
>> This patch fixes this by making sure that L1 only gets back the bits
>> that are necessary for regular functioning.
>
> The general idea of restricting exposure of HV privileged bits, but
> for the case of HFSCR a guest can probe the HFCR anyway by testing which
> facilities are available (and presumably an HV may need some way to know
> what features are available for it to advertise to its own guests), so
> is this necessary? Perhaps a comment would be sufficient.
>
Well, I'd be happy to force them through the arduous path then =); and
there are features that are emulated by the HV which L1 would not be
able to probe.
I think we should implement a mechanism that stops all leaks now, rather
than having to ponder about this every time we touch an hv_reg in that
structure. I'm not too worried about HFSCR specifically.
Let me think about this some more and see if I can make it more generic,
I realise that sticking the saved_hfscr on the side is not the most
elegant approach.
> Thanks,
> Nick
>
>>
>> Signed-off-by: Fabiano Rosas <farosas at linux.ibm.com>
>> ---
>> arch/powerpc/kvm/book3s_hv_nested.c | 22 +++++++++++++++++-----
>> 1 file changed, 17 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
>> index 0cd0e7aad588..860004f46e08 100644
>> --- a/arch/powerpc/kvm/book3s_hv_nested.c
>> +++ b/arch/powerpc/kvm/book3s_hv_nested.c
>> @@ -98,12 +98,20 @@ static void byteswap_hv_regs(struct hv_guest_state *hr)
>> }
>>
>> static void save_hv_return_state(struct kvm_vcpu *vcpu, int trap,
>> - struct hv_guest_state *hr)
>> + struct hv_guest_state *hr, u64 saved_hfscr)
>> {
>> struct kvmppc_vcore *vc = vcpu->arch.vcore;
>>
>> + /*
>> + * During sanitise_hv_regs() we used HFSCR bits from L1 state
>> + * to restrict what the L2 state is allowed to be. Since L1 is
>> + * not allowed to read this SPR, do not include these
>> + * modifications in the return state.
>> + */
>> + hr->hfscr = ((~HFSCR_INTR_CAUSE & saved_hfscr) |
>> + (HFSCR_INTR_CAUSE & vcpu->arch.hfscr));
>> +
>> hr->dpdes = vc->dpdes;
>> - hr->hfscr = vcpu->arch.hfscr;
>> hr->purr = vcpu->arch.purr;
>> hr->spurr = vcpu->arch.spurr;
>> hr->ic = vcpu->arch.ic;
>> @@ -132,12 +140,14 @@ static void save_hv_return_state(struct kvm_vcpu *vcpu, int trap,
>> }
>> }
>>
>> -static void sanitise_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr)
>> +static void sanitise_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr,
>> + u64 *saved_hfscr)
>> {
>> /*
>> * Don't let L1 enable features for L2 which we've disabled for L1,
>> * but preserve the interrupt cause field.
>> */
>> + *saved_hfscr = hr->hfscr;
>> hr->hfscr &= (HFSCR_INTR_CAUSE | vcpu->arch.hfscr);
>>
>> /* Don't let data address watchpoint match in hypervisor state */
>> @@ -272,6 +282,7 @@ long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
>> u64 hdec_exp;
>> s64 delta_purr, delta_spurr, delta_ic, delta_vtb;
>> u64 mask;
>> + u64 hfscr;
>> unsigned long lpcr;
>>
>> if (vcpu->kvm->arch.l1_ptcr == 0)
>> @@ -324,7 +335,8 @@ long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
>> mask = LPCR_DPFD | LPCR_ILE | LPCR_TC | LPCR_AIL | LPCR_LD |
>> LPCR_LPES | LPCR_MER;
>> lpcr = (vc->lpcr & ~mask) | (l2_hv.lpcr & mask);
>> - sanitise_hv_regs(vcpu, &l2_hv);
>> +
>> + sanitise_hv_regs(vcpu, &l2_hv, &hfscr);
>> restore_hv_regs(vcpu, &l2_hv);
>>
>> vcpu->arch.ret = RESUME_GUEST;
>> @@ -345,7 +357,7 @@ long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
>> delta_spurr = vcpu->arch.spurr - l2_hv.spurr;
>> delta_ic = vcpu->arch.ic - l2_hv.ic;
>> delta_vtb = vc->vtb - l2_hv.vtb;
>> - save_hv_return_state(vcpu, vcpu->arch.trap, &l2_hv);
>> + save_hv_return_state(vcpu, vcpu->arch.trap, &l2_hv, hfscr);
>>
>> /* restore L1 state */
>> vcpu->arch.nested = NULL;
>> --
>> 2.29.2
>>
>>
More information about the Linuxppc-dev
mailing list