[PATCH 14/27] Add book3s_64 specific opcode emulation
Alexander Graf
agraf at suse.de
Thu Nov 5 21:09:43 EST 2009
On 05.11.2009, at 01:53, Segher Boessenkool wrote:
>>>> + case OP_31_XOP_EIOIO:
>>>> + break;
>>>
>>> Have you always executed an eieio or sync when you get here, or
>>> do you just not allow direct access to I/O devices? Other context
>>> synchronising insns are not enough, they do not broadcast on the
>>> bus.
>>
>> There is no device passthrough yet :-). It's theoretically
>> possible, but nothing for it is implemented so far.
>
> You could just always do an eieio here, it's not expensive at all
> compared to the emulation trap itself.
>
> However -- eieio is a Book II insn, it will never trap anyway!
Don't all 31 ops trap? I'm pretty sure I added the emulation because I
saw the trap.
>>>> + case OP_31_XOP_DCBZ:
>>>> + {
>>>> + ulong rb = vcpu->arch.gpr[get_rb(inst)];
>>>> + ulong ra = 0;
>>>> + ulong addr;
>>>> + u32 zeros[8] = { 0, 0, 0, 0, 0, 0, 0, 0 };
>>>> +
>>>> + if (get_ra(inst))
>>>> + ra = vcpu->arch.gpr[get_ra(inst)];
>>>> +
>>>> + addr = (ra + rb) & ~31ULL;
>>>> + if (!(vcpu->arch.msr & MSR_SF))
>>>> + addr &= 0xffffffff;
>>>> +
>>>> + if (kvmppc_st(vcpu, addr, 32, zeros)) {
>>>
>>> DCBZ zeroes out a cache line, not 32 bytes; except on 970, where
>>> there
>>> are HID bits to make it work on 32 bytes only, and an extra DCBZL
>>> insn
>>> that always clears a full cache line (128 bytes).
>>
>> Yes. We only come here when we patched the dcbz opcodes to invalid
>> instructions
>
> Ah yes, I forgot. Could you rename it to OP_31_XOP_FAKE_32BIT_DCBZ
> or such?
Good idea.
>> because cache line size of target == 32.
>> On 970 with MSR_HV = 0 we actually use the dcbz 32-bytes mode.
>>
>> Admittedly though, this could be a lot more clever.
>
>>>> + /* guest HID5 set can change is_dcbz32 */
>>>> + if (vcpu->arch.mmu.is_dcbz32(vcpu) &&
>>>> + (mfmsr() & MSR_HV))
>>>> + vcpu->arch.hflags |= BOOK3S_HFLAG_DCBZ32;
>>>> + break;
>>>
>>> Wait, does this mean you allow other HID writes when MSR[HV] isn't
>>> set? All HIDs (and many other SPRs) cannot be read or written in
>>> supervisor mode.
>>
>> When we're running in MSR_HV=0 mode on a 970 we can use the 32 byte
>> dcbz HID flag. So all we need to do is tell our entry/exit code to
>> set this bit.
>
> Which patch contains that entry/exit code?
That's patch 7 / 27.
+ /* Some guests may need to have dcbz set to 32 byte length.
+ *
+ * Usually we ensure that by patching the guest's instructions
+ * to trap on dcbz and emulate it in the hypervisor.
+ *
+ * If we can, we should tell the CPU to use 32 byte dcbz though,
+ * because that's a lot faster.
+ */
+
+ ld r3, VCPU_HFLAGS(r4)
+ rldicl. r3, r3, 0, 63 /* CR = ((r3 & 1) == 0) */
+ beq no_dcbz32_on
+
+ mfspr r3,SPRN_HID5
+ ori r3, r3, 0x80 /* XXX HID5_dcbz32 = 0x80 */
+ mtspr SPRN_HID5,r3
+
+no_dcbz32_on:
>> If we're on 970 on a hypervisor or on a non-970 though we can't use
>> the HID5 bit, so we need to binary patch the opcodes.
>>
>> So in order to emulate real 970 behavior, we need to be able to
>> emulate that HID5 bit too! That's what this chunk of code does - it
>> basically sets us in dcbz32 mode when allowed on 970 guests.
>
> But when MSR[HV]=0 and MSR[PR]=0, mtspr to a hypervisor resource
> will not trap but be silently ignored. Sorry for not being more
> clear.
> ...Oh. You run your guest as MSR[PR]=1 anyway! Tricky.
Yeah, the guest is always running in PR=1, so all HV checks are for
the host. Usually we run in HV=1 on the host, because IBM doesn't sell
machines that have HV=0 accessible for mortals :-).
I'll address your comments in a follow-up patch once the stuff is
merged.
Alex
More information about the Linuxppc-dev
mailing list