[PATCH v4 1/2] powerpc/pseries/iommu: Share the per-cpu TCE page with the hypervisor.

Alexey Kardashevskiy aik at ozlabs.ru
Wed Dec 4 12:08:09 AEDT 2019



On 04/12/2019 11:49, Ram Pai wrote:
> On Wed, Dec 04, 2019 at 11:04:04AM +1100, Alexey Kardashevskiy wrote:
>>
>>
>> On 04/12/2019 03:52, Ram Pai wrote:
>>> On Tue, Dec 03, 2019 at 03:24:37PM +1100, Alexey Kardashevskiy wrote:
>>>>
>>>>
>>>> On 03/12/2019 15:05, Ram Pai wrote:
>>>>> On Tue, Dec 03, 2019 at 01:15:04PM +1100, Alexey Kardashevskiy wrote:
>>>>>>
>>>>>>
>>>>>> On 03/12/2019 13:08, Ram Pai wrote:
>>>>>>> On Tue, Dec 03, 2019 at 11:56:43AM +1100, Alexey Kardashevskiy wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 02/12/2019 17:45, Ram Pai wrote:
>>>>>>>>> H_PUT_TCE_INDIRECT hcall uses a page filled with TCE entries, as one of
>>>>>>>>> its parameters. One page is dedicated per cpu, for the lifetime of the
>>>>>>>>> kernel for this purpose. On secure VMs, contents of this page, when
>>>>>>>>> accessed by the hypervisor, retrieves encrypted TCE entries.  Hypervisor
>>>>>>>>> needs to know the unencrypted entries, to update the TCE table
>>>>>>>>> accordingly.  There is nothing secret or sensitive about these entries.
>>>>>>>>> Hence share the page with the hypervisor.
>>>>>>>>
>>>>>>>> This unsecures a page in the guest in a random place which creates an
>>>>>>>> additional attack surface which is hard to exploit indeed but
>>>>>>>> nevertheless it is there.
>>>>>>>> A safer option would be not to use the
>>>>>>>> hcall-multi-tce hyperrtas option (which translates FW_FEATURE_MULTITCE
>>>>>>>> in the guest).
>>>>>>>
>>>>>>>
>>>>>>> Hmm... How do we not use it?  AFAICT hcall-multi-tce option gets invoked
>>>>>>> automatically when IOMMU option is enabled.
>>>>>>
>>>>>> It is advertised by QEMU but the guest does not have to use it.
>>>>>
>>>>> Are you suggesting that even normal-guest, not use hcall-multi-tce?
>>>>> or just secure-guest?  
>>>>
>>>>
>>>> Just secure.
>>>
>>> hmm..  how are the TCE entries communicated to the hypervisor, if
>>> hcall-multi-tce is disabled?
>>
>> Via H_PUT_TCE which updates 1 entry at once (sets or clears).
>> hcall-multi-tce  enables H_PUT_TCE_INDIRECT (512 entries at once) and
>> H_STUFF_TCE (clearing, up to 4bln at once? many), these are simply an
>> optimization.
> 
> Do you still think, secure-VM should use H_PUT_TCE and not
> H_PUT_TCE_INDIRECT?  And normal VM should use H_PUT_TCE_INDIRECT?
> Is there any advantage of special casing it for secure-VMs.


Reducing the amount of insecure memory at random location.


> In fact, we could make use of as much optimization as possible.
> 
> 
>>
>>>>>> Is not this for pci+swiotlb? 
> ..snip..
>>>>> This patch is purely to help the hypervisor setup the TCE table, in the
>>>>> presence of a IOMMU.
>>>>
>>>> Then the hypervisor should be able to access the guest pages mapped for
>>>> DMA and these pages should be made unsecure for this to work. Where/when
>>>> does this happen?
>>>
>>> This happens in the SWIOTLB code.  The code to do that is already
>>> upstream.  
>>>
>>> The sharing of the pages containing the SWIOTLB bounce buffers is done
>>> in init_svm() which calls swiotlb_update_mem_attributes() which calls
>>> set_memory_decrypted().  In the case of pseries, set_memory_decrypted() calls 
>>> uv_share_page().
>>
>>
>> This does not seem enough as when you enforce iommu_platform=on, QEMU
>> starts accessing virtio buffers via IOMMU so bounce buffers have to be
>> mapped explicitly, via H_PUT_TCE&co, where does this happen?
>>
> 
> I think, it happens at boot time. Every page of the guest memory is TCE
> mapped, if iommu is enabled. SWIOTLB pages get implicitly TCE-mapped
> as part of that operation.


Ah I see. This works via the huge dma window. Ok, makes sense now.

It just seems like a waste that we could map swiotlb 1:1 via the always
existing small DMA window but instead we rely on a huge window to map
these small buffers. This way we are wasting the entire 32bit window and
most of the huge window. We may fix it in the future (not right now) but
for now I would still avoid unsecuring additional memory. Thanks,


-- 
Alexey


More information about the Linuxppc-dev mailing list