ibmvtpm byteswapping inconsistency

Sat Jan 28 05:02:34 AEDT 2017

On 01/26/2017 05:50 PM, Benjamin Herrenschmidt wrote:
> On Thu, 2017-01-26 at 17:42 -0800, Tyrel Datwyler wrote:
>> On 01/26/2017 12:22 PM, Michal Suchánek wrote:
>>> Hello,
>>>
>>> building ibmvtpm I noticed gcc warning complaining that second word
>>> of
>>> struct ibmvtpm_crq in tpm_ibmvtpm_suspend is uninitialized.
>>>
>>> The structure is defined as 
>>>
>>> struct ibmvtpm_crq {
>>>         u8 valid;
>>>         u8 msg;
>>>         __be16 len;
>>>         __be32 data;
>>>         __be64 reserved;
>>> } __attribute__((packed, aligned(8)));
>>>
>>> initialized as
>>>
>>>         struct ibmvtpm_crq crq;
>>>         u64 *buf = (u64 *) &crq;
>>> ...
>>>         crq.valid = (u8)IBMVTPM_VALID_CMD;
>>>         crq.msg = (u8)VTPM_PREPARE_TO_SUSPEND;
>>>
>>> and submitted with
>>>
>>>         rc = ibmvtpm_send_crq(ibmvtpm->vdev, cpu_to_be64(buf[0]),
>>>                               cpu_to_be64(buf[1]));
>>
>> These should be be64_to_cpu() here. The underlying hcall made by
>> ibmvtpm_send_crq() requires parameters to be in cpu endian unlike the
>> RTAS interface which requires data in BE.
> 
> Hrm... an hcall takes register arguments. Register arguments don't have
> an endianness.

I wasn't suggesting that they do. However, I still believe my point is
valid that the arguments need to be loaded into the registers according
to the endianness of the cpu. We had several bugs during LE porting
where assumptions were made that parameters should be loaded BE
regardless of cpu endian. For example:

commit 3df76a9dcc74d5f012b94ea01ed6e7aaf8362c5a
Author: Cyril Bur <cyrilbur at gmail.com>
Date:   Wed Jan 21 13:32:00 2015 +1100

    powerpc/pseries: Fix endian problems with LE migration

    RTAS events require arguments be passed in big endian while
    hypercalls have their arguments passed in registers and the values
    should therefore be in CPU endian.

> 
> The problem is that we are packing an in-memory structure into 2
> registers and it's expected that this structure is laid out in the
> registers as if it had been loaded by a BE CPU.

This is only the case if the cpu is BE. If the cpu is LE, regardless of
the fact that our in memory structure is laid out BE, when we break it
into 2 words each of those words needs to be loaded LE.

-Tyrel

> 
> So we have two things at play here:
> 
>   - The >8-bit fields should be laid out BE in the memory image
>   - That whole 128-bit structure should be loaded into 2 64-bit
> registers MSB first.
> 
> So the "double" swap is somewhat needed. The uglyness comes from the
> passing-by-register of the h-call but it should work.
> 
> That said, be64_to_cpup(buf) and be64_to_cpup(buf+1) might give you
> better result (though recent gcc's might not make a difference).
>>>
>>> which means that the second word indeed contains purely garbage.
>>>
>>> This is repeated a few times in the driver so I added memset to
>>> quiet
>>> gcc and make behavior deterministic in case the unused fields get
>>> some
>>> meaning in the future.
>>>
>>> However, in tpm_ibmvtpm_send the structure is initialized as
>>>
>>> 	struct ibmvtpm_crq crq;
>>>         __be64 *word = (__be64 *)&crq;
>>> ...
>>>         crq.valid = (u8)IBMVTPM_VALID_CMD;
>>>         crq.msg = (u8)VTPM_TPM_COMMAND;
>>>         crq.len = cpu_to_be16(count);
>>>         crq.data = cpu_to_be32(ibmvtpm->rtce_dma_handle);
>>>
>>> and submitted with
>>>
>>> 	rc = ibmvtpm_send_crq(ibmvtpm->vdev, be64_to_cpu(word[0]),
>>>                               be64_to_cpu(word[1]));
>>> meaning it is swapped twice.
>>>
>>>
>>> Where is the interface defined? Are the command arguments passed as
>>> BE
>>> subfields (the second case was correct before adding the extra
>>> whole
>>> word swap) or BE words (the first case doing whole word swap is
>>> correct)?
>>
>> The interface is defined in PAPR. The crq format is defined in BE
>> terms.
>> However, when we break the crq apart into high and low words they
>> need
>> to be in cpu endian as mentioned above.
>>
>> -Tyrel
>>
>>>
>>> Thanks
>>>
>>> Michal
>>>
>