[SLOF] [PATCH 3/4] fbuffer: Implement MRMOVE as an accelerated primitive

Alexey Kardashevskiy aik at ozlabs.ru
Mon Aug 3 16:48:16 AEST 2015


On 08/03/2015 04:22 PM, Thomas Huth wrote:
> On 03/08/15 08:01, Alexey Kardashevskiy wrote:
>> On 07/31/2015 11:00 PM, Thomas Huth wrote:
>>> The character drawing function fb8-draw-character uses "mrmove"
>>> (which moves main memory contents to IO memory) to copy the data
>>> of the character from main memory to the frame buffer. However,
>>> the current implementation of "mrmove"  performs quite badly on
>>> board-qemu since it triggers a hypercall for each memory access
>>> (e.g. for each 8 bytes that are transfered).
>>> But since the KVMPPC_H_LOGICAL_MEMOP hypercall can transfer bigger
>>> regions at once, we can accelerate the character drawing quite a
>>> bit by simply mapping the "mrmove" to the same macro that is
>>> already used for the "rmove".
>>
>> You described "mrmove" (which is good), please add a couple of words how
>> "rmove" is different from mrmove.
>
> While MRMOVE transfers data from main memory to IO memory, RMOVE
> transfers data from IO memory to IO memory. It is used for scrolling the
> screen, for example.

I'd never ever guess from their names...

I am missing the point here. Why not to get rid of "mrmove"? The proposed 
implementation with make "mrmove" doing pretty much the same thing as 
"rmove" already does.



>>> For keeping board-js2x in sync,
>>> this patch also transforms the "mrmove" for js2x into primitives.
>>>
>>> Signed-off-by: Thomas Huth <thuth at redhat.com>
>>> ---
>>>    include/ppc970/cache.h | 18 ++++++++++++++++++
>>>    include/ppcp7/cache.h  |  2 ++
>>>    slof/fs/rmove.fs       | 16 ----------------
>>>    slof/prim.code         |  6 ++++++
>>>    slof/prim.in           |  2 +-
>>>    5 files changed, 27 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/include/ppc970/cache.h b/include/ppc970/cache.h
>>> index b748689..3d27f92 100644
>>> --- a/include/ppc970/cache.h
>>> +++ b/include/ppc970/cache.h
>>> @@ -83,4 +83,22 @@ cache_inhibited_access(uint64_t, 64)
>>>            default:        _RMOVE(s, d, size, type_c); break; \
>>>        }
>>>
>>> +/* main RAM to IO memory move */
>>> +#define _MRMOVE(s, d, size, t)    \
>>> +{ \
>>> +    t *s1=(t *)s, *d1=(t *)d; \
>>
>>
>> The next patch puts spaces around "=" but not this one.
>>
>> Also, values passed to a macro should always be wrapped in braces.
>
> Yeah, I guess I should pay more attention when I copy-n-past code from
> other spots in the file (_MRMOVE started as a copy from _FWRMOVE).
>
>>> +    register t tmp; \
>>> +    while (size > 0) { \
>>> +        tmp = *s1++; SET_CI; *d1++ = tmp; CLR_CI; size -= sizeof(t); \
>>> +    } \
>>> +}
>>> +
>>> +#define _FASTMRMOVE(s, d, size) \
>>> +    switch (((type_u)s | (type_u)d | size) & (sizeof(type_u)-1)) { \
>>> +        case 0:            _MRMOVE(s, d, size, type_u); break; \
>>> +        case sizeof(type_l):    _MRMOVE(s, d, size, type_l); break; \
>>> +        case sizeof(type_w):    _MRMOVE(s, d, size, type_w); break; \
>>> +        default:        _MRMOVE(s, d, size, type_c); break; \
>>> +    }
>>> +
>>
>> You could have one _FASTMRMOVE() (or even expand it prim.code) and
>> define _MRMOVE() per board (would be ci_rmove() for qemu).
>
> Ok, I'll have a try.
>
>>> diff --git a/slof/prim.in b/slof/prim.in
>>> index 7a0d6a2..f323aed 100644
>>> --- a/slof/prim.in
>>> +++ b/slof/prim.in
>>> @@ -104,8 +104,8 @@ cod(SEMICOLON)
>>>    cod(EXECUTE)
>>>
>>>    cod(MOVE)
>>> -// cod(RMOVE64)
>>
>> This does not belong to this patch.
>
> Darn! You've caught me ;-)

The change is fine, just send in a separate patch (altogether with many 
other things which I bet you already want to get rid of) :)



-- 
Alexey


More information about the SLOF mailing list