[SLOF] [PATCH 3/4] fbuffer: Implement MRMOVE as an accelerated primitive
Thomas Huth
thuth at redhat.com
Mon Aug 3 17:55:39 AEST 2015
On 03/08/15 08:01, Alexey Kardashevskiy wrote:
> On 07/31/2015 11:00 PM, Thomas Huth wrote:
>> The character drawing function fb8-draw-character uses "mrmove"
>> (which moves main memory contents to IO memory) to copy the data
>> of the character from main memory to the frame buffer. However,
>> the current implementation of "mrmove" performs quite badly on
>> board-qemu since it triggers a hypercall for each memory access
>> (e.g. for each 8 bytes that are transfered).
>> But since the KVMPPC_H_LOGICAL_MEMOP hypercall can transfer bigger
>> regions at once, we can accelerate the character drawing quite a
>> bit by simply mapping the "mrmove" to the same macro that is
>> already used for the "rmove".
...
>> + register t tmp; \
>> + while (size > 0) { \
>> + tmp = *s1++; SET_CI; *d1++ = tmp; CLR_CI; size -= sizeof(t); \
>> + } \
>> +}
>> +
>> +#define _FASTMRMOVE(s, d, size) \
>> + switch (((type_u)s | (type_u)d | size) & (sizeof(type_u)-1)) { \
>> + case 0: _MRMOVE(s, d, size, type_u); break; \
>> + case sizeof(type_l): _MRMOVE(s, d, size, type_l); break; \
>> + case sizeof(type_w): _MRMOVE(s, d, size, type_w); break; \
>> + default: _MRMOVE(s, d, size, type_c); break; \
>> + }
>> +
>
> You could have one _FASTMRMOVE() (or even expand it prim.code) and
> define _MRMOVE() per board (would be ci_rmove() for qemu).
Ok, tried this now, but it unfortunately does not work out very well:
For _MRMOVE in board-js2x, I need the access size parameter to be the
type operator ("type_x"), while for ci_rmove() the access size parameter
needs to be the amount of bits to shift (i.e. log2(sizeof(type_x)).
I could of course add some more wrapper code in there, but I think this
is only getting uglier, so I'll stick with the original idea to provide
a FASTMRMOVE macro in both cache.h files.
Thomas
More information about the SLOF
mailing list