[Skiboot] [PATCH 05/15] libc/string: Add memcpy_ci
Claudio Carvalho
cclaudio at linux.vnet.ibm.com
Tue Sep 13 23:25:26 AEST 2016
On 09/03/2016 12:20 AM, Oliver O'Halloran wrote:
> On Fri, Sep 2, 2016 at 2:59 PM, Claudio Carvalho
> <cclaudio at linux.vnet.ibm.com> wrote:
>>
>>
>> On 09/01/2016 06:34 AM, Stewart Smith wrote:
>>> Claudio Carvalho <cclaudio at linux.vnet.ibm.com> writes:
>>>> --- /dev/null
>>>> +++ b/libc/string/memcpy_ci.c
>>>> @@ -0,0 +1,50 @@
>>>> +/* Copyright 2013-2016 IBM Corp
>>>> + *
>>>> + * Licensed under the Apache License, Version 2.0 (the "License");
>>>> + * you may not use this file except in compliance with the License.
>>>> + * You may obtain a copy of the License at
>>>> + *
>>>> + * http://www.apache.org/licenses/LICENSE-2.0
>>>> + *
>>>> + * Unless required by applicable law or agreed to in writing, software
>>>> + * distributed under the License is distributed on an "AS IS" BASIS,
>>>> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
>>>> + * implied.
>>>> + * See the License for the specific language governing permissions and
>>>> + * limitations under the License.
>>>> + */
>>>> +
>>>> +#include <string.h>
>>>> +#include <io.h>
>>>> +
>>>> +const void* memcpy_ci(const void *destpp, const void *srcpp, size_t len)
>
> I really don't like this name. If I encoutered a function called
> memcpy_ci() I would assume it performs the copy with CI loads *and*
> stores. This function copies from CI memory to cachable memory so it
> should be called memcpy_from_ci() or similar. This might seem a little
> nitpicky, but mixing cached and CI accesses to the same memory can
> cause a checkstop (even in real mode) so it's important to be clear
> about how memory is being accessed.
>
Related changes applied for V2:
- File name renamed to memcpy_from_ci.c
- Function renamed to memcpy_from_ci()
>>>> +{
>>>> + size_t i;
>>>> + const size_t block = sizeof(uint32_t);
>>>> +
>>>> + unsigned long int destp = (long int) destpp;
>>>> + unsigned long int srcp = (long int) srcpp;
>>>> +
>>>> + /* copy just a few bytes to make dst aligned */
>>>> + i = len % block;
>>>> + while (i > 0)
>>>> + {
>>>> + *((uint8_t*) destp) = in_8((uint8_t*)srcp);
>>>> + srcp += 1;
>>>> + destp += 1;
>>>> + i--;
>>>> + }
>>>> +
>>>> + len -= len % block;
>>>> +
>>>> + /* copy the aligned bytes */
>>>> + while(len > 0)
>>>> + {
>
> Nitpick: In skiboot we use the Kernel's brace style. Functions should
> have their brace on a new line, but control flow statements should
> have the brace on the same line.
>
Braces fixed for V2.
>>>> + *((uint32_t*) destp) = in_be32((uint32_t*)srcp);
>>>> + srcp += block;
>>>> + destp += block;
>>>> + len -= block;
>>>> + }
>>>> +
>>>> + return destpp;
>>>> +}
>>>
>>> I think it would be good to add a unit test for this function.
>>>
>>
>> Makes sense. I will figure out how to add a unit test in skiboot for
>> this function.
>>
>>
>>> It also (currently) assumes that src is a multiple of uint32_t, although
>>> nothing checks that and so would silently fail if called for, say 33 bytes.
>>>
>>
>> Not sure if I understand your suggestion here.
>>
>> This function doesn't check if srcpp and dstpp are multiple of uint32_t,
>> but it assumes that the processor is able to read a char and a uint32_t
>> from src. Thus, the first loop copies at most len % sizeof(uint32_t) to
>> optimize the number of reads
>
> The length checking is fine, the bigger issue here is that the first
> loop will leave srcp unaligned. If we wanted to copy 7 bytes starting
> from 0x0 the first loop would move srcp to 0x3 before attempting to
> finish the copy with a single 32 bit load. On P8 unaligned CI loads
> will cause an alignment interrupt so the assumption that the processor
> can read a uint32_t from srcp is faulty. If you want to keep this
> optimisation the first loop needs to align srcp so that (srcp % block)
> == 0. The bulk of the copy can then be done with with 32 (or 64bit)
> loads and finished with a byte-by-byte copy. You could move the byte
> copies into a helper function to eliminate all the int <-> pointer
> casting too.
>
Is this better? I am not sure if I understand what you mean about the
helper function.
void* memcpy_from_ci(void *destpp, const void *srcpp, size_t len)
{
const size_t block = sizeof(uint32_t);
unsigned long int destp = (long int) destpp;
unsigned long int srcp = (long int) srcpp;
/* Copy as many blocks as possible if srcp is block aligned */
if ((srcp % block) == 0) {
while ((len - block) > -1) {
*((uint32_t*) destp) =
in_be32((uint32_t*)srcp);
srcp += block;
destp += block;
len -= block;
}
}
/*
* Byte-by-byte copy if srcp is not block aligned or len
* is/becomes less than one block
*/
while (len > 0) {
*((uint8_t*) destp) = in_8((uint8_t*)srcp);
srcp += 1;
destp += 1;
len--;
}
return destpp;
}
>>
>>
>> _______________________________________________
>> Skiboot mailing list
>> Skiboot at lists.ozlabs.org
>> https://lists.ozlabs.org/listinfo/skiboot
>
>
More information about the Skiboot
mailing list