[Cbe-oss-dev] Memalign and free doesn't work

Vikberg, Gunnar vikbergg at msoe.edu
Mon Oct 12 11:06:10 EST 2009


Phil,

If you're trying to program the Cell's SPEs, I would suggest using the functions malloc_align() and free_align().  You'll have to include the headers <spu_intrinsics.h> and <libmisc.h> for them to work.  Take a look at the description of malloc_align() below that I copied directly from the comment block in the malloc_align.h file.

/* Function
 *
 *      void * malloc_align(size_t size, unsigned int log2_align)
 *
 * Description
 *      The malloc_align routine allocates a memory buffer of <size>
 *      bytes aligned to the power of 2 alignment specified by <log2_align>.
 *      For example, malloc_align(4096, 7) will allocate a memory heap 
 *      buffer of 4096 bytes aligned on a 128 byte boundary.
 *
 *      The aligned malloc routine allocates an enlarged buffer
 *      from the standard memory heap. Space for the real allocated memory
 *      pointer is reserved on the front of the memory buffer.
 *
 *            ----------------- <--- start of allocated memory
 *           |    pad 0 to                     |
 *           |(1<<log2_align)-1      |
 *           |     bytes                         |
 *           |-----------------|
 *           | allocation size             |
 *           |-----------------|
 *           | real buffer ptr              |
 *           |-----------------|<---- returned aligned memory pointer
 *           |                                        |
 *           |    requested                  |
 *           |     memory                    |
 *           |     buffer                        |
 *           |      size                          |
 *           |      bytes                        |
 *           |______________________|
 *
 *      Memory allocated by this routine must be freed using the free_align
 *      routine.
 *
 *      The size of the allocation is saved for special cases where the 
 *      data must be mored during a realloc_align.
 */

Also, please notice that this function does not take the alignment size in bytes as the argument directly.  Rather, it takes in an integer power of a base-two number system.  So, for an alignment of 16, you would use 4, since 2^4 is 16.  For 128 you would use 7, since 2^7 is 128, and so on.

In addition, as the diagram above shows, the function does allocate a bit more memory than requested.  The space required has to do with the extra information it stores.  It's primarily used in case the realloc_align() function is called.  Either way, just don't be surprised if you have a little more memory allocated than anticipated.  This also answers your original question for memalign allocating 68848 bytes.

My source code follows below.  I have included the Makefile used.  Copy the text labelled Makefile to a file called, well... you guessed it, "Makefile".  Also, create a C file called "spu_mem.c" in the same directory, and copy my source code into it.  Now, just run the command "make <ENTER>" from the command line and it should build the program "spu_mem" that you can run directly.

code: Makefile
============================================
PROGRAM_spu := spu_mem
IMPORTS = -lmisc
include /opt/cell/sdk/buildutils/make.footer
============================================

code: spu_mem.c
============================================
#include <stdio.h>
#include <spu_intrinsics.h>
#include <libmisc.h>

// Register 1 in SPU provides difference between heap and stack pointers
// Heap pointer not provided in SPU, but stack pointer is in first element
//      of Register 1, and difference between heap and stack is in second
//      element of Register 1.
// Set reg_1 equal to Register 1.
register volatile vector unsigned int reg_1 asm("1");

int main(unsigned long long spe_id, unsigned long long
        program_data_ea, unsigned long long env)
{
        int before_alloc, after_alloc;
        void *triangles_local_mem;

        // Print differences in stack and heap BEFORE memory allocation.
        // First, extract second element from Register 1, then print it.
        before_alloc = spu_extract(reg_1, 1);
        printf("Before malloc_align allocation: stack - heap = %#x\n",
                before_alloc);

        // Allocate roughly 64k of space in SPU using malloc_align.
        triangles_local_mem = (void *)malloc_align(65536, 4);

        // Print differences in stack and heap AFTER memory allocation.
        // First, extract second element from Register 1, then print it.
        after_alloc = spu_extract(reg_1, 1);
        printf("After malloc_align allocation: stack - heap = %#x\n",
                after_alloc);

        // Show the difference in space available in memory.
        printf("Space allocated by malloc_align: %d bytes\n",
                before_alloc - after_alloc);

        // Free allocated memory in SPU using free_align.
        free_align(triangles_local_mem);
        return 0;
}
============================================

The full sequence of commands to compile and run the program, plus the output from the spulet above follow:

[user at ps3 spu]# ls
Makefile  spu_mem.c

[user at ps3 spu]# make
/usr/bin/ccache /usr/bin/spu-gcc        -W -Wall -Winline  -I.  -I /opt/cell/sdk/usr/spu/include  -O3 -c spu_mem.c
spu_mem.c: In function 'main':
spu_mem.c:12: warning: unused parameter 'spe_id'
spu_mem.c:13: warning: unused parameter 'program_data_ea'
spu_mem.c:13: warning: unused parameter 'env'
/usr/bin/ccache /usr/bin/spu-gcc -o spu_mem  spu_mem.o      -L/opt/cell/sdk/usr/spu/lib -Wl,-N   -lmisc
/usr/bin/ppu-embedspu -m32 spu_mem spu_mem spu_mem-embed.o
/usr/bin/ppu-ar -qcs spu_mem.a spu_mem-embed.o

[user at ps3 spu]# ./spu_mem 
Before malloc_align allocation: stack - heap = 0x3e5d0
After malloc_align allocation: stack - heap = 0x2e3b0
Space allocated by malloc_align: 66080 bytes

As you can see above, the function malloc_align() actually allocated 66080 instead of the 65536 expected.  The difference is due to that bit of extra information it also stores in the process, as we discussed previously.

A great resource in case you're learning to program for the Cell B.E. is Matthew Scarpino's book "Programming the Cell Processor: For Games, Graphics, and Computation."

I hope this helps,
Gunnar

-----Original Message-----
From: cbe-oss-dev-bounces+vikbergg=msoe.edu at lists.ozlabs.org on behalf of Phil Pratt-Szeliga
Sent: Sun 10/11/09 9:18
To: W
Cc: cbe-oss-dev at lists.ozlabs.org
Subject: Re: [Cbe-oss-dev] Memalign and free doesn't work
 
Well this is causing me to run out of memory.  Isn't this wrong in
standard unix then too?

Phil

On Sun, Oct 11, 2009 at 2:36 AM, W <w.l.fischer at googlemail.com> wrote:
> Hi Phillip,
>
> nearly the same on standard linux:
>
> $ ./a.out
> MemUsage: 0
> MemUsage: 135168
> $ uname -a
> Linux z4 2.6.26-1-686 #1 SMP Fri Mar 13 18:08:45 UTC 2009 i686 GNU/Linux
>
> Willem
>
> On Sat, Oct 10, 2009 at 3:13 PM, Philip Pratt-Szeliga
> <phil.pratt.szeliga at gmail.com> wrote:
>> Hello,
>>
>> I am noticing that memalign and free doesn't work on the ps3.
>>
>> code:
>> =======================
>> #include <malloc.h>
>>
>> static void printMemUsage()
>> {
>>  struct mallinfo mi;
>>  mi = mallinfo();
>>  printf("MemUsage: %d\n", mi.arena);
>> }
>>
>> int main(unsigned long long spe_id, unsigned long long
>> program_data_ea, unsigned long long env)
>> {
>>  printMemUsage();
>>  void * triangles_local_mem = memalign(16, 65535);
>>  free(triangles_local_mem);
>>  printMemUsage();
>>  return 0;
>> }
>> =======================
>>
>> compile:
>> spu-gcc cell_function.c
>>
>> results:
>> #./a.out
>> MemUsage: 0
>> MemUsage: 68848
>> #
>>
>> Am I doing something wrong?
>>
>> -Phil
>> _______________________________________________
>> cbe-oss-dev mailing list
>> cbe-oss-dev at lists.ozlabs.org
>> https://lists.ozlabs.org/listinfo/cbe-oss-dev
>>
>
_______________________________________________
cbe-oss-dev mailing list
cbe-oss-dev at lists.ozlabs.org
https://lists.ozlabs.org/listinfo/cbe-oss-dev



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/cbe-oss-dev/attachments/20091011/9bccdf42/attachment-0001.htm>


More information about the cbe-oss-dev mailing list