[PATCH] x86: OLPC: speed up device tree creation during boot
Andres Salomon
dilinger at queued.net
Thu Oct 28 04:50:52 EST 2010
On Wed, 27 Oct 2010 11:39:24 +0100
Grant Likely <grant.likely at secretlab.ca> wrote:
> On Fri, Oct 22, 2010 at 05:22:47PM -0700, Andres Salomon wrote:
> >
> > Calling alloc_bootmem() for tiny chunks of memory over and over is
> > really slow; on an XO-1, it caused the time between when the kernel
> > started booting and when the display came alive (post-lxfb probe)
> > to increase to 44s. This patch optimizes the prom_early_alloc
> > function by calling alloc_bootmem for 4k-sized blocks of memory,
> > and handing out chunks of that to callers. With this hack, the
> > time between kernel load and display initialization decreased to
> > 23s. If there's a better way to do this early in the boot process,
> > please let me know.
> >
> > (Note: increasing the chunk size to 16k didn't noticably affect
> > boot time, and wasted 9k.)
> >
> > Signed-off-by: Andres Salomon <dilinger at queued.net>
> > ---
> > arch/x86/kernel/olpc_dt.c | 27 +++++++++++++++++++++++----
> > 1 files changed, 23 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/x86/kernel/olpc_dt.c b/arch/x86/kernel/olpc_dt.c
> > index f660a11..44dd2ae 100644
> > --- a/arch/x86/kernel/olpc_dt.c
> > +++ b/arch/x86/kernel/olpc_dt.c
> > @@ -123,16 +123,35 @@ static int __init olpc_dt_pkg2path(phandle
> > node, char *buf, }
> >
> > static unsigned int prom_early_allocated __initdata;
> > +#define DT_CHUNK_SIZE (1<<12)
>
> PAGE_SIZE perhaps?
>
I'd rather not imply that it's anything but completely arbitrary..
> >
> > void * __init prom_early_alloc(unsigned long size)
> > {
> > + static u8 *mem = NULL;
> > + static size_t free_mem = 0;
> > void *res;
> >
> > - res = alloc_bootmem(size);
> > - if (res)
> > - memset(res, 0, size);
> > + if (free_mem >= size) {
> > + /* allocate from the local cache */
> > + free_mem -= size;
> > + res = mem;
> > + mem += size;
> > + return res;
> > + }
> >
> > - prom_early_allocated += size;
> > + /*
> > + * To mimimize the number of allocations, grab 4k of
> > memory (that's
> > + * an arbitrary choice that matches PAGE_SIZE on the
> > platforms we care
> > + * about, and minimizes wasted bootmem) and hand off
> > chunks of it to
> > + * callers.
> > + */
> > + res = alloc_bootmem(DT_CHUNK_SIZE);
> > + if (res) {
> > + prom_early_allocated += DT_CHUNK_SIZE;
> > + memset(res, 0, DT_CHUNK_SIZE);
> > + free_mem = DT_CHUNK_SIZE - size;
> > + mem = res + size;
> > + }
>
> These two hunks should be flipped around so that only one chunk does
> the allocation from the pool. As so:
>
> /*
> * To mimimize the number of allocations, grab 4k of memory
> (that's
> * an arbitrary choice that matches PAGE_SIZE on the
> platforms we care
> * about, and minimizes wasted bootmem) and hand off chunks
> of it to
> * callers.
> */
> if (free_mem < size) {
> free_mem = max(DT_CHUNK_SIZE, size);
> mem = alloc_bootmem(free_mem);
> if (!mem) {
> free_mem = 0;
> return NULL;
> }
> memset(mem, 0, free_mem);
> prom_early_allocated += free_mem;
> }
>
> res = mem;
> free_mem -= size;
> mem += size;
> return res;
>
> g.
Makes sense, thanks.
More information about the devicetree-discuss
mailing list