NUMA memory block size

Mon Apr 5 13:19:41 EST 2004

On Sat, 2004-04-03 at 09:48, Olof Johansson wrote:
> On Sat, 3 Apr 2004, Dave Hansen wrote:
> > On Fri, 2004-04-02 at 22:50, Olof Johansson wrote:
> >
> > How about using the bootmem_alloc() functions instead of the LMB ones?
> > They're a bit more standard, and everyone else will realize what you're
> > doing.  That isn't too early, is it?
>
> I think it is. The array is built first thing in do_init_bootmem, so I
> wouldn't expect bootmem stuff to be available yet.

Actually, I think that the ppc64 do_init_bootmem() overuses the LMB
allocator a bit.  Look at the regular do_init_bootmem() case in
arch/ppc64/mm/init.c.  It LMB-reserves the bootmem bitmap.

But, look at the way i386 setup_memory() does it.  It calls
reserve_bootmem() *on* the bootmem bitmap itself just after bootmem has
been set up.  It's a bit counter-intuitive, but it obviously works.

Now, the two approaches are made equivalent as soon as the loop to
bootmem_reserve() of all of the LMB-reserved areas occur, but I think I
like the i386 method better.

Are there some {pa,page,pfn}_to_nid() operations that occur before the
ppc64 do_init_bootmem()?  I'm pretty sure there at least can't be any
page_* operations yet because there's no mem_map, yet.

If we can delay numa_memory_lookup_table[] until after the bootmem init
is done, then it could be bootmem-alloc'd, and sized more appropriately
for the actual memory amount.

That reminds me.  Are the htab sizes user-adjustable at all, or are they
completely fixed in size by what the hardware wants?  It would be really
nice to be able to disable relocation as early as possible and avoid any
of the klimit or lmb-style memory allocations and leave them all up to
the arch-independent bootmem allocator.  That's effectively what x86
does by having a few pagetable structures actually defined in the kernel
image.

-- Dave

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/