[RFC] implicit hugetlb pages (hugetlb_implicit)
Adam Litke
agl at us.ibm.com
Tue Jan 13 10:38:00 EST 2004
Thank you for your comments and suggestions. They are proving very
helpful as I work to clean this up.
On Sun, 2004-01-11 at 20:19, David Gibson wrote:
> On Fri, Jan 09, 2004 at 01:27:20PM -0800, Adam Litke wrote:
> >
> > hugetlb_implicit (2.6.0):
> > This patch includes the anonymous mmap work from Dave Gibson
> > (right?)
>
> I'm not sure what you're referring to here. My patches for lbss
> support also include support for copy-on-write of hugepages and
> various other changes which can make them act kind of like anonymous
> pages.
>
> But I don't see much in this patch that looks familiar.
Hmm. Could the original author of hugetlb for anonymous mmap claim
credit for the initial code?
> > + /* Do we have enough free huge pages? */
> > + if (!is_hugepage_mem_enough(len))
> > + return 0;
>
> Is this test safe/necessary? i.e. a) is there any potential race
> which could cause the mmap() to fail because it's short of memory
> despite suceeding the test here and b) can't we just let the mmap fail
> and fall back then rather than checking beforehand?
You're right. Now that safe fallback is working, we might as well defer
this test to get_unmapped area.
>
> Do we need/want any consideration of the given "hint" address here?
I am trying to do what the kernel does for normal mmaps here. If
someone hints at an address, they hopefully have a good reason for it.
I wouldn't want to override it just so I can do implicit hugetlb. Most
applications pass NULL for the hint right?
> > + /* Explicit requests for huge pages are allowed to return errors */
> > + if (*flags & MAP_HUGETLB) {
> > + if (pre_error)
> > + return pre_error;
> > + return hugetlb_get_unmapped_area(NULL, addr, len, pgoff, *flags);
> > + }
> > +
> > + /*
> > + * When implicit request fails, return 0 so we can
> > + * retry later with regular pages.
> > + */
> > + if (mmap_hugetlb_implicit(len)) {
> > + if (pre_error)
> > + goto out;
> > + addr = hugetlb_get_unmapped_area(NULL, addr, len, pgoff, *flags);
> > + if (IS_ERR((void *)addr))
> > + goto out;
> > + else {
> > + *flags |= MAP_HUGETLB;
> > + return addr;
> > + }
> > + }
> > +
> > +out:
> > + *flags &= ~MAP_HUGETLB;
> > + return 0;
> > +}
>
> This does assume that 0 is never a valid address returned for a
> hugepage range. That's true now, but it makes be slightly
> uncomfortable, since there's no inherent reason we couldn't make
> segment zero a hugepage segment.
You definately found an ugly part of the patch. Cleanup in progress.
> > +#ifdef CONFIG_HUGETLBFS
> > +int shm_with_hugepages(int shmflag, size_t size)
> > +{
> > + /* flag specified explicitly */
> > + if (shmflag & SHM_HUGETLB)
> > + return 1;
> > + /* Are we disabled? */
> > + if (!shm_use_hugepages)
> > + return 0;
> > + /* Must be HPAGE aligned */
> > + if (size & ~HPAGE_MASK)
> > + return 0;
> > + /* Are we under the max per file? */
> > + if ((size >> HPAGE_SHIFT) > shm_hugepages_per_file)
> > + return 0;
>
> I don't really understand this per-file restriction. More comments
> below.
Since hugetlb pages are a relatively scarce resource, this is a
rudimentary method to ensure that one application doesn't allocate more
than its fair share of hugetlb memory.
> > + /* Do we have enough free huge pages? */
> > + if (!is_hugepage_mem_enough(size))
> > + return 0;
>
> Same concerns with this test as in the mmap case.
Your right. This is racey. I haven't given the shared mem part of the
patch nearly as much attention as the mmap part. I am going to leave
this partially broken until I clean up the fallback code for mmaps so I
can put that here as well.
> > @@ -501,8 +505,17 @@ unsigned long do_mmap_pgoff(struct file
> >
> > /* Obtain the address to map to. we verify (or select) it and ensure
> > * that it represents a valid section of the address space.
> > + * VM_HUGETLB will never appear in vm_flags when CONFIG_HUGETLB is
> > + * unset.
> > */
> > - addr = get_unmapped_area(file, addr, len, pgoff, flags);
> > +#ifdef CONFIG_HUGETLBFS
> > + addr = try_hugetlb_get_unmapped_area(NULL, addr, len, pgoff, &flags);
> > + if (IS_ERR((void *)addr))
> > + return addr;
>
> This doesn't look right - we don't fall back if try_hugetlb...()
> fails. But it can fail if we don't have the right permissions, for
> one thing in which case we certainly do want to fall back.
I admit this is messy and I am working on cleaning it up.
--
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center
** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/
More information about the Linuxppc64-dev
mailing list