[RFC] implicit hugetlb pages (hugetlb_implicit)

Adam Litke agl at us.ibm.com
Tue Jan 13 10:38:00 EST 2004


Thank you for your comments and suggestions.  They are proving very
helpful as I work to clean this up.

On Sun, 2004-01-11 at 20:19, David Gibson wrote:
> On Fri, Jan 09, 2004 at 01:27:20PM -0800, Adam Litke wrote:
> >
> > hugetlb_implicit (2.6.0):
> >    This patch includes the anonymous mmap work from Dave Gibson
> > (right?)
>
> I'm not sure what you're referring to here.  My patches for lbss
> support also include support for copy-on-write of hugepages and
> various other changes which can make them act kind of like anonymous
> pages.
>
> But I don't see much in this patch that looks familiar.

Hmm.  Could the original author of hugetlb for anonymous mmap claim
credit for the initial code?

> > +	/* Do we have enough free huge pages? */
> > +	if (!is_hugepage_mem_enough(len))
> > +		return 0;
>
> Is this test safe/necessary?  i.e. a) is there any potential race
> which could cause the mmap() to fail because it's short of memory
> despite suceeding the test here and b) can't we just let the mmap fail
> and fall back then rather than checking beforehand?

You're right.  Now that safe fallback is working, we might as well defer
this test to get_unmapped area.

>
> Do we need/want any consideration of the given "hint" address here?

I am trying to do what the kernel does for normal mmaps here.  If
someone hints at an address, they hopefully have a good reason for it.
I wouldn't want to override it just so I can do implicit hugetlb.  Most
applications pass NULL for the hint right?

> > +	/* Explicit requests for huge pages are allowed to return errors */
> > +	if (*flags & MAP_HUGETLB) {
> > +		if (pre_error)
> > +			return pre_error;
> > +		return hugetlb_get_unmapped_area(NULL, addr, len, pgoff, *flags);
> > +	}
> > +
> > +	/*
> > +	 * When implicit request fails, return 0 so we can
> > +	 * retry later with regular pages.
> > +	 */
> > +	if (mmap_hugetlb_implicit(len)) {
> > +		if (pre_error)
> > +			goto out;
> > +		addr = hugetlb_get_unmapped_area(NULL, addr, len, pgoff, *flags);
> > +		if (IS_ERR((void *)addr))
> > +			goto out;
> > +		else {
> > +			*flags |= MAP_HUGETLB;
> > +			return addr;
> > +		}
> > +	}
> > +
> > +out:
> > +	*flags &= ~MAP_HUGETLB;
> > +	return 0;
> > +}
>
> This does assume that 0 is never a valid address returned for a
> hugepage range.  That's true now, but it makes be slightly
> uncomfortable, since there's no inherent reason we couldn't make
> segment zero a hugepage segment.

You definately found an ugly part of the patch.  Cleanup in progress.

> > +#ifdef CONFIG_HUGETLBFS
> > +int shm_with_hugepages(int shmflag, size_t size)
> > +{
> > +	/* flag specified explicitly */
> > +	if (shmflag & SHM_HUGETLB)
> > +		return 1;
> > +	/* Are we disabled? */
> > +	if (!shm_use_hugepages)
> > +		return 0;
> > +	/* Must be HPAGE aligned */
> > +	if (size & ~HPAGE_MASK)
> > +		return 0;
> > +	/* Are we under the max per file? */
> > +	if ((size >> HPAGE_SHIFT) > shm_hugepages_per_file)
> > +		return 0;
>
> I don't really understand this per-file restriction.  More comments
> below.

Since hugetlb pages are a relatively scarce resource, this is a
rudimentary method to ensure that one application doesn't allocate more
than its fair share of hugetlb memory.

> > +	/* Do we have enough free huge pages? */
> > +	if (!is_hugepage_mem_enough(size))
> > +		return 0;
>
> Same concerns with this test as in the mmap case.

Your right.  This is racey.  I haven't given the shared mem part of the
patch nearly as much attention as the mmap part.  I am going to leave
this partially broken until I clean up the fallback code for mmaps so I
can put that here as well.

> > @@ -501,8 +505,17 @@ unsigned long do_mmap_pgoff(struct file
> >
> >  	/* Obtain the address to map to. we verify (or select) it and ensure
> >  	 * that it represents a valid section of the address space.
> > +	 * VM_HUGETLB will never appear in vm_flags when CONFIG_HUGETLB is
> > +	 * unset.
> >  	 */
> > -	addr = get_unmapped_area(file, addr, len, pgoff, flags);
> > +#ifdef CONFIG_HUGETLBFS
> > +	addr = try_hugetlb_get_unmapped_area(NULL, addr, len, pgoff, &flags);
> > +	if (IS_ERR((void *)addr))
> > +		return addr;
>
> This doesn't look right - we don't fall back if try_hugetlb...()
> fails.  But it can fail if we don't have the right permissions, for
> one thing in which case we certainly do want to fall back.

I admit this is messy and I am working on cleaning it up.

--
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/





More information about the Linuxppc64-dev mailing list