[PATCH] Fix do_mlock so page alignment is to hugepage boundries when needed

Eric Paris eparis at redhat.com
Thu May 18 05:59:24 EST 2006


On Wed, 2006-05-17 at 18:42 +0100, Hugh Dickins wrote:
> On Wed, 17 May 2006, Eric Paris wrote:
> > sys_m{,un}lock and do_mlock all align memory references and the length
> > of the mlock given by userspace to page boundaries.  If the page being
> > mlocked is a hugepage instead of a normal page the start and finish of
> > the mlock will still only be aligned to normal page boundaries.
> > Ultimately upon the process exiting we will eventually call unmap_vmas
> > which will call unmap_hugepage_range for all of the ranges.
> > unmap_hugepage_range checks to make sure the beginning and the end of
> > the range are actually hugepage aligned and if not will BUG().  Since we
> > only aligned to a normal page boundary the end of the first range and
> > the beginning of the second will likely (unless userspace passed of
> > values already hugepage aligned) not be hugepage aligned and thus we
> > bomb.
> 
> When did you test this?  It should have been fixed in 2.6.11 onwards
> by split_vma()'s simple:
> 
> 	if (is_vm_hugetlb_page(vma) && (addr & ~HPAGE_MASK))
> 		return -EINVAL;
> 
> Hugh

You're right.  My initial BUG() problem was on pre 2.6.11.  I wrote this
patch and tested that it worked on 2.6.16 to allow the test program I
described to run successfully.  Admittedly this is the first time I
tested on unpatched 2.6.16 and as you said the mlock will fail with
"Invalid Argument" instead of BUG().  So the panic is gone post 2.6.11,
but the problem remains.  We still are rounding incorrectly for
hugepages in the sys_mlock and do_mlock functions.

This patch still solves the problem of the kernel currently being more
restrictive on what we accept from userspace for the length of the mlock
if it is a hugepage rather than a regular page.  With a regular page we
will round the value from userspace and happily go about our business of
mlocking.  For a hugepage it just rejects it if userspace doesn't align
it themselves.  This allows the kernel to do the same work for hugepages
that it does for normal pages.

-Eric




More information about the Linuxppc-dev mailing list