[RESEND PATCH V2 1/3] Add mmap flag to request pages are locked after page fault

Sat Jun 20 02:43:33 AEST 2015

On Fri, 19 Jun 2015, Michal Hocko wrote:

> On Thu 18-06-15 16:30:48, Eric B Munson wrote:
> > On Thu, 18 Jun 2015, Michal Hocko wrote:
> [...]
> > > Wouldn't it be much more reasonable and straightforward to have
> > > MAP_FAULTPOPULATE as a counterpart for MAP_POPULATE which would
> > > explicitly disallow any form of pre-faulting? It would be usable for
> > > other usecases than with MAP_LOCKED combination.
> > 
> > I don't see a clear case for it being more reasonable, it is one
> > possible way to solve the problem.
> 
> MAP_FAULTPOPULATE would be usable for other cases as well. E.g. fault
> around is all or nothing feature. Either all mappings (which support
> this) fault around or none. There is no way to tell the kernel that
> this particular mapping shouldn't fault around. I haven't seen such a
> request yet but we have seen requests to have a way to opt out from
> a global policy in the past (e.g. per-process opt out from THP). So
> I can imagine somebody will come with a request to opt out from any
> speculative operations on the mapped area in the future.
> 
> > But I think it leaves us in an even
> > more akward state WRT VMA flags.  As you noted in your fix for the
> > mmap() man page, one can get into a state where a VMA is VM_LOCKED, but
> > not present.  Having VM_LOCKONFAULT states that this was intentional, if
> > we go to using MAP_FAULTPOPULATE instead of MAP_LOCKONFAULT, we no
> > longer set VM_LOCKONFAULT (unless we want to start mapping it to the
> > presence of two MAP_ flags).  This can make detecting the MAP_LOCKED +
> > populate failure state harder.
> 
> I am not sure I understand your point here. Could you be more specific
> how would you check for that and what for?

My thought on detecting was that someone might want to know if they had
a VMA that was VM_LOCKED but had not been made present becuase of a
failure in mmap.  We don't have a way today, but adding VM_LOCKONFAULT
is at least explicit about what is happening which would make detecting
the VM_LOCKED but not present state easier.  This assumes that
MAP_FAULTPOPULATE does not translate to a VMA flag, but it sounds like
it would have to.

> 
> From my understanding MAP_LOCKONFAULT is essentially
> MAP_FAULTPOPULATE|MAP_LOCKED with a quite obvious semantic (unlike
> single MAP_LOCKED unfortunately). I would love to also have
> MAP_LOCKED|MAP_POPULATE (aka full mlock semantic) but I am really
> skeptical considering how my previous attempt to make MAP_POPULATE
> reasonable went.

Are you objecting to the addition of the VMA flag VM_LOCKONFAULT, or the
new MAP_LOCKONFAULT flag (or both)?  If you prefer that MAP_LOCKED |
MAP_FAULTPOPULATE means that VM_LOCKONFAULT is set, I am fine with that
instead of introducing MAP_LOCKONFAULT.  I went with the new flag
because to date, we have a one to one mapping of MAP_* to VM_* flags.

> 
> > If this is the preferred path for mmap(), I am fine with that. 
> 
> > However,
> > I would like to see the new system calls that Andrew mentioned (and that
> > I am testing patches for) go in as well. 
> 
> mlock with flags sounds like a good step but I am not sure it will make
> sense in the future. POSIX has screwed that and I am not sure how many
> applications would use it. This ship has sailed long time ago.

I don't know either, but the code is the question, right?  I know that
we have at least one team that wants it here.

> 
> > That way we give users the
> > ability to request VM_LOCKONFAULT for memory allocated using something
> > other than mmap.
> 
> mmap(MAP_FAULTPOPULATE); mlock() would have the same semantic even
> without changing mlock syscall.

That is true as long as MAP_FAULTPOPULATE set a flag in the VMA(s).  It
doesn't cover the actual case I was asking about, which is how do I get
lock on fault on malloc'd memory?

>  
> > > > This patch introduces the ability to request that pages are not
> > > > pre-faulted, but are placed on the unevictable LRU when they are finally
> > > > faulted in.
> > > > 
> > > > To keep accounting checks out of the page fault path, users are billed
> > > > for the entire mapping lock as if MAP_LOCKED was used.
> > > > 
> > > > Signed-off-by: Eric B Munson <emunson at akamai.com>
> > > > Cc: Michal Hocko <mhocko at suse.cz>
> > > > Cc: linux-alpha at vger.kernel.org
> > > > Cc: linux-kernel at vger.kernel.org
> > > > Cc: linux-mips at linux-mips.org
> > > > Cc: linux-parisc at vger.kernel.org
> > > > Cc: linuxppc-dev at lists.ozlabs.org
> > > > Cc: sparclinux at vger.kernel.org
> > > > Cc: linux-xtensa at linux-xtensa.org
> > > > Cc: linux-mm at kvack.org
> > > > Cc: linux-arch at vger.kernel.org
> > > > Cc: linux-api at vger.kernel.org
> > > > ---
> [...]
> -- 
> Michal Hocko
> SUSE Labs
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20150619/0df3bdb0/attachment.sig>