[RFC 12/18] limits: track RLIMIT_MEMLOCK actual max

Doug Ledford dledford at redhat.com
Sat Jun 18 10:59:13 AEST 2016


On 6/13/2016 3:44 PM, Topi Miettinen wrote:
> Track maximum size of locked memory, presented in /proc/self/limits.

You should have probably Cc:ed everyone on the cover letter and probably
patch 1 of this series.  This patch is hard to decipher without the
additional context of those items.  However, that said, I think I see
what you are doing.  But your wording of your comments below is bad:

> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index feb9bb7..d3f3c9f 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -3378,10 +3378,16 @@ static inline unsigned long rlimit_max(unsigned int limit)
>  	return task_rlimit_max(current, limit);
>  }
>  
> +static inline void task_bump_rlimit(struct task_struct *tsk,
> +				    unsigned int limit, unsigned long r)
> +{
> +	if (READ_ONCE(tsk->signal->rlim_curmax[limit]) < r)
> +		tsk->signal->rlim_curmax[limit] = r;
> +}
> +
>  static inline void bump_rlimit(unsigned int limit, unsigned long r)
>  {
> -	if (READ_ONCE(current->signal->rlim_curmax[limit]) < r)
> -		current->signal->rlim_curmax[limit] = r;
> +	return task_bump_rlimit(current, limit, r);
>  }
>  
>  #ifdef CONFIG_CPU_FREQ
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 46ecce4..192001e 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -76,6 +76,9 @@ static int bpf_map_charge_memlock(struct bpf_map *map)
>  		return -EPERM;
>  	}
>  	map->user = user;
> +	/* XXX resource limits apply per task, not per user */
> +	bump_rlimit(RLIMIT_MEMLOCK, atomic_long_read(&user->locked_vm) <<
> +		    PAGE_SHIFT);

No, these resource limits do not apply per task.  They are per user.
However, you are doing maximum  usage accounting on a per-task basis by
adding a new counter to the signal struct of the task.  Fine, but your
comments need to reflect that instead of the confusing comment above.
In addition, your function name is horrible for what you are doing.  A
person reading this function will think that you are bumping the actual
rlimit on the task, which is not what you are doing.  You are performing
per-task accounting of MEMLOCK memory.  The actual permission checks are
per-user, and the primary accounting is per-user.  So, really, this is
just a nice little feature that provides a more granular per-task usage
(but not control) so a user can see where their overall memlock memory
is being used.  Fine.  I would reword the comment something like this:

/* XXX resource is tracked and limit enforced on a per user basis,
   but we track it on a per-task basis as well so users can identify
   hogs of this resource, stats can be found in /proc/<pid>/limits */

And I would rename bump_rlimit and task_bump_rlimit to something like
account_rlimit and task_account_rlimit.  Calling it bump just gives the
wrong idea entirely on first read.

>  	return 0;
>  }
>  
> @@ -601,6 +604,9 @@ static int bpf_prog_charge_memlock(struct bpf_prog *prog)
>  		return -EPERM;
>  	}
>  	prog->aux->user = user;
> +	/* XXX resource limits apply per task, not per user */
> +	bump_rlimit(RLIMIT_MEMLOCK, atomic_long_read(&user->locked_vm) <<
> +		    PAGE_SHIFT);
>  	return 0;
>  }

> @@ -798,6 +802,9 @@ int user_shm_lock(size_t size, struct user_struct *user)
>  	get_uid(user);
>  	user->locked_shm += locked;
>  	allowed = 1;
> +
> +	/* XXX resource limits apply per task, not per user */
> +	bump_rlimit(RLIMIT_MEMLOCK, user->locked_shm << PAGE_SHIFT);
>  out:
>  	spin_unlock(&shmlock_user_lock);
>  	return allowed;
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 0963e7f..4e683dd 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2020,6 +2020,9 @@ static int acct_stack_growth(struct vm_area_struct *vma, unsigned long size, uns
>  		return -ENOMEM;
>  
>  	bump_rlimit(RLIMIT_STACK, actual_size);
> +	if (vma->vm_flags & VM_LOCKED)
> +		bump_rlimit(RLIMIT_MEMLOCK,
> +			    (mm->locked_vm + grow) << PAGE_SHIFT);
>  
>  	return 0;
>  }
> diff --git a/mm/mremap.c b/mm/mremap.c
> index 1f157ad..ade3e13 100644
> --- a/mm/mremap.c
> +++ b/mm/mremap.c
> @@ -394,6 +394,9 @@ static struct vm_area_struct *vma_to_resize(unsigned long addr,
>  		*p = charged;
>  	}
>  
> +	if (vma->vm_flags & VM_LOCKED)
> +		bump_rlimit(RLIMIT_MEMLOCK, (mm->locked_vm << PAGE_SHIFT) +
> +			    new_len - old_len);
>  	return vma;
>  }
>  
> 


-- 
Doug Ledford <dledford at redhat.com>
    GPG Key ID: 0E572FDD

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 884 bytes
Desc: OpenPGP digital signature
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20160617/03d876d1/attachment.sig>


More information about the Linuxppc-dev mailing list