Call sites of ucounts_limit_cmp() would allow the global root or capable user to bypass RLIMIT_NPROC on the bottom level of user_ns tree by not looking at ucounts at all.
As the traversal up the user_ns tree continues, the ucounts to which the task is charged may switch the owning user (to the creator of user_ns). If the new chargee is root, we don't really care about RLIMIT_NPROC observation, so lift the limit to the max.
The result is that an unprivileged user U can globally run more that RLIMIT_NPROC (of user_ns) tasks but within each user_ns it is still limited to RLIMINT_NPROC (as passed into task->signal->rlim) iff the user_nss are created by the privileged user.
Signed-off-by: Michal Koutný mkoutny@suse.com --- kernel/ucount.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/kernel/ucount.c b/kernel/ucount.c index 53ccd96387dd..f52b7273a572 100644 --- a/kernel/ucount.c +++ b/kernel/ucount.c @@ -356,6 +356,9 @@ long ucounts_limit_cmp(struct ucounts *ucounts, enum ucount_type type, unsigned if (excess > 0) return excess; max = READ_ONCE(iter->ns->ucount_max[type]); + /* Next ucounts owned by root? RLIMIT_NPROC is moot */ + if (type == UCOUNT_RLIMIT_NPROC && uid_eq(iter->ns->owner, GLOBAL_ROOT_UID)) + max = LONG_MAX; } return excess; }