On Fri, Jul 08, 2022 at 09:01:32AM -0500, Frederick Lawler wrote:
On 7/8/22 7:10 AM, Christian Göttsche wrote:
,On Fri, 8 Jul 2022 at 00:32, Frederick Lawler fred@cloudflare.com wrote:
While creating a LSM BPF MAC policy to block user namespace creation, we used the LSM cred_prepare hook because that is the closest hook to prevent a call to create_user_ns().
The calls look something like this:
cred = prepare_creds() security_prepare_creds() call_int_hook(cred_prepare, ... if (cred) create_user_ns(cred)
We noticed that error codes were not propagated from this hook and introduced a patch [1] to propagate those errors.
The discussion notes that security_prepare_creds() is not appropriate for MAC policies, and instead the hook is meant for LSM authors to prepare credentials for mutation. [2]
Ultimately, we concluded that a better course of action is to introduce a new security hook for LSM authors. [3]
This patch set first introduces a new security_create_user_ns() function and create_user_ns LSM hook, then marks the hook as sleepable in BPF.
Some thoughts:
I.
Why not make the hook more generic, e.g. support all other existing and potential future namespaces?
The main issue with a generic hook is that different namespaces have different calling contexts. We decided in a previous discussion to opt-out of a generic hook for this reason. [1]
Agreed.
Also I think the naming scheme is <object>_<verb>.
That's a good call out. I was originally hoping to keep the security_*() match with the hook name matched with the caller function to keep things all aligned. If no one objects to renaming the hook, I can rename the hook for v3.
LSM_HOOK(int, 0, namespace_create, const struct cred *cred,
unsigned int flags)
where flags is a bitmap of CLONE flags from include/uapi/linux/sched.h (like CLONE_NEWUSER).
II.
While adding policing for namespaces maybe also add a new hook for setns(2)
LSM_HOOK(int, 0, namespace_join, const struct cred *subj, const
struct cred *obj, unsigned int flags)
IIUC, setns() will create a new namespace for the other namespaces except for user namespace. If we add a security hook for the other create_*_ns()
setns() doesn't create new namespaces. It just switches to already existing ones:
setns(<pidfd>, <flags>) -> prepare_nsset() /* * Notice the 0 passed as flags which means all namespaces will * just take a reference. */ -> create_new_namespaces(0, ...)
you're thinking about unshare() and unshare() will be caught in create_user_ns().
functions, then we can catch setns() at that point.
If you block the creation of user namespaces by unprivileged users in create_user_ns() you can only create user namespaces as a privileged user. Consequently only a privileged users can setns() to a user namespace. So either the caller has CAP_SYS_ADMIN in the parent userns or they are located in the parent userns and are the owner of the userns they are attaching to. So if you lock create_user_ns() to capable(CAP_SYS_ADMIN) you should be done.