From: Xu Kuohai xukuohai@huawei.com
A bpf prog returning positive number attached to file_alloc_security hook will make kernel panic.
Here is a panic log:
[ 441.235774] BUG: kernel NULL pointer dereference, address: 00000000000009 [ 441.236748] #PF: supervisor write access in kernel mode [ 441.237429] #PF: error_code(0x0002) - not-present page [ 441.238119] PGD 800000000b02f067 P4D 800000000b02f067 PUD b031067 PMD 0 [ 441.238990] Oops: 0002 [#1] PREEMPT SMP PTI [ 441.239546] CPU: 0 PID: 347 Comm: loader Not tainted 6.8.0-rc6-gafe0cbf23373 #22 [ 441.240496] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b4 [ 441.241933] RIP: 0010:alloc_file+0x4b/0x190 [ 441.242485] Code: 8b 04 25 c0 3c 1f 00 48 8b b0 30 0c 00 00 e8 9c fe ff ff 48 3d 00 f0 ff fb [ 441.244820] RSP: 0018:ffffc90000c67c40 EFLAGS: 00010203 [ 441.245484] RAX: ffff888006a891a0 RBX: ffffffff8223bd00 RCX: 0000000035b08000 [ 441.246391] RDX: ffff88800b95f7b0 RSI: 00000000001fc110 RDI: f089cd0b8088ffff [ 441.247294] RBP: ffffc90000c67c58 R08: 0000000000000001 R09: 0000000000000001 [ 441.248209] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000001 [ 441.249108] R13: ffffc90000c67c78 R14: ffffffff8223bd00 R15: fffffffffffffff4 [ 441.250007] FS: 00000000005f3300(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000 [ 441.251053] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 441.251788] CR2: 00000000000001a9 CR3: 000000000bdc4003 CR4: 0000000000170ef0 [ 441.252688] Call Trace: [ 441.253011] <TASK> [ 441.253296] ? __die+0x24/0x70 [ 441.253702] ? page_fault_oops+0x15b/0x480 [ 441.254236] ? fixup_exception+0x26/0x330 [ 441.254750] ? exc_page_fault+0x6d/0x1c0 [ 441.255257] ? asm_exc_page_fault+0x26/0x30 [ 441.255792] ? alloc_file+0x4b/0x190 [ 441.256257] alloc_file_pseudo+0x9f/0xf0 [ 441.256760] __anon_inode_getfile+0x87/0x190 [ 441.257311] ? lock_release+0x14e/0x3f0 [ 441.257808] bpf_link_prime+0xe8/0x1d0 [ 441.258315] bpf_tracing_prog_attach+0x311/0x570 [ 441.258916] ? __pfx_bpf_lsm_file_alloc_security+0x10/0x10 [ 441.259605] __sys_bpf+0x1bb7/0x2dc0 [ 441.260070] __x64_sys_bpf+0x20/0x30 [ 441.260533] do_syscall_64+0x72/0x140 [ 441.261004] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 441.261643] RIP: 0033:0x4b0349 [ 441.262045] Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 88 [ 441.264355] RSP: 002b:00007fff74daee38 EFLAGS: 00000246 ORIG_RAX: 0000000000000141 [ 441.265293] RAX: ffffffffffffffda RBX: 00007fff74daef30 RCX: 00000000004b0349 [ 441.266187] RDX: 0000000000000040 RSI: 00007fff74daee50 RDI: 000000000000001c [ 441.267114] RBP: 000000000000001b R08: 00000000005ef820 R09: 0000000000000000 [ 441.268018] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004 [ 441.268907] R13: 0000000000000004 R14: 00000000005ef018 R15: 00000000004004e8
The reason is that the positive number returned by bpf prog is not a valid errno, and could not be filtered out with IS_ERR which is used by the file system to check errors. As a result, the filesystem mistakenly uses this random positive number as file pointer, causing panic.
To fix this issue, there are two schemes:
1. Modify the calling sites of file_alloc_security to take positive return values as zero.
2. Make the bpf verifier to ensure no unpredicted value returned by lsm bpf prog.
Considering that hook file_alloc_security never returned positive number before bpf lsm was introduced, and other lsm hooks may have the same problem, scheme 2 is more reasonable.
So this series adds lsm return value check in verifier to fix it.
v3: 1. Fix incorrect lsm hook return value ranges, and add disabled hook list for bpf lsm, and merge two LSM_RET_INT patches. (KP Singh) 2. Avoid bpf lsm progs attached to different hooks to call each other with tail call 3. Fix a CI failure caused by false rejection of AND operation 4. Add tests
v2: https://lore.kernel.org/bpf/20240325095653.1720123-1-xukuohai@huaweicloud.co... fix bpf ci failure
v1: https://lore.kernel.org/bpf/20240316122359.1073787-1-xukuohai@huaweicloud.co...
Xu Kuohai (11): bpf, lsm: Annotate lsm hook return value range bpf, lsm: Add helper to read lsm hook return value range bpf, lsm: Check bpf lsm hook return values in verifier bpf, lsm: Add bpf lsm disabled hook list bpf: Avoid progs for different hooks calling each other with tail call bpf: Fix compare error in function retval_range_within bpf: Fix a false rejection caused by AND operation selftests/bpf: Avoid load failure for token_lsm.c selftests/bpf: Add return value checks for failed tests selftests/bpf: Add test for lsm tail call selftests/bpf: Add verifier tests for bpf lsm
include/linux/bpf.h | 2 + include/linux/bpf_lsm.h | 8 + include/linux/lsm_hook_defs.h | 591 +++++++++--------- include/linux/lsm_hooks.h | 6 - kernel/bpf/bpf_lsm.c | 84 ++- kernel/bpf/btf.c | 5 +- kernel/bpf/core.c | 22 +- kernel/bpf/verifier.c | 82 ++- security/security.c | 1 + .../selftests/bpf/prog_tests/test_lsm.c | 46 +- .../selftests/bpf/prog_tests/verifier.c | 3 +- tools/testing/selftests/bpf/progs/err.h | 10 + .../selftests/bpf/progs/lsm_tailcall.c | 34 + .../selftests/bpf/progs/test_sig_in_xattr.c | 4 + .../bpf/progs/test_verify_pkcs7_sig.c | 8 +- tools/testing/selftests/bpf/progs/token_lsm.c | 4 +- .../bpf/progs/verifier_global_subprogs.c | 7 +- .../selftests/bpf/progs/verifier_lsm.c | 155 +++++ 18 files changed, 754 insertions(+), 318 deletions(-) create mode 100644 tools/testing/selftests/bpf/progs/lsm_tailcall.c create mode 100644 tools/testing/selftests/bpf/progs/verifier_lsm.c
From: Xu Kuohai xukuohai@huawei.com
Add macro LSM_RET_INT to annotate lsm hook return integer type and the default return value, and the expected return range.
The LSM_RET_INT is declared as:
LSM_RET_INT(defval, min, max)
where
- defval is the default return value
- min and max indicate the expected return range is [min, max]
The return value range for each lsm hook is taken from the description in security/security.c.
The expanded result of LSM_RET_INT is not changed, and the compiled product is not changed.
Signed-off-by: Xu Kuohai xukuohai@huawei.com --- include/linux/lsm_hook_defs.h | 591 +++++++++++++++++----------------- include/linux/lsm_hooks.h | 6 - kernel/bpf/bpf_lsm.c | 10 + security/security.c | 1 + 4 files changed, 313 insertions(+), 295 deletions(-)
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h index 334e00efbde4..708f515ffbf3 100644 --- a/include/linux/lsm_hook_defs.h +++ b/include/linux/lsm_hook_defs.h @@ -18,435 +18,448 @@ * The macro LSM_HOOK is used to define the data structures required by * the LSM framework using the pattern: * - * LSM_HOOK(<return_type>, <default_value>, <hook_name>, args...) + * LSM_HOOK(<return_type>, <return_description>, <hook_name>, args...) * * struct security_hook_heads { - * #define LSM_HOOK(RET, DEFAULT, NAME, ...) struct hlist_head NAME; + * #define LSM_HOOK(RET, RETVAL_DESC, NAME, ...) struct hlist_head NAME; * #include <linux/lsm_hook_defs.h> * #undef LSM_HOOK * }; */ -LSM_HOOK(int, 0, binder_set_context_mgr, const struct cred *mgr) -LSM_HOOK(int, 0, binder_transaction, const struct cred *from, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_set_context_mgr, const struct cred *mgr) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transaction, const struct cred *from, const struct cred *to) -LSM_HOOK(int, 0, binder_transfer_binder, const struct cred *from, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transfer_binder, const struct cred *from, const struct cred *to) -LSM_HOOK(int, 0, binder_transfer_file, const struct cred *from, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transfer_file, const struct cred *from, const struct cred *to, const struct file *file) -LSM_HOOK(int, 0, ptrace_access_check, struct task_struct *child, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), ptrace_access_check, struct task_struct *child, unsigned int mode) -LSM_HOOK(int, 0, ptrace_traceme, struct task_struct *parent) -LSM_HOOK(int, 0, capget, const struct task_struct *target, kernel_cap_t *effective, - kernel_cap_t *inheritable, kernel_cap_t *permitted) -LSM_HOOK(int, 0, capset, struct cred *new, const struct cred *old, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), ptrace_traceme, struct task_struct *parent) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), capget, const struct task_struct *target, + kernel_cap_t *effective, kernel_cap_t *inheritable, kernel_cap_t *permitted) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), capset, struct cred *new, const struct cred *old, const kernel_cap_t *effective, const kernel_cap_t *inheritable, const kernel_cap_t *permitted) -LSM_HOOK(int, 0, capable, const struct cred *cred, struct user_namespace *ns, - int cap, unsigned int opts) -LSM_HOOK(int, 0, quotactl, int cmds, int type, int id, const struct super_block *sb) -LSM_HOOK(int, 0, quota_on, struct dentry *dentry) -LSM_HOOK(int, 0, syslog, int type) -LSM_HOOK(int, 0, settime, const struct timespec64 *ts, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), capable, const struct cred *cred, + struct user_namespace *ns, int cap, unsigned int opts) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), quotactl, int cmds, int type, int id, + const struct super_block *sb) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), quota_on, struct dentry *dentry) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), syslog, int type) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), settime, const struct timespec64 *ts, const struct timezone *tz) -LSM_HOOK(int, 1, vm_enough_memory, struct mm_struct *mm, long pages) -LSM_HOOK(int, 0, bprm_creds_for_exec, struct linux_binprm *bprm) -LSM_HOOK(int, 0, bprm_creds_from_file, struct linux_binprm *bprm, const struct file *file) -LSM_HOOK(int, 0, bprm_check_security, struct linux_binprm *bprm) +LSM_HOOK(int, LSM_RET_INT(1, INT_MIN, INT_MAX), vm_enough_memory, struct mm_struct *mm, long pages) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), bprm_creds_for_exec, struct linux_binprm *bprm) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), bprm_creds_from_file, struct linux_binprm *bprm, + const struct file *file) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), bprm_check_security, struct linux_binprm *bprm) LSM_HOOK(void, LSM_RET_VOID, bprm_committing_creds, const struct linux_binprm *bprm) LSM_HOOK(void, LSM_RET_VOID, bprm_committed_creds, const struct linux_binprm *bprm) -LSM_HOOK(int, 0, fs_context_submount, struct fs_context *fc, struct super_block *reference) -LSM_HOOK(int, 0, fs_context_dup, struct fs_context *fc, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), fs_context_submount, struct fs_context *fc, + struct super_block *reference) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), fs_context_dup, struct fs_context *fc, struct fs_context *src_sc) -LSM_HOOK(int, -ENOPARAM, fs_context_parse_param, struct fs_context *fc, +LSM_HOOK(int, LSM_RET_INT(-ENOPARAM, -MAX_ERRNO, 0), fs_context_parse_param, struct fs_context *fc, struct fs_parameter *param) -LSM_HOOK(int, 0, sb_alloc_security, struct super_block *sb) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), sb_alloc_security, struct super_block *sb) LSM_HOOK(void, LSM_RET_VOID, sb_delete, struct super_block *sb) LSM_HOOK(void, LSM_RET_VOID, sb_free_security, struct super_block *sb) LSM_HOOK(void, LSM_RET_VOID, sb_free_mnt_opts, void *mnt_opts) -LSM_HOOK(int, 0, sb_eat_lsm_opts, char *orig, void **mnt_opts) -LSM_HOOK(int, 0, sb_mnt_opts_compat, struct super_block *sb, void *mnt_opts) -LSM_HOOK(int, 0, sb_remount, struct super_block *sb, void *mnt_opts) -LSM_HOOK(int, 0, sb_kern_mount, const struct super_block *sb) -LSM_HOOK(int, 0, sb_show_options, struct seq_file *m, struct super_block *sb) -LSM_HOOK(int, 0, sb_statfs, struct dentry *dentry) -LSM_HOOK(int, 0, sb_mount, const char *dev_name, const struct path *path, - const char *type, unsigned long flags, void *data) -LSM_HOOK(int, 0, sb_umount, struct vfsmount *mnt, int flags) -LSM_HOOK(int, 0, sb_pivotroot, const struct path *old_path, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), sb_eat_lsm_opts, char *orig, void **mnt_opts) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), sb_mnt_opts_compat, struct super_block *sb, + void *mnt_opts) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), sb_remount, struct super_block *sb, void *mnt_opts) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), sb_kern_mount, const struct super_block *sb) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), sb_show_options, struct seq_file *m, + struct super_block *sb) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), sb_statfs, struct dentry *dentry) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), sb_mount, const char *dev_name, + const struct path *path, const char *type, unsigned long flags, void *data) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), sb_umount, struct vfsmount *mnt, int flags) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), sb_pivotroot, const struct path *old_path, const struct path *new_path) -LSM_HOOK(int, 0, sb_set_mnt_opts, struct super_block *sb, void *mnt_opts, - unsigned long kern_flags, unsigned long *set_kern_flags) -LSM_HOOK(int, 0, sb_clone_mnt_opts, const struct super_block *oldsb, - struct super_block *newsb, unsigned long kern_flags, - unsigned long *set_kern_flags) -LSM_HOOK(int, 0, move_mount, const struct path *from_path, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), sb_set_mnt_opts, struct super_block *sb, + void *mnt_opts, unsigned long kern_flags, unsigned long *set_kern_flags) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), sb_clone_mnt_opts, const struct super_block *oldsb, + struct super_block *newsb, unsigned long kern_flags, unsigned long *set_kern_flags) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), move_mount, const struct path *from_path, const struct path *to_path) -LSM_HOOK(int, -EOPNOTSUPP, dentry_init_security, struct dentry *dentry, - int mode, const struct qstr *name, const char **xattr_name, - void **ctx, u32 *ctxlen) -LSM_HOOK(int, 0, dentry_create_files_as, struct dentry *dentry, int mode, - struct qstr *name, const struct cred *old, struct cred *new) +LSM_HOOK(int, LSM_RET_INT(-EOPNOTSUPP, -MAX_ERRNO, 0), dentry_init_security, struct dentry *dentry, + int mode, const struct qstr *name, const char **xattr_name, void **ctx, u32 *ctxlen) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), dentry_create_files_as, struct dentry *dentry, + int mode, struct qstr *name, const struct cred *old, struct cred *new)
#ifdef CONFIG_SECURITY_PATH -LSM_HOOK(int, 0, path_unlink, const struct path *dir, struct dentry *dentry) -LSM_HOOK(int, 0, path_mkdir, const struct path *dir, struct dentry *dentry, - umode_t mode) -LSM_HOOK(int, 0, path_rmdir, const struct path *dir, struct dentry *dentry) -LSM_HOOK(int, 0, path_mknod, const struct path *dir, struct dentry *dentry, - umode_t mode, unsigned int dev) -LSM_HOOK(void, LSM_RET_VOID, path_post_mknod, struct mnt_idmap *idmap, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), path_unlink, const struct path *dir, + struct dentry *dentry) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), path_mkdir, const struct path *dir, + struct dentry *dentry, umode_t mode) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), path_rmdir, const struct path *dir, struct dentry *dentry) -LSM_HOOK(int, 0, path_truncate, const struct path *path) -LSM_HOOK(int, 0, path_symlink, const struct path *dir, struct dentry *dentry, - const char *old_name) -LSM_HOOK(int, 0, path_link, struct dentry *old_dentry, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), path_mknod, const struct path *dir, + struct dentry *dentry, umode_t mode, unsigned int dev) +LSM_HOOK(void, LSM_RET_VOID, path_post_mknod, struct mnt_idmap *idmap, struct dentry *dentry) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), path_truncate, const struct path *path) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), path_symlink, const struct path *dir, + struct dentry *dentry, const char *old_name) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), path_link, struct dentry *old_dentry, const struct path *new_dir, struct dentry *new_dentry) -LSM_HOOK(int, 0, path_rename, const struct path *old_dir, - struct dentry *old_dentry, const struct path *new_dir, - struct dentry *new_dentry, unsigned int flags) -LSM_HOOK(int, 0, path_chmod, const struct path *path, umode_t mode) -LSM_HOOK(int, 0, path_chown, const struct path *path, kuid_t uid, kgid_t gid) -LSM_HOOK(int, 0, path_chroot, const struct path *path) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), path_rename, const struct path *old_dir, + struct dentry *old_dentry, const struct path *new_dir, struct dentry *new_dentry, + unsigned int flags) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), path_chmod, const struct path *path, umode_t mode) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), path_chown, const struct path *path, kuid_t uid, + kgid_t gid) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), path_chroot, const struct path *path) #endif /* CONFIG_SECURITY_PATH */
/* Needed for inode based security check */ -LSM_HOOK(int, 0, path_notify, const struct path *path, u64 mask, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), path_notify, const struct path *path, u64 mask, unsigned int obj_type) -LSM_HOOK(int, 0, inode_alloc_security, struct inode *inode) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_alloc_security, struct inode *inode) LSM_HOOK(void, LSM_RET_VOID, inode_free_security, struct inode *inode) -LSM_HOOK(int, -EOPNOTSUPP, inode_init_security, struct inode *inode, - struct inode *dir, const struct qstr *qstr, struct xattr *xattrs, - int *xattr_count) -LSM_HOOK(int, 0, inode_init_security_anon, struct inode *inode, +LSM_HOOK(int, LSM_RET_INT(-EOPNOTSUPP, -MAX_ERRNO, 0), inode_init_security, struct inode *inode, + struct inode *dir, const struct qstr *qstr, struct xattr *xattrs, int *xattr_count) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_init_security_anon, struct inode *inode, const struct qstr *name, const struct inode *context_inode) -LSM_HOOK(int, 0, inode_create, struct inode *dir, struct dentry *dentry, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_create, struct inode *dir, struct dentry *dentry, umode_t mode) LSM_HOOK(void, LSM_RET_VOID, inode_post_create_tmpfile, struct mnt_idmap *idmap, struct inode *inode) -LSM_HOOK(int, 0, inode_link, struct dentry *old_dentry, struct inode *dir, - struct dentry *new_dentry) -LSM_HOOK(int, 0, inode_unlink, struct inode *dir, struct dentry *dentry) -LSM_HOOK(int, 0, inode_symlink, struct inode *dir, struct dentry *dentry, - const char *old_name) -LSM_HOOK(int, 0, inode_mkdir, struct inode *dir, struct dentry *dentry, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_link, struct dentry *old_dentry, + struct inode *dir, struct dentry *new_dentry) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_unlink, struct inode *dir, struct dentry *dentry) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_symlink, struct inode *dir, + struct dentry *dentry, const char *old_name) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_mkdir, struct inode *dir, struct dentry *dentry, umode_t mode) -LSM_HOOK(int, 0, inode_rmdir, struct inode *dir, struct dentry *dentry) -LSM_HOOK(int, 0, inode_mknod, struct inode *dir, struct dentry *dentry, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_rmdir, struct inode *dir, struct dentry *dentry) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_mknod, struct inode *dir, struct dentry *dentry, umode_t mode, dev_t dev) -LSM_HOOK(int, 0, inode_rename, struct inode *old_dir, struct dentry *old_dentry, - struct inode *new_dir, struct dentry *new_dentry) -LSM_HOOK(int, 0, inode_readlink, struct dentry *dentry) -LSM_HOOK(int, 0, inode_follow_link, struct dentry *dentry, struct inode *inode, - bool rcu) -LSM_HOOK(int, 0, inode_permission, struct inode *inode, int mask) -LSM_HOOK(int, 0, inode_setattr, struct mnt_idmap *idmap, struct dentry *dentry, - struct iattr *attr) -LSM_HOOK(void, LSM_RET_VOID, inode_post_setattr, struct mnt_idmap *idmap, - struct dentry *dentry, int ia_valid) -LSM_HOOK(int, 0, inode_getattr, const struct path *path) -LSM_HOOK(int, 0, inode_setxattr, struct mnt_idmap *idmap, - struct dentry *dentry, const char *name, const void *value, - size_t size, int flags) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_rename, struct inode *old_dir, + struct dentry *old_dentry, struct inode *new_dir, struct dentry *new_dentry) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_readlink, struct dentry *dentry) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_follow_link, struct dentry *dentry, + struct inode *inode, bool rcu) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_permission, struct inode *inode, int mask) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_setattr, struct mnt_idmap *idmap, + struct dentry *dentry, struct iattr *attr) +LSM_HOOK(void, LSM_RET_VOID, inode_post_setattr, struct mnt_idmap *idmap, struct dentry *dentry, + int ia_valid) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_getattr, const struct path *path) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_setxattr, struct mnt_idmap *idmap, + struct dentry *dentry, const char *name, const void *value, size_t size, int flags) LSM_HOOK(void, LSM_RET_VOID, inode_post_setxattr, struct dentry *dentry, const char *name, const void *value, size_t size, int flags) -LSM_HOOK(int, 0, inode_getxattr, struct dentry *dentry, const char *name) -LSM_HOOK(int, 0, inode_listxattr, struct dentry *dentry) -LSM_HOOK(int, 0, inode_removexattr, struct mnt_idmap *idmap, - struct dentry *dentry, const char *name) -LSM_HOOK(void, LSM_RET_VOID, inode_post_removexattr, struct dentry *dentry, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_getxattr, struct dentry *dentry, const char *name) -LSM_HOOK(int, 0, inode_set_acl, struct mnt_idmap *idmap, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_listxattr, struct dentry *dentry) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_removexattr, struct mnt_idmap *idmap, + struct dentry *dentry, const char *name) +LSM_HOOK(void, LSM_RET_VOID, inode_post_removexattr, struct dentry *dentry, const char *name) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_set_acl, struct mnt_idmap *idmap, struct dentry *dentry, const char *acl_name, struct posix_acl *kacl) LSM_HOOK(void, LSM_RET_VOID, inode_post_set_acl, struct dentry *dentry, const char *acl_name, struct posix_acl *kacl) -LSM_HOOK(int, 0, inode_get_acl, struct mnt_idmap *idmap, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_get_acl, struct mnt_idmap *idmap, struct dentry *dentry, const char *acl_name) -LSM_HOOK(int, 0, inode_remove_acl, struct mnt_idmap *idmap, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_remove_acl, struct mnt_idmap *idmap, struct dentry *dentry, const char *acl_name) LSM_HOOK(void, LSM_RET_VOID, inode_post_remove_acl, struct mnt_idmap *idmap, struct dentry *dentry, const char *acl_name) -LSM_HOOK(int, 0, inode_need_killpriv, struct dentry *dentry) -LSM_HOOK(int, 0, inode_killpriv, struct mnt_idmap *idmap, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, INT_MAX), inode_need_killpriv, struct dentry *dentry) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_killpriv, struct mnt_idmap *idmap, struct dentry *dentry) -LSM_HOOK(int, -EOPNOTSUPP, inode_getsecurity, struct mnt_idmap *idmap, - struct inode *inode, const char *name, void **buffer, bool alloc) -LSM_HOOK(int, -EOPNOTSUPP, inode_setsecurity, struct inode *inode, +LSM_HOOK(int, LSM_RET_INT(-EOPNOTSUPP, -MAX_ERRNO, INT_MAX), inode_getsecurity, + struct mnt_idmap *idmap, struct inode *inode, const char *name, void **buffer, bool alloc) +LSM_HOOK(int, LSM_RET_INT(-EOPNOTSUPP, -MAX_ERRNO, 0), inode_setsecurity, struct inode *inode, const char *name, const void *value, size_t size, int flags) -LSM_HOOK(int, 0, inode_listsecurity, struct inode *inode, char *buffer, - size_t buffer_size) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, INT_MAX), inode_listsecurity, struct inode *inode, + char *buffer, size_t buffer_size) LSM_HOOK(void, LSM_RET_VOID, inode_getsecid, struct inode *inode, u32 *secid) -LSM_HOOK(int, 0, inode_copy_up, struct dentry *src, struct cred **new) -LSM_HOOK(int, -EOPNOTSUPP, inode_copy_up_xattr, const char *name) -LSM_HOOK(int, 0, kernfs_init_security, struct kernfs_node *kn_dir, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_copy_up, struct dentry *src, struct cred **new) +LSM_HOOK(int, LSM_RET_INT(-EOPNOTSUPP, -MAX_ERRNO, 1), inode_copy_up_xattr, const char *name) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), kernfs_init_security, struct kernfs_node *kn_dir, struct kernfs_node *kn) -LSM_HOOK(int, 0, file_permission, struct file *file, int mask) -LSM_HOOK(int, 0, file_alloc_security, struct file *file) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), file_permission, struct file *file, int mask) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), file_alloc_security, struct file *file) LSM_HOOK(void, LSM_RET_VOID, file_release, struct file *file) LSM_HOOK(void, LSM_RET_VOID, file_free_security, struct file *file) -LSM_HOOK(int, 0, file_ioctl, struct file *file, unsigned int cmd, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), file_ioctl, struct file *file, unsigned int cmd, unsigned long arg) -LSM_HOOK(int, 0, file_ioctl_compat, struct file *file, unsigned int cmd, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), file_ioctl_compat, struct file *file, unsigned int cmd, unsigned long arg) -LSM_HOOK(int, 0, mmap_addr, unsigned long addr) -LSM_HOOK(int, 0, mmap_file, struct file *file, unsigned long reqprot, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), mmap_addr, unsigned long addr) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), mmap_file, struct file *file, unsigned long reqprot, unsigned long prot, unsigned long flags) -LSM_HOOK(int, 0, file_mprotect, struct vm_area_struct *vma, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), file_mprotect, struct vm_area_struct *vma, unsigned long reqprot, unsigned long prot) -LSM_HOOK(int, 0, file_lock, struct file *file, unsigned int cmd) -LSM_HOOK(int, 0, file_fcntl, struct file *file, unsigned int cmd, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), file_lock, struct file *file, unsigned int cmd) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), file_fcntl, struct file *file, unsigned int cmd, unsigned long arg) LSM_HOOK(void, LSM_RET_VOID, file_set_fowner, struct file *file) -LSM_HOOK(int, 0, file_send_sigiotask, struct task_struct *tsk, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), file_send_sigiotask, struct task_struct *tsk, struct fown_struct *fown, int sig) -LSM_HOOK(int, 0, file_receive, struct file *file) -LSM_HOOK(int, 0, file_open, struct file *file) -LSM_HOOK(int, 0, file_post_open, struct file *file, int mask) -LSM_HOOK(int, 0, file_truncate, struct file *file) -LSM_HOOK(int, 0, task_alloc, struct task_struct *task, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), file_receive, struct file *file) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), file_open, struct file *file) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), file_post_open, struct file *file, int mask) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), file_truncate, struct file *file) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), task_alloc, struct task_struct *task, unsigned long clone_flags) LSM_HOOK(void, LSM_RET_VOID, task_free, struct task_struct *task) -LSM_HOOK(int, 0, cred_alloc_blank, struct cred *cred, gfp_t gfp) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), cred_alloc_blank, struct cred *cred, gfp_t gfp) LSM_HOOK(void, LSM_RET_VOID, cred_free, struct cred *cred) -LSM_HOOK(int, 0, cred_prepare, struct cred *new, const struct cred *old, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), cred_prepare, struct cred *new, const struct cred *old, gfp_t gfp) -LSM_HOOK(void, LSM_RET_VOID, cred_transfer, struct cred *new, - const struct cred *old) +LSM_HOOK(void, LSM_RET_VOID, cred_transfer, struct cred *new, const struct cred *old) LSM_HOOK(void, LSM_RET_VOID, cred_getsecid, const struct cred *c, u32 *secid) -LSM_HOOK(int, 0, kernel_act_as, struct cred *new, u32 secid) -LSM_HOOK(int, 0, kernel_create_files_as, struct cred *new, struct inode *inode) -LSM_HOOK(int, 0, kernel_module_request, char *kmod_name) -LSM_HOOK(int, 0, kernel_load_data, enum kernel_load_data_id id, bool contents) -LSM_HOOK(int, 0, kernel_post_load_data, char *buf, loff_t size, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), kernel_act_as, struct cred *new, u32 secid) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), kernel_create_files_as, struct cred *new, + struct inode *inode) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), kernel_module_request, char *kmod_name) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), kernel_load_data, enum kernel_load_data_id id, + bool contents) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), kernel_post_load_data, char *buf, loff_t size, enum kernel_load_data_id id, char *description) -LSM_HOOK(int, 0, kernel_read_file, struct file *file, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), kernel_read_file, struct file *file, enum kernel_read_file_id id, bool contents) -LSM_HOOK(int, 0, kernel_post_read_file, struct file *file, char *buf, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), kernel_post_read_file, struct file *file, char *buf, loff_t size, enum kernel_read_file_id id) -LSM_HOOK(int, 0, task_fix_setuid, struct cred *new, const struct cred *old, - int flags) -LSM_HOOK(int, 0, task_fix_setgid, struct cred *new, const struct cred * old, - int flags) -LSM_HOOK(int, 0, task_fix_setgroups, struct cred *new, const struct cred * old) -LSM_HOOK(int, 0, task_setpgid, struct task_struct *p, pid_t pgid) -LSM_HOOK(int, 0, task_getpgid, struct task_struct *p) -LSM_HOOK(int, 0, task_getsid, struct task_struct *p) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), task_fix_setuid, struct cred *new, + const struct cred *old, int flags) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), task_fix_setgid, struct cred *new, + const struct cred *old, int flags) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), task_fix_setgroups, struct cred *new, + const struct cred *old) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), task_setpgid, struct task_struct *p, pid_t pgid) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), task_getpgid, struct task_struct *p) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), task_getsid, struct task_struct *p) LSM_HOOK(void, LSM_RET_VOID, current_getsecid_subj, u32 *secid) -LSM_HOOK(void, LSM_RET_VOID, task_getsecid_obj, - struct task_struct *p, u32 *secid) -LSM_HOOK(int, 0, task_setnice, struct task_struct *p, int nice) -LSM_HOOK(int, 0, task_setioprio, struct task_struct *p, int ioprio) -LSM_HOOK(int, 0, task_getioprio, struct task_struct *p) -LSM_HOOK(int, 0, task_prlimit, const struct cred *cred, +LSM_HOOK(void, LSM_RET_VOID, task_getsecid_obj, struct task_struct *p, u32 *secid) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), task_setnice, struct task_struct *p, int nice) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), task_setioprio, struct task_struct *p, int ioprio) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), task_getioprio, struct task_struct *p) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), task_prlimit, const struct cred *cred, const struct cred *tcred, unsigned int flags) -LSM_HOOK(int, 0, task_setrlimit, struct task_struct *p, unsigned int resource, - struct rlimit *new_rlim) -LSM_HOOK(int, 0, task_setscheduler, struct task_struct *p) -LSM_HOOK(int, 0, task_getscheduler, struct task_struct *p) -LSM_HOOK(int, 0, task_movememory, struct task_struct *p) -LSM_HOOK(int, 0, task_kill, struct task_struct *p, struct kernel_siginfo *info, - int sig, const struct cred *cred) -LSM_HOOK(int, -ENOSYS, task_prctl, int option, unsigned long arg2, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), task_setrlimit, struct task_struct *p, + unsigned int resource, struct rlimit *new_rlim) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), task_setscheduler, struct task_struct *p) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), task_getscheduler, struct task_struct *p) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), task_movememory, struct task_struct *p) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), task_kill, struct task_struct *p, + struct kernel_siginfo *info, int sig, const struct cred *cred) +LSM_HOOK(int, LSM_RET_INT(-ENOSYS, -MAX_ERRNO, INT_MAX), task_prctl, int option, unsigned long arg2, unsigned long arg3, unsigned long arg4, unsigned long arg5) -LSM_HOOK(void, LSM_RET_VOID, task_to_inode, struct task_struct *p, - struct inode *inode) -LSM_HOOK(int, 0, userns_create, const struct cred *cred) -LSM_HOOK(int, 0, ipc_permission, struct kern_ipc_perm *ipcp, short flag) -LSM_HOOK(void, LSM_RET_VOID, ipc_getsecid, struct kern_ipc_perm *ipcp, - u32 *secid) -LSM_HOOK(int, 0, msg_msg_alloc_security, struct msg_msg *msg) +LSM_HOOK(void, LSM_RET_VOID, task_to_inode, struct task_struct *p, struct inode *inode) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), userns_create, const struct cred *cred) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), ipc_permission, struct kern_ipc_perm *ipcp, short flag) +LSM_HOOK(void, LSM_RET_VOID, ipc_getsecid, struct kern_ipc_perm *ipcp, u32 *secid) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), msg_msg_alloc_security, struct msg_msg *msg) LSM_HOOK(void, LSM_RET_VOID, msg_msg_free_security, struct msg_msg *msg) -LSM_HOOK(int, 0, msg_queue_alloc_security, struct kern_ipc_perm *perm) -LSM_HOOK(void, LSM_RET_VOID, msg_queue_free_security, - struct kern_ipc_perm *perm) -LSM_HOOK(int, 0, msg_queue_associate, struct kern_ipc_perm *perm, int msqflg) -LSM_HOOK(int, 0, msg_queue_msgctl, struct kern_ipc_perm *perm, int cmd) -LSM_HOOK(int, 0, msg_queue_msgsnd, struct kern_ipc_perm *perm, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), msg_queue_alloc_security, struct kern_ipc_perm *perm) +LSM_HOOK(void, LSM_RET_VOID, msg_queue_free_security, struct kern_ipc_perm *perm) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), msg_queue_associate, struct kern_ipc_perm *perm, + int msqflg) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), msg_queue_msgctl, struct kern_ipc_perm *perm, int cmd) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), msg_queue_msgsnd, struct kern_ipc_perm *perm, struct msg_msg *msg, int msqflg) -LSM_HOOK(int, 0, msg_queue_msgrcv, struct kern_ipc_perm *perm, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), msg_queue_msgrcv, struct kern_ipc_perm *perm, struct msg_msg *msg, struct task_struct *target, long type, int mode) -LSM_HOOK(int, 0, shm_alloc_security, struct kern_ipc_perm *perm) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), shm_alloc_security, struct kern_ipc_perm *perm) LSM_HOOK(void, LSM_RET_VOID, shm_free_security, struct kern_ipc_perm *perm) -LSM_HOOK(int, 0, shm_associate, struct kern_ipc_perm *perm, int shmflg) -LSM_HOOK(int, 0, shm_shmctl, struct kern_ipc_perm *perm, int cmd) -LSM_HOOK(int, 0, shm_shmat, struct kern_ipc_perm *perm, char __user *shmaddr, - int shmflg) -LSM_HOOK(int, 0, sem_alloc_security, struct kern_ipc_perm *perm) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), shm_associate, struct kern_ipc_perm *perm, int shmflg) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), shm_shmctl, struct kern_ipc_perm *perm, int cmd) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), shm_shmat, struct kern_ipc_perm *perm, + char __user *shmaddr, int shmflg) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), sem_alloc_security, struct kern_ipc_perm *perm) LSM_HOOK(void, LSM_RET_VOID, sem_free_security, struct kern_ipc_perm *perm) -LSM_HOOK(int, 0, sem_associate, struct kern_ipc_perm *perm, int semflg) -LSM_HOOK(int, 0, sem_semctl, struct kern_ipc_perm *perm, int cmd) -LSM_HOOK(int, 0, sem_semop, struct kern_ipc_perm *perm, struct sembuf *sops, - unsigned nsops, int alter) -LSM_HOOK(int, 0, netlink_send, struct sock *sk, struct sk_buff *skb) -LSM_HOOK(void, LSM_RET_VOID, d_instantiate, struct dentry *dentry, - struct inode *inode) -LSM_HOOK(int, -EOPNOTSUPP, getselfattr, unsigned int attr, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), sem_associate, struct kern_ipc_perm *perm, int semflg) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), sem_semctl, struct kern_ipc_perm *perm, int cmd) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), sem_semop, struct kern_ipc_perm *perm, + struct sembuf *sops, unsigned nsops, int alter) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), netlink_send, struct sock *sk, struct sk_buff *skb) +LSM_HOOK(void, LSM_RET_VOID, d_instantiate, struct dentry *dentry, struct inode *inode) +LSM_HOOK(int, LSM_RET_INT(-EOPNOTSUPP, -MAX_ERRNO, INT_MAX), getselfattr, unsigned int attr, struct lsm_ctx __user *ctx, u32 *size, u32 flags) -LSM_HOOK(int, -EOPNOTSUPP, setselfattr, unsigned int attr, +LSM_HOOK(int, LSM_RET_INT(-EOPNOTSUPP, -MAX_ERRNO, 0), setselfattr, unsigned int attr, struct lsm_ctx *ctx, u32 size, u32 flags) -LSM_HOOK(int, -EINVAL, getprocattr, struct task_struct *p, const char *name, - char **value) -LSM_HOOK(int, -EINVAL, setprocattr, const char *name, void *value, size_t size) -LSM_HOOK(int, 0, ismaclabel, const char *name) -LSM_HOOK(int, -EOPNOTSUPP, secid_to_secctx, u32 secid, char **secdata, +LSM_HOOK(int, LSM_RET_INT(-EINVAL, -MAX_ERRNO, INT_MAX), getprocattr, struct task_struct *p, + const char *name, char **value) +LSM_HOOK(int, LSM_RET_INT(-EINVAL, -MAX_ERRNO, INT_MAX), setprocattr, const char *name, void *value, + size_t size) +LSM_HOOK(int, LSM_RET_INT(0, 0, 1), ismaclabel, const char *name) +LSM_HOOK(int, LSM_RET_INT(-EOPNOTSUPP, -MAX_ERRNO, 0), secid_to_secctx, u32 secid, char **secdata, u32 *seclen) -LSM_HOOK(int, 0, secctx_to_secid, const char *secdata, u32 seclen, u32 *secid) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), secctx_to_secid, const char *secdata, u32 seclen, + u32 *secid) LSM_HOOK(void, LSM_RET_VOID, release_secctx, char *secdata, u32 seclen) LSM_HOOK(void, LSM_RET_VOID, inode_invalidate_secctx, struct inode *inode) -LSM_HOOK(int, 0, inode_notifysecctx, struct inode *inode, void *ctx, u32 ctxlen) -LSM_HOOK(int, 0, inode_setsecctx, struct dentry *dentry, void *ctx, u32 ctxlen) -LSM_HOOK(int, -EOPNOTSUPP, inode_getsecctx, struct inode *inode, void **ctx, - u32 *ctxlen) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_notifysecctx, struct inode *inode, void *ctx, + u32 ctxlen) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inode_setsecctx, struct dentry *dentry, void *ctx, + u32 ctxlen) +LSM_HOOK(int, LSM_RET_INT(-EOPNOTSUPP, -MAX_ERRNO, 0), inode_getsecctx, struct inode *inode, + void **ctx, u32 *ctxlen)
#if defined(CONFIG_SECURITY) && defined(CONFIG_WATCH_QUEUE) -LSM_HOOK(int, 0, post_notification, const struct cred *w_cred, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), post_notification, const struct cred *w_cred, const struct cred *cred, struct watch_notification *n) #endif /* CONFIG_SECURITY && CONFIG_WATCH_QUEUE */
#if defined(CONFIG_SECURITY) && defined(CONFIG_KEY_NOTIFICATIONS) -LSM_HOOK(int, 0, watch_key, struct key *key) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), watch_key, struct key *key) #endif /* CONFIG_SECURITY && CONFIG_KEY_NOTIFICATIONS */
#ifdef CONFIG_SECURITY_NETWORK -LSM_HOOK(int, 0, unix_stream_connect, struct sock *sock, struct sock *other, - struct sock *newsk) -LSM_HOOK(int, 0, unix_may_send, struct socket *sock, struct socket *other) -LSM_HOOK(int, 0, socket_create, int family, int type, int protocol, int kern) -LSM_HOOK(int, 0, socket_post_create, struct socket *sock, int family, int type, - int protocol, int kern) -LSM_HOOK(int, 0, socket_socketpair, struct socket *socka, struct socket *sockb) -LSM_HOOK(int, 0, socket_bind, struct socket *sock, struct sockaddr *address, - int addrlen) -LSM_HOOK(int, 0, socket_connect, struct socket *sock, struct sockaddr *address, - int addrlen) -LSM_HOOK(int, 0, socket_listen, struct socket *sock, int backlog) -LSM_HOOK(int, 0, socket_accept, struct socket *sock, struct socket *newsock) -LSM_HOOK(int, 0, socket_sendmsg, struct socket *sock, struct msghdr *msg, - int size) -LSM_HOOK(int, 0, socket_recvmsg, struct socket *sock, struct msghdr *msg, - int size, int flags) -LSM_HOOK(int, 0, socket_getsockname, struct socket *sock) -LSM_HOOK(int, 0, socket_getpeername, struct socket *sock) -LSM_HOOK(int, 0, socket_getsockopt, struct socket *sock, int level, int optname) -LSM_HOOK(int, 0, socket_setsockopt, struct socket *sock, int level, int optname) -LSM_HOOK(int, 0, socket_shutdown, struct socket *sock, int how) -LSM_HOOK(int, 0, socket_sock_rcv_skb, struct sock *sk, struct sk_buff *skb) -LSM_HOOK(int, -ENOPROTOOPT, socket_getpeersec_stream, struct socket *sock, - sockptr_t optval, sockptr_t optlen, unsigned int len) -LSM_HOOK(int, -ENOPROTOOPT, socket_getpeersec_dgram, struct socket *sock, - struct sk_buff *skb, u32 *secid) -LSM_HOOK(int, 0, sk_alloc_security, struct sock *sk, int family, gfp_t priority) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), unix_stream_connect, struct sock *sock, + struct sock *other, struct sock *newsk) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), unix_may_send, struct socket *sock, + struct socket *other) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), socket_create, int family, int type, int protocol, + int kern) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), socket_post_create, struct socket *sock, int family, + int type, int protocol, int kern) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), socket_socketpair, struct socket *socka, + struct socket *sockb) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), socket_bind, struct socket *sock, + struct sockaddr *address, int addrlen) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), socket_connect, struct socket *sock, + struct sockaddr *address, int addrlen) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), socket_listen, struct socket *sock, int backlog) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), socket_accept, struct socket *sock, + struct socket *newsock) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), socket_sendmsg, struct socket *sock, + struct msghdr *msg, int size) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), socket_recvmsg, struct socket *sock, + struct msghdr *msg, int size, int flags) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), socket_getsockname, struct socket *sock) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), socket_getpeername, struct socket *sock) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), socket_getsockopt, struct socket *sock, int level, + int optname) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), socket_setsockopt, struct socket *sock, int level, + int optname) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), socket_shutdown, struct socket *sock, int how) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), socket_sock_rcv_skb, struct sock *sk, + struct sk_buff *skb) +LSM_HOOK(int, LSM_RET_INT(-ENOPROTOOPT, -MAX_ERRNO, 0), socket_getpeersec_stream, + struct socket *sock, sockptr_t optval, sockptr_t optlen, unsigned int len) +LSM_HOOK(int, LSM_RET_INT(-ENOPROTOOPT, -MAX_ERRNO, 0), socket_getpeersec_dgram, + struct socket *sock, struct sk_buff *skb, u32 *secid) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), sk_alloc_security, struct sock *sk, int family, + gfp_t priority) LSM_HOOK(void, LSM_RET_VOID, sk_free_security, struct sock *sk) -LSM_HOOK(void, LSM_RET_VOID, sk_clone_security, const struct sock *sk, - struct sock *newsk) +LSM_HOOK(void, LSM_RET_VOID, sk_clone_security, const struct sock *sk, struct sock *newsk) LSM_HOOK(void, LSM_RET_VOID, sk_getsecid, const struct sock *sk, u32 *secid) LSM_HOOK(void, LSM_RET_VOID, sock_graft, struct sock *sk, struct socket *parent) -LSM_HOOK(int, 0, inet_conn_request, const struct sock *sk, struct sk_buff *skb, - struct request_sock *req) -LSM_HOOK(void, LSM_RET_VOID, inet_csk_clone, struct sock *newsk, - const struct request_sock *req) -LSM_HOOK(void, LSM_RET_VOID, inet_conn_established, struct sock *sk, - struct sk_buff *skb) -LSM_HOOK(int, 0, secmark_relabel_packet, u32 secid) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), inet_conn_request, const struct sock *sk, + struct sk_buff *skb, struct request_sock *req) +LSM_HOOK(void, LSM_RET_VOID, inet_csk_clone, struct sock *newsk, const struct request_sock *req) +LSM_HOOK(void, LSM_RET_VOID, inet_conn_established, struct sock *sk, struct sk_buff *skb) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), secmark_relabel_packet, u32 secid) LSM_HOOK(void, LSM_RET_VOID, secmark_refcount_inc, void) LSM_HOOK(void, LSM_RET_VOID, secmark_refcount_dec, void) LSM_HOOK(void, LSM_RET_VOID, req_classify_flow, const struct request_sock *req, struct flowi_common *flic) -LSM_HOOK(int, 0, tun_dev_alloc_security, void **security) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), tun_dev_alloc_security, void **security) LSM_HOOK(void, LSM_RET_VOID, tun_dev_free_security, void *security) -LSM_HOOK(int, 0, tun_dev_create, void) -LSM_HOOK(int, 0, tun_dev_attach_queue, void *security) -LSM_HOOK(int, 0, tun_dev_attach, struct sock *sk, void *security) -LSM_HOOK(int, 0, tun_dev_open, void *security) -LSM_HOOK(int, 0, sctp_assoc_request, struct sctp_association *asoc, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), tun_dev_create, void) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), tun_dev_attach_queue, void *security) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), tun_dev_attach, struct sock *sk, void *security) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), tun_dev_open, void *security) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), sctp_assoc_request, struct sctp_association *asoc, struct sk_buff *skb) -LSM_HOOK(int, 0, sctp_bind_connect, struct sock *sk, int optname, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), sctp_bind_connect, struct sock *sk, int optname, struct sockaddr *address, int addrlen) LSM_HOOK(void, LSM_RET_VOID, sctp_sk_clone, struct sctp_association *asoc, struct sock *sk, struct sock *newsk) -LSM_HOOK(int, 0, sctp_assoc_established, struct sctp_association *asoc, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), sctp_assoc_established, struct sctp_association *asoc, struct sk_buff *skb) -LSM_HOOK(int, 0, mptcp_add_subflow, struct sock *sk, struct sock *ssk) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), mptcp_add_subflow, struct sock *sk, struct sock *ssk) #endif /* CONFIG_SECURITY_NETWORK */
#ifdef CONFIG_SECURITY_INFINIBAND -LSM_HOOK(int, 0, ib_pkey_access, void *sec, u64 subnet_prefix, u16 pkey) -LSM_HOOK(int, 0, ib_endport_manage_subnet, void *sec, const char *dev_name, - u8 port_num) -LSM_HOOK(int, 0, ib_alloc_security, void **sec) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), ib_pkey_access, void *sec, u64 subnet_prefix, u16 pkey) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), ib_endport_manage_subnet, void *sec, + const char *dev_name, u8 port_num) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), ib_alloc_security, void **sec) LSM_HOOK(void, LSM_RET_VOID, ib_free_security, void *sec) #endif /* CONFIG_SECURITY_INFINIBAND */
#ifdef CONFIG_SECURITY_NETWORK_XFRM -LSM_HOOK(int, 0, xfrm_policy_alloc_security, struct xfrm_sec_ctx **ctxp, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), xfrm_policy_alloc_security, struct xfrm_sec_ctx **ctxp, struct xfrm_user_sec_ctx *sec_ctx, gfp_t gfp) -LSM_HOOK(int, 0, xfrm_policy_clone_security, struct xfrm_sec_ctx *old_ctx, - struct xfrm_sec_ctx **new_ctx) -LSM_HOOK(void, LSM_RET_VOID, xfrm_policy_free_security, - struct xfrm_sec_ctx *ctx) -LSM_HOOK(int, 0, xfrm_policy_delete_security, struct xfrm_sec_ctx *ctx) -LSM_HOOK(int, 0, xfrm_state_alloc, struct xfrm_state *x, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), xfrm_policy_clone_security, + struct xfrm_sec_ctx *old_ctx, struct xfrm_sec_ctx **new_ctx) +LSM_HOOK(void, LSM_RET_VOID, xfrm_policy_free_security, struct xfrm_sec_ctx *ctx) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), xfrm_policy_delete_security, struct xfrm_sec_ctx *ctx) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), xfrm_state_alloc, struct xfrm_state *x, struct xfrm_user_sec_ctx *sec_ctx) -LSM_HOOK(int, 0, xfrm_state_alloc_acquire, struct xfrm_state *x, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), xfrm_state_alloc_acquire, struct xfrm_state *x, struct xfrm_sec_ctx *polsec, u32 secid) LSM_HOOK(void, LSM_RET_VOID, xfrm_state_free_security, struct xfrm_state *x) -LSM_HOOK(int, 0, xfrm_state_delete_security, struct xfrm_state *x) -LSM_HOOK(int, 0, xfrm_policy_lookup, struct xfrm_sec_ctx *ctx, u32 fl_secid) -LSM_HOOK(int, 1, xfrm_state_pol_flow_match, struct xfrm_state *x, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), xfrm_state_delete_security, struct xfrm_state *x) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), xfrm_policy_lookup, struct xfrm_sec_ctx *ctx, + u32 fl_secid) +LSM_HOOK(int, LSM_RET_INT(1, INT_MIN, INT_MAX), xfrm_state_pol_flow_match, struct xfrm_state *x, struct xfrm_policy *xp, const struct flowi_common *flic) -LSM_HOOK(int, 0, xfrm_decode_session, struct sk_buff *skb, u32 *secid, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), xfrm_decode_session, struct sk_buff *skb, u32 *secid, int ckall) #endif /* CONFIG_SECURITY_NETWORK_XFRM */
/* key management security hooks */ #ifdef CONFIG_KEYS -LSM_HOOK(int, 0, key_alloc, struct key *key, const struct cred *cred, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), key_alloc, struct key *key, const struct cred *cred, unsigned long flags) LSM_HOOK(void, LSM_RET_VOID, key_free, struct key *key) -LSM_HOOK(int, 0, key_permission, key_ref_t key_ref, const struct cred *cred, - enum key_need_perm need_perm) -LSM_HOOK(int, 0, key_getsecurity, struct key *key, char **buffer) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), key_permission, key_ref_t key_ref, + const struct cred *cred, enum key_need_perm need_perm) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, INT_MAX), key_getsecurity, struct key *key, char **buffer) LSM_HOOK(void, LSM_RET_VOID, key_post_create_or_update, struct key *keyring, - struct key *key, const void *payload, size_t payload_len, - unsigned long flags, bool create) + struct key *key, const void *payload, size_t payload_len, unsigned long flags, bool create) #endif /* CONFIG_KEYS */
#ifdef CONFIG_AUDIT -LSM_HOOK(int, 0, audit_rule_init, u32 field, u32 op, char *rulestr, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), audit_rule_init, u32 field, u32 op, char *rulestr, void **lsmrule) -LSM_HOOK(int, 0, audit_rule_known, struct audit_krule *krule) -LSM_HOOK(int, 0, audit_rule_match, u32 secid, u32 field, u32 op, void *lsmrule) +LSM_HOOK(int, LSM_RET_INT(0, 0, 1), audit_rule_known, struct audit_krule *krule) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 1), audit_rule_match, u32 secid, u32 field, u32 op, + void *lsmrule) LSM_HOOK(void, LSM_RET_VOID, audit_rule_free, void *lsmrule) #endif /* CONFIG_AUDIT */
#ifdef CONFIG_BPF_SYSCALL -LSM_HOOK(int, 0, bpf, int cmd, union bpf_attr *attr, unsigned int size) -LSM_HOOK(int, 0, bpf_map, struct bpf_map *map, fmode_t fmode) -LSM_HOOK(int, 0, bpf_prog, struct bpf_prog *prog) -LSM_HOOK(int, 0, bpf_map_create, struct bpf_map *map, union bpf_attr *attr, - struct bpf_token *token) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), bpf, int cmd, union bpf_attr *attr, unsigned int size) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), bpf_map, struct bpf_map *map, fmode_t fmode) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), bpf_prog, struct bpf_prog *prog) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), bpf_map_create, struct bpf_map *map, + union bpf_attr *attr, struct bpf_token *token) LSM_HOOK(void, LSM_RET_VOID, bpf_map_free, struct bpf_map *map) -LSM_HOOK(int, 0, bpf_prog_load, struct bpf_prog *prog, union bpf_attr *attr, - struct bpf_token *token) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), bpf_prog_load, struct bpf_prog *prog, + union bpf_attr *attr, struct bpf_token *token) LSM_HOOK(void, LSM_RET_VOID, bpf_prog_free, struct bpf_prog *prog) -LSM_HOOK(int, 0, bpf_token_create, struct bpf_token *token, union bpf_attr *attr, - struct path *path) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), bpf_token_create, struct bpf_token *token, + union bpf_attr *attr, struct path *path) LSM_HOOK(void, LSM_RET_VOID, bpf_token_free, struct bpf_token *token) -LSM_HOOK(int, 0, bpf_token_cmd, const struct bpf_token *token, enum bpf_cmd cmd) -LSM_HOOK(int, 0, bpf_token_capable, const struct bpf_token *token, int cap) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), bpf_token_cmd, const struct bpf_token *token, + enum bpf_cmd cmd) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), bpf_token_capable, const struct bpf_token *token, + int cap) #endif /* CONFIG_BPF_SYSCALL */
-LSM_HOOK(int, 0, locked_down, enum lockdown_reason what) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), locked_down, enum lockdown_reason what)
#ifdef CONFIG_PERF_EVENTS -LSM_HOOK(int, 0, perf_event_open, struct perf_event_attr *attr, int type) -LSM_HOOK(int, 0, perf_event_alloc, struct perf_event *event) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), perf_event_open, struct perf_event_attr *attr, + int type) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), perf_event_alloc, struct perf_event *event) LSM_HOOK(void, LSM_RET_VOID, perf_event_free, struct perf_event *event) -LSM_HOOK(int, 0, perf_event_read, struct perf_event *event) -LSM_HOOK(int, 0, perf_event_write, struct perf_event *event) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), perf_event_read, struct perf_event *event) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), perf_event_write, struct perf_event *event) #endif /* CONFIG_PERF_EVENTS */
#ifdef CONFIG_IO_URING -LSM_HOOK(int, 0, uring_override_creds, const struct cred *new) -LSM_HOOK(int, 0, uring_sqpoll, void) -LSM_HOOK(int, 0, uring_cmd, struct io_uring_cmd *ioucmd) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), uring_override_creds, const struct cred *new) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), uring_sqpoll, void) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), uring_cmd, struct io_uring_cmd *ioucmd) #endif /* CONFIG_IO_URING */ diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h index a2ade0ffe9e7..14690cad4fb9 100644 --- a/include/linux/lsm_hooks.h +++ b/include/linux/lsm_hooks.h @@ -98,12 +98,6 @@ static inline struct xattr *lsm_get_xattr_slot(struct xattr *xattrs, return &xattrs[(*xattr_count)++]; }
-/* - * LSM_RET_VOID is used as the default value in LSM_HOOK definitions for void - * LSM hooks (in include/linux/lsm_hook_defs.h). - */ -#define LSM_RET_VOID ((void) 0) - /* * Initializing a security_hook_list structure takes * up a lot of space in a source file. This macro takes diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c index 68240c3c6e7d..ee9d1a795334 100644 --- a/kernel/bpf/bpf_lsm.c +++ b/kernel/bpf/bpf_lsm.c @@ -18,6 +18,14 @@ #include <linux/ima.h> #include <linux/bpf-cgroup.h>
+/* + * LSM_RET_VOID is used as the default value in LSM_HOOK definitions for void + * LSM hooks (in include/linux/lsm_hook_defs.h). + */ +#define LSM_RET_VOID ((void) 0) + +#define LSM_RET_INT(defval, ...) defval + /* For every LSM hook that allows attachment of BPF programs, declare a nop * function where a BPF program can be attached. */ @@ -29,6 +37,8 @@ noinline RET bpf_lsm_##NAME(__VA_ARGS__) \
#include <linux/lsm_hook_defs.h> #undef LSM_HOOK +#undef LSM_RET_INT +#undef LSM_RET_VOID
#define LSM_HOOK(RET, DEFAULT, NAME, ...) BTF_ID(func, bpf_lsm_##NAME) BTF_SET_START(bpf_lsm_hooks) diff --git a/security/security.c b/security/security.c index 7e118858b545..665c531497c4 100644 --- a/security/security.c +++ b/security/security.c @@ -834,6 +834,7 @@ int lsm_fill_user_ctx(struct lsm_ctx __user *uctx, u32 *uctx_len, * The macros below define static constants for the default value of each * LSM hook. */ +#define LSM_RET_INT(defval, ...) defval #define LSM_RET_DEFAULT(NAME) (NAME##_default) #define DECLARE_LSM_RET_DEFAULT_void(DEFAULT, NAME) #define DECLARE_LSM_RET_DEFAULT_int(DEFAULT, NAME) \
On Thu, Apr 11, 2024 at 8:24 AM Xu Kuohai xukuohai@huaweicloud.com wrote:
From: Xu Kuohai xukuohai@huawei.com
Add macro LSM_RET_INT to annotate lsm hook return integer type and the default return value, and the expected return range.
The LSM_RET_INT is declared as:
LSM_RET_INT(defval, min, max)
where
defval is the default return value
min and max indicate the expected return range is [min, max]
The return value range for each lsm hook is taken from the description in security/security.c.
The expanded result of LSM_RET_INT is not changed, and the compiled product is not changed.
Signed-off-by: Xu Kuohai xukuohai@huawei.com
include/linux/lsm_hook_defs.h | 591 +++++++++++++++++----------------- include/linux/lsm_hooks.h | 6 - kernel/bpf/bpf_lsm.c | 10 + security/security.c | 1 + 4 files changed, 313 insertions(+), 295 deletions(-)
...
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h index 334e00efbde4..708f515ffbf3 100644 --- a/include/linux/lsm_hook_defs.h +++ b/include/linux/lsm_hook_defs.h @@ -18,435 +18,448 @@
- The macro LSM_HOOK is used to define the data structures required by
- the LSM framework using the pattern:
LSM_HOOK(<return_type>, <default_value>, <hook_name>, args...)
LSM_HOOK(<return_type>, <return_description>, <hook_name>, args...)
- struct security_hook_heads {
- #define LSM_HOOK(RET, DEFAULT, NAME, ...) struct hlist_head NAME;
*/
- #define LSM_HOOK(RET, RETVAL_DESC, NAME, ...) struct hlist_head NAME;
- #include <linux/lsm_hook_defs.h>
- #undef LSM_HOOK
- };
-LSM_HOOK(int, 0, binder_set_context_mgr, const struct cred *mgr) -LSM_HOOK(int, 0, binder_transaction, const struct cred *from, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_set_context_mgr, const struct cred *mgr) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transaction, const struct cred *from, const struct cred *to) -LSM_HOOK(int, 0, binder_transfer_binder, const struct cred *from, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transfer_binder, const struct cred *from, const struct cred *to) -LSM_HOOK(int, 0, binder_transfer_file, const struct cred *from, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transfer_file, const struct cred *from, const struct cred *to, const struct file *file)
I'm not overly excited about injecting these additional return value range annotations into the LSM hook definitions, especially since the vast majority of the hooks "returns 0 on success, negative values on error". I'd rather see some effort put into looking at the feasibility of converting some (all?) of the LSM hook return value exceptions into the more conventional 0/-ERRNO format. Unfortunately, I haven't had the time to look into that myself, but if you wanted to do that I think it would be a good thing.
On 6/7/2024 5:53 AM, Paul Moore wrote:
On Thu, Apr 11, 2024 at 8:24 AM Xu Kuohai xukuohai@huaweicloud.com wrote:
From: Xu Kuohai xukuohai@huawei.com
Add macro LSM_RET_INT to annotate lsm hook return integer type and the default return value, and the expected return range.
The LSM_RET_INT is declared as:
LSM_RET_INT(defval, min, max)
where
defval is the default return value
min and max indicate the expected return range is [min, max]
The return value range for each lsm hook is taken from the description in security/security.c.
The expanded result of LSM_RET_INT is not changed, and the compiled product is not changed.
Signed-off-by: Xu Kuohai xukuohai@huawei.com
include/linux/lsm_hook_defs.h | 591 +++++++++++++++++----------------- include/linux/lsm_hooks.h | 6 - kernel/bpf/bpf_lsm.c | 10 + security/security.c | 1 + 4 files changed, 313 insertions(+), 295 deletions(-)
...
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h index 334e00efbde4..708f515ffbf3 100644 --- a/include/linux/lsm_hook_defs.h +++ b/include/linux/lsm_hook_defs.h @@ -18,435 +18,448 @@
- The macro LSM_HOOK is used to define the data structures required by
- the LSM framework using the pattern:
LSM_HOOK(<return_type>, <default_value>, <hook_name>, args...)
LSM_HOOK(<return_type>, <return_description>, <hook_name>, args...)
- struct security_hook_heads {
- #define LSM_HOOK(RET, DEFAULT, NAME, ...) struct hlist_head NAME;
*/
- #define LSM_HOOK(RET, RETVAL_DESC, NAME, ...) struct hlist_head NAME;
- #include <linux/lsm_hook_defs.h>
- #undef LSM_HOOK
- };
-LSM_HOOK(int, 0, binder_set_context_mgr, const struct cred *mgr) -LSM_HOOK(int, 0, binder_transaction, const struct cred *from, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_set_context_mgr, const struct cred *mgr) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transaction, const struct cred *from, const struct cred *to) -LSM_HOOK(int, 0, binder_transfer_binder, const struct cred *from, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transfer_binder, const struct cred *from, const struct cred *to) -LSM_HOOK(int, 0, binder_transfer_file, const struct cred *from, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transfer_file, const struct cred *from, const struct cred *to, const struct file *file)
I'm not overly excited about injecting these additional return value range annotations into the LSM hook definitions, especially since the vast majority of the hooks "returns 0 on success, negative values on error". I'd rather see some effort put into looking at the feasibility of converting some (all?) of the LSM hook return value exceptions into the more conventional 0/-ERRNO format. Unfortunately, I haven't had the time to look into that myself, but if you wanted to do that I think it would be a good thing.
I agree that keeping all hooks return a consistent range of 0/-ERRNO is more elegant than adding return value range annotations. However, there are two issues that might need to be addressed first:
1. Compatibility
For instance, security_vm_enough_memory_mm() determines whether to set cap_sys_admin by checking if the hook vm_enough_memory returns a positive number. If we were to change the hook vm_enough_memory to return 0 to indicate the need for cap_sys_admin, then for the LSM BPF program currently returning 0, the interpretation of its return value would be reversed after the modification.
2. Expressing multiple non-error states using 0/-ERRNO
IIUC, although 0/-ERRNO can be used to express different errors, only 0 can be used for non-error state. If there are multiple non-error states, they cannot be distinguished. For example, security_inode_need_killpriv() returns < 0 on error, 0 if security_inode_killpriv() doesn't need to be called, and > 0 if security_inode_killpriv() does need to be called.
On Sat, Jun 8, 2024 at 1:04 AM Xu Kuohai xukuohai@huaweicloud.com wrote:
On 6/7/2024 5:53 AM, Paul Moore wrote:
On Thu, Apr 11, 2024 at 8:24 AM Xu Kuohai xukuohai@huaweicloud.com wrote:
From: Xu Kuohai xukuohai@huawei.com
Add macro LSM_RET_INT to annotate lsm hook return integer type and the default return value, and the expected return range.
The LSM_RET_INT is declared as:
LSM_RET_INT(defval, min, max)
where
defval is the default return value
min and max indicate the expected return range is [min, max]
The return value range for each lsm hook is taken from the description in security/security.c.
The expanded result of LSM_RET_INT is not changed, and the compiled product is not changed.
Signed-off-by: Xu Kuohai xukuohai@huawei.com
include/linux/lsm_hook_defs.h | 591 +++++++++++++++++----------------- include/linux/lsm_hooks.h | 6 - kernel/bpf/bpf_lsm.c | 10 + security/security.c | 1 + 4 files changed, 313 insertions(+), 295 deletions(-)
...
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h index 334e00efbde4..708f515ffbf3 100644 --- a/include/linux/lsm_hook_defs.h +++ b/include/linux/lsm_hook_defs.h @@ -18,435 +18,448 @@
- The macro LSM_HOOK is used to define the data structures required by
- the LSM framework using the pattern:
LSM_HOOK(<return_type>, <default_value>, <hook_name>, args...)
LSM_HOOK(<return_type>, <return_description>, <hook_name>, args...)
- struct security_hook_heads {
- #define LSM_HOOK(RET, DEFAULT, NAME, ...) struct hlist_head NAME;
*/
- #define LSM_HOOK(RET, RETVAL_DESC, NAME, ...) struct hlist_head NAME;
- #include <linux/lsm_hook_defs.h>
- #undef LSM_HOOK
- };
-LSM_HOOK(int, 0, binder_set_context_mgr, const struct cred *mgr) -LSM_HOOK(int, 0, binder_transaction, const struct cred *from, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_set_context_mgr, const struct cred *mgr) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transaction, const struct cred *from, const struct cred *to) -LSM_HOOK(int, 0, binder_transfer_binder, const struct cred *from, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transfer_binder, const struct cred *from, const struct cred *to) -LSM_HOOK(int, 0, binder_transfer_file, const struct cred *from, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transfer_file, const struct cred *from, const struct cred *to, const struct file *file)
I'm not overly excited about injecting these additional return value range annotations into the LSM hook definitions, especially since the vast majority of the hooks "returns 0 on success, negative values on error". I'd rather see some effort put into looking at the feasibility of converting some (all?) of the LSM hook return value exceptions into the more conventional 0/-ERRNO format. Unfortunately, I haven't had the time to look into that myself, but if you wanted to do that I think it would be a good thing.
I agree that keeping all hooks return a consistent range of 0/-ERRNO is more elegant than adding return value range annotations. However, there are two issues that might need to be addressed first:
- Compatibility
For instance, security_vm_enough_memory_mm() determines whether to set cap_sys_admin by checking if the hook vm_enough_memory returns a positive number. If we were to change the hook vm_enough_memory to return 0 to indicate the need for cap_sys_admin, then for the LSM BPF program currently returning 0, the interpretation of its return value would be reversed after the modification.
This is not an issue. bpf lsm progs are no different from other lsm-s. If the meaning of return value or arguments to lsm hook change all lsm-s need to adjust as well. Regardless of whether they are written as in-kernel lsm-s, bpf-lsm, or out-of-tree lsm-s.
- Expressing multiple non-error states using 0/-ERRNO
IIUC, although 0/-ERRNO can be used to express different errors, only 0 can be used for non-error state. If there are multiple non-error states, they cannot be distinguished. For example, security_inode_need_killpriv() returns < 0 on error, 0 if security_inode_killpriv() doesn't need to be called, and > 0 if security_inode_killpriv() does need to be called.
This looks like a problem indeed. Converting all hooks to 0/-errno doesn't look practical.
On 6/8/2024 6:54 AM, Alexei Starovoitov wrote:
On Sat, Jun 8, 2024 at 1:04 AM Xu Kuohai xukuohai@huaweicloud.com wrote:
On 6/7/2024 5:53 AM, Paul Moore wrote:
On Thu, Apr 11, 2024 at 8:24 AM Xu Kuohai xukuohai@huaweicloud.com wrote:
From: Xu Kuohai xukuohai@huawei.com
Add macro LSM_RET_INT to annotate lsm hook return integer type and the default return value, and the expected return range.
The LSM_RET_INT is declared as:
LSM_RET_INT(defval, min, max)
where
defval is the default return value
min and max indicate the expected return range is [min, max]
The return value range for each lsm hook is taken from the description in security/security.c.
The expanded result of LSM_RET_INT is not changed, and the compiled product is not changed.
Signed-off-by: Xu Kuohai xukuohai@huawei.com
include/linux/lsm_hook_defs.h | 591 +++++++++++++++++----------------- include/linux/lsm_hooks.h | 6 - kernel/bpf/bpf_lsm.c | 10 + security/security.c | 1 + 4 files changed, 313 insertions(+), 295 deletions(-)
...
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h index 334e00efbde4..708f515ffbf3 100644 --- a/include/linux/lsm_hook_defs.h +++ b/include/linux/lsm_hook_defs.h @@ -18,435 +18,448 @@
- The macro LSM_HOOK is used to define the data structures required by
- the LSM framework using the pattern:
LSM_HOOK(<return_type>, <default_value>, <hook_name>, args...)
LSM_HOOK(<return_type>, <return_description>, <hook_name>, args...)
- struct security_hook_heads {
- #define LSM_HOOK(RET, DEFAULT, NAME, ...) struct hlist_head NAME;
*/
- #define LSM_HOOK(RET, RETVAL_DESC, NAME, ...) struct hlist_head NAME;
- #include <linux/lsm_hook_defs.h>
- #undef LSM_HOOK
- };
-LSM_HOOK(int, 0, binder_set_context_mgr, const struct cred *mgr) -LSM_HOOK(int, 0, binder_transaction, const struct cred *from, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_set_context_mgr, const struct cred *mgr) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transaction, const struct cred *from, const struct cred *to) -LSM_HOOK(int, 0, binder_transfer_binder, const struct cred *from, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transfer_binder, const struct cred *from, const struct cred *to) -LSM_HOOK(int, 0, binder_transfer_file, const struct cred *from, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transfer_file, const struct cred *from, const struct cred *to, const struct file *file)
I'm not overly excited about injecting these additional return value range annotations into the LSM hook definitions, especially since the vast majority of the hooks "returns 0 on success, negative values on error". I'd rather see some effort put into looking at the feasibility of converting some (all?) of the LSM hook return value exceptions into the more conventional 0/-ERRNO format. Unfortunately, I haven't had the time to look into that myself, but if you wanted to do that I think it would be a good thing.
I agree that keeping all hooks return a consistent range of 0/-ERRNO is more elegant than adding return value range annotations. However, there are two issues that might need to be addressed first:
- Compatibility
For instance, security_vm_enough_memory_mm() determines whether to set cap_sys_admin by checking if the hook vm_enough_memory returns a positive number. If we were to change the hook vm_enough_memory to return 0 to indicate the need for cap_sys_admin, then for the LSM BPF program currently returning 0, the interpretation of its return value would be reversed after the modification.
This is not an issue. bpf lsm progs are no different from other lsm-s. If the meaning of return value or arguments to lsm hook change all lsm-s need to adjust as well. Regardless of whether they are written as in-kernel lsm-s, bpf-lsm, or out-of-tree lsm-s.
- Expressing multiple non-error states using 0/-ERRNO
IIUC, although 0/-ERRNO can be used to express different errors, only 0 can be used for non-error state. If there are multiple non-error states, they cannot be distinguished. For example, security_inode_need_killpriv() returns < 0 on error, 0 if security_inode_killpriv() doesn't need to be called, and > 0 if security_inode_killpriv() does need to be called.
This looks like a problem indeed.
Hang on. There aren't really three states here. security_inode_killpriv() is called only on the security_inode_need_killpriv() > 0 case. I'm not looking at the code this instant, but adjusting the return to something like -ENOSYS (OK, maybe not a great choice, but you get the idea) instead of 0 in the don't call case and switching the positive value to 0 should work just fine.
We're working on getting the LSM interfaces to be more consistent. This particular pair of hooks is an example of why we need to do that.
Converting all hooks to 0/-errno doesn't look practical.
On Sun, Jun 9, 2024 at 1:39 PM Casey Schaufler casey@schaufler-ca.com wrote:
On 6/8/2024 6:54 AM, Alexei Starovoitov wrote:
On Sat, Jun 8, 2024 at 1:04 AM Xu Kuohai xukuohai@huaweicloud.com wrote:
On 6/7/2024 5:53 AM, Paul Moore wrote:
On Thu, Apr 11, 2024 at 8:24 AM Xu Kuohai xukuohai@huaweicloud.com wrote:
From: Xu Kuohai xukuohai@huawei.com
Add macro LSM_RET_INT to annotate lsm hook return integer type and the default return value, and the expected return range.
The LSM_RET_INT is declared as:
LSM_RET_INT(defval, min, max)
where
defval is the default return value
min and max indicate the expected return range is [min, max]
The return value range for each lsm hook is taken from the description in security/security.c.
The expanded result of LSM_RET_INT is not changed, and the compiled product is not changed.
Signed-off-by: Xu Kuohai xukuohai@huawei.com
include/linux/lsm_hook_defs.h | 591 +++++++++++++++++----------------- include/linux/lsm_hooks.h | 6 - kernel/bpf/bpf_lsm.c | 10 + security/security.c | 1 + 4 files changed, 313 insertions(+), 295 deletions(-)
...
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h index 334e00efbde4..708f515ffbf3 100644 --- a/include/linux/lsm_hook_defs.h +++ b/include/linux/lsm_hook_defs.h @@ -18,435 +18,448 @@
- The macro LSM_HOOK is used to define the data structures required by
- the LSM framework using the pattern:
LSM_HOOK(<return_type>, <default_value>, <hook_name>, args...)
LSM_HOOK(<return_type>, <return_description>, <hook_name>, args...)
- struct security_hook_heads {
- #define LSM_HOOK(RET, DEFAULT, NAME, ...) struct hlist_head NAME;
*/
- #define LSM_HOOK(RET, RETVAL_DESC, NAME, ...) struct hlist_head NAME;
- #include <linux/lsm_hook_defs.h>
- #undef LSM_HOOK
- };
-LSM_HOOK(int, 0, binder_set_context_mgr, const struct cred *mgr) -LSM_HOOK(int, 0, binder_transaction, const struct cred *from, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_set_context_mgr, const struct cred *mgr) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transaction, const struct cred *from, const struct cred *to) -LSM_HOOK(int, 0, binder_transfer_binder, const struct cred *from, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transfer_binder, const struct cred *from, const struct cred *to) -LSM_HOOK(int, 0, binder_transfer_file, const struct cred *from, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transfer_file, const struct cred *from, const struct cred *to, const struct file *file)
I'm not overly excited about injecting these additional return value range annotations into the LSM hook definitions, especially since the vast majority of the hooks "returns 0 on success, negative values on error". I'd rather see some effort put into looking at the feasibility of converting some (all?) of the LSM hook return value exceptions into the more conventional 0/-ERRNO format. Unfortunately, I haven't had the time to look into that myself, but if you wanted to do that I think it would be a good thing.
I agree that keeping all hooks return a consistent range of 0/-ERRNO is more elegant than adding return value range annotations. However, there are two issues that might need to be addressed first:
- Compatibility
For instance, security_vm_enough_memory_mm() determines whether to set cap_sys_admin by checking if the hook vm_enough_memory returns a positive number. If we were to change the hook vm_enough_memory to return 0 to indicate the need for cap_sys_admin, then for the LSM BPF program currently returning 0, the interpretation of its return value would be reversed after the modification.
This is not an issue. bpf lsm progs are no different from other lsm-s. If the meaning of return value or arguments to lsm hook change all lsm-s need to adjust as well. Regardless of whether they are written as in-kernel lsm-s, bpf-lsm, or out-of-tree lsm-s.
Yes, the are no guarantees around compatibility in kernel/LSM interface from one kernel release to the next. If we need to change a LSM hook, we can change a LSM hook; the important part is that when we change the LSM hook we must make sure to update all of the in-tree LSMs which make use of that hook.
- Expressing multiple non-error states using 0/-ERRNO
IIUC, although 0/-ERRNO can be used to express different errors, only 0 can be used for non-error state. If there are multiple non-error states, they cannot be distinguished. For example, security_inode_need_killpriv() returns < 0 on error, 0 if security_inode_killpriv() doesn't need to be called, and > 0 if security_inode_killpriv() does need to be called.
This looks like a problem indeed.
Hang on. There aren't really three states here. security_inode_killpriv() is called only on the security_inode_need_killpriv() > 0 case. I'm not looking at the code this instant, but adjusting the return to something like -ENOSYS (OK, maybe not a great choice, but you get the idea) instead of 0 in the don't call case and switching the positive value to 0 should work just fine.
We're working on getting the LSM interfaces to be more consistent. This particular pair of hooks is an example of why we need to do that.
Yes, exactly. Aside from the issues with BPF verification, we've seen problems in the past with LSM hooks that differ from the usual "0 on success, negative values on failure" pattern. I'm not saying it is possible to convert all of the hooks to fit this model, but even if we can only adjust one or two I think that is still a win.
As far as security_inode_need_killpriv()/security_inode_killpriv() is concerned, one possibility would be to shift the ATTR_KILL_PRIV set/mask operation into the LSM hook, something like this:
[WARNING: completely untested, likely broken, yadda yadda]
/** * ... * Returns: Return 0 on success, negative values on failure. @attrs may be updated * on success. */ int security_inode_need_killpriv(*dentry, attrs) { int rc; rc = call_int_hook(inode_killpriv, dentry); if (rc < 0) return rc; if (rc > 0) attrs |= ATTR_KILL_PRIV; else if (rc == 0) attrs &= ~ATTR_KILL_PRIV; return 0; }
Yes, that doesn't fix the problem for the individual LSMs, but it does make the hook a bit more consistent from the rest of the kernel.
On 6/10/2024 2:17 AM, Paul Moore wrote:
On Sun, Jun 9, 2024 at 1:39 PM Casey Schaufler casey@schaufler-ca.com wrote:
On 6/8/2024 6:54 AM, Alexei Starovoitov wrote:
On Sat, Jun 8, 2024 at 1:04 AM Xu Kuohai xukuohai@huaweicloud.com wrote:
On 6/7/2024 5:53 AM, Paul Moore wrote:
On Thu, Apr 11, 2024 at 8:24 AM Xu Kuohai xukuohai@huaweicloud.com wrote:
From: Xu Kuohai xukuohai@huawei.com
Add macro LSM_RET_INT to annotate lsm hook return integer type and the default return value, and the expected return range.
The LSM_RET_INT is declared as:
LSM_RET_INT(defval, min, max)
where
defval is the default return value
min and max indicate the expected return range is [min, max]
The return value range for each lsm hook is taken from the description in security/security.c.
The expanded result of LSM_RET_INT is not changed, and the compiled product is not changed.
Signed-off-by: Xu Kuohai xukuohai@huawei.com
include/linux/lsm_hook_defs.h | 591 +++++++++++++++++----------------- include/linux/lsm_hooks.h | 6 - kernel/bpf/bpf_lsm.c | 10 + security/security.c | 1 + 4 files changed, 313 insertions(+), 295 deletions(-)
...
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h index 334e00efbde4..708f515ffbf3 100644 --- a/include/linux/lsm_hook_defs.h +++ b/include/linux/lsm_hook_defs.h @@ -18,435 +18,448 @@ * The macro LSM_HOOK is used to define the data structures required by * the LSM framework using the pattern: *
LSM_HOOK(<return_type>, <default_value>, <hook_name>, args...)
LSM_HOOK(<return_type>, <return_description>, <hook_name>, args...)
- struct security_hook_heads {
- #define LSM_HOOK(RET, DEFAULT, NAME, ...) struct hlist_head NAME;
- #define LSM_HOOK(RET, RETVAL_DESC, NAME, ...) struct hlist_head NAME;
*/
- #include <linux/lsm_hook_defs.h>
- #undef LSM_HOOK
- };
-LSM_HOOK(int, 0, binder_set_context_mgr, const struct cred *mgr) -LSM_HOOK(int, 0, binder_transaction, const struct cred *from, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_set_context_mgr, const struct cred *mgr) +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transaction, const struct cred *from, const struct cred *to) -LSM_HOOK(int, 0, binder_transfer_binder, const struct cred *from, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transfer_binder, const struct cred *from, const struct cred *to) -LSM_HOOK(int, 0, binder_transfer_file, const struct cred *from, +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transfer_file, const struct cred *from, const struct cred *to, const struct file *file)
I'm not overly excited about injecting these additional return value range annotations into the LSM hook definitions, especially since the vast majority of the hooks "returns 0 on success, negative values on error". I'd rather see some effort put into looking at the feasibility of converting some (all?) of the LSM hook return value exceptions into the more conventional 0/-ERRNO format. Unfortunately, I haven't had the time to look into that myself, but if you wanted to do that I think it would be a good thing.
I agree that keeping all hooks return a consistent range of 0/-ERRNO is more elegant than adding return value range annotations. However, there are two issues that might need to be addressed first:
- Compatibility
For instance, security_vm_enough_memory_mm() determines whether to set cap_sys_admin by checking if the hook vm_enough_memory returns a positive number. If we were to change the hook vm_enough_memory to return 0 to indicate the need for cap_sys_admin, then for the LSM BPF program currently returning 0, the interpretation of its return value would be reversed after the modification.
This is not an issue. bpf lsm progs are no different from other lsm-s. If the meaning of return value or arguments to lsm hook change all lsm-s need to adjust as well. Regardless of whether they are written as in-kernel lsm-s, bpf-lsm, or out-of-tree lsm-s.
Yes, the are no guarantees around compatibility in kernel/LSM interface from one kernel release to the next. If we need to change a LSM hook, we can change a LSM hook; the important part is that when we change the LSM hook we must make sure to update all of the in-tree LSMs which make use of that hook.
Great, so there are no compatibility restrictions on both LSM and BPF sides.
- Expressing multiple non-error states using 0/-ERRNO
IIUC, although 0/-ERRNO can be used to express different errors, only 0 can be used for non-error state. If there are multiple non-error states, they cannot be distinguished. For example, security_inode_need_killpriv() returns < 0 on error, 0 if security_inode_killpriv() doesn't need to be called, and > 0 if security_inode_killpriv() does need to be called.
This looks like a problem indeed.
Hang on. There aren't really three states here. security_inode_killpriv() is called only on the security_inode_need_killpriv() > 0 case. I'm not looking at the code this instant, but adjusting the return to something like -ENOSYS (OK, maybe not a great choice, but you get the idea) instead of 0 in the don't call case and switching the positive value to 0 should work just fine.
We're working on getting the LSM interfaces to be more consistent. This particular pair of hooks is an example of why we need to do that.
Yes, exactly. Aside from the issues with BPF verification, we've seen problems in the past with LSM hooks that differ from the usual "0 on success, negative values on failure" pattern. I'm not saying it is possible to convert all of the hooks to fit this model, but even if we can only adjust one or two I think that is still a win.
As far as security_inode_need_killpriv()/security_inode_killpriv() is concerned, one possibility would be to shift the ATTR_KILL_PRIV set/mask operation into the LSM hook, something like this:
[WARNING: completely untested, likely broken, yadda yadda]
/**
- ...
- Returns: Return 0 on success, negative values on failure. @attrs
may be updated
on success.
*/ int security_inode_need_killpriv(*dentry, attrs) { int rc; rc = call_int_hook(inode_killpriv, dentry); if (rc < 0) return rc; if (rc > 0) attrs |= ATTR_KILL_PRIV; else if (rc == 0) attrs &= ~ATTR_KILL_PRIV; return 0; }
Yes, that doesn't fix the problem for the individual LSMs, but it does make the hook a bit more consistent from the rest of the kernel.
Alright, I'll give it a try. Perhaps in the end, there will be a few hooks that cannot be converted. If that's the case, it seems we can just provide exceptions for the return value explanations for these not unconverted hooks, maybe on the BPF side only, thus avoiding the need to annotate return values for all LSM hooks.
On Mon, Jun 10, 2024 at 10:25 PM Xu Kuohai xukuohai@huaweicloud.com wrote:
Alright, I'll give it a try. Perhaps in the end, there will be a few hooks that cannot be converted. If that's the case, it seems we can just provide exceptions for the return value explanations for these not unconverted hooks, maybe on the BPF side only, thus avoiding the need to annotate return values for all LSM hooks.
Thanks. Yes, while I don't think we will be able to normalize all of the hooks to 0/-ERRNO, my guess is that we can reduce the exceptions to a manageable count.
From: Xu Kuohai xukuohai@huawei.com
Add helper to read lsm hook return value range. The following patch will use this information to verify lsm hook return values in bpf verifier.
Signed-off-by: Xu Kuohai xukuohai@huawei.com --- include/linux/bpf_lsm.h | 8 ++++++ kernel/bpf/bpf_lsm.c | 54 ++++++++++++++++++++++++++++++++++++++++- 2 files changed, 61 insertions(+), 1 deletion(-)
diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h index 1de7ece5d36d..e51c042abf43 100644 --- a/include/linux/bpf_lsm.h +++ b/include/linux/bpf_lsm.h @@ -9,6 +9,7 @@
#include <linux/sched.h> #include <linux/bpf.h> +#include <linux/bpf_verifier.h> #include <linux/lsm_hooks.h>
#ifdef CONFIG_BPF_LSM @@ -45,6 +46,8 @@ void bpf_inode_storage_free(struct inode *inode);
void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t *bpf_func);
+int bpf_lsm_get_retval_range(const struct bpf_prog *prog, + struct bpf_retval_range *range); #else /* !CONFIG_BPF_LSM */
static inline bool bpf_lsm_is_sleepable_hook(u32 btf_id) @@ -78,6 +81,11 @@ static inline void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, { }
+static inline int bpf_lsm_get_retval_range(const struct bpf_prog *prog, + struct bpf_retval_range *range) +{ + return -EOPNOTSUPP; +} #endif /* CONFIG_BPF_LSM */
#endif /* _LINUX_BPF_LSM_H */ diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c index ee9d1a795334..4e1a4a333000 100644 --- a/kernel/bpf/bpf_lsm.c +++ b/kernel/bpf/bpf_lsm.c @@ -11,7 +11,6 @@ #include <linux/lsm_hooks.h> #include <linux/bpf_lsm.h> #include <linux/kallsyms.h> -#include <linux/bpf_verifier.h> #include <net/bpf_sk_storage.h> #include <linux/bpf_local_storage.h> #include <linux/btf_ids.h> @@ -40,6 +39,29 @@ noinline RET bpf_lsm_##NAME(__VA_ARGS__) \ #undef LSM_RET_INT #undef LSM_RET_VOID
+struct lsm_retval_desc { + unsigned long func_addr; + int minval; + int maxval; +}; + +#define LSM_RET_INT(defval, min, max) min, max + +#define LSM_HOOK_RETVAL_int(NAME, ...) \ +{ (unsigned long)&bpf_lsm_##NAME, __VA_ARGS__ }, + +#define LSM_HOOK_RETVAL_void(NAME, ...) + +#define LSM_HOOK(RET, RET_DESCRIPTION, NAME, ...) \ +LSM_HOOK_RETVAL_##RET(NAME, RET_DESCRIPTION) + +static struct lsm_retval_desc lsm_retvalues[] = { +#include <linux/lsm_hook_defs.h> +}; +#undef LSM_HOOK +#undef LSM_RET_INT +#undef LSM_RET_VOID + #define LSM_HOOK(RET, DEFAULT, NAME, ...) BTF_ID(func, bpf_lsm_##NAME) BTF_SET_START(bpf_lsm_hooks) #include <linux/lsm_hook_defs.h> @@ -399,3 +421,33 @@ const struct bpf_verifier_ops lsm_verifier_ops = { .get_func_proto = bpf_lsm_func_proto, .is_valid_access = btf_ctx_access, }; + +static struct lsm_retval_desc *find_retval_desc(const char *func_name) +{ + unsigned long addr; + struct lsm_retval_desc *desc; + + addr = kallsyms_lookup_name(func_name); + for (unsigned int i = 0U; i < ARRAY_SIZE(lsm_retvalues); i++) { + desc = &lsm_retvalues[i]; + if (addr == desc->func_addr) + return desc; + } + + return NULL; +} + +int bpf_lsm_get_retval_range(const struct bpf_prog *prog, + struct bpf_retval_range *retval_range) +{ + struct lsm_retval_desc *desc; + + desc = find_retval_desc(prog->aux->attach_func_name); + if (desc == NULL) + return -ENODEV; + + retval_range->minval = desc->minval; + retval_range->maxval = desc->maxval; + + return 0; +}
From: Xu Kuohai xukuohai@huawei.com
A bpf prog returning positive number attached to file_alloc_security hook will make kernel panic.
The reason is that the positive number returned by bpf prog is not a valid errno, and could not be filtered out with IS_ERR which is used by the file system to check errors. As a result, the file system uses this positive number as file pointer, causing panic.
Considering that hook file_alloc_security never returned positive number before bpf lsm was introduced, and other bpf lsm hooks may have the same problem, this patch adds lsm return value check in bpf verifier to ensure no unpredicted values will be returned by lsm bpf prog.
Fixes: 520b7aa00d8c ("bpf: lsm: Initialize the BPF LSM hooks") Reported-by: Xin Liu liuxin350@huawei.com Signed-off-by: Xu Kuohai xukuohai@huawei.com --- include/linux/bpf.h | 1 + kernel/bpf/btf.c | 5 +++- kernel/bpf/verifier.c | 57 +++++++++++++++++++++++++++++++++++++------ 3 files changed, 55 insertions(+), 8 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 5034c1b4ded7..7aedb4827a94 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -917,6 +917,7 @@ struct bpf_insn_access_aux { }; }; struct bpf_verifier_log *log; /* for verbose logs */ + bool is_retval; /* is accessing function return value ? */ };
static inline void diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 90c4a32d89ff..d593684d80c6 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -6227,8 +6227,11 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
if (arg == nr_args) { switch (prog->expected_attach_type) { - case BPF_LSM_CGROUP: case BPF_LSM_MAC: + /* mark we are accessing the return value */ + info->is_retval = true; + fallthrough; + case BPF_LSM_CGROUP: case BPF_TRACE_FEXIT: /* When LSM programs are attached to void LSM hooks * they use FEXIT trampolines and when attached to diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 2aad6d90550f..05c7c5f2bec0 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -2321,6 +2321,25 @@ static void mark_reg_unknown(struct bpf_verifier_env *env, __mark_reg_unknown(env, regs + regno); }
+static int __mark_reg_s32_range(struct bpf_verifier_env *env, + struct bpf_reg_state *regs, + u32 regno, + s32 s32_min, + s32 s32_max) +{ + struct bpf_reg_state *reg = regs + regno; + + reg->s32_min_value = max_t(s32, reg->s32_min_value, s32_min); + reg->s32_max_value = min_t(s32, reg->s32_max_value, s32_max); + + reg->smin_value = max_t(s64, reg->smin_value, s32_min); + reg->smax_value = min_t(s64, reg->smax_value, s32_max); + + reg_bounds_sync(reg); + + return reg_bounds_sanity_check(env, reg, "s32_range"); +} + static void __mark_reg_not_init(const struct bpf_verifier_env *env, struct bpf_reg_state *reg) { @@ -5555,11 +5574,12 @@ static int check_packet_access(struct bpf_verifier_env *env, u32 regno, int off, /* check access to 'struct bpf_context' fields. Supports fixed offsets only */ static int check_ctx_access(struct bpf_verifier_env *env, int insn_idx, int off, int size, enum bpf_access_type t, enum bpf_reg_type *reg_type, - struct btf **btf, u32 *btf_id) + struct btf **btf, u32 *btf_id, bool *is_retval) { struct bpf_insn_access_aux info = { .reg_type = *reg_type, .log = &env->log, + .is_retval = false, };
if (env->ops->is_valid_access && @@ -5572,6 +5592,7 @@ static int check_ctx_access(struct bpf_verifier_env *env, int insn_idx, int off, * type of narrower access. */ *reg_type = info.reg_type; + *is_retval = info.is_retval;
if (base_type(*reg_type) == PTR_TO_BTF_ID) { *btf = info.btf; @@ -6725,6 +6746,17 @@ static int check_stack_access_within_bounds( return grow_stack_state(env, state, -min_off /* size */); }
+static bool get_func_retval_range(struct bpf_prog *prog, + struct bpf_retval_range *range) +{ + if (prog->type == BPF_PROG_TYPE_LSM && + prog->expected_attach_type == BPF_LSM_MAC && + !bpf_lsm_get_retval_range(prog, range)) { + return true; + } + return false; +} + /* check whether memory at (regno + off) is accessible for t = (read | write) * if t==write, value_regno is a register which value is stored into memory * if t==read, value_regno is a register which will receive the value from memory @@ -6829,6 +6861,8 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn if (!err && value_regno >= 0 && (t == BPF_READ || rdonly_mem)) mark_reg_unknown(env, regs, value_regno); } else if (reg->type == PTR_TO_CTX) { + bool is_retval = false; + struct bpf_retval_range range; enum bpf_reg_type reg_type = SCALAR_VALUE; struct btf *btf = NULL; u32 btf_id = 0; @@ -6844,7 +6878,7 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn return err;
err = check_ctx_access(env, insn_idx, off, size, t, ®_type, &btf, - &btf_id); + &btf_id, &is_retval); if (err) verbose_linfo(env, insn_idx, "; "); if (!err && t == BPF_READ && value_regno >= 0) { @@ -6853,7 +6887,14 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn * case, we know the offset is zero. */ if (reg_type == SCALAR_VALUE) { - mark_reg_unknown(env, regs, value_regno); + if (is_retval && get_func_retval_range(env->prog, &range)) { + err = __mark_reg_s32_range(env, regs, value_regno, + range.minval, range.maxval); + if (err) + return err; + } else { + mark_reg_unknown(env, regs, value_regno); + } } else { mark_reg_known_zero(env, regs, value_regno); @@ -15492,10 +15533,12 @@ static int check_return_code(struct bpf_verifier_env *env, int regno, const char
case BPF_PROG_TYPE_LSM: if (env->prog->expected_attach_type != BPF_LSM_CGROUP) { - /* Regular BPF_PROG_TYPE_LSM programs can return - * any value. - */ - return 0; + /* no range found, any return value is allowed */ + if (!get_func_retval_range(env->prog, &range)) + return 0; + /* no restricted range, any return value is allowed */ + if (range.minval == S32_MIN && range.maxval == S32_MAX) + return 0; } if (!env->prog->aux->attach_func_proto->type) { /* Make sure programs that attach to void
On Thu, 2024-04-11 at 20:27 +0800, Xu Kuohai wrote:
From: Xu Kuohai xukuohai@huawei.com
A bpf prog returning positive number attached to file_alloc_security hook will make kernel panic.
The reason is that the positive number returned by bpf prog is not a valid errno, and could not be filtered out with IS_ERR which is used by the file system to check errors. As a result, the file system uses this positive number as file pointer, causing panic.
Considering that hook file_alloc_security never returned positive number before bpf lsm was introduced, and other bpf lsm hooks may have the same problem, this patch adds lsm return value check in bpf verifier to ensure no unpredicted values will be returned by lsm bpf prog.
Fixes: 520b7aa00d8c ("bpf: lsm: Initialize the BPF LSM hooks") Reported-by: Xin Liu liuxin350@huawei.com Signed-off-by: Xu Kuohai xukuohai@huawei.com
Acked-by: Eduard Zingerman eddyz87@gmail.com
From: Xu Kuohai xukuohai@huawei.com
Add a disabled hooks list for bpf lsm. progs being attached to the listed hooks will be rejected by the verifier.
Suggested-by: KP Singh kpsingh@kernel.org Signed-off-by: Xu Kuohai xukuohai@huawei.com --- kernel/bpf/bpf_lsm.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c index 4e1a4a333000..7f5648b404f2 100644 --- a/kernel/bpf/bpf_lsm.c +++ b/kernel/bpf/bpf_lsm.c @@ -68,6 +68,12 @@ BTF_SET_START(bpf_lsm_hooks) #undef LSM_HOOK BTF_SET_END(bpf_lsm_hooks)
+BTF_SET_START(bpf_lsm_disabled_hooks) +BTF_ID(func, bpf_lsm_getprocattr) +BTF_ID(func, bpf_lsm_setprocattr) +BTF_ID(func, bpf_lsm_ismaclabel) +BTF_SET_END(bpf_lsm_disabled_hooks) + /* List of LSM hooks that should operate on 'current' cgroup regardless * of function signature. */ @@ -129,15 +135,25 @@ void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog, const struct bpf_prog *prog) { + u32 btf_id = prog->aux->attach_btf_id; + const char *func_name = prog->aux->attach_func_name; + if (!prog->gpl_compatible) { bpf_log(vlog, "LSM programs must have a GPL compatible license\n"); return -EINVAL; }
- if (!btf_id_set_contains(&bpf_lsm_hooks, prog->aux->attach_btf_id)) { + if (btf_id_set_contains(&bpf_lsm_disabled_hooks, btf_id)) { + bpf_log(vlog, + "attach_btf_id %u points to disabled bpf lsm hook %s\n", + btf_id, func_name); + return -EINVAL; + } + + if (!btf_id_set_contains(&bpf_lsm_hooks, btf_id)) { bpf_log(vlog, "attach_btf_id %u points to wrong type name %s\n", - prog->aux->attach_btf_id, prog->aux->attach_func_name); + btf_id, func_name); return -EINVAL; }
From: Xu Kuohai xukuohai@huawei.com
LSM and tracing bpf programs are hooked to kernel functions which may have different types. That is, the hook functions may have different parameters, different return types, or different return ranges. progs attached to different hook types may receive different context structures or return different return types or different return ranges, so they should not be allowed to call each other with tail call.
Signed-off-by: Xu Kuohai xukuohai@huawei.com --- include/linux/bpf.h | 1 + kernel/bpf/core.c | 22 ++++++++++++++++++---- 2 files changed, 19 insertions(+), 4 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 7aedb4827a94..dea7f1bdd2e6 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -292,6 +292,7 @@ struct bpf_map { * same prog type, JITed flag and xdp_has_frags flag. */ struct { + const struct btf_type *attach_func_proto; spinlock_t lock; enum bpf_prog_type type; bool jited; diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index a41718eaeefe..6dd176481b71 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -2303,6 +2303,7 @@ bool bpf_prog_map_compatible(struct bpf_map *map, { enum bpf_prog_type prog_type = resolve_prog_type(fp); bool ret; + struct bpf_prog_aux *aux = fp->aux;
if (fp->kprobe_override) return false; @@ -2312,9 +2313,8 @@ bool bpf_prog_map_compatible(struct bpf_map *map, * in the case of devmap and cpumap). Until device checks * are implemented, prohibit adding dev-bound programs to program maps. */ - if (bpf_prog_is_dev_bound(fp->aux)) + if (bpf_prog_is_dev_bound(aux)) return false; - spin_lock(&map->owner.lock); if (!map->owner.type) { /* There's no owner yet where we could check for @@ -2322,12 +2322,26 @@ bool bpf_prog_map_compatible(struct bpf_map *map, */ map->owner.type = prog_type; map->owner.jited = fp->jited; - map->owner.xdp_has_frags = fp->aux->xdp_has_frags; + map->owner.xdp_has_frags = aux->xdp_has_frags; + map->owner.attach_func_proto = aux->attach_func_proto; ret = true; } else { ret = map->owner.type == prog_type && map->owner.jited == fp->jited && - map->owner.xdp_has_frags == fp->aux->xdp_has_frags; + map->owner.xdp_has_frags == aux->xdp_has_frags; + if (ret && + map->owner.attach_func_proto != aux->attach_func_proto) { + switch (prog_type) { + case BPF_PROG_TYPE_TRACING: + case BPF_PROG_TYPE_LSM: + case BPF_PROG_TYPE_EXT: + case BPF_PROG_TYPE_STRUCT_OPS: + ret = false; + break; + default: + break; + } + } } spin_unlock(&map->owner.lock);
From: Xu Kuohai xukuohai@huawei.com
After checking lsm hook return range in verifier, the test case "test_progs -t test_lsm" failed, and the failure log says:
libbpf: prog 'test_int_hook': BPF program load failed: Invalid argument libbpf: prog 'test_int_hook': -- BEGIN PROG LOAD LOG -- 0: R1=ctx() R10=fp0 ; int BPF_PROG(test_int_hook, struct vm_area_struct *vma, @ lsm.c:89 0: (79) r0 = *(u64 *)(r1 +24) ; R0_w=scalar(smin=smin32=-4095,smax=smax32=0) R1=ctx()
[...]
24: (b4) w0 = -1 ; R0_w=0xffffffff ; int BPF_PROG(test_int_hook, struct vm_area_struct *vma, @ lsm.c:89 25: (95) exit At program exit the register R0 has smin=4294967295 smax=4294967295 should have been in [-4095, 0]
It can be seen that instruction "w0 = -1" zero extended -1 to 64-bit register r0, setting both smin and smax values of r0 to 4294967295. This resulted in a false reject when r0 was checked with range [-4095, 0].
Given bpf_retval_range is a 32-bit range, this patch fixes it by changing the compare between r0 and return range from 64-bit operation to 32-bit operation.
Fixes: 8fa4ecd49b81 ("bpf: enforce exact retval range on subprog/callback exit") Signed-off-by: Xu Kuohai xukuohai@huawei.com --- kernel/bpf/verifier.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 05c7c5f2bec0..5393d576c76f 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -9879,7 +9879,7 @@ static bool in_rbtree_lock_required_cb(struct bpf_verifier_env *env)
static bool retval_range_within(struct bpf_retval_range range, const struct bpf_reg_state *reg) { - return range.minval <= reg->smin_value && reg->smax_value <= range.maxval; + return range.minval <= reg->s32_min_value && reg->s32_max_value <= range.maxval; }
static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx)
On Thu, Apr 11, 2024 at 08:27:47PM +0800, Xu Kuohai wrote:
[...] 24: (b4) w0 = -1 ; R0_w=0xffffffff ; int BPF_PROG(test_int_hook, struct vm_area_struct *vma, @ lsm.c:89 25: (95) exit At program exit the register R0 has smin=4294967295 smax=4294967295 should have been in [-4095, 0]
It can be seen that instruction "w0 = -1" zero extended -1 to 64-bit register r0, setting both smin and smax values of r0 to 4294967295. This resulted in a false reject when r0 was checked with range [-4095, 0].
Given bpf_retval_range is a 32-bit range, this patch fixes it by changing the compare between r0 and return range from 64-bit operation to 32-bit operation.
Fixes: 8fa4ecd49b81 ("bpf: enforce exact retval range on subprog/callback exit") Signed-off-by: Xu Kuohai xukuohai@huawei.com
kernel/bpf/verifier.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 05c7c5f2bec0..5393d576c76f 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -9879,7 +9879,7 @@ static bool in_rbtree_lock_required_cb(struct bpf_verifier_env *env) static bool retval_range_within(struct bpf_retval_range range, const struct bpf_reg_state *reg) {
- return range.minval <= reg->smin_value && reg->smax_value <= range.maxval;
- return range.minval <= reg->s32_min_value && reg->s32_max_value <= range.maxval;
Logic-wise LGTM
While the status-quo is that the return value is always truncated to 32-bit, looking back there was an attempt to use 64-bit return value for bpf_prog_run[1] (not merged due to issue on 32-bit architectures). Also from the reading of BPF standardization ABI it would be inferred that return value is in 64-bit range:
BPF has 10 general purpose registers and a read-only frame pointer register, all of which are 64-bits wide.
The BPF calling convention is defined as:
* R0: return value from function calls, and exit value for BPF programs ...
So add relevant people into the thread for opinions.
1: https://lore.kernel.org/bpf/20221115193911.u6prvskdzr5jevni@apollo/
On Thu, Apr 11, 2024 at 5:24 AM Xu Kuohai xukuohai@huaweicloud.com wrote:
From: Xu Kuohai xukuohai@huawei.com
After checking lsm hook return range in verifier, the test case "test_progs -t test_lsm" failed, and the failure log says:
libbpf: prog 'test_int_hook': BPF program load failed: Invalid argument libbpf: prog 'test_int_hook': -- BEGIN PROG LOAD LOG -- 0: R1=ctx() R10=fp0 ; int BPF_PROG(test_int_hook, struct vm_area_struct *vma, @ lsm.c:89 0: (79) r0 = *(u64 *)(r1 +24) ; R0_w=scalar(smin=smin32=-4095,smax=smax32=0) R1=ctx()
[...]
24: (b4) w0 = -1 ; R0_w=0xffffffff ; int BPF_PROG(test_int_hook, struct vm_area_struct *vma, @ lsm.c:89 25: (95) exit At program exit the register R0 has smin=4294967295 smax=4294967295 should have been in [-4095, 0]
It can be seen that instruction "w0 = -1" zero extended -1 to 64-bit register r0, setting both smin and smax values of r0 to 4294967295. This resulted in a false reject when r0 was checked with range [-4095, 0].
Given bpf_retval_range is a 32-bit range, this patch fixes it by changing the compare between r0 and return range from 64-bit operation to 32-bit operation.
Fixes: 8fa4ecd49b81 ("bpf: enforce exact retval range on subprog/callback exit") Signed-off-by: Xu Kuohai xukuohai@huawei.com
kernel/bpf/verifier.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 05c7c5f2bec0..5393d576c76f 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -9879,7 +9879,7 @@ static bool in_rbtree_lock_required_cb(struct bpf_verifier_env *env)
static bool retval_range_within(struct bpf_retval_range range, const struct bpf_reg_state *reg) {
return range.minval <= reg->smin_value && reg->smax_value <= range.maxval;
return range.minval <= reg->s32_min_value && reg->s32_max_value <= range.maxval;
are all BPF programs treated as if they return int instead of long? If not, we probably should have a bool flag in bpf_retval_range whether comparison should be 32-bit or 64-bit?
}
static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx)
2.30.2
On 4/26/2024 7:41 AM, Andrii Nakryiko wrote:
On Thu, Apr 11, 2024 at 5:24 AM Xu Kuohai xukuohai@huaweicloud.com wrote:
From: Xu Kuohai xukuohai@huawei.com
After checking lsm hook return range in verifier, the test case "test_progs -t test_lsm" failed, and the failure log says:
libbpf: prog 'test_int_hook': BPF program load failed: Invalid argument libbpf: prog 'test_int_hook': -- BEGIN PROG LOAD LOG -- 0: R1=ctx() R10=fp0 ; int BPF_PROG(test_int_hook, struct vm_area_struct *vma, @ lsm.c:89 0: (79) r0 = *(u64 *)(r1 +24) ; R0_w=scalar(smin=smin32=-4095,smax=smax32=0) R1=ctx()
[...]
24: (b4) w0 = -1 ; R0_w=0xffffffff ; int BPF_PROG(test_int_hook, struct vm_area_struct *vma, @ lsm.c:89 25: (95) exit At program exit the register R0 has smin=4294967295 smax=4294967295 should have been in [-4095, 0]
It can be seen that instruction "w0 = -1" zero extended -1 to 64-bit register r0, setting both smin and smax values of r0 to 4294967295. This resulted in a false reject when r0 was checked with range [-4095, 0].
Given bpf_retval_range is a 32-bit range, this patch fixes it by changing the compare between r0 and return range from 64-bit operation to 32-bit operation.
Fixes: 8fa4ecd49b81 ("bpf: enforce exact retval range on subprog/callback exit") Signed-off-by: Xu Kuohai xukuohai@huawei.com
kernel/bpf/verifier.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 05c7c5f2bec0..5393d576c76f 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -9879,7 +9879,7 @@ static bool in_rbtree_lock_required_cb(struct bpf_verifier_env *env)
static bool retval_range_within(struct bpf_retval_range range, const struct bpf_reg_state *reg) {
return range.minval <= reg->smin_value && reg->smax_value <= range.maxval;
return range.minval <= reg->s32_min_value && reg->s32_max_value <= range.maxval;
are all BPF programs treated as if they return int instead of long? If not, we probably should have a bool flag in bpf_retval_range whether comparison should be 32-bit or 64-bit?
It seems that when a fmod_return prog is attached to a kernel function that returns long value, the bpf prog should also return long value. To confirm it, I'll try to find an example or construct a case for this.
}
static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx)
2.30.2
From: Xu Kuohai xukuohai@huawei.com
With lsm return value check, the no-alu32 version test_libbpf_get_fd_by_id_opts is rejected by the verifier, and the log says:
0: R1=ctx() R10=fp0 ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 0: (b7) r0 = 0 ; R0_w=0 1: (79) r2 = *(u64 *)(r1 +0) func 'bpf_lsm_bpf_map' arg0 has btf_id 916 type STRUCT 'bpf_map' 2: R1=ctx() R2_w=trusted_ptr_bpf_map() ; if (map != (struct bpf_map *)&data_input) @ test_libbpf_get_fd_by_id_opts.c:29 2: (18) r3 = 0xffff9742c0951a00 ; R3_w=map_ptr(map=data_input,ks=4,vs=4) 4: (5d) if r2 != r3 goto pc+4 ; R2_w=trusted_ptr_bpf_map() R3_w=map_ptr(map=data_input,ks=4,vs=4) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 5: (79) r0 = *(u64 *)(r1 +8) ; R0_w=scalar() R1=ctx() ; if (fmode & FMODE_WRITE) @ test_libbpf_get_fd_by_id_opts.c:32 6: (67) r0 <<= 62 ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000)) 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) ; @ test_libbpf_get_fd_by_id_opts.c:0 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 9: (95) exit
And here is the C code of the prog.
SEC("lsm/bpf_map") int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) { if (map != (struct bpf_map *)&data_input) return 0;
if (fmode & FMODE_WRITE) return -EACCES;
return 0; }
It is clear that the prog can only return either 0 or -EACCESS, and both values are legal.
So why is it rejected by the verifier?
The verifier log shows that the second if and return value setting statements in the prog is optimized to bitwise operations "r0 s>>= 63" and "r0 &= -13". The verifier correctly deduces that the the value of r0 is in the range [-1, 0] after verifing instruction "r0 s>>= 63". But when the verifier proceeds to verify instruction "r0 &= -13", it fails to deduce the correct value range of r0.
7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3))
So why the verifier fails to deduce the result of 'r0 &= -13'?
The verifier uses tnum to track values, and the two ranges "[-1, 0]" and "[0, -1ULL]" are encoded to the same tnum. When verifing instruction "r0 &= -13", the verifier erroneously deduces the result from "[0, -1ULL] AND -13", which is out of the expected return range [-4095, 0].
To fix it, this patch simply adds a special SCALAR32 case for the verifier. That is, when the source operand of the AND instruction is a constant and the destination operand changes from negative to non-negative and falls in range [-256, 256], deduce the result range by enumerating all possible AND results.
Signed-off-by: Xu Kuohai xukuohai@huawei.com --- kernel/bpf/verifier.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 5393d576c76f..62e259f18f35 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -13369,6 +13369,29 @@ static void scalar32_min_max_and(struct bpf_reg_state *dst_reg, return; }
+ if (src_known && + dst_reg->s32_min_value < 0 && dst_reg->s32_min_value >= -256 && + dst_reg->s32_max_value >= 0 && dst_reg->s32_max_value <= 256 && + dst_reg->s32_min_value == dst_reg->smin_value && + dst_reg->s32_max_value == dst_reg->smax_value) { + s32 s32_min = S32_MAX; + s32 s32_max = S32_MIN; + s32 v = dst_reg->s32_min_value; + while (v <= dst_reg->s32_max_value) { + s32 w = (v & src_reg->s32_min_value); + if (w < s32_min) + s32_min = w; + if (w > s32_max) + s32_max = w; + v++; + } + dst_reg->s32_min_value = s32_min; + dst_reg->s32_max_value = s32_max; + dst_reg->u32_min_value = var32_off.value; + dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val); + return; + } + /* We get our minimum from the var_off, since that's inherently * bitwise. Our maximum is the minimum of the operands' maxima. */
On Thu, 2024-04-11 at 20:27 +0800, Xu Kuohai wrote:
From: Xu Kuohai xukuohai@huawei.com
With lsm return value check, the no-alu32 version test_libbpf_get_fd_by_id_opts is rejected by the verifier, and the log says:
0: R1=ctx() R10=fp0 ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 0: (b7) r0 = 0 ; R0_w=0 1: (79) r2 = *(u64 *)(r1 +0) func 'bpf_lsm_bpf_map' arg0 has btf_id 916 type STRUCT 'bpf_map' 2: R1=ctx() R2_w=trusted_ptr_bpf_map() ; if (map != (struct bpf_map *)&data_input) @ test_libbpf_get_fd_by_id_opts.c:29 2: (18) r3 = 0xffff9742c0951a00 ; R3_w=map_ptr(map=data_input,ks=4,vs=4) 4: (5d) if r2 != r3 goto pc+4 ; R2_w=trusted_ptr_bpf_map() R3_w=map_ptr(map=data_input,ks=4,vs=4) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 5: (79) r0 = *(u64 *)(r1 +8) ; R0_w=scalar() R1=ctx() ; if (fmode & FMODE_WRITE) @ test_libbpf_get_fd_by_id_opts.c:32 6: (67) r0 <<= 62 ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000)) 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) ; @ test_libbpf_get_fd_by_id_opts.c:0 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 9: (95) exit
And here is the C code of the prog.
SEC("lsm/bpf_map") int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) { if (map != (struct bpf_map *)&data_input) return 0;
if (fmode & FMODE_WRITE) return -EACCES;
return 0; }
It is clear that the prog can only return either 0 or -EACCESS, and both values are legal.
So why is it rejected by the verifier?
The verifier log shows that the second if and return value setting statements in the prog is optimized to bitwise operations "r0 s>>= 63" and "r0 &= -13". The verifier correctly deduces that the the value of r0 is in the range [-1, 0] after verifing instruction "r0 s>>= 63". But when the verifier proceeds to verify instruction "r0 &= -13", it fails to deduce the correct value range of r0.
7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3))
So why the verifier fails to deduce the result of 'r0 &= -13'?
The verifier uses tnum to track values, and the two ranges "[-1, 0]" and "[0, -1ULL]" are encoded to the same tnum. When verifing instruction "r0 &= -13", the verifier erroneously deduces the result from "[0, -1ULL] AND -13", which is out of the expected return range [-4095, 0].
To fix it, this patch simply adds a special SCALAR32 case for the verifier. That is, when the source operand of the AND instruction is a constant and the destination operand changes from negative to non-negative and falls in range [-256, 256], deduce the result range by enumerating all possible AND results.
Signed-off-by: Xu Kuohai xukuohai@huawei.com
Hello,
Sorry for the delay, I had to think about this issue a bit. I found the clang transformation that generates the pattern this patch tries to handle. It is located in DAGCombiner::SimplifySelectCC() method (see [1]). The transformation happens as a part of DAG to DAG rewrites (LLVM uses several internal representations: - generic optimizer uses LLVM IR, most of the work is done using this representation; - before instruction selection IR is converted to Selection DAG, some optimizations are applied at this stage, all such optimizations are a set of pattern replacements; - Selection DAG is converted to machine code, some optimizations are applied at the machine code level).
Full pattern is described as follows:
// fold (select_cc seteq (and x, y), 0, 0, A) -> (and (sra (shl x)) A) // where y is has a single bit set. // A plaintext description would be, we can turn the SELECT_CC into an AND // when the condition can be materialized as an all-ones register. Any // single bit-test can be materialized as an all-ones register with // shift-left and shift-right-arith.
For this particular test case the DAG is converted as follows:
.---------------- lhs The meaning of this select_cc is: | .------- rhs `lhs == rhs ? true value : false value` | | .----- true value | | | .-- false value v v v v (select_cc seteq (and X 2) 0 0 -13) ^ -> '---------------. (and (sra (sll X 62) 63) | -13) | | Before pattern is applied, it checks that second 'and' operand has only one bit set, (which is true for '2').
The pattern itself generates logical shift left / arithmetic shift right pair, that ensures that result is either all ones (-1) or all zeros (0). Hence, applying 'and' to shifts result and false value generates a correct result.
In my opinion the approach taken by this patch is sub-optimal: - 512 iterations is too much; - this does not cover all code that could be generated by the above mentioned LLVM transformation (e.g. second 'and' operand could be 1 << 16).
Instead, I suggest to make a special case for source or dst register of '&=' operation being in range [-1,0]. Meaning that one of the '&=' operands is either: - all ones, in which case the counterpart is the result of the operation; - all zeros, in which case zero is the result of the operation; - derive MIN and MAX values based on above two observations.
[1] https://github.com/llvm/llvm-project/blob/4523a267829c807f3fc8fab8e5e9613985...
Best regards, Eduard
On 4/20/2024 7:00 AM, Eduard Zingerman wrote:
On Thu, 2024-04-11 at 20:27 +0800, Xu Kuohai wrote:
From: Xu Kuohai xukuohai@huawei.com
With lsm return value check, the no-alu32 version test_libbpf_get_fd_by_id_opts is rejected by the verifier, and the log says:
0: R1=ctx() R10=fp0 ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 0: (b7) r0 = 0 ; R0_w=0 1: (79) r2 = *(u64 *)(r1 +0) func 'bpf_lsm_bpf_map' arg0 has btf_id 916 type STRUCT 'bpf_map' 2: R1=ctx() R2_w=trusted_ptr_bpf_map() ; if (map != (struct bpf_map *)&data_input) @ test_libbpf_get_fd_by_id_opts.c:29 2: (18) r3 = 0xffff9742c0951a00 ; R3_w=map_ptr(map=data_input,ks=4,vs=4) 4: (5d) if r2 != r3 goto pc+4 ; R2_w=trusted_ptr_bpf_map() R3_w=map_ptr(map=data_input,ks=4,vs=4) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 5: (79) r0 = *(u64 *)(r1 +8) ; R0_w=scalar() R1=ctx() ; if (fmode & FMODE_WRITE) @ test_libbpf_get_fd_by_id_opts.c:32 6: (67) r0 <<= 62 ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000)) 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) ; @ test_libbpf_get_fd_by_id_opts.c:0 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 9: (95) exit
And here is the C code of the prog.
SEC("lsm/bpf_map") int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) { if (map != (struct bpf_map *)&data_input) return 0;
if (fmode & FMODE_WRITE) return -EACCES;
return 0; }
It is clear that the prog can only return either 0 or -EACCESS, and both values are legal.
So why is it rejected by the verifier?
The verifier log shows that the second if and return value setting statements in the prog is optimized to bitwise operations "r0 s>>= 63" and "r0 &= -13". The verifier correctly deduces that the the value of r0 is in the range [-1, 0] after verifing instruction "r0 s>>= 63". But when the verifier proceeds to verify instruction "r0 &= -13", it fails to deduce the correct value range of r0.
7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3))
So why the verifier fails to deduce the result of 'r0 &= -13'?
The verifier uses tnum to track values, and the two ranges "[-1, 0]" and "[0, -1ULL]" are encoded to the same tnum. When verifing instruction "r0 &= -13", the verifier erroneously deduces the result from "[0, -1ULL] AND -13", which is out of the expected return range [-4095, 0].
To fix it, this patch simply adds a special SCALAR32 case for the verifier. That is, when the source operand of the AND instruction is a constant and the destination operand changes from negative to non-negative and falls in range [-256, 256], deduce the result range by enumerating all possible AND results.
Signed-off-by: Xu Kuohai xukuohai@huawei.com
Hello,
Sorry for the delay, I had to think about this issue a bit. I found the clang transformation that generates the pattern this patch tries to handle. It is located in DAGCombiner::SimplifySelectCC() method (see [1]). The transformation happens as a part of DAG to DAG rewrites (LLVM uses several internal representations:
- generic optimizer uses LLVM IR, most of the work is done using this representation;
- before instruction selection IR is converted to Selection DAG, some optimizations are applied at this stage, all such optimizations are a set of pattern replacements;
- Selection DAG is converted to machine code, some optimizations are applied at the machine code level).
Full pattern is described as follows:
// fold (select_cc seteq (and x, y), 0, 0, A) -> (and (sra (shl x)) A) // where y is has a single bit set. // A plaintext description would be, we can turn the SELECT_CC into an AND // when the condition can be materialized as an all-ones register. Any // single bit-test can be materialized as an all-ones register with // shift-left and shift-right-arith.
For this particular test case the DAG is converted as follows:
.---------------- lhs The meaning of this select_cc is: | .------- rhs `lhs == rhs ? true value : false value` | | .----- true value | | | .-- false value v v v v
(select_cc seteq (and X 2) 0 0 -13) ^ -> '---------------. (and (sra (sll X 62) 63) | -13) | | Before pattern is applied, it checks that second 'and' operand has only one bit set, (which is true for '2').
The pattern itself generates logical shift left / arithmetic shift right pair, that ensures that result is either all ones (-1) or all zeros (0). Hence, applying 'and' to shifts result and false value generates a correct result.
Thanks for your detailed and invaluable explanation!
In my opinion the approach taken by this patch is sub-optimal:
- 512 iterations is too much;
- this does not cover all code that could be generated by the above mentioned LLVM transformation (e.g. second 'and' operand could be 1 << 16).
Instead, I suggest to make a special case for source or dst register of '&=' operation being in range [-1,0]. Meaning that one of the '&=' operands is either:
- all ones, in which case the counterpart is the result of the operation;
- all zeros, in which case zero is the result of the operation;
- derive MIN and MAX values based on above two observations.
Totally agree, I'll cook a new patch as you suggested.
[1] https://github.com/llvm/llvm-project/blob/4523a267829c807f3fc8fab8e5e9613985...
Best regards, Eduard
On 4/20/24 1:33 AM, Xu Kuohai wrote:
On 4/20/2024 7:00 AM, Eduard Zingerman wrote:
On Thu, 2024-04-11 at 20:27 +0800, Xu Kuohai wrote:
From: Xu Kuohai xukuohai@huawei.com
With lsm return value check, the no-alu32 version test_libbpf_get_fd_by_id_opts is rejected by the verifier, and the log says:
0: R1=ctx() R10=fp0 ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 0: (b7) r0 = 0 ; R0_w=0 1: (79) r2 = *(u64 *)(r1 +0) func 'bpf_lsm_bpf_map' arg0 has btf_id 916 type STRUCT 'bpf_map' 2: R1=ctx() R2_w=trusted_ptr_bpf_map() ; if (map != (struct bpf_map *)&data_input) @ test_libbpf_get_fd_by_id_opts.c:29 2: (18) r3 = 0xffff9742c0951a00 ; R3_w=map_ptr(map=data_input,ks=4,vs=4) 4: (5d) if r2 != r3 goto pc+4 ; R2_w=trusted_ptr_bpf_map() R3_w=map_ptr(map=data_input,ks=4,vs=4) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 5: (79) r0 = *(u64 *)(r1 +8) ; R0_w=scalar() R1=ctx() ; if (fmode & FMODE_WRITE) @ test_libbpf_get_fd_by_id_opts.c:32 6: (67) r0 <<= 62 ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000)) 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) ; @ test_libbpf_get_fd_by_id_opts.c:0 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 9: (95) exit
And here is the C code of the prog.
SEC("lsm/bpf_map") int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) { if (map != (struct bpf_map *)&data_input) return 0;
if (fmode & FMODE_WRITE) return -EACCES;
return 0; }
It is clear that the prog can only return either 0 or -EACCESS, and both values are legal.
So why is it rejected by the verifier?
The verifier log shows that the second if and return value setting statements in the prog is optimized to bitwise operations "r0 s>>= 63" and "r0 &= -13". The verifier correctly deduces that the the value of r0 is in the range [-1, 0] after verifing instruction "r0 s>>= 63". But when the verifier proceeds to verify instruction "r0 &= -13", it fails to deduce the correct value range of r0.
7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3))
So why the verifier fails to deduce the result of 'r0 &= -13'?
The verifier uses tnum to track values, and the two ranges "[-1, 0]" and "[0, -1ULL]" are encoded to the same tnum. When verifing instruction "r0 &= -13", the verifier erroneously deduces the result from "[0, -1ULL] AND -13", which is out of the expected return range [-4095, 0].
To fix it, this patch simply adds a special SCALAR32 case for the verifier. That is, when the source operand of the AND instruction is a constant and the destination operand changes from negative to non-negative and falls in range [-256, 256], deduce the result range by enumerating all possible AND results.
Signed-off-by: Xu Kuohai xukuohai@huawei.com
Hello,
Sorry for the delay, I had to think about this issue a bit. I found the clang transformation that generates the pattern this patch tries to handle. It is located in DAGCombiner::SimplifySelectCC() method (see [1]). The transformation happens as a part of DAG to DAG rewrites (LLVM uses several internal representations: - generic optimizer uses LLVM IR, most of the work is done using this representation; - before instruction selection IR is converted to Selection DAG, some optimizations are applied at this stage, all such optimizations are a set of pattern replacements; - Selection DAG is converted to machine code, some optimizations are applied at the machine code level).
Full pattern is described as follows:
// fold (select_cc seteq (and x, y), 0, 0, A) -> (and (sra (shl x)) A) // where y is has a single bit set. // A plaintext description would be, we can turn the SELECT_CC into an AND // when the condition can be materialized as an all-ones register. Any // single bit-test can be materialized as an all-ones register with // shift-left and shift-right-arith.
For this particular test case the DAG is converted as follows:
.---------------- lhs The meaning of this select_cc is: | .------- rhs `lhs == rhs ? true value : false value` | | .----- true value | | | .-- false value v v v v (select_cc seteq (and X 2) 0 0 -13) ^ -> '---------------. (and (sra (sll X 62) 63) | -13) | | Before pattern is applied, it checks that second 'and' operand has only one bit set, (which is true for '2').
The pattern itself generates logical shift left / arithmetic shift right pair, that ensures that result is either all ones (-1) or all zeros (0). Hence, applying 'and' to shifts result and false value generates a correct result.
Thanks for your detailed and invaluable explanation!
Thanks Eduard for detailed explanation. It looks like we could resolve this issue without adding too much complexity to verifier. Also, this code pattern above seems generic enough to be worthwhile with verifier change.
Kuohai, please added detailed explanation (as described by Eduard) in the commit message.
In my opinion the approach taken by this patch is sub-optimal:
- 512 iterations is too much;
- this does not cover all code that could be generated by the above
mentioned LLVM transformation (e.g. second 'and' operand could be 1 << 16).
Instead, I suggest to make a special case for source or dst register of '&=' operation being in range [-1,0]. Meaning that one of the '&=' operands is either:
- all ones, in which case the counterpart is the result of the
operation;
- all zeros, in which case zero is the result of the operation;
- derive MIN and MAX values based on above two observations.
Totally agree, I'll cook a new patch as you suggested.
[1] https://github.com/llvm/llvm-project/blob/4523a267829c807f3fc8fab8e5e9613985...
Best regards, Eduard
On 4/24/2024 5:55 AM, Yonghong Song wrote:
On 4/20/24 1:33 AM, Xu Kuohai wrote:
On 4/20/2024 7:00 AM, Eduard Zingerman wrote:
On Thu, 2024-04-11 at 20:27 +0800, Xu Kuohai wrote:
From: Xu Kuohai xukuohai@huawei.com
With lsm return value check, the no-alu32 version test_libbpf_get_fd_by_id_opts is rejected by the verifier, and the log says:
0: R1=ctx() R10=fp0 ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 0: (b7) r0 = 0 ; R0_w=0 1: (79) r2 = *(u64 *)(r1 +0) func 'bpf_lsm_bpf_map' arg0 has btf_id 916 type STRUCT 'bpf_map' 2: R1=ctx() R2_w=trusted_ptr_bpf_map() ; if (map != (struct bpf_map *)&data_input) @ test_libbpf_get_fd_by_id_opts.c:29 2: (18) r3 = 0xffff9742c0951a00 ; R3_w=map_ptr(map=data_input,ks=4,vs=4) 4: (5d) if r2 != r3 goto pc+4 ; R2_w=trusted_ptr_bpf_map() R3_w=map_ptr(map=data_input,ks=4,vs=4) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 5: (79) r0 = *(u64 *)(r1 +8) ; R0_w=scalar() R1=ctx() ; if (fmode & FMODE_WRITE) @ test_libbpf_get_fd_by_id_opts.c:32 6: (67) r0 <<= 62 ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000)) 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) ; @ test_libbpf_get_fd_by_id_opts.c:0 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 9: (95) exit
And here is the C code of the prog.
SEC("lsm/bpf_map") int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) { if (map != (struct bpf_map *)&data_input) return 0;
if (fmode & FMODE_WRITE) return -EACCES;
return 0; }
It is clear that the prog can only return either 0 or -EACCESS, and both values are legal.
So why is it rejected by the verifier?
The verifier log shows that the second if and return value setting statements in the prog is optimized to bitwise operations "r0 s>>= 63" and "r0 &= -13". The verifier correctly deduces that the the value of r0 is in the range [-1, 0] after verifing instruction "r0 s>>= 63". But when the verifier proceeds to verify instruction "r0 &= -13", it fails to deduce the correct value range of r0.
7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3))
So why the verifier fails to deduce the result of 'r0 &= -13'?
The verifier uses tnum to track values, and the two ranges "[-1, 0]" and "[0, -1ULL]" are encoded to the same tnum. When verifing instruction "r0 &= -13", the verifier erroneously deduces the result from "[0, -1ULL] AND -13", which is out of the expected return range [-4095, 0].
To fix it, this patch simply adds a special SCALAR32 case for the verifier. That is, when the source operand of the AND instruction is a constant and the destination operand changes from negative to non-negative and falls in range [-256, 256], deduce the result range by enumerating all possible AND results.
Signed-off-by: Xu Kuohai xukuohai@huawei.com
Hello,
Sorry for the delay, I had to think about this issue a bit. I found the clang transformation that generates the pattern this patch tries to handle. It is located in DAGCombiner::SimplifySelectCC() method (see [1]). The transformation happens as a part of DAG to DAG rewrites (LLVM uses several internal representations: - generic optimizer uses LLVM IR, most of the work is done using this representation; - before instruction selection IR is converted to Selection DAG, some optimizations are applied at this stage, all such optimizations are a set of pattern replacements; - Selection DAG is converted to machine code, some optimizations are applied at the machine code level).
Full pattern is described as follows:
// fold (select_cc seteq (and x, y), 0, 0, A) -> (and (sra (shl x)) A) // where y is has a single bit set. // A plaintext description would be, we can turn the SELECT_CC into an AND // when the condition can be materialized as an all-ones register. Any // single bit-test can be materialized as an all-ones register with // shift-left and shift-right-arith.
For this particular test case the DAG is converted as follows:
.---------------- lhs The meaning of this select_cc is: | .------- rhs `lhs == rhs ? true value : false value` | | .----- true value | | | .-- false value v v v v (select_cc seteq (and X 2) 0 0 -13) ^ -> '---------------. (and (sra (sll X 62) 63) | -13) | | Before pattern is applied, it checks that second 'and' operand has only one bit set, (which is true for '2').
The pattern itself generates logical shift left / arithmetic shift right pair, that ensures that result is either all ones (-1) or all zeros (0). Hence, applying 'and' to shifts result and false value generates a correct result.
Thanks for your detailed and invaluable explanation!
Thanks Eduard for detailed explanation. It looks like we could resolve this issue without adding too much complexity to verifier. Also, this code pattern above seems generic enough to be worthwhile with verifier change.
Kuohai, please added detailed explanation (as described by Eduard) in the commit message.
Sure, already added, the commit message and the change now is like this:
---
bpf: Fix a false rejection caused by AND operation
With lsm return value check, the no-alu32 version test_libbpf_get_fd_by_id_opts is rejected by the verifier, and the log says:
0: R1=ctx() R10=fp0 ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 0: (b7) r0 = 0 ; R0_w=0 1: (79) r2 = *(u64 *)(r1 +0) func 'bpf_lsm_bpf_map' arg0 has btf_id 916 type STRUCT 'bpf_map' 2: R1=ctx() R2_w=trusted_ptr_bpf_map() ; if (map != (struct bpf_map *)&data_input) @ test_libbpf_get_fd_by_id_opts.c:29 2: (18) r3 = 0xffff9742c0951a00 ; R3_w=map_ptr(map=data_input,ks=4,vs=4) 4: (5d) if r2 != r3 goto pc+4 ; R2_w=trusted_ptr_bpf_map() R3_w=map_ptr(map=data_input,ks=4,vs=4) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 5: (79) r0 = *(u64 *)(r1 +8) ; R0_w=scalar() R1=ctx() ; if (fmode & FMODE_WRITE) @ test_libbpf_get_fd_by_id_opts.c:32 6: (67) r0 <<= 62 ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000)) 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) ; @ test_libbpf_get_fd_by_id_opts.c:0 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 9: (95) exit
And here is the C code of the prog.
SEC("lsm/bpf_map") int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) { if (map != (struct bpf_map *)&data_input) return 0;
if (fmode & FMODE_WRITE) return -EACCES;
return 0; }
It is clear that the prog can only return either 0 or -EACCESS, and both values are legal.
So why is it rejected by the verifier?
The verifier log shows that the second if and return value setting statements in the prog is optimized to bitwise operations "r0 s>>= 63" and "r0 &= -13". The verifier correctly deduces that the the value of r0 is in the range [-1, 0] after verifing instruction "r0 s>>= 63". But when the verifier proceeds to verify instruction "r0 &= -13", it fails to deduce the correct value range of r0.
7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3))
So why the verifier fails to deduce the result of 'r0 &= -13'?
The verifier uses tnum to track values, and the two ranges "[-1, 0]" and "[0, -1ULL]" are encoded to the same tnum. When verifing instruction "r0 &= -13", the verifier erroneously deduces the result from "[0, -1ULL] AND -13", which is out of the expected return range [-4095, 0].
As explained by Eduard in [0], the clang transformation that generates this pattern is located in DAGCombiner::SimplifySelectCC() method (see [1]).
The transformation happens as a part of DAG to DAG rewrites (LLVM uses several internal representations: - generic optimizer uses LLVM IR, most of the work is done using this representation; - before instruction selection IR is converted to Selection DAG, some optimizations are applied at this stage, all such optimizations are a set of pattern replacements; - Selection DAG is converted to machine code, some optimizations are applied at the machine code level).
Full pattern is described as follows:
// fold (select_cc seteq (and x, y), 0, 0, A) -> (and (sra (shl x)) A) // where y is has a single bit set. // A plaintext description would be, we can turn the SELECT_CC into an AND // when the condition can be materialized as an all-ones register. Any // single bit-test can be materialized as an all-ones register with // shift-left and shift-right-arith.
For this particular test case the DAG is converted as follows:
.---------------- lhs The meaning of this select_cc is: | .------- rhs `lhs == rhs ? true value : false value` | | .----- true value | | | .-- false value v v v v (select_cc seteq (and X 2) 0 0 -13) ^ -> '---------------. (and (sra (sll X 62) 63) | -13) | | Before pattern is applied, it checks that second 'and' operand has only one bit set, (which is true for '2').
The pattern itself generates logical shift left / arithmetic shift right pair, that ensures that result is either all ones (-1) or all zeros (0). Hence, applying 'and' to shifts result and false value generates a correct result.
As suggested by Eduard, this patch makes a special case for source or destination register of '&=' operation being in range [-1, 0].
Meaning that one of the '&=' operands is either: - all ones, in which case the counterpart is the result of the operation; - all zeros, in which case zero is the result of the operation.
And MIN and MAX values could be derived based on above two observations.
[0] https://lore.kernel.org/bpf/e62e2971301ca7f2e9eb74fc500c520285cad8f5.camel@g... [1] https://github.com/llvm/llvm-project/blob/4523a267829c807f3fc8fab8e5e9613985...
Suggested-by: Eduard Zingerman eddyz87@gmail.com Signed-off-by: Xu Kuohai xukuohai@huawei.com
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 640747b53745..30c551d39329 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -13374,6 +13374,24 @@ static void scalar32_min_max_and(struct bpf_reg_state *dst_reg, dst_reg->u32_min_value = var32_off.value; dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val);
+ /* Special case: src_reg is known and dst_reg is in range [-1, 0] */ + if (src_known && + dst_reg->s32_min_value == -1 && dst_reg->s32_max_value == 0 && + dst_reg->smin_value == -1 && dst_reg->smax_value == 0) { + dst_reg->s32_min_value = min_t(s32, src_reg->s32_min_value, 0); + dst_reg->s32_max_value = max_t(s32, src_reg->s32_min_value, 0); + return; + } + + /* Special case: dst_reg is known and src_reg is in range [-1, 0] */ + if (dst_known && + src_reg->s32_min_value == -1 && src_reg->s32_max_value == 0 && + src_reg->smin_value == -1 && src_reg->smax_value == 0) { + dst_reg->s32_min_value = min_t(s32, dst_reg->s32_min_value, 0); + dst_reg->s32_max_value = max_t(s32, dst_reg->s32_min_value, 0); + return; + } + /* Safe to set s32 bounds by casting u32 result into s32 when u32 * doesn't cross sign boundary. Otherwise set s32 bounds to unbounded. */ @@ -13404,6 +13422,24 @@ static void scalar_min_max_and(struct bpf_reg_state *dst_reg, dst_reg->umin_value = dst_reg->var_off.value; dst_reg->umax_value = min(dst_reg->umax_value, umax_val);
+ /* Special case: src_reg is known and dst_reg is in range [-1, 0] */ + if (src_known && + dst_reg->smin_value == -1 && dst_reg->smax_value == 0 && + dst_reg->s32_min_value == -1 && dst_reg->s32_max_value == 0) { + dst_reg->smin_value = min_t(s64, src_reg->smin_value, 0); + dst_reg->smax_value = max_t(s64, src_reg->smin_value, 0); + return; + } + + /* Special case: dst_reg is known and src_reg is in range [-1, 0] */ + if (dst_known && + src_reg->smin_value == -1 && src_reg->smax_value == 0 && + src_reg->s32_min_value == -1 && src_reg->s32_max_value == 0) { + dst_reg->smin_value = min_t(s64, dst_reg->smin_value, 0); + dst_reg->smax_value = max_t(s64, dst_reg->smin_value, 0); + return; + } + /* Safe to set s64 bounds by casting u64 result into s64 when u64 * doesn't cross sign boundary. Otherwise set s64 bounds to unbounded. */
In my opinion the approach taken by this patch is sub-optimal:
- 512 iterations is too much;
- this does not cover all code that could be generated by the above
mentioned LLVM transformation (e.g. second 'and' operand could be 1 << 16).
Instead, I suggest to make a special case for source or dst register of '&=' operation being in range [-1,0]. Meaning that one of the '&=' operands is either:
- all ones, in which case the counterpart is the result of the operation;
- all zeros, in which case zero is the result of the operation;
- derive MIN and MAX values based on above two observations.
Totally agree, I'll cook a new patch as you suggested.
[1] https://github.com/llvm/llvm-project/blob/4523a267829c807f3fc8fab8e5e9613985...
Best regards, Eduard
On 4/23/24 7:25 PM, Xu Kuohai wrote:
On 4/24/2024 5:55 AM, Yonghong Song wrote:
On 4/20/24 1:33 AM, Xu Kuohai wrote:
On 4/20/2024 7:00 AM, Eduard Zingerman wrote:
On Thu, 2024-04-11 at 20:27 +0800, Xu Kuohai wrote:
From: Xu Kuohai xukuohai@huawei.com
With lsm return value check, the no-alu32 version test_libbpf_get_fd_by_id_opts is rejected by the verifier, and the log says:
0: R1=ctx() R10=fp0 ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 0: (b7) r0 = 0 ; R0_w=0 1: (79) r2 = *(u64 *)(r1 +0) func 'bpf_lsm_bpf_map' arg0 has btf_id 916 type STRUCT 'bpf_map' 2: R1=ctx() R2_w=trusted_ptr_bpf_map() ; if (map != (struct bpf_map *)&data_input) @ test_libbpf_get_fd_by_id_opts.c:29 2: (18) r3 = 0xffff9742c0951a00 ; R3_w=map_ptr(map=data_input,ks=4,vs=4) 4: (5d) if r2 != r3 goto pc+4 ; R2_w=trusted_ptr_bpf_map() R3_w=map_ptr(map=data_input,ks=4,vs=4) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 5: (79) r0 = *(u64 *)(r1 +8) ; R0_w=scalar() R1=ctx() ; if (fmode & FMODE_WRITE) @ test_libbpf_get_fd_by_id_opts.c:32 6: (67) r0 <<= 62 ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000)) 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) ; @ test_libbpf_get_fd_by_id_opts.c:0 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 9: (95) exit
And here is the C code of the prog.
SEC("lsm/bpf_map") int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) { if (map != (struct bpf_map *)&data_input) return 0;
if (fmode & FMODE_WRITE) return -EACCES;
return 0; }
It is clear that the prog can only return either 0 or -EACCESS, and both values are legal.
So why is it rejected by the verifier?
The verifier log shows that the second if and return value setting statements in the prog is optimized to bitwise operations "r0 s>>= 63" and "r0 &= -13". The verifier correctly deduces that the the value of r0 is in the range [-1, 0] after verifing instruction "r0 s>>= 63". But when the verifier proceeds to verify instruction "r0 &= -13", it fails to deduce the correct value range of r0.
7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3))
So why the verifier fails to deduce the result of 'r0 &= -13'?
The verifier uses tnum to track values, and the two ranges "[-1, 0]" and "[0, -1ULL]" are encoded to the same tnum. When verifing instruction "r0 &= -13", the verifier erroneously deduces the result from "[0, -1ULL] AND -13", which is out of the expected return range [-4095, 0].
To fix it, this patch simply adds a special SCALAR32 case for the verifier. That is, when the source operand of the AND instruction is a constant and the destination operand changes from negative to non-negative and falls in range [-256, 256], deduce the result range by enumerating all possible AND results.
Signed-off-by: Xu Kuohai xukuohai@huawei.com
Hello,
Sorry for the delay, I had to think about this issue a bit. I found the clang transformation that generates the pattern this patch tries to handle. It is located in DAGCombiner::SimplifySelectCC() method (see [1]). The transformation happens as a part of DAG to DAG rewrites (LLVM uses several internal representations: - generic optimizer uses LLVM IR, most of the work is done using this representation; - before instruction selection IR is converted to Selection DAG, some optimizations are applied at this stage, all such optimizations are a set of pattern replacements; - Selection DAG is converted to machine code, some optimizations are applied at the machine code level).
Full pattern is described as follows:
// fold (select_cc seteq (and x, y), 0, 0, A) -> (and (sra (shl x)) A) // where y is has a single bit set. // A plaintext description would be, we can turn the SELECT_CC into an AND // when the condition can be materialized as an all-ones register. Any // single bit-test can be materialized as an all-ones register with // shift-left and shift-right-arith.
For this particular test case the DAG is converted as follows:
.---------------- lhs The meaning of this select_cc is: | .------- rhs `lhs == rhs ? true value : false value` | | .----- true value | | | .-- false value v v v v (select_cc seteq (and X 2) 0 0 -13) ^ -> '---------------. (and (sra (sll X 62) 63) | -13) | | Before pattern is applied, it checks that second 'and' operand has only one bit set, (which is true for '2').
The pattern itself generates logical shift left / arithmetic shift right pair, that ensures that result is either all ones (-1) or all zeros (0). Hence, applying 'and' to shifts result and false value generates a correct result.
Thanks for your detailed and invaluable explanation!
Thanks Eduard for detailed explanation. It looks like we could resolve this issue without adding too much complexity to verifier. Also, this code pattern above seems generic enough to be worthwhile with verifier change.
Kuohai, please added detailed explanation (as described by Eduard) in the commit message.
Sure, already added, the commit message and the change now is like this:
bpf: Fix a false rejection caused by AND operation
With lsm return value check, the no-alu32 version test_libbpf_get_fd_by_id_opts is rejected by the verifier, and the log says:
0: R1=ctx() R10=fp0 ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 0: (b7) r0 = 0 ; R0_w=0 1: (79) r2 = *(u64 *)(r1 +0) func 'bpf_lsm_bpf_map' arg0 has btf_id 916 type STRUCT 'bpf_map' 2: R1=ctx() R2_w=trusted_ptr_bpf_map() ; if (map != (struct bpf_map *)&data_input) @ test_libbpf_get_fd_by_id_opts.c:29 2: (18) r3 = 0xffff9742c0951a00 ; R3_w=map_ptr(map=data_input,ks=4,vs=4) 4: (5d) if r2 != r3 goto pc+4 ; R2_w=trusted_ptr_bpf_map() R3_w=map_ptr(map=data_input,ks=4,vs=4) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 5: (79) r0 = *(u64 *)(r1 +8) ; R0_w=scalar() R1=ctx() ; if (fmode & FMODE_WRITE) @ test_libbpf_get_fd_by_id_opts.c:32 6: (67) r0 <<= 62 ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000)) 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) ; @ test_libbpf_get_fd_by_id_opts.c:0 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 9: (95) exit
And here is the C code of the prog.
SEC("lsm/bpf_map") int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) { if (map != (struct bpf_map *)&data_input) return 0;
if (fmode & FMODE_WRITE) return -EACCES;
return 0; }
It is clear that the prog can only return either 0 or -EACCESS, and both values are legal.
So why is it rejected by the verifier?
The verifier log shows that the second if and return value setting statements in the prog is optimized to bitwise operations "r0 s>>= 63" and "r0 &= -13". The verifier correctly deduces that the the value of r0 is in the range [-1, 0] after verifing instruction "r0 s>>= 63". But when the verifier proceeds to verify instruction "r0 &= -13", it fails to deduce the correct value range of r0.
7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3))
So why the verifier fails to deduce the result of 'r0 &= -13'?
The verifier uses tnum to track values, and the two ranges "[-1, 0]" and "[0, -1ULL]" are encoded to the same tnum. When verifing instruction "r0 &= -13", the verifier erroneously deduces the result from "[0, -1ULL] AND -13", which is out of the expected return range [-4095, 0].
As explained by Eduard in [0], the clang transformation that generates this pattern is located in DAGCombiner::SimplifySelectCC() method (see [1]).
The transformation happens as a part of DAG to DAG rewrites (LLVM uses several internal representations: - generic optimizer uses LLVM IR, most of the work is done using this representation; - before instruction selection IR is converted to Selection DAG, some optimizations are applied at this stage, all such optimizations are a set of pattern replacements; - Selection DAG is converted to machine code, some optimizations are applied at the machine code level).
Full pattern is described as follows:
// fold (select_cc seteq (and x, y), 0, 0, A) -> (and (sra (shl x)) A) // where y is has a single bit set. // A plaintext description would be, we can turn the SELECT_CC into an AND // when the condition can be materialized as an all-ones register. Any // single bit-test can be materialized as an all-ones register with // shift-left and shift-right-arith.
For this particular test case the DAG is converted as follows:
.---------------- lhs The meaning of this select_cc is: | .------- rhs `lhs == rhs ? true value : false value` | | .----- true value | | | .-- false value v v v v (select_cc seteq (and X 2) 0 0 -13) ^ -> '---------------. (and (sra (sll X 62) 63) | -13) | | Before pattern is applied, it checks that second 'and' operand has only one bit set, (which is true for '2').
The pattern itself generates logical shift left / arithmetic shift right pair, that ensures that result is either all ones (-1) or all zeros (0). Hence, applying 'and' to shifts result and false value generates a correct result.
As suggested by Eduard, this patch makes a special case for source or destination register of '&=' operation being in range [-1, 0].
Meaning that one of the '&=' operands is either: - all ones, in which case the counterpart is the result of the operation; - all zeros, in which case zero is the result of the operation.
And MIN and MAX values could be derived based on above two observations.
[0] https://lore.kernel.org/bpf/e62e2971301ca7f2e9eb74fc500c520285cad8f5.camel@g... [1] https://github.com/llvm/llvm-project/blob/4523a267829c807f3fc8fab8e5e9613985...
Suggested-by: Eduard Zingerman eddyz87@gmail.com Signed-off-by: Xu Kuohai xukuohai@huawei.com
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 640747b53745..30c551d39329 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -13374,6 +13374,24 @@ static void scalar32_min_max_and(struct bpf_reg_state *dst_reg, dst_reg->u32_min_value = var32_off.value; dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val);
+ /* Special case: src_reg is known and dst_reg is in range [-1, 0] */ + if (src_known && + dst_reg->s32_min_value == -1 && dst_reg->s32_max_value == 0 && + dst_reg->smin_value == -1 && dst_reg->smax_value == 0) {
do we need to check dst_reg->smin_value/smax_value here? They should not impact final dst_reg->s32_{min,max}_value computation, right? Similarly, for later 64bit min/max and, 32bit value does not really matter.
- dst_reg->s32_min_value = min_t(s32, src_reg->s32_min_value, 0);
+ dst_reg->s32_max_value = max_t(s32, src_reg->s32_min_value, 0); + return; + }
+ /* Special case: dst_reg is known and src_reg is in range [-1, 0] */ + if (dst_known && + src_reg->s32_min_value == -1 && src_reg->s32_max_value == 0 && + src_reg->smin_value == -1 && src_reg->smax_value == 0) { + dst_reg->s32_min_value = min_t(s32, dst_reg->s32_min_value, 0); + dst_reg->s32_max_value = max_t(s32, dst_reg->s32_min_value, 0); + return; + }
/* Safe to set s32 bounds by casting u32 result into s32 when u32 * doesn't cross sign boundary. Otherwise set s32 bounds to unbounded. */ @@ -13404,6 +13422,24 @@ static void scalar_min_max_and(struct bpf_reg_state *dst_reg, dst_reg->umin_value = dst_reg->var_off.value; dst_reg->umax_value = min(dst_reg->umax_value, umax_val);
+ /* Special case: src_reg is known and dst_reg is in range [-1, 0] */ + if (src_known && + dst_reg->smin_value == -1 && dst_reg->smax_value == 0 && + dst_reg->s32_min_value == -1 && dst_reg->s32_max_value == 0) { + dst_reg->smin_value = min_t(s64, src_reg->smin_value, 0); + dst_reg->smax_value = max_t(s64, src_reg->smin_value, 0); + return; + }
+ /* Special case: dst_reg is known and src_reg is in range [-1, 0] */ + if (dst_known && + src_reg->smin_value == -1 && src_reg->smax_value == 0 && + src_reg->s32_min_value == -1 && src_reg->s32_max_value == 0) { + dst_reg->smin_value = min_t(s64, dst_reg->smin_value, 0); + dst_reg->smax_value = max_t(s64, dst_reg->smin_value, 0); + return; + }
/* Safe to set s64 bounds by casting u64 result into s64 when u64 * doesn't cross sign boundary. Otherwise set s64 bounds to unbounded. */
In my opinion the approach taken by this patch is sub-optimal:
- 512 iterations is too much;
- this does not cover all code that could be generated by the above
mentioned LLVM transformation (e.g. second 'and' operand could be 1 << 16).
Instead, I suggest to make a special case for source or dst register of '&=' operation being in range [-1,0]. Meaning that one of the '&=' operands is either:
- all ones, in which case the counterpart is the result of the
operation;
- all zeros, in which case zero is the result of the operation;
- derive MIN and MAX values based on above two observations.
Totally agree, I'll cook a new patch as you suggested.
[1] https://github.com/llvm/llvm-project/blob/4523a267829c807f3fc8fab8e5e9613985...
Best regards, Eduard
On 4/25/2024 6:06 AM, Yonghong Song wrote:
On 4/23/24 7:25 PM, Xu Kuohai wrote:
On 4/24/2024 5:55 AM, Yonghong Song wrote:
On 4/20/24 1:33 AM, Xu Kuohai wrote:
On 4/20/2024 7:00 AM, Eduard Zingerman wrote:
On Thu, 2024-04-11 at 20:27 +0800, Xu Kuohai wrote:
From: Xu Kuohai xukuohai@huawei.com
With lsm return value check, the no-alu32 version test_libbpf_get_fd_by_id_opts is rejected by the verifier, and the log says:
0: R1=ctx() R10=fp0 ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 0: (b7) r0 = 0 ; R0_w=0 1: (79) r2 = *(u64 *)(r1 +0) func 'bpf_lsm_bpf_map' arg0 has btf_id 916 type STRUCT 'bpf_map' 2: R1=ctx() R2_w=trusted_ptr_bpf_map() ; if (map != (struct bpf_map *)&data_input) @ test_libbpf_get_fd_by_id_opts.c:29 2: (18) r3 = 0xffff9742c0951a00 ; R3_w=map_ptr(map=data_input,ks=4,vs=4) 4: (5d) if r2 != r3 goto pc+4 ; R2_w=trusted_ptr_bpf_map() R3_w=map_ptr(map=data_input,ks=4,vs=4) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 5: (79) r0 = *(u64 *)(r1 +8) ; R0_w=scalar() R1=ctx() ; if (fmode & FMODE_WRITE) @ test_libbpf_get_fd_by_id_opts.c:32 6: (67) r0 <<= 62 ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000)) 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) ; @ test_libbpf_get_fd_by_id_opts.c:0 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 9: (95) exit
And here is the C code of the prog.
SEC("lsm/bpf_map") int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) { if (map != (struct bpf_map *)&data_input) return 0;
if (fmode & FMODE_WRITE) return -EACCES;
return 0; }
It is clear that the prog can only return either 0 or -EACCESS, and both values are legal.
So why is it rejected by the verifier?
The verifier log shows that the second if and return value setting statements in the prog is optimized to bitwise operations "r0 s>>= 63" and "r0 &= -13". The verifier correctly deduces that the the value of r0 is in the range [-1, 0] after verifing instruction "r0 s>>= 63". But when the verifier proceeds to verify instruction "r0 &= -13", it fails to deduce the correct value range of r0.
7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3))
So why the verifier fails to deduce the result of 'r0 &= -13'?
The verifier uses tnum to track values, and the two ranges "[-1, 0]" and "[0, -1ULL]" are encoded to the same tnum. When verifing instruction "r0 &= -13", the verifier erroneously deduces the result from "[0, -1ULL] AND -13", which is out of the expected return range [-4095, 0].
To fix it, this patch simply adds a special SCALAR32 case for the verifier. That is, when the source operand of the AND instruction is a constant and the destination operand changes from negative to non-negative and falls in range [-256, 256], deduce the result range by enumerating all possible AND results.
Signed-off-by: Xu Kuohai xukuohai@huawei.com
Hello,
Sorry for the delay, I had to think about this issue a bit. I found the clang transformation that generates the pattern this patch tries to handle. It is located in DAGCombiner::SimplifySelectCC() method (see [1]). The transformation happens as a part of DAG to DAG rewrites (LLVM uses several internal representations: - generic optimizer uses LLVM IR, most of the work is done using this representation; - before instruction selection IR is converted to Selection DAG, some optimizations are applied at this stage, all such optimizations are a set of pattern replacements; - Selection DAG is converted to machine code, some optimizations are applied at the machine code level).
Full pattern is described as follows:
// fold (select_cc seteq (and x, y), 0, 0, A) -> (and (sra (shl x)) A) // where y is has a single bit set. // A plaintext description would be, we can turn the SELECT_CC into an AND // when the condition can be materialized as an all-ones register. Any // single bit-test can be materialized as an all-ones register with // shift-left and shift-right-arith.
For this particular test case the DAG is converted as follows:
.---------------- lhs The meaning of this select_cc is: | .------- rhs `lhs == rhs ? true value : false value` | | .----- true value | | | .-- false value v v v v (select_cc seteq (and X 2) 0 0 -13) ^ -> '---------------. (and (sra (sll X 62) 63) | -13) | | Before pattern is applied, it checks that second 'and' operand has only one bit set, (which is true for '2').
The pattern itself generates logical shift left / arithmetic shift right pair, that ensures that result is either all ones (-1) or all zeros (0). Hence, applying 'and' to shifts result and false value generates a correct result.
Thanks for your detailed and invaluable explanation!
Thanks Eduard for detailed explanation. It looks like we could resolve this issue without adding too much complexity to verifier. Also, this code pattern above seems generic enough to be worthwhile with verifier change.
Kuohai, please added detailed explanation (as described by Eduard) in the commit message.
Sure, already added, the commit message and the change now is like this:
bpf: Fix a false rejection caused by AND operation
With lsm return value check, the no-alu32 version test_libbpf_get_fd_by_id_opts is rejected by the verifier, and the log says:
0: R1=ctx() R10=fp0 ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 0: (b7) r0 = 0 ; R0_w=0 1: (79) r2 = *(u64 *)(r1 +0) func 'bpf_lsm_bpf_map' arg0 has btf_id 916 type STRUCT 'bpf_map' 2: R1=ctx() R2_w=trusted_ptr_bpf_map() ; if (map != (struct bpf_map *)&data_input) @ test_libbpf_get_fd_by_id_opts.c:29 2: (18) r3 = 0xffff9742c0951a00 ; R3_w=map_ptr(map=data_input,ks=4,vs=4) 4: (5d) if r2 != r3 goto pc+4 ; R2_w=trusted_ptr_bpf_map() R3_w=map_ptr(map=data_input,ks=4,vs=4) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 5: (79) r0 = *(u64 *)(r1 +8) ; R0_w=scalar() R1=ctx() ; if (fmode & FMODE_WRITE) @ test_libbpf_get_fd_by_id_opts.c:32 6: (67) r0 <<= 62 ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000)) 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) ; @ test_libbpf_get_fd_by_id_opts.c:0 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 9: (95) exit
And here is the C code of the prog.
SEC("lsm/bpf_map") int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) { if (map != (struct bpf_map *)&data_input) return 0;
if (fmode & FMODE_WRITE) return -EACCES;
return 0; }
It is clear that the prog can only return either 0 or -EACCESS, and both values are legal.
So why is it rejected by the verifier?
The verifier log shows that the second if and return value setting statements in the prog is optimized to bitwise operations "r0 s>>= 63" and "r0 &= -13". The verifier correctly deduces that the the value of r0 is in the range [-1, 0] after verifing instruction "r0 s>>= 63". But when the verifier proceeds to verify instruction "r0 &= -13", it fails to deduce the correct value range of r0.
7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3))
So why the verifier fails to deduce the result of 'r0 &= -13'?
The verifier uses tnum to track values, and the two ranges "[-1, 0]" and "[0, -1ULL]" are encoded to the same tnum. When verifing instruction "r0 &= -13", the verifier erroneously deduces the result from "[0, -1ULL] AND -13", which is out of the expected return range [-4095, 0].
As explained by Eduard in [0], the clang transformation that generates this pattern is located in DAGCombiner::SimplifySelectCC() method (see [1]).
The transformation happens as a part of DAG to DAG rewrites (LLVM uses several internal representations: - generic optimizer uses LLVM IR, most of the work is done using this representation; - before instruction selection IR is converted to Selection DAG, some optimizations are applied at this stage, all such optimizations are a set of pattern replacements; - Selection DAG is converted to machine code, some optimizations are applied at the machine code level).
Full pattern is described as follows:
// fold (select_cc seteq (and x, y), 0, 0, A) -> (and (sra (shl x)) A) // where y is has a single bit set. // A plaintext description would be, we can turn the SELECT_CC into an AND // when the condition can be materialized as an all-ones register. Any // single bit-test can be materialized as an all-ones register with // shift-left and shift-right-arith.
For this particular test case the DAG is converted as follows:
.---------------- lhs The meaning of this select_cc is: | .------- rhs `lhs == rhs ? true value : false value` | | .----- true value | | | .-- false value v v v v (select_cc seteq (and X 2) 0 0 -13) ^ -> '---------------. (and (sra (sll X 62) 63) | -13) | | Before pattern is applied, it checks that second 'and' operand has only one bit set, (which is true for '2').
The pattern itself generates logical shift left / arithmetic shift right pair, that ensures that result is either all ones (-1) or all zeros (0). Hence, applying 'and' to shifts result and false value generates a correct result.
As suggested by Eduard, this patch makes a special case for source or destination register of '&=' operation being in range [-1, 0].
Meaning that one of the '&=' operands is either: - all ones, in which case the counterpart is the result of the operation; - all zeros, in which case zero is the result of the operation.
And MIN and MAX values could be derived based on above two observations.
[0] https://lore.kernel.org/bpf/e62e2971301ca7f2e9eb74fc500c520285cad8f5.camel@g... [1] https://github.com/llvm/llvm-project/blob/4523a267829c807f3fc8fab8e5e9613985...
Suggested-by: Eduard Zingerman eddyz87@gmail.com Signed-off-by: Xu Kuohai xukuohai@huawei.com
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 640747b53745..30c551d39329 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -13374,6 +13374,24 @@ static void scalar32_min_max_and(struct bpf_reg_state *dst_reg, dst_reg->u32_min_value = var32_off.value; dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val);
+ /* Special case: src_reg is known and dst_reg is in range [-1, 0] */ + if (src_known && + dst_reg->s32_min_value == -1 && dst_reg->s32_max_value == 0 && + dst_reg->smin_value == -1 && dst_reg->smax_value == 0) {
do we need to check dst_reg->smin_value/smax_value here? They should not impact final dst_reg->s32_{min,max}_value computation, right?
right, the check was simply copied from the old code, which only handled the case where 64-bit range is the same as the 32-bit range
Similarly, for later 64bit min/max and, 32bit value does not really matter.
hmm, the 32-bit check is completely unnecessary.
- dst_reg->s32_min_value = min_t(s32, src_reg->s32_min_value, 0);
+ dst_reg->s32_max_value = max_t(s32, src_reg->s32_min_value, 0); + return; + }
+ /* Special case: dst_reg is known and src_reg is in range [-1, 0] */ + if (dst_known && + src_reg->s32_min_value == -1 && src_reg->s32_max_value == 0 && + src_reg->smin_value == -1 && src_reg->smax_value == 0) { + dst_reg->s32_min_value = min_t(s32, dst_reg->s32_min_value, 0); + dst_reg->s32_max_value = max_t(s32, dst_reg->s32_min_value, 0); + return; + }
/* Safe to set s32 bounds by casting u32 result into s32 when u32 * doesn't cross sign boundary. Otherwise set s32 bounds to unbounded. */ @@ -13404,6 +13422,24 @@ static void scalar_min_max_and(struct bpf_reg_state *dst_reg, dst_reg->umin_value = dst_reg->var_off.value; dst_reg->umax_value = min(dst_reg->umax_value, umax_val);
+ /* Special case: src_reg is known and dst_reg is in range [-1, 0] */ + if (src_known && + dst_reg->smin_value == -1 && dst_reg->smax_value == 0 && + dst_reg->s32_min_value == -1 && dst_reg->s32_max_value == 0) { + dst_reg->smin_value = min_t(s64, src_reg->smin_value, 0); + dst_reg->smax_value = max_t(s64, src_reg->smin_value, 0); + return; + }
+ /* Special case: dst_reg is known and src_reg is in range [-1, 0] */ + if (dst_known && + src_reg->smin_value == -1 && src_reg->smax_value == 0 && + src_reg->s32_min_value == -1 && src_reg->s32_max_value == 0) { + dst_reg->smin_value = min_t(s64, dst_reg->smin_value, 0); + dst_reg->smax_value = max_t(s64, dst_reg->smin_value, 0); + return; + }
/* Safe to set s64 bounds by casting u64 result into s64 when u64 * doesn't cross sign boundary. Otherwise set s64 bounds to unbounded. */
In my opinion the approach taken by this patch is sub-optimal:
- 512 iterations is too much;
- this does not cover all code that could be generated by the above
mentioned LLVM transformation (e.g. second 'and' operand could be 1 << 16).
Instead, I suggest to make a special case for source or dst register of '&=' operation being in range [-1,0]. Meaning that one of the '&=' operands is either:
- all ones, in which case the counterpart is the result of the operation;
- all zeros, in which case zero is the result of the operation;
- derive MIN and MAX values based on above two observations.
Totally agree, I'll cook a new patch as you suggested.
[1] https://github.com/llvm/llvm-project/blob/4523a267829c807f3fc8fab8e5e9613985...
Best regards, Eduard
On 4/24/24 7:42 PM, Xu Kuohai wrote:
On 4/25/2024 6:06 AM, Yonghong Song wrote:
On 4/23/24 7:25 PM, Xu Kuohai wrote:
On 4/24/2024 5:55 AM, Yonghong Song wrote:
On 4/20/24 1:33 AM, Xu Kuohai wrote:
On 4/20/2024 7:00 AM, Eduard Zingerman wrote:
On Thu, 2024-04-11 at 20:27 +0800, Xu Kuohai wrote: > From: Xu Kuohai xukuohai@huawei.com > > With lsm return value check, the no-alu32 version > test_libbpf_get_fd_by_id_opts > is rejected by the verifier, and the log says: > > 0: R1=ctx() R10=fp0 > ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t > fmode) @ test_libbpf_get_fd_by_id_opts.c:27 > 0: (b7) r0 = 0 ; R0_w=0 > 1: (79) r2 = *(u64 *)(r1 +0) > func 'bpf_lsm_bpf_map' arg0 has btf_id 916 type STRUCT 'bpf_map' > 2: R1=ctx() R2_w=trusted_ptr_bpf_map() > ; if (map != (struct bpf_map *)&data_input) @ > test_libbpf_get_fd_by_id_opts.c:29 > 2: (18) r3 = 0xffff9742c0951a00 ; > R3_w=map_ptr(map=data_input,ks=4,vs=4) > 4: (5d) if r2 != r3 goto pc+4 ; > R2_w=trusted_ptr_bpf_map() R3_w=map_ptr(map=data_input,ks=4,vs=4) > ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t > fmode) @ test_libbpf_get_fd_by_id_opts.c:27 > 5: (79) r0 = *(u64 *)(r1 +8) ; R0_w=scalar() R1=ctx() > ; if (fmode & FMODE_WRITE) @ test_libbpf_get_fd_by_id_opts.c:32 > 6: (67) r0 <<= 62 ; > R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; > 0xc000000000000000)) > 7: (c7) r0 s>>= 63 ; > R0_w=scalar(smin=smin32=-1,smax=smax32=0) > ; @ test_libbpf_get_fd_by_id_opts.c:0 > 8: (57) r0 &= -13 ; > R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; > 0xfffffffffffffff3)) > ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t > fmode) @ test_libbpf_get_fd_by_id_opts.c:27 > 9: (95) exit > > And here is the C code of the prog. > > SEC("lsm/bpf_map") > int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) > { > if (map != (struct bpf_map *)&data_input) > return 0; > > if (fmode & FMODE_WRITE) > return -EACCES; > > return 0; > } > > It is clear that the prog can only return either 0 or -EACCESS, > and both > values are legal. > > So why is it rejected by the verifier? > > The verifier log shows that the second if and return value setting > statements in the prog is optimized to bitwise operations "r0 > s>>= 63" > and "r0 &= -13". The verifier correctly deduces that the the > value of > r0 is in the range [-1, 0] after verifing instruction "r0 s>>= 63". > But when the verifier proceeds to verify instruction "r0 &= > -13", it > fails to deduce the correct value range of r0. > > 7: (c7) r0 s>>= 63 ; > R0_w=scalar(smin=smin32=-1,smax=smax32=0) > 8: (57) r0 &= -13 ; > R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; > 0xfffffffffffffff3)) > > So why the verifier fails to deduce the result of 'r0 &= -13'? > > The verifier uses tnum to track values, and the two ranges "[-1, > 0]" and > "[0, -1ULL]" are encoded to the same tnum. When verifing > instruction > "r0 &= -13", the verifier erroneously deduces the result from > "[0, -1ULL] AND -13", which is out of the expected return range > [-4095, 0]. > > To fix it, this patch simply adds a special SCALAR32 case for the > verifier. That is, when the source operand of the AND > instruction is > a constant and the destination operand changes from negative to > non-negative and falls in range [-256, 256], deduce the result > range > by enumerating all possible AND results. > > Signed-off-by: Xu Kuohai xukuohai@huawei.com > ---
Hello,
Sorry for the delay, I had to think about this issue a bit. I found the clang transformation that generates the pattern this patch tries to handle. It is located in DAGCombiner::SimplifySelectCC() method (see [1]). The transformation happens as a part of DAG to DAG rewrites (LLVM uses several internal representations: - generic optimizer uses LLVM IR, most of the work is done using this representation; - before instruction selection IR is converted to Selection DAG, some optimizations are applied at this stage, all such optimizations are a set of pattern replacements; - Selection DAG is converted to machine code, some optimizations are applied at the machine code level).
Full pattern is described as follows:
// fold (select_cc seteq (and x, y), 0, 0, A) -> (and (sra (shl x)) A) // where y is has a single bit set. // A plaintext description would be, we can turn the SELECT_CC into an AND // when the condition can be materialized as an all-ones register. Any // single bit-test can be materialized as an all-ones register with // shift-left and shift-right-arith.
For this particular test case the DAG is converted as follows:
.---------------- lhs The meaning of this select_cc is: | .------- rhs `lhs == rhs ? true value : false value` | | .----- true value | | | .-- false value v v v v (select_cc seteq (and X 2) 0 0 -13) ^ -> '---------------. (and (sra (sll X 62) 63) | -13) | | Before pattern is applied, it checks that second 'and' operand has only one bit set, (which is true for '2').
The pattern itself generates logical shift left / arithmetic shift right pair, that ensures that result is either all ones (-1) or all zeros (0). Hence, applying 'and' to shifts result and false value generates a correct result.
Thanks for your detailed and invaluable explanation!
Thanks Eduard for detailed explanation. It looks like we could resolve this issue without adding too much complexity to verifier. Also, this code pattern above seems generic enough to be worthwhile with verifier change.
Kuohai, please added detailed explanation (as described by Eduard) in the commit message.
Sure, already added, the commit message and the change now is like this:
bpf: Fix a false rejection caused by AND operation
With lsm return value check, the no-alu32 version test_libbpf_get_fd_by_id_opts is rejected by the verifier, and the log says:
0: R1=ctx() R10=fp0 ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 0: (b7) r0 = 0 ; R0_w=0 1: (79) r2 = *(u64 *)(r1 +0) func 'bpf_lsm_bpf_map' arg0 has btf_id 916 type STRUCT 'bpf_map' 2: R1=ctx() R2_w=trusted_ptr_bpf_map() ; if (map != (struct bpf_map *)&data_input) @ test_libbpf_get_fd_by_id_opts.c:29 2: (18) r3 = 0xffff9742c0951a00 ; R3_w=map_ptr(map=data_input,ks=4,vs=4) 4: (5d) if r2 != r3 goto pc+4 ; R2_w=trusted_ptr_bpf_map() R3_w=map_ptr(map=data_input,ks=4,vs=4) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 5: (79) r0 = *(u64 *)(r1 +8) ; R0_w=scalar() R1=ctx() ; if (fmode & FMODE_WRITE) @ test_libbpf_get_fd_by_id_opts.c:32 6: (67) r0 <<= 62 ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000)) 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) ; @ test_libbpf_get_fd_by_id_opts.c:0 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 9: (95) exit
And here is the C code of the prog.
SEC("lsm/bpf_map") int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) { if (map != (struct bpf_map *)&data_input) return 0;
if (fmode & FMODE_WRITE) return -EACCES;
return 0; }
It is clear that the prog can only return either 0 or -EACCESS, and both values are legal.
So why is it rejected by the verifier?
The verifier log shows that the second if and return value setting statements in the prog is optimized to bitwise operations "r0 s>>= 63" and "r0 &= -13". The verifier correctly deduces that the the value of r0 is in the range [-1, 0] after verifing instruction "r0 s>>= 63". But when the verifier proceeds to verify instruction "r0 &= -13", it fails to deduce the correct value range of r0.
7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3))
So why the verifier fails to deduce the result of 'r0 &= -13'?
The verifier uses tnum to track values, and the two ranges "[-1, 0]" and "[0, -1ULL]" are encoded to the same tnum. When verifing instruction "r0 &= -13", the verifier erroneously deduces the result from "[0, -1ULL] AND -13", which is out of the expected return range [-4095, 0].
As explained by Eduard in [0], the clang transformation that generates this pattern is located in DAGCombiner::SimplifySelectCC() method (see [1]).
The transformation happens as a part of DAG to DAG rewrites (LLVM uses several internal representations: - generic optimizer uses LLVM IR, most of the work is done using this representation; - before instruction selection IR is converted to Selection DAG, some optimizations are applied at this stage, all such optimizations are a set of pattern replacements; - Selection DAG is converted to machine code, some optimizations are applied at the machine code level).
Full pattern is described as follows:
// fold (select_cc seteq (and x, y), 0, 0, A) -> (and (sra (shl x)) A) // where y is has a single bit set. // A plaintext description would be, we can turn the SELECT_CC into an AND // when the condition can be materialized as an all-ones register. Any // single bit-test can be materialized as an all-ones register with // shift-left and shift-right-arith.
For this particular test case the DAG is converted as follows:
.---------------- lhs The meaning of this select_cc is: | .------- rhs `lhs == rhs ? true value : false value` | | .----- true value | | | .-- false value v v v v (select_cc seteq (and X 2) 0 0 -13) ^ -> '---------------. (and (sra (sll X 62) 63) | -13) | | Before pattern is applied, it checks that second 'and' operand has only one bit set, (which is true for '2').
The pattern itself generates logical shift left / arithmetic shift right pair, that ensures that result is either all ones (-1) or all zeros (0). Hence, applying 'and' to shifts result and false value generates a correct result.
As suggested by Eduard, this patch makes a special case for source or destination register of '&=' operation being in range [-1, 0].
Meaning that one of the '&=' operands is either: - all ones, in which case the counterpart is the result of the operation; - all zeros, in which case zero is the result of the operation.
And MIN and MAX values could be derived based on above two observations.
[0] https://lore.kernel.org/bpf/e62e2971301ca7f2e9eb74fc500c520285cad8f5.camel@g... [1] https://github.com/llvm/llvm-project/blob/4523a267829c807f3fc8fab8e5e9613985...
Suggested-by: Eduard Zingerman eddyz87@gmail.com Signed-off-by: Xu Kuohai xukuohai@huawei.com
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 640747b53745..30c551d39329 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -13374,6 +13374,24 @@ static void scalar32_min_max_and(struct bpf_reg_state *dst_reg, dst_reg->u32_min_value = var32_off.value; dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val);
+ /* Special case: src_reg is known and dst_reg is in range [-1, 0] */ + if (src_known && + dst_reg->s32_min_value == -1 && dst_reg->s32_max_value == 0 && + dst_reg->smin_value == -1 && dst_reg->smax_value == 0) {
do we need to check dst_reg->smin_value/smax_value here? They should not impact final dst_reg->s32_{min,max}_value computation, right?
right, the check was simply copied from the old code, which only handled the case where 64-bit range is the same as the 32-bit range
What if we do not have 64bit smin_value/smax_value check? Could you give more explanation here? In my opinion, deducing lower 32bit range should not care upper 32bit values.
Similarly, for later 64bit min/max and, 32bit value does not really matter.
hmm, the 32-bit check is completely unnecessary.
- dst_reg->s32_min_value = min_t(s32, src_reg->s32_min_value, 0);
+ dst_reg->s32_max_value = max_t(s32, src_reg->s32_min_value, 0); + return; + }
+ /* Special case: dst_reg is known and src_reg is in range [-1, 0] */ + if (dst_known && + src_reg->s32_min_value == -1 && src_reg->s32_max_value == 0 && + src_reg->smin_value == -1 && src_reg->smax_value == 0) { + dst_reg->s32_min_value = min_t(s32, dst_reg->s32_min_value, 0); + dst_reg->s32_max_value = max_t(s32, dst_reg->s32_min_value, 0); + return; + }
/* Safe to set s32 bounds by casting u32 result into s32 when u32 * doesn't cross sign boundary. Otherwise set s32 bounds to unbounded. */ @@ -13404,6 +13422,24 @@ static void scalar_min_max_and(struct bpf_reg_state *dst_reg, dst_reg->umin_value = dst_reg->var_off.value; dst_reg->umax_value = min(dst_reg->umax_value, umax_val);
+ /* Special case: src_reg is known and dst_reg is in range [-1, 0] */ + if (src_known && + dst_reg->smin_value == -1 && dst_reg->smax_value == 0 && + dst_reg->s32_min_value == -1 && dst_reg->s32_max_value == 0) { + dst_reg->smin_value = min_t(s64, src_reg->smin_value, 0); + dst_reg->smax_value = max_t(s64, src_reg->smin_value, 0); + return; + }
+ /* Special case: dst_reg is known and src_reg is in range [-1, 0] */ + if (dst_known && + src_reg->smin_value == -1 && src_reg->smax_value == 0 && + src_reg->s32_min_value == -1 && src_reg->s32_max_value == 0) { + dst_reg->smin_value = min_t(s64, dst_reg->smin_value, 0); + dst_reg->smax_value = max_t(s64, dst_reg->smin_value, 0); + return; + }
/* Safe to set s64 bounds by casting u64 result into s64 when u64 * doesn't cross sign boundary. Otherwise set s64 bounds to unbounded. */
In my opinion the approach taken by this patch is sub-optimal:
- 512 iterations is too much;
- this does not cover all code that could be generated by the above
mentioned LLVM transformation (e.g. second 'and' operand could be 1 << 16).
Instead, I suggest to make a special case for source or dst register of '&=' operation being in range [-1,0]. Meaning that one of the '&=' operands is either:
- all ones, in which case the counterpart is the result of the
operation;
- all zeros, in which case zero is the result of the operation;
- derive MIN and MAX values based on above two observations.
Totally agree, I'll cook a new patch as you suggested.
[1] https://github.com/llvm/llvm-project/blob/4523a267829c807f3fc8fab8e5e9613985...
Best regards, Eduard
On 4/26/2024 12:28 AM, Yonghong Song wrote:
On 4/24/24 7:42 PM, Xu Kuohai wrote:
On 4/25/2024 6:06 AM, Yonghong Song wrote:
On 4/23/24 7:25 PM, Xu Kuohai wrote:
On 4/24/2024 5:55 AM, Yonghong Song wrote:
On 4/20/24 1:33 AM, Xu Kuohai wrote:
On 4/20/2024 7:00 AM, Eduard Zingerman wrote: > On Thu, 2024-04-11 at 20:27 +0800, Xu Kuohai wrote: >> From: Xu Kuohai xukuohai@huawei.com >> >> With lsm return value check, the no-alu32 version test_libbpf_get_fd_by_id_opts >> is rejected by the verifier, and the log says: >> >> 0: R1=ctx() R10=fp0 >> ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 >> 0: (b7) r0 = 0 ; R0_w=0 >> 1: (79) r2 = *(u64 *)(r1 +0) >> func 'bpf_lsm_bpf_map' arg0 has btf_id 916 type STRUCT 'bpf_map' >> 2: R1=ctx() R2_w=trusted_ptr_bpf_map() >> ; if (map != (struct bpf_map *)&data_input) @ test_libbpf_get_fd_by_id_opts.c:29 >> 2: (18) r3 = 0xffff9742c0951a00 ; R3_w=map_ptr(map=data_input,ks=4,vs=4) >> 4: (5d) if r2 != r3 goto pc+4 ; R2_w=trusted_ptr_bpf_map() R3_w=map_ptr(map=data_input,ks=4,vs=4) >> ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 >> 5: (79) r0 = *(u64 *)(r1 +8) ; R0_w=scalar() R1=ctx() >> ; if (fmode & FMODE_WRITE) @ test_libbpf_get_fd_by_id_opts.c:32 >> 6: (67) r0 <<= 62 ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000)) >> 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) >> ; @ test_libbpf_get_fd_by_id_opts.c:0 >> 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) >> ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 >> 9: (95) exit >> >> And here is the C code of the prog. >> >> SEC("lsm/bpf_map") >> int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) >> { >> if (map != (struct bpf_map *)&data_input) >> return 0; >> >> if (fmode & FMODE_WRITE) >> return -EACCES; >> >> return 0; >> } >> >> It is clear that the prog can only return either 0 or -EACCESS, and both >> values are legal. >> >> So why is it rejected by the verifier? >> >> The verifier log shows that the second if and return value setting >> statements in the prog is optimized to bitwise operations "r0 s>>= 63" >> and "r0 &= -13". The verifier correctly deduces that the the value of >> r0 is in the range [-1, 0] after verifing instruction "r0 s>>= 63". >> But when the verifier proceeds to verify instruction "r0 &= -13", it >> fails to deduce the correct value range of r0. >> >> 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) >> 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) >> >> So why the verifier fails to deduce the result of 'r0 &= -13'? >> >> The verifier uses tnum to track values, and the two ranges "[-1, 0]" and >> "[0, -1ULL]" are encoded to the same tnum. When verifing instruction >> "r0 &= -13", the verifier erroneously deduces the result from >> "[0, -1ULL] AND -13", which is out of the expected return range >> [-4095, 0]. >> >> To fix it, this patch simply adds a special SCALAR32 case for the >> verifier. That is, when the source operand of the AND instruction is >> a constant and the destination operand changes from negative to >> non-negative and falls in range [-256, 256], deduce the result range >> by enumerating all possible AND results. >> >> Signed-off-by: Xu Kuohai xukuohai@huawei.com >> --- > > Hello, > > Sorry for the delay, I had to think about this issue a bit. > I found the clang transformation that generates the pattern this patch > tries to handle. > It is located in DAGCombiner::SimplifySelectCC() method (see [1]). > The transformation happens as a part of DAG to DAG rewrites > (LLVM uses several internal representations: > - generic optimizer uses LLVM IR, most of the work is done > using this representation; > - before instruction selection IR is converted to Selection DAG, > some optimizations are applied at this stage, > all such optimizations are a set of pattern replacements; > - Selection DAG is converted to machine code, some optimizations > are applied at the machine code level). > > Full pattern is described as follows: > > // fold (select_cc seteq (and x, y), 0, 0, A) -> (and (sra (shl x)) A) > // where y is has a single bit set. > // A plaintext description would be, we can turn the SELECT_CC into an AND > // when the condition can be materialized as an all-ones register. Any > // single bit-test can be materialized as an all-ones register with > // shift-left and shift-right-arith. > > For this particular test case the DAG is converted as follows: > > .---------------- lhs The meaning of this select_cc is: > | .------- rhs `lhs == rhs ? true value : false value` > | | .----- true value > | | | .-- false value > v v v v > (select_cc seteq (and X 2) 0 0 -13) > ^ > -> '---------------. > (and (sra (sll X 62) 63) | > -13) | > | > Before pattern is applied, it checks that second 'and' operand has > only one bit set, (which is true for '2'). > > The pattern itself generates logical shift left / arithmetic shift > right pair, that ensures that result is either all ones (-1) or all > zeros (0). Hence, applying 'and' to shifts result and false value > generates a correct result. >
Thanks for your detailed and invaluable explanation!
Thanks Eduard for detailed explanation. It looks like we could resolve this issue without adding too much complexity to verifier. Also, this code pattern above seems generic enough to be worthwhile with verifier change.
Kuohai, please added detailed explanation (as described by Eduard) in the commit message.
Sure, already added, the commit message and the change now is like this:
bpf: Fix a false rejection caused by AND operation
With lsm return value check, the no-alu32 version test_libbpf_get_fd_by_id_opts is rejected by the verifier, and the log says:
0: R1=ctx() R10=fp0 ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 0: (b7) r0 = 0 ; R0_w=0 1: (79) r2 = *(u64 *)(r1 +0) func 'bpf_lsm_bpf_map' arg0 has btf_id 916 type STRUCT 'bpf_map' 2: R1=ctx() R2_w=trusted_ptr_bpf_map() ; if (map != (struct bpf_map *)&data_input) @ test_libbpf_get_fd_by_id_opts.c:29 2: (18) r3 = 0xffff9742c0951a00 ; R3_w=map_ptr(map=data_input,ks=4,vs=4) 4: (5d) if r2 != r3 goto pc+4 ; R2_w=trusted_ptr_bpf_map() R3_w=map_ptr(map=data_input,ks=4,vs=4) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 5: (79) r0 = *(u64 *)(r1 +8) ; R0_w=scalar() R1=ctx() ; if (fmode & FMODE_WRITE) @ test_libbpf_get_fd_by_id_opts.c:32 6: (67) r0 <<= 62 ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000)) 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) ; @ test_libbpf_get_fd_by_id_opts.c:0 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 9: (95) exit
And here is the C code of the prog.
SEC("lsm/bpf_map") int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) { if (map != (struct bpf_map *)&data_input) return 0;
if (fmode & FMODE_WRITE) return -EACCES;
return 0; }
It is clear that the prog can only return either 0 or -EACCESS, and both values are legal.
So why is it rejected by the verifier?
The verifier log shows that the second if and return value setting statements in the prog is optimized to bitwise operations "r0 s>>= 63" and "r0 &= -13". The verifier correctly deduces that the the value of r0 is in the range [-1, 0] after verifing instruction "r0 s>>= 63". But when the verifier proceeds to verify instruction "r0 &= -13", it fails to deduce the correct value range of r0.
7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3))
So why the verifier fails to deduce the result of 'r0 &= -13'?
The verifier uses tnum to track values, and the two ranges "[-1, 0]" and "[0, -1ULL]" are encoded to the same tnum. When verifing instruction "r0 &= -13", the verifier erroneously deduces the result from "[0, -1ULL] AND -13", which is out of the expected return range [-4095, 0].
As explained by Eduard in [0], the clang transformation that generates this pattern is located in DAGCombiner::SimplifySelectCC() method (see [1]).
The transformation happens as a part of DAG to DAG rewrites (LLVM uses several internal representations: - generic optimizer uses LLVM IR, most of the work is done using this representation; - before instruction selection IR is converted to Selection DAG, some optimizations are applied at this stage, all such optimizations are a set of pattern replacements; - Selection DAG is converted to machine code, some optimizations are applied at the machine code level).
Full pattern is described as follows:
// fold (select_cc seteq (and x, y), 0, 0, A) -> (and (sra (shl x)) A) // where y is has a single bit set. // A plaintext description would be, we can turn the SELECT_CC into an AND // when the condition can be materialized as an all-ones register. Any // single bit-test can be materialized as an all-ones register with // shift-left and shift-right-arith.
For this particular test case the DAG is converted as follows:
.---------------- lhs The meaning of this select_cc is: | .------- rhs `lhs == rhs ? true value : false value` | | .----- true value | | | .-- false value v v v v (select_cc seteq (and X 2) 0 0 -13) ^ -> '---------------. (and (sra (sll X 62) 63) | -13) | | Before pattern is applied, it checks that second 'and' operand has only one bit set, (which is true for '2').
The pattern itself generates logical shift left / arithmetic shift right pair, that ensures that result is either all ones (-1) or all zeros (0). Hence, applying 'and' to shifts result and false value generates a correct result.
As suggested by Eduard, this patch makes a special case for source or destination register of '&=' operation being in range [-1, 0].
Meaning that one of the '&=' operands is either: - all ones, in which case the counterpart is the result of the operation; - all zeros, in which case zero is the result of the operation.
And MIN and MAX values could be derived based on above two observations.
[0] https://lore.kernel.org/bpf/e62e2971301ca7f2e9eb74fc500c520285cad8f5.camel@g... [1] https://github.com/llvm/llvm-project/blob/4523a267829c807f3fc8fab8e5e9613985...
Suggested-by: Eduard Zingerman eddyz87@gmail.com Signed-off-by: Xu Kuohai xukuohai@huawei.com
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 640747b53745..30c551d39329 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -13374,6 +13374,24 @@ static void scalar32_min_max_and(struct bpf_reg_state *dst_reg, dst_reg->u32_min_value = var32_off.value; dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val);
+ /* Special case: src_reg is known and dst_reg is in range [-1, 0] */ + if (src_known && + dst_reg->s32_min_value == -1 && dst_reg->s32_max_value == 0 && + dst_reg->smin_value == -1 && dst_reg->smax_value == 0) {
do we need to check dst_reg->smin_value/smax_value here? They should not impact final dst_reg->s32_{min,max}_value computation, right?
right, the check was simply copied from the old code, which only handled the case where 64-bit range is the same as the 32-bit range
What if we do not have 64bit smin_value/smax_value check? Could you give more explanation here? In my opinion, deducing lower 32bit range should not care upper 32bit values.
I agree that for AND operation there's no need to check the upper 32bit. But for other operations, we may need to consider the impact of upper 32bit overflow.
I added 64bit check in the old patch is to make sure the special case only works when 64bit and 32bit ranges are the same since ranges are the same in the failure verifier log.
Similarly, for later 64bit min/max and, 32bit value does not really matter.
hmm, the 32-bit check is completely unnecessary.
- dst_reg->s32_min_value = min_t(s32, src_reg->s32_min_value, 0);
+ dst_reg->s32_max_value = max_t(s32, src_reg->s32_min_value, 0); + return; + }
+ /* Special case: dst_reg is known and src_reg is in range [-1, 0] */ + if (dst_known && + src_reg->s32_min_value == -1 && src_reg->s32_max_value == 0 && + src_reg->smin_value == -1 && src_reg->smax_value == 0) { + dst_reg->s32_min_value = min_t(s32, dst_reg->s32_min_value, 0); + dst_reg->s32_max_value = max_t(s32, dst_reg->s32_min_value, 0); + return; + }
/* Safe to set s32 bounds by casting u32 result into s32 when u32 * doesn't cross sign boundary. Otherwise set s32 bounds to unbounded. */ @@ -13404,6 +13422,24 @@ static void scalar_min_max_and(struct bpf_reg_state *dst_reg, dst_reg->umin_value = dst_reg->var_off.value; dst_reg->umax_value = min(dst_reg->umax_value, umax_val);
+ /* Special case: src_reg is known and dst_reg is in range [-1, 0] */ + if (src_known && + dst_reg->smin_value == -1 && dst_reg->smax_value == 0 && + dst_reg->s32_min_value == -1 && dst_reg->s32_max_value == 0) { + dst_reg->smin_value = min_t(s64, src_reg->smin_value, 0); + dst_reg->smax_value = max_t(s64, src_reg->smin_value, 0); + return; + }
+ /* Special case: dst_reg is known and src_reg is in range [-1, 0] */ + if (dst_known && + src_reg->smin_value == -1 && src_reg->smax_value == 0 && + src_reg->s32_min_value == -1 && src_reg->s32_max_value == 0) { + dst_reg->smin_value = min_t(s64, dst_reg->smin_value, 0); + dst_reg->smax_value = max_t(s64, dst_reg->smin_value, 0); + return; + }
/* Safe to set s64 bounds by casting u64 result into s64 when u64 * doesn't cross sign boundary. Otherwise set s64 bounds to unbounded. */
> In my opinion the approach taken by this patch is sub-optimal: > - 512 iterations is too much; > - this does not cover all code that could be generated by the above > mentioned LLVM transformation > (e.g. second 'and' operand could be 1 << 16). > > Instead, I suggest to make a special case for source or dst register > of '&=' operation being in range [-1,0]. > Meaning that one of the '&=' operands is either: > - all ones, in which case the counterpart is the result of the operation; > - all zeros, in which case zero is the result of the operation; > - derive MIN and MAX values based on above two observations. >
Totally agree, I'll cook a new patch as you suggested.
> [1] https://github.com/llvm/llvm-project/blob/4523a267829c807f3fc8fab8e5e9613985... > > Best regards, > Eduard
On Tue, Apr 23, 2024 at 7:26 PM Xu Kuohai xukuohai@huaweicloud.com wrote:
On 4/24/2024 5:55 AM, Yonghong Song wrote:
On 4/20/24 1:33 AM, Xu Kuohai wrote:
On 4/20/2024 7:00 AM, Eduard Zingerman wrote:
On Thu, 2024-04-11 at 20:27 +0800, Xu Kuohai wrote:
From: Xu Kuohai xukuohai@huawei.com
With lsm return value check, the no-alu32 version test_libbpf_get_fd_by_id_opts is rejected by the verifier, and the log says:
0: R1=ctx() R10=fp0 ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 0: (b7) r0 = 0 ; R0_w=0 1: (79) r2 = *(u64 *)(r1 +0) func 'bpf_lsm_bpf_map' arg0 has btf_id 916 type STRUCT 'bpf_map' 2: R1=ctx() R2_w=trusted_ptr_bpf_map() ; if (map != (struct bpf_map *)&data_input) @ test_libbpf_get_fd_by_id_opts.c:29 2: (18) r3 = 0xffff9742c0951a00 ; R3_w=map_ptr(map=data_input,ks=4,vs=4) 4: (5d) if r2 != r3 goto pc+4 ; R2_w=trusted_ptr_bpf_map() R3_w=map_ptr(map=data_input,ks=4,vs=4) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 5: (79) r0 = *(u64 *)(r1 +8) ; R0_w=scalar() R1=ctx() ; if (fmode & FMODE_WRITE) @ test_libbpf_get_fd_by_id_opts.c:32 6: (67) r0 <<= 62 ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000)) 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) ; @ test_libbpf_get_fd_by_id_opts.c:0 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 9: (95) exit
[...]
As suggested by Eduard, this patch makes a special case for source or destination register of '&=' operation being in range [-1, 0]. Meaning that one of the '&=' operands is either: - all ones, in which case the counterpart is the result of the operation; - all zeros, in which case zero is the result of the operation. And MIN and MAX values could be derived based on above two observations. [0] https://lore.kernel.org/bpf/e62e2971301ca7f2e9eb74fc500c520285cad8f5.camel@gmail.com/ [1] https://github.com/llvm/llvm-project/blob/4523a267829c807f3fc8fab8e5e9613985a51565/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Suggested-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 640747b53745..30c551d39329 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -13374,6 +13374,24 @@ static void scalar32_min_max_and(struct bpf_reg_state *dst_reg, dst_reg->u32_min_value = var32_off.value; dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val);
/* Special case: src_reg is known and dst_reg is in range [-1, 0] */
if (src_known &&
dst_reg->s32_min_value == -1 && dst_reg->s32_max_value == 0 &&
dst_reg->smin_value == -1 && dst_reg->smax_value == 0) {
please keep if () condition aligned across multiple lines, it's super confusing this way
dst_reg->s32_min_value = min_t(s32, src_reg->s32_min_value, 0);
dst_reg->s32_max_value = max_t(s32, src_reg->s32_min_value, 0);
do we need to update tnum parts as well (or reset and re-derive, probably)?
btw, can't we support src being a range here? the idea is that dst_reg either all ones or all zeros. For and it means that it either stays all zero, or will be *exactly equal* to src, right? So I think the logic would be:
a) if [s32_min, s32_max] is on the same side of zero, then resulting range would be [min(s32_min, 0), max(s32_max, 0)], just like you have here
b) if [s32_min, s32_max] contains zero, then resulting range will be exactly [s32_min, s32_max]
Or did I make a mistake above?
return;
}
/* Special case: dst_reg is known and src_reg is in range [-1, 0] */
if (dst_known &&
src_reg->s32_min_value == -1 && src_reg->s32_max_value == 0 &&
src_reg->smin_value == -1 && src_reg->smax_value == 0) {
dst_reg->s32_min_value = min_t(s32, dst_reg->s32_min_value, 0);
dst_reg->s32_max_value = max_t(s32, dst_reg->s32_min_value, 0);
return;
}
/* Safe to set s32 bounds by casting u32 result into s32 when u32 * doesn't cross sign boundary. Otherwise set s32 bounds to unbounded. */
[...]
On 4/27/2024 4:36 AM, Andrii Nakryiko wrote:
On Tue, Apr 23, 2024 at 7:26 PM Xu Kuohai xukuohai@huaweicloud.com wrote:
On 4/24/2024 5:55 AM, Yonghong Song wrote:
On 4/20/24 1:33 AM, Xu Kuohai wrote:
On 4/20/2024 7:00 AM, Eduard Zingerman wrote:
On Thu, 2024-04-11 at 20:27 +0800, Xu Kuohai wrote:
From: Xu Kuohai xukuohai@huawei.com
With lsm return value check, the no-alu32 version test_libbpf_get_fd_by_id_opts is rejected by the verifier, and the log says:
0: R1=ctx() R10=fp0 ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 0: (b7) r0 = 0 ; R0_w=0 1: (79) r2 = *(u64 *)(r1 +0) func 'bpf_lsm_bpf_map' arg0 has btf_id 916 type STRUCT 'bpf_map' 2: R1=ctx() R2_w=trusted_ptr_bpf_map() ; if (map != (struct bpf_map *)&data_input) @ test_libbpf_get_fd_by_id_opts.c:29 2: (18) r3 = 0xffff9742c0951a00 ; R3_w=map_ptr(map=data_input,ks=4,vs=4) 4: (5d) if r2 != r3 goto pc+4 ; R2_w=trusted_ptr_bpf_map() R3_w=map_ptr(map=data_input,ks=4,vs=4) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 5: (79) r0 = *(u64 *)(r1 +8) ; R0_w=scalar() R1=ctx() ; if (fmode & FMODE_WRITE) @ test_libbpf_get_fd_by_id_opts.c:32 6: (67) r0 <<= 62 ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000)) 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) ; @ test_libbpf_get_fd_by_id_opts.c:0 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 9: (95) exit
[...]
As suggested by Eduard, this patch makes a special case for source or destination register of '&=' operation being in range [-1, 0]. Meaning that one of the '&=' operands is either: - all ones, in which case the counterpart is the result of the operation; - all zeros, in which case zero is the result of the operation. And MIN and MAX values could be derived based on above two observations. [0] https://lore.kernel.org/bpf/e62e2971301ca7f2e9eb74fc500c520285cad8f5.camel@gmail.com/ [1] https://github.com/llvm/llvm-project/blob/4523a267829c807f3fc8fab8e5e9613985a51565/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Suggested-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 640747b53745..30c551d39329 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -13374,6 +13374,24 @@ static void scalar32_min_max_and(struct bpf_reg_state *dst_reg, dst_reg->u32_min_value = var32_off.value; dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val);
/* Special case: src_reg is known and dst_reg is in range [-1, 0] */
if (src_known &&
dst_reg->s32_min_value == -1 && dst_reg->s32_max_value == 0 &&
dst_reg->smin_value == -1 && dst_reg->smax_value == 0) {
please keep if () condition aligned across multiple lines, it's super confusing this way
OK, will update the align style
dst_reg->s32_min_value = min_t(s32, src_reg->s32_min_value, 0);
dst_reg->s32_max_value = max_t(s32, src_reg->s32_min_value, 0);
do we need to update tnum parts as well (or reset and re-derive, probably)?
btw, can't we support src being a range here? the idea is that dst_reg either all ones or all zeros. For and it means that it either stays all zero, or will be *exactly equal* to src, right? So I think the logic would be:
a) if [s32_min, s32_max] is on the same side of zero, then resulting range would be [min(s32_min, 0), max(s32_max, 0)], just like you have here
b) if [s32_min, s32_max] contains zero, then resulting range will be exactly [s32_min, s32_max]
Or did I make a mistake above?
Totally agree, the AND of any set with the range [-1,0] is equivalent to adding number 0 to the set!
Based on this observation, I've rewritten the patch as follows.
diff --git a/include/linux/tnum.h b/include/linux/tnum.h index 3c13240077b8..5e795d728b9f 100644 --- a/include/linux/tnum.h +++ b/include/linux/tnum.h @@ -52,6 +52,9 @@ struct tnum tnum_mul(struct tnum a, struct tnum b); /* Return a tnum representing numbers satisfying both @a and @b */ struct tnum tnum_intersect(struct tnum a, struct tnum b);
+/* Return a tnum representing numbers satisfying either @a or @b */ +struct tnum tnum_union(struct tnum a, struct tnum b); + /* Return @a with all but the lowest @size bytes cleared */ struct tnum tnum_cast(struct tnum a, u8 size);
diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c index 9dbc31b25e3d..9d4480a683ca 100644 --- a/kernel/bpf/tnum.c +++ b/kernel/bpf/tnum.c @@ -150,6 +150,29 @@ struct tnum tnum_intersect(struct tnum a, struct tnum b) return TNUM(v & ~mu, mu); }
+/* + * Each bit has 3 states: unkown, known 0, known 1. If using x to represent + * unknown state, the result of the union of two bits is as follows: + * + * | x 0 1 + * -----+------------ + * x | x x x + * 0 | x 0 x + * 1 | x x 1 + * + * For tnum a and b, only the bits that are both known 0 or known 1 in a + * and b are known in the result of union a and b. + */ +struct tnum tnum_union(struct tnum a, struct tnum b) +{ + u64 v0, v1, mu; + + mu = a.mask | b.mask; // unkown bits either in a or b + v1 = (a.value & b.value) & ~mu; // "known 1" bits in both a and b + v0 = (~a.value & ~b.value) & ~mu; // "known 0" bits in both a and b + return TNUM(v1, mu | ~(v0 | v1)); +} + struct tnum tnum_cast(struct tnum a, u8 size) { a.value &= (1ULL << (size * 8)) - 1; { a.value &= (1ULL << (size * 8)) - 1; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 8f0f2e21699e..b69c89bc5cfc 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -13478,6 +13478,28 @@ static void scalar32_min_max_and(struct bpf_reg_state *dst_reg, return; }
+ /* Special case: dst_reg is in range [-1, 0] */ + if (dst_reg->s32_min_value == -1 && dst_reg->s32_max_value == 0) { + var32_off = tnum_union(src_reg->var_off, tnum_const(0)); + dst_reg->var_off = tnum_with_subreg(dst_reg->var_off, var32_off); + dst_reg->u32_min_value = var32_off.value; + dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val); + dst_reg->s32_min_value = min_t(s32, src_reg->s32_min_value, 0); + dst_reg->s32_max_value = max_t(s32, src_reg->s32_max_value, 0); + return; + } + + /* Special case: src_reg is in range [-1, 0] */ + if (src_reg->s32_min_value == -1 && src_reg->s32_max_value == 0) { + var32_off = tnum_union(dst_reg->var_off, tnum_const(0)); + dst_reg->var_off = tnum_with_subreg(dst_reg->var_off, var32_off); + dst_reg->u32_min_value = var32_off.value; + dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val); + dst_reg->s32_min_value = min_t(s32, dst_reg->s32_min_value, 0); + dst_reg->s32_max_value = max_t(s32, dst_reg->s32_max_value, 0); + return; + } + /* We get our minimum from the var_off, since that's inherently * bitwise. Our maximum is the minimum of the operands' maxima. */ @@ -13508,6 +13530,26 @@ static void scalar_min_max_and(struct bpf_reg_state *dst_reg, return; }
+ /* Special case: dst_reg is in range [-1, 0] */ + if (dst_reg->smin_value == -1 && dst_reg->smax_value == 0) { + dst_reg->var_off = tnum_union(src_reg->var_off, tnum_const(0)); + dst_reg->umin_value = dst_reg->var_off.value; + dst_reg->umax_value = min(dst_reg->umax_value, umax_val); + dst_reg->smin_value = min_t(s64, src_reg->smin_value, 0); + dst_reg->smax_value = max_t(s64, src_reg->smax_value, 0); + return; + } + + /* Special case: src_reg is in range [-1, 0] */ + if (src_reg->smin_value == -1 && src_reg->smax_value == 0) { + dst_reg->var_off = tnum_union(dst_reg->var_off, tnum_const(0)); + dst_reg->umin_value = dst_reg->var_off.value; + dst_reg->umax_value = min(dst_reg->umax_value, umax_val); + dst_reg->smin_value = min_t(s64, dst_reg->smin_value, 0); + dst_reg->smax_value = max_t(s64, dst_reg->smax_value, 0); + return; + } +
return;
}
/* Special case: dst_reg is known and src_reg is in range [-1, 0] */
if (dst_known &&
src_reg->s32_min_value == -1 && src_reg->s32_max_value == 0 &&
src_reg->smin_value == -1 && src_reg->smax_value == 0) {
dst_reg->s32_min_value = min_t(s32, dst_reg->s32_min_value, 0);
dst_reg->s32_max_value = max_t(s32, dst_reg->s32_min_value, 0);
return;
}
/* Safe to set s32 bounds by casting u32 result into s32 when u32 * doesn't cross sign boundary. Otherwise set s32 bounds to unbounded. */
[...]
On Sun, Apr 28, 2024 at 8:15 AM Xu Kuohai xukuohai@huaweicloud.com wrote:
On 4/27/2024 4:36 AM, Andrii Nakryiko wrote:
On Tue, Apr 23, 2024 at 7:26 PM Xu Kuohai xukuohai@huaweicloud.com wrote:
On 4/24/2024 5:55 AM, Yonghong Song wrote:
On 4/20/24 1:33 AM, Xu Kuohai wrote:
On 4/20/2024 7:00 AM, Eduard Zingerman wrote:
On Thu, 2024-04-11 at 20:27 +0800, Xu Kuohai wrote: > From: Xu Kuohai xukuohai@huawei.com > > With lsm return value check, the no-alu32 version test_libbpf_get_fd_by_id_opts > is rejected by the verifier, and the log says: > > 0: R1=ctx() R10=fp0 > ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 > 0: (b7) r0 = 0 ; R0_w=0 > 1: (79) r2 = *(u64 *)(r1 +0) > func 'bpf_lsm_bpf_map' arg0 has btf_id 916 type STRUCT 'bpf_map' > 2: R1=ctx() R2_w=trusted_ptr_bpf_map() > ; if (map != (struct bpf_map *)&data_input) @ test_libbpf_get_fd_by_id_opts.c:29 > 2: (18) r3 = 0xffff9742c0951a00 ; R3_w=map_ptr(map=data_input,ks=4,vs=4) > 4: (5d) if r2 != r3 goto pc+4 ; R2_w=trusted_ptr_bpf_map() R3_w=map_ptr(map=data_input,ks=4,vs=4) > ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 > 5: (79) r0 = *(u64 *)(r1 +8) ; R0_w=scalar() R1=ctx() > ; if (fmode & FMODE_WRITE) @ test_libbpf_get_fd_by_id_opts.c:32 > 6: (67) r0 <<= 62 ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000)) > 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) > ; @ test_libbpf_get_fd_by_id_opts.c:0 > 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) > ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 > 9: (95) exit
[...]
As suggested by Eduard, this patch makes a special case for source or destination register of '&=' operation being in range [-1, 0]. Meaning that one of the '&=' operands is either: - all ones, in which case the counterpart is the result of the operation; - all zeros, in which case zero is the result of the operation. And MIN and MAX values could be derived based on above two observations. [0] https://lore.kernel.org/bpf/e62e2971301ca7f2e9eb74fc500c520285cad8f5.camel@gmail.com/ [1] https://github.com/llvm/llvm-project/blob/4523a267829c807f3fc8fab8e5e9613985a51565/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Suggested-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 640747b53745..30c551d39329 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -13374,6 +13374,24 @@ static void scalar32_min_max_and(struct bpf_reg_state *dst_reg, dst_reg->u32_min_value = var32_off.value; dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val);
/* Special case: src_reg is known and dst_reg is in range [-1, 0] */
if (src_known &&
dst_reg->s32_min_value == -1 && dst_reg->s32_max_value == 0 &&
dst_reg->smin_value == -1 && dst_reg->smax_value == 0) {
please keep if () condition aligned across multiple lines, it's super confusing this way
OK, will update the align style
dst_reg->s32_min_value = min_t(s32, src_reg->s32_min_value, 0);
dst_reg->s32_max_value = max_t(s32, src_reg->s32_min_value, 0);
do we need to update tnum parts as well (or reset and re-derive, probably)?
btw, can't we support src being a range here? the idea is that dst_reg either all ones or all zeros. For and it means that it either stays all zero, or will be *exactly equal* to src, right? So I think the logic would be:
a) if [s32_min, s32_max] is on the same side of zero, then resulting range would be [min(s32_min, 0), max(s32_max, 0)], just like you have here
b) if [s32_min, s32_max] contains zero, then resulting range will be exactly [s32_min, s32_max]
Or did I make a mistake above?
Totally agree, the AND of any set with the range [-1,0] is equivalent to adding number 0 to the set!
Based on this observation, I've rewritten the patch as follows.
diff --git a/include/linux/tnum.h b/include/linux/tnum.h index 3c13240077b8..5e795d728b9f 100644 --- a/include/linux/tnum.h +++ b/include/linux/tnum.h @@ -52,6 +52,9 @@ struct tnum tnum_mul(struct tnum a, struct tnum b); /* Return a tnum representing numbers satisfying both @a and @b */ struct tnum tnum_intersect(struct tnum a, struct tnum b);
+/* Return a tnum representing numbers satisfying either @a or @b */ +struct tnum tnum_union(struct tnum a, struct tnum b);
- /* Return @a with all but the lowest @size bytes cleared */ struct tnum tnum_cast(struct tnum a, u8 size);
diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c index 9dbc31b25e3d..9d4480a683ca 100644 --- a/kernel/bpf/tnum.c +++ b/kernel/bpf/tnum.c @@ -150,6 +150,29 @@ struct tnum tnum_intersect(struct tnum a, struct tnum b) return TNUM(v & ~mu, mu); }
+/*
- Each bit has 3 states: unkown, known 0, known 1. If using x to represent
- unknown state, the result of the union of two bits is as follows:
| x 0 1
- -----+------------
x | x x x
0 | x 0 x
1 | x x 1
- For tnum a and b, only the bits that are both known 0 or known 1 in a
- and b are known in the result of union a and b.
- */
+struct tnum tnum_union(struct tnum a, struct tnum b) +{
u64 v0, v1, mu;
mu = a.mask | b.mask; // unkown bits either in a or b
v1 = (a.value & b.value) & ~mu; // "known 1" bits in both a and b
v0 = (~a.value & ~b.value) & ~mu; // "known 0" bits in both a and b
no C++-style comments, please
return TNUM(v1, mu | ~(v0 | v1));
+}
I've CC'ed Edward, hopefully he can take a look as well. Please CC him on future patches touching tnum as well.
struct tnum tnum_cast(struct tnum a, u8 size) { a.value &= (1ULL << (size * 8)) - 1; { a.value &= (1ULL << (size * 8)) - 1; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 8f0f2e21699e..b69c89bc5cfc 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -13478,6 +13478,28 @@ static void scalar32_min_max_and(struct bpf_reg_state *dst_reg, return; }
/* Special case: dst_reg is in range [-1, 0] */
if (dst_reg->s32_min_value == -1 && dst_reg->s32_max_value == 0) {
var32_off = tnum_union(src_reg->var_off, tnum_const(0));
dst_reg->var_off = tnum_with_subreg(dst_reg->var_off, var32_off);
dst_reg->u32_min_value = var32_off.value;
dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val);
can you explain the logic behing u32 min/max updates, especially that we use completely different values for min/max and it's not clear why u32_min <= u32_max invariant will always hold. Same below
dst_reg->s32_min_value = min_t(s32, src_reg->s32_min_value, 0);
dst_reg->s32_max_value = max_t(s32, src_reg->s32_max_value, 0);
return;
}
/* Special case: src_reg is in range [-1, 0] */
if (src_reg->s32_min_value == -1 && src_reg->s32_max_value == 0) {
var32_off = tnum_union(dst_reg->var_off, tnum_const(0));
dst_reg->var_off = tnum_with_subreg(dst_reg->var_off, var32_off);
dst_reg->u32_min_value = var32_off.value;
dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val);
dst_reg->s32_min_value = min_t(s32, dst_reg->s32_min_value, 0);
dst_reg->s32_max_value = max_t(s32, dst_reg->s32_max_value, 0);
return;
}
/* We get our minimum from the var_off, since that's inherently * bitwise. Our maximum is the minimum of the operands' maxima. */
@@ -13508,6 +13530,26 @@ static void scalar_min_max_and(struct bpf_reg_state *dst_reg, return; }
/* Special case: dst_reg is in range [-1, 0] */
if (dst_reg->smin_value == -1 && dst_reg->smax_value == 0) {
dst_reg->var_off = tnum_union(src_reg->var_off, tnum_const(0));
dst_reg->umin_value = dst_reg->var_off.value;
dst_reg->umax_value = min(dst_reg->umax_value, umax_val);
dst_reg->smin_value = min_t(s64, src_reg->smin_value, 0);
dst_reg->smax_value = max_t(s64, src_reg->smax_value, 0);
return;
}
/* Special case: src_reg is in range [-1, 0] */
if (src_reg->smin_value == -1 && src_reg->smax_value == 0) {
dst_reg->var_off = tnum_union(dst_reg->var_off, tnum_const(0));
dst_reg->umin_value = dst_reg->var_off.value;
dst_reg->umax_value = min(dst_reg->umax_value, umax_val);
dst_reg->smin_value = min_t(s64, dst_reg->smin_value, 0);
dst_reg->smax_value = max_t(s64, dst_reg->smax_value, 0);
return;
}
return;
}
/* Special case: dst_reg is known and src_reg is in range [-1, 0] */
if (dst_known &&
src_reg->s32_min_value == -1 && src_reg->s32_max_value == 0 &&
src_reg->smin_value == -1 && src_reg->smax_value == 0) {
dst_reg->s32_min_value = min_t(s32, dst_reg->s32_min_value, 0);
dst_reg->s32_max_value = max_t(s32, dst_reg->s32_min_value, 0);
return;
}
/* Safe to set s32 bounds by casting u32 result into s32 when u32 * doesn't cross sign boundary. Otherwise set s32 bounds to unbounded. */
[...]
On Mon, 2024-04-29 at 13:58 -0700, Andrii Nakryiko wrote:
[...]
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 8f0f2e21699e..b69c89bc5cfc 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -13478,6 +13478,28 @@ static void scalar32_min_max_and(struct bpf_reg_state *dst_reg, return; }
/* Special case: dst_reg is in range [-1, 0] */
if (dst_reg->s32_min_value == -1 && dst_reg->s32_max_value == 0) {
var32_off = tnum_union(src_reg->var_off, tnum_const(0));
dst_reg->var_off = tnum_with_subreg(dst_reg->var_off, var32_off);
dst_reg->u32_min_value = var32_off.value;
dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val);
can you explain the logic behing u32 min/max updates, especially that we use completely different values for min/max and it's not clear why u32_min <= u32_max invariant will always hold. Same below
I agree with Andrii here. It appears that dst_reg.{min,max} fields should be set as {min(src.min, 0), max(src.max, 0)} for both signed and unsigned cases. Wdyt?
dst_reg->s32_min_value = min_t(s32, src_reg->s32_min_value, 0);
dst_reg->s32_max_value = max_t(s32, src_reg->s32_max_value, 0);
return;
}
/* Special case: src_reg is in range [-1, 0] */
if (src_reg->s32_min_value == -1 && src_reg->s32_max_value == 0) {
var32_off = tnum_union(dst_reg->var_off, tnum_const(0));
dst_reg->var_off = tnum_with_subreg(dst_reg->var_off, var32_off);
dst_reg->u32_min_value = var32_off.value;
dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val);
dst_reg->s32_min_value = min_t(s32, dst_reg->s32_min_value, 0);
dst_reg->s32_max_value = max_t(s32, dst_reg->s32_max_value, 0);
return;
}
/* We get our minimum from the var_off, since that's inherently * bitwise. Our maximum is the minimum of the operands' maxima. */
[...]
On 4/30/2024 6:18 AM, Eduard Zingerman wrote:
On Mon, 2024-04-29 at 13:58 -0700, Andrii Nakryiko wrote:
[...]
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 8f0f2e21699e..b69c89bc5cfc 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -13478,6 +13478,28 @@ static void scalar32_min_max_and(struct bpf_reg_state *dst_reg, return; }
/* Special case: dst_reg is in range [-1, 0] */
if (dst_reg->s32_min_value == -1 && dst_reg->s32_max_value == 0) {
var32_off = tnum_union(src_reg->var_off, tnum_const(0));
dst_reg->var_off = tnum_with_subreg(dst_reg->var_off, var32_off);
dst_reg->u32_min_value = var32_off.value;
dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val);
can you explain the logic behing u32 min/max updates, especially that we use completely different values for min/max and it's not clear why u32_min <= u32_max invariant will always hold. Same below
I agree with Andrii here. It appears that dst_reg.{min,max} fields should be set as {min(src.min, 0), max(src.max, 0)} for both signed and unsigned cases. Wdyt?
Agree, since 0 is the minimum unsigned number, the result range is equal to [0, src.u32_max].
dst_reg->s32_min_value = min_t(s32, src_reg->s32_min_value, 0);
dst_reg->s32_max_value = max_t(s32, src_reg->s32_max_value, 0);
return;
}
/* Special case: src_reg is in range [-1, 0] */
if (src_reg->s32_min_value == -1 && src_reg->s32_max_value == 0) {
var32_off = tnum_union(dst_reg->var_off, tnum_const(0));
dst_reg->var_off = tnum_with_subreg(dst_reg->var_off, var32_off);
dst_reg->u32_min_value = var32_off.value;
dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val);
dst_reg->s32_min_value = min_t(s32, dst_reg->s32_min_value, 0);
dst_reg->s32_max_value = max_t(s32, dst_reg->s32_max_value, 0);
return;
}
/* We get our minimum from the var_off, since that's inherently * bitwise. Our maximum is the minimum of the operands' maxima. */
[...]
On 4/30/2024 4:58 AM, Andrii Nakryiko wrote:
On Sun, Apr 28, 2024 at 8:15 AM Xu Kuohai xukuohai@huaweicloud.com wrote:
On 4/27/2024 4:36 AM, Andrii Nakryiko wrote:
On Tue, Apr 23, 2024 at 7:26 PM Xu Kuohai xukuohai@huaweicloud.com wrote:
On 4/24/2024 5:55 AM, Yonghong Song wrote:
On 4/20/24 1:33 AM, Xu Kuohai wrote:
On 4/20/2024 7:00 AM, Eduard Zingerman wrote: > On Thu, 2024-04-11 at 20:27 +0800, Xu Kuohai wrote: >> From: Xu Kuohai xukuohai@huawei.com >> >> With lsm return value check, the no-alu32 version test_libbpf_get_fd_by_id_opts >> is rejected by the verifier, and the log says: >> >> 0: R1=ctx() R10=fp0 >> ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 >> 0: (b7) r0 = 0 ; R0_w=0 >> 1: (79) r2 = *(u64 *)(r1 +0) >> func 'bpf_lsm_bpf_map' arg0 has btf_id 916 type STRUCT 'bpf_map' >> 2: R1=ctx() R2_w=trusted_ptr_bpf_map() >> ; if (map != (struct bpf_map *)&data_input) @ test_libbpf_get_fd_by_id_opts.c:29 >> 2: (18) r3 = 0xffff9742c0951a00 ; R3_w=map_ptr(map=data_input,ks=4,vs=4) >> 4: (5d) if r2 != r3 goto pc+4 ; R2_w=trusted_ptr_bpf_map() R3_w=map_ptr(map=data_input,ks=4,vs=4) >> ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 >> 5: (79) r0 = *(u64 *)(r1 +8) ; R0_w=scalar() R1=ctx() >> ; if (fmode & FMODE_WRITE) @ test_libbpf_get_fd_by_id_opts.c:32 >> 6: (67) r0 <<= 62 ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000)) >> 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) >> ; @ test_libbpf_get_fd_by_id_opts.c:0 >> 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) >> ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 >> 9: (95) exit
[...]
As suggested by Eduard, this patch makes a special case for source or destination register of '&=' operation being in range [-1, 0]. Meaning that one of the '&=' operands is either: - all ones, in which case the counterpart is the result of the operation; - all zeros, in which case zero is the result of the operation. And MIN and MAX values could be derived based on above two observations. [0] https://lore.kernel.org/bpf/e62e2971301ca7f2e9eb74fc500c520285cad8f5.camel@gmail.com/ [1] https://github.com/llvm/llvm-project/blob/4523a267829c807f3fc8fab8e5e9613985a51565/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Suggested-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 640747b53745..30c551d39329 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -13374,6 +13374,24 @@ static void scalar32_min_max_and(struct bpf_reg_state *dst_reg, dst_reg->u32_min_value = var32_off.value; dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val);
/* Special case: src_reg is known and dst_reg is in range [-1, 0] */
if (src_known &&
dst_reg->s32_min_value == -1 && dst_reg->s32_max_value == 0 &&
dst_reg->smin_value == -1 && dst_reg->smax_value == 0) {
please keep if () condition aligned across multiple lines, it's super confusing this way
OK, will update the align style
dst_reg->s32_min_value = min_t(s32, src_reg->s32_min_value, 0);
dst_reg->s32_max_value = max_t(s32, src_reg->s32_min_value, 0);
do we need to update tnum parts as well (or reset and re-derive, probably)?
btw, can't we support src being a range here? the idea is that dst_reg either all ones or all zeros. For and it means that it either stays all zero, or will be *exactly equal* to src, right? So I think the logic would be:
a) if [s32_min, s32_max] is on the same side of zero, then resulting range would be [min(s32_min, 0), max(s32_max, 0)], just like you have here
b) if [s32_min, s32_max] contains zero, then resulting range will be exactly [s32_min, s32_max]
Or did I make a mistake above?
Totally agree, the AND of any set with the range [-1,0] is equivalent to adding number 0 to the set!
Based on this observation, I've rewritten the patch as follows.
diff --git a/include/linux/tnum.h b/include/linux/tnum.h index 3c13240077b8..5e795d728b9f 100644 --- a/include/linux/tnum.h +++ b/include/linux/tnum.h @@ -52,6 +52,9 @@ struct tnum tnum_mul(struct tnum a, struct tnum b); /* Return a tnum representing numbers satisfying both @a and @b */ struct tnum tnum_intersect(struct tnum a, struct tnum b);
+/* Return a tnum representing numbers satisfying either @a or @b */ +struct tnum tnum_union(struct tnum a, struct tnum b);
- /* Return @a with all but the lowest @size bytes cleared */ struct tnum tnum_cast(struct tnum a, u8 size);
diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c index 9dbc31b25e3d..9d4480a683ca 100644 --- a/kernel/bpf/tnum.c +++ b/kernel/bpf/tnum.c @@ -150,6 +150,29 @@ struct tnum tnum_intersect(struct tnum a, struct tnum b) return TNUM(v & ~mu, mu); }
+/*
- Each bit has 3 states: unkown, known 0, known 1. If using x to represent
- unknown state, the result of the union of two bits is as follows:
| x 0 1
- -----+------------
x | x x x
0 | x 0 x
1 | x x 1
- For tnum a and b, only the bits that are both known 0 or known 1 in a
- and b are known in the result of union a and b.
- */
+struct tnum tnum_union(struct tnum a, struct tnum b) +{
u64 v0, v1, mu;
mu = a.mask | b.mask; // unkown bits either in a or b
v1 = (a.value & b.value) & ~mu; // "known 1" bits in both a and b
v0 = (~a.value & ~b.value) & ~mu; // "known 0" bits in both a and b
no C++-style comments, please
OK, will fix in the formal patch.
return TNUM(v1, mu | ~(v0 | v1));
+}
I've CC'ed Edward, hopefully he can take a look as well. Please CC him on future patches touching tnum as well.
Sure
struct tnum tnum_cast(struct tnum a, u8 size) { a.value &= (1ULL << (size * 8)) - 1; { a.value &= (1ULL << (size * 8)) - 1; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 8f0f2e21699e..b69c89bc5cfc 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -13478,6 +13478,28 @@ static void scalar32_min_max_and(struct bpf_reg_state *dst_reg, return; }
/* Special case: dst_reg is in range [-1, 0] */
if (dst_reg->s32_min_value == -1 && dst_reg->s32_max_value == 0) {
var32_off = tnum_union(src_reg->var_off, tnum_const(0));
dst_reg->var_off = tnum_with_subreg(dst_reg->var_off, var32_off);
dst_reg->u32_min_value = var32_off.value;
dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val);
can you explain the logic behing u32 min/max updates, especially that we use completely different values for min/max and it's not clear why u32_min <= u32_max invariant will always hold. Same below
We're adding 0 to the existing range, and 0 is the smallest unsigned number, so the resulted unsigned min can only get smaller, and the unsigned max will not be affected. In fact, since 0 is added to the range, var32_off.value should be 0. And since -1 is in included in dst_reg, dst_reg->u32_max_value should be -1U, the maximum unsigned integer. So we can just set u32_min to 0, and set u32_max to umax_val.
dst_reg->s32_min_value = min_t(s32, src_reg->s32_min_value, 0);
dst_reg->s32_max_value = max_t(s32, src_reg->s32_max_value, 0);
return;
}
/* Special case: src_reg is in range [-1, 0] */
if (src_reg->s32_min_value == -1 && src_reg->s32_max_value == 0) {
var32_off = tnum_union(dst_reg->var_off, tnum_const(0));
dst_reg->var_off = tnum_with_subreg(dst_reg->var_off, var32_off);
dst_reg->u32_min_value = var32_off.value;
dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val);
dst_reg->s32_min_value = min_t(s32, dst_reg->s32_min_value, 0);
dst_reg->s32_max_value = max_t(s32, dst_reg->s32_max_value, 0);
return;
}
/* We get our minimum from the var_off, since that's inherently * bitwise. Our maximum is the minimum of the operands' maxima. */
@@ -13508,6 +13530,26 @@ static void scalar_min_max_and(struct bpf_reg_state *dst_reg, return; }
/* Special case: dst_reg is in range [-1, 0] */
if (dst_reg->smin_value == -1 && dst_reg->smax_value == 0) {
dst_reg->var_off = tnum_union(src_reg->var_off, tnum_const(0));
dst_reg->umin_value = dst_reg->var_off.value;
dst_reg->umax_value = min(dst_reg->umax_value, umax_val);
dst_reg->smin_value = min_t(s64, src_reg->smin_value, 0);
dst_reg->smax_value = max_t(s64, src_reg->smax_value, 0);
return;
}
/* Special case: src_reg is in range [-1, 0] */
if (src_reg->smin_value == -1 && src_reg->smax_value == 0) {
dst_reg->var_off = tnum_union(dst_reg->var_off, tnum_const(0));
dst_reg->umin_value = dst_reg->var_off.value;
dst_reg->umax_value = min(dst_reg->umax_value, umax_val);
dst_reg->smin_value = min_t(s64, dst_reg->smin_value, 0);
dst_reg->smax_value = max_t(s64, dst_reg->smax_value, 0);
return;
}
return;
}
/* Special case: dst_reg is known and src_reg is in range [-1, 0] */
if (dst_known &&
src_reg->s32_min_value == -1 && src_reg->s32_max_value == 0 &&
src_reg->smin_value == -1 && src_reg->smax_value == 0) {
dst_reg->s32_min_value = min_t(s32, dst_reg->s32_min_value, 0);
dst_reg->s32_max_value = max_t(s32, dst_reg->s32_min_value, 0);
return;
}
/* Safe to set s32 bounds by casting u32 result into s32 when u32 * doesn't cross sign boundary. Otherwise set s32 bounds to unbounded. */
[...]
On Sun, 2024-04-28 at 23:15 +0800, Xu Kuohai wrote:
[...]
diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c index 9dbc31b25e3d..9d4480a683ca 100644 --- a/kernel/bpf/tnum.c +++ b/kernel/bpf/tnum.c @@ -150,6 +150,29 @@ struct tnum tnum_intersect(struct tnum a, struct tnum b) return TNUM(v & ~mu, mu); }
+/*
- Each bit has 3 states: unkown, known 0, known 1. If using x to represent
- unknown state, the result of the union of two bits is as follows:
| x 0 1
- -----+------------
x | x x x
0 | x 0 x
1 | x x 1
- For tnum a and b, only the bits that are both known 0 or known 1 in a
- and b are known in the result of union a and b.
- */
+struct tnum tnum_union(struct tnum a, struct tnum b) +{
u64 v0, v1, mu;
mu = a.mask | b.mask; // unkown bits either in a or b
v1 = (a.value & b.value) & ~mu; // "known 1" bits in both a and b
v0 = (~a.value & ~b.value) & ~mu; // "known 0" bits in both a and b
return TNUM(v1, mu | ~(v0 | v1));
+}
Zero would be represented as {.value=0,.mask=0}, suppose 'b' is zero:
1. mu = a.mask | 0; 2. mu = a.mask; v1 = (a.value & 0) & ~mu; v1 = 0; v0 = (~a.value & ~0) & ~mu; v0 = ~a.value & ~mu; return TNUM(v1, mu | ~(v0 | v1)); return TNUM(v1, mu | ~(v0 | v1));
3. v1 = 0; 4. v1 = 0; v0 = ~a.value & ~a.mask; v0 = ~a.value & ~a.mask; return TNUM(v1, a.mask | ~(v0 | v1)); return TNUM(0, a.mask | ~(~a.value & ~a.mask));
5. return TNUM(0, a.mask | a.value)
So ultimately this says that for 1's that we knew we no longer know if those are 1's. Which seems to make sense.
From: Xu Kuohai xukuohai@huawei.com
The compiler optimized the two bpf progs in token_lsm.c to make return value from the bool variable in the "return -1" path, causing an unexpected rejection:
0: R1=ctx() R10=fp0 ; int BPF_PROG(bpf_token_capable, struct bpf_token *token, int cap) @ bpf_lsm.c:17 0: (b7) r6 = 0 ; R6_w=0 ; if (my_pid == 0 || my_pid != (bpf_get_current_pid_tgid() >> 32)) @ bpf_lsm.c:19 1: (18) r1 = 0xffffc9000102a000 ; R1_w=map_value(map=bpf_lsm.bss,ks=4,vs=5) 3: (61) r7 = *(u32 *)(r1 +0) ; R1_w=map_value(map=bpf_lsm.bss,ks=4,vs=5) R7_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) 4: (15) if r7 == 0x0 goto pc+11 ; R7_w=scalar(smin=umin=umin32=1,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) 5: (67) r7 <<= 32 ; R7_w=scalar(smax=0x7fffffff00000000,umax=0xffffffff00000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffff00000000)) 6: (c7) r7 s>>= 32 ; R7_w=scalar(smin=0xffffffff80000000,smax=0x7fffffff) 7: (85) call bpf_get_current_pid_tgid#14 ; R0=scalar() 8: (77) r0 >>= 32 ; R0_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) 9: (5d) if r0 != r7 goto pc+6 ; R0_w=scalar(smin=smin32=0,smax=umax=umax32=0x7fffffff,var_off=(0x0; 0x7fffffff)) R7=scalar(smin=smin32=0,smax=umax=umax32=0x7fffffff,var_off=(0x0; 0x7fffffff)) ; if (reject_capable) @ bpf_lsm.c:21 10: (18) r1 = 0xffffc9000102a004 ; R1_w=map_value(map=bpf_lsm.bss,ks=4,vs=5,off=4) 12: (71) r6 = *(u8 *)(r1 +0) ; R1_w=map_value(map=bpf_lsm.bss,ks=4,vs=5,off=4) R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) ; @ bpf_lsm.c:0 13: (87) r6 = -r6 ; R6_w=scalar() 14: (67) r6 <<= 56 ; R6_w=scalar(smax=0x7f00000000000000,umax=0xff00000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xff00000000000000)) 15: (c7) r6 s>>= 56 ; R6_w=scalar(smin=smin32=-128,smax=smax32=127) ; int BPF_PROG(bpf_token_capable, struct bpf_token *token, int cap) @ bpf_lsm.c:17 16: (bf) r0 = r6 ; R0_w=scalar(id=1,smin=smin32=-128,smax=smax32=127) R6_w=scalar(id=1,smin=smin32=-128,smax=smax32=127) 17: (95) exit At program exit the register R0 has smin=-128 smax=127 should have been in [-4095, 0]
To avoid this failure, change the variable type from bool to int.
Signed-off-by: Xu Kuohai xukuohai@huawei.com --- tools/testing/selftests/bpf/progs/token_lsm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/bpf/progs/token_lsm.c b/tools/testing/selftests/bpf/progs/token_lsm.c index e4d59b6ba743..a6002d073b1b 100644 --- a/tools/testing/selftests/bpf/progs/token_lsm.c +++ b/tools/testing/selftests/bpf/progs/token_lsm.c @@ -8,8 +8,8 @@ char _license[] SEC("license") = "GPL";
int my_pid; -bool reject_capable; -bool reject_cmd; +int reject_capable; +int reject_cmd;
SEC("lsm/bpf_token_capable") int BPF_PROG(token_capable, struct bpf_token *token, int cap)
From: Xu Kuohai xukuohai@huawei.com
The return ranges of some bpf lsm test progs can not be deduced by the verifier accurately. To avoid erroneous rejections, add explicit return value checks for these progs.
Signed-off-by: Xu Kuohai xukuohai@huawei.com --- tools/testing/selftests/bpf/progs/err.h | 10 ++++++++++ tools/testing/selftests/bpf/progs/test_sig_in_xattr.c | 4 ++++ .../selftests/bpf/progs/test_verify_pkcs7_sig.c | 8 ++++++-- .../selftests/bpf/progs/verifier_global_subprogs.c | 7 ++++++- 4 files changed, 26 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/bpf/progs/err.h b/tools/testing/selftests/bpf/progs/err.h index d66d283d9e59..38529779a236 100644 --- a/tools/testing/selftests/bpf/progs/err.h +++ b/tools/testing/selftests/bpf/progs/err.h @@ -5,6 +5,16 @@ #define MAX_ERRNO 4095 #define IS_ERR_VALUE(x) (unsigned long)(void *)(x) >= (unsigned long)-MAX_ERRNO
+#define __STR(x) #x + +#define set_if_not_errno_or_zero(x, y) \ +({ \ + asm volatile ("if %0 s< -4095 goto +1\n" \ + "if %0 s<= 0 goto +1\n" \ + "%0 = " __STR(y) "\n" \ + : "+r"(x)); \ +}) + static inline int IS_ERR_OR_NULL(const void *ptr) { return !ptr || IS_ERR_VALUE((unsigned long)ptr); diff --git a/tools/testing/selftests/bpf/progs/test_sig_in_xattr.c b/tools/testing/selftests/bpf/progs/test_sig_in_xattr.c index 2f0eb1334d65..8ef6b39335b6 100644 --- a/tools/testing/selftests/bpf/progs/test_sig_in_xattr.c +++ b/tools/testing/selftests/bpf/progs/test_sig_in_xattr.c @@ -6,6 +6,7 @@ #include <bpf/bpf_helpers.h> #include <bpf/bpf_tracing.h> #include "bpf_kfuncs.h" +#include "err.h"
char _license[] SEC("license") = "GPL";
@@ -79,5 +80,8 @@ int BPF_PROG(test_file_open, struct file *f) ret = bpf_verify_pkcs7_signature(&digest_ptr, &sig_ptr, trusted_keyring);
bpf_key_put(trusted_keyring); + + set_if_not_errno_or_zero(ret, -EFAULT); + return ret; } diff --git a/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c b/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c index f42e9f3831a1..12034a73ee2d 100644 --- a/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c +++ b/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c @@ -11,6 +11,7 @@ #include <bpf/bpf_helpers.h> #include <bpf/bpf_tracing.h> #include "bpf_kfuncs.h" +#include "err.h"
#define MAX_DATA_SIZE (1024 * 1024) #define MAX_SIG_SIZE 1024 @@ -55,12 +56,12 @@ int BPF_PROG(bpf, int cmd, union bpf_attr *attr, unsigned int size)
ret = bpf_probe_read_kernel(&value, sizeof(value), &attr->value); if (ret) - return ret; + goto out;
ret = bpf_copy_from_user(data_val, sizeof(struct data), (void *)(unsigned long)value); if (ret) - return ret; + goto out;
if (data_val->data_len > sizeof(data_val->data)) return -EINVAL; @@ -84,5 +85,8 @@ int BPF_PROG(bpf, int cmd, union bpf_attr *attr, unsigned int size)
bpf_key_put(trusted_keyring);
+out: + set_if_not_errno_or_zero(ret, -EFAULT); + return ret; } diff --git a/tools/testing/selftests/bpf/progs/verifier_global_subprogs.c b/tools/testing/selftests/bpf/progs/verifier_global_subprogs.c index baff5ffe9405..5df7a98a4c51 100644 --- a/tools/testing/selftests/bpf/progs/verifier_global_subprogs.c +++ b/tools/testing/selftests/bpf/progs/verifier_global_subprogs.c @@ -7,6 +7,7 @@ #include "bpf_misc.h" #include "xdp_metadata.h" #include "bpf_kfuncs.h" +#include "err.h"
int arr[1]; int unkn_idx; @@ -324,7 +325,11 @@ SEC("?lsm/bpf") __success __log_level(2) int BPF_PROG(arg_tag_ctx_lsm) { - return tracing_subprog_void(ctx) + tracing_subprog_u64(ctx); + int ret; + + ret = tracing_subprog_void(ctx) + tracing_subprog_u64(ctx); + set_if_not_errno_or_zero(ret, -1); + return ret; }
SEC("?struct_ops/test_1")
From: Xu Kuohai xukuohai@huawei.com
Add test for lsm tail call to ensure tail call can only be used between bpf lsm progs attached to the same hook.
Signed-off-by: Xu Kuohai xukuohai@huawei.com --- .../selftests/bpf/prog_tests/test_lsm.c | 46 ++++++++++++++++++- .../selftests/bpf/progs/lsm_tailcall.c | 34 ++++++++++++++ 2 files changed, 79 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/bpf/progs/lsm_tailcall.c
diff --git a/tools/testing/selftests/bpf/prog_tests/test_lsm.c b/tools/testing/selftests/bpf/prog_tests/test_lsm.c index 16175d579bc7..2a27f3714f5c 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_lsm.c +++ b/tools/testing/selftests/bpf/prog_tests/test_lsm.c @@ -12,6 +12,7 @@ #include <stdlib.h>
#include "lsm.skel.h" +#include "lsm_tailcall.skel.h"
char *CMD_ARGS[] = {"true", NULL};
@@ -95,7 +96,7 @@ static int test_lsm(struct lsm *skel) return 0; }
-void test_test_lsm(void) +static void test_lsm_basic(void) { struct lsm *skel = NULL; int err; @@ -114,3 +115,46 @@ void test_test_lsm(void) close_prog: lsm__destroy(skel); } + +static void test_lsm_tailcall(void) +{ + struct lsm_tailcall *skel = NULL; + int map_fd, prog_fd; + int err, key; + + skel = lsm_tailcall__open_and_load(); + if (!ASSERT_OK_PTR(skel, "lsm_tailcall__skel_load")) + goto close_prog; + + map_fd = bpf_map__fd(skel->maps.jmp_table); + if (CHECK_FAIL(map_fd < 0)) + goto close_prog; + + prog_fd = bpf_program__fd(skel->progs.lsm_file_permission_prog); + if (CHECK_FAIL(prog_fd < 0)) + goto close_prog; + + key = 0; + err = bpf_map_update_elem(map_fd, &key, &prog_fd, BPF_ANY); + if (CHECK_FAIL(!err)) + goto close_prog; + + prog_fd = bpf_program__fd(skel->progs.lsm_file_alloc_security_prog); + if (CHECK_FAIL(prog_fd < 0)) + goto close_prog; + + err = bpf_map_update_elem(map_fd, &key, &prog_fd, BPF_ANY); + if (CHECK_FAIL(err)) + goto close_prog; + +close_prog: + lsm_tailcall__destroy(skel); +} + +void test_test_lsm(void) +{ + if (test__start_subtest("lsm_basic")) + test_lsm_basic(); + if (test__start_subtest("lsm_tailcall")) + test_lsm_tailcall(); +} diff --git a/tools/testing/selftests/bpf/progs/lsm_tailcall.c b/tools/testing/selftests/bpf/progs/lsm_tailcall.c new file mode 100644 index 000000000000..49c075ce2d4c --- /dev/null +++ b/tools/testing/selftests/bpf/progs/lsm_tailcall.c @@ -0,0 +1,34 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Huawei Technologies Co., Ltd */ + +#include "vmlinux.h" +#include <errno.h> +#include <bpf/bpf_helpers.h> + +char _license[] SEC("license") = "GPL"; + +struct { + __uint(type, BPF_MAP_TYPE_PROG_ARRAY); + __uint(max_entries, 1); + __uint(key_size, sizeof(__u32)); + __uint(value_size, sizeof(__u32)); +} jmp_table SEC(".maps"); + +SEC("lsm/file_permission") +int lsm_file_permission_prog(void *ctx) +{ + return 0; +} + +SEC("lsm/file_alloc_security") +int lsm_file_alloc_security_prog(void *ctx) +{ + return 0; +} + +SEC("lsm/file_alloc_security") +int lsm_file_alloc_security_entry(void *ctx) +{ + bpf_tail_call_static(ctx, &jmp_table, 0); + return 0; +}
From: Xu Kuohai xukuohai@huawei.com
Add verifier tests to check bpf lsm return values and disabled hooks.
Signed-off-by: Xu Kuohai xukuohai@huawei.com --- .../selftests/bpf/prog_tests/verifier.c | 3 +- .../selftests/bpf/progs/verifier_lsm.c | 155 ++++++++++++++++++ 2 files changed, 157 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/bpf/progs/verifier_lsm.c
diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c index c4f9f306646e..07398846085c 100644 --- a/tools/testing/selftests/bpf/prog_tests/verifier.c +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c @@ -84,6 +84,7 @@ #include "verifier_xadd.skel.h" #include "verifier_xdp.skel.h" #include "verifier_xdp_direct_packet_access.skel.h" +#include "verifier_lsm.skel.h"
#define MAX_ENTRIES 11
@@ -196,8 +197,8 @@ void test_verifier_value_illegal_alu(void) { RUN(verifier_value_illegal_alu); void test_verifier_value_or_null(void) { RUN(verifier_value_or_null); } void test_verifier_var_off(void) { RUN(verifier_var_off); } void test_verifier_xadd(void) { RUN(verifier_xadd); } -void test_verifier_xdp(void) { RUN(verifier_xdp); } void test_verifier_xdp_direct_packet_access(void) { RUN(verifier_xdp_direct_packet_access); } +void test_verifier_lsm(void) { RUN(verifier_lsm); }
static int init_test_val_map(struct bpf_object *obj, char *map_name) { diff --git a/tools/testing/selftests/bpf/progs/verifier_lsm.c b/tools/testing/selftests/bpf/progs/verifier_lsm.c new file mode 100644 index 000000000000..005f28eebf71 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/verifier_lsm.c @@ -0,0 +1,155 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> +#include "bpf_misc.h" + +SEC("lsm/file_alloc_security") +__description("lsm bpf prog exit with valid return code. test 1") +__success +__naked int return_code_vaild_test1(void) +{ + asm volatile (" \ + r0 = 0; \ + exit; \ +" ::: __clobber_all); +} + +SEC("lsm/file_alloc_security") +__description("lsm bpf prog exit with valid return code. test 2") +__success +__naked int return_code_vaild_test2(void) +{ + asm volatile (" \ + r0 = -4095; \ + exit; \ +" ::: __clobber_all); +} + +SEC("lsm/file_alloc_security") +__description("lsm bpf prog exit with valid return code. test 3") +__success +__naked int return_code_vaild_test3(void) +{ + asm volatile (" \ + call %[bpf_get_prandom_u32]; \ + r0 <<= 63; \ + r0 s>>= 63; \ + r0 &= -13; \ + exit; \ + " : + : __imm(bpf_get_prandom_u32) + : __clobber_all); +} + +SEC("lsm/vm_enough_memory") +__description("lsm bpf prog exit with valid return code. test 4") +__success +__naked int return_code_vaild_test4(void) +{ + asm volatile (" \ + r0 = 0; \ + exit; \ +" ::: __clobber_all); +} + +SEC("lsm/vm_enough_memory") +__description("lsm bpf prog exit with valid return code. test 5") +__success +__naked int return_code_vaild_test5(void) +{ + asm volatile (" \ + r0 = -4096; \ + exit; \ +" ::: __clobber_all); +} + +SEC("lsm/vm_enough_memory") +__description("lsm bpf prog exit with valid return code. test 6") +__success +__naked int return_code_vaild_test6(void) +{ + asm volatile (" \ + r0 = 4096; \ + exit; \ +" ::: __clobber_all); +} + +SEC("lsm/file_free_security") +__description("lsm bpf prog exit with valid return code. test 7") +__success +__naked void return_code_vaild_test7(void) +{ + asm volatile (" \ + r0 = -4096; \ + exit; \ +" ::: __clobber_all); +} + +SEC("lsm/file_free_security") +__description("lsm bpf prog exit with valid return code. test 8") +__success +__naked void return_code_vaild_test8(void) +{ + asm volatile (" \ + r0 = 4096; \ + exit; \ +" ::: __clobber_all); +} + +SEC("lsm/file_alloc_security") +__description("lsm bpf prog exit with invalid return code. test 1") +__failure __msg("R0 has smin=1 smax=1 should have been in [-4095, 0]") +__naked int return_code_invalid_test1(void) +{ + asm volatile (" \ + r0 = 1; \ + exit; \ +" ::: __clobber_all); +} + +SEC("lsm/file_alloc_security") +__description("lsm bpf prog exit with invalid return code. test 2") +__failure __msg("R0 has smin=-4096 smax=-4096 should have been in [-4095, 0]") +__naked int return_code_invalid_test2(void) +{ + asm volatile (" \ + r0 = -4096; \ + exit; \ +" ::: __clobber_all); +} + +SEC("lsm/getprocattr") +__description("lsm disabled hook: getprocattr") +__failure __msg("points to disabled bpf lsm hook") +__naked int disabled_hook_test1(void) +{ + asm volatile (" \ + r0 = 0; \ + exit; \ +" ::: __clobber_all); +} + +SEC("lsm/setprocattr") +__description("lsm disabled hook: setprocattr") +__failure __msg("points to disabled bpf lsm hook") +__naked int disabled_hook_test2(void) +{ + asm volatile (" \ + r0 = 0; \ + exit; \ +" ::: __clobber_all); +} + +SEC("lsm/ismaclabel") +__description("lsm disabled hook: ismaclabel") +__failure __msg("points to disabled bpf lsm hook") +__naked int disabled_hook_test3(void) +{ + asm volatile (" \ + r0 = 0; \ + exit; \ +" ::: __clobber_all); +} + +char _license[] SEC("license") = "GPL";
linux-kselftest-mirror@lists.linaro.org