The LTP test io_pgetevents02 fails in 32bit compat mode because an nr_max of -1 appears to be treated as a large positive integer. This causes pgetevents_time64 to return an event. The test expects the call to fail and errno to be set to EINVAL.
Using the compat syscall fixes the issue.
Fixes: 7a35397f8c06 ("io_pgetevents: use __kernel_timespec") Signed-off-by: Richard Palethorpe rpalethorpe@suse.com --- arch/x86/entry/syscalls/syscall_32.tbl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl index 960a021d543e..0985d8333368 100644 --- a/arch/x86/entry/syscalls/syscall_32.tbl +++ b/arch/x86/entry/syscalls/syscall_32.tbl @@ -420,7 +420,7 @@ 412 i386 utimensat_time64 sys_utimensat 413 i386 pselect6_time64 sys_pselect6 compat_sys_pselect6_time64 414 i386 ppoll_time64 sys_ppoll compat_sys_ppoll_time64 -416 i386 io_pgetevents_time64 sys_io_pgetevents +416 i386 io_pgetevents_time64 sys_io_pgetevents compat_sys_io_pgetevents_time64 417 i386 recvmmsg_time64 sys_recvmmsg compat_sys_recvmmsg_time64 418 i386 mq_timedsend_time64 sys_mq_timedsend 419 i386 mq_timedreceive_time64 sys_mq_timedreceive
On Tue, Sep 21, 2021 at 3:01 PM Richard Palethorpe rpalethorpe@suse.com wrote:
The LTP test io_pgetevents02 fails in 32bit compat mode because an nr_max of -1 appears to be treated as a large positive integer. This causes pgetevents_time64 to return an event. The test expects the call to fail and errno to be set to EINVAL.
Using the compat syscall fixes the issue.
Fixes: 7a35397f8c06 ("io_pgetevents: use __kernel_timespec") Signed-off-by: Richard Palethorpe rpalethorpe@suse.com
Thanks a lot for finding this, indeed there is definitely a mistake that this function is defined and not used, but I don't yet see how it would get to the specific failure you report.
Between the two implementations, I can see a difference in the handling of the signal mask, but that should only affect architectures with incompatible compat_sigset_t, i.e. big-endian or _COMPAT_NSIG_WORDS!=_NSIG_WORDS, and the latter is never true for currently supported architectures. On x86, there is no difference in the sigset at all.
The negative 'nr' and 'min_nr' arguments that you list as causing the problem /should/ be converted by the magic SYSCALL_DEFINE6() definition. If this is currently broken, I would expect other syscalls to be affected as well.
Have you tried reproducing this on non-x86 architectures? If I misremembered how the compat conversion in SYSCALL_DEFINE6() works, then all architectures that support CONFIG_COMPAT have to be fixed.
Arnd
Hi,
I also wondered, if it should be added for other archs like the other compat functions.
Kind regards, Petr
Hello Arnd,
Arnd Bergmann arnd@arndb.de writes:
On Tue, Sep 21, 2021 at 3:01 PM Richard Palethorpe rpalethorpe@suse.com wrote:
The LTP test io_pgetevents02 fails in 32bit compat mode because an nr_max of -1 appears to be treated as a large positive integer. This causes pgetevents_time64 to return an event. The test expects the call to fail and errno to be set to EINVAL.
Using the compat syscall fixes the issue.
Fixes: 7a35397f8c06 ("io_pgetevents: use __kernel_timespec") Signed-off-by: Richard Palethorpe rpalethorpe@suse.com
Thanks a lot for finding this, indeed there is definitely a mistake that this function is defined and not used, but I don't yet see how it would get to the specific failure you report.
Between the two implementations, I can see a difference in the handling of the signal mask, but that should only affect architectures with incompatible compat_sigset_t, i.e. big-endian or _COMPAT_NSIG_WORDS!=_NSIG_WORDS, and the latter is never true for currently supported architectures. On x86, there is no difference in the sigset at all.
The negative 'nr' and 'min_nr' arguments that you list as causing the problem /should/ be converted by the magic SYSCALL_DEFINE6() definition. If this is currently broken, I would expect other syscalls to be affected as well.
That is what I thought, but I couldn't think of another explanation for it.
Have you tried reproducing this on non-x86 architectures? If I misremembered how the compat conversion in SYSCALL_DEFINE6() works, then all architectures that support CONFIG_COMPAT have to be fixed.
Arnd
No, but I suppose I can try it on ARM or PowerPC. I suppose printing the arguments would be a good idea too.
Richard Palethorpe rpalethorpe@suse.de writes:
Hello Arnd,
Arnd Bergmann arnd@arndb.de writes:
On Tue, Sep 21, 2021 at 3:01 PM Richard Palethorpe rpalethorpe@suse.com wrote:
The LTP test io_pgetevents02 fails in 32bit compat mode because an nr_max of -1 appears to be treated as a large positive integer. This causes pgetevents_time64 to return an event. The test expects the call to fail and errno to be set to EINVAL.
Using the compat syscall fixes the issue.
Fixes: 7a35397f8c06 ("io_pgetevents: use __kernel_timespec") Signed-off-by: Richard Palethorpe rpalethorpe@suse.com
Thanks a lot for finding this, indeed there is definitely a mistake that this function is defined and not used, but I don't yet see how it would get to the specific failure you report.
Between the two implementations, I can see a difference in the handling of the signal mask, but that should only affect architectures with incompatible compat_sigset_t, i.e. big-endian or _COMPAT_NSIG_WORDS!=_NSIG_WORDS, and the latter is never true for currently supported architectures. On x86, there is no difference in the sigset at all.
The negative 'nr' and 'min_nr' arguments that you list as causing the problem /should/ be converted by the magic SYSCALL_DEFINE6() definition. If this is currently broken, I would expect other syscalls to be affected as well.
That is what I thought, but I couldn't think of another explanation for it.
Have you tried reproducing this on non-x86 architectures? If I misremembered how the compat conversion in SYSCALL_DEFINE6() works, then all architectures that support CONFIG_COMPAT have to be fixed.
Arnd
No, but I suppose I can try it on ARM or PowerPC. I suppose printing the arguments would be a good idea too.
It appears it really is failing to sign extend the s32 to s64. I added the following printks
modified fs/aio.c @@ -2054,6 +2054,7 @@ static long do_io_getevents(aio_context_t ctx_id, long ret = -EINVAL;
if (likely(ioctx)) { + printk("comparing %ld <= %ld\n", min_nr, nr); if (likely(min_nr <= nr && min_nr >= 0)) ret = read_events(ioctx, min_nr, nr, events, until); percpu_ref_put(&ioctx->users); @@ -2114,6 +2115,8 @@ SYSCALL_DEFINE6(io_pgetevents, bool interrupted; int ret;
+ printk("io_pgetevents(%lx, %ld, %ld, ...)\n", ctx_id, min_nr, nr); + if (timeout && unlikely(get_timespec64(&ts, timeout))) return -EFAULT;
Then the output is:
[ 11.252268] io_pgetevents(f7f19000, 4294967295, 1, ...) [ 11.252401] comparing 4294967295 <= 1 io_pgetevents02.c:114: TPASS: invalid min_nr: io_pgetevents() failed as expected: EINVAL (22) [ 11.252610] io_pgetevents(f7f19000, 1, 4294967295, ...) [ 11.252748] comparing 1 <= 4294967295 io_pgetevents02.c:103: TFAIL: invalid max_nr: io_pgetevents() passed unexpectedly
Richard Palethorpe rpalethorpe@suse.de writes:
Richard Palethorpe rpalethorpe@suse.de writes:
Hello Arnd,
Arnd Bergmann arnd@arndb.de writes:
On Tue, Sep 21, 2021 at 3:01 PM Richard Palethorpe rpalethorpe@suse.com wrote:
The LTP test io_pgetevents02 fails in 32bit compat mode because an nr_max of -1 appears to be treated as a large positive integer. This causes pgetevents_time64 to return an event. The test expects the call to fail and errno to be set to EINVAL.
Using the compat syscall fixes the issue.
Fixes: 7a35397f8c06 ("io_pgetevents: use __kernel_timespec") Signed-off-by: Richard Palethorpe rpalethorpe@suse.com
Thanks a lot for finding this, indeed there is definitely a mistake that this function is defined and not used, but I don't yet see how it would get to the specific failure you report.
Between the two implementations, I can see a difference in the handling of the signal mask, but that should only affect architectures with incompatible compat_sigset_t, i.e. big-endian or _COMPAT_NSIG_WORDS!=_NSIG_WORDS, and the latter is never true for currently supported architectures. On x86, there is no difference in the sigset at all.
The negative 'nr' and 'min_nr' arguments that you list as causing the problem /should/ be converted by the magic SYSCALL_DEFINE6() definition. If this is currently broken, I would expect other syscalls to be affected as well.
That is what I thought, but I couldn't think of another explanation for it.
Have you tried reproducing this on non-x86 architectures? If I misremembered how the compat conversion in SYSCALL_DEFINE6() works, then all architectures that support CONFIG_COMPAT have to be fixed.
Arnd
No, but I suppose I can try it on ARM or PowerPC. I suppose printing the arguments would be a good idea too.
It appears it really is failing to sign extend the s32 to s64. I added the following printks
modified fs/aio.c @@ -2054,6 +2054,7 @@ static long do_io_getevents(aio_context_t ctx_id, long ret = -EINVAL; if (likely(ioctx)) {
if (likely(min_nr <= nr && min_nr >= 0)) ret = read_events(ioctx, min_nr, nr, events, until); percpu_ref_put(&ioctx->users);printk("comparing %ld <= %ld\n", min_nr, nr);
@@ -2114,6 +2115,8 @@ SYSCALL_DEFINE6(io_pgetevents, bool interrupted; int ret;
- printk("io_pgetevents(%lx, %ld, %ld, ...)\n", ctx_id, min_nr, nr);
- if (timeout && unlikely(get_timespec64(&ts, timeout))) return -EFAULT;
Then the output is:
[ 11.252268] io_pgetevents(f7f19000, 4294967295, 1, ...) [ 11.252401] comparing 4294967295 <= 1 io_pgetevents02.c:114: TPASS: invalid min_nr: io_pgetevents() failed as expected: EINVAL (22) [ 11.252610] io_pgetevents(f7f19000, 1, 4294967295, ...) [ 11.252748] comparing 1 <= 4294967295 io_pgetevents02.c:103: TFAIL: invalid max_nr: io_pgetevents() passed unexpectedly
and below is the macro expansion for the automatically generated 32bit to 64bit io_pgetevents. I believe it is casting u32 to s64, which appears to mean there is no sign extension. I don't know if this is the expected behaviour?
For the manually written compat version we cast back to s32 which is what fixes the issue.
long __ia32_sys_io_pgetevents(const struct pt_regs *regs) { return __se_sys_io_pgetevents((unsigned int)regs->bx, (unsigned int)regs->cx, (unsigned int)regs->dx, (unsigned int)regs->si, (unsigned int)regs->di, (unsigned int)regs->bp); } static long __se_sys_io_pgetevents( __typeof(__builtin_choose_expr( (__builtin_types_compatible_p(typeof((aio_context_t)0), typeof(0LL)) || __builtin_types_compatible_p(typeof((aio_context_t)0), typeof(0ULL))), 0LL, 0L)) ctx_id, __typeof(__builtin_choose_expr( (__builtin_types_compatible_p(typeof((long)0), typeof(0LL)) || __builtin_types_compatible_p(typeof((long)0), typeof(0ULL))), 0LL, 0L)) min_nr, __typeof(__builtin_choose_expr( (__builtin_types_compatible_p(typeof((long)0), typeof(0LL)) || __builtin_types_compatible_p(typeof((long)0), typeof(0ULL))), 0LL, 0L)) nr, __typeof(__builtin_choose_expr( (__builtin_types_compatible_p(typeof((struct io_event *)0), typeof(0LL)) || __builtin_types_compatible_p(typeof((struct io_event *)0), typeof(0ULL))), 0LL, 0L)) events, __typeof(__builtin_choose_expr( (__builtin_types_compatible_p(typeof((struct __kernel_timespec *)0), typeof(0LL)) || __builtin_types_compatible_p(typeof((struct __kernel_timespec *)0), typeof(0ULL))), 0LL, 0L)) timeout, __typeof(__builtin_choose_expr( (__builtin_types_compatible_p(typeof((const struct __aio_sigset *)0), typeof(0LL)) || __builtin_types_compatible_p(typeof((const struct __aio_sigset *)0), typeof(0ULL))), 0LL, 0L)) usig) { long ret = __do_sys_io_pgetevents( (aio_context_t)ctx_id, (long)min_nr, (long)nr, (struct io_event *)events, (struct __kernel_timespec *)timeout, (const struct __aio_sigset *)usig);
... }
On Wed, Sep 22, 2021 at 10:46 AM Richard Palethorpe rpalethorpe@suse.de wrote:
Richard Palethorpe rpalethorpe@suse.de writes:
Then the output is:
[ 11.252268] io_pgetevents(f7f19000, 4294967295, 1, ...) [ 11.252401] comparing 4294967295 <= 1 io_pgetevents02.c:114: TPASS: invalid min_nr: io_pgetevents() failed as expected: EINVAL (22) [ 11.252610] io_pgetevents(f7f19000, 1, 4294967295, ...) [ 11.252748] comparing 1 <= 4294967295 io_pgetevents02.c:103: TFAIL: invalid max_nr: io_pgetevents() passed unexpectedly
and below is the macro expansion for the automatically generated 32bit to 64bit io_pgetevents. I believe it is casting u32 to s64, which appears to mean there is no sign extension. I don't know if this is the expected behaviour?
Thank you for digging through this, I meant to already reply once more yesterday but didn't get around to that.
__typeof(__builtin_choose_expr( (__builtin_types_compatible_p(typeof((long)0), typeof(0LL)) || __builtin_types_compatible_p(typeof((long)0), typeof(0ULL))), 0LL, 0L)) min_nr, __typeof(__builtin_choose_expr( (__builtin_types_compatible_p(typeof((long)0), typeof(0LL)) || __builtin_types_compatible_p(typeof((long)0), typeof(0ULL))), 0LL, 0L)) nr,
The part that I remembered is in arch/s390/include/asm/syscall_wrapper.h, which uses this version instead:
#define __SC_COMPAT_CAST(t, a) \ ({ \ long __ReS = a; \ \ BUILD_BUG_ON((sizeof(t) > 4) && !__TYPE_IS_L(t) && \ !__TYPE_IS_UL(t) && !__TYPE_IS_PTR(t) && \ !__TYPE_IS_LL(t)); \ if (__TYPE_IS_L(t)) \ __ReS = (s32)a; \ if (__TYPE_IS_UL(t)) \ __ReS = (u32)a; \ if (__TYPE_IS_PTR(t)) \ __ReS = a & 0x7fffffff; \ if (__TYPE_IS_LL(t)) \ return -ENOSYS; \ (t)__ReS; \ })
This also takes care of s390-specific pointer conversion, which is the reason for needing an architecture-specific wrapper, but I suppose the handling of signed arguments as done in s390 should also be done everywhere else.
I also noticed that only x86 and s390 even have separate entry points for normal syscalls when called in compat mode, while the others all just zero the upper halves of the registers in the low-level entry code and then call the native entry point.
Arnd
Hello Arnd,
Arnd Bergmann arnd@arndb.de writes:
On Wed, Sep 22, 2021 at 10:46 AM Richard Palethorpe rpalethorpe@suse.de wrote:
Richard Palethorpe rpalethorpe@suse.de writes:
Then the output is:
[ 11.252268] io_pgetevents(f7f19000, 4294967295, 1, ...) [ 11.252401] comparing 4294967295 <= 1 io_pgetevents02.c:114: TPASS: invalid min_nr: io_pgetevents() failed as expected: EINVAL (22) [ 11.252610] io_pgetevents(f7f19000, 1, 4294967295, ...) [ 11.252748] comparing 1 <= 4294967295 io_pgetevents02.c:103: TFAIL: invalid max_nr: io_pgetevents() passed unexpectedly
and below is the macro expansion for the automatically generated 32bit to 64bit io_pgetevents. I believe it is casting u32 to s64, which appears to mean there is no sign extension. I don't know if this is the expected behaviour?
Thank you for digging through this, I meant to already reply once more yesterday but didn't get around to that.
Thanks, no problem. I suppose this will effect other systemcalls as well. Which if nothing else is a pain for testing.
__typeof(__builtin_choose_expr( (__builtin_types_compatible_p(typeof((long)0), typeof(0LL)) || __builtin_types_compatible_p(typeof((long)0), typeof(0ULL))), 0LL, 0L)) min_nr, __typeof(__builtin_choose_expr( (__builtin_types_compatible_p(typeof((long)0), typeof(0LL)) || __builtin_types_compatible_p(typeof((long)0), typeof(0ULL))), 0LL, 0L)) nr,
The part that I remembered is in arch/s390/include/asm/syscall_wrapper.h, which uses this version instead:
#define __SC_COMPAT_CAST(t, a) \ ({ \ long __ReS = a; \ \ BUILD_BUG_ON((sizeof(t) > 4) && !__TYPE_IS_L(t) && \ !__TYPE_IS_UL(t) && !__TYPE_IS_PTR(t) && \ !__TYPE_IS_LL(t)); \ if (__TYPE_IS_L(t)) \ __ReS = (s32)a; \ if (__TYPE_IS_UL(t)) \ __ReS = (u32)a; \ if (__TYPE_IS_PTR(t)) \ __ReS = a & 0x7fffffff; \ if (__TYPE_IS_LL(t)) \ return -ENOSYS; \ (t)__ReS; \ })
This also takes care of s390-specific pointer conversion, which is the reason for needing an architecture-specific wrapper, but I suppose the handling of signed arguments as done in s390 should also be done everywhere else.
I also noticed that only x86 and s390 even have separate entry points for normal syscalls when called in compat mode, while the others all just zero the upper halves of the registers in the low-level entry code and then call the native entry point.
Arnd
It looks to me like aarch64 also has something similar? At any rate, I can try to fix it for x86 and investigate what else might be effected.
On Thu, Sep 23, 2021 at 12:01 PM Richard Palethorpe rpalethorpe@suse.de wrote:
Arnd Bergmann arnd@arndb.de writes:
On Wed, Sep 22, 2021 at 10:46 AM Richard Palethorpe rpalethorpe@suse.de wrote:
Richard Palethorpe rpalethorpe@suse.de writes:
I also noticed that only x86 and s390 even have separate entry points for normal syscalls when called in compat mode, while the others all just zero the upper halves of the registers in the low-level entry code and then call the native entry point.
It looks to me like aarch64 also has something similar? At any rate, I can try to fix it for x86 and investigate what else might be effected.
arm64 also has a custom asm/syscall_wrapper.h, but it only does this for accessing pt_regs (as x86 does), not for doing any argument conversion. x86 does the 32-to-64 widening in the wrapper, arm64 relies on the pt_regs already having the upper halves zeroed.
Arnd