I wonder if it would be better to not copy the ring mapping on fork. Something like the below? I think that would be more in line with expectations (the ring isn't available in the child process).
I like this idea a lot, but would this be viewed as subtly changing the userland to kernel interface?
If we're okay with this change though, I think it makes sense.
On Wed, Jan 11, 2023 at 2:37 PM Jeff Moyer jmoyer@redhat.com wrote:
Hi, Seth,
Seth Jenkins sethjenkins@google.com writes:
Commit e4a0d3e720e7 ("aio: Make it possible to remap aio ring") introduced a null-deref if mremap is called on an old aio mapping after fork as mm->ioctx_table will be set to NULL.
Fixes: e4a0d3e720e7 ("aio: Make it possible to remap aio ring") Cc: stable@vger.kernel.org Signed-off-by: Seth Jenkins sethjenkins@google.com
fs/aio.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c index 5b2ff20ad322..74eae7de7323 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -361,16 +361,18 @@ static int aio_ring_mremap(struct vm_area_struct *vma) spin_lock(&mm->ioctx_lock); rcu_read_lock(); table = rcu_dereference(mm->ioctx_table);
for (i = 0; i < table->nr; i++) {
struct kioctx *ctx;
ctx = rcu_dereference(table->table[i]);
if (ctx && ctx->aio_ring_file == file) {
if (!atomic_read(&ctx->dead)) {
ctx->user_id = ctx->mmap_base = vma->vm_start;
res = 0;
if (table) {
for (i = 0; i < table->nr; i++) {
struct kioctx *ctx;
ctx = rcu_dereference(table->table[i]);
if (ctx && ctx->aio_ring_file == file) {
if (!atomic_read(&ctx->dead)) {
ctx->user_id = ctx->mmap_base = vma->vm_start;
res = 0;
}
break; }
break; } }
I wonder if it would be better to not copy the ring mapping on fork. Something like the below? I think that would be more in line with expectations (the ring isn't available in the child process).
-Jeff
diff --git a/fs/aio.c b/fs/aio.c index 562916d85cba..dbf3b0749cb4 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -390,7 +390,7 @@ static const struct vm_operations_struct aio_ring_vm_ops = {
static int aio_ring_mmap(struct file *file, struct vm_area_struct *vma) {
vma->vm_flags |= VM_DONTEXPAND;
vma->vm_flags |= VM_DONTEXPAND|VM_DONTCOPY; vma->vm_ops = &aio_ring_vm_ops; return 0;
}