On Mon, Oct 28, 2024 at 8:00 PM Eric Sandeen sandeen@redhat.com wrote:
On 10/28/24 4:43 PM, Kalesh Singh wrote:
Commit 78ff64081949 ("vfs: Convert tracefs to use the new mount API") tracefs to use the new mount APIs caused mounting with the gid=<gid> option to not take effect.
Or any other mount options. I'm sure this isn't unique to gid, right? So, might want to fix the commit title.
Hi Eric,
You are right, they same applies to any of the mount options. I'll update the commit text for v2.
The tracefs superblock can be updated from multiple paths: - on fs_initcall() to init_trace_printk_function_export() - form a work queue to initialize eventfs tracer_init_tracefs_work_func() - fsconfig() syscall to mount of remount sysfs
The tracefs super block root inode gets created early on in init_trace_printk_function_export().
With the new mount API tracefs effectively uses get_tree_single() instead of the old API mount_single().
Previously, mount_single() ensured that the options are alway applied to the superblock root inode: (1) If the root inode didn't exist, called fill_super() to create it and apply the options. (2) If the root inode exists, called reconfigure_single() which effectively called tracefs_apply_options() to parse and apply options to the subperblock's fs_info and inode and remount eventfs (if necessary)
On the other hand, get_tree_single() effectively calls vfs_get_super() which: (3) If the root inode doesn't exists calls fill_super() to create it and apply the options. (4) If the root inode already exists, updates the fs_context root with the superblock's root inode.
I'm honestly a little lost here, but given the differences between mount_single() and get_tree_single() - are other get_tree_single() users similarly broken?
I'm not sure if other filesystems are broken in the same way. The issue happened for tracefs due to the fact that the sb root is created before calling mount -- from init_trace_printk_function_export(). If there are other filesystems that have similar early initialization (before mount) they may be broken as well.
Should get_tree_single() just be calling reconfigure_single() internally like mount_single() did? The comment in reconfigure_single() confuses me.
(4) above is always the case for tracefs mounts, since the super block's root inode will already be created by init_trace_printk_function_export().
this reminds me a little of
commit a6097180d884ddab769fb25588ea8598589c218c Author: NeilBrown neilb@suse.de Date: Mon Jan 17 09:07:26 2022 +1100
devtmpfs regression fix: reconfigure on each mount
Interesting, yes it seems like the same root cause. i.e. devtmpfs sb root is created early on from kernel_init() .... --> driver_init() ---> devtmpfs_init() which setups up the root inode for devtmpfs in shmem_get_tree() ... -> vfs_get_super(); which is the case (4), meaning that it would have ignored the options on mount for the same reason: the superblock root indoe already exists.
I'm not very familiar with the filesystem area. If this is a common scenario, maybe there needs to be a separate API to handle this case?
This means that the gid mount option gets ignored: - Since it isn't applied to the super block's root inode, it doesn't get inherited by the children. - Since eventfs is initialized from form a separate work queue and before call to mount with the options, and it doesn't get remounted for mount.
Ensure that the mount options are applied to the super block and eventfs is remounted to respect the new mount options.
[1] https://lore.kernel.org/r/536e99d3-345c-448b-adee-a21389d7ab4b@redhat.com/
Fixes: 78ff64081949 ("vfs: Convert tracefs to use the new mount API") Cc: David Howells dhowells@redhat.com Cc: Steven Rostedt rostedt@goodmis.org Cc: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Kalesh Singh kaleshsingh@google.com
fs/tracefs/inode.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c index 1748dff58c3b..cfc614c638da 100644 --- a/fs/tracefs/inode.c +++ b/fs/tracefs/inode.c @@ -392,6 +392,9 @@ static int tracefs_reconfigure(struct fs_context *fc) struct tracefs_fs_info *sb_opts = sb->s_fs_info; struct tracefs_fs_info *new_opts = fc->s_fs_info;
if (!new_opts)
return 0;
Can this really happen?
From init_trace_printk_function_export() the first time the super block is allocated and added to file_system_type->fs_supers; fc->s_fs_info is reset to NULL. [1]. I think that is ok since the fs_info would have already been copied to the super block. [2]
See sget_fc():
[1] https://github.com/torvalds/linux/blob/v6.12-rc4/fs/super.c#L774 [2] https://github.com/torvalds/linux/blob/v6.12-rc4/fs/super.c#L766
--Kalesh
sync_filesystem(sb); /* structure copy of new mount options to sb */ *sb_opts = *new_opts;
FWIW doing this as a structure copy was probably a terrible choice on my part. :(
@@ -478,14 +481,17 @@ static int tracefs_fill_super(struct super_block *sb, struct fs_context *fc) sb->s_op = &tracefs_super_operations; sb->s_d_op = &tracefs_dentry_operations;
tracefs_apply_options(sb, false);
return 0;
}
static int tracefs_get_tree(struct fs_context *fc) {
return get_tree_single(fc, tracefs_fill_super);
int err = get_tree_single(fc, tracefs_fill_super);
if (err)
return err;
return tracefs_reconfigure(fc);
}
static void tracefs_free_fc(struct fs_context *fc)