On 03-Jul-2023, at 8:38 PM, Steven Rostedt rostedt@goodmis.org wrote:
I just looked at that code and the commit, and I honestly believe that is a horrible hack, and very fragile. It's in the smb code, so it was unlikely reviewed by anyone outside that subsystem. I really do not want to prolificate that solution around the kernel. We need to come up with something else.
I also think it's buggy (yes the cifs code is buggy!) because in the comment above the down_read_nested() it says:
/*
- nested locking. NOTE: rwsems are not allowed to recurse
- (which occurs if the same task tries to acquire the same
- lock instance multiple times), but multiple locks of the
- same lock class might be taken, if the order of the locks
- is always the same. This ordering rule can be expressed
- to lockdep via the _nested() APIs, but enumerating the
- subclasses that are used. (If the nesting relationship is
- static then another method for expressing nested locking is
- the explicit definition of lock class keys and the use of
- lockdep_set_class() at lock initialization time.
- See Documentation/locking/lockdep-design.rst for more details.)
*/
So this is NOT a solution (and the cifs code should be fixed too!)
Can you show me the exact backtrace where the reader lock gets taken again? We will have to come up with a way to not take the same lock twice.
[ 244.185505] eventfs_root_lookup+0x37/0x1f0 <--- require read lock [ 244.185509] __lookup_slow+0x72/0x100 [ 244.185511] lookup_one_len+0x6a/0x70 [ 244.185513] eventfs_start_creating+0x58/0xd0 [ 244.185515] ? security_locked_down+0x2e/0x50 [ 244.185518] eventfs_create_file+0x57/0x150 [ 244.185521] dcache_dir_open_wrapper+0x1c6/0x260 <--- require read lock [ 244.185524] ? __pfx_dcache_dir_open_wrapper+0x10/0x10 [ 244.185526] do_dentry_open+0x1ed/0x420 [ 244.185529] vfs_open+0x2d/0x40
We can also look to see if we can implement this with RCU. What exactly is this rwsem protecting?
- struct eventfs_file holds the meta-data for file or dir. https://github.com/intel-lab-lkp/linux/blob/dfe0dc15a73261ed83cdc728e43f4b3d... - eventfs_rwsem is supposed to protect the 'link-list which is made of struct eventfs_file ' and elements of struct eventfs_file.
I tried one more solution i.e by checking owner of lock: static inline struct task_struct *rwsem_owner(struct rw_semaphore *sem) { return (struct task_struct *) (atomic_long_read(&sem->owner) & ~RWSEM_OWNER_FLAGS_MASK); }
But rwsem_owner() is static.
Looking further for your input. I will add explanation in v4.
+}
[..]
- This function creates the top of the trace event directory.
- */
+struct dentry *eventfs_create_events_dir(const char *name,
struct dentry *parent,
struct rw_semaphore *eventfs_rwsem)
OK, I'm going to have to really look at this. Passing in a lock to the API is just broken. We need to find a way to solve this another way.
eventfs_rwsem is a member of struct trace_array, I guess we should pass pointer to trace_array.
No, it should not be part of the trace_array. If we can't do this with RCU, then we need to add a descriptor that contains the dentry that is returned above, and have the lock held there. The caller of the eventfs_create_events_dir() should not care about locking. That's an implementation detail that should *not* be part of the API.
That is, if you need a lock:
struct eventfs_dentry { struct dentry *dentry; struct rwsem *rwsem; };
And then get to that lock by using the container_of() macro. All created eventfs dentry's could have this structure, where the rwsem points to the top one. Again, that's only if we can't do this with RCU.
Ok. Let’s first fix locking issue.
-Ajay