Re: [PATCH v3 03/10] eventfs: adding eventfs dir add functions

3 Jul 2023

      On Mon, 3 Jul 2023 10:13:22 +0000
Ajay Kaher akaher@vmware.com wrote:
...
...
...
+/**

eventfs_down_write - acquire write lock function

@eventfs_rwsem: a pointer to rw_semaphore

helper function to perform write lock on eventfs_rwsem

*/

+static void eventfs_down_write(struct rw_semaphore *eventfs_rwsem)
+{

while (!down_write_trylock(eventfs_rwsem))

        msleep(10);  

What's this loop for? Something like that needs a very good explanation
in a comment. Loops like these are usually a sign of a workaround for a
bug in the design, or worse, simply hides an existing bug.
Yes correct, this logic is to solve deadlock:
Thread 1                             Thread 2
down_read_nested()                                 - read lock acquired
                                         down_write()     - waiting for write lock to acquire
down_read_nested()                                  - deadlock
Deadlock is because rwlock wouldn’t allow read lock to be acquired if write lock is waiting.
down_write_trylock() wouldn’t add the write lock in waiting queue, hence helps to prevent
deadlock scenario.
I was stuck with this Deadlock, tried few methods and finally borrowed from cifs, as it’s
upstreamed, tested and working in cifs, please refer:
https://elixir.bootlin.com/linux/v6.3.1/source/fs/cifs/file.c#L438
I just looked at that code and the commit, and I honestly believe that
is a horrible hack, and very fragile. It's in the smb code, so it was
unlikely reviewed by anyone outside that subsystem. I really do not
want to prolificate that solution around the kernel. We need to come up
with something else.
I also think it's buggy (yes the cifs code is buggy!) because in the
comment above the down_read_nested() it says:
/*
 * nested locking. NOTE: rwsems are not allowed to recurse
 * (which occurs if the same task tries to acquire the same
 * lock instance multiple times), but multiple locks of the
 * same lock class might be taken, if the order of the locks
 * is always the same. This ordering rule can be expressed
 * to lockdep via the _nested() APIs, but enumerating the
 * subclasses that are used. (If the nesting relationship is
 * static then another method for expressing nested locking is
 * the explicit definition of lock class keys and the use of
 * lockdep_set_class() at lock initialization time.
 * See Documentation/locking/lockdep-design.rst for more details.)
 */
So this is NOT a solution (and the cifs code should be fixed too!)
Can you show me the exact backtrace where the reader lock gets taken
again? We will have to come up with a way to not take the same lock
twice.
We can also look to see if we can implement this with RCU. What exactly
is this rwsem protecting?
...
Looking further for your input. I will add explanation in v4.
...
...
+}

[..]
...
...
...

This function creates the top of the trace event directory.

*/

+struct dentry *eventfs_create_events_dir(const char *name,

                                 struct dentry *parent,

                                 struct rw_semaphore *eventfs_rwsem)  

OK, I'm going to have to really look at this. Passing in a lock to the
API is just broken. We need to find a way to solve this another way.
eventfs_rwsem is a member of struct trace_array, I guess we should
pass pointer to trace_array.
No, it should not be part of the trace_array. If we can't do this with
RCU, then we need to add a descriptor that contains the dentry that is
returned above, and have the lock held there. The caller of the
eventfs_create_events_dir() should not care about locking. That's an
implementation detail that should *not* be part of the API.
That is, if you need a lock:
struct eventfs_dentry {
    struct dentry		*dentry;
    struct rwsem		*rwsem;
};
And then get to that lock by using the container_of() macro. All
created eventfs dentry's could have this structure, where the rwsem
points to the top one. Again, that's only if we can't do this with RCU.
-- Steve
...
...
I'm about to board a plane to JFK shortly, I'm hoping to play with this
while flying back.
I have replied for major concerns. All other minor I will take care in v4.
Thanks a lot for giving time to eventfs patches.

Ajay

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v3 03/10] eventfs: adding eventfs dir add functions