On Tue, Oct 05, 2021 at 12:47:22PM -0700, Kees Cook wrote:
On Mon, Sep 27, 2021 at 09:37:57AM -0700, Luis Chamberlain wrote:
This adds initial failure injection support to kernfs. We start off with debug knobs which when enabled allow test drivers, such as test_sysfs, to then make use of these to try to force certain difficult races to take place with a high degree of certainty.
This only adds runtime code *iff* the new bool CONFIG_FAIL_KERNFS_KNOBS is enabled in your kernel. If you don't have this enabled this provides no new functional. When CONFIG_FAIL_KERNFS_KNOBS is disabled the new routine kernfs_debug_should_wait() ends up being transformed to if (false), and so the compiler should optimize these out as dead code producing no new effective binary changes.
We start off with enabling failure injections in kernfs by allowing us to alter the way kernfs_fop_write_iter() behaves. We allow for the routine kernfs_fop_write_iter() to wait for a certain condition in the kernel to occur, after which it will sleep a predefined amount of time. This lets kernfs users to time exactly when it want kernfs_fop_write_iter() to complete, allowing for developing race conditions and test for correctness in kernfs.
You'd boot with this enabled on your kernel command line:
fail_kernfs_fop_write_iter=1,100,0,1
The values are <interval,probability,size,times>, we don't care for size, so for now we ignore it. The above ensures a failure will trigger only once.
*How* we allow for this routine to change behaviour is left to knobs we expose under debugfs:
# ls -1 /sys/kernel/debug/kernfs/config_fail_kernfs_fop_write_iter/
I'd expect this to live under /sys/kernel/debug/fail_kernfs, like the other fault injectors.
Yes I see, thanks will fix up!
diff --git a/Documentation/fault-injection/fault-injection.rst b/Documentation/fault-injection/fault-injection.rst index 4a25c5eb6f07..d4d34b082f47 100644 --- a/Documentation/fault-injection/fault-injection.rst +++ b/Documentation/fault-injection/fault-injection.rst @@ -28,6 +28,28 @@ Available fault injection capabilities injects kernel RPC client and server failures. +- fail_kernfs_fop_write_iter
- Allows for failures to be enabled inside kernfs_fop_write_iter(). Enabling
- this does not immediately enable any errors to occur. You must configure
- how you want this routine to fail or change behaviour by using the debugfs
- knobs for it:
- # ls -1 /sys/kernel/debug/kernfs/config_fail_kernfs_fop_write_iter/
- wait_after_active
- wait_after_mutex
- wait_at_start
- wait_before_mutex
This should be split up and detailed in the "debugfs entries" section below here.
Done!
diff --git a/MAINTAINERS b/MAINTAINERS index 1b4cefcb064c..fadfd961ad80 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -10384,7 +10384,7 @@ M: Greg Kroah-Hartman gregkh@linuxfoundation.org M: Tejun Heo tj@kernel.org S: Supported T: git git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git -F: fs/kernfs/ +F: fs/kernfs/* F: include/linux/kernfs.h KEXEC diff --git a/fs/kernfs/Makefile b/fs/kernfs/Makefile index 4ca54ff54c98..bc5b32ca39f9 100644 --- a/fs/kernfs/Makefile +++ b/fs/kernfs/Makefile @@ -4,3 +4,4 @@ # obj-y := mount.o inode.o dir.o file.o symlink.o +obj-$(CONFIG_FAIL_KERNFS_KNOBS) += failure-injection.o diff --git a/fs/kernfs/failure-injection.c b/fs/kernfs/failure-injection.c new file mode 100644 index 000000000000..4130d202c13b --- /dev/null +++ b/fs/kernfs/failure-injection.c
I'd name this fault_inject.c, which matches the more common case:
$ find . -type f -name '*fault*inject*.c' ./fs/nfsd/fault_inject.c ./drivers/nvme/host/fault_inject.c ./drivers/scsi/ufs/ufs-fault-injection.c ./lib/fault-inject.c ./lib/fault-inject-usercopy.c
Sure, done.
+int __kernfs_debug_should_wait_kernfs_fop_write_iter(bool evaluate) +{
- if (!evaluate)
return 0;
- return should_fail(&fail_kernfs_fop_write_iter, 0);
+}
Every caller ends up doing the wait, so how about just including that here instead? It should make things much less intrusive and more readable.
And for the naming, other fault injectors use "should_fail_$topic", so maybe better here would be something like may_wait_kernfs(...).
In case anyone is reading Hail Mary by Andy Weir: "Yes yes yes!"
Indeed, that's a great idea. Changed!
+DECLARE_COMPLETION(kernfs_debug_wait_completion); +EXPORT_SYMBOL_NS_GPL(kernfs_debug_wait_completion, KERNFS_DEBUG_PRIVATE);
+void kernfs_debug_wait(void) +{
- unsigned long timeout;
- timeout = wait_for_completion_timeout(&kernfs_debug_wait_completion,
msecs_to_jiffies(3000));
- if (!timeout)
pr_info("%s waiting for kernfs_debug_wait_completion timed out\n",
__func__);
- else
pr_info("%s received completion with time left on timeout %u ms\n",
__func__, jiffies_to_msecs(timeout));
- /**
* The goal is wait for an event, and *then* once we have
* reached it, the other side will try to do something which
* it thinks will break. So we must give it some time to do
* that. The amount of time is configurable.
*/
- msleep(kernfs_config_fail.sleep_after_wait_ms);
- pr_info("%s ended\n", __func__);
+}
All the uses of "__func__" here seems redundant; I would drop them.
Alright, and I also added the pr_fmt define which I forgot.
diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c index 60e2a86c535e..4479c6580333 100644 --- a/fs/kernfs/file.c +++ b/fs/kernfs/file.c @@ -259,6 +259,9 @@ static ssize_t kernfs_fop_write_iter(struct kiocb *iocb, struct iov_iter *iter) const struct kernfs_ops *ops; char *buf;
- if (kernfs_debug_should_wait(kernfs_fop_write_iter, at_start))
kernfs_debug_wait();
So this could just be:
may_wait_kernfs(kernfs_fop_write_iter, at_start);
Yup! Thanks!
diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h index f9cc912c31e1..9e3abf597e2d 100644 --- a/fs/kernfs/kernfs-internal.h +++ b/fs/kernfs/kernfs-internal.h +#define __kernfs_config_wait_var(func, when) \
- (kernfs_config_fail. func ## _fail.wait_ ## when)
^^ ^ ^
nit: needless spaces
Trimmed.
Luis