This patch series contains stability fixes and error handling for remoteproc.
The changes included in this series do the following: Patch 1: Fixes the creation of the rproc character device. Patch 2: Validates rproc as the first step of rproc_add(). Patch 3: Fixes the rproc cdev remove and the order of dev_del() and cdev_del(). Patch 4: Adds error handling in rproc_add().
v2 -> v3: - Removed extra file that got added by mistake.
v1 -> v2: - Added extra patch which addresses Bjorn's comments on patch 3 from v1. - Fixed commit text for patch 2 (s/calling making/making).
Siddharth Gupta (4): remoteproc: core: Move cdev add before device add remoteproc: core: Move validate before device add remoteproc: core: Fix cdev remove and rproc del remoteproc: core: Cleanup device in case of failure
drivers/remoteproc/remoteproc_cdev.c | 2 +- drivers/remoteproc/remoteproc_core.c | 27 ++++++++++++++++++--------- 2 files changed, 19 insertions(+), 10 deletions(-)
When cdev_add is called after device_add has been called there is no way for the userspace to know about the addition of a cdev as cdev_add itself doesn't trigger a uevent notification, or for the kernel to know about the change to devt. This results in two problems: - mknod is never called for the cdev and hence no cdev appears on devtmpfs. - sysfs links to the new cdev are not established.
The cdev needs to be added and devt assigned before device_add() is called in order for the relevant sysfs and devtmpfs entries to be created and the uevent to be properly populated.
Signed-off-by: Siddharth Gupta sidgup@codeaurora.org Reviewed-by: Bjorn Andersson bjorn.andersson@linaro.org --- drivers/remoteproc/remoteproc_core.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c index 6348aaa..9ad8c5f 100644 --- a/drivers/remoteproc/remoteproc_core.c +++ b/drivers/remoteproc/remoteproc_core.c @@ -2333,6 +2333,11 @@ int rproc_add(struct rproc *rproc) struct device *dev = &rproc->dev; int ret;
+ /* add char device for this remoteproc */ + ret = rproc_char_device_add(rproc); + if (ret < 0) + return ret; + ret = device_add(dev); if (ret < 0) return ret; @@ -2346,11 +2351,6 @@ int rproc_add(struct rproc *rproc) /* create debugfs entries */ rproc_create_debug_dir(rproc);
- /* add char device for this remoteproc */ - ret = rproc_char_device_add(rproc); - if (ret < 0) - return ret; - /* if rproc is marked always-on, request it to boot */ if (rproc->auto_boot) { ret = rproc_trigger_auto_boot(rproc);
On Mon, Jun 14, 2021 at 07:21:08PM -0700, Siddharth Gupta wrote:
When cdev_add is called after device_add has been called there is no way for the userspace to know about the addition of a cdev as cdev_add itself doesn't trigger a uevent notification, or for the kernel to know about the change to devt. This results in two problems:
- mknod is never called for the cdev and hence no cdev appears on devtmpfs.
- sysfs links to the new cdev are not established.
The cdev needs to be added and devt assigned before device_add() is called in order for the relevant sysfs and devtmpfs entries to be created and the uevent to be properly populated.
Signed-off-by: Siddharth Gupta sidgup@codeaurora.org Reviewed-by: Bjorn Andersson bjorn.andersson@linaro.org
drivers/remoteproc/remoteproc_core.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
<formletter>
This is not the correct way to submit patches for inclusion in the stable kernel tree. Please read: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html for how to do this properly.
</formletter>
On Mon, Jun 14, 2021 at 07:21:08PM -0700, Siddharth Gupta wrote:
When cdev_add is called after device_add has been called there is no way for the userspace to know about the addition of a cdev as cdev_add itself doesn't trigger a uevent notification, or for the kernel to know about the change to devt. This results in two problems:
- mknod is never called for the cdev and hence no cdev appears on devtmpfs.
- sysfs links to the new cdev are not established.
The cdev needs to be added and devt assigned before device_add() is called in order for the relevant sysfs and devtmpfs entries to be created and the uevent to be properly populated.
So this means no one ever ran this code on a system that used devtmpfs?
How was it ever tested?
On 6/14/2021 9:56 PM, Greg KH wrote:
On Mon, Jun 14, 2021 at 07:21:08PM -0700, Siddharth Gupta wrote:
When cdev_add is called after device_add has been called there is no way for the userspace to know about the addition of a cdev as cdev_add itself doesn't trigger a uevent notification, or for the kernel to know about the change to devt. This results in two problems:
- mknod is never called for the cdev and hence no cdev appears on devtmpfs.
- sysfs links to the new cdev are not established.
The cdev needs to be added and devt assigned before device_add() is called in order for the relevant sysfs and devtmpfs entries to be created and the uevent to be properly populated.
So this means no one ever ran this code on a system that used devtmpfs?
How was it ever tested?
My testing was done with toybox + Android's ueventd ramdisk. As I mentioned in the discussion, the race became evident recently. I will make sure to test all such changes without systemd/ueventd in the future.
Thanks, Sid
On Tue, Jun 15, 2021 at 12:03:26PM -0700, Siddharth Gupta wrote:
On 6/14/2021 9:56 PM, Greg KH wrote:
On Mon, Jun 14, 2021 at 07:21:08PM -0700, Siddharth Gupta wrote:
When cdev_add is called after device_add has been called there is no way for the userspace to know about the addition of a cdev as cdev_add itself doesn't trigger a uevent notification, or for the kernel to know about the change to devt. This results in two problems:
- mknod is never called for the cdev and hence no cdev appears on devtmpfs.
- sysfs links to the new cdev are not established.
The cdev needs to be added and devt assigned before device_add() is called in order for the relevant sysfs and devtmpfs entries to be created and the uevent to be properly populated.
So this means no one ever ran this code on a system that used devtmpfs?
How was it ever tested?
My testing was done with toybox + Android's ueventd ramdisk. As I mentioned in the discussion, the race became evident recently. I will make sure to test all such changes without systemd/ueventd in the future.
It isn't an issue of systemd/ueventd, those do not control /dev on a normal system, that is what devtmpfs is for.
And devtmpfs nodes are only created if you create a struct device somewhere with a proper major/minor, which you were not doing here, so you must have had a static /dev on your test systems, right?
thanks,
greg k-h
On 6/15/2021 10:58 PM, Greg KH wrote:
On Tue, Jun 15, 2021 at 12:03:26PM -0700, Siddharth Gupta wrote:
On 6/14/2021 9:56 PM, Greg KH wrote:
On Mon, Jun 14, 2021 at 07:21:08PM -0700, Siddharth Gupta wrote:
When cdev_add is called after device_add has been called there is no way for the userspace to know about the addition of a cdev as cdev_add itself doesn't trigger a uevent notification, or for the kernel to know about the change to devt. This results in two problems:
- mknod is never called for the cdev and hence no cdev appears on devtmpfs.
- sysfs links to the new cdev are not established.
The cdev needs to be added and devt assigned before device_add() is called in order for the relevant sysfs and devtmpfs entries to be created and the uevent to be properly populated.
So this means no one ever ran this code on a system that used devtmpfs?
How was it ever tested?
My testing was done with toybox + Android's ueventd ramdisk. As I mentioned in the discussion, the race became evident recently. I will make sure to test all such changes without systemd/ueventd in the future.
It isn't an issue of systemd/ueventd, those do not control /dev on a normal system, that is what devtmpfs is for.
I am not fully aware of when devtmpfs is enabled or not, but in case it is not - systemd/ueventd will create these files with mknod, right? I was even manually able to call mknod from the terminal when some of the remoteproc character device entries showed up (using major number from there, and minor number being the remoteproc id), and that allowed me to boot up the remoteprocs as well.
And devtmpfs nodes are only created if you create a struct device somewhere with a proper major/minor, which you were not doing here, so you must have had a static /dev on your test systems, right?
I am not sure of what you mean by a static /dev? Could you explain? In case you mean the character device would be non-functional, that is not the case. They have been working for us since the beginning.
Thanks, Sid
thanks,
greg k-h
On Wed, Jun 16, 2021 at 11:47:01AM -0700, Siddharth Gupta wrote:
On 6/15/2021 10:58 PM, Greg KH wrote:
On Tue, Jun 15, 2021 at 12:03:26PM -0700, Siddharth Gupta wrote:
On 6/14/2021 9:56 PM, Greg KH wrote:
On Mon, Jun 14, 2021 at 07:21:08PM -0700, Siddharth Gupta wrote:
When cdev_add is called after device_add has been called there is no way for the userspace to know about the addition of a cdev as cdev_add itself doesn't trigger a uevent notification, or for the kernel to know about the change to devt. This results in two problems:
- mknod is never called for the cdev and hence no cdev appears on devtmpfs.
- sysfs links to the new cdev are not established.
The cdev needs to be added and devt assigned before device_add() is called in order for the relevant sysfs and devtmpfs entries to be created and the uevent to be properly populated.
So this means no one ever ran this code on a system that used devtmpfs?
How was it ever tested?
My testing was done with toybox + Android's ueventd ramdisk. As I mentioned in the discussion, the race became evident recently. I will make sure to test all such changes without systemd/ueventd in the future.
It isn't an issue of systemd/ueventd, those do not control /dev on a normal system, that is what devtmpfs is for.
I am not fully aware of when devtmpfs is enabled or not, but in case it is not - systemd/ueventd will create these files with mknod, right?
No, systemd does not create device nodes, and neither does udev. Hasn't done so for well over 10 years now.
I was even manually able to call mknod from the terminal when some of the remoteproc character device entries showed up (using major number from there, and minor number being the remoteproc id), and that allowed me to boot up the remoteprocs as well.
Yes, that is fine, but that also means that this was not working from the very beginning :(
And devtmpfs nodes are only created if you create a struct device somewhere with a proper major/minor, which you were not doing here, so you must have had a static /dev on your test systems, right?
I am not sure of what you mean by a static /dev? Could you explain? In case you mean the character device would be non-functional, that is not the case. They have been working for us since the beginning.
/dev on modern systems is managed by devtmpfs, which knows to create the device nodes when you properly register the device with the driver core. A "static" /dev is managed by mknod from userspace, like you did "by hand", and that is usually only done by older systems.
thanks,
greg k-h
On 6/23/2021 12:27 AM, Greg KH wrote:
On Wed, Jun 16, 2021 at 11:47:01AM -0700, Siddharth Gupta wrote:
On 6/15/2021 10:58 PM, Greg KH wrote:
On Tue, Jun 15, 2021 at 12:03:26PM -0700, Siddharth Gupta wrote:
On 6/14/2021 9:56 PM, Greg KH wrote:
On Mon, Jun 14, 2021 at 07:21:08PM -0700, Siddharth Gupta wrote:
When cdev_add is called after device_add has been called there is no way for the userspace to know about the addition of a cdev as cdev_add itself doesn't trigger a uevent notification, or for the kernel to know about the change to devt. This results in two problems: - mknod is never called for the cdev and hence no cdev appears on devtmpfs. - sysfs links to the new cdev are not established.
The cdev needs to be added and devt assigned before device_add() is called in order for the relevant sysfs and devtmpfs entries to be created and the uevent to be properly populated.
So this means no one ever ran this code on a system that used devtmpfs?
How was it ever tested?
My testing was done with toybox + Android's ueventd ramdisk. As I mentioned in the discussion, the race became evident recently. I will make sure to test all such changes without systemd/ueventd in the future.
It isn't an issue of systemd/ueventd, those do not control /dev on a normal system, that is what devtmpfs is for.
I am not fully aware of when devtmpfs is enabled or not, but in case it is not - systemd/ueventd will create these files with mknod, right?
No, systemd does not create device nodes, and neither does udev. Hasn't done so for well over 10 years now.
Oh okay. I thought ueventd does it because it allows setting the node permissions through ueventd.rc: https://android.googlesource.com/platform/system/core/+/master/rootdir/ueven...
I was even manually able to call mknod from the terminal when some of the remoteproc character device entries showed up (using major number from there, and minor number being the remoteproc id), and that allowed me to boot up the remoteprocs as well.
Yes, that is fine, but that also means that this was not working from the very beginning :(
Right. To clarify, I did this after we started seeing the problem on our devices, which led me to believe there was a race between ueventd and cdev_add(). Not sure anymore if that is not the case.
And devtmpfs nodes are only created if you create a struct device somewhere with a proper major/minor, which you were not doing here, so you must have had a static /dev on your test systems, right?
I am not sure of what you mean by a static /dev? Could you explain? In case you mean the character device would be non-functional, that is not the case. They have been working for us since the beginning.
/dev on modern systems is managed by devtmpfs, which knows to create the device nodes when you properly register the device with the driver core. A "static" /dev is managed by mknod from userspace, like you did "by hand", and that is usually only done by older systems.
Thanks for the explanation! As I mentioned earlier - I was under the impression that ueventd does it. I will go through our older builds where this was working (without this patch) and try to see how the dev nodes were being populated.
Thanks, Sid
thanks,
greg k-h
We can validate whether the remoteproc is correctly setup before making the cdev_add and device_add calls. This saves us the trouble of cleaning up later on.
Signed-off-by: Siddharth Gupta sidgup@codeaurora.org Reviewed-by: Bjorn Andersson bjorn.andersson@linaro.org --- drivers/remoteproc/remoteproc_core.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c index 9ad8c5f..b65fce3 100644 --- a/drivers/remoteproc/remoteproc_core.c +++ b/drivers/remoteproc/remoteproc_core.c @@ -2333,16 +2333,16 @@ int rproc_add(struct rproc *rproc) struct device *dev = &rproc->dev; int ret;
- /* add char device for this remoteproc */ - ret = rproc_char_device_add(rproc); + ret = rproc_validate(rproc); if (ret < 0) return ret;
- ret = device_add(dev); + /* add char device for this remoteproc */ + ret = rproc_char_device_add(rproc); if (ret < 0) return ret;
- ret = rproc_validate(rproc); + ret = device_add(dev); if (ret < 0) return ret;
On Mon, Jun 14, 2021 at 07:21:09PM -0700, Siddharth Gupta wrote:
We can validate whether the remoteproc is correctly setup before making the cdev_add and device_add calls. This saves us the trouble of cleaning up later on.
Signed-off-by: Siddharth Gupta sidgup@codeaurora.org Reviewed-by: Bjorn Andersson bjorn.andersson@linaro.org
drivers/remoteproc/remoteproc_core.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
<formletter>
This is not the correct way to submit patches for inclusion in the stable kernel tree. Please read: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html for how to do this properly.
</formletter>
The rproc_char_device_remove() call currently unmaps the cdev region instead of simply deleting the cdev that was added as a part of the rproc_char_device_add() call. This change fixes that behaviour, and also fixes the order in which device_del() and cdev_del() need to be called.
Signed-off-by: Siddharth Gupta sidgup@codeaurora.org --- drivers/remoteproc/remoteproc_cdev.c | 2 +- drivers/remoteproc/remoteproc_core.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/remoteproc/remoteproc_cdev.c b/drivers/remoteproc/remoteproc_cdev.c index 0b8a84c..4ad98b0 100644 --- a/drivers/remoteproc/remoteproc_cdev.c +++ b/drivers/remoteproc/remoteproc_cdev.c @@ -124,7 +124,7 @@ int rproc_char_device_add(struct rproc *rproc)
void rproc_char_device_remove(struct rproc *rproc) { - __unregister_chrdev(MAJOR(rproc->dev.devt), rproc->index, 1, "remoteproc"); + cdev_del(&rproc->cdev); }
void __init rproc_init_cdev(void) diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c index b65fce3..b874280 100644 --- a/drivers/remoteproc/remoteproc_core.c +++ b/drivers/remoteproc/remoteproc_core.c @@ -2619,7 +2619,6 @@ int rproc_del(struct rproc *rproc) mutex_unlock(&rproc->lock);
rproc_delete_debug_dir(rproc); - rproc_char_device_remove(rproc);
/* the rproc is downref'ed as soon as it's removed from the klist */ mutex_lock(&rproc_list_mutex); @@ -2630,6 +2629,7 @@ int rproc_del(struct rproc *rproc) synchronize_rcu();
device_del(&rproc->dev); + rproc_char_device_remove(rproc);
return 0; }
On Mon, Jun 14, 2021 at 07:21:10PM -0700, Siddharth Gupta wrote:
The rproc_char_device_remove() call currently unmaps the cdev region instead of simply deleting the cdev that was added as a part of the rproc_char_device_add() call. This change fixes that behaviour, and also fixes the order in which device_del() and cdev_del() need to be called.
Signed-off-by: Siddharth Gupta sidgup@codeaurora.org
drivers/remoteproc/remoteproc_cdev.c | 2 +- drivers/remoteproc/remoteproc_core.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
<formletter>
This is not the correct way to submit patches for inclusion in the stable kernel tree. Please read: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html for how to do this properly.
</formletter>
When a failure occurs in rproc_add() it returns an error, but does not cleanup after itself. This change adds the failure path in such cases.
Signed-off-by: Siddharth Gupta sidgup@codeaurora.org --- drivers/remoteproc/remoteproc_core.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c index b874280..d823f70 100644 --- a/drivers/remoteproc/remoteproc_core.c +++ b/drivers/remoteproc/remoteproc_core.c @@ -2343,8 +2343,10 @@ int rproc_add(struct rproc *rproc) return ret;
ret = device_add(dev); - if (ret < 0) - return ret; + if (ret < 0) { + put_device(dev); + goto rproc_remove_cdev; + }
dev_info(dev, "%s is available\n", rproc->name);
@@ -2355,7 +2357,7 @@ int rproc_add(struct rproc *rproc) if (rproc->auto_boot) { ret = rproc_trigger_auto_boot(rproc); if (ret < 0) - return ret; + goto rproc_remove_dev; }
/* expose to rproc_get_by_phandle users */ @@ -2364,6 +2366,13 @@ int rproc_add(struct rproc *rproc) mutex_unlock(&rproc_list_mutex);
return 0; + +rproc_remove_dev: + rproc_delete_debug_dir(rproc); + device_del(dev); +rproc_remove_cdev: + rproc_char_device_remove(rproc); + return ret; } EXPORT_SYMBOL(rproc_add);
On Mon, Jun 14, 2021 at 07:21:11PM -0700, Siddharth Gupta wrote:
When a failure occurs in rproc_add() it returns an error, but does not cleanup after itself. This change adds the failure path in such cases.
Signed-off-by: Siddharth Gupta sidgup@codeaurora.org
drivers/remoteproc/remoteproc_core.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)
<formletter>
This is not the correct way to submit patches for inclusion in the stable kernel tree. Please read: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html for how to do this properly.
</formletter>
linux-stable-mirror@lists.linaro.org