->add_dev() may fail and the error returned from it can be useful for the caller.
For example, if some of the resources aren't ready yet and -EPROBE_DEFER is returned from ->add_dev(), then the owner of 'struct subsys_interface' may want to try probing again at a later point of time. And that requires a proper return value from ->add_dev().
Also, if we hit an error while registering subsys_interface, then we should stop proceeding further and rollback whatever has been done until then. Break part of subsys_interface_unregister() into another routine, which lets us call ->remove_dev() for all devices for which ->add_dev() is already called.
Cc: 3.3+ stable@vger.kernel.org # 3.3+ Fixes: ca22e56debc5 ("driver-core: implement 'sysdev' functionality for regular devices and buses") Reported-and-tested-by: Pi-Cheng Chen pi-cheng.chen@linaro.org Signed-off-by: Viresh Kumar viresh.kumar@linaro.org ---
drivers/base/bus.c | 55 ++++++++++++++++++++++++++++++++++-------------------- 1 file changed, 35 insertions(+), 20 deletions(-)
diff --git a/drivers/base/bus.c b/drivers/base/bus.c index 79bc203f51ef..d92dc109ba51 100644 --- a/drivers/base/bus.c +++ b/drivers/base/bus.c @@ -1112,11 +1112,36 @@ void subsys_dev_iter_exit(struct subsys_dev_iter *iter) } EXPORT_SYMBOL_GPL(subsys_dev_iter_exit);
+static void __subsys_interface_unregister(struct subsys_interface *sif, + struct device *lastdev) +{ + struct bus_type *subsys = sif->subsys; + struct subsys_dev_iter iter; + struct device *dev; + + mutex_lock(&subsys->p->mutex); + list_del_init(&sif->node); + if (sif->remove_dev) { + subsys_dev_iter_init(&iter, subsys, NULL, NULL); + while ((dev = subsys_dev_iter_next(&iter))) { + if (dev == lastdev) + break; + + sif->remove_dev(dev, sif); + } + subsys_dev_iter_exit(&iter); + } + mutex_unlock(&subsys->p->mutex); + + bus_put(subsys); +} + int subsys_interface_register(struct subsys_interface *sif) { struct bus_type *subsys; struct subsys_dev_iter iter; struct device *dev; + int ret = 0;
if (!sif || !sif->subsys) return -ENODEV; @@ -1129,38 +1154,28 @@ int subsys_interface_register(struct subsys_interface *sif) list_add_tail(&sif->node, &subsys->p->interfaces); if (sif->add_dev) { subsys_dev_iter_init(&iter, subsys, NULL, NULL); - while ((dev = subsys_dev_iter_next(&iter))) - sif->add_dev(dev, sif); + while ((dev = subsys_dev_iter_next(&iter))) { + ret = sif->add_dev(dev, sif); + if (ret) + break; + } subsys_dev_iter_exit(&iter); } mutex_unlock(&subsys->p->mutex);
- return 0; + if (ret) + __subsys_interface_unregister(sif, dev); + + return ret; } EXPORT_SYMBOL_GPL(subsys_interface_register);
void subsys_interface_unregister(struct subsys_interface *sif) { - struct bus_type *subsys; - struct subsys_dev_iter iter; - struct device *dev; - if (!sif || !sif->subsys) return;
- subsys = sif->subsys; - - mutex_lock(&subsys->p->mutex); - list_del_init(&sif->node); - if (sif->remove_dev) { - subsys_dev_iter_init(&iter, subsys, NULL, NULL); - while ((dev = subsys_dev_iter_next(&iter))) - sif->remove_dev(dev, sif); - subsys_dev_iter_exit(&iter); - } - mutex_unlock(&subsys->p->mutex); - - bus_put(subsys); + __subsys_interface_unregister(sif, NULL); } EXPORT_SYMBOL_GPL(subsys_interface_unregister);
On Fri, Jun 26, 2015 at 02:32:47PM +0530, Viresh Kumar wrote:
->add_dev() may fail and the error returned from it can be useful for the caller.
For example, if some of the resources aren't ready yet and -EPROBE_DEFER is returned from ->add_dev(), then the owner of 'struct subsys_interface' may want to try probing again at a later point of time. And that requires a proper return value from ->add_dev().
Also, if we hit an error while registering subsys_interface, then we should stop proceeding further and rollback whatever has been done until then. Break part of subsys_interface_unregister() into another routine, which lets us call ->remove_dev() for all devices for which ->add_dev() is already called.
Cc: 3.3+ stable@vger.kernel.org # 3.3+ Fixes: ca22e56debc5 ("driver-core: implement 'sysdev' functionality for regular devices and buses")
I don't see how this is a stable bug fix, what is resolved by it that doesn't work today? Is there some code that is expecting this functionality that has never been present?
I'll go queue it up, but I don't think it is -stable material, but feel free to change my mind.
greg k-h
On Wednesday, July 29, 2015 02:19:16 PM Greg KH wrote:
On Fri, Jun 26, 2015 at 02:32:47PM +0530, Viresh Kumar wrote:
->add_dev() may fail and the error returned from it can be useful for the caller.
For example, if some of the resources aren't ready yet and -EPROBE_DEFER is returned from ->add_dev(), then the owner of 'struct subsys_interface' may want to try probing again at a later point of time. And that requires a proper return value from ->add_dev().
Also, if we hit an error while registering subsys_interface, then we should stop proceeding further and rollback whatever has been done until then. Break part of subsys_interface_unregister() into another routine, which lets us call ->remove_dev() for all devices for which ->add_dev() is already called.
Cc: 3.3+ stable@vger.kernel.org # 3.3+ Fixes: ca22e56debc5 ("driver-core: implement 'sysdev' functionality for regular devices and buses")
I don't see how this is a stable bug fix, what is resolved by it that doesn't work today? Is there some code that is expecting this functionality that has never been present?
I'll go queue it up, but I don't think it is -stable material, but feel free to change my mind.
There is a small problem with it that I've already pointed out to Viresh.
Namely, while changing subsys_interface_(un)register() to handle return values from ->add_dev(), it doesn't do the same thing in bus_probe_device() which I believe it should for consistency at least.
But then, the question is whether or not the probing should fail and what if device_attach() returns 0 and one of the ->add_dev() callbacks returns an error.
Thanks, Rafael
On Thu, Jul 30, 2015 at 01:01:21AM +0200, Rafael J. Wysocki wrote:
On Wednesday, July 29, 2015 02:19:16 PM Greg KH wrote:
On Fri, Jun 26, 2015 at 02:32:47PM +0530, Viresh Kumar wrote:
->add_dev() may fail and the error returned from it can be useful for the caller.
For example, if some of the resources aren't ready yet and -EPROBE_DEFER is returned from ->add_dev(), then the owner of 'struct subsys_interface' may want to try probing again at a later point of time. And that requires a proper return value from ->add_dev().
Also, if we hit an error while registering subsys_interface, then we should stop proceeding further and rollback whatever has been done until then. Break part of subsys_interface_unregister() into another routine, which lets us call ->remove_dev() for all devices for which ->add_dev() is already called.
Cc: 3.3+ stable@vger.kernel.org # 3.3+ Fixes: ca22e56debc5 ("driver-core: implement 'sysdev' functionality for regular devices and buses")
I don't see how this is a stable bug fix, what is resolved by it that doesn't work today? Is there some code that is expecting this functionality that has never been present?
I'll go queue it up, but I don't think it is -stable material, but feel free to change my mind.
There is a small problem with it that I've already pointed out to Viresh.
Namely, while changing subsys_interface_(un)register() to handle return values from ->add_dev(), it doesn't do the same thing in bus_probe_device() which I believe it should for consistency at least.
Oops, sorry, missed that response. I'll go drop this patch then, thanks for letting me know.
But then, the question is whether or not the probing should fail and what if device_attach() returns 0 and one of the ->add_dev() callbacks returns an error.
That's a total mess...
Given that there are almost no uses of this api, I think people should work it out before any more start to pop up :)
thanks,
greg k-h
On Wednesday, July 29, 2015 03:37:43 PM Greg KH wrote:
On Thu, Jul 30, 2015 at 01:01:21AM +0200, Rafael J. Wysocki wrote:
On Wednesday, July 29, 2015 02:19:16 PM Greg KH wrote:
On Fri, Jun 26, 2015 at 02:32:47PM +0530, Viresh Kumar wrote:
->add_dev() may fail and the error returned from it can be useful for the caller.
For example, if some of the resources aren't ready yet and -EPROBE_DEFER is returned from ->add_dev(), then the owner of 'struct subsys_interface' may want to try probing again at a later point of time. And that requires a proper return value from ->add_dev().
Also, if we hit an error while registering subsys_interface, then we should stop proceeding further and rollback whatever has been done until then. Break part of subsys_interface_unregister() into another routine, which lets us call ->remove_dev() for all devices for which ->add_dev() is already called.
Cc: 3.3+ stable@vger.kernel.org # 3.3+ Fixes: ca22e56debc5 ("driver-core: implement 'sysdev' functionality for regular devices and buses")
I don't see how this is a stable bug fix, what is resolved by it that doesn't work today? Is there some code that is expecting this functionality that has never been present?
I'll go queue it up, but I don't think it is -stable material, but feel free to change my mind.
There is a small problem with it that I've already pointed out to Viresh.
Namely, while changing subsys_interface_(un)register() to handle return values from ->add_dev(), it doesn't do the same thing in bus_probe_device() which I believe it should for consistency at least.
Oops, sorry, missed that response. I'll go drop this patch then, thanks for letting me know.
But then, the question is whether or not the probing should fail and what if device_attach() returns 0 and one of the ->add_dev() callbacks returns an error.
That's a total mess...
Given that there are almost no uses of this api, I think people should work it out before any more start to pop up :)
cpufreq is one of the users and that's where the problem is, but in my opinion it should be addressed in a different way.
But while we are at it, should the ->add_dev and ->remove_dev callbacks in struct subsys_interface return an int if their return values are always ignored? Maybe it would be better to redefine them to be void to make it clear that they can't fail?
Thanks, Rafael
On Thu, Jul 30, 2015 at 01:29:01AM +0200, Rafael J. Wysocki wrote:
On Wednesday, July 29, 2015 03:37:43 PM Greg KH wrote:
On Thu, Jul 30, 2015 at 01:01:21AM +0200, Rafael J. Wysocki wrote:
On Wednesday, July 29, 2015 02:19:16 PM Greg KH wrote:
On Fri, Jun 26, 2015 at 02:32:47PM +0530, Viresh Kumar wrote:
->add_dev() may fail and the error returned from it can be useful for the caller.
For example, if some of the resources aren't ready yet and -EPROBE_DEFER is returned from ->add_dev(), then the owner of 'struct subsys_interface' may want to try probing again at a later point of time. And that requires a proper return value from ->add_dev().
Also, if we hit an error while registering subsys_interface, then we should stop proceeding further and rollback whatever has been done until then. Break part of subsys_interface_unregister() into another routine, which lets us call ->remove_dev() for all devices for which ->add_dev() is already called.
Cc: 3.3+ stable@vger.kernel.org # 3.3+ Fixes: ca22e56debc5 ("driver-core: implement 'sysdev' functionality for regular devices and buses")
I don't see how this is a stable bug fix, what is resolved by it that doesn't work today? Is there some code that is expecting this functionality that has never been present?
I'll go queue it up, but I don't think it is -stable material, but feel free to change my mind.
There is a small problem with it that I've already pointed out to Viresh.
Namely, while changing subsys_interface_(un)register() to handle return values from ->add_dev(), it doesn't do the same thing in bus_probe_device() which I believe it should for consistency at least.
Oops, sorry, missed that response. I'll go drop this patch then, thanks for letting me know.
But then, the question is whether or not the probing should fail and what if device_attach() returns 0 and one of the ->add_dev() callbacks returns an error.
That's a total mess...
Given that there are almost no uses of this api, I think people should work it out before any more start to pop up :)
cpufreq is one of the users and that's where the problem is, but in my opinion it should be addressed in a different way.
But while we are at it, should the ->add_dev and ->remove_dev callbacks in struct subsys_interface return an int if their return values are always ignored? Maybe it would be better to redefine them to be void to make it clear that they can't fail?
void makes sense to me.
On 30-07-15, 01:29, Rafael J. Wysocki wrote:
There is a small problem with it that I've already pointed out to Viresh.
Namely, while changing subsys_interface_(un)register() to handle return values from ->add_dev(), it doesn't do the same thing in bus_probe_device() which I believe it should for consistency at least.
Oops, sorry, missed that response. I'll go drop this patch then, thanks for letting me know.
It was discussed in last 2-3 days over a cpufreq related thread, no way you could have caught that :)
But, I think we should keep this patch until the time we find a better solution to our problem.
You are the driver core maintainer and probably no one can give a better suggestion to fix our problem, so lemme explain that again here :)
A platform specific cpufreq driver may depend on a external driver (say i2c which may control access to regulators) for its working, and until the time regulator is up along with i2c we need to defer cpufreq driver from being registered. We already have an EPROBE_DEFER mechanism to handle such situations nicely.
cpufreq core calling sequence at a glance: cpufreq_register_driver() -> subsys_interface_register() -> subsys->add_dev() for all present CPUs one by one -> cpufreq_add_dev() -> cpufreq_driver->init()
Now init() is the only location where the drivers initialize per policy (group of CPUs that share clock/voltage rails) stuff and can check if all the resources like clk/regulator are available or not. And so -EPROBE_DEFER will be returned from here.
The only broken part here is the return value of subsys->add_dev() and that I tried to fix.
Another important part here is that the cpufreq driver isn't probed against a cpu device, but a dummy platform device to get the EPROBE_DEFER story right, plus there are few other issues that it solves, like probing the right cpufreq driver.
Now, please explain a sane way to get things working for such platforms.
But then, the question is whether or not the probing should fail and what if device_attach() returns 0 and one of the ->add_dev() callbacks returns an error.
That's a total mess...
Right and so modifying that to propagate return value wouldn't be that straight forward.
Given that there are almost no uses of this api, I think people should work it out before any more start to pop up :)
cpufreq is one of the users and that's where the problem is, but in my opinion it should be addressed in a different way.
But while we are at it, should the ->add_dev and ->remove_dev callbacks in struct subsys_interface return an int if their return values are always ignored? Maybe it would be better to redefine them to be void to make it clear that they can't fail?
For remove_dev(), surely a void return type will make sense. I can put some effort to get that done. But not sure about add_dev() yet.
On Thursday, July 30, 2015 09:14:31 AM Viresh Kumar wrote:
On 30-07-15, 01:29, Rafael J. Wysocki wrote:
There is a small problem with it that I've already pointed out to Viresh.
Namely, while changing subsys_interface_(un)register() to handle return values from ->add_dev(), it doesn't do the same thing in bus_probe_device() which I believe it should for consistency at least.
Oops, sorry, missed that response. I'll go drop this patch then, thanks for letting me know.
It was discussed in last 2-3 days over a cpufreq related thread, no way you could have caught that :)
But, I think we should keep this patch until the time we find a better solution to our problem.
You are the driver core maintainer and probably no one can give a better suggestion to fix our problem, so lemme explain that again here :)
A platform specific cpufreq driver may depend on a external driver (say i2c which may control access to regulators) for its working, and until the time regulator is up along with i2c we need to defer cpufreq driver from being registered. We already have an EPROBE_DEFER mechanism to handle such situations nicely.
cpufreq core calling sequence at a glance: cpufreq_register_driver() -> subsys_interface_register() -> subsys->add_dev() for all present CPUs one by one -> cpufreq_add_dev() -> cpufreq_driver->init()
Now init() is the only location where the drivers initialize per policy (group of CPUs that share clock/voltage rails) stuff and can check if all the resources like clk/regulator are available or not. And so -EPROBE_DEFER will be returned from here.
No, this is not the only place (see below).
The only broken part here is the return value of subsys->add_dev() and that I tried to fix.
Another important part here is that the cpufreq driver isn't probed against a cpu device, but a dummy platform device to get the EPROBE_DEFER story right,
So the big question here is: why? Why is that done the way it is done?
plus there are few other issues that it solves, like probing the right cpufreq driver.
Now, please explain a sane way to get things working for such platforms.
Well, on ACPI systems we actually do probe CPU devices. We have a processor driver there that binds to CPU devices and the cpufreq driver is just a frontend to that.
So question is what prevents DT-based systems from doing it analogously.
Now, even if you use a fake platform device for that (I'm sure there are reasons for doing that, but I'd very much like them to be explained), then all of the information on dependencies should already be available to the ->probe callback of that device's driver, so it can check them before registering the cpufreq interface, can't it?
Essentially, what you're suggesting to do is something like: Make the ->probe of one device's driver register a subsys interface for a specific bus type and check what ->add_dev of that interface returns for each device on that bus and if that is -EPROBE_DEFER, return it as its own return value. Do you honestly think this is a good design?
But then, the question is whether or not the probing should fail and what if device_attach() returns 0 and one of the ->add_dev() callbacks returns an error.
That's a total mess...
Right and so modifying that to propagate return value wouldn't be that straight forward.
That's the point. Doing it in one place only just because that covers your particular use case (which is questionable at best) is not exactly correct in my view.
Given that there are almost no uses of this api, I think people should work it out before any more start to pop up :)
cpufreq is one of the users and that's where the problem is, but in my opinion it should be addressed in a different way.
But while we are at it, should the ->add_dev and ->remove_dev callbacks in struct subsys_interface return an int if their return values are always ignored? Maybe it would be better to redefine them to be void to make it clear that they can't fail?
For remove_dev(), surely a void return type will make sense. I can put some effort to get that done. But not sure about add_dev() yet.
The way ->add_dev is used by the core today, it should be void as well.
Changing the way the core uses it would require figuring out how it should interact with device_attach().
Thanks, Rafael
On 30-07-15, 20:53, Rafael J. Wysocki wrote:
Well, on ACPI systems we actually do probe CPU devices. We have a processor driver there that binds to CPU devices and the cpufreq driver is just a frontend to that.
Hmm, maybe I need to look at that in detail..
So question is what prevents DT-based systems from doing it analogously.
Don't have an answer to it yet.
Now, even if you use a fake platform device for that (I'm sure there are reasons for doing that, but I'd very much like them to be explained),
The other reason apart from the EPROBE_DEFER thing was to identify the right driver for a platform. For multiplatform kernels, there can be multiple cpufreq drivers present in the kernel and there was no other way to identify the right driver platform wants to probe.
then all of the information on dependencies should already be available to the ->probe callback of that device's driver, so it can check them before registering the cpufreq interface, can't it?
That's what we try to do today for cpufreq-dt, for example. But that has to be done for every possible policy the system can have as all might have separate resources to allocate. For cpufreq-dt, we do it only for cpu0 today, and assume others will work as well if cpu0 can.
The real deal is that we need a probe() per policy here, for which init() fitted well :)
Essentially, what you're suggesting to do is something like: Make the ->probe of one device's driver register a subsys interface for a specific bus type and check what ->add_dev of that interface returns for each device on that bus and if that is -EPROBE_DEFER, return it as its own return value. Do you honestly think this is a good design?
No. I don't really thing so. That's why I was asking for suggestions to do it proper. Maybe processor driver is the way to look for, I will investigate further on that.
But until the time that is done, and I expect that to take some time, can't we check the return value of ->add_dev()?
On Friday, July 31, 2015 11:39:07 AM Viresh Kumar wrote:
On 30-07-15, 20:53, Rafael J. Wysocki wrote:
Well, on ACPI systems we actually do probe CPU devices. We have a processor driver there that binds to CPU devices and the cpufreq driver is just a frontend to that.
Hmm, maybe I need to look at that in detail..
So question is what prevents DT-based systems from doing it analogously.
Don't have an answer to it yet.
Now, even if you use a fake platform device for that (I'm sure there are reasons for doing that, but I'd very much like them to be explained),
The other reason apart from the EPROBE_DEFER thing was to identify the right driver for a platform. For multiplatform kernels, there can be multiple cpufreq drivers present in the kernel and there was no other way to identify the right driver platform wants to probe.
then all of the information on dependencies should already be available to the ->probe callback of that device's driver, so it can check them before registering the cpufreq interface, can't it?
That's what we try to do today for cpufreq-dt, for example. But that has to be done for every possible policy the system can have as all might have separate resources to allocate. For cpufreq-dt, we do it only for cpu0 today, and assume others will work as well if cpu0 can.
The real deal is that we need a probe() per policy here, for which init() fitted well :)
Essentially, what you're suggesting to do is something like: Make the ->probe of one device's driver register a subsys interface for a specific bus type and check what ->add_dev of that interface returns for each device on that bus and if that is -EPROBE_DEFER, return it as its own return value. Do you honestly think this is a good design?
No. I don't really thing so. That's why I was asking for suggestions to do it proper. Maybe processor driver is the way to look for, I will investigate further on that.
But until the time that is done, and I expect that to take some time, can't we check the return value of ->add_dev()?
As I said, either do it everywhere, or do it nowhere (in which case it can be void). Doing it in one place only is plain confusing and generally incorrect.
Thanks, Rafael
On 29-07-15, 14:19, Greg KH wrote:
I don't see how this is a stable bug fix, what is resolved by it that doesn't work today? Is there some code that is expecting this functionality that has never been present?
I'll go queue it up, but I don't think it is -stable material, but feel free to change my mind.
Yeah, probably its a bit overdone. Leave the stable thing for now..
linaro-kernel@lists.linaro.org