[PATCH 0/2] staging: greybus: loopback: Fix two oops paths in gb_loopback

List overview All Threads
Download

newer

older

[PATCH] staging: greybus: fix...

[PATCH] MAINTAINERS: add greybus...

Bryan O'Donoghue

8 Jan 2017 8 Jan '17

3:27 p.m.

Alexandre reported a bug in the loopback rmmod path subsequent to a failure in gb_pm_runtime_get_sync - something that can happen when you are developing device-side firmware easily. Doing some cursory rmmod testing on gb_loopback then showed a second (and long standing) error pertaining to removal of gb_dev.root.

This series fixes both issues.

Bryan O'Donoghue (2): staging: greybus: loopback: fix gb_pm_runtime_get_sync error handling staging: greybus: loopback: fix oops on rmmod gb_loopback

drivers/staging/greybus/loopback.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)

-- 2.7.4

Show replies by date

Bryan O'Donoghue

8 Jan 8 Jan

3:27 p.m.

New subject: [PATCH 1/2] staging: greybus: loopback: fix gb_pm_runtime_get_sync error handling

commit e854ff58ed70 ("greybus: loopback: add runtime pm support") introduces pm runtime support to the loopback code. If a gb_pm_runtime_get_sync() fails, currently we immediately return from the loopback thread.

Alexandre reports that later on, subsequent to the afore mentioned failure, doing an rmmod will trigger a kthread_stop() which will will generate a fault. The fault results from dereferencing gb->task in gb_loopback_disconnect().

One way to fix that is to have the loopback thread do_exit() instead of simply returning on gb_pm_runtime_get_sync() failure - however this will leave dangling sysfs entries with no loopback thread to take action based on the sysfs inputs.

This patch fixes the fault by ignoring the gb_pm_runtime_get_sync() result code, this will allow only gb_loopback_disconnect() to terminate the loopback thread. Fix validated in gbsim.

Next steps on this driver entail changing the relationship between /sysfs and the loopback thread - probably making the loopback thread dynamically started/stopped - as opposed to the pre-allocation model currently used but, those changes are for a future series to address.

Reported-by: Alexandre Bailon abailon@baylibre.com Signed-off-by: Bryan O'Donoghue pure.logic@nexus-software.ie --- drivers/staging/greybus/loopback.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/staging/greybus/loopback.c b/drivers/staging/greybus/loopback.c index 83f3b9f..88d4f5c 100644 --- a/drivers/staging/greybus/loopback.c +++ b/drivers/staging/greybus/loopback.c @@ -886,9 +886,7 @@ static int gb_loopback_fn(void *data) gb_pm_runtime_put_autosuspend(bundle); wait_event_interruptible(gb->wq, gb->type || kthread_should_stop()); - ret = gb_pm_runtime_get_sync(bundle); - if (ret) - return ret; + gb_pm_runtime_get_sync(bundle); }

if (kthread_should_stop())

-- 2.7.4

Johan Hovold

9 Jan 9 Jan

11:19 a.m.

New subject: [PATCH 1/2] staging: greybus: loopback: fix gb_pm_runtime_get_sync error handling

On Sun, Jan 08, 2017 at 03:27:18PM +0000, Bryan O'Donoghue wrote:

...

commit e854ff58ed70 ("greybus: loopback: add runtime pm support") introduces pm runtime support to the loopback code. If a gb_pm_runtime_get_sync() fails, currently we immediately return from the loopback thread.

Alexandre reports that later on, subsequent to the afore mentioned failure, doing an rmmod will trigger a kthread_stop() which will will generate a fault. The fault results from dereferencing gb->task in gb_loopback_disconnect().

One way to fix that is to have the loopback thread do_exit() instead of simply returning on gb_pm_runtime_get_sync() failure - however this will leave dangling sysfs entries with no loopback thread to take action based on the sysfs inputs.

This patch fixes the fault by ignoring the gb_pm_runtime_get_sync() result code, this will allow only gb_loopback_disconnect() to terminate the loopback thread. Fix validated in gbsim.

No, you must check the return for resume errors, and not attempt any further I/O in case of failure.

Kill the thread, or somehow report a test failure and wait for user space to retry.

This is not directly related to the sysfs entries, they belong to the bundle and should stay while the device is present. As mentioned before, the sysfs-interface is not the right interface for this type of device, but that's a different story.

...

Next steps on this driver entail changing the relationship between /sysfs and the loopback thread - probably making the loopback thread dynamically started/stopped - as opposed to the pre-allocation model currently used but, those changes are for a future series to address.

You might get away with using a work struct scheduled from the current sysfs-interface, but long term this should probably be shifted over to using a character-device interface.

Thanks, Johan

Bryan O'Donoghue

11:29 a.m.

New subject: [PATCH 1/2] staging: greybus: loopback: fix gb_pm_runtime_get_sync error handling

On January 9, 2017 11:19:09 AM GMT+00:00, Johan Hovold johan@kernel.org wrote:

...

On Sun, Jan 08, 2017 at 03:27:18PM +0000, Bryan O'Donoghue wrote:

...
commit e854ff58ed70 ("greybus: loopback: add runtime pm support") introduces pm runtime support to the loopback code. If a gb_pm_runtime_get_sync() fails, currently we immediately return from

the

...
loopback thread.

Alexandre reports that later on, subsequent to the afore mentioned

failure,

...
doing an rmmod will trigger a kthread_stop() which will will generate

a

...
fault. The fault results from dereferencing gb->task in gb_loopback_disconnect().

One way to fix that is to have the loopback thread do_exit() instead

of

...
simply returning on gb_pm_runtime_get_sync() failure - however this

will

...
leave dangling sysfs entries with no loopback thread to take action

based

...
on the sysfs inputs.

This patch fixes the fault by ignoring the gb_pm_runtime_get_sync()

result

...
code, this will allow only gb_loopback_disconnect() to terminate the loopback thread. Fix validated in gbsim.

No, you must check the return for resume errors, and not attempt any further I/O in case of failure.

The greybus operations themselves do report an error subsequent to the user-space tool.

...

Kill the thread, or somehow report a test failure and wait for user space to retry.

Killing the thread would be messy, however yes I take your general point.

It should be possible to use the gb_pm_runtime_get_sync as a determinant and cease further operations.

I spin that into a v2 this evening.

...

This is not directly related to the sysfs entries, they belong to the bundle and should stay while the device is present. As mentioned before, the sysfs-interface is not the right interface for this type of device, but that's a different story.

...
Next steps on this driver entail changing the relationship between

/sysfs

...
and the loopback thread - probably making the loopback thread

dynamically

...
started/stopped - as opposed to the pre-allocation model currently

used

...
but, those changes are for a future series to address.

You might get away with using a work struct scheduled from the current sysfs-interface, but long term this should probably be shifted over to using a character-device interface.

Thanks, Johan

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

Johan Hovold

11:57 a.m.

New subject: [PATCH 1/2] staging: greybus: loopback: fix gb_pm_runtime_get_sync error handling

On Mon, Jan 09, 2017 at 11:29:50AM +0000, Bryan O'Donoghue wrote:

...

On January 9, 2017 11:19:09 AM GMT+00:00, Johan Hovold johan@kernel.org wrote:

...
On Sun, Jan 08, 2017 at 03:27:18PM +0000, Bryan O'Donoghue wrote:

...
commit e854ff58ed70 ("greybus: loopback: add runtime pm support") introduces pm runtime support to the loopback code. If a gb_pm_runtime_get_sync() fails, currently we immediately return from

the

...
loopback thread.

Alexandre reports that later on, subsequent to the afore mentioned

failure,

...
doing an rmmod will trigger a kthread_stop() which will will generate

a

...
fault. The fault results from dereferencing gb->task in gb_loopback_disconnect().

One way to fix that is to have the loopback thread do_exit() instead

of

...
simply returning on gb_pm_runtime_get_sync() failure - however this

will

...
leave dangling sysfs entries with no loopback thread to take action

based

...
on the sysfs inputs.

This patch fixes the fault by ignoring the gb_pm_runtime_get_sync()

result

...
code, this will allow only gb_loopback_disconnect() to terminate the loopback thread. Fix validated in gbsim.

No, you must check the return for resume errors, and not attempt any further I/O in case of failure.

The greybus operations themselves do report an error subsequent to the user-space tool.

Then you must already have a mechanism for reporting errors. Just set a flag and don't execute any further operations after resume failure.

...

...
Kill the thread, or somehow report a test failure and wait for user space to retry.

Killing the thread would be messy, however yes I take your general point.

Setting a flag and doing a clean exit should do right? Considering a resume-failure as a fatal error is reasonable.

...

It should be possible to use the gb_pm_runtime_get_sync as a determinant and cease further operations.

But cancelling the current test and allow user-space to restart it would be more well-behaved of course. Not sure it's worth it given the state of things though.

Johan

Bryan O'Donoghue

12:08 p.m.

New subject: [PATCH 1/2] staging: greybus: loopback: fix gb_pm_runtime_get_sync error handling

On January 9, 2017 11:57:36 AM GMT+00:00, Johan Hovold johan@kernel.org wrote:

...

On Mon, Jan 09, 2017 at 11:29:50AM +0000, Bryan O'Donoghue wrote:

...
On January 9, 2017 11:19:09 AM GMT+00:00, Johan Hovold

johan@kernel.org wrote:

...
...
On Sun, Jan 08, 2017 at 03:27:18PM +0000, Bryan O'Donoghue wrote:

...
commit e854ff58ed70 ("greybus: loopback: add runtime pm support") introduces pm runtime support to the loopback code. If a gb_pm_runtime_get_sync() fails, currently we immediately return

from

...
...
the

...
loopback thread.

Alexandre reports that later on, subsequent to the afore mentioned

failure,

...
doing an rmmod will trigger a kthread_stop() which will will

generate

...
...
a

...
fault. The fault results from dereferencing gb->task in gb_loopback_disconnect().

One way to fix that is to have the loopback thread do_exit()

instead

...
...
of

...
simply returning on gb_pm_runtime_get_sync() failure - however

this

...
...
will

...
leave dangling sysfs entries with no loopback thread to take

action

...
...
based

...
on the sysfs inputs.

This patch fixes the fault by ignoring the

gb_pm_runtime_get_sync()

...
...
result

...
code, this will allow only gb_loopback_disconnect() to terminate

the

...
...
...
loopback thread. Fix validated in gbsim.

No, you must check the return for resume errors, and not attempt any further I/O in case of failure.

The greybus operations themselves do report an error subsequent to

the

...
user-space tool.

Then you must already have a mechanism for reporting errors. Just set a flag and don't execute any further operations after resume failure.

...
...
Kill the thread, or somehow report a test failure and wait for user space to retry.

Killing the thread would be messy, however yes I take your general point.

Setting a flag and doing a clean exit should do right? Considering a resume-failure as a fatal error is reasonable.

...
It should be possible to use the gb_pm_runtime_get_sync as a determinant and cease further operations.

But cancelling the current test and allow user-space to restart it would be more well-behaved of course. Not sure it's worth it given the state of things though.

Sorry yes, I'm not saying anything different.

...

Johan

--- bod

Bryan O'Donoghue

8 Jan 8 Jan

3:27 p.m.

New subject: [PATCH 2/2] staging: greybus: loopback: fix oops on rmmod gb_loopback

Doing an rmmod gb_loopback will cause an oops with the following backtrace.

Call Trace: debugfs_remove+0xaf/0x180 gb_loopback_disconnect+0x36/0x160 [gb_loopback] greybus_remove+0x87/0x1a0 [greybus] device_release_driver_internal+0x14a/0x200 driver_detach+0x39/0x60 bus_remove_driver+0x47/0xa0 driver_unregister+0x27/0x50 greybus_deregister_driver+0xd/0x10 [greybus] loopback_exit+0x1c/0x36 [gb_loopback] SyS_delete_module+0x170/0x1b0 entry_SYSCALL_64_fastpath+0x13/0x93

This happens because gb_dev.root is removed prior to greybus_deregister_driver which itself will trigger gb_loopback_disconnect to run - leading to the parent of gb->file being NULL.

Typically up to this point when tearing down greybus modules we have removed gb_es2 prior to removing gb_loopback. In this case gb_es2 will run subordinate greybus->disconnect() routines. A subsequent rmmod of gb_loopback then will not re-run gb_loopback_disconnect().

On project Ara the second module removal flow was the only practiced flow hence it's only now in gbsim where we have more freedom to rmmod that this bug is apparent. The fix itself is pretty straight-forward.

Signed-off-by: Bryan O'Donoghue pure.logic@nexus-software.ie --- drivers/staging/greybus/loopback.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/greybus/loopback.c b/drivers/staging/greybus/loopback.c index 88d4f5c..8f3eb33 100644 --- a/drivers/staging/greybus/loopback.c +++ b/drivers/staging/greybus/loopback.c @@ -1260,9 +1260,9 @@ module_init(loopback_init);

static void __exit loopback_exit(void) { - debugfs_remove_recursive(gb_dev.root); greybus_deregister(&gb_loopback_driver); class_unregister(&loopback_class); + debugfs_remove_recursive(gb_dev.root); ida_destroy(&loopback_ida); } module_exit(loopback_exit);

-- 2.7.4

Johan Hovold

9 Jan 9 Jan

11:51 a.m.

New subject: [PATCH 2/2] staging: greybus: loopback: fix oops on rmmod gb_loopback

On Sun, Jan 08, 2017 at 03:27:19PM +0000, Bryan O'Donoghue wrote:

...

Doing an rmmod gb_loopback will cause an oops with the following backtrace.

Call Trace: debugfs_remove+0xaf/0x180 gb_loopback_disconnect+0x36/0x160 [gb_loopback] greybus_remove+0x87/0x1a0 [greybus] device_release_driver_internal+0x14a/0x200 driver_detach+0x39/0x60 bus_remove_driver+0x47/0xa0 driver_unregister+0x27/0x50 greybus_deregister_driver+0xd/0x10 [greybus] loopback_exit+0x1c/0x36 [gb_loopback] SyS_delete_module+0x170/0x1b0 entry_SYSCALL_64_fastpath+0x13/0x93

This happens because gb_dev.root is removed prior to greybus_deregister_driver which itself will trigger gb_loopback_disconnect to run - leading to the parent of gb->file being NULL.

Typically up to this point when tearing down greybus modules we have removed gb_es2 prior to removing gb_loopback. In this case gb_es2 will run subordinate greybus->disconnect() routines. A subsequent rmmod of gb_loopback then will not re-run gb_loopback_disconnect().

On project Ara the second module removal flow was the only practiced flow hence it's only now in gbsim where we have more freedom to rmmod that this bug is apparent. The fix itself is pretty straight-forward.

There was nothing preventing you from unloading bundle-drivers while keeping the host-device driver (e.g. gb_es2) loaded on Project Ara, even if this typically wasn't done, yes.

...

drivers/staging/greybus/loopback.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/greybus/loopback.c b/drivers/staging/greybus/loopback.c index 88d4f5c..8f3eb33 100644 --- a/drivers/staging/greybus/loopback.c +++ b/drivers/staging/greybus/loopback.c @@ -1260,9 +1260,9 @@ module_init(loopback_init); static void __exit loopback_exit(void) {

debugfs_remove_recursive(gb_dev.root); greybus_deregister(&gb_loopback_driver); class_unregister(&loopback_class);

debugfs_remove_recursive(gb_dev.root);

This is just a plain bug, and in fact, you should only be using debugfs_remove here, as any bundle-specific files should have been cleaned up as part of disconnect.

...

ida_destroy(&loopback_ida); } module_exit(loopback_exit);

Thanks, Johan

3109

days inactive

3110

days old

greybus-dev@lists.linaro.org

7 comments

participants

tags (0)

participants (2)

Bryan O'Donoghue
Johan Hovold