Support ROHM BU27034 ALS sensor
This series adds support for ROHM BU27034 Ambient Light Sensor.
The BU27034 has configurable gain and measurement (integration) time settings. Both of these have inversely proportional relation to the sensor's intensity channel scale.
Many users only set the scale, which means that many drivers attempt to 'guess' the best gain+time combination to meet the scale. Usually this is the biggest integration time which allows setting the requested scale. Typically, increasing the integration time has better accuracy than increasing the gain, which often amplifies the noise as well as the real signal.
However, there may be cases where more responsive sensors are needed. So, in some cases the longest integration times may not be what the user prefers. The driver has no way of knowing this.
Hence, the approach taken by this series is to allow user to set both the scale and the integration time with following logic:
1. When scale is set, the existing integration time is tried to be maintained as a first priority. 1a) If the requested scale can't be met by current time, then also other time + gain combinations are searched. If scale can be met by some other integration time, then the new time may be applied. If the time setting is common for all channels, then also other channels must be able to maintain their scale with this new time (by changing their gain). The new times are scanned in the order of preference (typically the longest times first). 1b) If the requested scale can be met using current time, then only the gain for the channel is changed.
2. When the integration time change - scale is tried to be maintained. When integration time change is requested also gain for all impacted channels is adjusted so that the scale is not changed, or is chaned as little as possible. This is different from the RFCv1 where the request was rejected if suitable gain couldn't be found for some channel(s).
This logic is simple. When total gain (either caused by time or hw-gain) is doubled, the scale gets halved. Also, the supported times are given a 'multiplier' value which tells how much they increase the total gain.
However, when I wrote this logic in bu27034 driver, I made quite a few errors on the way - and driver got pretty big. As I am writing drivers for two other sensors (RGB C/IR + flicker BU27010 and RGB C/IR BU27008) with similar gain-time-scale logic I thought that adding common helpers for these computations might be wise. I hope this way all the bugs will be concentrated in one place and not in every individual driver ;)
Hence, this series also intriduces IIO gain-time-scale helpers (abbreviated as gts-helpers) + a couple of KUnit tests for the most hairy parts.
Speaking of which - testing the devm interfaces requires a 'dummy device'. There were neat helpers in DRM tests for creating and freeing such a device. This series moves those helpers to more generic location. What is worth noting is that there is something similar ongoing in the CCF territory: https://lore.kernel.org/all/20230302013822.1808711-1-sboyd@kernel.org/ These efforts should be somehow coordinated in order to avoid any avoid conflicts.
Finally, these added helpers do provide some value also for drivers which only: a) allow gain change or b) allow changing both the time and gain while trying to maintain the scale.
For a) we provide the gain - selector (register value) table format + selector to gain look-ups, gain <-> scale conversions and the available scales helpers.
For latter case we also provide the time-tables, and actually all the APIs should be usable by setting the time multiplier to 1. (not testeted thoroughly though).
The patch 1/8 introduces the helpers for creating/dropping a test device for devm-tests. It can be applied alone.
The patches 2/8 (convert DRM tests to use new helper) depends on patch 1/8 but is othervice not part of this series. It can be applied to DRM tree after the dependency to 1/8 is handled.
The patch 5/8 (IIO GTS tests) also depends on the patch 1/8 (and also other patches in the series).
Rest of the series should be Ok to be applied with/without the patches 1/8, 2/8, 5/8 - although the 5/8 would be "nice to have" together with the rest of the series for the testability reasons.
Revision history: v4 => v5: Mostly fixes to review comments from Andy and Jonathan. - more accurate change-log in individual patches - copy code from DRM test helper instead of moving it to simplify merging - document all exported GTS helpers. - inline a few GTS helpers - use again Milli lux for the bu27034 with RAW IIO_LIGHT channel and scale - Fix bug from added in v4 bu27034 time setting.
v3 => v4: (Still mostly fixes to review comments from Andy and Jonathan) - more accurate change-log in individual patches - dt-binding and maintainer patches unchanged. - dropped unused helpers and converted ones currently used only internally to static. - extracted "dummy device" creation helpers from DRM tests. - added tests for devm APIs - dropped scale for PROCESSED channel in BU27034 and converted mLux values to luxes - dropped channel 2 GAIN setting which can't be done due to HW limitations.
v2 => v3: (Mostly fixes to review comments from Andy and Jonathan) - dt-binding and maintainer patches unchanged. - iio-gts-helper tests: Use namespaces - iio-gts-helpers + bu27034 plenty of changes. See more comprehensive changelog in individual patches.
RFCv1 => v2: dt-bindings: - Fix binding file name and id by using comma instead of a hyphen to separate the vendor and part names. gts-helpers: - fix include guardian - Improve kernel doc for iio_init_iio_gts. - Add iio_gts_scale_to_total_gain - Add iio_gts_total_gain_to_scale - Fix review comments from Jonathan - add documentation to few functions - replace 0xffffffffffffffffLLU by U64_MAX - some styling fixes - drop unnecessary NULL checks - order function arguments by in / out purpose - drop GAIN_SCALE_ITIME_MS() - Add helpers for available scales and times - Rename to iio-gts-helpers gts-tests: - add tests for available scales/times helpers - adapt to renamed iio-gts-helpers.h header bu27034-driver: - (really) protect read-only registers - fix get and set gain - buffered mode - Protect the whole sequences including meas_en/meas_dis to avoid messing up the enable / disable order - typofixes / doc improvements - change dropped GAIN_SCALE_ITIME_MS() to GAIN_SCALE_ITIME_US() - use more accurate scale for lux channel (milli lux) - provide available scales / integration times (using helpers). - adapt to renamed iio-gts-helpers.h file - bu27034 - longer lines in Kconfig - Drop bu27034_meas_en and bu27034_meas_dis wrappers. - Change device-name from bu27034-als to bu27034 MAINTAINERS: - Add iio-list
---
Matti Vaittinen (8): drivers: kunit: Generic helpers for test device creation drm/tests: helpers: Use generic helpers dt-bindings: iio: light: Support ROHM BU27034 iio: light: Add gain-time-scale helpers iio: test: test gain-time-scale helpers MAINTAINERS: Add IIO gain-time-scale helpers iio: light: ROHM BU27034 Ambient Light Sensor MAINTAINERS: Add ROHM BU27034
.../bindings/iio/light/rohm,bu27034.yaml | 46 + MAINTAINERS | 14 + drivers/base/test/Kconfig | 5 + drivers/base/test/Makefile | 2 + drivers/base/test/test_kunit_device.c | 83 + drivers/gpu/drm/Kconfig | 2 + .../gpu/drm/tests/drm_client_modeset_test.c | 5 +- drivers/gpu/drm/tests/drm_kunit_helpers.c | 69 - drivers/gpu/drm/tests/drm_managed_test.c | 5 +- drivers/gpu/drm/tests/drm_modes_test.c | 5 +- drivers/gpu/drm/tests/drm_probe_helper_test.c | 5 +- drivers/gpu/drm/vc4/Kconfig | 1 + drivers/gpu/drm/vc4/tests/vc4_mock.c | 3 +- .../gpu/drm/vc4/tests/vc4_test_pv_muxing.c | 9 +- drivers/iio/Kconfig | 3 + drivers/iio/Makefile | 1 + drivers/iio/industrialio-gts-helper.c | 1064 ++++++++++++ drivers/iio/light/Kconfig | 14 + drivers/iio/light/Makefile | 1 + drivers/iio/light/rohm-bu27034.c | 1482 +++++++++++++++++ drivers/iio/test/Kconfig | 14 + drivers/iio/test/Makefile | 1 + drivers/iio/test/iio-test-gts.c | 542 ++++++ include/drm/drm_kunit_helpers.h | 7 +- include/kunit/platform_device.h | 13 + include/linux/iio/iio-gts-helper.h | 206 +++ 26 files changed, 3515 insertions(+), 87 deletions(-) create mode 100644 Documentation/devicetree/bindings/iio/light/rohm,bu27034.yaml create mode 100644 drivers/base/test/test_kunit_device.c create mode 100644 drivers/iio/industrialio-gts-helper.c create mode 100644 drivers/iio/light/rohm-bu27034.c create mode 100644 drivers/iio/test/iio-test-gts.c create mode 100644 include/kunit/platform_device.h create mode 100644 include/linux/iio/iio-gts-helper.h
base-commit: eeac8ede17557680855031c6f305ece2378af326
The creation of a dummy device in order to test managed interfaces is a generally useful test feature. The drm test helpers drm_kunit_helper_alloc_device() and drm_kunit_helper_free_device() are doing exactly this. It makes no sense that each and every component which intends to be testing managed interfaces will create similar helpers so stole these for more generic use.
Signed-off-by: Matti Vaittinen mazziesaccount@gmail.com
--- Changes: v4 => v5: - Add accidentally dropped header and email recipients - do not rename + move helpers from DRM but add temporary dublicates to simplify merging. (This patch does not touch DRM and can be merged separately. DRM patch and IIO test patch still depend on this one).
Please note that there's something similar ongoing in the CCF: https://lore.kernel.org/all/20230302013822.1808711-1-sboyd@kernel.org/
I do like the simplicity of these DRM-originated helpers so I think they're worth. I do equally like the Stephen's idea of having the "dummy platform device" related helpers under drivers/base and the header being in include/kunit/platform_device.h which is similar to real platform device under include/linux/platform_device.h --- drivers/base/test/Kconfig | 5 ++ drivers/base/test/Makefile | 2 + drivers/base/test/test_kunit_device.c | 83 +++++++++++++++++++++++++++ include/kunit/platform_device.h | 13 +++++ 4 files changed, 103 insertions(+) create mode 100644 drivers/base/test/test_kunit_device.c create mode 100644 include/kunit/platform_device.h
diff --git a/drivers/base/test/Kconfig b/drivers/base/test/Kconfig index 610a1ba7a467..642b5b183c10 100644 --- a/drivers/base/test/Kconfig +++ b/drivers/base/test/Kconfig @@ -1,4 +1,9 @@ # SPDX-License-Identifier: GPL-2.0 + +config TEST_KUNIT_DEVICE_HELPERS + depends on KUNIT + tristate + config TEST_ASYNC_DRIVER_PROBE tristate "Build kernel module to test asynchronous driver probing" depends on m diff --git a/drivers/base/test/Makefile b/drivers/base/test/Makefile index 7f76fee6f989..49926524ec6f 100644 --- a/drivers/base/test/Makefile +++ b/drivers/base/test/Makefile @@ -1,5 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_TEST_ASYNC_DRIVER_PROBE) += test_async_driver_probe.o
+obj-$(CONFIG_TEST_KUNIT_DEVICE_HELPERS) += test_kunit_device.o + obj-$(CONFIG_DRIVER_PE_KUNIT_TEST) += property-entry-test.o CFLAGS_property-entry-test.o += $(DISABLE_STRUCTLEAK_PLUGIN) diff --git a/drivers/base/test/test_kunit_device.c b/drivers/base/test/test_kunit_device.c new file mode 100644 index 000000000000..75790e15b85c --- /dev/null +++ b/drivers/base/test/test_kunit_device.c @@ -0,0 +1,83 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * These helpers have been extracted from drm test code at + * drm_kunit_helpers.c which was authored by + * Maxime Ripard maxime@cerno.tech + */ + +#include <linux/device.h> +#include <linux/platform_device.h> + +#include <kunit/platform_device.h> + +#define KUNIT_DEVICE_NAME "test-kunit-mock-device" + +static int fake_probe(struct platform_device *pdev) +{ + return 0; +} + +static int fake_remove(struct platform_device *pdev) +{ + return 0; +} + +static struct platform_driver fake_platform_driver = { + .probe = fake_probe, + .remove = fake_remove, + .driver = { + .name = KUNIT_DEVICE_NAME, + }, +}; + +/** + * test_kunit_helper_alloc_device - Allocate a mock device for a KUnit test + * @test: The test context object + * + * This allocates a fake struct &device to create a mock for a KUnit + * test. The device will also be bound to a fake driver. It will thus be + * able to leverage the usual infrastructure and most notably the + * device-managed resources just like a "real" device. + * + * Callers need to make sure test_kunit_helper_free_device() on the + * device when done. + * + * Returns: + * A pointer to the new device, or an ERR_PTR() otherwise. + */ +struct device *test_kunit_helper_alloc_device(struct kunit *test) +{ + struct platform_device *pdev; + int ret; + + ret = platform_driver_register(&fake_platform_driver); + KUNIT_ASSERT_EQ(test, ret, 0); + + pdev = platform_device_alloc(KUNIT_DEVICE_NAME, PLATFORM_DEVID_NONE); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, pdev); + + ret = platform_device_add(pdev); + KUNIT_ASSERT_EQ(test, ret, 0); + + return &pdev->dev; +} +EXPORT_SYMBOL_GPL(test_kunit_helper_alloc_device); + +/** + * test_kunit_helper_free_device - Frees a mock device + * @test: The test context object + * @dev: The device to free + * + * Frees a device allocated with test_kunit_helper_alloc_device(). + */ +void test_kunit_helper_free_device(struct kunit *test, struct device *dev) +{ + struct platform_device *pdev = to_platform_device(dev); + + platform_device_unregister(pdev); + platform_driver_unregister(&fake_platform_driver); +} +EXPORT_SYMBOL_GPL(test_kunit_helper_free_device); + +MODULE_AUTHOR("Matti Vaittinen mazziesaccount@gmail.com"); +MODULE_LICENSE("GPL"); diff --git a/include/kunit/platform_device.h b/include/kunit/platform_device.h new file mode 100644 index 000000000000..2a9c7bdd75eb --- /dev/null +++ b/include/kunit/platform_device.h @@ -0,0 +1,13 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef __KUNIT_PLATFORM_DEVICE__ +#define __KUNIT_PLATFORM_DEVICE__ + +#include <kunit/test.h> + +struct device; + +struct device *test_kunit_helper_alloc_device(struct kunit *test); +void test_kunit_helper_free_device(struct kunit *test, struct device *dev); + +#endif
On Wed, Mar 22, 2023 at 11:05:55AM +0200, Matti Vaittinen wrote:
--- /dev/null +++ b/include/kunit/platform_device.h @@ -0,0 +1,13 @@ +/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __KUNIT_PLATFORM_DEVICE__ +#define __KUNIT_PLATFORM_DEVICE__
+#include <kunit/test.h>
+struct device;
+struct device *test_kunit_helper_alloc_device(struct kunit *test); +void test_kunit_helper_free_device(struct kunit *test, struct device *dev);
Why are you calling this a "platform_device" when it isn't a platform device at all?
Why not just say "device.h" here?
thanks,
greg k-h
On Wed, Mar 22, 2023 at 11:05:55AM +0200, Matti Vaittinen wrote:
--- /dev/null +++ b/drivers/base/test/test_kunit_device.c @@ -0,0 +1,83 @@ +// SPDX-License-Identifier: GPL-2.0 +/*
- These helpers have been extracted from drm test code at
- drm_kunit_helpers.c which was authored by
- Maxime Ripard maxime@cerno.tech
- */
+#include <linux/device.h> +#include <linux/platform_device.h>
+#include <kunit/platform_device.h>
+#define KUNIT_DEVICE_NAME "test-kunit-mock-device"
+static int fake_probe(struct platform_device *pdev)
Please do not abuse platform devices and drivers for things that are not actually platform devices and drivers.
+{
- return 0;
+}
+static int fake_remove(struct platform_device *pdev) +{
- return 0;
+}
+static struct platform_driver fake_platform_driver = {
- .probe = fake_probe,
- .remove = fake_remove,
- .driver = {
.name = KUNIT_DEVICE_NAME,
- },
+};
Why do you need this fake platform driver at all?
Why not just use a virtual device?
+/**
- test_kunit_helper_alloc_device - Allocate a mock device for a KUnit test
- @test: The test context object
- This allocates a fake struct &device to create a mock for a KUnit
- test. The device will also be bound to a fake driver. It will thus be
- able to leverage the usual infrastructure and most notably the
- device-managed resources just like a "real" device.
What specific "usual infrastructure" are you wanting to access here?
And again, if you want a fake device, make a virtual one, by just calling device_create().
Or are you wanting to do "more" with that device pointer than device_create() can give you?
Again, please do not abuse the platform device infrastructure for things it was never ment to do (i.e. create fake devices that are not really a platform device.)
thanks,
greg k-h
Hi Greg,
Thanks for looking at this.
On 3/22/23 14:07, Greg Kroah-Hartman wrote:
On Wed, Mar 22, 2023 at 11:05:55AM +0200, Matti Vaittinen wrote:
--- /dev/null +++ b/drivers/base/test/test_kunit_device.c @@ -0,0 +1,83 @@ +// SPDX-License-Identifier: GPL-2.0 +/*
- These helpers have been extracted from drm test code at
- drm_kunit_helpers.c which was authored by
- Maxime Ripard maxime@cerno.tech
- */
+#include <linux/device.h> +#include <linux/platform_device.h>
+#include <kunit/platform_device.h>
+#define KUNIT_DEVICE_NAME "test-kunit-mock-device"
+static int fake_probe(struct platform_device *pdev)
Please do not abuse platform devices and drivers for things that are not actually platform devices and drivers.
+{
- return 0;
+}
+static int fake_remove(struct platform_device *pdev) +{
- return 0;
+}
+static struct platform_driver fake_platform_driver = {
- .probe = fake_probe,
- .remove = fake_remove,
- .driver = {
.name = KUNIT_DEVICE_NAME,
- },
+};
Why do you need this fake platform driver at all?
Why not just use a virtual device?
I can only answer on my behalf. In my case the answer to why I used platform_devices is practicality. I wanted to test devm_ APIs using KUnit tests and I was pointed to an existing implementation in DRM (seen in these patches). It didn't seem to make any sense to re-invent the wheel by writing another implementation for the existing in-tree functionality.
Maybe Maxime had a better reason to go with the platform devices.
+/**
- test_kunit_helper_alloc_device - Allocate a mock device for a KUnit test
- @test: The test context object
- This allocates a fake struct &device to create a mock for a KUnit
- test. The device will also be bound to a fake driver. It will thus be
- able to leverage the usual infrastructure and most notably the
- device-managed resources just like a "real" device.
What specific "usual infrastructure" are you wanting to access here?
And again, if you want a fake device, make a virtual one, by just calling device_create().
Or are you wanting to do "more" with that device pointer than device_create() can give you?
Personally, I was (am) only interested in devm_ unwinding. I guess the device_create(), device_add(), device_remove()... (didn't study this sequence in details so sorry if there is errors) could've been sufficient for me. I haven't looked how much of the code that there is for 'platform devices' should be duplicated to support that sequence for testability purposes.
The biggest thing for me is that I don't like the idea of creating own 'test device' in <add subsystem here> while we already have some in DRM (or others). Thus, I do see value in adding generic helpers for supporting running KUnit tests on devm_* APIs. Hence it'd be good to have _some_ support for it. And having them in drivers/base/test seemed like a correct place to me. What I really don't know is if there are legitimate use-cases for using platform_devices in DRM tests. Perhaps Maxime can shed light on that.
Yours, -- Matti
On Wed, Mar 22, 2023 at 03:48:00PM +0200, Matti Vaittinen wrote:
Hi Greg,
Thanks for looking at this.
On 3/22/23 14:07, Greg Kroah-Hartman wrote:
On Wed, Mar 22, 2023 at 11:05:55AM +0200, Matti Vaittinen wrote:
--- /dev/null +++ b/drivers/base/test/test_kunit_device.c @@ -0,0 +1,83 @@ +// SPDX-License-Identifier: GPL-2.0 +/*
- These helpers have been extracted from drm test code at
- drm_kunit_helpers.c which was authored by
- Maxime Ripard maxime@cerno.tech
- */
+#include <linux/device.h> +#include <linux/platform_device.h>
+#include <kunit/platform_device.h>
+#define KUNIT_DEVICE_NAME "test-kunit-mock-device"
+static int fake_probe(struct platform_device *pdev)
Please do not abuse platform devices and drivers for things that are not actually platform devices and drivers.
+{
- return 0;
+}
+static int fake_remove(struct platform_device *pdev) +{
- return 0;
+}
+static struct platform_driver fake_platform_driver = {
- .probe = fake_probe,
- .remove = fake_remove,
- .driver = {
.name = KUNIT_DEVICE_NAME,
- },
+};
Why do you need this fake platform driver at all?
Why not just use a virtual device?
I can only answer on my behalf. In my case the answer to why I used platform_devices is practicality. I wanted to test devm_ APIs using KUnit tests and I was pointed to an existing implementation in DRM (seen in these patches). It didn't seem to make any sense to re-invent the wheel by writing another implementation for the existing in-tree functionality.
That's fine, but please, let's do this right if it's going to be in the driver core, that way we can actually test the driver core code as well.
+/**
- test_kunit_helper_alloc_device - Allocate a mock device for a KUnit test
- @test: The test context object
- This allocates a fake struct &device to create a mock for a KUnit
- test. The device will also be bound to a fake driver. It will thus be
- able to leverage the usual infrastructure and most notably the
- device-managed resources just like a "real" device.
What specific "usual infrastructure" are you wanting to access here?
And again, if you want a fake device, make a virtual one, by just calling device_create().
Or are you wanting to do "more" with that device pointer than device_create() can give you?
Personally, I was (am) only interested in devm_ unwinding. I guess the device_create(), device_add(), device_remove()... (didn't study this sequence in details so sorry if there is errors) could've been sufficient for me. I haven't looked how much of the code that there is for 'platform devices' should be duplicated to support that sequence for testability purposes.
Any device can access devm_ code, there's no need for it to be a platform device at all.
The biggest thing for me is that I don't like the idea of creating own 'test device' in <add subsystem here> while we already have some in DRM (or others). Thus, I do see value in adding generic helpers for supporting running KUnit tests on devm_* APIs. Hence it'd be good to have _some_ support for it.
I agree, let's use a virtual device and a virtual bus (you can use the auxbus code for this as that's all there for this type of thing) and then you can test the devm_* code, _AND_ you can test the driver core at the same time.
And having them in drivers/base/test seemed like a correct place to me. What I really don't know is if there are legitimate use-cases for using platform_devices in DRM tests. Perhaps Maxime can shed light on that.
I agree that this could be in drivers/base/test/ but then let's test the driver core, not just provide a dummy platform device.
If you want to test the platform driver/device api, that would be great too, that can be plaform device/driver specific, but don't use one for some other random driver core functionality please.
thanks,
greg k-h
On 3/22/23 20:57, Greg Kroah-Hartman wrote:
On Wed, Mar 22, 2023 at 03:48:00PM +0200, Matti Vaittinen wrote:
Hi Greg,
Thanks for looking at this.
On 3/22/23 14:07, Greg Kroah-Hartman wrote:
On Wed, Mar 22, 2023 at 11:05:55AM +0200, Matti Vaittinen wrote:
+/**
- test_kunit_helper_alloc_device - Allocate a mock device for a KUnit test
- @test: The test context object
- This allocates a fake struct &device to create a mock for a KUnit
- test. The device will also be bound to a fake driver. It will thus be
- able to leverage the usual infrastructure and most notably the
- device-managed resources just like a "real" device.
What specific "usual infrastructure" are you wanting to access here?
And again, if you want a fake device, make a virtual one, by just calling device_create().
Or are you wanting to do "more" with that device pointer than device_create() can give you?
Personally, I was (am) only interested in devm_ unwinding. I guess the device_create(), device_add(), device_remove()... (didn't study this sequence in details so sorry if there is errors) could've been sufficient for me. I haven't looked how much of the code that there is for 'platform devices' should be duplicated to support that sequence for testability purposes.
Any device can access devm_ code, there's no need for it to be a platform device at all.
The biggest thing for me is that I don't like the idea of creating own 'test device' in <add subsystem here> while we already have some in DRM (or others). Thus, I do see value in adding generic helpers for supporting running KUnit tests on devm_* APIs. Hence it'd be good to have _some_ support for it.
I agree, let's use a virtual device and a virtual bus (you can use the auxbus code for this as that's all there for this type of thing)
Hm. The auxiliary_devices require parent. What would be the best way to deal with that in KUnit tests?
then you can test the devm_* code, _AND_ you can test the driver core at the same time.
And having them in drivers/base/test seemed like a correct place to me. What I really don't know is if there are legitimate use-cases for using platform_devices in DRM tests. Perhaps Maxime can shed light on that.
I agree that this could be in drivers/base/test/ but then let's test the driver core, not just provide a dummy platform device.
If you want to test the platform driver/device api, that would be great too, that can be plaform device/driver specific, but don't use one for some other random driver core functionality please.
I am very conservative what comes to adding unit tests due to the huge inertia they add to any further development. I usually only add tests to APIs which I know won't require changing (I don't know such in-kernel APIs) - or to functions which I think are error-prone. So, I am probably one of the last persons adding UTs to code I don't know :)
Yours, -- Matti
On Thu, Mar 23, 2023 at 07:17:40AM +0000, Vaittinen, Matti wrote:
On 3/22/23 20:57, Greg Kroah-Hartman wrote:
On Wed, Mar 22, 2023 at 03:48:00PM +0200, Matti Vaittinen wrote:
Hi Greg,
Thanks for looking at this.
On 3/22/23 14:07, Greg Kroah-Hartman wrote:
On Wed, Mar 22, 2023 at 11:05:55AM +0200, Matti Vaittinen wrote:
+/**
- test_kunit_helper_alloc_device - Allocate a mock device for a KUnit test
- @test: The test context object
- This allocates a fake struct &device to create a mock for a KUnit
- test. The device will also be bound to a fake driver. It will thus be
- able to leverage the usual infrastructure and most notably the
- device-managed resources just like a "real" device.
What specific "usual infrastructure" are you wanting to access here?
And again, if you want a fake device, make a virtual one, by just calling device_create().
Or are you wanting to do "more" with that device pointer than device_create() can give you?
Personally, I was (am) only interested in devm_ unwinding. I guess the device_create(), device_add(), device_remove()... (didn't study this sequence in details so sorry if there is errors) could've been sufficient for me. I haven't looked how much of the code that there is for 'platform devices' should be duplicated to support that sequence for testability purposes.
Any device can access devm_ code, there's no need for it to be a platform device at all.
The biggest thing for me is that I don't like the idea of creating own 'test device' in <add subsystem here> while we already have some in DRM (or others). Thus, I do see value in adding generic helpers for supporting running KUnit tests on devm_* APIs. Hence it'd be good to have _some_ support for it.
I agree, let's use a virtual device and a virtual bus (you can use the auxbus code for this as that's all there for this type of thing)
Hm. The auxiliary_devices require parent. What would be the best way to deal with that in KUnit tests?
If you use NULL as the parent, it goes into the root.
then you can test the devm_* code, _AND_ you can test the driver core at the same time.
And having them in drivers/base/test seemed like a correct place to me. What I really don't know is if there are legitimate use-cases for using platform_devices in DRM tests. Perhaps Maxime can shed light on that.
I agree that this could be in drivers/base/test/ but then let's test the driver core, not just provide a dummy platform device.
If you want to test the platform driver/device api, that would be great too, that can be plaform device/driver specific, but don't use one for some other random driver core functionality please.
I am very conservative what comes to adding unit tests due to the huge inertia they add to any further development. I usually only add tests to APIs which I know won't require changing (I don't know such in-kernel APIs)
So anything that is changing doesn't get a test? If you only test things that don't change then no tests fail, and so, why have the test at all?
On the contrary, tests should be used to verify things that are changing all the time, to ensure that we don't break things. That's why we need them, not to just validate that old code still is going ok.
The driver core is changing, and so, I would love to see tests for it to ensure that I don't break anything over time. That should NOT slow down development but rather, speed it up as it ensures that things still work properly.
- or to functions which I think are error-prone. So, I am probably
one of the last persons adding UTs to code I don't know :)
That's fine, you don't have to add test code for stuff you don't know.
But again, do NOT abuse a platform device for this, that's not ok, and the in-kernel code that does do this should be fixed up.
thanks,
greg k-h
On 3/23/23 10:58, Greg Kroah-Hartman wrote:
On Thu, Mar 23, 2023 at 07:17:40AM +0000, Vaittinen, Matti wrote:
On 3/22/23 20:57, Greg Kroah-Hartman wrote:
On Wed, Mar 22, 2023 at 03:48:00PM +0200, Matti Vaittinen wrote:
Hi Greg,
Thanks for looking at this.
On 3/22/23 14:07, Greg Kroah-Hartman wrote:
On Wed, Mar 22, 2023 at 11:05:55AM +0200, Matti Vaittinen wrote:
I am very conservative what comes to adding unit tests due to the huge inertia they add to any further development. I usually only add tests to APIs which I know won't require changing (I don't know such in-kernel APIs)
So anything that is changing doesn't get a test?
No. I think you misread me. I didn't say I don't like adding tests to code which changes. I said, I don't like adding tests to APIs which change.
If you only test
things that don't change then no tests fail, and so, why have the test at all?
Because implementation cascading into functions below an API may change even if the API stays unchanged.
On the contrary, tests should be used to verify things that are changing all the time, to ensure that we don't break things.
This is only true when your test code stays valid. Problem with excessive amount of tests is that more we have callers for an API, harder changing that API becomes. I've seen a point where people stop fixing "unimportant" things just because the amount of work fixing all impacted UT-cases would take. I know that many things went wrong before that project ended up to the point - but what I picked up with me is that carelessly added UTs do really hinder further development.
That's why we need
them, not to just validate that old code still is going ok.
The driver core is changing, and so, I would love to see tests for it to ensure that I don't break anything over time. That should NOT slow down development but rather, speed it up as it ensures that things still work properly.
I agree that there are cases where UTs are very handy and can add confidence that things work as intended. Still, my strong opinion is that people should consider what parts of code are really worth testing - and how to do the tests so that the amount of maintenance required by the tests stays low. It's definitely _not fun_ to do refactoring for minor improvement when 400+ unit-test cases break. It's a point when many developers start seeing fixing this minor culprit much less important... And when people stop fixing minor things ... major things start to be just around the corner.
- or to functions which I think are error-prone. So, I am probably
one of the last persons adding UTs to code I don't know :)
That's fine, you don't have to add test code for stuff you don't know.
But again, do NOT abuse a platform device for this, that's not ok, and the in-kernel code that does do this should be fixed up.
As suggested by David in another mail - I'll go with the root_device_[un]register(). I'll drop this patch entirely. Thanks for help, this was once again very educating :)
Yours, -- Matti
On Thu, Mar 23, 2023 at 11:20:33AM +0200, Matti Vaittinen wrote:
On 3/23/23 10:58, Greg Kroah-Hartman wrote:
On Thu, Mar 23, 2023 at 07:17:40AM +0000, Vaittinen, Matti wrote:
On 3/22/23 20:57, Greg Kroah-Hartman wrote:
On Wed, Mar 22, 2023 at 03:48:00PM +0200, Matti Vaittinen wrote:
Hi Greg,
Thanks for looking at this.
On 3/22/23 14:07, Greg Kroah-Hartman wrote:
On Wed, Mar 22, 2023 at 11:05:55AM +0200, Matti Vaittinen wrote:
I am very conservative what comes to adding unit tests due to the huge inertia they add to any further development. I usually only add tests to APIs which I know won't require changing (I don't know such in-kernel APIs)
So anything that is changing doesn't get a test?
No. I think you misread me. I didn't say I don't like adding tests to code which changes. I said, I don't like adding tests to APIs which change.
Then you should not be writing any in-kernel tests as all of our APIs change all the time.
If you only test
things that don't change then no tests fail, and so, why have the test at all?
Because implementation cascading into functions below an API may change even if the API stays unchanged.
Then it needs to be fixed.
On the contrary, tests should be used to verify things that are changing all the time, to ensure that we don't break things.
This is only true when your test code stays valid. Problem with excessive amount of tests is that more we have callers for an API, harder changing that API becomes. I've seen a point where people stop fixing "unimportant" things just because the amount of work fixing all impacted UT-cases would take. I know that many things went wrong before that project ended up to the point - but what I picked up with me is that carelessly added UTs do really hinder further development.
Again, in-kernel apis change at any moment. I just changed one that was over 15 years old. Don't get stuck into thinking that you can only write tests for stuff that is "stable" as nothing in the kernel is "stable" and can change at any point in time. You fix up all the in-kernel users of the api, and the tests, and all is good. That's how kernel development works.
That's why we need
them, not to just validate that old code still is going ok.
The driver core is changing, and so, I would love to see tests for it to ensure that I don't break anything over time. That should NOT slow down development but rather, speed it up as it ensures that things still work properly.
I agree that there are cases where UTs are very handy and can add confidence that things work as intended. Still, my strong opinion is that people should consider what parts of code are really worth testing - and how to do the tests so that the amount of maintenance required by the tests stays low. It's definitely _not fun_ to do refactoring for minor improvement when 400+ unit-test cases break. It's a point when many developers start seeing fixing this minor culprit much less important... And when people stop fixing minor things ... major things start to be just around the corner.
If people stop fixing minor things then the kernel development process is dead. Based on all the changes that go into it right now, we are far from having that problem.
So write valid tests, if we get to the point where we have too much of a problem fixing up the tests than the real users of apis, then we can revisit it. But for now, that's not an issue.
And again, remember, and api can, and will, change at any moment in time, you can never know what will be "stable" as we do not have such a thing.
thanks,
greg k-h
This is a low priority babbling - feel free to skip if busy.
On 3/23/23 12:25, Greg Kroah-Hartman wrote:
On Thu, Mar 23, 2023 at 11:20:33AM +0200, Matti Vaittinen wrote:
On 3/23/23 10:58, Greg Kroah-Hartman wrote:
On Thu, Mar 23, 2023 at 07:17:40AM +0000, Vaittinen, Matti wrote:
On 3/22/23 20:57, Greg Kroah-Hartman wrote:
On Wed, Mar 22, 2023 at 03:48:00PM +0200, Matti Vaittinen wrote:
Hi Greg,
Thanks for looking at this.
On 3/22/23 14:07, Greg Kroah-Hartman wrote: > On Wed, Mar 22, 2023 at 11:05:55AM +0200, Matti Vaittinen wrote:
I am very conservative what comes to adding unit tests due to the huge inertia they add to any further development. I usually only add tests to APIs which I know won't require changing (I don't know such in-kernel APIs)
So anything that is changing doesn't get a test?
No. I think you misread me. I didn't say I don't like adding tests to code which changes. I said, I don't like adding tests to APIs which change.
Then you should not be writing any in-kernel tests as all of our APIs change all the time.
If you only test
things that don't change then no tests fail, and so, why have the test at all?
Because implementation cascading into functions below an API may change even if the API stays unchanged.
Then it needs to be fixed.
On the contrary, tests should be used to verify things that are changing all the time, to ensure that we don't break things.
This is only true when your test code stays valid. Problem with excessive amount of tests is that more we have callers for an API, harder changing that API becomes. I've seen a point where people stop fixing "unimportant" things just because the amount of work fixing all impacted UT-cases would take. I know that many things went wrong before that project ended up to the point - but what I picked up with me is that carelessly added UTs do really hinder further development.
Again, in-kernel apis change at any moment.
I agree. This is why I initially wrote:
APIs which I know won't require changing (I don't know such in-kernel APIs)
Don't get stuck into thinking that you can only write tests for stuff that is "stable" as nothing in the kernel is "stable" and can change at any point in time.
I don't. But I don't either think that UTs come with no cost. Thus I do only write tests when I see a _real need_ for one. If the APIs would be guaranteed not to change, then I would understand writing the tests for each and every "thing" without much of thinking if "the thing" is worth the test.
You fix up all the in-kernel users of the api, and the tests, and all is good. That's how kernel development works.
Sure. This is how it works and how I think it should work. But I also have seen how this 'UT work overhead' has made people to decide not to touch things. Not in kernel but in other project. This is a real thing which can happen - many engineers like me are lazy bastards :)
That's why we need
them, not to just validate that old code still is going ok.
The driver core is changing, and so, I would love to see tests for it to ensure that I don't break anything over time. That should NOT slow down development but rather, speed it up as it ensures that things still work properly.
I agree that there are cases where UTs are very handy and can add confidence that things work as intended. Still, my strong opinion is that people should consider what parts of code are really worth testing - and how to do the tests so that the amount of maintenance required by the tests stays low. It's definitely _not fun_ to do refactoring for minor improvement when 400+ unit-test cases break. It's a point when many developers start seeing fixing this minor culprit much less important... And when people stop fixing minor things ... major things start to be just around the corner.
If people stop fixing minor things then the kernel development process is dead. Based on all the changes that go into it right now, we are far from having that problem.
And I am so happy for that. Kernel/drivers are still fun to work with. My personal preference is to keep it that way :)
So write valid tests, if we get to the point where we have too much of a problem fixing up the tests than the real users of apis, then we can revisit it. But for now, that's not an issue.
The beginning of your sentence hits the point. Write valid tests. I just encourage people to occasionally ask if the test they write is really a valid one. :)
And again, remember, and api can, and will, change at any moment in time, you can never know what will be "stable" as we do not have such a thing.
We agree on this.
On 3/23/23 10:58, Greg Kroah-Hartman wrote:
On Thu, Mar 23, 2023 at 07:17:40AM +0000, Vaittinen, Matti wrote:
On 3/22/23 20:57, Greg Kroah-Hartman wrote:
On Wed, Mar 22, 2023 at 03:48:00PM +0200, Matti Vaittinen wrote:
Hi Greg,
Thanks for looking at this.
On 3/22/23 14:07, Greg Kroah-Hartman wrote:
On Wed, Mar 22, 2023 at 11:05:55AM +0200, Matti Vaittinen wrote:
The biggest thing for me is that I don't like the idea of creating own 'test device' in <add subsystem here> while we already have some in DRM (or others). Thus, I do see value in adding generic helpers for supporting running KUnit tests on devm_* APIs. Hence it'd be good to have _some_ support for it.
I agree, let's use a virtual device and a virtual bus (you can use the auxbus code for this as that's all there for this type of thing)
Hm. The auxiliary_devices require parent. What would be the best way to deal with that in KUnit tests?
If you use NULL as the parent, it goes into the root.
As far as I read this is not the case with auxiliary devices. Judging the docs they were intended to be representing some part of a (parent) device. I see the auxiliary_device_init() has explicit check for parent being populated:
int auxiliary_device_init(struct auxiliary_device *auxdev) { struct device *dev = &auxdev->dev;
if (!dev->parent) { pr_err("auxiliary_device has a NULL dev->parent\n"); return -EINVAL; }
As I wrote in another mail, I thought of using a root_device for this IIO test as was suggested by David. To tell the truth, implementing a kunit bus device is starting to feel a bit overwhelming... I started just adding a driver for a light sensor, ended up adding a helper for IIO gain-time-scale conversions and I am slightly reluctant to going the extra-extra mile of adding some UT infrastructure in the context of this driver work...
Well, let's see. Maybe I change my mind after a good night's sleep :)
Yours, -- Matti
On Thu, Mar 23, 2023 at 12:01:15PM +0200, Matti Vaittinen wrote:
On 3/23/23 10:58, Greg Kroah-Hartman wrote:
On Thu, Mar 23, 2023 at 07:17:40AM +0000, Vaittinen, Matti wrote:
On 3/22/23 20:57, Greg Kroah-Hartman wrote:
On Wed, Mar 22, 2023 at 03:48:00PM +0200, Matti Vaittinen wrote:
Hi Greg,
Thanks for looking at this.
On 3/22/23 14:07, Greg Kroah-Hartman wrote:
On Wed, Mar 22, 2023 at 11:05:55AM +0200, Matti Vaittinen wrote:
The biggest thing for me is that I don't like the idea of creating own 'test device' in <add subsystem here> while we already have some in DRM (or others). Thus, I do see value in adding generic helpers for supporting running KUnit tests on devm_* APIs. Hence it'd be good to have _some_ support for it.
I agree, let's use a virtual device and a virtual bus (you can use the auxbus code for this as that's all there for this type of thing)
Hm. The auxiliary_devices require parent. What would be the best way to deal with that in KUnit tests?
If you use NULL as the parent, it goes into the root.
As far as I read this is not the case with auxiliary devices. Judging the docs they were intended to be representing some part of a (parent) device. I see the auxiliary_device_init() has explicit check for parent being populated:
int auxiliary_device_init(struct auxiliary_device *auxdev) { struct device *dev = &auxdev->dev;
if (!dev->parent) { pr_err("auxiliary_device has a NULL dev->parent\n"); return -EINVAL; }
Yes as it wants to "split" a device up into smaller devices. So make a real device that it can hang off of.
As I wrote in another mail, I thought of using a root_device for this IIO test as was suggested by David. To tell the truth, implementing a kunit bus device is starting to feel a bit overwhelming... I started just adding a driver for a light sensor, ended up adding a helper for IIO gain-time-scale conversions and I am slightly reluctant to going the extra-extra mile of adding some UT infrastructure in the context of this driver work...
I think it is worth it as the driver core has no tests. So it obviously must be correct, right? :)
thanks,
greg k-h
On 3/23/23 12:27, Greg Kroah-Hartman wrote:
On Thu, Mar 23, 2023 at 12:01:15PM +0200, Matti Vaittinen wrote:
On 3/23/23 10:58, Greg Kroah-Hartman wrote:
On Thu, Mar 23, 2023 at 07:17:40AM +0000, Vaittinen, Matti wrote:
On 3/22/23 20:57, Greg Kroah-Hartman wrote:
On Wed, Mar 22, 2023 at 03:48:00PM +0200, Matti Vaittinen wrote:
Hi Greg,
Thanks for looking at this.
On 3/22/23 14:07, Greg Kroah-Hartman wrote: > On Wed, Mar 22, 2023 at 11:05:55AM +0200, Matti Vaittinen wrote:
The biggest thing for me is that I don't like the idea of creating own 'test device' in <add subsystem here> while we already have some in DRM (or others). Thus, I do see value in adding generic helpers for supporting running KUnit tests on devm_* APIs. Hence it'd be good to have _some_ support for it.
I agree, let's use a virtual device and a virtual bus (you can use the auxbus code for this as that's all there for this type of thing)
Hm. The auxiliary_devices require parent. What would be the best way to deal with that in KUnit tests?
If you use NULL as the parent, it goes into the root.
As far as I read this is not the case with auxiliary devices. Judging the docs they were intended to be representing some part of a (parent) device. I see the auxiliary_device_init() has explicit check for parent being populated:
int auxiliary_device_init(struct auxiliary_device *auxdev) { struct device *dev = &auxdev->dev;
if (!dev->parent) { pr_err("auxiliary_device has a NULL dev->parent\n"); return -EINVAL; }
Yes as it wants to "split" a device up into smaller devices. So make a real device that it can hang off of.
Yep. This is what led me to the root_device_register()... :rolleyes: And seein the root-device alone could do what I need - adding auxiliary device on top of it just for the sake of adding one seems a bit of an over-engineering to me :)
As I wrote in another mail, I thought of using a root_device for this IIO test as was suggested by David. To tell the truth, implementing a kunit bus device is starting to feel a bit overwhelming... I started just adding a driver for a light sensor, ended up adding a helper for IIO gain-time-scale conversions and I am slightly reluctant to going the extra-extra mile of adding some UT infrastructure in the context of this driver work...
I think it is worth it as the driver core has no tests. So it obviously must be correct, right? :)
Doh. Greg, I hate you :) How could one argue with something like this? I think I will submit the v6 with the root_device_register() due to the aux-device requiring it in any case. I know that will end up to your table still as IIO is going through your hands anyways.
I will however take a look at what Maxime said about devm unwinding not being done w/o a bus because I think I saw the unwinding done in these IIO tests even when using the root_device_register() root_device_unregister(). If the unwinding really is not done, then I will come back to this auxiliary device rehearsal
On Wed, Mar 22, 2023 at 07:57:10PM +0100, Greg Kroah-Hartman wrote:
+/**
- test_kunit_helper_alloc_device - Allocate a mock device for a KUnit test
- @test: The test context object
- This allocates a fake struct &device to create a mock for a KUnit
- test. The device will also be bound to a fake driver. It will thus be
- able to leverage the usual infrastructure and most notably the
- device-managed resources just like a "real" device.
What specific "usual infrastructure" are you wanting to access here?
And again, if you want a fake device, make a virtual one, by just calling device_create().
Or are you wanting to do "more" with that device pointer than device_create() can give you?
Personally, I was (am) only interested in devm_ unwinding. I guess the device_create(), device_add(), device_remove()... (didn't study this sequence in details so sorry if there is errors) could've been sufficient for me. I haven't looked how much of the code that there is for 'platform devices' should be duplicated to support that sequence for testability purposes.
Any device can access devm_ code, there's no need for it to be a platform device at all.
Sure but the resources are only released if the device is part of a bus, so it can't be a root_device (or bare device) either
Maxime
On Thu, Mar 23, 2023 at 11:12:16AM +0100, Maxime Ripard wrote:
On Wed, Mar 22, 2023 at 07:57:10PM +0100, Greg Kroah-Hartman wrote:
+/**
- test_kunit_helper_alloc_device - Allocate a mock device for a KUnit test
- @test: The test context object
- This allocates a fake struct &device to create a mock for a KUnit
- test. The device will also be bound to a fake driver. It will thus be
- able to leverage the usual infrastructure and most notably the
- device-managed resources just like a "real" device.
What specific "usual infrastructure" are you wanting to access here?
And again, if you want a fake device, make a virtual one, by just calling device_create().
Or are you wanting to do "more" with that device pointer than device_create() can give you?
Personally, I was (am) only interested in devm_ unwinding. I guess the device_create(), device_add(), device_remove()... (didn't study this sequence in details so sorry if there is errors) could've been sufficient for me. I haven't looked how much of the code that there is for 'platform devices' should be duplicated to support that sequence for testability purposes.
Any device can access devm_ code, there's no need for it to be a platform device at all.
Sure but the resources are only released if the device is part of a bus, so it can't be a root_device (or bare device) either
The resources are not cleaned up when the device is freed no matter if it's on a bus or not? If so, then that's a bug that needs to be fixed, and tested :)
thanks,
greg k-h
Hi Maxime, all
On 3/23/23 12:21, Greg Kroah-Hartman wrote:
On Thu, Mar 23, 2023 at 11:12:16AM +0100, Maxime Ripard wrote:
On Wed, Mar 22, 2023 at 07:57:10PM +0100, Greg Kroah-Hartman wrote:
+/**
- test_kunit_helper_alloc_device - Allocate a mock device for a KUnit test
- @test: The test context object
- This allocates a fake struct &device to create a mock for a KUnit
- test. The device will also be bound to a fake driver. It will thus be
- able to leverage the usual infrastructure and most notably the
- device-managed resources just like a "real" device.
What specific "usual infrastructure" are you wanting to access here?
And again, if you want a fake device, make a virtual one, by just calling device_create().
Or are you wanting to do "more" with that device pointer than device_create() can give you?
Personally, I was (am) only interested in devm_ unwinding. I guess the device_create(), device_add(), device_remove()... (didn't study this sequence in details so sorry if there is errors) could've been sufficient for me. I haven't looked how much of the code that there is for 'platform devices' should be duplicated to support that sequence for testability purposes.
Any device can access devm_ code, there's no need for it to be a platform device at all.
Sure but the resources are only released if the device is part of a bus, so it can't be a root_device (or bare device) either
The resources are not cleaned up when the device is freed no matter if it's on a bus or not? If so, then that's a bug that needs to be fixed, and tested :)
This is strange. I just ran a test on a beaglebone black using Linux 6.3.0-rc2 + the IIO patches we se here (but the IIO test patch modified to use the root_device_register() and root_device_unregister().
I passed the device pointer from root_device_register() to the devm_iio_init_iio_gts()
// snip dev = root_device_register(IIO_GTS_TEST_DEV); KUNIT_EXPECT_NOT_ERR_OR_NULL(test, dev); if (IS_ERR_OR_NULL(dev)) return NULL;
ret = devm_iio_init_iio_gts(dev, TEST_SCALE_1X, 0, g_table, num_g, i_table, num_i, gts);
- and saw the tables for available scales allocated:
if (gts.num_avail_all_scales) pr_info("GTS: table allocation succeeded\n"); else pr_info("GTS: table allocation failed\n");
pr_info("gts: num_avail_all_scales %d\n", gts.num_avail_all_scales);
(this printed: [ 52.132966] # Subtest: iio-gain-time-scale [ 52.132982] 1..7 [ 52.157455] GTS: table allocation succeeded [ 52.164077] gts: num_avail_all_scales 16
Next I unregister the root-device and check if the unwinding code which frees the tables and zeroes the scale count was ran:
root_device_unregister(dev); pr_info("gts: num_avail_all_scales %d\n", gts.num_avail_all_scales);
if (gts.num_avail_all_scales) pr_info("devm unwinding not done\n"); else pr_info("devm unwinding succeeded\n");
Which printed: [ 52.168101] gts: num_avail_all_scales 0 [ 52.171957] devm unwinding succeeded
I can send patch(es) just for testing this on other machines if someone want's to see if the lack of devm unwinding is somehow architecture specific (which sounds very strange to me) - although using this IIO series just for checking the unwinding is a bit of an overkill. I just happened to have these tests at my hands / in my tree for testing.
In any case, devm unwinding using root_device_[un]register() seems to "work on my machine".
Naxime, what was the environment where you observed lack of unwinding? (Huh, I am so afraid of sending this post out - I've experienced too many "Oh, boy - how I didn't notice THAT" moments in the past and maybe I am again overlooking something...)
Yours, -- Matti
On Thu, Mar 23, 2023 at 02:16:52PM +0200, Matti Vaittinen wrote:
Hi Maxime, all
On 3/23/23 12:21, Greg Kroah-Hartman wrote:
On Thu, Mar 23, 2023 at 11:12:16AM +0100, Maxime Ripard wrote:
On Wed, Mar 22, 2023 at 07:57:10PM +0100, Greg Kroah-Hartman wrote:
> +/** > + * test_kunit_helper_alloc_device - Allocate a mock device for a KUnit test > + * @test: The test context object > + * > + * This allocates a fake struct &device to create a mock for a KUnit > + * test. The device will also be bound to a fake driver. It will thus be > + * able to leverage the usual infrastructure and most notably the > + * device-managed resources just like a "real" device.
What specific "usual infrastructure" are you wanting to access here?
And again, if you want a fake device, make a virtual one, by just calling device_create().
Or are you wanting to do "more" with that device pointer than device_create() can give you?
Personally, I was (am) only interested in devm_ unwinding. I guess the device_create(), device_add(), device_remove()... (didn't study this sequence in details so sorry if there is errors) could've been sufficient for me. I haven't looked how much of the code that there is for 'platform devices' should be duplicated to support that sequence for testability purposes.
Any device can access devm_ code, there's no need for it to be a platform device at all.
Sure but the resources are only released if the device is part of a bus, so it can't be a root_device (or bare device) either
The resources are not cleaned up when the device is freed no matter if it's on a bus or not? If so, then that's a bug that needs to be fixed, and tested :)
This is strange. I just ran a test on a beaglebone black using Linux 6.3.0-rc2 + the IIO patches we se here (but the IIO test patch modified to use the root_device_register() and root_device_unregister().
I passed the device pointer from root_device_register() to the devm_iio_init_iio_gts()
// snip dev = root_device_register(IIO_GTS_TEST_DEV); KUNIT_EXPECT_NOT_ERR_OR_NULL(test, dev); if (IS_ERR_OR_NULL(dev)) return NULL;
ret = devm_iio_init_iio_gts(dev, TEST_SCALE_1X, 0, g_table, num_g, i_table, num_i, gts);
- and saw the tables for available scales allocated:
if (gts.num_avail_all_scales) pr_info("GTS: table allocation succeeded\n"); else pr_info("GTS: table allocation failed\n");
pr_info("gts: num_avail_all_scales %d\n", gts.num_avail_all_scales);
(this printed: [ 52.132966] # Subtest: iio-gain-time-scale [ 52.132982] 1..7 [ 52.157455] GTS: table allocation succeeded [ 52.164077] gts: num_avail_all_scales 16
Next I unregister the root-device and check if the unwinding code which frees the tables and zeroes the scale count was ran:
root_device_unregister(dev); pr_info("gts: num_avail_all_scales %d\n", gts.num_avail_all_scales); if (gts.num_avail_all_scales) pr_info("devm unwinding not done\n"); else pr_info("devm unwinding succeeded\n");
Which printed: [ 52.168101] gts: num_avail_all_scales 0 [ 52.171957] devm unwinding succeeded
I can send patch(es) just for testing this on other machines if someone want's to see if the lack of devm unwinding is somehow architecture specific (which sounds very strange to me) - although using this IIO series just for checking the unwinding is a bit of an overkill. I just happened to have these tests at my hands / in my tree for testing.
In any case, devm unwinding using root_device_[un]register() seems to "work on my machine".
Naxime, what was the environment where you observed lack of unwinding? (Huh, I am so afraid of sending this post out - I've experienced too many "Oh, boy
- how I didn't notice THAT" moments in the past and maybe I am again
overlooking something...)
This is the description of what was happening: https://lore.kernel.org/dri-devel/20221117165311.vovrc7usy4efiytl@houat/
Maxime
On 3/23/23 14:29, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 02:16:52PM +0200, Matti Vaittinen wrote:
This is the description of what was happening: https://lore.kernel.org/dri-devel/20221117165311.vovrc7usy4efiytl@houat/
Thanks Maxime. Do I read this correcty. The devm_ unwinding not being done when root_device_register() is used is not because root_device_unregister() would not trigger the unwinding - but rather because DRM code on top of this device keeps the refcount increased?
If this is the case, then it sounds like a DRM specific issue to me. Whether it is a feature or bug is beyond my knowledge. Still, I would not say using the root_device_[un]register() in generic code is not feasible - unless all other subsytems have similar refcount handling.
Sure thing using root_device_register() root_device_unregister() in DRM does not work as such. This, however, does not mean the generic kunit helpers should use platform_devices to force unwinding?
Well, It's almost the best season for ice-fishing in Finland so opening a can of worms is not that bad, right? :)
Thanks for the education people! I did learn a thing or two Today.
Yours, -- Matti
On Thu, Mar 23, 2023 at 03:02:03PM +0200, Matti Vaittinen wrote:
On 3/23/23 14:29, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 02:16:52PM +0200, Matti Vaittinen wrote:
This is the description of what was happening: https://lore.kernel.org/dri-devel/20221117165311.vovrc7usy4efiytl@houat/
Thanks Maxime. Do I read this correcty. The devm_ unwinding not being done when root_device_register() is used is not because root_device_unregister() would not trigger the unwinding - but rather because DRM code on top of this device keeps the refcount increased?
There's a difference of behaviour between a root_device and any device with a bus: the root_device will only release the devm resources when it's freed (in device_release), but a bus device will also do it in device_del (through bus_remove_device() -> device_release_driver() -> device_release_driver_internal() -> __device_release_driver() -> device_unbind_cleanup(), which are skipped (in multiple places) if there's no bus and no driver attached to the device).
It does affect DRM, but I'm pretty sure it will affect any framework that deals with device hotplugging by deferring the framework structure until the last (userspace) user closes its file descriptor. So I'd assume that v4l2 and cec at least are also affected, and most likely others.
If this is the case, then it sounds like a DRM specific issue to me.
I mean, I guess. One could also argue that it's because IIO doesn't properly deal with hotplugging. I'm not sure how that helps. Those are common helpers which should accommodate every framework, and your second patch breaks the kunit tests for DRM anyway.
Whether it is a feature or bug is beyond my knowledge. Still, I would not say using the root_device_[un]register() in generic code is not feasible - unless all other subsytems have similar refcount handling.
Sure thing using root_device_register() root_device_unregister() in DRM does not work as such. This, however, does not mean the generic kunit helpers should use platform_devices to force unwinding?
platform_devices were a quick way to get a device that would have a bus and a driver bound to fall into the right patch above. We probably shouldn't use platform_devices and a kunit_device sounds like the best idea, but the test linked in the original mail I pointed you to should work with whatever we come up with. It works with multiple (platform, PCI, USB, etc) buses, so the mock we create should behave like their real world equivalents.
Maxime
On 3/23/23 18:36, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 03:02:03PM +0200, Matti Vaittinen wrote:
On 3/23/23 14:29, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 02:16:52PM +0200, Matti Vaittinen wrote:
This is the description of what was happening: https://lore.kernel.org/dri-devel/20221117165311.vovrc7usy4efiytl@houat/
Thanks Maxime. Do I read this correcty. The devm_ unwinding not being done when root_device_register() is used is not because root_device_unregister() would not trigger the unwinding - but rather because DRM code on top of this device keeps the refcount increased?
There's a difference of behaviour between a root_device and any device with a bus: the root_device will only release the devm resources when it's freed (in device_release), but a bus device will also do it in device_del (through bus_remove_device() -> device_release_driver() -> device_release_driver_internal() -> __device_release_driver() -> device_unbind_cleanup(), which are skipped (in multiple places) if there's no bus and no driver attached to the device).
It does affect DRM, but I'm pretty sure it will affect any framework that deals with device hotplugging by deferring the framework structure until the last (userspace) user closes its file descriptor. So I'd assume that v4l2 and cec at least are also affected, and most likely others.
Thanks for the explanation and patience :)
If this is the case, then it sounds like a DRM specific issue to me.
I mean, I guess. One could also argue that it's because IIO doesn't properly deal with hotplugging.
I must say I haven't been testing the IIO registration API. I've only tested the helper API which is not backed up by any "IIO device". (This is fine for the helper because it must by design be cleaned-up only after the IIO-deregistration).
After your explanation here, I am not convinced IIO wouldn't see the same issue if I was testing the devm_iio_device_alloc() & co.
I'm not sure how that helps. Those are common helpers which should accommodate every framework,
Ok. Fair enough. Besides, if the root-device was sufficient - then I would actually not see the need for a helper. People could in that case directly use the root_device_register(). So, if helpers are provided they should be backed up by a device with a bus then.
and your second patch breaks the kunit tests for DRM anyway.
Oh, I must have made an error there. It was supposed to be just a refactoring with no functional changes. Sorry about that. Anyways, that patch can be forgotten as Greg opposes using the platform devices in generic helpers.
Whether it is a feature or bug is beyond my knowledge. Still, I would not say using the root_device_[un]register() in generic code is not feasible - unless all other subsytems have similar refcount handling.
Sure thing using root_device_register() root_device_unregister() in DRM does not work as such. This, however, does not mean the generic kunit helpers should use platform_devices to force unwinding?
platform_devices were a quick way to get a device that would have a bus and a driver bound to fall into the right patch above. We probably shouldn't use platform_devices and a kunit_device sounds like the best idea, but the test linked in the original mail I pointed you to should work with whatever we come up with. It works with multiple (platform, PCI, USB, etc) buses, so the mock we create should behave like their real world equivalents.
Thanks for the patience and the explanation. Now I understand a generic test device needs to sit on a bus.
As I said, in my very specific IIO related test the test device does not need a bus. Hence I'll drop the 'generic helpers' from this series.
Yours, -- Matti
On Fri, 24 Mar 2023 at 14:11, Matti Vaittinen mazziesaccount@gmail.com wrote:
On 3/23/23 18:36, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 03:02:03PM +0200, Matti Vaittinen wrote:
On 3/23/23 14:29, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 02:16:52PM +0200, Matti Vaittinen wrote:
This is the description of what was happening: https://lore.kernel.org/dri-devel/20221117165311.vovrc7usy4efiytl@houat/
Thanks Maxime. Do I read this correcty. The devm_ unwinding not being done when root_device_register() is used is not because root_device_unregister() would not trigger the unwinding - but rather because DRM code on top of this device keeps the refcount increased?
There's a difference of behaviour between a root_device and any device with a bus: the root_device will only release the devm resources when it's freed (in device_release), but a bus device will also do it in device_del (through bus_remove_device() -> device_release_driver() -> device_release_driver_internal() -> __device_release_driver() -> device_unbind_cleanup(), which are skipped (in multiple places) if there's no bus and no driver attached to the device).
It does affect DRM, but I'm pretty sure it will affect any framework that deals with device hotplugging by deferring the framework structure until the last (userspace) user closes its file descriptor. So I'd assume that v4l2 and cec at least are also affected, and most likely others.
Thanks for the explanation and patience :)
Thanks from me as well -- this has certainly helped me understand some of the details of the driver model that had slipped past me.
If this is the case, then it sounds like a DRM specific issue to me.
I mean, I guess. One could also argue that it's because IIO doesn't properly deal with hotplugging.
I must say I haven't been testing the IIO registration API. I've only tested the helper API which is not backed up by any "IIO device". (This is fine for the helper because it must by design be cleaned-up only after the IIO-deregistration).
After your explanation here, I am not convinced IIO wouldn't see the same issue if I was testing the devm_iio_device_alloc() & co.
I'm not sure how that helps. Those are common helpers which should accommodate every framework,
Ok. Fair enough. Besides, if the root-device was sufficient - then I would actually not see the need for a helper. People could in that case directly use the root_device_register(). So, if helpers are provided they should be backed up by a device with a bus then.
I think there is _some_ value in helpers even without a bus, but it's much more limited: - It's less confusing if KUnit test devices are using kunit labelled structs and functions. - Helpers could use KUnit's resource management API to ensure any device created is properly unregistered and removed when the test exits (particularly if it exits early due to, e.g., an assertion).
I've played around implementing those with a proper struct kunit_device and the automatic cleanup on test failure, and thus far it -- like root_device_register -- works for all of the tests except the drm-test-managed one.
So if we really wanted to, we could use KUnit-specific helpers for just those tests which currently work with root_device_register(), but if we're going to try to implement a KUnit bus -- which I think is at least worth investigating -- I'd rather not either hold up otherwise good tests on helper development, or rush a helper out only to have to change it a lot when we see exactly what the bus implementation would look like.
and your second patch breaks the kunit tests for DRM anyway.
Oh, I must have made an error there. It was supposed to be just a refactoring with no functional changes. Sorry about that. Anyways, that patch can be forgotten as Greg opposes using the platform devices in generic helpers.
Whether it is a feature or bug is beyond my knowledge. Still, I would not say using the root_device_[un]register() in generic code is not feasible - unless all other subsytems have similar refcount handling.
Sure thing using root_device_register() root_device_unregister() in DRM does not work as such. This, however, does not mean the generic kunit helpers should use platform_devices to force unwinding?
platform_devices were a quick way to get a device that would have a bus and a driver bound to fall into the right patch above. We probably shouldn't use platform_devices and a kunit_device sounds like the best idea, but the test linked in the original mail I pointed you to should work with whatever we come up with. It works with multiple (platform, PCI, USB, etc) buses, so the mock we create should behave like their real world equivalents.
Thanks for the patience and the explanation. Now I understand a generic test device needs to sit on a bus.
As I said, in my very specific IIO related test the test device does not need a bus. Hence I'll drop the 'generic helpers' from this series.
I think that sounds like a good strategy for now, and we can work on a set of 'generic helpers' which have an associated bus and struct kunit_device in the meantime. If we can continue to use root_device_register until those are ready, that'd be very convenient. Certainly, the helpers I'm playing with at the moment have a few other dependencies I'd want to land first, so I suspect they wouldn't be done in time for 6.4. It'd also make sense to see if we can make sure any such helpers could either work well with (or at least not conflict with) tests which use devicetree.
Regardless, thanks very much for putting all of the effort in to working this out. I think we have a much better idea of the requirements for this sort of thing now.
Cheers, -- David
On 3/24/23 08:34, David Gow wrote:
On Fri, 24 Mar 2023 at 14:11, Matti Vaittinen mazziesaccount@gmail.com wrote:
On 3/23/23 18:36, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 03:02:03PM +0200, Matti Vaittinen wrote:
On 3/23/23 14:29, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 02:16:52PM +0200, Matti Vaittinen wrote:
Ok. Fair enough. Besides, if the root-device was sufficient - then I would actually not see the need for a helper. People could in that case directly use the root_device_register(). So, if helpers are provided they should be backed up by a device with a bus then.
I think there is _some_ value in helpers even without a bus, but it's much more limited:
- It's less confusing if KUnit test devices are using kunit labelled
structs and functions.
- Helpers could use KUnit's resource management API to ensure any
device created is properly unregistered and removed when the test exits (particularly if it exits early due to, e.g., an assertion).
Ah. That's true. Being able to abort the test on error w/o being forced to do a clean-up dance for the dummy device would be convenient.
I've played around implementing those with a proper struct kunit_device and the automatic cleanup on test failure, and thus far it -- like root_device_register -- works for all of the tests except the drm-test-managed one.
So if we really wanted to, we could use KUnit-specific helpers for just those tests which currently work with root_device_register(), but if we're going to try to implement a KUnit bus -- which I think is at least worth investigating -- I'd rather not either hold up otherwise good tests on helper development, or rush a helper out only to have to change it a lot when we see exactly what the bus implementation would look like.
It's easy for me to agree.
As I said, in my very specific IIO related test the test device does not need a bus. Hence I'll drop the 'generic helpers' from this series.
I think that sounds like a good strategy for now, and we can work on a set of 'generic helpers' which have an associated bus and struct kunit_device in the meantime. If we can continue to use root_device_register until those are ready, that'd be very convenient.
Would it be a tiny bit more acceptable if we did add a very simple:
#define kunit_root_device_register(name) root_device_register(name) #define kunit_root_device_unregister(dev) root_device_unregister(dev)
to include/kunit/device.h (or somesuch)
This should help us later to at least spot the places where root_device_[un]register() is abused and (potentially mass-)covert them to use the proper helpers when they're available.
Yours, -- Matti
On Fri, 24 Mar 2023 at 14:51, Matti Vaittinen mazziesaccount@gmail.com wrote:
On 3/24/23 08:34, David Gow wrote:
On Fri, 24 Mar 2023 at 14:11, Matti Vaittinen mazziesaccount@gmail.com wrote:
On 3/23/23 18:36, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 03:02:03PM +0200, Matti Vaittinen wrote:
On 3/23/23 14:29, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 02:16:52PM +0200, Matti Vaittinen wrote:
Ok. Fair enough. Besides, if the root-device was sufficient - then I would actually not see the need for a helper. People could in that case directly use the root_device_register(). So, if helpers are provided they should be backed up by a device with a bus then.
I think there is _some_ value in helpers even without a bus, but it's much more limited:
- It's less confusing if KUnit test devices are using kunit labelled
structs and functions.
- Helpers could use KUnit's resource management API to ensure any
device created is properly unregistered and removed when the test exits (particularly if it exits early due to, e.g., an assertion).
Ah. That's true. Being able to abort the test on error w/o being forced to do a clean-up dance for the dummy device would be convenient.
I've played around implementing those with a proper struct kunit_device and the automatic cleanup on test failure, and thus far it -- like root_device_register -- works for all of the tests except the drm-test-managed one.
So if we really wanted to, we could use KUnit-specific helpers for just those tests which currently work with root_device_register(), but if we're going to try to implement a KUnit bus -- which I think is at least worth investigating -- I'd rather not either hold up otherwise good tests on helper development, or rush a helper out only to have to change it a lot when we see exactly what the bus implementation would look like.
It's easy for me to agree.
As I said, in my very specific IIO related test the test device does not need a bus. Hence I'll drop the 'generic helpers' from this series.
I think that sounds like a good strategy for now, and we can work on a set of 'generic helpers' which have an associated bus and struct kunit_device in the meantime. If we can continue to use root_device_register until those are ready, that'd be very convenient.
Would it be a tiny bit more acceptable if we did add a very simple:
#define kunit_root_device_register(name) root_device_register(name) #define kunit_root_device_unregister(dev) root_device_unregister(dev)
to include/kunit/device.h (or somesuch)
This should help us later to at least spot the places where root_device_[un]register() is abused and (potentially mass-)covert them to use the proper helpers when they're available.
Great idea.
The code I've been playing with has the following in include/kunit/device.h:
/* Register a new device against a KUnit test. */ struct device *kunit_device_register(struct kunit *test, const char *name); /* Unregister a device created by kunit_device_register() early (i.e., before test cleanup). */ void kunit_device_unregister(struct kunit *test, struct device *dev);
If we used the same names, and just forwarded them to root_device_register() and root_device_unregister() for now (discarding the struct kunit pointer), then I expect we could just swap out the implementation to gain the extra functionality.
It's a little less explicit, though, so I could see the value in using macros with "root_device" in the name to make the current implementation clearer, and the eventual change more obvious.
Cheers, -- David
On 3/24/23 11:52, David Gow wrote:
On Fri, 24 Mar 2023 at 14:51, Matti Vaittinen mazziesaccount@gmail.com wrote:
On 3/24/23 08:34, David Gow wrote:
On Fri, 24 Mar 2023 at 14:11, Matti Vaittinen mazziesaccount@gmail.com wrote:
I think that sounds like a good strategy for now, and we can work on a set of 'generic helpers' which have an associated bus and struct kunit_device in the meantime. If we can continue to use root_device_register until those are ready, that'd be very convenient.
Would it be a tiny bit more acceptable if we did add a very simple:
#define kunit_root_device_register(name) root_device_register(name) #define kunit_root_device_unregister(dev) root_device_unregister(dev)
to include/kunit/device.h (or somesuch)
This should help us later to at least spot the places where root_device_[un]register() is abused and (potentially mass-)covert them to use the proper helpers when they're available.
Great idea.
The code I've been playing with has the following in include/kunit/device.h:
/* Register a new device against a KUnit test. */ struct device *kunit_device_register(struct kunit *test, const char *name); /* Unregister a device created by kunit_device_register() early (i.e., before test cleanup). */ void kunit_device_unregister(struct kunit *test, struct device *dev);
If we used the same names, and just forwarded them to root_device_register() and root_device_unregister() for now (discarding the struct kunit pointer), then I expect we could just swap out the implementation to gain the extra functionality.
It's a little less explicit, though, so I could see the value in using macros with "root_device" in the name to make the current implementation clearer, and the eventual change more obvious.
I think it makes sense to avoid the mass-conversions if the signatures for kunit_device_register() and kunit_device_unregister() are expected to be known by now. If there is no objections I'll add those wrappers + the include/kunit/device.h to v6 of this series.
I think I'll try to hold back sending the v6 until next Monday - unless I get really bored during the weekend ;)
Yours, -- Matti
On 3/24/23 12:05, Matti Vaittinen wrote:
On 3/24/23 11:52, David Gow wrote:
On Fri, 24 Mar 2023 at 14:51, Matti Vaittinen mazziesaccount@gmail.com wrote:
On 3/24/23 08:34, David Gow wrote:
On Fri, 24 Mar 2023 at 14:11, Matti Vaittinen mazziesaccount@gmail.com wrote:
I think that sounds like a good strategy for now, and we can work on a set of 'generic helpers' which have an associated bus and struct kunit_device in the meantime. If we can continue to use root_device_register until those are ready, that'd be very convenient.
Would it be a tiny bit more acceptable if we did add a very simple:
#define kunit_root_device_register(name) root_device_register(name) #define kunit_root_device_unregister(dev) root_device_unregister(dev)
to include/kunit/device.h (or somesuch)
This should help us later to at least spot the places where root_device_[un]register() is abused and (potentially mass-)covert them to use the proper helpers when they're available.
Great idea.
The code I've been playing with has the following in include/kunit/device.h:
/* Register a new device against a KUnit test. */ struct device *kunit_device_register(struct kunit *test, const char *name); /* Unregister a device created by kunit_device_register() early (i.e., before test cleanup). */ void kunit_device_unregister(struct kunit *test, struct device *dev);
If we used the same names, and just forwarded them to root_device_register() and root_device_unregister() for now (discarding the struct kunit pointer), then I expect we could just swap out the implementation to gain the extra functionality.
There's one thing though. If the goal is to do a direct replacement and if automatic device deletion upon test completion / test abort is planned - then it should be there also for these initial wrappers.
If these wrappers don't yet include the automatic device clean-up - then it probably makes more sense to just do the kunit_root_device_* defines because the tests are likely to need removing the explicit device clean-ups when proper APIs are finished.
Yours, -- Matti
On Fri, 24 Mar 2023 at 18:17, Matti Vaittinen mazziesaccount@gmail.com wrote:
On 3/24/23 12:05, Matti Vaittinen wrote:
On 3/24/23 11:52, David Gow wrote:
On Fri, 24 Mar 2023 at 14:51, Matti Vaittinen mazziesaccount@gmail.com wrote:
On 3/24/23 08:34, David Gow wrote:
On Fri, 24 Mar 2023 at 14:11, Matti Vaittinen mazziesaccount@gmail.com wrote:
I think that sounds like a good strategy for now, and we can work on a set of 'generic helpers' which have an associated bus and struct kunit_device in the meantime. If we can continue to use root_device_register until those are ready, that'd be very convenient.
Would it be a tiny bit more acceptable if we did add a very simple:
#define kunit_root_device_register(name) root_device_register(name) #define kunit_root_device_unregister(dev) root_device_unregister(dev)
to include/kunit/device.h (or somesuch)
This should help us later to at least spot the places where root_device_[un]register() is abused and (potentially mass-)covert them to use the proper helpers when they're available.
Great idea.
The code I've been playing with has the following in include/kunit/device.h:
/* Register a new device against a KUnit test. */ struct device *kunit_device_register(struct kunit *test, const char *name); /* Unregister a device created by kunit_device_register() early (i.e., before test cleanup). */ void kunit_device_unregister(struct kunit *test, struct device *dev);
If we used the same names, and just forwarded them to root_device_register() and root_device_unregister() for now (discarding the struct kunit pointer), then I expect we could just swap out the implementation to gain the extra functionality.
There's one thing though. If the goal is to do a direct replacement and if automatic device deletion upon test completion / test abort is planned - then it should be there also for these initial wrappers.
Yeah, that's an excellent point. It's a pretty subtle change in behaviour to suddenly introduce that, so changing it behind the scenes is probably unwise.
If these wrappers don't yet include the automatic device clean-up - then it probably makes more sense to just do the kunit_root_device_* defines because the tests are likely to need removing the explicit device clean-ups when proper APIs are finished.
I sent out my prototype implementation of this here, which does do the automatic cleanup: https://lore.kernel.org/linux-kselftest/20230325043104.3761770-1-davidgow@go...
It's probably overkill to squeeze into your patch series, though, given it also adds and uses a whole new kunit_defer() API.
Cheers, -- David
On 3/25/23 06:35, David Gow wrote:
On Fri, 24 Mar 2023 at 18:17, Matti Vaittinen mazziesaccount@gmail.com wrote:
On 3/24/23 12:05, Matti Vaittinen wrote:
On 3/24/23 11:52, David Gow wrote:
On Fri, 24 Mar 2023 at 14:51, Matti Vaittinen mazziesaccount@gmail.com wrote:
On 3/24/23 08:34, David Gow wrote:
On Fri, 24 Mar 2023 at 14:11, Matti Vaittinen mazziesaccount@gmail.com wrote:
I think that sounds like a good strategy for now, and we can work on a set of 'generic helpers' which have an associated bus and struct kunit_device in the meantime. If we can continue to use root_device_register until those are ready, that'd be very convenient.
Would it be a tiny bit more acceptable if we did add a very simple:
#define kunit_root_device_register(name) root_device_register(name) #define kunit_root_device_unregister(dev) root_device_unregister(dev)
to include/kunit/device.h (or somesuch)
This should help us later to at least spot the places where root_device_[un]register() is abused and (potentially mass-)covert them to use the proper helpers when they're available.
Great idea.
The code I've been playing with has the following in include/kunit/device.h:
/* Register a new device against a KUnit test. */ struct device *kunit_device_register(struct kunit *test, const char *name); /* Unregister a device created by kunit_device_register() early (i.e., before test cleanup). */ void kunit_device_unregister(struct kunit *test, struct device *dev);
If we used the same names, and just forwarded them to root_device_register() and root_device_unregister() for now (discarding the struct kunit pointer), then I expect we could just swap out the implementation to gain the extra functionality.
There's one thing though. If the goal is to do a direct replacement and if automatic device deletion upon test completion / test abort is planned - then it should be there also for these initial wrappers.
Yeah, that's an excellent point. It's a pretty subtle change in behaviour to suddenly introduce that, so changing it behind the scenes is probably unwise.
If these wrappers don't yet include the automatic device clean-up - then it probably makes more sense to just do the kunit_root_device_* defines because the tests are likely to need removing the explicit device clean-ups when proper APIs are finished.
I sent out my prototype implementation of this here, which does do the automatic cleanup: https://lore.kernel.org/linux-kselftest/20230325043104.3761770-1-davidgow@go...
It's probably overkill to squeeze into your patch series, though, given it also adds and uses a whole new kunit_defer() API.
Thanks for letting me know. I did also prepare this commit yesterday: https://github.com/M-Vaittinen/linux/commit/b784a90f8cc64ff83e802ec818e662fa...
It does use the existing kunit resources for clean-up. I am not sure if it is worth a shot or should I just drop it and use the root-device API for now. Any educated opinions on that? :)
Yours, -- Matti
On Fri, Mar 24, 2023 at 02:34:19PM +0800, David Gow wrote:
On Fri, 24 Mar 2023 at 14:11, Matti Vaittinen mazziesaccount@gmail.com wrote:
On 3/23/23 18:36, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 03:02:03PM +0200, Matti Vaittinen wrote:
On 3/23/23 14:29, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 02:16:52PM +0200, Matti Vaittinen wrote:
This is the description of what was happening: https://lore.kernel.org/dri-devel/20221117165311.vovrc7usy4efiytl@houat/
Thanks Maxime. Do I read this correcty. The devm_ unwinding not being done when root_device_register() is used is not because root_device_unregister() would not trigger the unwinding - but rather because DRM code on top of this device keeps the refcount increased?
There's a difference of behaviour between a root_device and any device with a bus: the root_device will only release the devm resources when it's freed (in device_release), but a bus device will also do it in device_del (through bus_remove_device() -> device_release_driver() -> device_release_driver_internal() -> __device_release_driver() -> device_unbind_cleanup(), which are skipped (in multiple places) if there's no bus and no driver attached to the device).
It does affect DRM, but I'm pretty sure it will affect any framework that deals with device hotplugging by deferring the framework structure until the last (userspace) user closes its file descriptor. So I'd assume that v4l2 and cec at least are also affected, and most likely others.
Thanks for the explanation and patience :)
Thanks from me as well -- this has certainly helped me understand some of the details of the driver model that had slipped past me.
If this is the case, then it sounds like a DRM specific issue to me.
I mean, I guess. One could also argue that it's because IIO doesn't properly deal with hotplugging.
I must say I haven't been testing the IIO registration API. I've only tested the helper API which is not backed up by any "IIO device". (This is fine for the helper because it must by design be cleaned-up only after the IIO-deregistration).
After your explanation here, I am not convinced IIO wouldn't see the same issue if I was testing the devm_iio_device_alloc() & co.
I'm not sure how that helps. Those are common helpers which should accommodate every framework,
Ok. Fair enough. Besides, if the root-device was sufficient - then I would actually not see the need for a helper. People could in that case directly use the root_device_register(). So, if helpers are provided they should be backed up by a device with a bus then.
I think there is _some_ value in helpers even without a bus, but it's much more limited:
- It's less confusing if KUnit test devices are using kunit labelled
structs and functions.
- Helpers could use KUnit's resource management API to ensure any
device created is properly unregistered and removed when the test exits (particularly if it exits early due to, e.g., an assertion).
I've played around implementing those with a proper struct kunit_device and the automatic cleanup on test failure, and thus far it -- like root_device_register -- works for all of the tests except the drm-test-managed one.
Yeah, like I said you need a device that has been bound to a driver for it to work at the moment.
I guess for driver mocks we could move to a setup where we get kunit-specific drivers like what Stephen has been implementing for the clocks but I guess we would need to register the kunit tests in the driver probe which doesn't look like it's possible at the moment?
Maxime
On Fri, Mar 24, 2023 at 08:11:52AM +0200, Matti Vaittinen wrote:
On 3/23/23 18:36, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 03:02:03PM +0200, Matti Vaittinen wrote:
On 3/23/23 14:29, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 02:16:52PM +0200, Matti Vaittinen wrote:
This is the description of what was happening: https://lore.kernel.org/dri-devel/20221117165311.vovrc7usy4efiytl@houat/
Thanks Maxime. Do I read this correcty. The devm_ unwinding not being done when root_device_register() is used is not because root_device_unregister() would not trigger the unwinding - but rather because DRM code on top of this device keeps the refcount increased?
There's a difference of behaviour between a root_device and any device with a bus: the root_device will only release the devm resources when it's freed (in device_release), but a bus device will also do it in device_del (through bus_remove_device() -> device_release_driver() -> device_release_driver_internal() -> __device_release_driver() -> device_unbind_cleanup(), which are skipped (in multiple places) if there's no bus and no driver attached to the device).
It does affect DRM, but I'm pretty sure it will affect any framework that deals with device hotplugging by deferring the framework structure until the last (userspace) user closes its file descriptor. So I'd assume that v4l2 and cec at least are also affected, and most likely others.
Thanks for the explanation and patience :)
If this is the case, then it sounds like a DRM specific issue to me.
I mean, I guess. One could also argue that it's because IIO doesn't properly deal with hotplugging.
I must say I haven't been testing the IIO registration API. I've only tested the helper API which is not backed up by any "IIO device". (This is fine for the helper because it must by design be cleaned-up only after the IIO-deregistration).
After your explanation here, I am not convinced IIO wouldn't see the same issue if I was testing the devm_iio_device_alloc() & co.
It depends really. The issue DRM is trying to solve is that, when a device is gone, some application might still have an open FD and could still poke into the kernel, while all the resources would have been free'd if it was using devm.
So everything is kept around until the last fd is closed, so you still have a reference to the device (even though it's been removed from its bus) until that time.
It could be possible that IIO just doesn't handle that case at all. I guess most of the devices aren't hotpluggable, and there's not much to interact with from a userspace PoV iirc, so it might be why.
I'm not sure how that helps. Those are common helpers which should accommodate every framework,
Ok. Fair enough. Besides, if the root-device was sufficient - then I would actually not see the need for a helper. People could in that case directly use the root_device_register(). So, if helpers are provided they should be backed up by a device with a bus then.
and your second patch breaks the kunit tests for DRM anyway.
Oh, I must have made an error there. It was supposed to be just a refactoring with no functional changes. Sorry about that. Anyways, that patch can be forgotten as Greg opposes using the platform devices in generic helpers.
Whether it is a feature or bug is beyond my knowledge. Still, I would not say using the root_device_[un]register() in generic code is not feasible - unless all other subsytems have similar refcount handling.
Sure thing using root_device_register() root_device_unregister() in DRM does not work as such. This, however, does not mean the generic kunit helpers should use platform_devices to force unwinding?
platform_devices were a quick way to get a device that would have a bus and a driver bound to fall into the right patch above. We probably shouldn't use platform_devices and a kunit_device sounds like the best idea, but the test linked in the original mail I pointed you to should work with whatever we come up with. It works with multiple (platform, PCI, USB, etc) buses, so the mock we create should behave like their real world equivalents.
Thanks for the patience and the explanation. Now I understand a generic test device needs to sit on a bus.
As I said, in my very specific IIO related test the test device does not need a bus. Hence I'll drop the 'generic helpers' from this series.
So, I went around and created a bunch of kunit tests that shows the problem without DRM being involved at all.
It does three things:
- It registers a device, attaches a devm action, unregisters the device and then checks that the action has ran.
- It registers a device, gets a reference to it, attaches a devm action, puts back the reference, unregisters the device and then checks that the action has ran.
- It registers a device, gets a reference to it, attaches a devm action that will put back the reference, unregisters the device and then checks that the action has ran.
And in three cases: first with a root_device, then platform_device, then a platform_device that has been bound to a driver.
Once you've applied that patch, you can run it using:
./tools/testing/kunit/kunit.py run --kunitconfig=drivers/base/test/ devm-inconsistencies
You'll see that only the last case passes all the tests, even though the code itself is exactly the same.
Maxime
-- >8 -- diff --git a/drivers/base/test/.kunitconfig b/drivers/base/test/.kunitconfig new file mode 100644 index 000000000000..473923f0998b --- /dev/null +++ b/drivers/base/test/.kunitconfig @@ -0,0 +1,2 @@ +CONFIG_KUNIT=y +CONFIG_DM_KUNIT_TEST=y diff --git a/drivers/base/test/Kconfig b/drivers/base/test/Kconfig index 610a1ba7a467..9d42051f8f8e 100644 --- a/drivers/base/test/Kconfig +++ b/drivers/base/test/Kconfig @@ -9,6 +9,10 @@ config TEST_ASYNC_DRIVER_PROBE
If unsure say N.
+config DM_KUNIT_TEST + tristate "KUnit Tests for the device model" if !KUNIT_ALL_TESTS + depends on KUNIT + config DRIVER_PE_KUNIT_TEST bool "KUnit Tests for property entry API" if !KUNIT_ALL_TESTS depends on KUNIT=y diff --git a/drivers/base/test/Makefile b/drivers/base/test/Makefile index 7f76fee6f989..31bcaa743e94 100644 --- a/drivers/base/test/Makefile +++ b/drivers/base/test/Makefile @@ -1,5 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_TEST_ASYNC_DRIVER_PROBE) += test_async_driver_probe.o
+obj-$(CONFIG_DM_KUNIT_TEST) += test-devm-cleanup.o + obj-$(CONFIG_DRIVER_PE_KUNIT_TEST) += property-entry-test.o CFLAGS_property-entry-test.o += $(DISABLE_STRUCTLEAK_PLUGIN) diff --git a/drivers/base/test/test-devm-cleanup.c b/drivers/base/test/test-devm-cleanup.c new file mode 100644 index 000000000000..bad3cb1385b1 --- /dev/null +++ b/drivers/base/test/test-devm-cleanup.c @@ -0,0 +1,353 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <kunit/resource.h> + +#include <linux/device.h> +#include <linux/platform_device.h> + +#define DEVICE_NAME "test" + +struct test_priv { + bool probe_done; + bool release_done; + wait_queue_head_t release_wq; + struct device *dev; +}; + +static void devm_device_action(void *ptr) +{ + struct test_priv *priv = ptr; + + priv->release_done = true; + wake_up_interruptible(&priv->release_wq); +} + +static void devm_put_device_action(void *ptr) +{ + struct test_priv *priv = ptr; + + put_device(priv->dev); + priv->release_done = true; + wake_up_interruptible(&priv->release_wq); +} + +#define RELEASE_TIMEOUT_MS 500 + +static void root_device_register_unregister_test(struct kunit *test) +{ + struct test_priv *priv; + int ret; + + priv = kunit_kzalloc(test, sizeof(*priv), GFP_KERNEL); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, priv); + init_waitqueue_head(&priv->release_wq); + + priv->dev = root_device_register(DEVICE_NAME); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, priv->dev); + + ret = devm_add_action_or_reset(priv->dev, devm_device_action, priv); + KUNIT_ASSERT_EQ(test, ret, 0); + + root_device_unregister(priv->dev); + + ret = wait_event_interruptible_timeout(priv->release_wq, priv->release_done, + msecs_to_jiffies(RELEASE_TIMEOUT_MS)); + KUNIT_EXPECT_GT(test, ret, 0); +} + +static void root_device_register_get_put_unregister_test(struct kunit *test) +{ + struct test_priv *priv; + int ret; + + priv = kunit_kzalloc(test, sizeof(*priv), GFP_KERNEL); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, priv); + init_waitqueue_head(&priv->release_wq); + + priv->dev = root_device_register(DEVICE_NAME); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, priv->dev); + + get_device(priv->dev); + + ret = devm_add_action_or_reset(priv->dev, devm_device_action, priv); + KUNIT_ASSERT_EQ(test, ret, 0); + + put_device(priv->dev); + + root_device_unregister(priv->dev); + + ret = wait_event_interruptible_timeout(priv->release_wq, priv->release_done, + msecs_to_jiffies(RELEASE_TIMEOUT_MS)); + KUNIT_EXPECT_GT(test, ret, 0); +} + +static void root_device_register_get_unregister_with_devm_test(struct kunit *test) +{ + struct test_priv *priv; + int ret; + + priv = kunit_kzalloc(test, sizeof(*priv), GFP_KERNEL); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, priv); + init_waitqueue_head(&priv->release_wq); + + priv->dev = root_device_register(DEVICE_NAME); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, priv->dev); + + get_device(priv->dev); + + ret = devm_add_action_or_reset(priv->dev, devm_put_device_action, priv); + KUNIT_ASSERT_EQ(test, ret, 0); + + root_device_unregister(priv->dev); + + ret = wait_event_interruptible_timeout(priv->release_wq, priv->release_done, + msecs_to_jiffies(RELEASE_TIMEOUT_MS)); + KUNIT_EXPECT_GT(test, ret, 0); +} + +static void platform_device_register_unregister_test(struct kunit *test) +{ + struct platform_device *pdev; + struct test_priv *priv; + int ret; + + priv = kunit_kzalloc(test, sizeof(*priv), GFP_KERNEL); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, priv); + init_waitqueue_head(&priv->release_wq); + + pdev = platform_device_alloc(DEVICE_NAME, PLATFORM_DEVID_NONE); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, pdev); + + ret = platform_device_add(pdev); + KUNIT_ASSERT_EQ(test, ret, 0); + + priv->dev = &pdev->dev; + + ret = devm_add_action_or_reset(priv->dev, devm_device_action, priv); + KUNIT_ASSERT_EQ(test, ret, 0); + + platform_device_unregister(pdev); + + ret = wait_event_interruptible_timeout(priv->release_wq, priv->release_done, + msecs_to_jiffies(RELEASE_TIMEOUT_MS)); + KUNIT_EXPECT_GT(test, ret, 0); +} + +static void platform_device_register_get_put_unregister_test(struct kunit *test) +{ + struct platform_device *pdev; + struct test_priv *priv; + int ret; + + priv = kunit_kzalloc(test, sizeof(*priv), GFP_KERNEL); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, priv); + init_waitqueue_head(&priv->release_wq); + + pdev = platform_device_alloc(DEVICE_NAME, PLATFORM_DEVID_NONE); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, pdev); + + ret = platform_device_add(pdev); + KUNIT_ASSERT_EQ(test, ret, 0); + + priv->dev = &pdev->dev; + + get_device(priv->dev); + + ret = devm_add_action_or_reset(priv->dev, devm_device_action, priv); + KUNIT_ASSERT_EQ(test, ret, 0); + + put_device(priv->dev); + + platform_device_unregister(pdev); + + ret = wait_event_interruptible_timeout(priv->release_wq, priv->release_done, + msecs_to_jiffies(RELEASE_TIMEOUT_MS)); + KUNIT_EXPECT_GT(test, ret, 0); +} + +static void platform_device_register_get_unregister_with_devm_test(struct kunit *test) +{ + struct platform_device *pdev; + struct test_priv *priv; + int ret; + + priv = kunit_kzalloc(test, sizeof(*priv), GFP_KERNEL); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, priv); + init_waitqueue_head(&priv->release_wq); + + pdev = platform_device_alloc(DEVICE_NAME, PLATFORM_DEVID_NONE); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, pdev); + + ret = platform_device_add(pdev); + KUNIT_ASSERT_EQ(test, ret, 0); + + priv->dev = &pdev->dev; + + get_device(priv->dev); + + ret = devm_add_action_or_reset(priv->dev, devm_put_device_action, priv); + KUNIT_ASSERT_EQ(test, ret, 0); + + platform_device_unregister(pdev); + + ret = wait_event_interruptible_timeout(priv->release_wq, priv->release_done, + msecs_to_jiffies(RELEASE_TIMEOUT_MS)); + KUNIT_EXPECT_GT(test, ret, 0); +} + +static int fake_probe(struct platform_device *pdev) +{ + struct test_priv *priv = platform_get_drvdata(pdev); + + priv->probe_done = true; + wake_up_interruptible(&priv->release_wq); + + return 0; +} + +static struct platform_driver fake_driver = { + .probe = fake_probe, + .driver = { + .name = DEVICE_NAME, + }, +}; + +static void probed_platform_device_register_unregister_test(struct kunit *test) +{ + struct platform_device *pdev; + struct test_priv *priv; + int ret; + + priv = kunit_kzalloc(test, sizeof(*priv), GFP_KERNEL); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, priv); + init_waitqueue_head(&priv->release_wq); + + ret = platform_driver_register(&fake_driver); + KUNIT_ASSERT_EQ(test, ret, 0); + + pdev = platform_device_alloc(DEVICE_NAME, PLATFORM_DEVID_NONE); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, pdev); + + priv->dev = &pdev->dev; + platform_set_drvdata(pdev, priv); + + ret = platform_device_add(pdev); + KUNIT_ASSERT_EQ(test, ret, 0); + + ret = wait_event_interruptible_timeout(priv->release_wq, priv->probe_done, + msecs_to_jiffies(RELEASE_TIMEOUT_MS)); + KUNIT_ASSERT_GT(test, ret, 0); + + ret = devm_add_action_or_reset(priv->dev, devm_device_action, priv); + KUNIT_ASSERT_EQ(test, ret, 0); + + platform_device_unregister(pdev); + + ret = wait_event_interruptible_timeout(priv->release_wq, priv->release_done, + msecs_to_jiffies(RELEASE_TIMEOUT_MS)); + KUNIT_EXPECT_GT(test, ret, 0); + + platform_driver_unregister(&fake_driver); +} + +static void probed_platform_device_register_get_put_unregister_test(struct kunit *test) +{ + struct platform_device *pdev; + struct test_priv *priv; + int ret; + + priv = kunit_kzalloc(test, sizeof(*priv), GFP_KERNEL); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, priv); + init_waitqueue_head(&priv->release_wq); + + ret = platform_driver_register(&fake_driver); + KUNIT_ASSERT_EQ(test, ret, 0); + + pdev = platform_device_alloc(DEVICE_NAME, PLATFORM_DEVID_NONE); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, pdev); + + priv->dev = &pdev->dev; + platform_set_drvdata(pdev, priv); + + ret = platform_device_add(pdev); + KUNIT_ASSERT_EQ(test, ret, 0); + + ret = wait_event_interruptible_timeout(priv->release_wq, priv->probe_done, + msecs_to_jiffies(RELEASE_TIMEOUT_MS)); + KUNIT_ASSERT_GT(test, ret, 0); + + get_device(priv->dev); + + ret = devm_add_action_or_reset(priv->dev, devm_device_action, priv); + KUNIT_ASSERT_EQ(test, ret, 0); + + put_device(priv->dev); + + platform_device_unregister(pdev); + + ret = wait_event_interruptible_timeout(priv->release_wq, priv->release_done, + msecs_to_jiffies(RELEASE_TIMEOUT_MS)); + KUNIT_EXPECT_GT(test, ret, 0); + + platform_driver_unregister(&fake_driver); +} + +static void probed_platform_device_register_get_unregister_with_devm_test(struct kunit *test) +{ + struct platform_device *pdev; + struct test_priv *priv; + int ret; + + priv = kunit_kzalloc(test, sizeof(*priv), GFP_KERNEL); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, priv); + init_waitqueue_head(&priv->release_wq); + + ret = platform_driver_register(&fake_driver); + KUNIT_ASSERT_EQ(test, ret, 0); + + pdev = platform_device_alloc(DEVICE_NAME, PLATFORM_DEVID_NONE); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, pdev); + + priv->dev = &pdev->dev; + platform_set_drvdata(pdev, priv); + + ret = platform_device_add(pdev); + KUNIT_ASSERT_EQ(test, ret, 0); + + ret = wait_event_interruptible_timeout(priv->release_wq, priv->probe_done, + msecs_to_jiffies(RELEASE_TIMEOUT_MS)); + KUNIT_ASSERT_GT(test, ret, 0); + + get_device(priv->dev); + + ret = devm_add_action_or_reset(priv->dev, devm_put_device_action, priv); + KUNIT_ASSERT_EQ(test, ret, 0); + + platform_device_unregister(pdev); + + ret = wait_event_interruptible_timeout(priv->release_wq, priv->release_done, + msecs_to_jiffies(RELEASE_TIMEOUT_MS)); + KUNIT_EXPECT_GT(test, ret, 0); + + platform_driver_unregister(&fake_driver); +} + +static struct kunit_case devm_inconsistencies_tests[] = { + KUNIT_CASE(root_device_register_unregister_test), + KUNIT_CASE(root_device_register_get_put_unregister_test), + KUNIT_CASE(root_device_register_get_unregister_with_devm_test), + KUNIT_CASE(platform_device_register_unregister_test), + KUNIT_CASE(platform_device_register_get_put_unregister_test), + KUNIT_CASE(platform_device_register_get_unregister_with_devm_test), + KUNIT_CASE(probed_platform_device_register_unregister_test), + KUNIT_CASE(probed_platform_device_register_get_put_unregister_test), + KUNIT_CASE(probed_platform_device_register_get_unregister_with_devm_test), + {} +}; + +static struct kunit_suite devm_inconsistencies_test_suite = { + .name = "devm-inconsistencies", + .test_cases = devm_inconsistencies_tests, +}; + +kunit_test_suite(devm_inconsistencies_test_suite); -- >8 --
On Fri, 24 Mar 2023 at 20:32, Maxime Ripard maxime@cerno.tech wrote:
On Fri, Mar 24, 2023 at 08:11:52AM +0200, Matti Vaittinen wrote:
On 3/23/23 18:36, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 03:02:03PM +0200, Matti Vaittinen wrote:
On 3/23/23 14:29, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 02:16:52PM +0200, Matti Vaittinen wrote:
This is the description of what was happening: https://lore.kernel.org/dri-devel/20221117165311.vovrc7usy4efiytl@houat/
Thanks Maxime. Do I read this correcty. The devm_ unwinding not being done when root_device_register() is used is not because root_device_unregister() would not trigger the unwinding - but rather because DRM code on top of this device keeps the refcount increased?
There's a difference of behaviour between a root_device and any device with a bus: the root_device will only release the devm resources when it's freed (in device_release), but a bus device will also do it in device_del (through bus_remove_device() -> device_release_driver() -> device_release_driver_internal() -> __device_release_driver() -> device_unbind_cleanup(), which are skipped (in multiple places) if there's no bus and no driver attached to the device).
It does affect DRM, but I'm pretty sure it will affect any framework that deals with device hotplugging by deferring the framework structure until the last (userspace) user closes its file descriptor. So I'd assume that v4l2 and cec at least are also affected, and most likely others.
Thanks for the explanation and patience :)
If this is the case, then it sounds like a DRM specific issue to me.
I mean, I guess. One could also argue that it's because IIO doesn't properly deal with hotplugging.
I must say I haven't been testing the IIO registration API. I've only tested the helper API which is not backed up by any "IIO device". (This is fine for the helper because it must by design be cleaned-up only after the IIO-deregistration).
After your explanation here, I am not convinced IIO wouldn't see the same issue if I was testing the devm_iio_device_alloc() & co.
It depends really. The issue DRM is trying to solve is that, when a device is gone, some application might still have an open FD and could still poke into the kernel, while all the resources would have been free'd if it was using devm.
So everything is kept around until the last fd is closed, so you still have a reference to the device (even though it's been removed from its bus) until that time.
It could be possible that IIO just doesn't handle that case at all. I guess most of the devices aren't hotpluggable, and there's not much to interact with from a userspace PoV iirc, so it might be why.
I'm not sure how that helps. Those are common helpers which should accommodate every framework,
Ok. Fair enough. Besides, if the root-device was sufficient - then I would actually not see the need for a helper. People could in that case directly use the root_device_register(). So, if helpers are provided they should be backed up by a device with a bus then.
and your second patch breaks the kunit tests for DRM anyway.
Oh, I must have made an error there. It was supposed to be just a refactoring with no functional changes. Sorry about that. Anyways, that patch can be forgotten as Greg opposes using the platform devices in generic helpers.
Whether it is a feature or bug is beyond my knowledge. Still, I would not say using the root_device_[un]register() in generic code is not feasible - unless all other subsytems have similar refcount handling.
Sure thing using root_device_register() root_device_unregister() in DRM does not work as such. This, however, does not mean the generic kunit helpers should use platform_devices to force unwinding?
platform_devices were a quick way to get a device that would have a bus and a driver bound to fall into the right patch above. We probably shouldn't use platform_devices and a kunit_device sounds like the best idea, but the test linked in the original mail I pointed you to should work with whatever we come up with. It works with multiple (platform, PCI, USB, etc) buses, so the mock we create should behave like their real world equivalents.
Thanks for the patience and the explanation. Now I understand a generic test device needs to sit on a bus.
As I said, in my very specific IIO related test the test device does not need a bus. Hence I'll drop the 'generic helpers' from this series.
So, I went around and created a bunch of kunit tests that shows the problem without DRM being involved at all.
It does three things:
It registers a device, attaches a devm action, unregisters the device and then checks that the action has ran.
It registers a device, gets a reference to it, attaches a devm action, puts back the reference, unregisters the device and then checks that the action has ran.
It registers a device, gets a reference to it, attaches a devm action that will put back the reference, unregisters the device and then checks that the action has ran.
And in three cases: first with a root_device, then platform_device, then a platform_device that has been bound to a driver.
Once you've applied that patch, you can run it using:
./tools/testing/kunit/kunit.py run --kunitconfig=drivers/base/test/ devm-inconsistencies
You'll see that only the last case passes all the tests, even though the code itself is exactly the same.
This illustrates the problem very nicely, thanks.
I played around a bit with this, and I'm definitely leaning towards this being a bug, rather than intentional behaviour. There's actually an explicit call to devres_release_all() in device_release() which seems to suggest that this should work: /* * Some platform devices are driven without driver attached * and managed resources may have been acquired. Make sure * all resources are released. * * Drivers still can add resources into device after device * is deleted but alive, so release devres here to avoid * possible memory leak. */
My "solution" is just to call devres_release_all() in device_del() as well, which fixes your tests (and the drm-test-managed one when ported to use root_device_register() or my kunit_device_register() API[1]).
--8<-- diff --git a/drivers/base/core.c b/drivers/base/core.c index 6878dfcbf0d6..adfec6185014 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -3778,6 +3778,17 @@ void device_del(struct device *dev) device_platform_notify_remove(dev); device_links_purge(dev);
+ /* + * If a device does not have a driver attached, we need to clean up any + * managed resources. We do this in device_release(), but it's never + * called (and we leak the device) if a managed resource holds a + * reference to the device. So release all managed resources here, + * like we do in driver_detach(). We still need to do so again in + * device_release() in case someone adds a new resource after this + * point, though. + */ + devres_release_all(dev); + bus_notify(dev, BUS_NOTIFY_REMOVED_DEVICE); kobject_uevent(&dev->kobj, KOBJ_REMOVE); glue_dir = get_glue_dir(dev);
-->8--
It doesn't _seem_ to break anything else, and I've managed to convince myself that it's probably the correct fix. (Albeit, as someone with a limited knowledge of this part of the code, who also hasn't had quite enough sleep, so take that with some salt.)
Still, I'd agree with Greg that it'd be great to have versions of your tests upstream before making any such radical changes.
Cheers, -- David
Hi David,
On Sat, Mar 25, 2023 at 01:40:01PM +0800, David Gow wrote:
On Fri, 24 Mar 2023 at 20:32, Maxime Ripard maxime@cerno.tech wrote:
On Fri, Mar 24, 2023 at 08:11:52AM +0200, Matti Vaittinen wrote:
On 3/23/23 18:36, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 03:02:03PM +0200, Matti Vaittinen wrote:
On 3/23/23 14:29, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 02:16:52PM +0200, Matti Vaittinen wrote:
This is the description of what was happening: https://lore.kernel.org/dri-devel/20221117165311.vovrc7usy4efiytl@houat/
Thanks Maxime. Do I read this correcty. The devm_ unwinding not being done when root_device_register() is used is not because root_device_unregister() would not trigger the unwinding - but rather because DRM code on top of this device keeps the refcount increased?
There's a difference of behaviour between a root_device and any device with a bus: the root_device will only release the devm resources when it's freed (in device_release), but a bus device will also do it in device_del (through bus_remove_device() -> device_release_driver() -> device_release_driver_internal() -> __device_release_driver() -> device_unbind_cleanup(), which are skipped (in multiple places) if there's no bus and no driver attached to the device).
It does affect DRM, but I'm pretty sure it will affect any framework that deals with device hotplugging by deferring the framework structure until the last (userspace) user closes its file descriptor. So I'd assume that v4l2 and cec at least are also affected, and most likely others.
Thanks for the explanation and patience :)
If this is the case, then it sounds like a DRM specific issue to me.
I mean, I guess. One could also argue that it's because IIO doesn't properly deal with hotplugging.
I must say I haven't been testing the IIO registration API. I've only tested the helper API which is not backed up by any "IIO device". (This is fine for the helper because it must by design be cleaned-up only after the IIO-deregistration).
After your explanation here, I am not convinced IIO wouldn't see the same issue if I was testing the devm_iio_device_alloc() & co.
It depends really. The issue DRM is trying to solve is that, when a device is gone, some application might still have an open FD and could still poke into the kernel, while all the resources would have been free'd if it was using devm.
So everything is kept around until the last fd is closed, so you still have a reference to the device (even though it's been removed from its bus) until that time.
It could be possible that IIO just doesn't handle that case at all. I guess most of the devices aren't hotpluggable, and there's not much to interact with from a userspace PoV iirc, so it might be why.
I'm not sure how that helps. Those are common helpers which should accommodate every framework,
Ok. Fair enough. Besides, if the root-device was sufficient - then I would actually not see the need for a helper. People could in that case directly use the root_device_register(). So, if helpers are provided they should be backed up by a device with a bus then.
and your second patch breaks the kunit tests for DRM anyway.
Oh, I must have made an error there. It was supposed to be just a refactoring with no functional changes. Sorry about that. Anyways, that patch can be forgotten as Greg opposes using the platform devices in generic helpers.
Whether it is a feature or bug is beyond my knowledge. Still, I would not say using the root_device_[un]register() in generic code is not feasible - unless all other subsytems have similar refcount handling.
Sure thing using root_device_register() root_device_unregister() in DRM does not work as such. This, however, does not mean the generic kunit helpers should use platform_devices to force unwinding?
platform_devices were a quick way to get a device that would have a bus and a driver bound to fall into the right patch above. We probably shouldn't use platform_devices and a kunit_device sounds like the best idea, but the test linked in the original mail I pointed you to should work with whatever we come up with. It works with multiple (platform, PCI, USB, etc) buses, so the mock we create should behave like their real world equivalents.
Thanks for the patience and the explanation. Now I understand a generic test device needs to sit on a bus.
As I said, in my very specific IIO related test the test device does not need a bus. Hence I'll drop the 'generic helpers' from this series.
So, I went around and created a bunch of kunit tests that shows the problem without DRM being involved at all.
It does three things:
It registers a device, attaches a devm action, unregisters the device and then checks that the action has ran.
It registers a device, gets a reference to it, attaches a devm action, puts back the reference, unregisters the device and then checks that the action has ran.
It registers a device, gets a reference to it, attaches a devm action that will put back the reference, unregisters the device and then checks that the action has ran.
And in three cases: first with a root_device, then platform_device, then a platform_device that has been bound to a driver.
Once you've applied that patch, you can run it using:
./tools/testing/kunit/kunit.py run --kunitconfig=drivers/base/test/ devm-inconsistencies
You'll see that only the last case passes all the tests, even though the code itself is exactly the same.
This illustrates the problem very nicely, thanks.
I played around a bit with this, and I'm definitely leaning towards this being a bug, rather than intentional behaviour. There's actually an explicit call to devres_release_all() in device_release() which seems to suggest that this should work: /*
- Some platform devices are driven without driver attached
- and managed resources may have been acquired. Make sure
- all resources are released.
- Drivers still can add resources into device after device
- is deleted but alive, so release devres here to avoid
- possible memory leak.
*/
My "solution" is just to call devres_release_all() in device_del() as well, which fixes your tests (and the drm-test-managed one when ported to use root_device_register() or my kunit_device_register() API[1]).
--8<-- diff --git a/drivers/base/core.c b/drivers/base/core.c index 6878dfcbf0d6..adfec6185014 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -3778,6 +3778,17 @@ void device_del(struct device *dev) device_platform_notify_remove(dev); device_links_purge(dev);
/*
* If a device does not have a driver attached, we need to clean up any
* managed resources. We do this in device_release(), but it's never
* called (and we leak the device) if a managed resource holds a
* reference to the device. So release all managed resources here,
* like we do in driver_detach(). We still need to do so again in
* device_release() in case someone adds a new resource after this
* point, though.
*/
devres_release_all(dev);
bus_notify(dev, BUS_NOTIFY_REMOVED_DEVICE); kobject_uevent(&dev->kobj, KOBJ_REMOVE); glue_dir = get_glue_dir(dev);
-->8--
It doesn't _seem_ to break anything else, and I've managed to convince myself that it's probably the correct fix.
Yeah, as an outsider, I came to the same conclusion last time...
(Albeit, as someone with a limited knowledge of this part of the code, who also hasn't had quite enough sleep, so take that with some salt.)
... but as someone that also have a limited knowledge of that part of the code, I certainly wasn't sure it was the proper thing to do :)
Still, I'd agree with Greg that it'd be great to have versions of your tests upstream before making any such radical changes.
I just submitted them.
Thanks! Maxime
On Fri, 24 Mar 2023 13:31:57 +0100 Maxime Ripard maxime@cerno.tech wrote:
On Fri, Mar 24, 2023 at 08:11:52AM +0200, Matti Vaittinen wrote:
On 3/23/23 18:36, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 03:02:03PM +0200, Matti Vaittinen wrote:
On 3/23/23 14:29, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 02:16:52PM +0200, Matti Vaittinen wrote:
This is the description of what was happening: https://lore.kernel.org/dri-devel/20221117165311.vovrc7usy4efiytl@houat/
Thanks Maxime. Do I read this correcty. The devm_ unwinding not being done when root_device_register() is used is not because root_device_unregister() would not trigger the unwinding - but rather because DRM code on top of this device keeps the refcount increased?
There's a difference of behaviour between a root_device and any device with a bus: the root_device will only release the devm resources when it's freed (in device_release), but a bus device will also do it in device_del (through bus_remove_device() -> device_release_driver() -> device_release_driver_internal() -> __device_release_driver() -> device_unbind_cleanup(), which are skipped (in multiple places) if there's no bus and no driver attached to the device).
It does affect DRM, but I'm pretty sure it will affect any framework that deals with device hotplugging by deferring the framework structure until the last (userspace) user closes its file descriptor. So I'd assume that v4l2 and cec at least are also affected, and most likely others.
Thanks for the explanation and patience :)
If this is the case, then it sounds like a DRM specific issue to me.
I mean, I guess. One could also argue that it's because IIO doesn't properly deal with hotplugging.
I must say I haven't been testing the IIO registration API. I've only tested the helper API which is not backed up by any "IIO device". (This is fine for the helper because it must by design be cleaned-up only after the IIO-deregistration).
After your explanation here, I am not convinced IIO wouldn't see the same issue if I was testing the devm_iio_device_alloc() & co.
It depends really. The issue DRM is trying to solve is that, when a device is gone, some application might still have an open FD and could still poke into the kernel, while all the resources would have been free'd if it was using devm.
So everything is kept around until the last fd is closed, so you still have a reference to the device (even though it's been removed from its bus) until that time.
It could be possible that IIO just doesn't handle that case at all. I guess most of the devices aren't hotpluggable, and there's not much to interact with from a userspace PoV iirc, so it might be why.
Lars-Peter Clausen (IIRC) fixed up the IIO handling of the similar cases a long time ago now. It's simpler that for some other subsystems as we don't have as many interdependencies as occur in DRM etc.
I 'think' we are fine in general with the IIO approach to this (I think we did have one report of a theoretical race condition in the remove path that was never fully addressed).
For IIO we also have fds that can be open but all accesses to them are proxied through the IIO core and one of the things iio_device_unregister() or the devm equivalent does is to set indio_dev->info = NULL (+ wake up anyone waiting on data etc). Alongside removing the callbacks, that is also used as a flag to indicate the device has gone.
Note that we keep a reference to the struct indio_dev->dev (rather that the underlying device) so that is not freed until the last fd is closed. Thus, although devm unwinding has occurred that doesn't mean all the data that was allocated with devm_xx calls is cleared up immediately.
On 3/25/23 10:50, Jonathan Cameron wrote:
On Fri, 24 Mar 2023 13:31:57 +0100 Maxime Ripard maxime@cerno.tech wrote:
On Fri, Mar 24, 2023 at 08:11:52AM +0200, Matti Vaittinen wrote:
On 3/23/23 18:36, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 03:02:03PM +0200, Matti Vaittinen wrote:
On 3/23/23 14:29, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 02:16:52PM +0200, Matti Vaittinen wrote:
This is the description of what was happening: https://lore.kernel.org/dri-devel/20221117165311.vovrc7usy4efiytl@houat/
Thanks Maxime. Do I read this correcty. The devm_ unwinding not being done when root_device_register() is used is not because root_device_unregister() would not trigger the unwinding - but rather because DRM code on top of this device keeps the refcount increased?
There's a difference of behaviour between a root_device and any device with a bus: the root_device will only release the devm resources when it's freed (in device_release), but a bus device will also do it in device_del (through bus_remove_device() -> device_release_driver() -> device_release_driver_internal() -> __device_release_driver() -> device_unbind_cleanup(), which are skipped (in multiple places) if there's no bus and no driver attached to the device).
It does affect DRM, but I'm pretty sure it will affect any framework that deals with device hotplugging by deferring the framework structure until the last (userspace) user closes its file descriptor. So I'd assume that v4l2 and cec at least are also affected, and most likely others.
Thanks for the explanation and patience :)
If this is the case, then it sounds like a DRM specific issue to me.
I mean, I guess. One could also argue that it's because IIO doesn't properly deal with hotplugging.
I must say I haven't been testing the IIO registration API. I've only tested the helper API which is not backed up by any "IIO device". (This is fine for the helper because it must by design be cleaned-up only after the IIO-deregistration).
After your explanation here, I am not convinced IIO wouldn't see the same issue if I was testing the devm_iio_device_alloc() & co.
It depends really. The issue DRM is trying to solve is that, when a device is gone, some application might still have an open FD and could still poke into the kernel, while all the resources would have been free'd if it was using devm.
So everything is kept around until the last fd is closed, so you still have a reference to the device (even though it's been removed from its bus) until that time.
It could be possible that IIO just doesn't handle that case at all. I guess most of the devices aren't hotpluggable, and there's not much to interact with from a userspace PoV iirc, so it might be why.
Lars-Peter Clausen (IIRC) fixed up the IIO handling of the similar cases a long time ago now. It's simpler that for some other subsystems as we don't have as many interdependencies as occur in DRM etc.
I 'think' we are fine in general with the IIO approach to this (I think we did have one report of a theoretical race condition in the remove path that was never fully addressed).
For IIO we also have fds that can be open but all accesses to them are proxied through the IIO core and one of the things iio_device_unregister() or the devm equivalent does is to set indio_dev->info = NULL (+ wake up anyone waiting on data etc). Alongside removing the callbacks, that is also used as a flag to indicate the device has gone.
Note that we keep a reference to the struct indio_dev->dev (rather that the underlying device) so that is not freed until the last fd is closed. Thus, although devm unwinding has occurred that doesn't mean all the data that was allocated with devm_xx calls is cleared up immediately.
IIO is fully hot-plug and hot-unplug capable. And it will have the same issue. When using managed device registration that establishes a parent child relationship between the devices and in combination with a device where the managed unwinding does not happen on unbind, but rather on in the release callback you create a cyclic reference dependency. The child device holds a reference to the parent, but the reference is only released in the parents release callback. And since that release callback is not called until the last reference is dropped you end up with a resource leak.
There are even some other places where IIO drivers run into this. E.g. any driver that does `devm_iio_trigger_register(&indio_dev->dev, ...)` creates a resource leak on the trigger and the IIO device. The indio_dev is not a bus device, hence no unbind and the trigger holds a reference so the release callback will never be called either.
On Sun, 26 Mar 2023 10:16:54 -0700 Lars-Peter Clausen lars@metafoo.de wrote:
On 3/25/23 10:50, Jonathan Cameron wrote:
On Fri, 24 Mar 2023 13:31:57 +0100 Maxime Ripard maxime@cerno.tech wrote:
On Fri, Mar 24, 2023 at 08:11:52AM +0200, Matti Vaittinen wrote:
On 3/23/23 18:36, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 03:02:03PM +0200, Matti Vaittinen wrote:
On 3/23/23 14:29, Maxime Ripard wrote: > On Thu, Mar 23, 2023 at 02:16:52PM +0200, Matti Vaittinen wrote: > > This is the description of what was happening: > https://lore.kernel.org/dri-devel/20221117165311.vovrc7usy4efiytl@houat/ Thanks Maxime. Do I read this correcty. The devm_ unwinding not being done when root_device_register() is used is not because root_device_unregister() would not trigger the unwinding - but rather because DRM code on top of this device keeps the refcount increased?
There's a difference of behaviour between a root_device and any device with a bus: the root_device will only release the devm resources when it's freed (in device_release), but a bus device will also do it in device_del (through bus_remove_device() -> device_release_driver() -> device_release_driver_internal() -> __device_release_driver() -> device_unbind_cleanup(), which are skipped (in multiple places) if there's no bus and no driver attached to the device).
It does affect DRM, but I'm pretty sure it will affect any framework that deals with device hotplugging by deferring the framework structure until the last (userspace) user closes its file descriptor. So I'd assume that v4l2 and cec at least are also affected, and most likely others.
Thanks for the explanation and patience :)
If this is the case, then it sounds like a DRM specific issue to me.
I mean, I guess. One could also argue that it's because IIO doesn't properly deal with hotplugging.
I must say I haven't been testing the IIO registration API. I've only tested the helper API which is not backed up by any "IIO device". (This is fine for the helper because it must by design be cleaned-up only after the IIO-deregistration).
After your explanation here, I am not convinced IIO wouldn't see the same issue if I was testing the devm_iio_device_alloc() & co.
It depends really. The issue DRM is trying to solve is that, when a device is gone, some application might still have an open FD and could still poke into the kernel, while all the resources would have been free'd if it was using devm.
So everything is kept around until the last fd is closed, so you still have a reference to the device (even though it's been removed from its bus) until that time.
It could be possible that IIO just doesn't handle that case at all. I guess most of the devices aren't hotpluggable, and there's not much to interact with from a userspace PoV iirc, so it might be why.
Lars-Peter Clausen (IIRC) fixed up the IIO handling of the similar cases a long time ago now. It's simpler that for some other subsystems as we don't have as many interdependencies as occur in DRM etc.
I 'think' we are fine in general with the IIO approach to this (I think we did have one report of a theoretical race condition in the remove path that was never fully addressed).
For IIO we also have fds that can be open but all accesses to them are proxied through the IIO core and one of the things iio_device_unregister() or the devm equivalent does is to set indio_dev->info = NULL (+ wake up anyone waiting on data etc). Alongside removing the callbacks, that is also used as a flag to indicate the device has gone.
Note that we keep a reference to the struct indio_dev->dev (rather that the underlying device) so that is not freed until the last fd is closed. Thus, although devm unwinding has occurred that doesn't mean all the data that was allocated with devm_xx calls is cleared up immediately.
IIO is fully hot-plug and hot-unplug capable. And it will have the same issue. When using managed device registration that establishes a parent child relationship between the devices and in combination with a device where the managed unwinding does not happen on unbind, but rather on in the release callback you create a cyclic reference dependency. The child device holds a reference to the parent, but the reference is only released in the parents release callback. And since that release callback is not called until the last reference is dropped you end up with a resource leak.
There are even some other places where IIO drivers run into this. E.g. any driver that does `devm_iio_trigger_register(&indio_dev->dev, ...)`
A driver should should not do that. Should be devm_iio_trigger_registers(parent device, ...)
There is a corner where you need to detach the trigger from userspace before release which is odd if it was attached by default. There are some left over corners there that I think still cause trouble.
creates a resource leak on the trigger and the IIO device. The indio_dev is not a bus device, hence no unbind and the trigger holds a reference so the release callback will never be called either.
On Sat, Mar 25, 2023 at 05:50:44PM +0000, Jonathan Cameron wrote:
On Fri, 24 Mar 2023 13:31:57 +0100 Maxime Ripard maxime@cerno.tech wrote:
On Fri, Mar 24, 2023 at 08:11:52AM +0200, Matti Vaittinen wrote:
On 3/23/23 18:36, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 03:02:03PM +0200, Matti Vaittinen wrote:
On 3/23/23 14:29, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 02:16:52PM +0200, Matti Vaittinen wrote:
This is the description of what was happening: https://lore.kernel.org/dri-devel/20221117165311.vovrc7usy4efiytl@houat/
Thanks Maxime. Do I read this correcty. The devm_ unwinding not being done when root_device_register() is used is not because root_device_unregister() would not trigger the unwinding - but rather because DRM code on top of this device keeps the refcount increased?
There's a difference of behaviour between a root_device and any device with a bus: the root_device will only release the devm resources when it's freed (in device_release), but a bus device will also do it in device_del (through bus_remove_device() -> device_release_driver() -> device_release_driver_internal() -> __device_release_driver() -> device_unbind_cleanup(), which are skipped (in multiple places) if there's no bus and no driver attached to the device).
It does affect DRM, but I'm pretty sure it will affect any framework that deals with device hotplugging by deferring the framework structure until the last (userspace) user closes its file descriptor. So I'd assume that v4l2 and cec at least are also affected, and most likely others.
Thanks for the explanation and patience :)
If this is the case, then it sounds like a DRM specific issue to me.
I mean, I guess. One could also argue that it's because IIO doesn't properly deal with hotplugging.
I must say I haven't been testing the IIO registration API. I've only tested the helper API which is not backed up by any "IIO device". (This is fine for the helper because it must by design be cleaned-up only after the IIO-deregistration).
After your explanation here, I am not convinced IIO wouldn't see the same issue if I was testing the devm_iio_device_alloc() & co.
It depends really. The issue DRM is trying to solve is that, when a device is gone, some application might still have an open FD and could still poke into the kernel, while all the resources would have been free'd if it was using devm.
So everything is kept around until the last fd is closed, so you still have a reference to the device (even though it's been removed from its bus) until that time.
It could be possible that IIO just doesn't handle that case at all. I guess most of the devices aren't hotpluggable, and there's not much to interact with from a userspace PoV iirc, so it might be why.
Lars-Peter Clausen (IIRC) fixed up the IIO handling of the similar cases a long time ago now. It's simpler that for some other subsystems as we don't have as many interdependencies as occur in DRM etc.
I 'think' we are fine in general with the IIO approach to this (I think we did have one report of a theoretical race condition in the remove path that was never fully addressed).
For IIO we also have fds that can be open but all accesses to them are proxied through the IIO core and one of the things iio_device_unregister() or the devm equivalent does is to set indio_dev->info = NULL (+ wake up anyone waiting on data etc). Alongside removing the callbacks, that is also used as a flag to indicate the device has gone.
Sorry if it came as trying to put IIO under a bad light, it certainly wasn't my intention. I was trying to come up with possible explanations as to why IIO's design was simpler than DRM is :)
Note that we keep a reference to the struct indio_dev->dev (rather that the underlying device) so that is not freed until the last fd is closed. Thus, although devm unwinding has occurred that doesn't mean all the data that was allocated with devm_xx calls is cleared up immediately.
I'm not sure I get that part though. devm unwinding can happen even if the refcount is > 1
Maxime
On Wed, 29 Mar 2023 21:46:09 +0200 Maxime Ripard maxime@cerno.tech wrote:
On Sat, Mar 25, 2023 at 05:50:44PM +0000, Jonathan Cameron wrote:
On Fri, 24 Mar 2023 13:31:57 +0100 Maxime Ripard maxime@cerno.tech wrote:
On Fri, Mar 24, 2023 at 08:11:52AM +0200, Matti Vaittinen wrote:
On 3/23/23 18:36, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 03:02:03PM +0200, Matti Vaittinen wrote:
On 3/23/23 14:29, Maxime Ripard wrote: > On Thu, Mar 23, 2023 at 02:16:52PM +0200, Matti Vaittinen wrote: > > This is the description of what was happening: > https://lore.kernel.org/dri-devel/20221117165311.vovrc7usy4efiytl@houat/
Thanks Maxime. Do I read this correcty. The devm_ unwinding not being done when root_device_register() is used is not because root_device_unregister() would not trigger the unwinding - but rather because DRM code on top of this device keeps the refcount increased?
There's a difference of behaviour between a root_device and any device with a bus: the root_device will only release the devm resources when it's freed (in device_release), but a bus device will also do it in device_del (through bus_remove_device() -> device_release_driver() -> device_release_driver_internal() -> __device_release_driver() -> device_unbind_cleanup(), which are skipped (in multiple places) if there's no bus and no driver attached to the device).
It does affect DRM, but I'm pretty sure it will affect any framework that deals with device hotplugging by deferring the framework structure until the last (userspace) user closes its file descriptor. So I'd assume that v4l2 and cec at least are also affected, and most likely others.
Thanks for the explanation and patience :)
If this is the case, then it sounds like a DRM specific issue to me.
I mean, I guess. One could also argue that it's because IIO doesn't properly deal with hotplugging.
I must say I haven't been testing the IIO registration API. I've only tested the helper API which is not backed up by any "IIO device". (This is fine for the helper because it must by design be cleaned-up only after the IIO-deregistration).
After your explanation here, I am not convinced IIO wouldn't see the same issue if I was testing the devm_iio_device_alloc() & co.
It depends really. The issue DRM is trying to solve is that, when a device is gone, some application might still have an open FD and could still poke into the kernel, while all the resources would have been free'd if it was using devm.
So everything is kept around until the last fd is closed, so you still have a reference to the device (even though it's been removed from its bus) until that time.
It could be possible that IIO just doesn't handle that case at all. I guess most of the devices aren't hotpluggable, and there's not much to interact with from a userspace PoV iirc, so it might be why.
Lars-Peter Clausen (IIRC) fixed up the IIO handling of the similar cases a long time ago now. It's simpler that for some other subsystems as we don't have as many interdependencies as occur in DRM etc.
I 'think' we are fine in general with the IIO approach to this (I think we did have one report of a theoretical race condition in the remove path that was never fully addressed).
For IIO we also have fds that can be open but all accesses to them are proxied through the IIO core and one of the things iio_device_unregister() or the devm equivalent does is to set indio_dev->info = NULL (+ wake up anyone waiting on data etc). Alongside removing the callbacks, that is also used as a flag to indicate the device has gone.
Sorry if it came as trying to put IIO under a bad light, it certainly wasn't my intention. I was trying to come up with possible explanations as to why IIO's design was simpler than DRM is :)
No problem :) I'm sure there are gremlins hiding there. Part of the problem is that nothing prevents drivers doing 'wrong things' other than us noticing when it happens.
Note that we keep a reference to the struct indio_dev->dev (rather that the underlying device) so that is not freed until the last fd is closed. Thus, although devm unwinding has occurred that doesn't mean all the data that was allocated with devm_xx calls is cleared up immediately.
I'm not sure I get that part though. devm unwinding can happen even if the refcount is > 1
No IIO driver should be using devm on the indio_dev->dev, they should be doing it on the parent device. When the devm_iio_device_free() gets called, that doesn't actually free the device. Just decrements a reference count (earlier on we already ensured that it is just acting as a stub that provides no access to the underlying device for open FDs.).
There are probably more problems hiding though!
Jonathan
Maxime
On Thu, Mar 23, 2023 at 11:21:58AM +0100, Greg Kroah-Hartman wrote:
On Thu, Mar 23, 2023 at 11:12:16AM +0100, Maxime Ripard wrote:
On Wed, Mar 22, 2023 at 07:57:10PM +0100, Greg Kroah-Hartman wrote:
+/**
- test_kunit_helper_alloc_device - Allocate a mock device for a KUnit test
- @test: The test context object
- This allocates a fake struct &device to create a mock for a KUnit
- test. The device will also be bound to a fake driver. It will thus be
- able to leverage the usual infrastructure and most notably the
- device-managed resources just like a "real" device.
What specific "usual infrastructure" are you wanting to access here?
And again, if you want a fake device, make a virtual one, by just calling device_create().
Or are you wanting to do "more" with that device pointer than device_create() can give you?
Personally, I was (am) only interested in devm_ unwinding. I guess the device_create(), device_add(), device_remove()... (didn't study this sequence in details so sorry if there is errors) could've been sufficient for me. I haven't looked how much of the code that there is for 'platform devices' should be duplicated to support that sequence for testability purposes.
Any device can access devm_ code, there's no need for it to be a platform device at all.
Sure but the resources are only released if the device is part of a bus, so it can't be a root_device (or bare device) either
The resources are not cleaned up when the device is freed no matter if it's on a bus or not? If so, then that's a bug that needs to be fixed, and tested :)
Please have a look at: https://lore.kernel.org/linux-kselftest/20230324123157.bbwvfq4gsxnlnfwb@houa...
I couldn't get an answer on whether it was considered a bug or not last time, but as you can see there's a clear difference between a root device and a platform device that has probed when it comes to resource cleanup.
Maxime
On Fri, Mar 24, 2023 at 01:36:32PM +0100, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 11:21:58AM +0100, Greg Kroah-Hartman wrote:
On Thu, Mar 23, 2023 at 11:12:16AM +0100, Maxime Ripard wrote:
On Wed, Mar 22, 2023 at 07:57:10PM +0100, Greg Kroah-Hartman wrote:
> +/** > + * test_kunit_helper_alloc_device - Allocate a mock device for a KUnit test > + * @test: The test context object > + * > + * This allocates a fake struct &device to create a mock for a KUnit > + * test. The device will also be bound to a fake driver. It will thus be > + * able to leverage the usual infrastructure and most notably the > + * device-managed resources just like a "real" device.
What specific "usual infrastructure" are you wanting to access here?
And again, if you want a fake device, make a virtual one, by just calling device_create().
Or are you wanting to do "more" with that device pointer than device_create() can give you?
Personally, I was (am) only interested in devm_ unwinding. I guess the device_create(), device_add(), device_remove()... (didn't study this sequence in details so sorry if there is errors) could've been sufficient for me. I haven't looked how much of the code that there is for 'platform devices' should be duplicated to support that sequence for testability purposes.
Any device can access devm_ code, there's no need for it to be a platform device at all.
Sure but the resources are only released if the device is part of a bus, so it can't be a root_device (or bare device) either
The resources are not cleaned up when the device is freed no matter if it's on a bus or not? If so, then that's a bug that needs to be fixed, and tested :)
Please have a look at: https://lore.kernel.org/linux-kselftest/20230324123157.bbwvfq4gsxnlnfwb@houa...
I couldn't get an answer on whether it was considered a bug or not last time, but as you can see there's a clear difference between a root device and a platform device that has probed when it comes to resource cleanup.
Great, testing shows there are bugs! :)
That's a great start of a test, how about submitting that in a way that I can test it and we can go from there?
thanks,
greg k-h
On Fri, Mar 24, 2023 at 01:43:07PM +0100, Greg Kroah-Hartman wrote:
On Fri, Mar 24, 2023 at 01:36:32PM +0100, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 11:21:58AM +0100, Greg Kroah-Hartman wrote:
On Thu, Mar 23, 2023 at 11:12:16AM +0100, Maxime Ripard wrote:
On Wed, Mar 22, 2023 at 07:57:10PM +0100, Greg Kroah-Hartman wrote:
> > +/** > > + * test_kunit_helper_alloc_device - Allocate a mock device for a KUnit test > > + * @test: The test context object > > + * > > + * This allocates a fake struct &device to create a mock for a KUnit > > + * test. The device will also be bound to a fake driver. It will thus be > > + * able to leverage the usual infrastructure and most notably the > > + * device-managed resources just like a "real" device. > > What specific "usual infrastructure" are you wanting to access here? > > And again, if you want a fake device, make a virtual one, by just > calling device_create(). > > Or are you wanting to do "more" with that device pointer than > device_create() can give you?
Personally, I was (am) only interested in devm_ unwinding. I guess the device_create(), device_add(), device_remove()... (didn't study this sequence in details so sorry if there is errors) could've been sufficient for me. I haven't looked how much of the code that there is for 'platform devices' should be duplicated to support that sequence for testability purposes.
Any device can access devm_ code, there's no need for it to be a platform device at all.
Sure but the resources are only released if the device is part of a bus, so it can't be a root_device (or bare device) either
The resources are not cleaned up when the device is freed no matter if it's on a bus or not? If so, then that's a bug that needs to be fixed, and tested :)
Please have a look at: https://lore.kernel.org/linux-kselftest/20230324123157.bbwvfq4gsxnlnfwb@houa...
I couldn't get an answer on whether it was considered a bug or not last time, but as you can see there's a clear difference between a root device and a platform device that has probed when it comes to resource cleanup.
Great, testing shows there are bugs! :)
I mean, it wasn't clear to me that it was indeed a bug or the intent behind devm was that it would only work when probed. Both seemed reasonable.
That's a great start of a test, how about submitting that in a way that I can test it and we can go from there?
Ack.
I guess I'd need to arrange them somewhat differently for it to be useful and merge-able.
How would you prefer them to be submitted, in two different files testing both the root devices and platform devices?
Maxime
On Fri, Mar 24, 2023 at 02:02:06PM +0100, Maxime Ripard wrote:
On Fri, Mar 24, 2023 at 01:43:07PM +0100, Greg Kroah-Hartman wrote:
On Fri, Mar 24, 2023 at 01:36:32PM +0100, Maxime Ripard wrote:
On Thu, Mar 23, 2023 at 11:21:58AM +0100, Greg Kroah-Hartman wrote:
On Thu, Mar 23, 2023 at 11:12:16AM +0100, Maxime Ripard wrote:
On Wed, Mar 22, 2023 at 07:57:10PM +0100, Greg Kroah-Hartman wrote:
> > > +/** > > > + * test_kunit_helper_alloc_device - Allocate a mock device for a KUnit test > > > + * @test: The test context object > > > + * > > > + * This allocates a fake struct &device to create a mock for a KUnit > > > + * test. The device will also be bound to a fake driver. It will thus be > > > + * able to leverage the usual infrastructure and most notably the > > > + * device-managed resources just like a "real" device. > > > > What specific "usual infrastructure" are you wanting to access here? > > > > And again, if you want a fake device, make a virtual one, by just > > calling device_create(). > > > > Or are you wanting to do "more" with that device pointer than > > device_create() can give you? > > Personally, I was (am) only interested in devm_ unwinding. I guess the > device_create(), device_add(), device_remove()... (didn't study this > sequence in details so sorry if there is errors) could've been sufficient > for me. I haven't looked how much of the code that there is for 'platform > devices' should be duplicated to support that sequence for testability > purposes.
Any device can access devm_ code, there's no need for it to be a platform device at all.
Sure but the resources are only released if the device is part of a bus, so it can't be a root_device (or bare device) either
The resources are not cleaned up when the device is freed no matter if it's on a bus or not? If so, then that's a bug that needs to be fixed, and tested :)
Please have a look at: https://lore.kernel.org/linux-kselftest/20230324123157.bbwvfq4gsxnlnfwb@houa...
I couldn't get an answer on whether it was considered a bug or not last time, but as you can see there's a clear difference between a root device and a platform device that has probed when it comes to resource cleanup.
Great, testing shows there are bugs! :)
I mean, it wasn't clear to me that it was indeed a bug or the intent behind devm was that it would only work when probed. Both seemed reasonable.
That's a great start of a test, how about submitting that in a way that I can test it and we can go from there?
Ack.
I guess I'd need to arrange them somewhat differently for it to be useful and merge-able.
How would you prefer them to be submitted, in two different files testing both the root devices and platform devices?
root devices are rare, but yes, one for each would be good, thanks!
greg k-h
On Wed, Mar 22, 2023 at 11:05:55AM +0200, Matti Vaittinen wrote:
The creation of a dummy device in order to test managed interfaces is a generally useful test feature. The drm test helpers drm_kunit_helper_alloc_device() and drm_kunit_helper_free_device() are doing exactly this. It makes no sense that each and every component which intends to be testing managed interfaces will create similar helpers so stole these for more generic use.
Signed-off-by: Matti Vaittinen mazziesaccount@gmail.com
Changes: v4 => v5:
- Add accidentally dropped header and email recipients
- do not rename + move helpers from DRM but add temporary dublicates to simplify merging. (This patch does not touch DRM and can be merged separately. DRM patch and IIO test patch still depend on this one).
Please note that there's something similar ongoing in the CCF: https://lore.kernel.org/all/20230302013822.1808711-1-sboyd@kernel.org/
I do like the simplicity of these DRM-originated helpers so I think they're worth. I do equally like the Stephen's idea of having the "dummy platform device" related helpers under drivers/base and the header being in include/kunit/platform_device.h which is similar to real platform device under include/linux/platform_device.h
drivers/base/test/Kconfig | 5 ++ drivers/base/test/Makefile | 2 + drivers/base/test/test_kunit_device.c | 83 +++++++++++++++++++++++++++
By putting files in this directory, you are asking me to maintain this code, and right now, I can't agree with that, sorry. See my review comments on it for why.
thanks,
greg k-h
On Wed, 22 Mar 2023 at 17:06, Matti Vaittinen mazziesaccount@gmail.com wrote:
The creation of a dummy device in order to test managed interfaces is a generally useful test feature. The drm test helpers drm_kunit_helper_alloc_device() and drm_kunit_helper_free_device() are doing exactly this. It makes no sense that each and every component which intends to be testing managed interfaces will create similar helpers so stole these for more generic use.
Signed-off-by: Matti Vaittinen mazziesaccount@gmail.com
Changes: v4 => v5:
- Add accidentally dropped header and email recipients
- do not rename + move helpers from DRM but add temporary dublicates to simplify merging. (This patch does not touch DRM and can be merged separately. DRM patch and IIO test patch still depend on this one).
Please note that there's something similar ongoing in the CCF: https://lore.kernel.org/all/20230302013822.1808711-1-sboyd@kernel.org/
I do like the simplicity of these DRM-originated helpers so I think they're worth. I do equally like the Stephen's idea of having the "dummy platform device" related helpers under drivers/base and the header being in include/kunit/platform_device.h which is similar to real platform device under include/linux/platform_device.h
Thanks for sending this my way.
It's clear we need some way of creating "fake" devices for KUnit tests. Given that there are now (at least) three different drivers looking to use this, we'll ultimately need something which works for everyone.
I think the questions we therefore need to answer are: - Do we specifically need a platform_device (or, any other specific device struct), or would any old struct device work? I can see why we would need a platform device for cases where we're testing things like device-tree properties (or, in the future, having e.g. USB-specific helpers which create usb_device). Most tests just use root_device_register() thus far, though. - What helpers do we need to make creating, using, and cleaning up these devices as simple as possible.
I think that in this particular case, we don't actually need a struct platform_device. Replacing these helpers with simple calls to root_device_register() and root_device_unregister() seems to work fine with this patch series for me. (It does break the drm_test_managed_run_action test, though.) So I don't think having these helpers actually help this series at the moment.
That being said, if they used the KUnit resource system to automatically clean up the device when the test is finished (or otherwise exits), that would seem like a real advantage. The clk and drm examples both do this, and I'm hoping to add an API to make it even simpler going forward. (With the work-in-progress API described here[1], the whole thing should boil down to "struct device *dev = root_device_register(name); kunit_defer(root_device_unregister, dev);".)
So, I guess we have three cases we need to look at: - A test just needs any old struct device. Tests thus far have hardcoded, or had their own wrappers around root_device_register() for this. - A test needs a device attached to a bus (but doesn't care which bus). Thus far, people have used struct platform_device for this (see the DRM helpers, which use a platform device for managed resource tests[2]). Maybe the right solution here is something like a struct kunit_device? - A test needs a device attached to a specific bus. We'll probably need some more detailed faking of that bus. This covers cases like having fake USB devices, devicetree integration, etc.
I'd suggest that, for the majority of cases which only care about the first case, let's just use root_device_register() directly, or have a thin wrapper like the old root_device-based version of the DRM helpers[3]. This will probable serve us well enough while we work out how to handle the other two cases properly (which is already being looked at for the CLK/DeviceTree patches and the DRM stuff).
If the resulting helpers are generally useful enough, they can probably sit in either drivers/base or lib/kunit. I'd rather not have code that's really specific to certain busses sitting in lib/kunit rather than alongside the device/bus code in drivers/base or some other subsystem/driver path, but I can tolerate it for the very generic struct device.
Regardless, I've left a few notes on the patch itself below.
Cheers, -- David
[1]: https://kunit-review.googlesource.com/c/linux/+/5434/3/include/kunit/resourc... [2]: https://lore.kernel.org/all/20221123-rpi-kunit-tests-v3-8-4615a663a84a@cerno... [3]: https://elixir.bootlin.com/linux/v6.2/source/drivers/gpu/drm/tests/drm_kunit...
drivers/base/test/Kconfig | 5 ++ drivers/base/test/Makefile | 2 + drivers/base/test/test_kunit_device.c | 83 +++++++++++++++++++++++++++ include/kunit/platform_device.h | 13 +++++ 4 files changed, 103 insertions(+) create mode 100644 drivers/base/test/test_kunit_device.c create mode 100644 include/kunit/platform_device.h
diff --git a/drivers/base/test/Kconfig b/drivers/base/test/Kconfig index 610a1ba7a467..642b5b183c10 100644 --- a/drivers/base/test/Kconfig +++ b/drivers/base/test/Kconfig @@ -1,4 +1,9 @@ # SPDX-License-Identifier: GPL-2.0
+config TEST_KUNIT_DEVICE_HELPERS
depends on KUNIT
tristate
config TEST_ASYNC_DRIVER_PROBE tristate "Build kernel module to test asynchronous driver probing" depends on m diff --git a/drivers/base/test/Makefile b/drivers/base/test/Makefile index 7f76fee6f989..49926524ec6f 100644 --- a/drivers/base/test/Makefile +++ b/drivers/base/test/Makefile @@ -1,5 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_TEST_ASYNC_DRIVER_PROBE) += test_async_driver_probe.o
+obj-$(CONFIG_TEST_KUNIT_DEVICE_HELPERS) += test_kunit_device.o
obj-$(CONFIG_DRIVER_PE_KUNIT_TEST) += property-entry-test.o CFLAGS_property-entry-test.o += $(DISABLE_STRUCTLEAK_PLUGIN) diff --git a/drivers/base/test/test_kunit_device.c b/drivers/base/test/test_kunit_device.c new file mode 100644 index 000000000000..75790e15b85c --- /dev/null +++ b/drivers/base/test/test_kunit_device.c @@ -0,0 +1,83 @@ +// SPDX-License-Identifier: GPL-2.0 +/*
- These helpers have been extracted from drm test code at
- drm_kunit_helpers.c which was authored by
- Maxime Ripard maxime@cerno.tech
- */
+#include <linux/device.h> +#include <linux/platform_device.h>
+#include <kunit/platform_device.h>
+#define KUNIT_DEVICE_NAME "test-kunit-mock-device"
Personally, I'd really rather this be a name passed in by the test. What if a test needs to create multiple distinct devices?
+static int fake_probe(struct platform_device *pdev) +{
return 0;
+}
+static int fake_remove(struct platform_device *pdev) +{
return 0;
+}
+static struct platform_driver fake_platform_driver = {
.probe = fake_probe,
.remove = fake_remove,
.driver = {
.name = KUNIT_DEVICE_NAME,
},
+};
+/**
- test_kunit_helper_alloc_device - Allocate a mock device for a KUnit test
- @test: The test context object
- This allocates a fake struct &device to create a mock for a KUnit
- test. The device will also be bound to a fake driver. It will thus be
- able to leverage the usual infrastructure and most notably the
- device-managed resources just like a "real" device.
- Callers need to make sure test_kunit_helper_free_device() on the
- device when done.
- Returns:
- A pointer to the new device, or an ERR_PTR() otherwise.
- */
+struct device *test_kunit_helper_alloc_device(struct kunit *test) +{
struct platform_device *pdev;
int ret;
ret = platform_driver_register(&fake_platform_driver);
KUNIT_ASSERT_EQ(test, ret, 0);
pdev = platform_device_alloc(KUNIT_DEVICE_NAME, PLATFORM_DEVID_NONE);
KUNIT_ASSERT_NOT_ERR_OR_NULL(test, pdev);
ret = platform_device_add(pdev);
KUNIT_ASSERT_EQ(test, ret, 0);
return &pdev->dev;
+} +EXPORT_SYMBOL_GPL(test_kunit_helper_alloc_device);
+/**
- test_kunit_helper_free_device - Frees a mock device
- @test: The test context object
- @dev: The device to free
- Frees a device allocated with test_kunit_helper_alloc_device().
- */
This really should be automatically called when the test exits, probably using kunit reources. Ideally, there'd also be a function to free it earlier, which can be done by calling kunit_remove_resource() to lower the refcount.
+void test_kunit_helper_free_device(struct kunit *test, struct device *dev) +{
struct platform_device *pdev = to_platform_device(dev);
platform_device_unregister(pdev);
platform_driver_unregister(&fake_platform_driver);
+} +EXPORT_SYMBOL_GPL(test_kunit_helper_free_device);
+MODULE_AUTHOR("Matti Vaittinen mazziesaccount@gmail.com"); +MODULE_LICENSE("GPL"); diff --git a/include/kunit/platform_device.h b/include/kunit/platform_device.h new file mode 100644 index 000000000000..2a9c7bdd75eb --- /dev/null +++ b/include/kunit/platform_device.h @@ -0,0 +1,13 @@ +/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __KUNIT_PLATFORM_DEVICE__ +#define __KUNIT_PLATFORM_DEVICE__
+#include <kunit/test.h>
+struct device;
+struct device *test_kunit_helper_alloc_device(struct kunit *test); +void test_kunit_helper_free_device(struct kunit *test, struct device *dev);
If these helpers are supposed to guarantee that the resulting device is a platform device, let's reflect that in their names. Otherwise, let's not put this in a platform_device.h header, but maybe something more general, like kunit/device.h.
+#endif
2.39.2
-- Matti Vaittinen, Linux device drivers ROHM Semiconductors, Finland SWDC Kiviharjunlenkki 1E 90220 OULU FINLAND
Simon says - in Latin please. ~~~ "non cogito me" dixit Rene Descarte, deinde evanescavit ~~~ Thanks to Simon Glass for the translation =]
Hi David, all,
On 3/23/23 09:30, David Gow wrote:
On Wed, 22 Mar 2023 at 17:06, Matti Vaittinen mazziesaccount@gmail.com wrote:
The creation of a dummy device in order to test managed interfaces is a generally useful test feature. The drm test helpers drm_kunit_helper_alloc_device() and drm_kunit_helper_free_device() are doing exactly this. It makes no sense that each and every component which intends to be testing managed interfaces will create similar helpers so stole these for more generic use.
Signed-off-by: Matti Vaittinen mazziesaccount@gmail.com
Changes: v4 => v5:
- Add accidentally dropped header and email recipients
- do not rename + move helpers from DRM but add temporary dublicates to simplify merging. (This patch does not touch DRM and can be merged separately. DRM patch and IIO test patch still depend on this one).
Please note that there's something similar ongoing in the CCF: https://lore.kernel.org/all/20230302013822.1808711-1-sboyd@kernel.org/
I do like the simplicity of these DRM-originated helpers so I think they're worth. I do equally like the Stephen's idea of having the "dummy platform device" related helpers under drivers/base and the header being in include/kunit/platform_device.h which is similar to real platform device under include/linux/platform_device.h
Thanks for sending this my way.
It's clear we need some way of creating "fake" devices for KUnit tests. Given that there are now (at least) three different drivers looking to use this, we'll ultimately need something which works for everyone.
I think the questions we therefore need to answer are:
- Do we specifically need a platform_device (or, any other specific
device struct), or would any old struct device work? I can see why we would need a platform device for cases where we're testing things like device-tree properties (or, in the future, having e.g. USB-specific helpers which create usb_device). Most tests just use root_device_register() thus far, though.
Funny timing. I just found the root_device_register() while wondering the parent for auxiliary_devices.
I think the root_device_[un]register() meets my current needs.
- What helpers do we need to make creating, using, and cleaning up
these devices as simple as possible.
I think that in this particular case, we don't actually need a struct platform_device. Replacing these helpers with simple calls to root_device_register() and root_device_unregister() seems to work fine with this patch series for me. (It does break the drm_test_managed_run_action test, though.) So I don't think having these helpers actually help this series at the moment.
I am afraid that further work with these helpers is out of the scope for me (at least for now). I'll drop the DRM and the helper patches from this series && go with the root_device_register(), root_device_unregister() in the IIO tests added in this series.
That being said, if they used the KUnit resource system to automatically clean up the device when the test is finished (or otherwise exits),
My 10 cents to this is that yes, automatic unwinding at test exit would be simple - and also correct for most of the time. However, I think there might also be tests that would like to verify the unwinding process has really managed to do what it was intended to do. That, I think would require being able to manually drop the device in the middle of the test.
So, I guess we have three cases we need to look at:
- A test just needs any old struct device. Tests thus far have
hardcoded, or had their own wrappers around root_device_register() for this.
As said above, my case fits this category.
- A test needs a device attached to a bus (but doesn't care which
bus). Thus far, people have used struct platform_device for this (see the DRM helpers, which use a platform device for managed resource tests[2]). Maybe the right solution here is something like a struct kunit_device?
This sounds like, how to put it, "architecturally correct". But...
- A test needs a device attached to a specific bus. We'll probably
need some more detailed faking of that bus. This covers cases like having fake USB devices, devicetree integration, etc.
...if we in any case need this, wouldn't the kunit_device just be unnecessary bloat because if the test does not care which bus it is sitting in, then it could probably use a bus-specific device as well?
I'd suggest that, for the majority of cases which only care about the first case, let's just use root_device_register() directly,
After finding the root_device_register() - I agree.
or have a thin wrapper like the old root_device-based version of the DRM helpers[3]. This will probable serve us well enough while we work out how to handle the other two cases properly (which is already being looked at for the CLK/DeviceTree patches and the DRM stuff).
If the resulting helpers are generally useful enough, they can probably sit in either drivers/base or lib/kunit. I'd rather not have code that's really specific to certain busses sitting in lib/kunit rather than alongside the device/bus code in drivers/base or some other subsystem/driver path, but I can tolerate it for the very generic struct device.
Regardless, I've left a few notes on the patch itself below.
Thanks but I guess I'll just drop this one :)
Cheers, -- David
Thanks for the thorough analysis and these links! This was enlightening :)
Yours, -- Matti
On Thu, Mar 23, 2023 at 03:30:34PM +0800, David Gow wrote:
On Wed, 22 Mar 2023 at 17:06, Matti Vaittinen mazziesaccount@gmail.com wrote:
The creation of a dummy device in order to test managed interfaces is a generally useful test feature. The drm test helpers drm_kunit_helper_alloc_device() and drm_kunit_helper_free_device() are doing exactly this. It makes no sense that each and every component which intends to be testing managed interfaces will create similar helpers so stole these for more generic use.
Signed-off-by: Matti Vaittinen mazziesaccount@gmail.com
Changes: v4 => v5:
- Add accidentally dropped header and email recipients
- do not rename + move helpers from DRM but add temporary dublicates to simplify merging. (This patch does not touch DRM and can be merged separately. DRM patch and IIO test patch still depend on this one).
Please note that there's something similar ongoing in the CCF: https://lore.kernel.org/all/20230302013822.1808711-1-sboyd@kernel.org/
I do like the simplicity of these DRM-originated helpers so I think they're worth. I do equally like the Stephen's idea of having the "dummy platform device" related helpers under drivers/base and the header being in include/kunit/platform_device.h which is similar to real platform device under include/linux/platform_device.h
Thanks for sending this my way.
It's clear we need some way of creating "fake" devices for KUnit tests. Given that there are now (at least) three different drivers looking to use this, we'll ultimately need something which works for everyone.
I think the questions we therefore need to answer are:
- Do we specifically need a platform_device (or, any other specific
device struct), or would any old struct device work? I can see why we would need a platform device for cases where we're testing things like device-tree properties (or, in the future, having e.g. USB-specific helpers which create usb_device). Most tests just use root_device_register() thus far, though.
Any struct device would work, so please do NOT use a platform device.
Use aux devices, or better yet, just a normal virtual device by calling device_create().
- What helpers do we need to make creating, using, and cleaning up
these devices as simple as possible.
Becides what the driver core already provides you? What exactly are you wanting to do here?
I think that in this particular case, we don't actually need a struct platform_device. Replacing these helpers with simple calls to root_device_register() and root_device_unregister() seems to work fine with this patch series for me. (It does break the drm_test_managed_run_action test, though.) So I don't think having these helpers actually help this series at the moment.
Why abuse root_device_*() for something that really is not a root device in the system? Why not use a virtual device? Or better yet, a kunit bus with devices on it that are just for testing? That way you can properly test the bus and driver and device code fully, which is what you are implying is needed here.
That being said, if they used the KUnit resource system to automatically clean up the device when the test is finished (or otherwise exits), that would seem like a real advantage. The clk and drm examples both do this, and I'm hoping to add an API to make it even simpler going forward. (With the work-in-progress API described here[1], the whole thing should boil down to "struct device *dev = root_device_register(name); kunit_defer(root_device_unregister, dev);".)
So, I guess we have three cases we need to look at:
- A test just needs any old struct device. Tests thus far have
hardcoded, or had their own wrappers around root_device_register() for this.
Again, make a kunit bus and devices, that might be "simplest" overall.
- A test needs a device attached to a bus (but doesn't care which
bus). Thus far, people have used struct platform_device for this (see the DRM helpers, which use a platform device for managed resource tests[2]). Maybe the right solution here is something like a struct kunit_device?
Yes.
- A test needs a device attached to a specific bus. We'll probably
need some more detailed faking of that bus. This covers cases like having fake USB devices, devicetree integration, etc.
Have your own bus. No need to mess with USB or any real bus UNLESS you want to actually test those busses, and if so, just create fake USB or clk or whatever devices so that you can test them.
Or just rely on the testing that some busses have for their devices (USB has this today and syzbot knows how to test it), as busses for hardware can be complex and usually require a "real" driver to be written for them to test things.
thanks,
gre gk-h
Hi David,
On Thu, Mar 23, 2023 at 03:30:34PM +0800, David Gow wrote:
I think the questions we therefore need to answer are:
- Do we specifically need a platform_device (or, any other specific
device struct), or would any old struct device work? I can see why we would need a platform device for cases where we're testing things like device-tree properties (or, in the future, having e.g. USB-specific helpers which create usb_device). Most tests just use root_device_register() thus far, though.
- What helpers do we need to make creating, using, and cleaning up
these devices as simple as possible.
I think that in this particular case, we don't actually need a struct platform_device. Replacing these helpers with simple calls to root_device_register() and root_device_unregister() seems to work fine with this patch series for me. (It does break the drm_test_managed_run_action test, though.) So I don't think having these helpers actually help this series at the moment.
I'm not sure that a root_device is a good option, see below.
That being said, if they used the KUnit resource system to automatically clean up the device when the test is finished (or otherwise exits), that would seem like a real advantage. The clk and drm examples both do this, and I'm hoping to add an API to make it even simpler going forward. (With the work-in-progress API described here[1], the whole thing should boil down to "struct device *dev = root_device_register(name); kunit_defer(root_device_unregister, dev);".)
Oh, yes, please make it happen :)
The current API is a bit of a pain compared to other managed APIs like devm or drmm.
So, I guess we have three cases we need to look at:
- A test just needs any old struct device. Tests thus far have
hardcoded, or had their own wrappers around root_device_register() for this.
- A test needs a device attached to a bus (but doesn't care which
bus). Thus far, people have used struct platform_device for this (see the DRM helpers, which use a platform device for managed resource tests[2]). Maybe the right solution here is something like a struct kunit_device?
- A test needs a device attached to a specific bus. We'll probably
need some more detailed faking of that bus. This covers cases like having fake USB devices, devicetree integration, etc.
Yeah, from the current discussion on the IIO and clk patchsets, and what we've done in DRM, I guess there's two major use cases:
* You test an (isolated) function that takes a device as an argument. Here you probably don't really care about what the device is as long as you can provide one. This is what is being done for the DRM helpers at the moment, even though it's not really isolated. The device is essentially mocked. This could be your points 1 and 2.
* You want to probe the driver with a fake context. The device and drivers are very much real, but the device tree (or whatever) is mocked. This is what the clocks patches from Stephen are doing. This could be covered by your point 3.
I'd suggest that, for the majority of cases which only care about the first case, let's just use root_device_register() directly, or have a thin wrapper like the old root_device-based version of the DRM helpers[3]. This will probable serve us well enough while we work out how to handle the other two cases properly (which is already being looked at for the CLK/DeviceTree patches and the DRM stuff).
I disagree, and I think your cases 1 and 2 should be merged. We were initially using a root_device in DRM. We had to switch to platform_device (but actually any device attached to a bus) because a root device isn't attached to a bus and thus the devm resources are never freed.
When a function takes a device as an argument, there's a good chance that it will use devm nowadays, so we ended up exhausting resources pools (like IDs iirc) when running a lot of tests in sequence.
So yeah, I think you can't just assume that a root device will do even for a true unit test that would test an isolated function. It either needs to be tied to a bus, or we need to force the devm resource release when the root device is removed somehow.
Maxime
On Wed, Mar 22, 2023 at 11:05:23AM +0200, Matti Vaittinen wrote:
Revision history: v4 => v5: Mostly fixes to review comments from Andy and Jonathan.
- more accurate change-log in individual patches
- copy code from DRM test helper instead of moving it to simplify merging
1) Why do you think this is a problem? 2) How would we avoid spreading more copies of the same code in the future?
1) Merge conflicts is not a bad thing. It shows that people tested their code in isolation and stabilized it before submitting to the upper maintainer.
https://yarchive.net/comp/linux/git_merges_from_upstream.html
2) Spreading the same code when we _know_ this, should be very well justified. Merge conflict is an administrative point, and not a technical obstacle to avoid.
- document all exported GTS helpers.
- inline a few GTS helpers
- use again Milli lux for the bu27034 with RAW IIO_LIGHT channel and scale
- Fix bug from added in v4 bu27034 time setting.
Andy Shevchenko andriy.shevchenko@linux.intel.com writes:
Hello Andy,
On Wed, Mar 22, 2023 at 11:05:23AM +0200, Matti Vaittinen wrote:
Revision history: v4 => v5: Mostly fixes to review comments from Andy and Jonathan.
- more accurate change-log in individual patches
- copy code from DRM test helper instead of moving it to simplify merging
Why do you think this is a problem?
How would we avoid spreading more copies of the same code in the future?
Merge conflicts is not a bad thing. It shows that people tested their code
in isolation and stabilized it before submitting to the upper maintainer.
https://yarchive.net/comp/linux/git_merges_from_upstream.html
- Spreading the same code when we _know_ this, should be very well justified.
Merge conflict is an administrative point, and not a technical obstacle to avoid.
I believe this was suggested by Maxime and the rationale is that by just copying the helpers for now, that would make it easier to land instead of requiring coordination between different subystems.
Otherwise the IIO tree will need to provide an inmutable branch for the DRM tree to merge and so on.
I agree with Maxime that a little bit of duplication (that can be cleaned up by each subsystem at their own pace) is the path of least resistance.
On 3/22/23 12:34, Javier Martinez Canillas wrote:
Andy Shevchenko andriy.shevchenko@linux.intel.com writes:
Hello Andy,
On Wed, Mar 22, 2023 at 11:05:23AM +0200, Matti Vaittinen wrote:
Revision history: v4 => v5: Mostly fixes to review comments from Andy and Jonathan.
- more accurate change-log in individual patches
- copy code from DRM test helper instead of moving it to simplify merging
Why do you think this is a problem?
How would we avoid spreading more copies of the same code in the future?
Merge conflicts is not a bad thing. It shows that people tested their code
in isolation and stabilized it before submitting to the upper maintainer.
https://yarchive.net/comp/linux/git_merges_from_upstream.html
- Spreading the same code when we _know_ this, should be very well justified.
Merge conflict is an administrative point, and not a technical obstacle to avoid.
I definitely agree. This is also why I did the renaming and not copying in the first version. In this version I did still add the subsequent patch 2/8 - which drops the duplicates from DRM tree.
I believe this was suggested by Maxime and the rationale is that by just copying the helpers for now, that would make it easier to land instead of requiring coordination between different subystems.
This is correct.
Otherwise the IIO tree will need to provide an inmutable branch for the DRM tree to merge and so on.
Or, if we carry the patch 1/8 via self-test tree, then we get even more players here.
Still, I am not opposing immutable branch because that would allow fast applying of the patch 2/8 as well. Longer that is delayed, more likely we will see more users of the DRM helpers and harder it gets to remove the duplicates later.
I agree with Maxime that a little bit of duplication (that can be cleaned up by each subsystem at their own pace) is the path of least resistance.
I'd say this depends. It probably is the path of least resistance for people maintaining the trees. It can also be the path of least resistance in general - but it depends on if there will be no new users for those DRM helpers while waiting the new APIs being merged in DRM tree. More users we see in DRM, more effort the clean-up requires.
I have no strong opinion on this specific case. I'd just be happy to see the IIO tests getting in preferably sooner than later - although 'soon' and 'late' does also depend on other factors besides these helpers...
Yours, -- Matti
On Wed, Mar 22, 2023 at 12:59:33PM +0200, Matti Vaittinen wrote:
On 3/22/23 12:34, Javier Martinez Canillas wrote:
On Wed, Mar 22, 2023 at 11:05:23AM +0200, Matti Vaittinen wrote:
...
- copy code from DRM test helper instead of moving it to simplify merging
Why do you think this is a problem?
How would we avoid spreading more copies of the same code in the future?
Merge conflicts is not a bad thing. It shows that people tested their code
in isolation and stabilized it before submitting to the upper maintainer.
https://yarchive.net/comp/linux/git_merges_from_upstream.html
- Spreading the same code when we _know_ this, should be very well justified.
Merge conflict is an administrative point, and not a technical obstacle to avoid.
I definitely agree. This is also why I did the renaming and not copying in the first version. In this version I did still add the subsequent patch 2/8
- which drops the duplicates from DRM tree.
I believe this was suggested by Maxime and the rationale is that by just copying the helpers for now, that would make it easier to land instead of requiring coordination between different subystems.
This is correct.
Otherwise the IIO tree will need to provide an inmutable branch for the DRM tree to merge and so on.
Or, if we carry the patch 1/8 via self-test tree, then we get even more players here.
Still, I am not opposing immutable branch because that would allow fast applying of the patch 2/8 as well. Longer that is delayed, more likely we will see more users of the DRM helpers and harder it gets to remove the duplicates later.
I agree with Maxime that a little bit of duplication (that can be cleaned up by each subsystem at their own pace) is the path of least resistance.
I'd say this depends. It probably is the path of least resistance for people maintaining the trees. It can also be the path of least resistance in general - but it depends on if there will be no new users for those DRM helpers while waiting the new APIs being merged in DRM tree. More users we see in DRM, more effort the clean-up requires.
I have no strong opinion on this specific case. I'd just be happy to see the IIO tests getting in preferably sooner than later - although 'soon' and 'late' does also depend on other factors besides these helpers...
Since I'm not a maintainer of either, and one of them requires something, I can't oppose.
On Wed, Mar 22, 2023 at 12:59:33PM +0200, Matti Vaittinen wrote:
I agree with Maxime that a little bit of duplication (that can be cleaned up by each subsystem at their own pace) is the path of least resistance.
I'd say this depends. It probably is the path of least resistance for people maintaining the trees. It can also be the path of least resistance in general - but it depends on if there will be no new users for those DRM helpers while waiting the new APIs being merged in DRM tree. More users we see in DRM, more effort the clean-up requires.
So far there's one user in DRM, and I'm not aware of any current work using it at the moment. Even if some show up in the short-term future, it's not going to be overwhelming.
Maxime
linux-kselftest-mirror@lists.linaro.org