This patch series is a result of discussion at the refcount_t BOF the Linux Plumbers Conference. In this discussion, we identified a need for looking closely and investigating atomic_t usages in the kernel when it is used strictly as a counter without it controlling object lifetimes and state changes.
There are a number of atomic_t usages in the kernel where atomic_t api is used strictly for counting and not for managing object lifetime. In some cases, atomic_t might not even be needed.
The purpose of these counters is to clearly differentiate atomic_t counters from atomic_t usages that guard object lifetimes, hence prone to overflow and underflow errors. It allows tools that scan for underflow and overflow on atomic_t usages to detect overflow and underflows to scan just the cases that are prone to errors.
Simple atomic counters api provides interfaces for simple atomic counters that just count, and don't guard resource lifetimes. The interfaces are built on top of atomic_t api, providing a smaller subset of atomic_t interfaces necessary to support simple counters.
Counter wraps around to INT_MIN when it overflows and should not be used to guard resource lifetimes, device usage and open counts that control state changes, and pm states. Overflowing to INT_MIN is consistent with the atomic_t api, which it is built on top of.
Using counter_atomic* to guard lifetimes could lead to use-after free when it overflows and undefined behavior when used to manage state changes and device usage/open states.
This patch series introduces Simple atomic counters. Counter atomic ops leverage atomic_t and provide a sub-set of atomic_t ops.
In addition this patch series converts a few drivers to use the new api. The following criteria is used for select variables for conversion:
1. Variable doesn't guard object lifetimes, manage state changes e.g: device usage counts, device open counts, and pm states. 2. Variable is used for stats and counters. 3. The conversion doesn't change the overflow behavior.
Note: Would like to get this into Linux 5.10-rc1 so we can continue updating drivers that can be updated to use this API. If this all looks good, Kees, would you like to take this through your tree or would you like to take this through mine.
Changes since Patch v2: -- Thanks for reviews and reviewed-by, and Acked-by tags. Updated the patches with the tags. -- Minor changes to address Greg's comment to remove default from Kconfig -- Added Copyrights to new files Updates to address comments on v2 from Kees Cook -- Updated Patch 1/11 to make clear that the counter wraps around to INT_MIN and that this behavior is consistent with the atomic_t api, on which this counter built api built on top of. -- Other patch change logs updated with the correct wrap around behavior. -- Patch 1/11 is updated to add tests with constants for overflow and underflow. -- Patch 8/11 - added inits for the stat counters -- Patch 10/11 - fixes the vmci_num_guest_devices != 0 to >0 which is safer than checking for !=0.
Changes since Patch v1 -- Thanks for reviews and reviewed-by, and Acked-by tags. Updated the patches with the tags. -- Addressed Kees's and Joel's comments: 1. Removed dec_return interfaces 2. Removed counter_simple interfaces to be added later with changes to drivers that use them (if any).
Changes since RFC: -- Thanks for reviews and reviewed-by, and Acked-by tags. Updated the patches with the tags. -- Addressed Kees's comments: 1. Non-atomic counters renamed to counter_simple32 and counter_simple64 to clearly indicate size. 2. Added warning for counter_simple* usage and it should be used only when there is no need for atomicity. 3. Renamed counter_atomic to counter_atomic32 to clearly indicate size. 4. Renamed counter_atomic_long to counter_atomic64 and it now uses atomic64_t ops and indicates size. 5. Test updated for the API renames. 6. Added helper functions for test results printing 7. Verified that the test module compiles in kunit env. and test module can be loaded to run the test. 8. Updated Documentation to reflect the intent to make the API restricted so it can never be used to guard object lifetimes and state management. I left _return ops for now, inc_return is necessary for now as per the discussion we had on this topic. -- Updated driver patches with API name changes. -- We discussed if binder counters can be non-atomic. For now I left them the same as the RFC patch - using counter_atomic32 -- Unrelated to this patch series: The patch series review uncovered improvements could be made to test_async_driver_probe and vmw_vmci/vmci_guest. I will track these for fixing later.
Shuah Khan (11): counters: Introduce counter_atomic* counters selftests:lib:test_counters: add new test for counters drivers/base: convert deferred_trigger_count and probe_count to counter_atomic32 drivers/base/devcoredump: convert devcd_count to counter_atomic32 drivers/acpi: convert seqno counter_atomic32 drivers/acpi/apei: convert seqno counter_atomic32 drivers/android/binder: convert stats, transaction_log to counter_atomic32 drivers/base/test/test_async_driver_probe: convert to use counter_atomic32 drivers/char/ipmi: convert stats to use counter_atomic32 drivers/misc/vmw_vmci: convert num guest devices counter to counter_atomic32 drivers/edac: convert pci counters to counter_atomic32
Documentation/core-api/counters.rst | 109 ++++++++++++ MAINTAINERS | 8 + drivers/acpi/acpi_extlog.c | 5 +- drivers/acpi/apei/ghes.c | 5 +- drivers/android/binder.c | 41 ++--- drivers/android/binder_internal.h | 3 +- drivers/base/dd.c | 19 +- drivers/base/devcoredump.c | 5 +- drivers/base/test/test_async_driver_probe.c | 26 +-- drivers/char/ipmi/ipmi_msghandler.c | 9 +- drivers/char/ipmi/ipmi_si_intf.c | 9 +- drivers/edac/edac_pci.h | 5 +- drivers/edac/edac_pci_sysfs.c | 28 +-- drivers/misc/vmw_vmci/vmci_guest.c | 9 +- include/linux/counters.h | 176 +++++++++++++++++++ lib/Kconfig | 9 + lib/Makefile | 1 + lib/test_counters.c | 162 +++++++++++++++++ tools/testing/selftests/lib/Makefile | 1 + tools/testing/selftests/lib/config | 1 + tools/testing/selftests/lib/test_counters.sh | 10 ++ 21 files changed, 567 insertions(+), 74 deletions(-) create mode 100644 Documentation/core-api/counters.rst create mode 100644 include/linux/counters.h create mode 100644 lib/test_counters.c create mode 100755 tools/testing/selftests/lib/test_counters.sh
Add a new selftest for testing counter_atomic* Counters API. This test load test_counters test modules and unloads.
The test module runs tests and prints results in dmesg.
There are a number of atomic_t usages in the kernel where atomic_t api is used strictly for counting and not for managing object lifetime. In some cases, atomic_t might not even be needed.
The purpose of these counters is to clearly differentiate atomic_t counters from atomic_t usages that guard object lifetimes, hence prone to overflow and underflow errors. It allows tools that scan for underflow and overflow on atomic_t usages to detect overflow and underflows to scan just the cases that are prone to errors.
Simple atomic counters api provides interfaces for simple atomic counters that just count, and don't guard resource lifetimes. The interfaces are built on top of atomic_t api, providing a smaller subset of atomic_t interfaces necessary to support simple counters.
Counter wraps around to INT_MIN when it overflows and should not be used to guard resource lifetimes, device usage and open counts that control state changes, and pm states. Overflowing to INT_MIN is consistent with the atomic_t api, which it is built on top of.
Using counter_atomic* to guard lifetimes could lead to use-after free when it overflows and undefined behavior when used to manage state changes and device usage/open states.
Signed-off-by: Shuah Khan skhan@linuxfoundation.org --- MAINTAINERS | 1 + tools/testing/selftests/lib/Makefile | 1 + tools/testing/selftests/lib/config | 1 + tools/testing/selftests/lib/test_counters.sh | 10 ++++++++++ 4 files changed, 13 insertions(+) create mode 100755 tools/testing/selftests/lib/test_counters.sh
diff --git a/MAINTAINERS b/MAINTAINERS index 4e82d0ffcab0..26719b8dd48e 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -15845,6 +15845,7 @@ L: linux-kernel@vger.kernel.org S: Maintained F: include/linux/counters.h F: lib/test_counters.c +F: tools/testing/selftests/lib/test_counters.sh
SIMPLE FIRMWARE INTERFACE (SFI) S: Obsolete diff --git a/tools/testing/selftests/lib/Makefile b/tools/testing/selftests/lib/Makefile index a105f094676e..e8960d7934e2 100644 --- a/tools/testing/selftests/lib/Makefile +++ b/tools/testing/selftests/lib/Makefile @@ -5,5 +5,6 @@ all:
TEST_PROGS := printf.sh bitmap.sh prime_numbers.sh strscpy.sh +TEST_PROGS += test_counters.sh
include ../lib.mk diff --git a/tools/testing/selftests/lib/config b/tools/testing/selftests/lib/config index b80ee3f6e265..6ed25024d371 100644 --- a/tools/testing/selftests/lib/config +++ b/tools/testing/selftests/lib/config @@ -3,3 +3,4 @@ CONFIG_TEST_BITMAP=m CONFIG_PRIME_NUMBERS=m CONFIG_TEST_STRSCPY=m CONFIG_TEST_BITOPS=m +CONFIG_TEST_COUNTERS=m diff --git a/tools/testing/selftests/lib/test_counters.sh b/tools/testing/selftests/lib/test_counters.sh new file mode 100755 index 000000000000..78726cad5c7a --- /dev/null +++ b/tools/testing/selftests/lib/test_counters.sh @@ -0,0 +1,10 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# +# Copyright (c) 2020 Shuah Khan skhan@linuxfoundation.org +# Copyright (c) 2020 The Linux Foundation +# +# Tests the Simple Atomic Counters interfaces using test_counters +# kernel module +# +$(dirname $0)/../kselftest/module.sh "test_counters" test_counters
On Fri, Oct 09, 2020 at 09:55:57AM -0600, Shuah Khan wrote:
Add a new selftest for testing counter_atomic* Counters API. This test load test_counters test modules and unloads.
The test module runs tests and prints results in dmesg.
There are a number of atomic_t usages in the kernel where atomic_t api is used strictly for counting and not for managing object lifetime. In some cases, atomic_t might not even be needed.
The purpose of these counters is to clearly differentiate atomic_t counters from atomic_t usages that guard object lifetimes, hence prone to overflow and underflow errors. It allows tools that scan for underflow and overflow on atomic_t usages to detect overflow and underflows to scan just the cases that are prone to errors.
Simple atomic counters api provides interfaces for simple atomic counters that just count, and don't guard resource lifetimes. The interfaces are built on top of atomic_t api, providing a smaller subset of atomic_t interfaces necessary to support simple counters.
Counter wraps around to INT_MIN when it overflows and should not be used to guard resource lifetimes, device usage and open counts that control state changes, and pm states. Overflowing to INT_MIN is consistent with the atomic_t api, which it is built on top of.
Using counter_atomic* to guard lifetimes could lead to use-after free when it overflows and undefined behavior when used to manage state changes and device usage/open states.
Signed-off-by: Shuah Khan skhan@linuxfoundation.org
Reviewed-by: Kees Cook keescook@chromium.org
On Fri, Oct 09, 2020 at 09:55:55AM -0600, Shuah Khan wrote:
Note: Would like to get this into Linux 5.10-rc1 so we can continue updating drivers that can be updated to use this API. If this all looks good, Kees, would you like to take this through your tree or would you like to take this through mine.
I'd mentioned this in the v2, but yes, please take via your trees. :)
I'm glad to see this landing!
On 10/9/20 12:03 PM, Kees Cook wrote:
On Fri, Oct 09, 2020 at 09:55:55AM -0600, Shuah Khan wrote:
Note: Would like to get this into Linux 5.10-rc1 so we can continue updating drivers that can be updated to use this API. If this all looks good, Kees, would you like to take this through your tree or would you like to take this through mine.
I'd mentioned this in the v2, but yes, please take via your trees. :)
Sorry. I missed that. I will get take this through my trees.
I'm glad to see this landing!
Thanks for reviews and brainstorming ideas. It has been lot of fun doing this work.
thanks, -- Shuah
On Fri, Oct 09, 2020 at 09:55:55AM -0600, Shuah Khan wrote:
This patch series is a result of discussion at the refcount_t BOF the Linux Plumbers Conference. In this discussion, we identified a need for looking closely and investigating atomic_t usages in the kernel when it is used strictly as a counter without it controlling object lifetimes and state changes.
There are a number of atomic_t usages in the kernel where atomic_t api is used strictly for counting and not for managing object lifetime. In some cases, atomic_t might not even be needed.
Then the right patch is to not user atomic_t in those cases.
The purpose of these counters is to clearly differentiate atomic_t counters from atomic_t usages that guard object lifetimes, hence prone to overflow and underflow errors. It allows tools that scan for underflow and overflow on atomic_t usages to detect overflow and underflows to scan just the cases that are prone to errors.
Guarding lifetimes is what we got refcount_t for.
Simple atomic counters api provides interfaces for simple atomic counters that just count, and don't guard resource lifetimes. The interfaces are built on top of atomic_t api, providing a smaller subset of atomic_t interfaces necessary to support simple counters.
To what actual purpose?!? AFACIT its pointless wrappery, it gets us nothing.
Counter wraps around to INT_MIN when it overflows and should not be used to guard resource lifetimes, device usage and open counts that control state changes, and pm states. Overflowing to INT_MIN is consistent with the atomic_t api, which it is built on top of. Using counter_atomic* to guard lifetimes could lead to use-after free when it overflows and undefined behavior when used to manage state changes and device usage/open states.
This patch series introduces Simple atomic counters. Counter atomic ops leverage atomic_t and provide a sub-set of atomic_t ops.
Thanks for Cc'ing the atomic maintainers :/
NAK.
On Fri, Oct 09, 2020 at 09:37:46PM +0200, Peter Zijlstra wrote:
On Fri, Oct 09, 2020 at 09:55:55AM -0600, Shuah Khan wrote:
Simple atomic counters api provides interfaces for simple atomic counters that just count, and don't guard resource lifetimes. The interfaces are built on top of atomic_t api, providing a smaller subset of atomic_t interfaces necessary to support simple counters.
To what actual purpose?!? AFACIT its pointless wrappery, it gets us nothing.
It's not pointless. There is value is separating types for behavioral constraint to avoid flaws. atomic_t provides a native operation. We gained refcount_t for the "must not wrap" type, and this gets us the other side of that behavioral type, which is "wrapping is expected". Separating the atomic_t uses allows for a clearer path to being able to reason about code flow, whether it be a human or a static analyzer.
The counter wrappers add nothing to the image size, and only serve to confine the API to one that cannot be used for lifetime management.
Once conversions are done, we have a clean line between refcounting and statistical atomics, which means we have a much lower chance of introducing new flaws (and maybe we'll fix flaws during the conversion, which we've certainly seen before when doing this stricter type/language changes).
I don't see why this is an objectionable goal.
On Fri, Oct 09, 2020 at 01:45:43PM -0700, Kees Cook wrote:
On Fri, Oct 09, 2020 at 09:37:46PM +0200, Peter Zijlstra wrote:
On Fri, Oct 09, 2020 at 09:55:55AM -0600, Shuah Khan wrote:
Simple atomic counters api provides interfaces for simple atomic counters that just count, and don't guard resource lifetimes. The interfaces are built on top of atomic_t api, providing a smaller subset of atomic_t interfaces necessary to support simple counters.
To what actual purpose?!? AFACIT its pointless wrappery, it gets us nothing.
It's not pointless. There is value is separating types for behavioral constraint to avoid flaws. atomic_t provides a native operation. We gained refcount_t for the "must not wrap" type, and this gets us the other side of that behavioral type, which is "wrapping is expected". Separating the atomic_t uses allows for a clearer path to being able to reason about code flow, whether it be a human or a static analyzer.
refcount_t got us actual rutime exceptions that atomic_t doesn't. This propsal gets us nothing.
atomic_t is very much expected to wrap.
The counter wrappers add nothing to the image size, and only serve to confine the API to one that cannot be used for lifetime management.
It doesn't add anything period. It doesn't get us new behaviour, it splits a 'can wrap' use-case from a 'can wrap' type. That's sodding pointless.
Worse, it mixes 2 unrelated cases into one type, which just makes a mockery of things (all the inc_return users are not statistics, some might even mis-behave if they wrap).
Once conversions are done, we have a clean line between refcounting and statistical atomics, which means we have a much lower chance of introducing new flaws (and maybe we'll fix flaws during the conversion, which we've certainly seen before when doing this stricter type/language changes).
I don't see why this is an objectionable goal.
People can and will always find a way to mess things up.
Only add types when you get behavioural changes, otherwise it's pointless noise.
My NAK stands.
On 10/10/20 5:09 AM, Peter Zijlstra wrote:
On Fri, Oct 09, 2020 at 01:45:43PM -0700, Kees Cook wrote:
On Fri, Oct 09, 2020 at 09:37:46PM +0200, Peter Zijlstra wrote:
On Fri, Oct 09, 2020 at 09:55:55AM -0600, Shuah Khan wrote:
Simple atomic counters api provides interfaces for simple atomic counters that just count, and don't guard resource lifetimes. The interfaces are built on top of atomic_t api, providing a smaller subset of atomic_t interfaces necessary to support simple counters.
To what actual purpose?!? AFACIT its pointless wrappery, it gets us nothing.
It's not pointless. There is value is separating types for behavioral constraint to avoid flaws. atomic_t provides a native operation. We gained refcount_t for the "must not wrap" type, and this gets us the other side of that behavioral type, which is "wrapping is expected". Separating the atomic_t uses allows for a clearer path to being able to reason about code flow, whether it be a human or a static analyzer.
refcount_t got us actual rutime exceptions that atomic_t doesn't. This propsal gets us nothing.
atomic_t is very much expected to wrap.
The counter wrappers add nothing to the image size, and only serve to confine the API to one that cannot be used for lifetime management.
It doesn't add anything period. It doesn't get us new behaviour, it splits a 'can wrap' use-case from a 'can wrap' type. That's sodding pointless.
They don't add any new behavior, As Kees mentioned they do give us a way to clearly differentiate atomic usages that can wrap.
Let's discuss the problem at hand before dismissing it as pointless.
Worse, it mixes 2 unrelated cases into one type, which just makes a mockery of things (all the inc_return users are not statistics, some might even mis-behave if they wrap).
You are right that all inc_return usages aren't statistics. There are 3 distinct usages:
1. Stats 2. Cases where wrapping is fine 3. Cases where wrapping could be a problem. In which case, this API shouldn't be used.
There is no need to keep inc_return in this API as such. I included it so it can be used for above cases 1 and 2, so the users don't have to call inc() followed by read(). It can be left out of the API.
The atomic_t usages in the kernel fall into the following categories:
1. Stats (tolerance for accuracy determines whether they need to be atomic or not). RFC version included non-atomic API for cases when lossiness is acceptable. All these cases use/need just init and inc. There are two variations in this case:
a. No checks for wrapping. Use signed value. b. No checks for wrapping, but return unsigned.
2. Reference counters that release resource and rapping could result in use-after-free type problems. There are two variations in this case:
a. Increments and decrements aren't bounded. b. Increments and decrements are bounded.
Currently tools that flag unsafe atomic_t usages that are candidates for refcount_t conversions don't make a distinction between the two.
The second case, since increments and decrements are bounded, it is safe to continue to use it. At the moment there is no good way to tell them apart other than looking at each of these cases.
3. Reference counters that manage/control states. Wrapping is a problem in this case, as it could lead to undefined behavior. These cases don't use test and free, use inc/dec. At the moment there is no good way to tell them apart other than looking at each of these cases. This is addressed by REFCOUNT_SATURATED case.
This API addresses 1a. Stats. No checks for wrapping. Use signed value at the moment with plan to add support for unsigned for cases where unsigned is being used.
It is possible to cover 2b in this API, so it becomes easier to make a clear distinction the two cases and we can focus on only the atomic_t cases that need to converted to refcount_t. This is easy to do by allowing max. threshold for the variable and checking against that and not letting it go above it.
There are several atomic_t usages that use just:
-- init or set and inc -- init or set and inc/dec (including the ones that manage state) -- Increments and decrements are bounded
Creating a sub-set of atomic_t api would help us with differentiate these cases and make it easy for us identify and fix cases where refcount_t should be used.
Would you be open to considering a subset if it addresses 2b and unsigned returns for stats?
thanks, -- Shuah
On Tue, Oct 13, 2020 at 08:12:20PM -0600, Shuah Khan wrote:
They don't add any new behavior, As Kees mentioned they do give us a way to clearly differentiate atomic usages that can wrap.
No it doesn't! atomic_t can wrap, this thing can wrap, no distinction.
All it does is fragment the API and sow confusion. FOR NO BENEFIT.
Worse, it mixes 2 unrelated cases into one type, which just makes a mockery of things (all the inc_return users are not statistics, some might even mis-behave if they wrap).
You are right that all inc_return usages aren't statistics. There are 3 distinct usages:
- Stats
- Cases where wrapping is fine
- Cases where wrapping could be a problem. In which case, this API shouldn't be used.
And yet, afaict patch 4 is case 3...
There is no need to keep inc_return in this API as such. I included it so it can be used for above cases 1 and 2, so the users don't have to call inc() followed by read(). It can be left out of the API.
The atomic_t usages in the kernel fall into the following categories:
Stats (tolerance for accuracy determines whether they need to be atomic or not). RFC version included non-atomic API for cases when lossiness is acceptable. All these cases use/need just init and inc. There are two variations in this case:
a. No checks for wrapping. Use signed value. b. No checks for wrapping, but return unsigned.
Reference counters that release resource and rapping could result in use-after-free type problems. There are two variations in this case:
a. Increments and decrements aren't bounded. b. Increments and decrements are bounded.
Currently tools that flag unsafe atomic_t usages that are candidates for refcount_t conversions don't make a distinction between the two.
The second case, since increments and decrements are bounded, it is safe to continue to use it. At the moment there is no good way to tell them apart other than looking at each of these cases.
Reference counters that manage/control states. Wrapping is a problem in this case, as it could lead to undefined behavior. These cases don't use test and free, use inc/dec. At the moment there is no good way to tell them apart other than looking at each of these cases. This is addressed by REFCOUNT_SATURATED case.
Wrong! The atomic usage in mutex doesn't fall in any of those categories.
The only thing you're all saying that makes sense is that unintentional wrapping can have bad consequences, the rest is pure confusion.
Focus on the non-wrapping cases, _everything_ else is not going anywhere.
So audit the kernel, find the cases that should not wrap, categorize and create APIs for them that trap the wrapping. But don't go around confusing things that don't need confusion.
On Wed, Oct 14, 2020 at 11:17:20AM +0200, Peter Zijlstra wrote:
On Tue, Oct 13, 2020 at 08:12:20PM -0600, Shuah Khan wrote:
They don't add any new behavior, As Kees mentioned they do give us a way to clearly differentiate atomic usages that can wrap.
No it doesn't! atomic_t can wrap, this thing can wrap, no distinction.
All it does is fragment the API and sow confusion. FOR NO BENEFIT.
I really don't see it this way. It's a distinct subset of the atomic_t API. The trouble that has existed here has been with an atomic_t being originally used NOT for lifetime management, that mutates into something like that because of the available API, but doing so without realizing it. atomic_t gets used for all kinds of algorithms, and the "counter" type is way too easily accidentally transformed into a "lifetime tracker" and we get bugs.
If we have a distinct type for wrapping-counters that limits the API, then it is much harder for folks to shoot themselves in the foot. I don't see why this is so bad: we end up with safer usage, more easily auditable code behavior ("how was this atomic_t instance _intended_ to be used?"), and no change in binary size.
There is no need to keep inc_return in this API as such. I included it so it can be used for above cases 1 and 2, so the users don't have to call inc() followed by read(). It can be left out of the API.
I go back and forth on this, but after looking at these instances, it makes sense to have inc_return(), for where counters are actually "serial numbers". An argument could be made[1], however, that such uses should not end up in the position of _reusing_ earlier identifiers, which means it's actually can't wrap. (And some cases just need u64 to make this happen[2] -- and in that specific case, don't even need to be atomic_t).
[1] https://lore.kernel.org/lkml/202010071334.8298F3FA7@keescook/ [2] https://git.kernel.org/linus/d1e7fd6462ca9fc76650fbe6ca800e35b24267da
Wrong! The atomic usage in mutex doesn't fall in any of those categories.
But the atomic usage in mutex is *IN* mutex -- it's a separate data type, etc. We don't build mutexes manually, so why build counters manually?
The only thing you're all saying that makes sense is that unintentional wrapping can have bad consequences, the rest is pure confusion.
Focus on the non-wrapping cases, _everything_ else is not going anywhere.
I view this as a way to do so: this subset of wrapping cases is being identified and removed from the pool of all the atomic_t cases so that they will have been classified, and we can continue to narrow down all the atomic_t uses to find any potentially mis-used non-wrapping cases.
The other option is adding some kind of attribute to the declarations (which gets us the annotation) but doesn't provide a limit to the API. (e.g. no counter should ever call dec_return).
So audit the kernel, find the cases that should not wrap, categorize and create APIs for them that trap the wrapping. But don't go around confusing things that don't need confusion.
That's what's happening here. But as it turns out, it's easier to do this by employing both the process of elimination (mark the counters) and direct identification (mark the refcount_t). Then the pool of "unannotated" atomic_t instances continues to shrink.
On Wed, Oct 14, 2020 at 04:31:42PM -0700, Kees Cook wrote:
On Wed, Oct 14, 2020 at 11:17:20AM +0200, Peter Zijlstra wrote:
On Tue, Oct 13, 2020 at 08:12:20PM -0600, Shuah Khan wrote:
They don't add any new behavior, As Kees mentioned they do give us a way to clearly differentiate atomic usages that can wrap.
No it doesn't! atomic_t can wrap, this thing can wrap, no distinction.
All it does is fragment the API and sow confusion. FOR NO BENEFIT.
I really don't see it this way. It's a distinct subset of the atomic_t API. The trouble that has existed here has been with an atomic_t being originally used NOT for lifetime management, that mutates into something like that because of the available API, but doing so without realizing it. atomic_t gets used for all kinds of algorithms, and the "counter" type is way too easily accidentally transformed into a "lifetime tracker" and we get bugs.
If we have a distinct type for wrapping-counters that limits the API, then it is much harder for folks to shoot themselves in the foot. I don't see why this is so bad: we end up with safer usage, more easily auditable code behavior ("how was this atomic_t instance _intended_ to be used?"), and no change in binary size.
I simply do not believe in that. Maybe I've seen too much code like:
atomic_dec_and_test(&counter->cnt);
or god forbid (thanks Will):
typedef union { refcount_t ref; counter_atomic_t cnt; atomic_t atm; } my_awesome_type_t;
People simply aren't too fussed. They'll do whatever it takes (to get the job done with minimal effort).
You think you've cleaned things up, but the moment you turn your back, it'll decend into madness at break-neck speed. This is part economics and part the fact that C is crap (see the above two examples).
Can't we use static-analysis/robots for this? The constant feedback from those is the only thing that helps stem the tide of crap.
Create a variable attribute __wrap, and have the robot complain when atomic_*_and_test() are used on a variable declared with __wrap. Or better yet, when a __wrap heads a condition that leads to call_rcu() / *free*().
There is no need to keep inc_return in this API as such. I included it so it can be used for above cases 1 and 2, so the users don't have to call inc() followed by read(). It can be left out of the API.
I go back and forth on this, but after looking at these instances, it makes sense to have inc_return(), for where counters are actually "serial numbers". An argument could be made[1], however, that such uses should not end up in the position of _reusing_ earlier identifiers, which means it's actually can't wrap. (And some cases just need u64 to make this happen[2] -- and in that specific case, don't even need to be atomic_t).
[1] https://lore.kernel.org/lkml/202010071334.8298F3FA7@keescook/ [2] https://git.kernel.org/linus/d1e7fd6462ca9fc76650fbe6ca800e35b24267da
Sure, there's valid cases where wrapping is ok, but it's a distinct use-case and really shouldn't be mixed up in the statistics one -- they're distinct.
The fact that patch #4 appears to get this wrong seems like a fairly big clue.
Wrong! The atomic usage in mutex doesn't fall in any of those categories.
But the atomic usage in mutex is *IN* mutex -- it's a separate data type, etc. We don't build mutexes manually, so why build counters manually?
The claim was: atomic usage in the kernel falls in these 3 categories.
- mutex uses atomic (atomic_long_t); - mutex is in the kernel; - mutex's use of atomic does not fit the listed categories.
Therefore, the claim is false. In fact most of the locking primitives use atomic one way or another. And yes, some people do build their own.
The only thing you're all saying that makes sense is that unintentional wrapping can have bad consequences, the rest is pure confusion.
Focus on the non-wrapping cases, _everything_ else is not going anywhere.
I view this as a way to do so: this subset of wrapping cases is being identified and removed from the pool of all the atomic_t cases so that they will have been classified, and we can continue to narrow down all the atomic_t uses to find any potentially mis-used non-wrapping cases.
The other option is adding some kind of attribute to the declarations (which gets us the annotation) but doesn't provide a limit to the API. (e.g. no counter should ever call dec_return).
See above; robots can do exactly that.
So audit the kernel, find the cases that should not wrap, categorize and create APIs for them that trap the wrapping. But don't go around confusing things that don't need confusion.
That's what's happening here. But as it turns out, it's easier to do this by employing both the process of elimination (mark the counters) and direct identification (mark the refcount_t). Then the pool of "unannotated" atomic_t instances continues to shrink.
That's like saying: "I'm too lazy to track what I've looked at already". You're basically proposing to graffiti "Kees was here -- 16/10/2020" all over the kernel. Just so you can see where you still need to go.
It says the code was (assuming your audit was correct) good at that date, but has no guarantees for any moment after that.
On Fri, Oct 16, 2020 at 12:53:13PM +0200, Peter Zijlstra wrote:
That's like saying: "I'm too lazy to track what I've looked at already". You're basically proposing to graffiti "Kees was here -- 16/10/2020" all over the kernel. Just so you can see where you still need to go.
It says the code was (assuming your audit was correct) good at that date, but has no guarantees for any moment after that.
That kind of bit-rot marking is exactly what I would like to avoid: just putting a comment in is pointless. Making the expectations of the usage become _enforced_ is the goal. And having it enforced by the _compiler_ is key. Just adding a meaningless attribute that a static checker will notice some time and hope people fix them doesn't scale either (just look at how many sparse warnings there are). So with C's limits, the API and type enforcement become the principle way to get this done.
So, if there are behavioral patterns we CAN carve away from atomic_t cleanly (and I think there are), those are the ones I want to work towards. The "corner case" uses of atomic_t are much less common than the "big" code patterns like lifetime management (now delegated to and enforced by refcount_t). My estimation was that _statistics_ (and not "serial identifiers") was the next biggest code pattern using atomic_t that could benefit from having its usage isolated. It is not itself a dangerous code pattern, but it can mask finding them.
Then, at the end of the day, only the corner cases remain, and those can be seen clearly as they change over time. Since we can never have a one-time audit be anything other than advisory, we need to make it EASY to do those kinds of audits so they can be done regularly.
On Fri, Oct 16, 2020 at 03:51:25PM -0700, Kees Cook wrote:
On Fri, Oct 16, 2020 at 12:53:13PM +0200, Peter Zijlstra wrote:
That's like saying: "I'm too lazy to track what I've looked at already". You're basically proposing to graffiti "Kees was here -- 16/10/2020" all over the kernel. Just so you can see where you still need to go.
It says the code was (assuming your audit was correct) good at that date, but has no guarantees for any moment after that.
That kind of bit-rot marking is exactly what I would like to avoid: just putting a comment in is pointless. Making the expectations of the usage become _enforced_ is the goal. And having it enforced by the _compiler_ is key. Just adding a meaningless attribute that a static checker will notice some time and hope people fix them doesn't scale either (just look at how many sparse warnings there are).
Most Sparse warnings are false positives. People do actually fix the ones which matter.
I think this patchset could be useful. I'm working on a refcounting check for Smatch. I want to warn about when we forget to drop a reference on an error path. Right now I just assume that anything with "error", "drop" or "->stats->" in the name is just a counter.
regards, dan carpenter
On 10/14/20 5:31 PM, Kees Cook wrote:
On Wed, Oct 14, 2020 at 11:17:20AM +0200, Peter Zijlstra wrote:
On Tue, Oct 13, 2020 at 08:12:20PM -0600, Shuah Khan wrote:
They don't add any new behavior, As Kees mentioned they do give us a way to clearly differentiate atomic usages that can wrap.
No it doesn't! atomic_t can wrap, this thing can wrap, no distinction.
All it does is fragment the API and sow confusion. FOR NO BENEFIT.
I really don't see it this way. It's a distinct subset of the atomic_t API. The trouble that has existed here has been with an atomic_t being originally used NOT for lifetime management, that mutates into something like that because of the available API, but doing so without realizing it. atomic_t gets used for all kinds of algorithms, and the "counter" type is way too easily accidentally transformed into a "lifetime tracker" and we get bugs.
If we have a distinct type for wrapping-counters that limits the API, then it is much harder for folks to shoot themselves in the foot. I don't see why this is so bad: we end up with safer usage, more easily auditable code behavior ("how was this atomic_t instance _intended_ to be used?"), and no change in binary size.
There is no need to keep inc_return in this API as such. I included it so it can be used for above cases 1 and 2, so the users don't have to call inc() followed by read(). It can be left out of the API.
I go back and forth on this, but after looking at these instances, it makes sense to have inc_return(), for where counters are actually "serial numbers". An argument could be made[1], however, that such uses should not end up in the position of _reusing_ earlier identifiers, which means it's actually can't wrap. (And some cases just need u64 to make this happen[2] -- and in that specific case, don't even need to be atomic_t).
[1] https://lore.kernel.org/lkml/202010071334.8298F3FA7@keescook/ [2] https://git.kernel.org/linus/d1e7fd6462ca9fc76650fbe6ca800e35b24267da
Wrong! The atomic usage in mutex doesn't fall in any of those categories.
But the atomic usage in mutex is *IN* mutex -- it's a separate data type, etc. We don't build mutexes manually, so why build counters manually?
The only thing you're all saying that makes sense is that unintentional wrapping can have bad consequences, the rest is pure confusion.
Focus on the non-wrapping cases, _everything_ else is not going anywhere.
I view this as a way to do so: this subset of wrapping cases is being identified and removed from the pool of all the atomic_t cases so that they will have been classified, and we can continue to narrow down all the atomic_t uses to find any potentially mis-used non-wrapping cases.
The other option is adding some kind of attribute to the declarations (which gets us the annotation) but doesn't provide a limit to the API. (e.g. no counter should ever call dec_return).
Not sure about that. We have more than dec_return to deal with. More on this below.
So audit the kernel, find the cases that should not wrap, categorize and create APIs for them that trap the wrapping. But don't go around confusing things that don't need confusion.
That's what's happening here. But as it turns out, it's easier to do this by employing both the process of elimination (mark the counters) and direct identification (mark the refcount_t). Then the pool of "unannotated" atomic_t instances continues to shrink.
Right auditing is what is happening now.
Let me summarize the discussion:
atomic_t api provides a wide range of atomic operations as a base api to implement atomic counters, bitops, spinlock interfaces. The usages also evolved into being used for resource lifetimes and state management. Then came refcount_t api to address resource lifetime problems related to atomic_t wrapping.
There is a large overlap between the atomic_t api used for resource lifetimes and just counters. Not all counters used for resource lifetimes can be converted to refcount_t.
A few quick "git grep" numbers on atomic_t interfaces usage:
Common for all:
atomic_set() - 3418 atomic_read() - 5833 atomic_inc() - 3376 atomic_dec() - 2498 atomic_inc_return() - 612
Counters don't need these:
atomic_dec_return() - 295 atomic_add_return() - 209 atomic_sub_return() - 144 atomic_add() - 744 atomic_sub() - 371 atomic_dec_and_test() - 552
You can see from these numbers, the volume of common usages that make it difficult to separate out counters vs. non-counter usages.
The problem we are now running into is, it is becoming difficult weed out candidates for refcount_t conversion in this noise.
Isolating a smaller subset of arithmetic atomic ops to address this specific counters use-case will help reduce noise. This way we can go through this work once and convert all counters to use this narrow scoped api and what is left is non-counter usages.
The current situation is more confusing and adding a narrowly focused api for counters reduces it and makes it easier.
thanks, -- Shuah
linux-kselftest-mirror@lists.linaro.org