Currently, when a non-exclusive cpuset's "cpuset.cpus" overlaps with a partitioned sibling, the sibling's partition state becomes invalid. However, this invalidation is often unnecessary.
This can be observed in specific configuration sequences:
Case 1: Partition created first, then non-exclusive cpuset overlaps #1> mkdir -p /sys/fs/cgroup/A1 #2> echo "0-1" > /sys/fs/cgroup/A1/cpuset.cpus #3> echo "root" > /sys/fs/cgroup/A1/cpuset.cpus.partition #4> mkdir -p /sys/fs/cgroup/B1 #5> echo "0-3" > /sys/fs/cgroup/B1/cpuset.cpus // A1's partition becomes "root invalid" - this is unnecessary
Case 2: Non-exclusive cpuset exists first, then partition created #1> mkdir -p /sys/fs/cgroup/B1 #2> echo "0-1" > /sys/fs/cgroup/B1/cpuset.cpus #3> mkdir -p /sys/fs/cgroup/A1 #4> echo "0-1" > /sys/fs/cgroup/A1/cpuset.cpus #5> echo "root" > /sys/fs/cgroup/A1/cpuset.cpus.partition // A1's partition becomes "root invalid" - this is unnecessary
In Case 1, the effective CPU mask of B1 can differ from its requested mask. B1 can use CPUs 2-3 which don't overlap with A1's exclusive CPUs (0-1), thus not violating A1's exclusivity requirement.
In Case 2, B1 can inherit the effective CPUs from its parent, so there is no need to invalidate A1's partition state.
This patch relaxes the overlap check to only consider conflicts between partitioned siblings, not between a partitioned cpuset and a regular non-exclusive one.
Signed-off-by: Sun Shaojie sunshaojie@kylinos.cn --- kernel/cgroup/cpuset.c | 8 ++++---- tools/testing/selftests/cgroup/test_cpuset_prs.sh | 10 +++++----- 2 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 52468d2c178a..e0d27c9a101a 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -586,14 +586,14 @@ static inline bool cpusets_are_exclusive(struct cpuset *cs1, struct cpuset *cs2) * Returns: true if CPU exclusivity conflict exists, false otherwise * * Conflict detection rules: - * 1. If either cpuset is CPU exclusive, they must be mutually exclusive + * 1. If both cpusets are exclusive, they must be mutually exclusive * 2. exclusive_cpus masks cannot intersect between cpusets * 3. The allowed CPUs of one cpuset cannot be a subset of another's exclusive CPUs */ static inline bool cpus_excl_conflict(struct cpuset *cs1, struct cpuset *cs2) { - /* If either cpuset is exclusive, check if they are mutually exclusive */ - if (is_cpu_exclusive(cs1) || is_cpu_exclusive(cs2)) + /* If both cpusets are exclusive, check if they are mutually exclusive */ + if (is_cpu_exclusive(cs1) && is_cpu_exclusive(cs2)) return !cpusets_are_exclusive(cs1, cs2);
/* Exclusive_cpus cannot intersect */ @@ -695,7 +695,7 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial) goto out;
/* - * If either I or some sibling (!= me) is exclusive, we can't + * If both I and some sibling (!= me) are exclusive, we can't * overlap. exclusive_cpus cannot overlap with each other if set. */ ret = -EINVAL; diff --git a/tools/testing/selftests/cgroup/test_cpuset_prs.sh b/tools/testing/selftests/cgroup/test_cpuset_prs.sh index a17256d9f88a..903dddfe88d7 100755 --- a/tools/testing/selftests/cgroup/test_cpuset_prs.sh +++ b/tools/testing/selftests/cgroup/test_cpuset_prs.sh @@ -269,7 +269,7 @@ TEST_MATRIX=( " C0-3:S+ C1-3:S+ C2-3 . X2-3 X3:P2 . . 0 A1:0-2|A2:3|A3:3 A1:P0|A2:P2 3" " C0-3:S+ C1-3:S+ C2-3 . X2-3 X2-3 X2-3:P2 . 0 A1:0-1|A2:1|A3:2-3 A1:P0|A3:P2 2-3" " C0-3:S+ C1-3:S+ C2-3 . X2-3 X2-3 X2-3:P2:C3 . 0 A1:0-1|A2:1|A3:2-3 A1:P0|A3:P2 2-3" - " C0-3:S+ C1-3:S+ C2-3 C2-3 . . . P2 0 A1:0-3|A2:1-3|A3:2-3|B1:2-3 A1:P0|A3:P0|B1:P-2" + " C0-3:S+ C1-3:S+ C2-3 C2-3 . . . P2 0 A1:0-1|A2:1|A3:1|B1:2-3 A1:P0|A3:P0|B1:P2 2-3" " C0-3:S+ C1-3:S+ C2-3 C4-5 . . . P2 0 B1:4-5 B1:P2 4-5" " C0-3:S+ C1-3:S+ C2-3 C4 X2-3 X2-3 X2-3:P2 P2 0 A3:2-3|B1:4 A3:P2|B1:P2 2-4" " C0-3:S+ C1-3:S+ C2-3 C4 X2-3 X2-3 X2-3:P2:C1-3 P2 0 A3:2-3|B1:4 A3:P2|B1:P2 2-4" @@ -318,7 +318,7 @@ TEST_MATRIX=( # Invalid to valid local partition direct transition tests " C1-3:S+:P2 X4:P2 . . . . . . 0 A1:1-3|XA1:1-3|A2:1-3:XA2: A1:P2|A2:P-2 1-3" " C1-3:S+:P2 X4:P2 . . . X3:P2 . . 0 A1:1-2|XA1:1-3|A2:3:XA2:3 A1:P2|A2:P2 1-3" - " C0-3:P2 . . C4-6 C0-4 . . . 0 A1:0-4|B1:4-6 A1:P-2|B1:P0" + " C0-3:P2 . . C4-6 C0-4 . . . 0 A1:0-4|B1:5-6 A1:P2|B1:P0 0-4" " C0-3:P2 . . C4-6 C0-4:C0-3 . . . 0 A1:0-3|B1:4-6 A1:P2|B1:P0 0-3"
# Local partition invalidation tests @@ -388,10 +388,10 @@ TEST_MATRIX=( " C0-1:S+ C1 . C2-3 . P2 . . 0 A1:0-1|A2:1 A1:P0|A2:P-2" " C0-1:S+ C1:P2 . C2-3 P1 . . . 0 A1:0|A2:1 A1:P1|A2:P2 0-1|1"
- # A non-exclusive cpuset.cpus change will invalidate partition and its siblings - " C0-1:P1 . . C2-3 C0-2 . . . 0 A1:0-2|B1:2-3 A1:P-1|B1:P0" + # A non-exclusive cpuset.cpus change will not invalidate partition and its siblings + " C0-1:P1 . . C2-3 C0-2 . . . 0 A1:0-2|B1:3 A1:P1|B1:P0" " C0-1:P1 . . P1:C2-3 C0-2 . . . 0 A1:0-2|B1:2-3 A1:P-1|B1:P-1" - " C0-1 . . P1:C2-3 C0-2 . . . 0 A1:0-2|B1:2-3 A1:P0|B1:P-1" + " C0-1 . . P1:C2-3 C0-2 . . . 0 A1:0-1|B1:2-3 A1:P0|B1:P1"
# cpuset.cpus can overlap with sibling cpuset.cpus.exclusive but not subsumed by it " C0-3 . . C4-5 X5 . . . 0 A1:0-3|B1:4-5"
On 2025/11/12 10:11, Sun Shaojie wrote: Hello Shaojie,
Currently, when a non-exclusive cpuset's "cpuset.cpus" overlaps with a partitioned sibling, the sibling's partition state becomes invalid. However, this invalidation is often unnecessary.
This can be observed in specific configuration sequences:
Case 1: Partition created first, then non-exclusive cpuset overlaps #1> mkdir -p /sys/fs/cgroup/A1 #2> echo "0-1" > /sys/fs/cgroup/A1/cpuset.cpus #3> echo "root" > /sys/fs/cgroup/A1/cpuset.cpus.partition #4> mkdir -p /sys/fs/cgroup/B1 #5> echo "0-3" > /sys/fs/cgroup/B1/cpuset.cpus // A1's partition becomes "root invalid" - this is unnecessary
Case 2: Non-exclusive cpuset exists first, then partition created #1> mkdir -p /sys/fs/cgroup/B1 #2> echo "0-1" > /sys/fs/cgroup/B1/cpuset.cpus #3> mkdir -p /sys/fs/cgroup/A1 #4> echo "0-1" > /sys/fs/cgroup/A1/cpuset.cpus #5> echo "root" > /sys/fs/cgroup/A1/cpuset.cpus.partition // A1's partition becomes "root invalid" - this is unnecessary
In Case 1, the effective CPU mask of B1 can differ from its requested mask. B1 can use CPUs 2-3 which don't overlap with A1's exclusive CPUs (0-1), thus not violating A1's exclusivity requirement.
In Case 2, B1 can inherit the effective CPUs from its parent, so there is no need to invalidate A1's partition state.
This patch relaxes the overlap check to only consider conflicts between partitioned siblings, not between a partitioned cpuset and a regular non-exclusive one.
Does this rule have any negative impact on your products?
The CPUs specified by the user (including cpuset.cpus and cpuset.cpus.exclusive) can be treated as the dedicated exclusive CPUs for the partition. For the cases you provided, both siblings can be partitions. For example, in case 1, A1 can also be converted to a partition. If this rule is relaxed, I don’t see any check for exclusive conflicts when A1 becomes a partition.
Additionally, I think we should preserve the CPU affinity as the user intends as much as possible.
Signed-off-by: Sun Shaojie sunshaojie@kylinos.cn
kernel/cgroup/cpuset.c | 8 ++++---- tools/testing/selftests/cgroup/test_cpuset_prs.sh | 10 +++++----- 2 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 52468d2c178a..e0d27c9a101a 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -586,14 +586,14 @@ static inline bool cpusets_are_exclusive(struct cpuset *cs1, struct cpuset *cs2)
- Returns: true if CPU exclusivity conflict exists, false otherwise
- Conflict detection rules:
- If either cpuset is CPU exclusive, they must be mutually exclusive
*/
- If both cpusets are exclusive, they must be mutually exclusive
- exclusive_cpus masks cannot intersect between cpusets
- The allowed CPUs of one cpuset cannot be a subset of another's exclusive CPUs
static inline bool cpus_excl_conflict(struct cpuset *cs1, struct cpuset *cs2) {
- /* If either cpuset is exclusive, check if they are mutually exclusive */
- if (is_cpu_exclusive(cs1) || is_cpu_exclusive(cs2))
- /* If both cpusets are exclusive, check if they are mutually exclusive */
- if (is_cpu_exclusive(cs1) && is_cpu_exclusive(cs2)) return !cpusets_are_exclusive(cs1, cs2);
/* Exclusive_cpus cannot intersect */ @@ -695,7 +695,7 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial) goto out; /*
* If either I or some sibling (!= me) is exclusive, we can't
* If both I and some sibling (!= me) are exclusive, we can't*/ ret = -EINVAL;
- overlap. exclusive_cpus cannot overlap with each other if set.
diff --git a/tools/testing/selftests/cgroup/test_cpuset_prs.sh b/tools/testing/selftests/cgroup/test_cpuset_prs.sh index a17256d9f88a..903dddfe88d7 100755 --- a/tools/testing/selftests/cgroup/test_cpuset_prs.sh +++ b/tools/testing/selftests/cgroup/test_cpuset_prs.sh @@ -269,7 +269,7 @@ TEST_MATRIX=( " C0-3:S+ C1-3:S+ C2-3 . X2-3 X3:P2 . . 0 A1:0-2|A2:3|A3:3 A1:P0|A2:P2 3" " C0-3:S+ C1-3:S+ C2-3 . X2-3 X2-3 X2-3:P2 . 0 A1:0-1|A2:1|A3:2-3 A1:P0|A3:P2 2-3" " C0-3:S+ C1-3:S+ C2-3 . X2-3 X2-3 X2-3:P2:C3 . 0 A1:0-1|A2:1|A3:2-3 A1:P0|A3:P2 2-3"
- " C0-3:S+ C1-3:S+ C2-3 C2-3 . . . P2 0 A1:0-3|A2:1-3|A3:2-3|B1:2-3 A1:P0|A3:P0|B1:P-2"
- " C0-3:S+ C1-3:S+ C2-3 C2-3 . . . P2 0 A1:0-1|A2:1|A3:1|B1:2-3 A1:P0|A3:P0|B1:P2 2-3" " C0-3:S+ C1-3:S+ C2-3 C4-5 . . . P2 0 B1:4-5 B1:P2 4-5" " C0-3:S+ C1-3:S+ C2-3 C4 X2-3 X2-3 X2-3:P2 P2 0 A3:2-3|B1:4 A3:P2|B1:P2 2-4" " C0-3:S+ C1-3:S+ C2-3 C4 X2-3 X2-3 X2-3:P2:C1-3 P2 0 A3:2-3|B1:4 A3:P2|B1:P2 2-4"
@@ -318,7 +318,7 @@ TEST_MATRIX=( # Invalid to valid local partition direct transition tests " C1-3:S+:P2 X4:P2 . . . . . . 0 A1:1-3|XA1:1-3|A2:1-3:XA2: A1:P2|A2:P-2 1-3" " C1-3:S+:P2 X4:P2 . . . X3:P2 . . 0 A1:1-2|XA1:1-3|A2:3:XA2:3 A1:P2|A2:P2 1-3"
- " C0-3:P2 . . C4-6 C0-4 . . . 0 A1:0-4|B1:4-6 A1:P-2|B1:P0"
- " C0-3:P2 . . C4-6 C0-4 . . . 0 A1:0-4|B1:5-6 A1:P2|B1:P0 0-4" " C0-3:P2 . . C4-6 C0-4:C0-3 . . . 0 A1:0-3|B1:4-6 A1:P2|B1:P0 0-3"
# Local partition invalidation tests @@ -388,10 +388,10 @@ TEST_MATRIX=( " C0-1:S+ C1 . C2-3 . P2 . . 0 A1:0-1|A2:1 A1:P0|A2:P-2" " C0-1:S+ C1:P2 . C2-3 P1 . . . 0 A1:0|A2:1 A1:P1|A2:P2 0-1|1"
- # A non-exclusive cpuset.cpus change will invalidate partition and its siblings
- " C0-1:P1 . . C2-3 C0-2 . . . 0 A1:0-2|B1:2-3 A1:P-1|B1:P0"
- # A non-exclusive cpuset.cpus change will not invalidate partition and its siblings
- " C0-1:P1 . . C2-3 C0-2 . . . 0 A1:0-2|B1:3 A1:P1|B1:P0" " C0-1:P1 . . P1:C2-3 C0-2 . . . 0 A1:0-2|B1:2-3 A1:P-1|B1:P-1"
- " C0-1 . . P1:C2-3 C0-2 . . . 0 A1:0-2|B1:2-3 A1:P0|B1:P-1"
- " C0-1 . . P1:C2-3 C0-2 . . . 0 A1:0-1|B1:2-3 A1:P0|B1:P1"
# cpuset.cpus can overlap with sibling cpuset.cpus.exclusive but not subsumed by it " C0-3 . . C4-5 X5 . . . 0 A1:0-3|B1:4-5"
On 11/11/25 10:33 PM, Chen Ridong wrote:
On 2025/11/12 10:11, Sun Shaojie wrote: Hello Shaojie,
Currently, when a non-exclusive cpuset's "cpuset.cpus" overlaps with a partitioned sibling, the sibling's partition state becomes invalid. However, this invalidation is often unnecessary.
This can be observed in specific configuration sequences:
Case 1: Partition created first, then non-exclusive cpuset overlaps #1> mkdir -p /sys/fs/cgroup/A1 #2> echo "0-1" > /sys/fs/cgroup/A1/cpuset.cpus #3> echo "root" > /sys/fs/cgroup/A1/cpuset.cpus.partition #4> mkdir -p /sys/fs/cgroup/B1 #5> echo "0-3" > /sys/fs/cgroup/B1/cpuset.cpus // A1's partition becomes "root invalid" - this is unnecessary
Case 2: Non-exclusive cpuset exists first, then partition created #1> mkdir -p /sys/fs/cgroup/B1 #2> echo "0-1" > /sys/fs/cgroup/B1/cpuset.cpus #3> mkdir -p /sys/fs/cgroup/A1 #4> echo "0-1" > /sys/fs/cgroup/A1/cpuset.cpus #5> echo "root" > /sys/fs/cgroup/A1/cpuset.cpus.partition // A1's partition becomes "root invalid" - this is unnecessary
In Case 1, the effective CPU mask of B1 can differ from its requested mask. B1 can use CPUs 2-3 which don't overlap with A1's exclusive CPUs (0-1), thus not violating A1's exclusivity requirement.
In Case 2, B1 can inherit the effective CPUs from its parent, so there is no need to invalidate A1's partition state.
This patch relaxes the overlap check to only consider conflicts between partitioned siblings, not between a partitioned cpuset and a regular non-exclusive one.
Does this rule have any negative impact on your products?
The CPUs specified by the user (including cpuset.cpus and cpuset.cpus.exclusive) can be treated as the dedicated exclusive CPUs for the partition. For the cases you provided, both siblings can be partitions. For example, in case 1, A1 can also be converted to a partition. If this rule is relaxed, I don’t see any check for exclusive conflicts when A1 becomes a partition.
Additionally, I think we should preserve the CPU affinity as the user intends as much as possible.
Where does the original patch sent to? I didn't see it.
Anyway it is late for me. I will take a further look tomorrow.
Cheers, Longman
Hi Ridong,
Thank you for your response.
From your reply "in case 1, A1 can also be converted to a partition," I realize there might be a misunderstanding. The scenario I'm addressing involves two sibling cgroups where one is an effective partition root and the other is not, and both have empty cpuset.cpus.exclusive. Let me explain the intention behind case 1 in detail, which will also illustrate why this has negative impacts on our product.
In case 1, after #3 completes, A1 is already a valid partition root - this is correct.After #4, B1 was generated, and B1 is no-exclusive. After #5, A1 changes from "root" to "root invalid". But A1 becoming "root invalid" could be unnecessary because having A1 remain as "root" might be more acceptable. Here's the analysis:
As documented in cgroup-v2.rst regarding cpuset.cpus: "The actual list of CPUs to be granted, however, is subjected to constraints imposed by its parent and can differ from the requested CPUs". This means that although we're requesting CPUs 0-3 for B1, we can accept that the actual available CPUs in B1 might not be 0-3.
Based on this characteristic, in our product's implementation for case 1, before writing to B1's cpuset.cpus in #5, we check B1's parent cpuset.cpus.effective and know that the CPUs available for B1 don't include 0-1 (since 0-1 are exclusively used by A1). However, we still want to set B1's cpuset.cpus to 0-3 because we hope that when 0-1 become available in the future, B1 can use them without affecting the normal operation of other cgroups.
The reality is that because B1's requested cpuset.cpus (0-3) conflicts with A1's exclusive CPUs (0-1) at that moment, it destroys the validity of A1's partition root. So why must the current rule sacrifice A1's validity to accommodate B1's CPU request? In this situation, B1 can clearly use 2-3 while A1 exclusively uses 0-1 - they don't need to conflict.
This patch narrows the exclusivity conflict check scope to only between partitions. Moreover, user-specified CPUs (including cpuset.cpus and cpuset.cpus.exclusive) only have true exclusive meaning within effective partitions. So why should the current rule perform exclusivity conflict checks between an exclusive partition and a non-exclusive member? This is clearly unnecessary.
Thanks Sun Shaojie
On 2025/11/12 17:46, Sun Shaojie wrote:
Hi Ridong,
Thank you for your response.
From your reply "in case 1, A1 can also be converted to a partition," I
realize there might be a misunderstanding. The scenario I'm addressing involves two sibling cgroups where one is an effective partition root and the other is not, and both have empty cpuset.cpus.exclusive. Let me explain the intention behind case 1 in detail, which will also illustrate why this has negative impacts on our product.
I think I understand what you mean.
In case 1, after #3 completes, A1 is already a valid partition root - this is correct.After #4, B1 was generated, and B1 is no-exclusive. After #5, A1 changes from "root" to "root invalid". But A1 becoming "root invalid" could be unnecessary because having A1 remain as "root" might be more acceptable. Here's the analysis:
What I want to note is this: what if we run echo root > /sys/fs/cgroup/B1/cpuset.cpus.partition after step #5? There’s no conflict check when enabling the partition.
As documented in cgroup-v2.rst regarding cpuset.cpus: "The actual list of CPUs to be granted, however, is subjected to constraints imposed by its parent and can differ from the requested CPUs". This means that although we're requesting CPUs 0-3 for B1, we can accept that the actual available CPUs in B1 might not be 0-3.
Based on this characteristic, in our product's implementation for case 1, before writing to B1's cpuset.cpus in #5, we check B1's parent cpuset.cpus.effective and know that the CPUs available for B1 don't include 0-1 (since 0-1 are exclusively used by A1). However, we still want to set B1's cpuset.cpus to 0-3 because we hope that when 0-1 become available in the future, B1 can use them without affecting the normal operation of other cgroups.
The reality is that because B1's requested cpuset.cpus (0-3) conflicts with A1's exclusive CPUs (0-1) at that moment, it destroys the validity of A1's partition root. So why must the current rule sacrifice A1's validity to accommodate B1's CPU request? In this situation, B1 can clearly use 2-3 while A1 exclusively uses 0-1 - they don't need to conflict.
This patch narrows the exclusivity conflict check scope to only between partitions. Moreover, user-specified CPUs (including cpuset.cpus and cpuset.cpus.exclusive) only have true exclusive meaning within effective partitions. So why should the current rule perform exclusivity conflict checks between an exclusive partition and a non-exclusive member? This is clearly unnecessary.
Thanks Sun Shaojie
On 11/12/25 4:46 AM, Sun Shaojie wrote:
Hi Ridong,
Thank you for your response.
From your reply "in case 1, A1 can also be converted to a partition," I realize there might be a misunderstanding. The scenario I'm addressing involves two sibling cgroups where one is an effective partition root and the other is not, and both have empty cpuset.cpus.exclusive. Let me explain the intention behind case 1 in detail, which will also illustrate why this has negative impacts on our product.
In case 1, after #3 completes, A1 is already a valid partition root - this is correct.After #4, B1 was generated, and B1 is no-exclusive. After #5, A1 changes from "root" to "root invalid". But A1 becoming "root invalid" could be unnecessary because having A1 remain as "root" might be more acceptable. Here's the analysis:
As documented in cgroup-v2.rst regarding cpuset.cpus: "The actual list of CPUs to be granted, however, is subjected to constraints imposed by its parent and can differ from the requested CPUs". This means that although we're requesting CPUs 0-3 for B1, we can accept that the actual available CPUs in B1 might not be 0-3.
Based on this characteristic, in our product's implementation for case 1, before writing to B1's cpuset.cpus in #5, we check B1's parent cpuset.cpus.effective and know that the CPUs available for B1 don't include 0-1 (since 0-1 are exclusively used by A1). However, we still want to set B1's cpuset.cpus to 0-3 because we hope that when 0-1 become available in the future, B1 can use them without affecting the normal operation of other cgroups.
The reality is that because B1's requested cpuset.cpus (0-3) conflicts with A1's exclusive CPUs (0-1) at that moment, it destroys the validity of A1's partition root. So why must the current rule sacrifice A1's validity to accommodate B1's CPU request? In this situation, B1 can clearly use 2-3 while A1 exclusively uses 0-1 - they don't need to conflict.
This patch narrows the exclusivity conflict check scope to only between partitions. Moreover, user-specified CPUs (including cpuset.cpus and cpuset.cpus.exclusive) only have true exclusive meaning within effective partitions. So why should the current rule perform exclusivity conflict checks between an exclusive partition and a non-exclusive member? This is clearly unnecessary.
As I have said in the other thread, v2 exclusive cpuset checking follows the v1 rule. However, the behavior of setting cpuset.cpus differs between v1 and v2. In v1, setting cpuset.cpus can fail if there is some conflict. In v2, users are allow to set whatever value they want without failure, but the effective CPUs granted will be subjected to constraints and differ from cpuset.cpus. So in that sense, I think it makes sense to relax the exclusive cpuset check for v2, but we still need to keep the current v1 behavior. Please update your patch to do that.
Cheers, Longman
On 2025/11/13 2:05, Waiman Long wrote:
On 11/12/25 4:46 AM, Sun Shaojie wrote:
Hi Ridong,
Thank you for your response.
From your reply "in case 1, A1 can also be converted to a partition," I realize there might be a misunderstanding. The scenario I'm addressing involves two sibling cgroups where one is an effective partition root and the other is not, and both have empty cpuset.cpus.exclusive. Let me explain the intention behind case 1 in detail, which will also illustrate why this has negative impacts on our product.
In case 1, after #3 completes, A1 is already a valid partition root - this is correct.After #4, B1 was generated, and B1 is no-exclusive. After #5, A1 changes from "root" to "root invalid". But A1 becoming "root invalid" could be unnecessary because having A1 remain as "root" might be more acceptable. Here's the analysis:
As documented in cgroup-v2.rst regarding cpuset.cpus: "The actual list of CPUs to be granted, however, is subjected to constraints imposed by its parent and can differ from the requested CPUs". This means that although we're requesting CPUs 0-3 for B1, we can accept that the actual available CPUs in B1 might not be 0-3.
Based on this characteristic, in our product's implementation for case 1, before writing to B1's cpuset.cpus in #5, we check B1's parent cpuset.cpus.effective and know that the CPUs available for B1 don't include 0-1 (since 0-1 are exclusively used by A1). However, we still want to set B1's cpuset.cpus to 0-3 because we hope that when 0-1 become available in the future, B1 can use them without affecting the normal operation of other cgroups.
The reality is that because B1's requested cpuset.cpus (0-3) conflicts with A1's exclusive CPUs (0-1) at that moment, it destroys the validity of A1's partition root. So why must the current rule sacrifice A1's validity to accommodate B1's CPU request? In this situation, B1 can clearly use 2-3 while A1 exclusively uses 0-1 - they don't need to conflict.
This patch narrows the exclusivity conflict check scope to only between partitions. Moreover, user-specified CPUs (including cpuset.cpus and cpuset.cpus.exclusive) only have true exclusive meaning within effective partitions. So why should the current rule perform exclusivity conflict checks between an exclusive partition and a non-exclusive member? This is clearly unnecessary.
As I have said in the other thread, v2 exclusive cpuset checking follows the v1 rule. However, the behavior of setting cpuset.cpus differs between v1 and v2. In v1, setting cpuset.cpus can fail if there is some conflict. In v2, users are allow to set whatever value they want without failure, but the effective CPUs granted will be subjected to constraints and differ from cpuset.cpus. So in that sense, I think it makes sense to relax the exclusive cpuset check for v2, but we still need to keep the current v1 behavior. Please update your patch to do that.
Cheers, Longman
Hi, Longman.
It did not fail to set cupset.cpus, but invalidated the sibling cpuset partition.
If we relax this rule, we should consider:
What I want to note is this: what if we run echo root > /sys/fs/cgroup/B1/cpuset.cpus.partition after step #5? There’s no conflict check when enabling the partition.
On 2025/11/13 09:21, Chen Ridong wrote:
Hi, Longman.
It did not fail to set cupset.cpus, but invalidated the sibling cpuset partition.
If we relax this rule, we should consider:
What I want to note is this: what if we run echo root > /sys/fs/cgroup/B1/cpuset.cpus.partition after step #5? There’s no conflict check when enabling the partition.
-- Best regards, Ridong
Hi, Ridong.
I understand your concern, and there is a conflict check when enabling partitions. Below, I will use two tables to show the partition states of A1 and B1 before applying this patch and after applying it.(All the steps in the table are by default under the path /sys/fs/cgroup)
Table 1: Before applying the patch | A1's prstate | B1's prstate | #1> mkdir -p A1 | member | | #2> echo "0-1" > A1/cpuset.cpus | member | | #3> echo "root" > A1/cpuset.cpus.partition | root | | #4> mkdir -p B1 | root | member | #5> echo "0-3" > B1/cpuset.cpus | root invalid | member | #6> echo "root" > B1/cpuset.cpus.partition | root invalid | root invalid |
Table 2: After applying the patch | A1's prstate | B1's prstate | #1> mkdir -p A1 | member | | #2> echo "0-1" > A1/cpuset.cpus | member | | #3> echo "root" > A1/cpuset.cpus.partition | root | | #4> mkdir -p B1 | root | member | #5> echo "0-3" > B1/cpuset.cpus | root | member | #6> echo "root" > B1/cpuset.cpus.partition | root | root invalid |
As shown in Table 2, after step #6, B1's partition state becomes "root invalid". This confirms that conflict checks are performed when enabling partitions, and clearly, the check did not pass in this case. This is the expected result, since the CPUs (0-3) that B1 attempts to use exclusively conflict with those used by A1 (0-1).
The reviewer mentioned they couldn't see my original patch, so I'm re-quoting the key changes below for clarity:
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 52468d2c178a..e0d27c9a101a 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -586,14 +586,14 @@ static inline bool cpusets_are_exclusive(struct cpuset *cs1, struct cpuset *cs2)
- Returns: true if CPU exclusivity conflict exists, false otherwise
- Conflict detection rules:
- If either cpuset is CPU exclusive, they must be mutually exclusive
- If both cpusets are exclusive, they must be mutually exclusive
- exclusive_cpus masks cannot intersect between cpusets
- The allowed CPUs of one cpuset cannot be a subset of another's exclusive CPUs
*/ static inline bool cpus_excl_conflict(struct cpuset *cs1, struct cpuset *cs2) {
- /* If either cpuset is exclusive, check if they are mutually exclusive */
- if (is_cpu_exclusive(cs1) || is_cpu_exclusive(cs2))
/* If both cpusets are exclusive, check if they are mutually exclusive */
if (is_cpu_exclusive(cs1) && is_cpu_exclusive(cs2)) return !cpusets_are_exclusive(cs1, cs2);
/* Exclusive_cpus cannot intersect */
Here are the main changes, where the conflict check for step #6 in Table 2 is performed. And these changes have no effect on cgroup v1.
Thanks, Sun Shaojie
On 11/12/25 10:33 PM, Sun Shaojie wrote:
The reviewer mentioned they couldn't see my original patch, so I'm re-quoting the key changes below for clarity:
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 52468d2c178a..e0d27c9a101a 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -586,14 +586,14 @@ static inline bool cpusets_are_exclusive(struct cpuset *cs1, struct cpuset *cs2)
- Returns: true if CPU exclusivity conflict exists, false otherwise
- Conflict detection rules:
- If either cpuset is CPU exclusive, they must be mutually exclusive
*/
- If both cpusets are exclusive, they must be mutually exclusive
- exclusive_cpus masks cannot intersect between cpusets
- The allowed CPUs of one cpuset cannot be a subset of another's exclusive CPUs
static inline bool cpus_excl_conflict(struct cpuset *cs1, struct cpuset *cs2) {
- /* If either cpuset is exclusive, check if they are mutually exclusive */
- if (is_cpu_exclusive(cs1) || is_cpu_exclusive(cs2))
/* If both cpusets are exclusive, check if they are mutually exclusive */
if (is_cpu_exclusive(cs1) && is_cpu_exclusive(cs2)) return !cpusets_are_exclusive(cs1, cs2);
/* Exclusive_cpus cannot intersect */
Here are the main changes, where the conflict check for step #6 in Table 2 is performed. And these changes have no effect on cgroup v1.
cpus_excl_conflict() is called by validate_change() which is used for both v1 and v2.
Cheers, Longman
On 2025/11/13 11:33, Sun Shaojie wrote:
On 2025/11/13 09:21, Chen Ridong wrote:
Hi, Longman.
It did not fail to set cupset.cpus, but invalidated the sibling cpuset partition.
If we relax this rule, we should consider:
What I want to note is this: what if we run echo root > /sys/fs/cgroup/B1/cpuset.cpus.partition after step #5? There’s no conflict check when enabling the partition.
-- Best regards, Ridong
Hi, Ridong.
I understand your concern, and there is a conflict check when enabling partitions. Below, I will use two tables to show the partition states of A1 and B1 before applying this patch and after applying it.(All the steps in the table are by default under the path /sys/fs/cgroup)
Table 1: Before applying the patch | A1's prstate | B1's prstate | #1> mkdir -p A1 | member | | #2> echo "0-1" > A1/cpuset.cpus | member | | #3> echo "root" > A1/cpuset.cpus.partition | root | | #4> mkdir -p B1 | root | member | #5> echo "0-3" > B1/cpuset.cpus | root invalid | member | #6> echo "root" > B1/cpuset.cpus.partition | root invalid | root invalid |
Table 2: After applying the patch | A1's prstate | B1's prstate | #1> mkdir -p A1 | member | | #2> echo "0-1" > A1/cpuset.cpus | member | | #3> echo "root" > A1/cpuset.cpus.partition | root | | #4> mkdir -p B1 | root | member | #5> echo "0-3" > B1/cpuset.cpus | root | member | #6> echo "root" > B1/cpuset.cpus.partition | root | root invalid |
Thank you for your clarification.
I missed exclusive conflict will be checked with:
update_prstate update_partition_exclusive_flag cpuset_update_flag validate_change
On 11/11/25 10:33 PM, Chen Ridong wrote:
On 2025/11/12 10:11, Sun Shaojie wrote: Hello Shaojie,
Currently, when a non-exclusive cpuset's "cpuset.cpus" overlaps with a partitioned sibling, the sibling's partition state becomes invalid. However, this invalidation is often unnecessary.
This can be observed in specific configuration sequences:
Case 1: Partition created first, then non-exclusive cpuset overlaps #1> mkdir -p /sys/fs/cgroup/A1 #2> echo "0-1" > /sys/fs/cgroup/A1/cpuset.cpus #3> echo "root" > /sys/fs/cgroup/A1/cpuset.cpus.partition #4> mkdir -p /sys/fs/cgroup/B1 #5> echo "0-3" > /sys/fs/cgroup/B1/cpuset.cpus // A1's partition becomes "root invalid" - this is unnecessary
Case 2: Non-exclusive cpuset exists first, then partition created #1> mkdir -p /sys/fs/cgroup/B1 #2> echo "0-1" > /sys/fs/cgroup/B1/cpuset.cpus #3> mkdir -p /sys/fs/cgroup/A1 #4> echo "0-1" > /sys/fs/cgroup/A1/cpuset.cpus #5> echo "root" > /sys/fs/cgroup/A1/cpuset.cpus.partition // A1's partition becomes "root invalid" - this is unnecessary
In Case 1, the effective CPU mask of B1 can differ from its requested mask. B1 can use CPUs 2-3 which don't overlap with A1's exclusive CPUs (0-1), thus not violating A1's exclusivity requirement.
In Case 2, B1 can inherit the effective CPUs from its parent, so there is no need to invalidate A1's partition state.
This patch relaxes the overlap check to only consider conflicts between partitioned siblings, not between a partitioned cpuset and a regular non-exclusive one.
The current cgroup v2 exclusive cpuset behavior follows the v1 behavior of cpuset.cpus.exclusive flag. Even if we want to relax the cgroup v2 behavior, we will still need to maintain the v1 behavior as we want to minimize any changes to cgroup v1. IOW, we have to gate this change specific to v2.
Cheers, Longman
On 2025/11/13 12:12, Waiman Long wrote:
On 11/12/25 10:33 PM, Sun Shaojie wrote:
The reviewer mentioned they couldn't see my original patch, so I'm re-quoting the key changes below for clarity:
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 52468d2c178a..e0d27c9a101a 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -586,14 +586,14 @@ static inline bool cpusets_are_exclusive(struct cpuset *cs1, struct cpuset *cs2)
- Returns: true if CPU exclusivity conflict exists, false otherwise
- Conflict detection rules:
- If either cpuset is CPU exclusive, they must be mutually exclusive
*/
- If both cpusets are exclusive, they must be mutually exclusive
- exclusive_cpus masks cannot intersect between cpusets
- The allowed CPUs of one cpuset cannot be a subset of another's exclusive CPUs
static inline bool cpus_excl_conflict(struct cpuset *cs1, struct cpuset *cs2) {
- /* If either cpuset is exclusive, check if they are mutually exclusive */
- if (is_cpu_exclusive(cs1) || is_cpu_exclusive(cs2))
/* If both cpusets are exclusive, check if they are mutually exclusive */
if (is_cpu_exclusive(cs1) && is_cpu_exclusive(cs2)) return !cpusets_are_exclusive(cs1, cs2);
/* Exclusive_cpus cannot intersect */
Here are the main changes, where the conflict check for step #6 in Table 2 is performed. And these changes have no effect on cgroup v1.
cpus_excl_conflict() is called by validate_change() which is used for both v1 and v2.
Cheers, Longman
Hi,Longman
Thanks for pointing this out. I will make the necessary updates.
Thanks, Sun Shaojie
linux-kselftest-mirror@lists.linaro.org