The current cpuset code and test_cpuset_prs.sh test have not fully account for the possibility of pre-isolated CPUs added by the "isolcpus" boot command line parameter. This patch series modifies them to do the right thing whether or not "isolcpus" is present or not.
The updated test_cpuset_prs.sh was run successfully with or without the "isolcpus" option.
Waiman Long (2): cgroup/cpuset: Account for boot time isolated CPUs selftest/cgroup: Make test_cpuset_prs.sh deal with pre-isolated CPUs
kernel/cgroup/cpuset.c | 23 +++++++--- .../selftests/cgroup/test_cpuset_prs.sh | 44 ++++++++++++++----- 2 files changed, 51 insertions(+), 16 deletions(-)
With the "isolcpus" boot command line parameter, we are able to create isolated CPUs at boot time. These isolated CPUs aren't fully accounted for in the cpuset code. For instance, the root cgroup's "cpuset.cpus.isolated" control file does not include the boot time isolated CPUs. Fix that by looking for pre-isolated CPUs at init time.
The prstate_housekeeping_conflict() function does check the HK_TYPE_DOMAIN housekeeping cpumask to make sure that CPUs outside of it can only be used in isolated partition. Given the fact that we are going to make housekeeping cpumasks dynamic, the current check may not be right anymore. Save the boot time HK_TYPE_DOMAIN cpumask and check against it instead of the upcoming dynamic HK_TYPE_DOMAIN housekeeping cpumask.
Signed-off-by: Waiman Long longman@redhat.com --- kernel/cgroup/cpuset.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 7db55eed63cf..8b40df89c3c1 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -224,6 +224,12 @@ static cpumask_var_t subpartitions_cpus; */ static cpumask_var_t isolated_cpus;
+/* + * Housekeeping (HK_TYPE_DOMAIN) CPUs at boot + */ +static cpumask_var_t boot_hk_cpus; +static bool have_boot_isolcpus; + /* List of remote partition root children */ static struct list_head remote_children;
@@ -1823,15 +1829,15 @@ static void remote_partition_check(struct cpuset *cs, struct cpumask *newmask, * @new_cpus: cpu mask * Return: true if there is conflict, false otherwise * - * CPUs outside of housekeeping_cpumask(HK_TYPE_DOMAIN) can only be used in - * an isolated partition. + * CPUs outside of boot_hk_cpus, if defined, can only be used in an + * isolated partition. */ static bool prstate_housekeeping_conflict(int prstate, struct cpumask *new_cpus) { - const struct cpumask *hk_domain = housekeeping_cpumask(HK_TYPE_DOMAIN); - bool all_in_hk = cpumask_subset(new_cpus, hk_domain); + if (!have_boot_isolcpus) + return false;
- if (!all_in_hk && (prstate != PRS_ISOLATED)) + if ((prstate != PRS_ISOLATED) && !cpumask_subset(new_cpus, boot_hk_cpus)) return true;
return false; @@ -4345,6 +4351,13 @@ int __init cpuset_init(void)
BUG_ON(!alloc_cpumask_var(&cpus_attach, GFP_KERNEL));
+ have_boot_isolcpus = housekeeping_enabled(HK_TYPE_DOMAIN); + if (have_boot_isolcpus) { + BUG_ON(!alloc_cpumask_var(&boot_hk_cpus, GFP_KERNEL)); + cpumask_copy(boot_hk_cpus, housekeeping_cpumask(HK_TYPE_DOMAIN)); + cpumask_andnot(isolated_cpus, cpu_possible_mask, boot_hk_cpus); + } + return 0; }
Hi.
On Tue, Aug 20, 2024 at 03:55:35PM GMT, Waiman Long longman@redhat.com wrote:
The prstate_housekeeping_conflict() function does check the HK_TYPE_DOMAIN housekeeping cpumask to make sure that CPUs outside of it can only be used in isolated partition. Given the fact that we are going to make housekeeping cpumasks dynamic, the current check may not be right anymore. Save the boot time HK_TYPE_DOMAIN cpumask and check against it instead of the upcoming dynamic HK_TYPE_DOMAIN housekeeping cpumask.
Why is (will be) checking against the stored HK_TYPE_DOMAIN mask correct when this mask becomes dynamic?
Thanks, Michal
On 8/27/24 04:01, Michal Koutný wrote:
Hi.
On Tue, Aug 20, 2024 at 03:55:35PM GMT, Waiman Long longman@redhat.com wrote:
The prstate_housekeeping_conflict() function does check the HK_TYPE_DOMAIN housekeeping cpumask to make sure that CPUs outside of it can only be used in isolated partition. Given the fact that we are going to make housekeeping cpumasks dynamic, the current check may not be right anymore. Save the boot time HK_TYPE_DOMAIN cpumask and check against it instead of the upcoming dynamic HK_TYPE_DOMAIN housekeeping cpumask.
Why is (will be) checking against the stored HK_TYPE_DOMAIN mask correct when this mask becomes dynamic?
In term of isolated CPUs, there are 2 categories - static and dynamic. Statically isolated CPUs are those that are designed as isolated at boot time by "isolcpus". Other isolated CPUs created by the cpuset isolated partitions are considered dynamic in the sense that its state can change at run time. The degree of CPU isolation of dynamically isolated CPUs isn't as good as that of the statically isolated ones. So I want to handle them separately which is what the prstate_housekeeping_conflict() is intended to do.
As it is my intention to make the housekeeping cpumasks dynamic, I need to keep a copy of the statically isolated CPUs and check against them. There is no point to check dynamically isolated CPUs as the test may produce a false positive result.
In the future when dynamic CPU isolation is as almost as good as the static ones, we can get rid of this distinction and treat all of them as dynamic.
Cheers, Longman
Since isolated CPUs can be reserved at boot time via the "isolcpus" boot command line option, these pre-isolated CPUs may interfere with testing done by test_cpuset_prs.sh.
With the previous commit that incorporates those boot time isolated CPUs into "cpuset.cpus.isolated", we can check for those before testing is started to make sure that there will be no interference. Otherwise, this test will be skipped if incorrect test failure can happen.
As "cpuset.cpus.isolated" is now available in a non cgroup_debug kernel, we don't need to check for its existence anymore.
Signed-off-by: Waiman Long longman@redhat.com --- .../selftests/cgroup/test_cpuset_prs.sh | 44 ++++++++++++++----- 1 file changed, 33 insertions(+), 11 deletions(-)
diff --git a/tools/testing/selftests/cgroup/test_cpuset_prs.sh b/tools/testing/selftests/cgroup/test_cpuset_prs.sh index 7295424502b9..03c1bdaed2c3 100755 --- a/tools/testing/selftests/cgroup/test_cpuset_prs.sh +++ b/tools/testing/selftests/cgroup/test_cpuset_prs.sh @@ -84,6 +84,20 @@ echo member > test/cpuset.cpus.partition echo "" > test/cpuset.cpus [[ $RESULT -eq 0 ]] && skip_test "Child cgroups are using cpuset!"
+# +# If isolated CPUs have been reserved at boot time (as shown in +# cpuset.cpus.isolated), these isolated CPUs should be outside of CPUs 0-7 +# that will be used by this script for testing purpose. If not, some of +# the tests may fail incorrectly. These isolated CPUs will also be removed +# before being compared with the expected results. +# +BOOT_ISOLCPUS=$(cat $CGROUP2/cpuset.cpus.isolated) +if [[ -n "$BOOT_ISOLCPUS" ]] +then + [[ $(echo $BOOT_ISOLCPUS | sed -e "s/[,-].*//") -le 7 ]] && + skip_test "Pre-isolated CPUs ($BOOT_ISOLCPUS) overlap CPUs to be tested" + echo "Pre-isolated CPUs: $BOOT_ISOLCPUS" +fi cleanup() { online_cpus @@ -642,7 +656,8 @@ check_cgroup_states() # Note that isolated CPUs from the sched/domains context include offline # CPUs as well as CPUs in non-isolated 1-CPU partition. Those CPUs may # not be included in the cpuset.cpus.isolated control file which contains -# only CPUs in isolated partitions. +# only CPUs in isolated partitions as well as those that are isolated at +# boot time. # # $1 - expected isolated cpu list(s) <isolcpus1>{,<isolcpus2>} # <isolcpus1> - expected sched/domains value @@ -669,18 +684,21 @@ check_isolcpus() fi
# - # Check the debug isolated cpumask, if present + # Check cpuset.cpus.isolated cpumask # - [[ -f $ISCPUS ]] && { + if [[ -z "$BOOT_ISOLCPUS" ]] + then + ISOLCPUS=$(cat $ISCPUS) + else + ISOLCPUS=$(cat $ISCPUS | sed -e "s/,*$BOOT_ISOLCPUS//") + fi + [[ "$EXPECT_VAL2" != "$ISOLCPUS" ]] && { + # Take a 50ms pause and try again + pause 0.05 ISOLCPUS=$(cat $ISCPUS) - [[ "$EXPECT_VAL2" != "$ISOLCPUS" ]] && { - # Take a 50ms pause and try again - pause 0.05 - ISOLCPUS=$(cat $ISCPUS) - } - [[ "$EXPECT_VAL2" != "$ISOLCPUS" ]] && return 1 - ISOLCPUS= } + [[ "$EXPECT_VAL2" != "$ISOLCPUS" ]] && return 1 + ISOLCPUS=
# # Use the sched domain in debugfs to check isolated CPUs, if available @@ -713,6 +731,9 @@ check_isolcpus() fi done [[ "$ISOLCPUS" = *- ]] && ISOLCPUS=${ISOLCPUS}$LASTISOLCPU + [[ -n "BOOT_ISOLCPUS" ]] && + ISOLCPUS=$(echo $ISOLCPUS | sed -e "s/,*$BOOT_ISOLCPUS//") + [[ "$EXPECT_VAL" = "$ISOLCPUS" ]] }
@@ -730,7 +751,8 @@ test_fail() }
# -# Check to see if there are unexpected isolated CPUs left +# Check to see if there are unexpected isolated CPUs left beyond the boot +# time isolated ones. # null_isolcpus_check() {
On 8/20/24 15:55, Waiman Long wrote:
The current cpuset code and test_cpuset_prs.sh test have not fully account for the possibility of pre-isolated CPUs added by the "isolcpus" boot command line parameter. This patch series modifies them to do the right thing whether or not "isolcpus" is present or not.
The updated test_cpuset_prs.sh was run successfully with or without the "isolcpus" option.
Waiman Long (2): cgroup/cpuset: Account for boot time isolated CPUs selftest/cgroup: Make test_cpuset_prs.sh deal with pre-isolated CPUs
kernel/cgroup/cpuset.c | 23 +++++++--- .../selftests/cgroup/test_cpuset_prs.sh | 44 ++++++++++++++----- 2 files changed, 51 insertions(+), 16 deletions(-)
Ping! Any comment on these patches?
Thanks, Longman
On 8/26/24 15:05, Tejun Heo wrote:
Hello, Waiman.
On Mon, Aug 26, 2024 at 03:01:21PM -0400, Waiman Long wrote:
Ping! Any comment on these patches?
cgroup side looks fine to me. Ping isol people?
This patchset is specific to cpuset code and doesn't touch any isolation code at all. My other isolation related patch series [1] does have a dependency on this one to work correctly.
[1] https://lore.kernel.org/lkml/20240821142312.236970-1-longman@redhat.com/
Cheers, Longman
On Mon, Aug 26, 2024 at 03:41:23PM -0400, Waiman Long wrote:
On 8/26/24 15:05, Tejun Heo wrote:
Hello, Waiman.
On Mon, Aug 26, 2024 at 03:01:21PM -0400, Waiman Long wrote:
Ping! Any comment on these patches?
cgroup side looks fine to me. Ping isol people?
This patchset is specific to cpuset code and doesn't touch any isolation code at all. My other isolation related patch series [1] does have a dependency on this one to work correctly.
[1] https://lore.kernel.org/lkml/20240821142312.236970-1-longman@redhat.com/
Oh, I was confused of the two patchsets. I thought the second one was the replacement for this one. Will take another look.
Thanks.
On Tue, Aug 20, 2024 at 03:55:34PM -0400, Waiman Long wrote:
The current cpuset code and test_cpuset_prs.sh test have not fully account for the possibility of pre-isolated CPUs added by the "isolcpus" boot command line parameter. This patch series modifies them to do the right thing whether or not "isolcpus" is present or not.
The updated test_cpuset_prs.sh was run successfully with or without the "isolcpus" option.
Waiman Long (2): cgroup/cpuset: Account for boot time isolated CPUs selftest/cgroup: Make test_cpuset_prs.sh deal with pre-isolated CPUs
Applied 1-2 to cgroup/for-6.12.
Thanks.
linux-kselftest-mirror@lists.linaro.org