[RFC PATCH v3 0/2] cgroup: Track time in cgroup v2 freezer

List overview All Threads
Download

newer

older

[PATCH net-next] selftests:...

[PATCH] selftests/x86: fix coding...

Tiffany Yang

5 Aug 2025 5 Aug '25

3:29 a.m.

Hello,

The cgroup v2 freezer controller is useful for freezing background applications so they don't contend with foreground tasks. However, this may disrupt any internal monitoring that the application is performing, as it may not be aware that it was frozen.

To illustrate, an application might implement a watchdog thread to monitor a high-priority task by periodically checking its state to ensure progress. The challenge is that the task only advances when the application is running, but watchdog timers are set relative to system time, not app time. If the app is frozen and misses the expected deadline, the watchdog, unaware of this pause, may kill a healthy process.

This series tracks the time that each cgroup spends "freezing" and exposes it via cgroup.freeze.stat.local. If others prefer, I can instead create cgroup.stat.local and allow the freeze time accounting to be accessed there instead.

This version includes several basic selftests. I would find feedback especially useful here! Along with testing basic functionality, I wanted to demonstrate the following relationships: 1. Freeze time will increase while a cgroup is freezing, regardless of whether it is frozen or not. 2. Each cgroup's freeze time is independent from the other cgroups in its hierarchy.

I was hoping to show (1.) with a test that freezes a cgroup and then checks its freeze time while cgroup.events still shows "frozen 0", but I am having trouble writing a case that can reliably cause this (even when letting a forkbomb grow for a while before attempting to freeze!). Ideally, I could populate a test cgroup with an unfreezable task. Is there an elegant way to create a process from a selftest that will become TASK_INTERRUPTIBLE?

The main challenge in establishing (2.) is that in order to make a meaningful comparison between two cgroups' freeze times, they need to be obtained at around the same time. The test process may check one cgroup's freeze time, but then it may be preempted and delayed from checking another cgroup's for a relatively "long" time. I have tried to use sleeps to increase what a "long" time would be, but this possibility makes tests like test_cgfreezer_time_parent non-deterministic, so I am a bit squeamish about adding it here.

Any suggestions for better tests or anything else would be welcome.

Thank you! Tiffany

Signed-off-by: Tiffany Yang ynaffit@google.com --- v3: * Use seqcount along with css_set_lock to guard freeze time accesses as suggested by Michal Koutný * Add selftests

v2: https://lore.kernel.org/lkml/20250714050008.2167786-2-ynaffit@google.com/ * Track per-cgroup freezing time instead of per-task frozen time as suggested by Tejun Heo

v1: https://lore.kernel.org/lkml/20250603224304.3198729-3-ynaffit@google.com/

Cc: John Stultz jstultz@google.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Stephen Boyd sboyd@kernel.org Cc: Anna-Maria Behnsen anna-maria@linutronix.de Cc: Frederic Weisbecker frederic@kernel.org Cc: Tejun Heo tj@kernel.org Cc: Johannes Weiner hannes@cmpxchg.org Cc: Michal Koutný mkoutny@suse.com Cc: "Rafael J. Wysocki" rafael@kernel.org Cc: Pavel Machek pavel@kernel.org Cc: Roman Gushchin roman.gushchin@linux.dev Cc: Chen Ridong chenridong@huawei.com

Tiffany Yang (2): cgroup: cgroup.freeze.stat.local time accounting cgroup: selftests: Add tests for freezer time

Documentation/admin-guide/cgroup-v2.rst | 20 + include/linux/cgroup-defs.h | 17 + kernel/cgroup/cgroup.c | 28 + kernel/cgroup/freezer.c | 10 +- tools/testing/selftests/cgroup/test_freezer.c | 686 ++++++++++++++++++ 5 files changed, 759 insertions(+), 2 deletions(-)

-- 2.50.1.565.gc32cd1483b-goog

Show replies by date

Tiffany Yang

5 Aug 5 Aug

3:29 a.m.

New subject: [RFC PATCH v3 1/2] cgroup: cgroup.freeze.stat.local time accounting

There isn't yet a clear way to identify a set of "lost" time that everyone (or at least a wider group of users) cares about. However, users can perform some delay accounting by iterating over components of interest. This patch allows cgroup v2 freezing time to be one of those components.

Track the cumulative time that each v2 cgroup spends freezing and expose it to userland via a new core interface file in cgroupfs.

To access this value: $ mkdir /sys/fs/cgroup/test $ cat /sys/fs/cgroup/test/cgroup.freeze.stat.local freeze_time_total 0

Ensure consistent freeze time reads with freeze_seq, a per-cgroup sequence counter. Writes are serialized using the css_set_lock.

Signed-off-by: Tiffany Yang ynaffit@google.com --- Documentation/admin-guide/cgroup-v2.rst | 20 ++++++++++++++++++ include/linux/cgroup-defs.h | 17 +++++++++++++++ kernel/cgroup/cgroup.c | 28 +++++++++++++++++++++++++ kernel/cgroup/freezer.c | 10 +++++++-- 4 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst index d9d3cc7df348..e5bc463f8e05 100644 --- a/Documentation/admin-guide/cgroup-v2.rst +++ b/Documentation/admin-guide/cgroup-v2.rst @@ -1027,6 +1027,26 @@ All cgroup core files are prefixed with "cgroup." it's possible to delete a frozen (and empty) cgroup, as well as create new sub-cgroups.

+ cgroup.freeze.stat.local + A read-only flat-keyed file which exists in non-root cgroups. + The following entry is defined: + + freeze_time_total + Cumulative time that this cgroup has spent between freezing and + thawing, regardless of whether by self or ancestor groups. + NB: (not) reaching "frozen" state is not accounted here. + + Using the following ASCII representation of a cgroup's freezer + state, :: + + 1 _____ + frozen 0 __/ __ + ab cd + + .. Originally contributed by Michal Koutný mkoutny@suse.com + + the duration being measured is the span between a and c. + cgroup.kill A write-only single value file which exists in non-root cgroups. The only allowed value is "1". diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h index 6b93a64115fe..a4f9600fc101 100644 --- a/include/linux/cgroup-defs.h +++ b/include/linux/cgroup-defs.h @@ -433,6 +433,23 @@ struct cgroup_freezer_state { * frozen, SIGSTOPped, and PTRACEd. */ int nr_frozen_tasks; + + /* Freeze time data consistency protection */ + seqcount_t freeze_seq; + + /* + * Most recent time the cgroup was requested to freeze. + * Accesses guarded by freeze_seq counter. Writes serialized + * by css_set_lock. + */ + u64 freeze_time_start_ns; + + /* + * Total duration the cgroup has spent freezing. + * Accesses guarded by freeze_seq counter. Writes serialized + * by css_set_lock. + */ + u64 freeze_time_total_ns; };

struct cgroup { diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 312c6a8b55bb..25e008b40992 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -4055,6 +4055,27 @@ static ssize_t cgroup_freeze_write(struct kernfs_open_file *of, return nbytes; }

+static int cgroup_freeze_local_stat_show(struct seq_file *seq, void *v) +{ + struct cgroup *cgrp = seq_css(seq)->cgroup; + unsigned int sequence; + u64 freeze_time; + + do { + sequence = read_seqcount_begin(&cgrp->freezer.freeze_seq); + freeze_time = cgrp->freezer.freeze_time_total_ns; + /* Add in current freezer interval if the task is now frozen */ + if (test_bit(CGRP_FREEZE, &cgrp->flags)) + freeze_time += (ktime_get_ns() - + cgrp->freezer.freeze_time_start_ns); + } while (read_seqcount_retry(&cgrp->freezer.freeze_seq, sequence)); + + seq_printf(seq, "freeze_time_total %llu\n", + (unsigned long long) freeze_time / NSEC_PER_USEC); + + return 0; +} + static void __cgroup_kill(struct cgroup *cgrp) { struct css_task_iter it; @@ -5360,6 +5381,11 @@ static struct cftype cgroup_base_files[] = { .seq_show = cgroup_freeze_show, .write = cgroup_freeze_write, }, + { + .name = "cgroup.freeze.stat.local", + .flags = CFTYPE_NOT_ON_ROOT, + .seq_show = cgroup_freeze_local_stat_show, + }, { .name = "cgroup.kill", .flags = CFTYPE_NOT_ON_ROOT, @@ -5763,6 +5789,7 @@ static struct cgroup *cgroup_create(struct cgroup *parent, const char *name, * if the parent has to be frozen, the child has too. */ cgrp->freezer.e_freeze = parent->freezer.e_freeze; + seqcount_init(&cgrp->freezer.freeze_seq); if (cgrp->freezer.e_freeze) { /* * Set the CGRP_FREEZE flag, so when a process will be @@ -5771,6 +5798,7 @@ static struct cgroup *cgroup_create(struct cgroup *parent, const char *name, * consider it frozen immediately. */ set_bit(CGRP_FREEZE, &cgrp->flags); + cgrp->freezer.freeze_time_start_ns = ktime_get_ns(); set_bit(CGRP_FROZEN, &cgrp->flags); }

diff --git a/kernel/cgroup/freezer.c b/kernel/cgroup/freezer.c index bf1690a167dd..bbffad570ff7 100644 --- a/kernel/cgroup/freezer.c +++ b/kernel/cgroup/freezer.c @@ -179,10 +179,16 @@ static void cgroup_do_freeze(struct cgroup *cgrp, bool freeze) lockdep_assert_held(&cgroup_mutex);

spin_lock_irq(&css_set_lock); - if (freeze) + write_seqcount_begin(&cgrp->freezer.freeze_seq); + if (freeze) { set_bit(CGRP_FREEZE, &cgrp->flags); - else + cgrp->freezer.freeze_time_start_ns = ktime_get_ns(); + } else { clear_bit(CGRP_FREEZE, &cgrp->flags); + cgrp->freezer.freeze_time_total_ns += (ktime_get_ns() - + cgrp->freezer.freeze_time_start_ns); + } + write_seqcount_end(&cgrp->freezer.freeze_seq); spin_unlock_irq(&css_set_lock);

if (freeze)

-- 2.50.1.565.gc32cd1483b-goog

Tejun Heo

11 Aug 11 Aug

6:51 p.m.

New subject: [RFC PATCH v3 1/2] cgroup: cgroup.freeze.stat.local time accounting

Hello,

Generally looks good to me. Some comments on cosmetics / interface.

On Mon, Aug 04, 2025 at 08:29:41PM -0700, Tiffany Yang wrote: ...

...

cgroup.freeze.stat.local

This was mentioned before and maybe I missed the following discussions but given that cgroup.freeze is a part of core cgroup, cgroup.stat.local is probably the right place. It's not great that cgroup.stat wouldn't be a superset of cgroup.stat.local but we can add the hierarchical counter later if necessary.

...

A read-only flat-keyed file which exists in non-root cgroups.

The following entry is defined:
 freeze_time_total

How about just frozen_usec? "_usec" is what we used in cpu.stat for time stats.

...

Cumulative time that this cgroup has spent between freezing and

thawing, regardless of whether by self or ancestor groups.

NB: (not) reaching "frozen" state is not accounted here.

Using the following ASCII representation of a cgroup's freezer

```
state, ::
```

It's a bit odd to include credit in a doc file. Maybe move it to the description or add Co-developed-by: tag?

Thanks.

-- tejun

Tiffany Yang

14 Aug 14 Aug

1:30 a.m.

New subject: [RFC PATCH v3 1/2] cgroup: cgroup.freeze.stat.local time accounting

Tejun Heo tj@kernel.org writes:

...

Hello,

...

Generally looks good to me. Some comments on cosmetics / interface.

...

On Mon, Aug 04, 2025 at 08:29:41PM -0700, Tiffany Yang wrote: ...

...

cgroup.freeze.stat.local

...

This was mentioned before and maybe I missed the following discussions but given that cgroup.freeze is a part of core cgroup, cgroup.stat.local is probably the right place. It's not great that cgroup.stat wouldn't be a superset of cgroup.stat.local but we can add the hierarchical counter later if necessary.

Got it. I had ended up opting for the "freeze"-specific name because there was already a cgroup_local_stat_show that seemed to imply that cgroup.stat.local was reserved for controllers with a struct cgroup_subsys. I will update v4 with something similar for core!

...

...
A read-only flat-keyed file which exists in non-root cgroups.

The following entry is defined:
 freeze_time_total

...

How about just frozen_usec? "_usec" is what we used in cpu.stat for time stats.

Ack.

...

Cumulative time that this cgroup has spent between freezing and

thawing, regardless of whether by self or ancestor groups.

NB: (not) reaching "frozen" state is not accounted here.

Using the following ASCII representation of a cgroup's freezer

```
state, ::
```

...

It's a bit odd to include credit in a doc file. Maybe move it to the description or add Co-developed-by: tag?

Will do! Thanks for looking over this :).

-- Tiffany Y. Yang

Michal Koutný

3:54 p.m.

New subject: [RFC PATCH v3 1/2] cgroup: cgroup.freeze.stat.local time accounting

Hello.

On Mon, Aug 04, 2025 at 08:29:41PM -0700, Tiffany Yang ynaffit@google.com wrote:

...

cgroup.freeze.stat.local
A read-only flat-keyed file which exists in non-root cgroups.
The following entry is defined:
```
 freeze_time_total
```

Cumulative time that this cgroup has spent between freezing and

thawing, regardless of whether by self or ancestor groups.

NB: (not) reaching "frozen" state is not accounted here.

Using the following ASCII representation of a cgroup's freezer

```
state, ::
```
```
	       1    _____
```
```
	frozen 0 __/     \__
```
```
	          ab    cd
```

.. Originally contributed by Michal Koutný <mkoutny@suse.com>

the duration being measured is the span between a and c.

This is so little "artwork" that a mere mention in commit message is OK ;-)

...

+static int cgroup_freeze_local_stat_show(struct seq_file *seq, void *v) +{
struct cgroup *cgrp = seq_css(seq)->cgroup;

unsigned int sequence;

u64 freeze_time;

do {
sequence = read_seqcount_begin(&cgrp->freezer.freeze_seq);
freeze_time = cgrp->freezer.freeze_time_total_ns;
/* Add in current freezer interval if the task is now frozen */

Nit: cgrp is frozen, not a task here

...

@@ -179,10 +179,16 @@ static void cgroup_do_freeze(struct cgroup *cgrp, bool freeze) lockdep_assert_held(&cgroup_mutex); spin_lock_irq(&css_set_lock);

if (freeze)

write_seqcount_begin(&cgrp->freezer.freeze_seq);

if (freeze) { set_bit(CGRP_FREEZE, &cgrp->flags);

else
cgrp->freezer.freeze_time_start_ns = ktime_get_ns();

I wonder whether it wouldn't achieve more stable results if the reference timestamp was taken only once in cgroup_freeze(). Measuring the rate of cgroup traversal is rather noise in this case.

Thanks, Michal

Tiffany Yang

5 Aug 5 Aug

3:29 a.m.

New subject: [RFC PATCH v3 2/2] cgroup: selftests: Add tests for freezer time

Test cgroup v2 freezer time stat. Freezer time accounting should be independent of other cgroups in the hierarchy and should increase iff a cgroup is CGRP_FREEZE (regardless of whether it reaches CGRP_FROZEN).

Skip these tests on systems without freeze time accounting.

Signed-off-by: Tiffany Yang ynaffit@google.com --- tools/testing/selftests/cgroup/test_freezer.c | 686 ++++++++++++++++++ 1 file changed, 686 insertions(+)

diff --git a/tools/testing/selftests/cgroup/test_freezer.c b/tools/testing/selftests/cgroup/test_freezer.c index 8730645d363a..c0880ecfa814 100644 --- a/tools/testing/selftests/cgroup/test_freezer.c +++ b/tools/testing/selftests/cgroup/test_freezer.c @@ -804,6 +804,685 @@ static int test_cgfreezer_vfork(const char *root) return ret; }

+/* + * Get the current freeze_time_total for the cgroup. + */ +static long cg_check_freezetime(const char *cgroup) +{ + return cg_read_key_long(cgroup, "cgroup.freeze.stat.local", + "freeze_time_total "); +} + +/* + * Test that the freeze time will behave as expected for an empty cgroup. + */ +static int test_cgfreezer_time_empty(const char *root) +{ + int ret = KSFT_FAIL; + char *cgroup = NULL; + long prev, curr; + int i; + + cgroup = cg_name(root, "cg_time_test_empty"); + if (!cgroup) + goto cleanup; + + /* + * 1) Create an empty cgroup and check that its freeze time + * is 0. + */ + if (cg_create(cgroup)) + goto cleanup; + + curr = cg_check_freezetime(cgroup); + if (curr) { + if (curr < 0) + ret = KSFT_SKIP; + else + debug("Expect time (%ld) to be 0\n", curr); + + goto cleanup; + } + + /* + * 2) Freeze the cgroup. Check that its freeze time is + * larger than 0. + */ + if (cg_freeze_nowait(cgroup, true)) + goto cleanup; + prev = curr; + curr = cg_check_freezetime(cgroup); + if (curr <= prev) { + debug("Expect time (%ld) > 0\n", curr); + goto cleanup; + } + + /* + * 3) Sleep for 100 us. Check that the freeze time is at + * least 100 us larger than it was at 2). + */ + usleep(100); + prev = curr; + curr = cg_check_freezetime(cgroup); + if ((curr - prev) < 100) { + debug("Expect time (%ld) to be at least 100 us more than previous check (%ld)\n", + curr, prev); + goto cleanup; + } + + /* + * 4) Unfreeze the cgroup. Check that the freeze time is + * larger than at 3). + */ + if (cg_freeze_nowait(cgroup, false)) + goto cleanup; + prev = curr; + curr = cg_check_freezetime(cgroup); + if (curr <= prev) { + debug("Expect time (%ld) to be more than previous check (%ld)\n", + curr, prev); + goto cleanup; + } + + /* + * 5) Check the freeze time again to ensure that it has not + * changed. + */ + prev = curr; + curr = cg_check_freezetime(cgroup); + if (curr != prev) { + debug("Expect time (%ld) to be unchanged from previous check (%ld)\n", + curr, prev); + goto cleanup; + } + + ret = KSFT_PASS; + +cleanup: + if (cgroup) + cg_destroy(cgroup); + free(cgroup); + return ret; +} + +/* + * A simple test for cgroup freezer time accounting. This test follows + * the same flow as test_cgfreezer_time_empty, but with a single process + * in the cgroup. + */ +static int test_cgfreezer_time_simple(const char *root) +{ + int ret = KSFT_FAIL; + char *cgroup = NULL; + long prev, curr; + int i; + + cgroup = cg_name(root, "cg_time_test_simple"); + if (!cgroup) + goto cleanup; + + /* + * 1) Create a cgroup and check that its freeze time is 0. + */ + if (cg_create(cgroup)) + goto cleanup; + + curr = cg_check_freezetime(cgroup); + if (curr) { + if (curr < 0) + ret = KSFT_SKIP; + else + debug("Expect time (%ld) to be 0\n", curr); + + goto cleanup; + } + + /* + * 2) Populate the cgroup with one child and check that the + * freeze time is still 0. + */ + cg_run_nowait(cgroup, child_fn, NULL); + prev = curr; + curr = cg_check_freezetime(cgroup); + if (curr > prev) { + debug("Expect time (%ld) to be 0\n", curr); + goto cleanup; + } + + /* + * 3) Freeze the cgroup. Check that its freeze time is + * larger than 0. + */ + if (cg_freeze_nowait(cgroup, true)) + goto cleanup; + prev = curr; + curr = cg_check_freezetime(cgroup); + if (curr <= prev) { + debug("Expect time (%ld) > 0\n", curr); + goto cleanup; + } + + /* + * 4) Sleep for 100 us. Check that the freeze time is at + * least 100 us larger than it was at 3). + */ + usleep(100); + prev = curr; + curr = cg_check_freezetime(cgroup); + if ((curr - prev) < 100) { + debug("Expect time (%ld) to be at least 100 us more than previous check (%ld)\n", + curr, prev); + goto cleanup; + } + + /* + * 5) Unfreeze the cgroup. Check that the freeze time is + * larger than at 4). + */ + if (cg_freeze_nowait(cgroup, false)) + goto cleanup; + prev = curr; + curr = cg_check_freezetime(cgroup); + if (curr <= prev) { + debug("Expect time (%ld) to be more than previous check (%ld)\n", + curr, prev); + goto cleanup; + } + + /* + * 6) Sleep for 100 us. Check that the freeze time is the + * same as at 5). + */ + usleep(100); + prev = curr; + curr = cg_check_freezetime(cgroup); + if (curr != prev) { + debug("Expect time (%ld) to be unchanged from previous check (%ld)\n", + curr, prev); + goto cleanup; + } + + ret = KSFT_PASS; + +cleanup: + if (cgroup) + cg_destroy(cgroup); + free(cgroup); + return ret; +} + +/* + * Test that freezer time accounting works as expected, even while we're + * populating a cgroup with processes. + */ +static int test_cgfreezer_time_populate(const char *root) +{ + int ret = KSFT_FAIL; + char *cgroup = NULL; + long prev, curr; + int i; + + cgroup = cg_name(root, "cg_time_test_populate"); + if (!cgroup) + goto cleanup; + + if (cg_create(cgroup)) + goto cleanup; + + curr = cg_check_freezetime(cgroup); + if (curr) { + if (curr < 0) + ret = KSFT_SKIP; + else + debug("Expect time (%ld) to be 0\n", curr); + + goto cleanup; + } + + /* + * 1) Populate the cgroup with 100 processes. Check that + * the freeze time is 0. + */ + for (i = 0; i < 100; i++) + cg_run_nowait(cgroup, child_fn, NULL); + prev = curr; + curr = cg_check_freezetime(cgroup); + if (curr != prev) { + debug("Expect time (%ld) to be 0\n", curr); + goto cleanup; + } + + /* + * 2) Wait for the group to become fully populated. Check + * that the freeze time is 0. + */ + if (cg_wait_for_proc_count(cgroup, 100)) + goto cleanup; + prev = curr; + curr = cg_check_freezetime(cgroup); + if (curr != prev) { + debug("Expect time (%ld) to be 0\n", curr); + goto cleanup; + } + + /* + * 3) Freeze the cgroup and then populate it with 100 more + * processes. Check that the freeze time continues to grow. + */ + if (cg_freeze_nowait(cgroup, true)) + goto cleanup; + prev = curr; + curr = cg_check_freezetime(cgroup); + if (curr <= prev) { + debug("Expect time (%ld) to be more than previous check (%ld)\n", + curr, prev); + goto cleanup; + } + + for (i = 0; i < 100; i++) + cg_run_nowait(cgroup, child_fn, NULL); + prev = curr; + curr = cg_check_freezetime(cgroup); + if (curr <= prev) { + debug("Expect time (%ld) to be more than previous check (%ld)\n", + curr, prev); + goto cleanup; + } + + /* + * 4) Wait for the group to become fully populated. Check + * that the freeze time is larger than at 3). + */ + if (cg_wait_for_proc_count(cgroup, 200)) + goto cleanup; + prev = curr; + curr = cg_check_freezetime(cgroup); + if (curr <= prev) { + debug("Expect time (%ld) to be more than previous check (%ld)\n", + curr, prev); + goto cleanup; + } + + /* + * 5) Unfreeze the cgroup. Check that the freeze time is + * larger than at 4). + */ + if (cg_freeze_nowait(cgroup, false)) + goto cleanup; + prev = curr; + curr = cg_check_freezetime(cgroup); + if (curr <= prev) { + debug("Expect time (%ld) to be more than previous check (%ld)\n", + curr, prev); + goto cleanup; + } + + /* + * 6) Kill the processes. Check that the freeze time is the + * same as it was at 5). + */ + if (cg_killall(cgroup)) + goto cleanup; + prev = curr; + curr = cg_check_freezetime(cgroup); + if (curr != prev) { + debug("Expect time (%ld) to be unchanged from previous check (%ld)\n", + curr, prev); + goto cleanup; + } + + /* + * 7) Freeze and unfreeze the cgroup. Check that the freeze + * time is larger than it was at 6). + */ + if (cg_freeze_nowait(cgroup, true)) + goto cleanup; + if (cg_freeze_nowait(cgroup, false)) + goto cleanup; + prev = curr; + curr = cg_check_freezetime(cgroup); + if (curr <= prev) { + debug("Expect time (%ld) to be more than previous check (%ld)\n", + curr, prev); + goto cleanup; + } + + ret = KSFT_PASS; + +cleanup: + if (cgroup) + cg_destroy(cgroup); + free(cgroup); + return ret; +} + +/* + * Test that frozen time for a cgroup continues to work as expected, + * even as processes are migrated. Frozen cgroup A's freeze time should + * continue to increase and running cgroup B's should stay 0. + */ +static int test_cgfreezer_time_migrate(const char *root) +{ + long prev_A, curr_A, curr_B; + char *cgroup[2] = {0}; + int ret = KSFT_FAIL; + int pid, i; + + cgroup[0] = cg_name(root, "cg_time_test_migrate_A"); + if (!cgroup[0]) + goto cleanup; + + cgroup[1] = cg_name(root, "cg_time_test_migrate_B"); + if (!cgroup[1]) + goto cleanup; + + if (cg_create(cgroup[0])) + goto cleanup; + + if (cg_check_freezetime(cgroup[0]) < 0) { + ret = KSFT_SKIP; + goto cleanup; + } + + if (cg_create(cgroup[1])) + goto cleanup; + + pid = cg_run_nowait(cgroup[0], child_fn, NULL); + if (pid < 0) + goto cleanup; + + if (cg_wait_for_proc_count(cgroup[0], 1)) + goto cleanup; + + curr_A = cg_check_freezetime(cgroup[0]); + if (curr_A) { + debug("Expect time (%ld) to be 0\n", curr_A); + goto cleanup; + } + curr_B = cg_check_freezetime(cgroup[1]); + if (curr_B) { + debug("Expect time (%ld) to be 0\n", curr_B); + goto cleanup; + } + + /* + * Freeze cgroup A. + */ + if (cg_freeze_wait(cgroup[0], true)) + goto cleanup; + prev_A = curr_A; + curr_A = cg_check_freezetime(cgroup[0]); + if (curr_A <= prev_A) { + debug("Expect time (%ld) to be > 0\n", curr_A); + goto cleanup; + } + + /* + * Migrate from A (frozen) to B (running). + */ + if (cg_enter(cgroup[1], pid)) + goto cleanup; + + usleep(1000); + curr_B = cg_check_freezetime(cgroup[1]); + if (curr_B) { + debug("Expect time (%ld) to be 0\n", curr_B); + goto cleanup; + } + + prev_A = curr_A; + curr_A = cg_check_freezetime(cgroup[0]); + if (curr_A <= prev_A) { + debug("Expect time (%ld) to be more than previous check (%ld)\n", + curr_A, prev_A); + goto cleanup; + } + + ret = KSFT_PASS; + +cleanup: + if (cgroup[0]) + cg_destroy(cgroup[0]); + free(cgroup[0]); + if (cgroup[1]) + cg_destroy(cgroup[1]); + free(cgroup[1]); + return ret; +} + +/* + * The test creates a cgroup and freezes it. Then it creates a child cgroup. + * After that it checks that the child cgroup has a non-zero freeze time + * that is less than the parent's. Next, it freezes the child, unfreezes + * the parent, and sleeps. Finally, it checks that the child's freeze + * time has grown larger than the parent's. + */ +static int test_cgfreezer_time_parent(const char *root) +{ + char *parent, *child = NULL; + int ret = KSFT_FAIL; + long ptime, ctime; + + parent = cg_name(root, "cg_test_parent_A"); + if (!parent) + goto cleanup; + + child = cg_name(parent, "cg_test_parent_B"); + if (!child) + goto cleanup; + + if (cg_create(parent)) + goto cleanup; + + if (cg_check_freezetime(parent) < 0) { + ret = KSFT_SKIP; + goto cleanup; + } + + if (cg_freeze_wait(parent, true)) + goto cleanup; + + usleep(1000); + if (cg_create(child)) + goto cleanup; + + if (cg_check_frozen(child, true)) + goto cleanup; + + /* + * Since the parent was frozen the entire time the child cgroup + * was being created, we expect the parent's freeze time to be + * larger than the child's. + * + * Ideally, we would be able to check both times simultaneously, + * but here we get the child's after we get the parent's. + */ + ptime = cg_check_freezetime(parent); + ctime = cg_check_freezetime(child); + if (ptime <= ctime) { + debug("Expect ptime (%ld) > ctime (%ld)\n", ptime, ctime); + goto cleanup; + } + + if (cg_freeze_nowait(child, true)) + goto cleanup; + + if (cg_freeze_wait(parent, false)) + goto cleanup; + + if (cg_check_frozen(child, true)) + goto cleanup; + + usleep(100000); + + ctime = cg_check_freezetime(child); + ptime = cg_check_freezetime(parent); + + if (ctime <= ptime) { + debug("Expect ctime (%ld) > ptime (%ld)\n", ctime, ptime); + goto cleanup; + } + + ret = KSFT_PASS; + +cleanup: + if (child) + cg_destroy(child); + free(child); + if (parent) + cg_destroy(parent); + free(parent); + return ret; +} + +/* + * The test creates a parent cgroup and a child cgroup. Then, it freezes + * the child and checks that the child's freeze time is greater than the + * parent's, which should be zero. + */ +static int test_cgfreezer_time_child(const char *root) +{ + char *parent, *child = NULL; + int ret = KSFT_FAIL; + long ptime, ctime; + + parent = cg_name(root, "cg_test_child_A"); + if (!parent) + goto cleanup; + + child = cg_name(parent, "cg_test_child_B"); + if (!child) + goto cleanup; + + if (cg_create(parent)) + goto cleanup; + + if (cg_check_freezetime(parent) < 0) { + ret = KSFT_SKIP; + goto cleanup; + } + + if (cg_create(child)) + goto cleanup; + + if (cg_freeze_wait(child, true)) + goto cleanup; + + ctime = cg_check_freezetime(child); + ptime = cg_check_freezetime(parent); + if (ptime != 0) { + debug("Expect ptime (%ld) to be 0\n", ptime); + goto cleanup; + } + + if (ctime <= ptime) { + debug("Expect ctime (%ld) <= ptime (%ld)\n", ctime, ptime); + goto cleanup; + } + + ret = KSFT_PASS; + +cleanup: + if (child) + cg_destroy(child); + free(child); + if (parent) + cg_destroy(parent); + free(parent); + return ret; +} + +/* + * The test creates the following hierarchy: + * A + * | + * B + * | + * C + * + * Then it freezes the cgroups in the order C, B, A. + * Then it unfreezes the cgroups in the order A, B, C. + * Then it checks that C's freeze time is larger than B's and + * that B's is larger than A's. + */ +static int test_cgfreezer_time_nested(const char *root) +{ + char *cgroup[3] = {0}; + int ret = KSFT_FAIL; + long time[3] = {0}; + int i; + + cgroup[0] = cg_name(root, "cg_test_time_A"); + if (!cgroup[0]) + goto cleanup; + + cgroup[1] = cg_name(cgroup[0], "B"); + if (!cgroup[1]) + goto cleanup; + + cgroup[2] = cg_name(cgroup[1], "C"); + if (!cgroup[2]) + goto cleanup; + + if (cg_create(cgroup[0])) + goto cleanup; + + if (cg_check_freezetime(cgroup[0]) < 0) { + ret = KSFT_SKIP; + goto cleanup; + } + + if (cg_create(cgroup[1])) + goto cleanup; + + if (cg_create(cgroup[2])) + goto cleanup; + + if (cg_freeze_nowait(cgroup[2], true)) + goto cleanup; + + if (cg_freeze_nowait(cgroup[1], true)) + goto cleanup; + + if (cg_freeze_nowait(cgroup[0], true)) + goto cleanup; + + usleep(1000); + + if (cg_freeze_nowait(cgroup[0], false)) + goto cleanup; + + if (cg_freeze_nowait(cgroup[1], false)) + goto cleanup; + + if (cg_freeze_nowait(cgroup[2], false)) + goto cleanup; + + time[2] = cg_check_freezetime(cgroup[2]); + time[1] = cg_check_freezetime(cgroup[1]); + time[0] = cg_check_freezetime(cgroup[0]); + + if (time[2] <= time[1]) { + debug("Expect C's time (%ld) > B's time (%ld)", time[2], time[1]); + goto cleanup; + } + + if (time[1] <= time[0]) { + debug("Expect B's time (%ld) > A's time (%ld)", time[1], time[0]); + goto cleanup; + } + + ret = KSFT_PASS; + +cleanup: + for (i = 2; i >= 0 && cgroup[i]; i--) { + cg_destroy(cgroup[i]); + free(cgroup[i]); + } + + return ret; +} + #define T(x) { x, #x } struct cgfreezer_test { int (*fn)(const char *root); @@ -819,6 +1498,13 @@ struct cgfreezer_test { T(test_cgfreezer_stopped), T(test_cgfreezer_ptraced), T(test_cgfreezer_vfork), + T(test_cgfreezer_time_empty), + T(test_cgfreezer_time_simple), + T(test_cgfreezer_time_populate), + T(test_cgfreezer_time_migrate), + T(test_cgfreezer_time_parent), + T(test_cgfreezer_time_child), + T(test_cgfreezer_time_nested), }; #undef T

-- 2.50.1.565.gc32cd1483b-goog

Michal Koutný

14 Aug 14 Aug

4:18 p.m.

New subject: [RFC PATCH v3 2/2] cgroup: selftests: Add tests for freezer time

On Mon, Aug 04, 2025 at 08:29:42PM -0700, Tiffany Yang ynaffit@google.com wrote:

...

+static int test_cgfreezer_time_empty(const char *root) +{
int ret = KSFT_FAIL;

char *cgroup = NULL;

long prev, curr;

int i;

cgroup = cg_name(root, "cg_time_test_empty");

if (!cgroup)
goto cleanup;
/*
* 1) Create an empty cgroup and check that its freeze time
*    is 0.
*/
if (cg_create(cgroup))
goto cleanup;
curr = cg_check_freezetime(cgroup);

if (curr) {
if (curr < 0)
	ret = KSFT_SKIP;
else
	debug("Expect time (%ld) to be 0\n", curr);
goto cleanup;
}

if (curr < 0) { ret = KSFT_SKIP; goto cleanup; } if (curr > 0) { debug("Expect time (%ld) to be 0\n", curr); goto cleanup; }

I might like the version with less indentation and explicit guards. It's only minor stylistic issue.

...

/*
* 2) Freeze the cgroup. Check that its freeze time is
*    larger than 0.
*/
if (cg_freeze_nowait(cgroup, true))
goto cleanup;
prev = curr;

curr = cg_check_freezetime(cgroup);

if (curr <= prev) {

Here and...

...

debug("Expect time (%ld) > 0\n", curr);
goto cleanup;
}

/*
* 3) Sleep for 100 us. Check that the freeze time is at
*    least 100 us larger than it was at 2).
*/
usleep(100);

prev = curr;

curr = cg_check_freezetime(cgroup);

if ((curr - prev) < 100) {

...here I'm slightly worried it may cause test flakiness on systems with too coarse clock granularity.

Is the first check anyhow meaningful? (I think it's only as strong as checking return value of the preceding write(2) to cgroup.freeze.)

Would it compromise your use case if the latter check was at least 1000 μs (based on other usleeps in cgroup selftests)? (Ditto for other 100 μs checks.)

Or does anything guarantee the minimal precision in common selftest environments?

Thanks, Michal

Tiffany Yang

19 Aug 19 Aug

11:05 p.m.

New subject: [RFC PATCH v3 2/2] cgroup: selftests: Add tests for freezer time

Michal Koutný mkoutny@suse.com writes: ...

...

if (curr < 0) { ret = KSFT_SKIP; goto cleanup; } if (curr > 0) { debug("Expect time (%ld) to be 0\n", curr); goto cleanup; }

...

I might like the version with less indentation and explicit guards. It's only minor stylistic issue.

Noted! Will be fixed in v4.

...

...
/*
* 2) Freeze the cgroup. Check that its freeze time is
*    larger than 0.
*/
if (cg_freeze_nowait(cgroup, true))
goto cleanup;
prev = curr;

curr = cg_check_freezetime(cgroup);

if (curr <= prev) {

...

Here and...

...
debug("Expect time (%ld) > 0\n", curr);
goto cleanup;
}

/*
* 3) Sleep for 100 us. Check that the freeze time is at
*    least 100 us larger than it was at 2).
*/
usleep(100);

prev = curr;

curr = cg_check_freezetime(cgroup);

if ((curr - prev) < 100) {

...

...here I'm slightly worried it may cause test flakiness on systems with too coarse clock granularity.

...

Is the first check anyhow meaningful? (I think it's only as strong as checking return value of the preceding write(2) to cgroup.freeze.)

Hmm I had originally put the check at 2) in to make sure that the value increases as expected for an empty cgroup (the simplest case), but I think the check at 3) (and most other checks in these test cases) establish the same thing.

The other purpose it serves is to act as kind of a buffer for the time it takes to freeze the cgroup (t_1 -> t_2) to ensure that the cgroup would be frozen for the entirety of the sleep. I.e., preventing the case where we fail the check because the time measured at t_3 ends up being (100 - the time it took to freeze).

That said, the time between writing to an empty cgroup's cgroup.freeze and it beginning to freeze is basically negligible relative to the time scales in this test, so I'm happy to take it out!

(If what I've written above is worded too confusingly, ignore it. TL;DR: we don't need this check! I'm taking it out!)

...

Would it compromise your use case if the latter check was at least 1000 μs (based on other usleeps in cgroup selftests)? (Ditto for other 100 μs checks.)

Not at all! I'll make this change for v4.

Thanks,

-- Tiffany Y. Yang

107

days inactive

121

days old

linux-kselftest-mirror@lists.linaro.org

7 comments

participants

tags (0)

participants (3)

Michal Koutný
Tejun Heo
Tiffany Yang