Changelog:
v7:
* Added the mem_cgroup_iter_online() function to the API for the new
behavior (suggested by Andrew Morton) (patch 2)
* Fixed a missing list_lru_del -> list_lru_del_obj (patch 1)
v6:
* Rebase on top of latest mm-unstable.
* Fix/improve the in-code documentation of the new list_lru
manipulation functions (patch 1)
v5:
* Replace reference getting with an rcu_read_lock() section for
zswap lru modifications (suggested by Yosry)
* Add a new prep patch that allows mem_cgroup_iter() to return
online cgroup.
* Add a callback that updates pool->next_shrink when the cgroup is
offlined (suggested by Yosry Ahmed, Johannes Weiner)
v4:
* Rename list_lru_add to list_lru_add_obj and __list_lru_add to
list_lru_add (patch 1) (suggested by Johannes Weiner and
Yosry Ahmed)
* Some cleanups on the memcg aware LRU patch (patch 2)
(suggested by Yosry Ahmed)
* Use event interface for the new per-cgroup writeback counters.
(patch 3) (suggested by Yosry Ahmed)
* Abstract zswap's lruvec states and handling into
zswap_lruvec_state (patch 5) (suggested by Yosry Ahmed)
v3:
* Add a patch to export per-cgroup zswap writeback counters
* Add a patch to update zswap's kselftest
* Separate the new list_lru functions into its own prep patch
* Do not start from the top of the hierarchy when encounter a memcg
that is not online for the global limit zswap writeback (patch 2)
(suggested by Yosry Ahmed)
* Do not remove the swap entry from list_lru in
__read_swapcache_async() (patch 2) (suggested by Yosry Ahmed)
* Removed a redundant zswap pool getting (patch 2)
(reported by Ryan Roberts)
* Use atomic for the nr_zswap_protected (instead of lruvec's lock)
(patch 5) (suggested by Yosry Ahmed)
* Remove the per-cgroup zswap shrinker knob (patch 5)
(suggested by Yosry Ahmed)
v2:
* Fix loongarch compiler errors
* Use pool stats instead of memcg stats when !CONFIG_MEMCG_KEM
There are currently several issues with zswap writeback:
1. There is only a single global LRU for zswap, making it impossible to
perform worload-specific shrinking - an memcg under memory pressure
cannot determine which pages in the pool it owns, and often ends up
writing pages from other memcgs. This issue has been previously
observed in practice and mitigated by simply disabling
memcg-initiated shrinking:
https://lore.kernel.org/all/20230530232435.3097106-1-nphamcs@gmail.com/T/#u
But this solution leaves a lot to be desired, as we still do not
have an avenue for an memcg to free up its own memory locked up in
the zswap pool.
2. We only shrink the zswap pool when the user-defined limit is hit.
This means that if we set the limit too high, cold data that are
unlikely to be used again will reside in the pool, wasting precious
memory. It is hard to predict how much zswap space will be needed
ahead of time, as this depends on the workload (specifically, on
factors such as memory access patterns and compressibility of the
memory pages).
This patch series solves these issues by separating the global zswap
LRU into per-memcg and per-NUMA LRUs, and performs workload-specific
(i.e memcg- and NUMA-aware) zswap writeback under memory pressure. The
new shrinker does not have any parameter that must be tuned by the
user, and can be opted in or out on a per-memcg basis.
As a proof of concept, we ran the following synthetic benchmark:
build the linux kernel in a memory-limited cgroup, and allocate some
cold data in tmpfs to see if the shrinker could write them out and
improved the overall performance. Depending on the amount of cold data
generated, we observe from 14% to 35% reduction in kernel CPU time used
in the kernel builds.
Domenico Cerasuolo (3):
zswap: make shrinking memcg-aware
mm: memcg: add per-memcg zswap writeback stat
selftests: cgroup: update per-memcg zswap writeback selftest
Nhat Pham (3):
list_lru: allows explicit memcg and NUMA node selection
memcontrol: add a new function to traverse online-only memcg hierarchy
zswap: shrinks zswap pool based on memory pressure
Documentation/admin-guide/mm/zswap.rst | 7 +
drivers/android/binder_alloc.c | 7 +-
fs/dcache.c | 8 +-
fs/gfs2/quota.c | 6 +-
fs/inode.c | 4 +-
fs/nfs/nfs42xattr.c | 8 +-
fs/nfsd/filecache.c | 4 +-
fs/xfs/xfs_buf.c | 6 +-
fs/xfs/xfs_dquot.c | 2 +-
fs/xfs/xfs_qm.c | 2 +-
include/linux/list_lru.h | 54 ++-
include/linux/memcontrol.h | 18 +
include/linux/mmzone.h | 2 +
include/linux/vm_event_item.h | 1 +
include/linux/zswap.h | 27 +-
mm/list_lru.c | 48 ++-
mm/memcontrol.c | 32 +-
mm/mmzone.c | 1 +
mm/swap.h | 3 +-
mm/swap_state.c | 26 +-
mm/vmstat.c | 1 +
mm/workingset.c | 4 +-
mm/zswap.c | 426 +++++++++++++++++---
tools/testing/selftests/cgroup/test_zswap.c | 74 ++--
24 files changed, 641 insertions(+), 130 deletions(-)
base-commit: 5cdba94229e58a39ca389ad99763af29e6b0c5a5
--
2.34.1
DAMON provides almost all control to the user via its sysfs interface.
For that, the interface provides plenty of files and hierarchies. The
interface is simple enough to be controlled by shell commands including
'cat', 'echo', and redirection. However, due to the number of files and
the hierarchies, doing that repeatedly is quite tedious. As a result,
DAMON selftests are containing only simple test cases rather than real
functionality tests. Having a wrapper script that can be reused to
implement more functionality tests could be helpful. Writing such
wrapper with shell script might be challenging and not easy to further
maintain and extend for future DAMON interface extensions, though.
To this end, implement a Python-written DAMON sysfs interface wrapper
that could be easily managed and extended for future DAMON interface
extensions. Further implement one simple functionality test and a
corner case regression test for a previously found bug, using the
wrapper module. In fact, the bug was found by the test this patchset is
introducing.
Note that the Python wrapper is not supporting full features of DAMON
interface, but only some of those that essential for the tests that this
patchset is introducing. The wrapper would extended to support more
features, but only with essential ones for such future tests. The
wrapper will hence keep being simple, small, and constrained. For
convenient and general use cases of DAMON, users should use DAMON
user-space tools for such purpose, like damo[1].
[1] https://github.com/damonitor/damo
Patches Sequence
----------------
This patchset is constructed with five patches. The first three patches
implement the Python-written DAMON sysfs interface wrapper in small
steps. The basic data structure (first patch), kdamond startup command
(second patch), and finally DAMOS tried bytes command (third patch).
Then two patches for adding selftests using the wrapper follows. The
fourth patch implements a basic functionality test of DAMON for working
set estimation accuracy. Finally, the fifth patch implements a corner
case test for a previously found bug.
SeongJae Park (5):
selftests/damon: add a DAMON interface wrapper python module
selftests/damon/_damon: implement sysfs-based kdamonds start function
selftests/damon/_damon: implement sysfs updat_schemes_tried_bytes
command
selftests/damon: add a test for update_schemes_tried_regions sysfs
command
selftests/damon: add a test for update_schemes_tried_regions hang bug
tools/testing/selftests/damon/Makefile | 3 +
tools/testing/selftests/damon/_damon.py | 322 ++++++++++++++++++
tools/testing/selftests/damon/access_memory.c | 41 +++
...sysfs_update_schemes_tried_regions_hang.py | 33 ++
...te_schemes_tried_regions_wss_estimation.py | 48 +++
5 files changed, 447 insertions(+)
create mode 100644 tools/testing/selftests/damon/_damon.py
create mode 100644 tools/testing/selftests/damon/access_memory.c
create mode 100755 tools/testing/selftests/damon/sysfs_update_schemes_tried_regions_hang.py
create mode 100755 tools/testing/selftests/damon/sysfs_update_schemes_tried_regions_wss_estimation.py
base-commit: 1be383c41197b82cfd51b2edc7ee515c0b786496
--
2.34.1
As Guillaume pointed, many selftests create namespaces with very common
names (like "client" or "server") or even (partially) run directly in init_net.
This makes these tests prone to failure if another namespace with the same
name already exists. It also makes it impossible to run several instances
of these tests in parallel.
This patch set intend to conver all the net selftests to run in unique namespace,
so we can update the selftest freamwork to run all tests in it's own namespace
in parallel. After update, we only need to wait for the test which need
longest time.
As the total patch set is too large. I break it to severl parts. This is
the first part.
v1 -> v2:
- Split the large patch set to small parts for easy review (Paolo Abeni)
- Move busywait from forwarding/lib.sh to net/lib.sh directly (Petr Machata)
- Update setup_ns/cleanup_ns struct (Petr Machata)
- Remove default trap in lib.sh (Petr Machata)
Hangbin Liu (14):
selftests/net: add lib.sh
selftests/net: convert arp_ndisc_evict_nocarrier.sh to run it in
unique namespace
selftests/net: specify the interface when do arping
selftests/net: convert arp_ndisc_untracked_subnets.sh to run it in
unique namespace
selftests/net: convert cmsg tests to make them run in unique namespace
selftests/net: convert drop_monitor_tests.sh to run it in unique
namespace
selftests/net: convert traceroute.sh to run it in unique namespace
selftests/net: convert icmp_redirect.sh to run it in unique namespace
sleftests/net: convert icmp.sh to run it in unique namespace
selftests/net: convert ioam6.sh to run it in unique namespace
selftests/net: convert l2tp.sh to run it in unique namespace
selftests/net: convert ndisc_unsolicited_na_test.sh to run it in
unique namespace
selftests/net: convert sctp_vrf.sh to run it in unique namespace
selftests/net: convert unicast_extensions.sh to run it in unique
namespace
tools/testing/selftests/net/Makefile | 2 +-
.../net/arp_ndisc_evict_nocarrier.sh | 46 ++--
.../net/arp_ndisc_untracked_subnets.sh | 20 +-
tools/testing/selftests/net/cmsg_ipv6.sh | 10 +-
tools/testing/selftests/net/cmsg_so_mark.sh | 7 +-
tools/testing/selftests/net/cmsg_time.sh | 7 +-
.../selftests/net/drop_monitor_tests.sh | 21 +-
tools/testing/selftests/net/forwarding/lib.sh | 27 +-
tools/testing/selftests/net/icmp.sh | 10 +-
tools/testing/selftests/net/icmp_redirect.sh | 182 +++++++------
tools/testing/selftests/net/ioam6.sh | 247 +++++++++---------
tools/testing/selftests/net/l2tp.sh | 130 +++++----
tools/testing/selftests/net/lib.sh | 85 ++++++
.../net/ndisc_unsolicited_na_test.sh | 19 +-
tools/testing/selftests/net/sctp_vrf.sh | 12 +-
tools/testing/selftests/net/traceroute.sh | 82 +++---
.../selftests/net/unicast_extensions.sh | 99 ++++---
17 files changed, 500 insertions(+), 506 deletions(-)
create mode 100644 tools/testing/selftests/net/lib.sh
--
2.41.0
In kunit_debugfs_create_suite() give up and skip creating the debugfs
file if any of the alloc_string_stream() calls return an error or NULL.
Only put a value in the log pointer of kunit_suite and kunit_test if it
is a valid pointer to a log.
This prevents the potential invalid dereference reported by smatch:
lib/kunit/debugfs.c:115 kunit_debugfs_create_suite() error: 'suite->log'
dereferencing possible ERR_PTR()
lib/kunit/debugfs.c:119 kunit_debugfs_create_suite() error: 'test_case->log'
dereferencing possible ERR_PTR()
Signed-off-by: Richard Fitzgerald <rf(a)opensource.cirrus.com>
Reported-by: Dan Carpenter <dan.carpenter(a)linaro.org>
Fixes: 05e2006ce493 ("kunit: Use string_stream for test log")
---
lib/kunit/debugfs.c | 30 +++++++++++++++++++++++++-----
1 file changed, 25 insertions(+), 5 deletions(-)
diff --git a/lib/kunit/debugfs.c b/lib/kunit/debugfs.c
index 270d185737e6..9d167adfa746 100644
--- a/lib/kunit/debugfs.c
+++ b/lib/kunit/debugfs.c
@@ -109,14 +109,28 @@ static const struct file_operations debugfs_results_fops = {
void kunit_debugfs_create_suite(struct kunit_suite *suite)
{
struct kunit_case *test_case;
+ struct string_stream *stream;
- /* Allocate logs before creating debugfs representation. */
- suite->log = alloc_string_stream(GFP_KERNEL);
- string_stream_set_append_newlines(suite->log, true);
+ /*
+ * Allocate logs before creating debugfs representation.
+ * The suite->log and test_case->log pointer are expected to be NULL
+ * if there isn't a log, so only set it if the log stream was created
+ * successfully.
+ */
+ stream = alloc_string_stream(GFP_KERNEL);
+ if (IS_ERR_OR_NULL(stream))
+ return;
+
+ string_stream_set_append_newlines(stream, true);
+ suite->log = stream;
kunit_suite_for_each_test_case(suite, test_case) {
- test_case->log = alloc_string_stream(GFP_KERNEL);
- string_stream_set_append_newlines(test_case->log, true);
+ stream = alloc_string_stream(GFP_KERNEL);
+ if (IS_ERR_OR_NULL(stream))
+ goto err;
+
+ string_stream_set_append_newlines(stream, true);
+ test_case->log = stream;
}
suite->debugfs = debugfs_create_dir(suite->name, debugfs_rootdir);
@@ -124,6 +138,12 @@ void kunit_debugfs_create_suite(struct kunit_suite *suite)
debugfs_create_file(KUNIT_DEBUGFS_RESULTS, S_IFREG | 0444,
suite->debugfs,
suite, &debugfs_results_fops);
+ return;
+
+err:
+ string_stream_destroy(suite->log);
+ kunit_suite_for_each_test_case(suite, test_case)
+ string_stream_destroy(test_case->log);
}
void kunit_debugfs_destroy_suite(struct kunit_suite *suite)
--
2.30.2