Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 967dffffcc16 - mm/mremap: Add comment explaining the untagging behaviour of mremap()
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://cki-artifacts.s3.us-east-2.amazonaws.com/index.html?prefix=dataware…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking MACsec: sanity
✅ Networking socket: fuzz
✅ Networking sctp-auth: sockopts test
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ httpd: mod_ssl smoke sanity
✅ tuned: tune-processes-through-perf
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ Usex - version 1.9-29
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ LTP: openposix test suite
🚧 ✅ Networking vnic: ipvlan/basic
🚧 ✅ audit: audit testsuite test
🚧 ✅ iotop: sanity
🚧 ✅ storage: dm/common
🚧 ✅ trace: ftrace/tracer
Host 2:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ lvm thinp sanity
✅ storage: software RAID testing
🚧 ✅ Storage blktests
ppc64le:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ lvm thinp sanity
✅ storage: software RAID testing
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking MACsec: sanity
✅ Networking socket: fuzz
✅ Networking sctp-auth: sockopts test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ httpd: mod_ssl smoke sanity
✅ tuned: tune-processes-through-perf
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ Usex - version 1.9-29
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ❌ LTP: openposix test suite
🚧 ✅ Networking vnic: ipvlan/basic
🚧 ❌ audit: audit testsuite test
🚧 ✅ iotop: sanity
🚧 ✅ storage: dm/common
🚧 ✅ trace: ftrace/tracer
s390x:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
Probable cause: Problem connecting to Beaker
⚡⚡⚡ Boot test
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ stress: stress-ng
🚧 ⚡⚡⚡ Storage blktests
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
Probable cause: Problem connecting to Beaker
⚡⚡⚡ Boot test
⚡⚡⚡ Podman system integration test - as root
⚡⚡⚡ Podman system integration test - as user
⚡⚡⚡ LTP
⚡⚡⚡ Loopdev Sanity
⚡⚡⚡ Memory function: memfd_create
⚡⚡⚡ Networking bridge: sanity
⚡⚡⚡ Ethernet drivers sanity
⚡⚡⚡ Networking MACsec: sanity
⚡⚡⚡ Networking sctp-auth: sockopts test
⚡⚡⚡ Networking route: pmtu
⚡⚡⚡ Networking route_func - local
⚡⚡⚡ Networking route_func - forward
⚡⚡⚡ Networking TCP: keepalive test
⚡⚡⚡ Networking UDP: socket
⚡⚡⚡ Networking tunnel: geneve basic test
⚡⚡⚡ Networking tunnel: gre basic
⚡⚡⚡ L2TP basic test
⚡⚡⚡ Networking tunnel: vxlan basic
⚡⚡⚡ Networking ipsec: basic netns - transport
⚡⚡⚡ Networking ipsec: basic netns - tunnel
⚡⚡⚡ httpd: mod_ssl smoke sanity
⚡⚡⚡ tuned: tune-processes-through-perf
⚡⚡⚡ Usex - version 1.9-29
🚧 ⚡⚡⚡ CIFS Connectathon
🚧 ⚡⚡⚡ POSIX pjd-fstest suites
🚧 ⚡⚡⚡ jvm - DaCapo Benchmark Suite
🚧 ⚡⚡⚡ jvm - jcstress tests
🚧 ⚡⚡⚡ Memory function: kaslr
🚧 ⚡⚡⚡ LTP: openposix test suite
🚧 ⚡⚡⚡ Networking vnic: ipvlan/basic
🚧 ⚡⚡⚡ audit: audit testsuite test
🚧 ⚡⚡⚡ iotop: sanity
🚧 ⚡⚡⚡ storage: dm/common
🚧 ⚡⚡⚡ trace: ftrace/tracer
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
Probable cause: Problem connecting to Beaker
⚡⚡⚡ Boot test
⚡⚡⚡ Podman system integration test - as root
⚡⚡⚡ Podman system integration test - as user
⚡⚡⚡ LTP
⚡⚡⚡ Loopdev Sanity
⚡⚡⚡ Memory function: memfd_create
⚡⚡⚡ Networking bridge: sanity
⚡⚡⚡ Ethernet drivers sanity
⚡⚡⚡ Networking MACsec: sanity
⚡⚡⚡ Networking sctp-auth: sockopts test
⚡⚡⚡ Networking route: pmtu
⚡⚡⚡ Networking route_func - local
⚡⚡⚡ Networking route_func - forward
⚡⚡⚡ Networking TCP: keepalive test
⚡⚡⚡ Networking UDP: socket
⚡⚡⚡ Networking tunnel: geneve basic test
⚡⚡⚡ Networking tunnel: gre basic
⚡⚡⚡ L2TP basic test
⚡⚡⚡ Networking tunnel: vxlan basic
⚡⚡⚡ Networking ipsec: basic netns - transport
⚡⚡⚡ Networking ipsec: basic netns - tunnel
⚡⚡⚡ httpd: mod_ssl smoke sanity
⚡⚡⚡ tuned: tune-processes-through-perf
⚡⚡⚡ Usex - version 1.9-29
🚧 ⚡⚡⚡ CIFS Connectathon
🚧 ⚡⚡⚡ POSIX pjd-fstest suites
🚧 ⚡⚡⚡ jvm - DaCapo Benchmark Suite
🚧 ⚡⚡⚡ jvm - jcstress tests
🚧 ⚡⚡⚡ Memory function: kaslr
🚧 ⚡⚡⚡ LTP: openposix test suite
🚧 ⚡⚡⚡ Networking vnic: ipvlan/basic
🚧 ⚡⚡⚡ audit: audit testsuite test
🚧 ⚡⚡⚡ iotop: sanity
🚧 ⚡⚡⚡ storage: dm/common
🚧 ⚡⚡⚡ trace: ftrace/tracer
Host 4:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
Probable cause: Problem connecting to Beaker
⚡⚡⚡ Boot test
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ stress: stress-ng
🚧 ⚡⚡⚡ Storage blktests
x86_64:
Host 1:
✅ Boot test
✅ Storage SAN device stress - qedf driver
Host 2:
✅ Boot test
✅ Storage SAN device stress - mpt3sas_gen1
Host 3:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ lvm thinp sanity
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ IOMMU boot test
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
Host 4:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking MACsec: sanity
✅ Networking socket: fuzz
✅ Networking sctp-auth: sockopts test
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ httpd: mod_ssl smoke sanity
✅ tuned: tune-processes-through-perf
✅ pciutils: sanity smoke test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ Usex - version 1.9-29
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ LTP: openposix test suite
🚧 ✅ Networking vnic: ipvlan/basic
🚧 ❌ audit: audit testsuite test
🚧 ✅ iotop: sanity
🚧 ✅ storage: dm/common
🚧 ✅ trace: ftrace/tracer
Host 5:
⏱ Boot test
⏱ Storage SAN device stress - megaraid_sas
Test sources: https://github.com/CKI-project/tests-beaker
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
From: Marcos Paulo de Souza <mpdesouza(a)suse.com>
[PROBLEM]
Whenever a chown is executed, all capabilities of the file being touched are
lost. When doing incremental send with a file with capabilities, there is a
situation where the capability can be lost in the receiving side. The
sequence of actions bellow shows the problem:
$ mount /dev/sda fs1
$ mount /dev/sdb fs2
$ touch fs1/foo.bar
$ setcap cap_sys_nice+ep fs1/foo.bar
$ btrfs subvol snap -r fs1 fs1/snap_init
$ btrfs send fs1/snap_init | btrfs receive fs2
$ chgrp adm fs1/foo.bar
$ setcap cap_sys_nice+ep fs1/foo.bar
$ btrfs subvol snap -r fs1 fs1/snap_complete
$ btrfs subvol snap -r fs1 fs1/snap_incremental
$ btrfs send fs1/snap_complete | btrfs receive fs2
$ btrfs send -p fs1/snap_init fs1/snap_incremental | btrfs receive fs2
At this point, only a chown was emitted by "btrfs send" since only the group
was changed. This makes the cap_sys_nice capability to be dropped from
fs2/snap_incremental/foo.bar
[FIX]
Only emit capabilities after chown is emitted. The current code
first checks for xattrs that are new/changed, emits them, and later emit
the chown. Now, __process_new_xattr skips capabilities, letting only
finish_inode_if_needed to emit them, if they exist, for the inode being
processed.
This behavior was being worked around in "btrfs receive"
side by caching the capability and only applying it after chown. Now,
xattrs are only emmited _after_ chown, making that hack not needed
anymore.
Link: https://github.com/kdave/btrfs-progs/issues/202
Suggested-by: Filipe Manana <fdmanana(a)suse.com>
Reviewed-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: Marcos Paulo de Souza <mpdesouza(a)suse.com>
---
Changes from v1:
* Constify name var in send_capabilities function (suggested by Filipe)
* Changed btrfs_alloc_path -> alloc_path_for_send() in send_capabilities
(suggested by Filipe)
* Removed a redundant sentence in the commit message (suggested by Filipe)
* Added the Reviewed-By tag from Filipe
Changes from RFC:
* Explained about chown + drop capabilities problem in the commit message (suggested
by Filipe and David)
* Changed the commit message to show describe the fix (suggested by Filipe)
* Skip the xattr in __process_new_xattr if it's a capability, since it'll be
handled in finish_inode_if_needed now (suggested by Filipe).
* Created function send_capabilities to query if the inode has caps, and if
yes, emit them.
* Call send_capabilities in finish_inode_if_needed _after_ the needs_chown
check. (suggested by Filipe)
fs/btrfs/send.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 69 insertions(+)
diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 6b86841315be..2b378e32e7dc 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -23,6 +23,7 @@
#include "btrfs_inode.h"
#include "transaction.h"
#include "compression.h"
+#include "xattr.h"
/*
* Maximum number of references an extent can have in order for us to attempt to
@@ -4545,6 +4546,10 @@ static int __process_new_xattr(int num, struct btrfs_key *di_key,
struct fs_path *p;
struct posix_acl_xattr_header dummy_acl;
+ /* capabilities are emitted by finish_inode_if_needed */
+ if (!strncmp(name, XATTR_NAME_CAPS, name_len))
+ return 0;
+
p = fs_path_alloc();
if (!p)
return -ENOMEM;
@@ -5107,6 +5112,66 @@ static int send_extent_data(struct send_ctx *sctx,
return 0;
}
+/*
+ * Search for a capability xattr related to sctx->cur_ino. If the capability if
+ * found, call send_set_xattr function to emit it.
+ *
+ * Return %0 if there isn't a capability, or when the capability was emitted
+ * successfully, or < %0 if an error occurred.
+ */
+static int send_capabilities(struct send_ctx *sctx)
+{
+ struct fs_path *fspath = NULL;
+ struct btrfs_path *path;
+ struct btrfs_dir_item *di;
+ struct extent_buffer *leaf;
+ unsigned long data_ptr;
+ const char *name = XATTR_NAME_CAPS;
+ char *buf = NULL;
+ int buf_len;
+ int ret = 0;
+
+ path = alloc_path_for_send();
+ if (!path)
+ return -ENOMEM;
+
+ di = btrfs_lookup_xattr(NULL, sctx->send_root, path, sctx->cur_ino,
+ name, strlen(name), 0);
+ if (!di) {
+ /* there is no xattr for this inode */
+ goto out;
+ } else if (IS_ERR(di)) {
+ ret = PTR_ERR(di);
+ goto out;
+ }
+
+ leaf = path->nodes[0];
+ buf_len = btrfs_dir_data_len(leaf, di);
+
+ fspath = fs_path_alloc();
+ buf = kmalloc(buf_len, GFP_KERNEL);
+ if (!fspath || !buf) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ ret = get_cur_path(sctx, sctx->cur_ino, sctx->cur_inode_gen, fspath);
+ if (ret < 0)
+ goto out;
+
+ data_ptr = (unsigned long)((char *)(di + 1) +
+ btrfs_dir_name_len(leaf, di));
+ read_extent_buffer(leaf, buf, data_ptr,
+ btrfs_dir_data_len(leaf, di));
+
+ ret = send_set_xattr(sctx, fspath, name, strlen(name), buf, buf_len);
+out:
+ kfree(buf);
+ fs_path_free(fspath);
+ btrfs_free_path(path);
+ return ret;
+}
+
static int clone_range(struct send_ctx *sctx,
struct clone_root *clone_root,
const u64 disk_byte,
@@ -6010,6 +6075,10 @@ static int finish_inode_if_needed(struct send_ctx *sctx, int at_end)
goto out;
}
+ ret = send_capabilities(sctx);
+ if (ret < 0)
+ goto out;
+
/*
* If other directory inodes depended on our current directory
* inode's move/rename, now do their move/rename operations.
--
2.25.1
Assume we have kmem configured and loaded:
[root@localhost ~]# cat /proc/iomem
...
140000000-33fffffff : Persistent Memory$
140000000-1481fffff : namespace0.0
150000000-33fffffff : dax0.0
150000000-33fffffff : System RAM
Assume we try to unload kmem. This force-unloading will work, even if
memory cannot get removed from the system.
[root@localhost ~]# rmmod kmem
[ 86.380228] removing memory fails, because memory [0x0000000150000000-0x0000000157ffffff] is onlined
...
[ 86.431225] kmem dax0.0: DAX region [mem 0x150000000-0x33fffffff] cannot be hotremoved until the next reboot
Now, we can reconfigure the namespace:
[root@localhost ~]# ndctl create-namespace --force --reconfig=namespace0.0 --mode=devdax
[ 131.409351] nd_pmem namespace0.0: could not reserve region [mem 0x140000000-0x33fffffff]dax
[ 131.410147] nd_pmem: probe of namespace0.0 failed with error -16namespace0.0 --mode=devdax
...
This fails as expected due to the busy memory resource, and the memory
cannot be used. However, the dax0.0 device is removed, and along its name.
The name of the memory resource now points at freed memory (name of the
device).
[root@localhost ~]# cat /proc/iomem
...
140000000-33fffffff : Persistent Memory
140000000-1481fffff : namespace0.0
150000000-33fffffff : �_�^7_��/_��wR��WQ���^��� ...
150000000-33fffffff : System RAM
We have to make sure to duplicate the string. While at it, remove the
superfluous setting of the name and fixup a stale comment.
Fixes: 9f960da72b25 ("device-dax: "Hotremove" persistent memory that is used like normal RAM")
Cc: stable(a)vger.kernel.org # v5.3
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Vishal Verma <vishal.l.verma(a)intel.com>
Cc: Dave Jiang <dave.jiang(a)intel.com>
Cc: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: David Hildenbrand <david(a)redhat.com>
---
drivers/dax/kmem.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index 3d0a7e702c94..1e678bdf5aed 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -22,6 +22,7 @@ int dev_dax_kmem_probe(struct device *dev)
resource_size_t kmem_size;
resource_size_t kmem_end;
struct resource *new_res;
+ const char *new_res_name;
int numa_node;
int rc;
@@ -48,11 +49,16 @@ int dev_dax_kmem_probe(struct device *dev)
kmem_size &= ~(memory_block_size_bytes() - 1);
kmem_end = kmem_start + kmem_size;
- /* Region is permanently reserved. Hot-remove not yet implemented. */
- new_res = request_mem_region(kmem_start, kmem_size, dev_name(dev));
+ new_res_name = kstrdup(dev_name(dev), GFP_KERNEL);
+ if (!new_res_name)
+ return -ENOMEM;
+
+ /* Region is permanently reserved if hotremove fails. */
+ new_res = request_mem_region(kmem_start, kmem_size, new_res_name);
if (!new_res) {
dev_warn(dev, "could not reserve region [%pa-%pa]\n",
&kmem_start, &kmem_end);
+ kfree(new_res_name);
return -EBUSY;
}
@@ -63,12 +69,12 @@ int dev_dax_kmem_probe(struct device *dev)
* unknown to us that will break add_memory() below.
*/
new_res->flags = IORESOURCE_SYSTEM_RAM;
- new_res->name = dev_name(dev);
rc = add_memory(numa_node, new_res->start, resource_size(new_res));
if (rc) {
release_resource(new_res);
kfree(new_res);
+ kfree(new_res_name);
return rc;
}
dev_dax->dax_kmem_res = new_res;
@@ -83,6 +89,7 @@ static int dev_dax_kmem_remove(struct device *dev)
struct resource *res = dev_dax->dax_kmem_res;
resource_size_t kmem_start = res->start;
resource_size_t kmem_size = resource_size(res);
+ const char *res_name = res->name;
int rc;
/*
@@ -102,6 +109,7 @@ static int dev_dax_kmem_remove(struct device *dev)
/* Release and free dax resources */
release_resource(res);
kfree(res);
+ kfree(res_name);
dev_dax->dax_kmem_res = NULL;
return 0;
--
2.25.4
The patch titled
Subject: device-dax: don't leak kernel memory to user space after unloading kmem
has been added to the -mm tree. Its filename is
device-dax-dont-leak-kernel-memory-to-user-space-after-unloading-kmem.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/device-dax-dont-leak-kernel-memory…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/device-dax-dont-leak-kernel-memory…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: David Hildenbrand <david(a)redhat.com>
Subject: device-dax: don't leak kernel memory to user space after unloading kmem
Assume we have kmem configured and loaded:
[root@localhost ~]# cat /proc/iomem
...
140000000-33fffffff : Persistent Memory$
140000000-1481fffff : namespace0.0
150000000-33fffffff : dax0.0
150000000-33fffffff : System RAM
Assume we try to unload kmem. This force-unloading will work, even if
memory cannot get removed from the system.
[root@localhost ~]# rmmod kmem
[ 86.380228] removing memory fails, because memory [0x0000000150000000-0x0000000157ffffff] is onlined
...
[ 86.431225] kmem dax0.0: DAX region [mem 0x150000000-0x33fffffff] cannot be hotremoved until the next reboot
Now, we can reconfigure the namespace:
[root@localhost ~]# ndctl create-namespace --force --reconfig=namespace0.0 --mode=devdax
[ 131.409351] nd_pmem namespace0.0: could not reserve region [mem 0x140000000-0x33fffffff]dax
[ 131.410147] nd_pmem: probe of namespace0.0 failed with error -16namespace0.0 --mode=devdax
...
This fails as expected due to the busy memory resource, and the memory
cannot be used. However, the dax0.0 device is removed, and along its name.
The name of the memory resource now points at freed memory (name of the
device).
[root@localhost ~]# cat /proc/iomem
...
140000000-33fffffff : Persistent Memory
140000000-1481fffff : namespace0.0
150000000-33fffffff : �_�^7_��/_��wR��WQ���^��� ...
150000000-33fffffff : System RAM
We have to make sure to duplicate the string. While at it, remove the
superfluous setting of the name and fixup a stale comment.
Link: http://lkml.kernel.org/r/20200508084217.9160-2-david@redhat.com
Fixes: 9f960da72b25 ("device-dax: "Hotremove" persistent memory that is used like normal RAM")
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Vishal Verma <vishal.l.verma(a)intel.com>
Cc: Dave Jiang <dave.jiang(a)intel.com>
Cc: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: <stable(a)vger.kernel.org> [5.3]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
drivers/dax/kmem.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
--- a/drivers/dax/kmem.c~device-dax-dont-leak-kernel-memory-to-user-space-after-unloading-kmem
+++ a/drivers/dax/kmem.c
@@ -22,6 +22,7 @@ int dev_dax_kmem_probe(struct device *de
resource_size_t kmem_size;
resource_size_t kmem_end;
struct resource *new_res;
+ const char *new_res_name;
int numa_node;
int rc;
@@ -48,11 +49,16 @@ int dev_dax_kmem_probe(struct device *de
kmem_size &= ~(memory_block_size_bytes() - 1);
kmem_end = kmem_start + kmem_size;
- /* Region is permanently reserved. Hot-remove not yet implemented. */
- new_res = request_mem_region(kmem_start, kmem_size, dev_name(dev));
+ new_res_name = kstrdup(dev_name(dev), GFP_KERNEL);
+ if (!new_res_name)
+ return -ENOMEM;
+
+ /* Region is permanently reserved if hotremove fails. */
+ new_res = request_mem_region(kmem_start, kmem_size, new_res_name);
if (!new_res) {
dev_warn(dev, "could not reserve region [%pa-%pa]\n",
&kmem_start, &kmem_end);
+ kfree(new_res_name);
return -EBUSY;
}
@@ -63,12 +69,12 @@ int dev_dax_kmem_probe(struct device *de
* unknown to us that will break add_memory() below.
*/
new_res->flags = IORESOURCE_SYSTEM_RAM;
- new_res->name = dev_name(dev);
rc = add_memory(numa_node, new_res->start, resource_size(new_res));
if (rc) {
release_resource(new_res);
kfree(new_res);
+ kfree(new_res_name);
return rc;
}
dev_dax->dax_kmem_res = new_res;
@@ -83,6 +89,7 @@ static int dev_dax_kmem_remove(struct de
struct resource *res = dev_dax->dax_kmem_res;
resource_size_t kmem_start = res->start;
resource_size_t kmem_size = resource_size(res);
+ const char *res_name = res->name;
int rc;
/*
@@ -102,6 +109,7 @@ static int dev_dax_kmem_remove(struct de
/* Release and free dax resources */
release_resource(res);
kfree(res);
+ kfree(res_name);
dev_dax->dax_kmem_res = NULL;
return 0;
_
Patches currently in -mm which might be from david(a)redhat.com are
device-dax-dont-leak-kernel-memory-to-user-space-after-unloading-kmem.patch
drivers-base-memoryc-cache-memory-blocks-in-xarray-to-accelerate-lookup-fix.patch
powerpc-pseries-hotplug-memory-stop-checking-is_mem_section_removable.patch
mm-memory_hotplug-remove-is_mem_section_removable.patch
mm-memory_hotplug-set-node_start_pfn-of-hotadded-pgdat-to-0.patch
mm-memory_hotplug-handle-memblocks-only-with-config_arch_keep_memblock.patch
Ich bin Jeff Lindsay, ein älterer Bürger aus Kalifornien, USA. Ich habe einen Jackpot von 447,8 Millionen Dollar gewonnen, der größte Lotterie-Jackpot. Im Namen meiner Familie und aus gutem Willen spenden wir Ihnen und Ihrer Familie einen Betrag von (€ 2.000.000,00 EUR). Ich versuche, die öffentlichen Waisenhäuser zu erreichen. Tragen Sie zur Armutsbekämpfung bei und sorgen Sie für eine angemessene Gesundheitsversorgung für Einzelpersonen, insbesondere während dieser Welt. Pandemic Covid 19. Ich möchte auch, dass Sie einen Teil dieser Spende in die öffentliche Infrastruktur investieren, um Arbeitslosen in Ihrem Land Arbeitsplätze zu bieten. Ich habe dich gewählt, weil ich an dich glaube. Ich brauche Ihre uneingeschränkte Mitarbeit in Bezug auf diese Spende. Bitte kontaktieren Sie mich hier zurück unter meiner privaten E-Mail: povertysolutionsorg(a)gmail.com