Here is a series with some fixes and cleanups to resctrl selftests. In
v3, the rewritten CAT test is not included as an issue was discovered
in one of its components requiring further work before it can be
included to mainline.
v3:
- Don't include rewritten CAT test into this series!
- Tweak wildcard style in Makefile
- Fix many changelog typos, remove some wrong claims, and generally
improve them.
- Add fix to PARENT_EXIT() to unmount resctrl FS
- Add unmounting resctrl FS before starting any tests
- Add fix for buf leak
- Add fix for perf fd closing
- Split mount/remount/umount patches differently
- Use size_t and %zu for span
- Keep MBM print as MB, only internally use span in bytes
- Drop start_buf global from fill_buf
v2 (was sent with CAT test rewrite which is no longer included in v3):
- Rebased on top of next to solve the conflicts
- Added 2 patches related to resctrl FS mount/umount (fix + cleanup)
- Consistently use "alloc" in cache_alloc_size()
- CAT test error handling tweaked
- Remove a spurious newline change from the CAT patch
- Small improvements to changelogs
Ilpo Järvinen (19):
selftests/resctrl: Add resctrl.h into build deps
selftests/resctrl: Don't leak buffer in fill_cache()
selftests/resctrl: Unmount resctrl FS if child fails to run benchmark
selftests/resctrl: Close perf value read fd on errors
selftests/resctrl: Unmount resctrl FS before starting the first test
selftests/resctrl: Move resctrl FS mount/umount to higher level
selftests/resctrl: Refactor remount_resctrl(bool mum_resctrlfs) to
mount_resctrl()
selftests/resctrl: Remove mum_resctrlfs from struct resctrl_val_param
selftests/resctrl: Convert span to size_t
selftests/resctrl: Express span internally in bytes
selftests/resctrl: Remove duplicated preparation for span arg
selftests/resctrl: Remove "malloc_and_init_memory" param from
run_fill_buf()
selftests/resctrl: Remove unnecessary startptr global from fill_buf
selftests/resctrl: Improve parameter consistency in fill_buf
selftests/resctrl: Don't pass test name to fill_buf
selftests/resctrl: Don't use variable argument list for ->setup()
selftests/resctrl: Move CAT/CMT test global vars to function they are
used in
selftests/resctrl: Pass the real number of tests to show_cache_info()
selftests/resctrl: Remove test type checks from cat_val()
tools/testing/selftests/resctrl/Makefile | 2 +-
tools/testing/selftests/resctrl/cache.c | 64 +++++++-------
tools/testing/selftests/resctrl/cat_test.c | 28 ++----
tools/testing/selftests/resctrl/cmt_test.c | 29 ++-----
tools/testing/selftests/resctrl/fill_buf.c | 87 +++++++------------
tools/testing/selftests/resctrl/mba_test.c | 9 +-
tools/testing/selftests/resctrl/mbm_test.c | 17 ++--
tools/testing/selftests/resctrl/resctrl.h | 17 ++--
.../testing/selftests/resctrl/resctrl_tests.c | 82 +++++++++++------
tools/testing/selftests/resctrl/resctrl_val.c | 7 +-
tools/testing/selftests/resctrl/resctrlfs.c | 57 ++++++------
11 files changed, 169 insertions(+), 230 deletions(-)
--
2.30.2
Hi,
The test failed with the latest torvalds tree kernel 6.4-rc5-00305-g022ce8862dff
on AMD Ryzen 9 and Ubuntu 22.04 Jammy.
The config is a merge of Ubuntu generic config and selftest config files.
Debug output with `set -x` is [edited]:
root@host:selftests/drivers/net/bonding# ./bond-arp-interval-causes-panic.sh
Cannot find device "link1_1"
root@defiant:/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/drivers/net/bonding# vi !$
vi ./bond-arp-interval-causes-panic.sh
root@host:selftests/drivers/net/bonding# ./bond-arp-interval-causes-panic.sh
+ test 0 -ne 0
+ trap finish EXIT
+ client_ip4=192.168.1.198
+ server_ip4=192.168.1.254
+ echo 180
+ ip link add dev link1_1 type veth peer name link1_2
+ ip netns add server
+ ip link set dev link1_2 netns server up name eth0
+ ip netns exec server ip addr add 192.168.1.254/24 dev eth0
+ ip netns add client
+ ip link set dev link1_1 netns client down name eth0
+ ip netns exec client ip link add dev bond0 down type bond mode 1 miimon 100 all_slaves_active 1
+ ip netns exec client ip link set dev eth0 down master bond0
+ ip netns exec client ip link set dev bond0 up
+ ip netns exec client ip addr add 192.168.1.198/24 dev bond0
+ ip netns exec client ping -c 5 192.168.1.254
+ finish
+ ip netns delete server
+ ip netns delete client
+ ip link del link1_1
Cannot find device "link1_1"
+ true
root@host:testing/selftests/drivers/net/bonding# uname -rms
Linux 6.4.0-rc5-kmlk-netdbg-iwlwifi-00305-g022ce8862dff x86_64
root@host:testing/selftests/drivers/net/bonding#
Some debugging:
I have added some "ip link show" commands in the finish() function:
finish()
{
ip link show
ip netns delete server || true
ip netns delete client || true
ip link show
ip link del link1_1 || true
}
Now the debug output is like this:
root@host:selftests/drivers/net/bonding# ./bond-arp-interval-causes-panic.sh
+ test 0 -ne 0
+ trap finish EXIT
+ client_ip4=192.168.1.198
+ server_ip4=192.168.1.254
+ echo 180
+ ip link add dev link1_1 type veth peer name link1_2
+ ip netns add server
+ ip link set dev link1_2 netns server up name eth0
+ ip netns exec server ip addr add 192.168.1.254/24 dev eth0
+ ip netns add client
+ ip link set dev link1_1 netns client down name eth0
+ ip netns exec client ip link add dev bond0 down type bond mode 1 miimon 100 all_slaves_active 1
+ ip netns exec client ip link set dev eth0 down master bond0
+ ip netns exec client ip link set dev bond0 up
+ ip netns exec client ip addr add 192.168.1.198/24 dev bond0
+ ip netns exec client ping -c 5 192.168.1.254
+ finish
+ ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 02:fc:ca:49:e2:d4 brd ff:ff:ff:ff:ff:ff
3: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
4: gre0@NONE: <NOARP> mtu 1476 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/gre 0.0.0.0 brd 0.0.0.0
5: gretap0@NONE: <BROADCAST,MULTICAST> mtu 1462 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
6: erspan0@NONE: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
7: ip_vti0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
8: ip6_vti0@NONE: <NOARP> mtu 1332 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/tunnel6 :: brd :: permaddr 325b:a7df:c8db::
9: sit0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/sit 0.0.0.0 brd 0.0.0.0
10: ip6tnl0@NONE: <NOARP> mtu 1452 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/tunnel6 :: brd :: permaddr 76d3:be76:4187::
11: ip6gre0@NONE: <NOARP> mtu 1448 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/gre6 :: brd :: permaddr 569b:65fd:b94b::
12: enp16s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
link/ether 9c:6b:00:01:fb:80 brd ff:ff:ff:ff:ff:ff
+ ip netns delete server
+ ip netns delete client
+ ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 02:fc:ca:49:e2:d4 brd ff:ff:ff:ff:ff:ff
3: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
4: gre0@NONE: <NOARP> mtu 1476 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/gre 0.0.0.0 brd 0.0.0.0
5: gretap0@NONE: <BROADCAST,MULTICAST> mtu 1462 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
6: erspan0@NONE: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
7: ip_vti0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
8: ip6_vti0@NONE: <NOARP> mtu 1332 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/tunnel6 :: brd :: permaddr 325b:a7df:c8db::
9: sit0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/sit 0.0.0.0 brd 0.0.0.0
10: ip6tnl0@NONE: <NOARP> mtu 1452 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/tunnel6 :: brd :: permaddr 76d3:be76:4187::
11: ip6gre0@NONE: <NOARP> mtu 1448 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/gre6 :: brd :: permaddr 569b:65fd:b94b::
12: enp16s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
link/ether 9c:6b:00:01:fb:80 brd ff:ff:ff:ff:ff:ff
+ ip link del link1_1
Cannot find device "link1_1"
+ true
root@host:selftests/drivers/net/bonding#
Adding more `ip link show` before and after operations with link_1
had shown that `ip link set dev link1_1 netns client down name eth0` command
shuts down the link, so the `ip link del link1_1` doesn't succeed, as seen
here:
+ ip netns exec server ip addr add 192.168.1.254/24 dev eth0
+ ip netns add client
+ ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 02:fc:ca:49:e2:d4 brd ff:ff:ff:ff:ff:ff
3: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
4: gre0@NONE: <NOARP> mtu 1476 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/gre 0.0.0.0 brd 0.0.0.0
5: gretap0@NONE: <BROADCAST,MULTICAST> mtu 1462 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
6: erspan0@NONE: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
7: ip_vti0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
8: ip6_vti0@NONE: <NOARP> mtu 1332 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/tunnel6 :: brd :: permaddr 325b:a7df:c8db::
9: sit0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/sit 0.0.0.0 brd 0.0.0.0
10: ip6tnl0@NONE: <NOARP> mtu 1452 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/tunnel6 :: brd :: permaddr 76d3:be76:4187::
11: ip6gre0@NONE: <NOARP> mtu 1448 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/gre6 :: brd :: permaddr 569b:65fd:b94b::
12: enp16s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
link/ether 9c:6b:00:01:fb:80 brd ff:ff:ff:ff:ff:ff
64: link1_1@if63: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 32:d6:de:9f:5d:e2 brd ff:ff:ff:ff:ff:ff link-netns server
+ ip link set dev link1_1 netns client down name eth0
+ ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 02:fc:ca:49:e2:d4 brd ff:ff:ff:ff:ff:ff
3: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
4: gre0@NONE: <NOARP> mtu 1476 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/gre 0.0.0.0 brd 0.0.0.0
5: gretap0@NONE: <BROADCAST,MULTICAST> mtu 1462 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
6: erspan0@NONE: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
7: ip_vti0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
8: ip6_vti0@NONE: <NOARP> mtu 1332 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/tunnel6 :: brd :: permaddr 325b:a7df:c8db::
9: sit0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/sit 0.0.0.0 brd 0.0.0.0
10: ip6tnl0@NONE: <NOARP> mtu 1452 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/tunnel6 :: brd :: permaddr 76d3:be76:4187::
11: ip6gre0@NONE: <NOARP> mtu 1448 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/gre6 :: brd :: permaddr 569b:65fd:b94b::
12: enp16s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
link/ether 9c:6b:00:01:fb:80 brd ff:ff:ff:ff:ff:ff
+ ip netns exec client ip link add dev bond0 down type bond mode 1 miimon 100 all_slaves_active 1
Hope this helps.
I am not sure what is the right thing to do with this test, and whether it is
the expected behaviour of the kernel.
Best regards,
Mirsad
Hi,
On several platforms, and for some time, I've noticed alsa: pcm-test TIMEOUT.
I have tried to increase the timeout in selftests/alsa/settings: timeout=300,
so I think there is no more purpose to further increase it, so something appears
generally stuck.
The test runs up to "default.time4.1.8 section where it hangs for more than
200 seconds, possibly indefinitely.
The output of the selftest is:
# # default.time3.1.8.0.PLAYBACK - 44.1kHz stereo large periods
# # default.time3.1.8.0.PLAYBACK hw_params.RW_INTERLEAVED.S16_LE.44100.2.22496.202464 sw_params.202464
# ok 61 default.time3.0.3.0.PLAYBACK
# # default.time4.0.3.0.PLAYBACK - 48kHz stereo small periods
# # default.time4.0.3.0.PLAYBACK hw_params.RW_INTERLEAVED.S16_LE.48000.2.512.4096 sw_params.4096
# ok 62 default.time4.0.3.0.PLAYBACK
# # default.time5.0.3.0.PLAYBACK - 48kHz stereo large periods
# # default.time5.0.3.0.PLAYBACK hw_params.RW_INTERLEAVED.S16_LE.48000.2.24000.192000 sw_params.192000
# ok 63 default.time5.0.3.0.PLAYBACK
# # default.time6.0.3.0.PLAYBACK - 48kHz 6 channel large periods
# # default.time6.0.3.0.PLAYBACK hw_params.RW_INTERLEAVED.S16_LE.48000.2.48000.576000 sw_params.576000
# ok 64 default.time6.0.3.0.PLAYBACK
# # default.time7.0.3.0.PLAYBACK - 96kHz stereo large periods
# # default.time7.0.3.0.PLAYBACK hw_params.RW_INTERLEAVED.S16_LE.96000.2.48000.192000 sw_params.192000
# not ok 65 default.time3.1.8.0.PLAYBACK
# # time mismatch: expected 4000ms got 17005
# # default.time4.1.8.#
not ok 2 selftests: alsa: pcm-test # TIMEOUT 300 seconds
The platform is AMD Ryzen 9 assembled box with AsRock mainboard. Config and Lshw attached.
CONTINUED:
Just to test further, I have increased timeout even further, to 400 seconds.
Only then the test passed, but with numerous errors, and this is Ryzen 9, so
I guess it can only be worse on hardware like i3 or i5.
But many subtests failed, so I am submitting the entire test log (due to mailing list
constraints to 100K size of attachments, it is compressed).
Best regards,
Mirsad
--------------
diff -u /dev/null tools/testing/selftests/alsa/settings
--- /dev/null 2023-06-11 00:36:30.651447094 +0200
+++ tools/testing/selftests/alsa/settings 2023-06-11 00:37:32.067504069 +0200
@@ -0,0 +1 @@
+timeout=400
Willy, Thomas
This is the revision of the v1 syscall helpers [1], just rebased it on
20230606-nolibc-rv32+stkp7a of [2]. It doesn't conflict with the -ENOSYS
patchset [3], so, it is ok to simply merge both of them.
This revision mainly applied your suggestions of v1, both of the syscall
return and call helpers are simplified or cleaned up.
Changes from v1 -> v2:
* tools/nolibc: sys.h: add __syscall() and __sysret() helpers
* Use inline function instead of macro for the syscall return helper
(Suggestion from Thomas)
* Rename syscall return helper from __syscall_ret to __sysret
(align with __syscall and it is not that long now)
* Make __sysret() be always inline
(Suggestion from Willy)
* Simplify the whole __syscall() macro to oneline code
(Benefit from the fixed 'long' return type of syscalls)
* tools/nolibc: unistd.h: apply __sysret() helper
* Convert the whole _syscall() macro to oneline code
* tools/nolibc: sys.h: apply __sysret() helper
* Futher convert both brk() and getpagesize() to oneline code
* tools/nolibc: sys.h: apply __syscall() helper
* Keep the same as v1, because the __syscall() usage not changed
Best regards,
Zhangjin
---
[1]: https://lore.kernel.org/linux-riscv/cover.1685856497.git.falcon@tinylab.org/
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/nolibc.git
[3]: https://lore.kernel.org/linux-riscv/cover.1685780412.git.falcon@tinylab.org/
Zhangjin Wu (4):
tools/nolibc: sys.h: add __syscall() and __sysret() helpers
tools/nolibc: unistd.h: apply __sysret() helper
tools/nolibc: sys.h: apply __sysret() helper
tools/nolibc: sys.h: apply __syscall() helper
tools/include/nolibc/sys.h | 366 ++++++----------------------------
tools/include/nolibc/unistd.h | 11 +-
2 files changed, 57 insertions(+), 320 deletions(-)
--
2.25.1
On Sun, Jun 04, 2023 at 10:41:05PM +0800, kernel test robot wrote:
>
>
> Hello,
>
> kernel test robot noticed "sysctl_could_not_get_directory" on:
>
> commit: 1997935e918fa4c07b70be47ef8f37622df427bd ("[PATCH 6/8] test_sysclt: Test for registering a mount point")
> url: https://protect2.fireeye.com/v1/url?k=ee66a422-8f1d0eab-ee672f6d-74fe486000…
> base: https://git.kernel.org/cgit/linux/kernel/git/mcgrof/linux.git sysctl-next
> patch link: https://lore.kernel.org/all/20230602110638.789426-7-j.granados@samsung.com/
> patch subject: [PATCH 6/8] test_sysclt: Test for registering a mount point
>
> in testcase: boot
>
> compiler: gcc-12
> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
>
> (please refer to attached dmesg/kmsg for entire log/backtrace)
>
>
> If you fix the issue, kindly add following tag
> | Reported-by: kernel test robot <oliver.sang(a)intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202306042234.f2d7beff-oliver.sang@intel.com
>
>
> [ 15.271017][ T1] initcall io_uring_init+0x0/0x40 returned 0 after 87 usecs
> [ 15.272122][ T1] calling test_firmware_init+0x0/0x190 @ 1
> [ 15.274422][ T1] test_firmware: interface ready
> [ 15.275240][ T1] initcall test_firmware_init+0x0/0x190 returned 0 after 2200 usecs
> [ 15.276480][ T1] calling test_sysctl_init+0x0/0x630 @ 1
> [ 15.277687][ T1] sysctl could not get directory: /debug/test_sysctl/mnt/mnt_error -30
This is precisely what I'm trying to test. I'm trying to create a
directory on top of a permanently empty directory and expecting the
failure and checking to see that the mnt_error directory was not
created.
@mcgrof: Can we just ignore this 0-day report as a false positive?
Best
> [ 15.279055][ T1] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.4.0-rc2-00016-g1997935e918f #1
> [ 15.280027][ T1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> [ 15.280027][ T1] Call Trace:
> [ 15.280027][ T1] <TASK>
> [ 15.280027][ T1] dump_stack_lvl (kbuild/src/consumer/lib/dump_stack.c:107)
> [ 15.280027][ T1] __register_sysctl_table (kbuild/src/consumer/fs/proc/proc_sysctl.c:1379)
> [ 15.280027][ T1] test_sysctl_init (kbuild/src/consumer/lib/test_sysctl.c:220 kbuild/src/consumer/lib/test_sysctl.c:235)
> [ 15.280027][ T1] ? test_firmware_init (kbuild/src/consumer/lib/test_sysctl.c:224)
>
>
> To reproduce:
>
> # build kernel
> cd linux
> cp config-6.4.0-rc2-00016-g1997935e918f .config
> make HOSTCC=gcc-12 CC=gcc-12 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules
> make HOSTCC=gcc-12 CC=gcc-12 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install
> cd <mod-install-dir>
> find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz
>
>
> git clone https://protect2.fireeye.com/v1/url?k=739e8a44-12e520cd-739f010b-74fe486000…
> cd lkp-tests
> bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email
>
> # if come across any failure that blocks the test,
> # please remove ~/.lkp and /lkp dir to run from a clean state.
>
>
>
> --
> 0-DAY CI Kernel Test Service
> https://protect2.fireeye.com/v1/url?k=0a5d5e5e-6b26f4d7-0a5cd511-74fe486000…
>
>
> #
> # Automatically generated file; DO NOT EDIT.
> # Linux/x86_64 6.4.0-rc2 Kernel Configuration
> #
> CONFIG_CC_VERSION_TEXT="gcc-12 (Debian 12.2.0-14) 12.2.0"
> CONFIG_CC_IS_GCC=y
> CONFIG_GCC_VERSION=120200
> CONFIG_CLANG_VERSION=0
> CONFIG_AS_IS_GNU=y
> CONFIG_AS_VERSION=24000
> CONFIG_LD_IS_BFD=y
> CONFIG_LD_VERSION=24000
> CONFIG_LLD_VERSION=0
> CONFIG_CC_CAN_LINK=y
> CONFIG_CC_CAN_LINK_STATIC=y
> CONFIG_CC_HAS_ASM_GOTO_OUTPUT=y
> CONFIG_CC_HAS_ASM_GOTO_TIED_OUTPUT=y
> CONFIG_TOOLS_SUPPORT_RELR=y
> CONFIG_CC_HAS_ASM_INLINE=y
> CONFIG_CC_HAS_NO_PROFILE_FN_ATTR=y
> CONFIG_PAHOLE_VERSION=125
> CONFIG_CONSTRUCTORS=y
> CONFIG_IRQ_WORK=y
> CONFIG_BUILDTIME_TABLE_SORT=y
> CONFIG_THREAD_INFO_IN_TASK=y
>
> #
> # General setup
> #
> CONFIG_INIT_ENV_ARG_LIMIT=32
> # CONFIG_COMPILE_TEST is not set
> # CONFIG_WERROR is not set
> CONFIG_LOCALVERSION=""
> CONFIG_LOCALVERSION_AUTO=y
> CONFIG_BUILD_SALT=""
> CONFIG_HAVE_KERNEL_GZIP=y
> CONFIG_HAVE_KERNEL_BZIP2=y
> CONFIG_HAVE_KERNEL_LZMA=y
> CONFIG_HAVE_KERNEL_XZ=y
> CONFIG_HAVE_KERNEL_LZO=y
> CONFIG_HAVE_KERNEL_LZ4=y
> CONFIG_HAVE_KERNEL_ZSTD=y
> CONFIG_KERNEL_GZIP=y
> # CONFIG_KERNEL_BZIP2 is not set
> # CONFIG_KERNEL_LZMA is not set
> # CONFIG_KERNEL_XZ is not set
> # CONFIG_KERNEL_LZO is not set
> # CONFIG_KERNEL_LZ4 is not set
> # CONFIG_KERNEL_ZSTD is not set
> CONFIG_DEFAULT_INIT=""
> CONFIG_DEFAULT_HOSTNAME="(none)"
> CONFIG_SYSVIPC=y
> CONFIG_SYSVIPC_SYSCTL=y
> CONFIG_SYSVIPC_COMPAT=y
> CONFIG_POSIX_MQUEUE=y
> CONFIG_POSIX_MQUEUE_SYSCTL=y
> # CONFIG_WATCH_QUEUE is not set
> CONFIG_CROSS_MEMORY_ATTACH=y
> # CONFIG_USELIB is not set
> CONFIG_AUDIT=y
> CONFIG_HAVE_ARCH_AUDITSYSCALL=y
> CONFIG_AUDITSYSCALL=y
>
> #
> # IRQ subsystem
> #
> CONFIG_GENERIC_IRQ_PROBE=y
> CONFIG_GENERIC_IRQ_SHOW=y
> CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y
> CONFIG_GENERIC_PENDING_IRQ=y
> CONFIG_GENERIC_IRQ_MIGRATION=y
> CONFIG_GENERIC_IRQ_INJECTION=y
> CONFIG_HARDIRQS_SW_RESEND=y
> CONFIG_IRQ_DOMAIN=y
> CONFIG_IRQ_SIM=y
> CONFIG_IRQ_DOMAIN_HIERARCHY=y
> CONFIG_GENERIC_MSI_IRQ=y
> CONFIG_IRQ_MSI_IOMMU=y
> CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR=y
> CONFIG_GENERIC_IRQ_RESERVATION_MODE=y
> CONFIG_IRQ_FORCED_THREADING=y
> CONFIG_SPARSE_IRQ=y
> # CONFIG_GENERIC_IRQ_DEBUGFS is not set
> # end of IRQ subsystem
>
> CONFIG_CLOCKSOURCE_WATCHDOG=y
> CONFIG_ARCH_CLOCKSOURCE_INIT=y
> CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
> CONFIG_GENERIC_TIME_VSYSCALL=y
> CONFIG_GENERIC_CLOCKEVENTS=y
> CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
> CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
> CONFIG_GENERIC_CMOS_UPDATE=y
> CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK=y
> CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y
> CONFIG_CONTEXT_TRACKING=y
> CONFIG_CONTEXT_TRACKING_IDLE=y
>
> #
> # Timers subsystem
> #
> CONFIG_TICK_ONESHOT=y
> CONFIG_NO_HZ_COMMON=y
> # CONFIG_HZ_PERIODIC is not set
> # CONFIG_NO_HZ_IDLE is not set
> CONFIG_NO_HZ_FULL=y
> CONFIG_CONTEXT_TRACKING_USER=y
> # CONFIG_CONTEXT_TRACKING_USER_FORCE is not set
> CONFIG_NO_HZ=y
> CONFIG_HIGH_RES_TIMERS=y
> CONFIG_CLOCKSOURCE_WATCHDOG_MAX_SKEW_US=125
> # end of Timers subsystem
>
> CONFIG_BPF=y
> CONFIG_HAVE_EBPF_JIT=y
> CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y
>
> #
> # BPF subsystem
> #
> CONFIG_BPF_SYSCALL=y
> CONFIG_BPF_JIT=y
> CONFIG_BPF_JIT_ALWAYS_ON=y
> CONFIG_BPF_JIT_DEFAULT_ON=y
> CONFIG_BPF_UNPRIV_DEFAULT_OFF=y
> # CONFIG_BPF_PRELOAD is not set
> # CONFIG_BPF_LSM is not set
> # end of BPF subsystem
>
> CONFIG_PREEMPT_BUILD=y
> # CONFIG_PREEMPT_NONE is not set
> CONFIG_PREEMPT_VOLUNTARY=y
> # CONFIG_PREEMPT is not set
> CONFIG_PREEMPT_COUNT=y
> CONFIG_PREEMPTION=y
> CONFIG_PREEMPT_DYNAMIC=y
> # CONFIG_SCHED_CORE is not set
>
> #
> # CPU/Task time and stats accounting
> #
> CONFIG_VIRT_CPU_ACCOUNTING=y
> CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
> CONFIG_IRQ_TIME_ACCOUNTING=y
> CONFIG_HAVE_SCHED_AVG_IRQ=y
> CONFIG_BSD_PROCESS_ACCT=y
> CONFIG_BSD_PROCESS_ACCT_V3=y
> CONFIG_TASKSTATS=y
> CONFIG_TASK_DELAY_ACCT=y
> CONFIG_TASK_XACCT=y
> CONFIG_TASK_IO_ACCOUNTING=y
> # CONFIG_PSI is not set
> # end of CPU/Task time and stats accounting
>
> CONFIG_CPU_ISOLATION=y
>
> #
> # RCU Subsystem
> #
> CONFIG_TREE_RCU=y
> CONFIG_PREEMPT_RCU=y
> # CONFIG_RCU_EXPERT is not set
> CONFIG_TREE_SRCU=y
> CONFIG_TASKS_RCU_GENERIC=y
> CONFIG_TASKS_RCU=y
> CONFIG_TASKS_RUDE_RCU=y
> CONFIG_TASKS_TRACE_RCU=y
> CONFIG_RCU_STALL_COMMON=y
> CONFIG_RCU_NEED_SEGCBLIST=y
> CONFIG_RCU_NOCB_CPU=y
> # CONFIG_RCU_NOCB_CPU_DEFAULT_ALL is not set
> # CONFIG_RCU_LAZY is not set
> # end of RCU Subsystem
>
> CONFIG_IKCONFIG=y
> CONFIG_IKCONFIG_PROC=y
> # CONFIG_IKHEADERS is not set
> CONFIG_LOG_BUF_SHIFT=20
> CONFIG_LOG_CPU_MAX_BUF_SHIFT=12
> # CONFIG_PRINTK_INDEX is not set
> CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
>
> #
> # Scheduler features
> #
> # CONFIG_UCLAMP_TASK is not set
> # end of Scheduler features
>
> CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
> CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y
> CONFIG_CC_HAS_INT128=y
> CONFIG_CC_IMPLICIT_FALLTHROUGH="-Wimplicit-fallthrough=5"
> CONFIG_GCC11_NO_ARRAY_BOUNDS=y
> CONFIG_CC_NO_ARRAY_BOUNDS=y
> CONFIG_ARCH_SUPPORTS_INT128=y
> CONFIG_NUMA_BALANCING=y
> CONFIG_NUMA_BALANCING_DEFAULT_ENABLED=y
> CONFIG_CGROUPS=y
> CONFIG_PAGE_COUNTER=y
> # CONFIG_CGROUP_FAVOR_DYNMODS is not set
> CONFIG_MEMCG=y
> CONFIG_MEMCG_KMEM=y
> CONFIG_BLK_CGROUP=y
> CONFIG_CGROUP_WRITEBACK=y
> CONFIG_CGROUP_SCHED=y
> CONFIG_FAIR_GROUP_SCHED=y
> CONFIG_CFS_BANDWIDTH=y
> CONFIG_RT_GROUP_SCHED=y
> CONFIG_SCHED_MM_CID=y
> CONFIG_CGROUP_PIDS=y
> CONFIG_CGROUP_RDMA=y
> CONFIG_CGROUP_FREEZER=y
> CONFIG_CGROUP_HUGETLB=y
> CONFIG_CPUSETS=y
> CONFIG_PROC_PID_CPUSET=y
> CONFIG_CGROUP_DEVICE=y
> CONFIG_CGROUP_CPUACCT=y
> CONFIG_CGROUP_PERF=y
> CONFIG_CGROUP_BPF=y
> # CONFIG_CGROUP_MISC is not set
> # CONFIG_CGROUP_DEBUG is not set
> CONFIG_SOCK_CGROUP_DATA=y
> CONFIG_NAMESPACES=y
> CONFIG_UTS_NS=y
> CONFIG_TIME_NS=y
> CONFIG_IPC_NS=y
> CONFIG_USER_NS=y
> CONFIG_PID_NS=y
> CONFIG_NET_NS=y
> CONFIG_CHECKPOINT_RESTORE=y
> CONFIG_SCHED_AUTOGROUP=y
> CONFIG_RELAY=y
> CONFIG_BLK_DEV_INITRD=y
> CONFIG_INITRAMFS_SOURCE=""
> CONFIG_RD_GZIP=y
> CONFIG_RD_BZIP2=y
> CONFIG_RD_LZMA=y
> CONFIG_RD_XZ=y
> CONFIG_RD_LZO=y
> CONFIG_RD_LZ4=y
> CONFIG_RD_ZSTD=y
> # CONFIG_BOOT_CONFIG is not set
> CONFIG_INITRAMFS_PRESERVE_MTIME=y
> CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y
> # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
> CONFIG_LD_ORPHAN_WARN=y
> CONFIG_LD_ORPHAN_WARN_LEVEL="warn"
> CONFIG_SYSCTL=y
> CONFIG_HAVE_UID16=y
> CONFIG_SYSCTL_EXCEPTION_TRACE=y
> CONFIG_HAVE_PCSPKR_PLATFORM=y
> CONFIG_EXPERT=y
> CONFIG_UID16=y
> CONFIG_MULTIUSER=y
> CONFIG_SGETMASK_SYSCALL=y
> CONFIG_SYSFS_SYSCALL=y
> CONFIG_FHANDLE=y
> CONFIG_POSIX_TIMERS=y
> CONFIG_PRINTK=y
> CONFIG_BUG=y
> CONFIG_ELF_CORE=y
> CONFIG_PCSPKR_PLATFORM=y
> CONFIG_BASE_FULL=y
> CONFIG_FUTEX=y
> CONFIG_FUTEX_PI=y
> CONFIG_EPOLL=y
> CONFIG_SIGNALFD=y
> CONFIG_TIMERFD=y
> CONFIG_EVENTFD=y
> CONFIG_SHMEM=y
> CONFIG_AIO=y
> CONFIG_IO_URING=y
> CONFIG_ADVISE_SYSCALLS=y
> CONFIG_MEMBARRIER=y
> CONFIG_KALLSYMS=y
> # CONFIG_KALLSYMS_SELFTEST is not set
> CONFIG_KALLSYMS_ALL=y
> CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y
> CONFIG_KALLSYMS_BASE_RELATIVE=y
> CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y
> CONFIG_KCMP=y
> CONFIG_RSEQ=y
> # CONFIG_DEBUG_RSEQ is not set
> CONFIG_EMBEDDED=y
> CONFIG_HAVE_PERF_EVENTS=y
> CONFIG_GUEST_PERF_EVENTS=y
> # CONFIG_PC104 is not set
>
> #
> # Kernel Performance Events And Counters
> #
> CONFIG_PERF_EVENTS=y
> # CONFIG_DEBUG_PERF_USE_VMALLOC is not set
> # end of Kernel Performance Events And Counters
>
> CONFIG_SYSTEM_DATA_VERIFICATION=y
> CONFIG_PROFILING=y
> CONFIG_TRACEPOINTS=y
> # end of General setup
>
> CONFIG_64BIT=y
> CONFIG_X86_64=y
> CONFIG_X86=y
> CONFIG_INSTRUCTION_DECODER=y
> CONFIG_OUTPUT_FORMAT="elf64-x86-64"
> CONFIG_LOCKDEP_SUPPORT=y
> CONFIG_STACKTRACE_SUPPORT=y
> CONFIG_MMU=y
> CONFIG_ARCH_MMAP_RND_BITS_MIN=28
> CONFIG_ARCH_MMAP_RND_BITS_MAX=32
> CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
> CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
> CONFIG_GENERIC_ISA_DMA=y
> CONFIG_GENERIC_CSUM=y
> CONFIG_GENERIC_BUG=y
> CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
> CONFIG_ARCH_MAY_HAVE_PC_FDC=y
> CONFIG_GENERIC_CALIBRATE_DELAY=y
> CONFIG_ARCH_HAS_CPU_RELAX=y
> CONFIG_ARCH_HIBERNATION_POSSIBLE=y
> CONFIG_ARCH_SUSPEND_POSSIBLE=y
> CONFIG_AUDIT_ARCH=y
> CONFIG_KASAN_SHADOW_OFFSET=0xdffffc0000000000
> CONFIG_HAVE_INTEL_TXT=y
> CONFIG_X86_64_SMP=y
> CONFIG_ARCH_SUPPORTS_UPROBES=y
> CONFIG_FIX_EARLYCON_MEM=y
> CONFIG_DYNAMIC_PHYSICAL_MASK=y
> CONFIG_PGTABLE_LEVELS=5
> CONFIG_CC_HAS_SANE_STACKPROTECTOR=y
>
> #
> # Processor type and features
> #
> CONFIG_SMP=y
> CONFIG_X86_FEATURE_NAMES=y
> CONFIG_X86_X2APIC=y
> CONFIG_X86_MPPARSE=y
> # CONFIG_GOLDFISH is not set
> CONFIG_X86_CPU_RESCTRL=y
> CONFIG_X86_EXTENDED_PLATFORM=y
> # CONFIG_X86_NUMACHIP is not set
> # CONFIG_X86_VSMP is not set
> CONFIG_X86_UV=y
> # CONFIG_X86_GOLDFISH is not set
> # CONFIG_X86_INTEL_MID is not set
> CONFIG_X86_INTEL_LPSS=y
> # CONFIG_X86_AMD_PLATFORM_DEVICE is not set
> CONFIG_IOSF_MBI=y
> # CONFIG_IOSF_MBI_DEBUG is not set
> CONFIG_X86_SUPPORTS_MEMORY_FAILURE=y
> # CONFIG_SCHED_OMIT_FRAME_POINTER is not set
> CONFIG_HYPERVISOR_GUEST=y
> CONFIG_PARAVIRT=y
> # CONFIG_PARAVIRT_DEBUG is not set
> CONFIG_PARAVIRT_SPINLOCKS=y
> CONFIG_X86_HV_CALLBACK_VECTOR=y
> # CONFIG_XEN is not set
> CONFIG_KVM_GUEST=y
> CONFIG_ARCH_CPUIDLE_HALTPOLL=y
> # CONFIG_PVH is not set
> CONFIG_PARAVIRT_TIME_ACCOUNTING=y
> CONFIG_PARAVIRT_CLOCK=y
> # CONFIG_JAILHOUSE_GUEST is not set
> # CONFIG_ACRN_GUEST is not set
> CONFIG_INTEL_TDX_GUEST=y
> # CONFIG_MK8 is not set
> # CONFIG_MPSC is not set
> # CONFIG_MCORE2 is not set
> # CONFIG_MATOM is not set
> CONFIG_GENERIC_CPU=y
> CONFIG_X86_INTERNODE_CACHE_SHIFT=6
> CONFIG_X86_L1_CACHE_SHIFT=6
> CONFIG_X86_TSC=y
> CONFIG_X86_CMPXCHG64=y
> CONFIG_X86_CMOV=y
> CONFIG_X86_MINIMUM_CPU_FAMILY=64
> CONFIG_X86_DEBUGCTLMSR=y
> CONFIG_IA32_FEAT_CTL=y
> CONFIG_X86_VMX_FEATURE_NAMES=y
> CONFIG_PROCESSOR_SELECT=y
> CONFIG_CPU_SUP_INTEL=y
> # CONFIG_CPU_SUP_AMD is not set
> # CONFIG_CPU_SUP_HYGON is not set
> # CONFIG_CPU_SUP_CENTAUR is not set
> # CONFIG_CPU_SUP_ZHAOXIN is not set
> CONFIG_HPET_TIMER=y
> CONFIG_HPET_EMULATE_RTC=y
> CONFIG_DMI=y
> CONFIG_BOOT_VESA_SUPPORT=y
> CONFIG_MAXSMP=y
> CONFIG_NR_CPUS_RANGE_BEGIN=8192
> CONFIG_NR_CPUS_RANGE_END=8192
> CONFIG_NR_CPUS_DEFAULT=8192
> CONFIG_NR_CPUS=8192
> CONFIG_SCHED_CLUSTER=y
> CONFIG_SCHED_SMT=y
> CONFIG_SCHED_MC=y
> CONFIG_SCHED_MC_PRIO=y
> CONFIG_X86_LOCAL_APIC=y
> CONFIG_X86_IO_APIC=y
> CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS=y
> CONFIG_X86_MCE=y
> CONFIG_X86_MCELOG_LEGACY=y
> CONFIG_X86_MCE_INTEL=y
> CONFIG_X86_MCE_THRESHOLD=y
> CONFIG_X86_MCE_INJECT=m
>
> #
> # Performance monitoring
> #
> CONFIG_PERF_EVENTS_INTEL_UNCORE=m
> CONFIG_PERF_EVENTS_INTEL_RAPL=m
> CONFIG_PERF_EVENTS_INTEL_CSTATE=m
> # end of Performance monitoring
>
> CONFIG_X86_16BIT=y
> CONFIG_X86_ESPFIX64=y
> CONFIG_X86_VSYSCALL_EMULATION=y
> CONFIG_X86_IOPL_IOPERM=y
> CONFIG_MICROCODE=y
> CONFIG_MICROCODE_INTEL=y
> CONFIG_MICROCODE_LATE_LOADING=y
> CONFIG_X86_MSR=y
> CONFIG_X86_CPUID=y
> CONFIG_X86_5LEVEL=y
> CONFIG_X86_DIRECT_GBPAGES=y
> # CONFIG_X86_CPA_STATISTICS is not set
> CONFIG_X86_MEM_ENCRYPT=y
> CONFIG_NUMA=y
> # CONFIG_AMD_NUMA is not set
> CONFIG_X86_64_ACPI_NUMA=y
> CONFIG_NUMA_EMU=y
> CONFIG_NODES_SHIFT=10
> CONFIG_ARCH_SPARSEMEM_ENABLE=y
> CONFIG_ARCH_SPARSEMEM_DEFAULT=y
> # CONFIG_ARCH_MEMORY_PROBE is not set
> CONFIG_ARCH_PROC_KCORE_TEXT=y
> CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000
> CONFIG_X86_PMEM_LEGACY_DEVICE=y
> CONFIG_X86_PMEM_LEGACY=m
> CONFIG_X86_CHECK_BIOS_CORRUPTION=y
> # CONFIG_X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK is not set
> CONFIG_MTRR=y
> CONFIG_MTRR_SANITIZER=y
> CONFIG_MTRR_SANITIZER_ENABLE_DEFAULT=1
> CONFIG_MTRR_SANITIZER_SPARE_REG_NR_DEFAULT=1
> CONFIG_X86_PAT=y
> CONFIG_ARCH_USES_PG_UNCACHED=y
> CONFIG_X86_UMIP=y
> CONFIG_CC_HAS_IBT=y
> CONFIG_X86_KERNEL_IBT=y
> CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS=y
> CONFIG_X86_INTEL_TSX_MODE_OFF=y
> # CONFIG_X86_INTEL_TSX_MODE_ON is not set
> # CONFIG_X86_INTEL_TSX_MODE_AUTO is not set
> CONFIG_X86_SGX=y
> CONFIG_EFI=y
> CONFIG_EFI_STUB=y
> CONFIG_EFI_HANDOVER_PROTOCOL=y
> CONFIG_EFI_MIXED=y
> # CONFIG_EFI_FAKE_MEMMAP is not set
> CONFIG_EFI_RUNTIME_MAP=y
> # CONFIG_HZ_100 is not set
> # CONFIG_HZ_250 is not set
> # CONFIG_HZ_300 is not set
> CONFIG_HZ_1000=y
> CONFIG_HZ=1000
> CONFIG_SCHED_HRTICK=y
> CONFIG_KEXEC=y
> CONFIG_KEXEC_FILE=y
> CONFIG_ARCH_HAS_KEXEC_PURGATORY=y
> # CONFIG_KEXEC_SIG is not set
> CONFIG_CRASH_DUMP=y
> CONFIG_KEXEC_JUMP=y
> CONFIG_PHYSICAL_START=0x1000000
> CONFIG_RELOCATABLE=y
> CONFIG_RANDOMIZE_BASE=y
> CONFIG_X86_NEED_RELOCS=y
> CONFIG_PHYSICAL_ALIGN=0x200000
> CONFIG_DYNAMIC_MEMORY_LAYOUT=y
> CONFIG_RANDOMIZE_MEMORY=y
> CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0xa
> # CONFIG_ADDRESS_MASKING is not set
> CONFIG_HOTPLUG_CPU=y
> CONFIG_BOOTPARAM_HOTPLUG_CPU0=y
> # CONFIG_DEBUG_HOTPLUG_CPU0 is not set
> # CONFIG_COMPAT_VDSO is not set
> CONFIG_LEGACY_VSYSCALL_XONLY=y
> # CONFIG_LEGACY_VSYSCALL_NONE is not set
> # CONFIG_CMDLINE_BOOL is not set
> CONFIG_MODIFY_LDT_SYSCALL=y
> # CONFIG_STRICT_SIGALTSTACK_SIZE is not set
> CONFIG_HAVE_LIVEPATCH=y
> CONFIG_LIVEPATCH=y
> # end of Processor type and features
>
> CONFIG_CC_HAS_SLS=y
> CONFIG_CC_HAS_RETURN_THUNK=y
> CONFIG_CC_HAS_ENTRY_PADDING=y
> CONFIG_FUNCTION_PADDING_CFI=11
> CONFIG_FUNCTION_PADDING_BYTES=16
> CONFIG_SPECULATION_MITIGATIONS=y
> CONFIG_PAGE_TABLE_ISOLATION=y
> # CONFIG_RETPOLINE is not set
> CONFIG_CPU_IBRS_ENTRY=y
> # CONFIG_SLS is not set
> CONFIG_ARCH_HAS_ADD_PAGES=y
> CONFIG_ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE=y
>
> #
> # Power management and ACPI options
> #
> CONFIG_ARCH_HIBERNATION_HEADER=y
> CONFIG_SUSPEND=y
> CONFIG_SUSPEND_FREEZER=y
> # CONFIG_SUSPEND_SKIP_SYNC is not set
> CONFIG_HIBERNATE_CALLBACKS=y
> CONFIG_HIBERNATION=y
> CONFIG_HIBERNATION_SNAPSHOT_DEV=y
> CONFIG_PM_STD_PARTITION=""
> CONFIG_PM_SLEEP=y
> CONFIG_PM_SLEEP_SMP=y
> # CONFIG_PM_AUTOSLEEP is not set
> # CONFIG_PM_USERSPACE_AUTOSLEEP is not set
> # CONFIG_PM_WAKELOCKS is not set
> CONFIG_PM=y
> CONFIG_PM_DEBUG=y
> # CONFIG_PM_ADVANCED_DEBUG is not set
> # CONFIG_PM_TEST_SUSPEND is not set
> CONFIG_PM_SLEEP_DEBUG=y
> # CONFIG_DPM_WATCHDOG is not set
> # CONFIG_PM_TRACE_RTC is not set
> CONFIG_PM_CLK=y
> # CONFIG_WQ_POWER_EFFICIENT_DEFAULT is not set
> # CONFIG_ENERGY_MODEL is not set
> CONFIG_ARCH_SUPPORTS_ACPI=y
> CONFIG_ACPI=y
> CONFIG_ACPI_LEGACY_TABLES_LOOKUP=y
> CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC=y
> CONFIG_ACPI_SYSTEM_POWER_STATES_SUPPORT=y
> # CONFIG_ACPI_DEBUGGER is not set
> CONFIG_ACPI_SPCR_TABLE=y
> # CONFIG_ACPI_FPDT is not set
> CONFIG_ACPI_LPIT=y
> CONFIG_ACPI_SLEEP=y
> CONFIG_ACPI_REV_OVERRIDE_POSSIBLE=y
> CONFIG_ACPI_EC_DEBUGFS=m
> CONFIG_ACPI_AC=y
> CONFIG_ACPI_BATTERY=y
> CONFIG_ACPI_BUTTON=y
> CONFIG_ACPI_VIDEO=m
> CONFIG_ACPI_FAN=y
> CONFIG_ACPI_TAD=m
> CONFIG_ACPI_DOCK=y
> CONFIG_ACPI_CPU_FREQ_PSS=y
> CONFIG_ACPI_PROCESSOR_CSTATE=y
> CONFIG_ACPI_PROCESSOR_IDLE=y
> CONFIG_ACPI_CPPC_LIB=y
> CONFIG_ACPI_PROCESSOR=y
> CONFIG_ACPI_IPMI=m
> CONFIG_ACPI_HOTPLUG_CPU=y
> CONFIG_ACPI_PROCESSOR_AGGREGATOR=m
> CONFIG_ACPI_THERMAL=y
> CONFIG_ACPI_PLATFORM_PROFILE=m
> CONFIG_ARCH_HAS_ACPI_TABLE_UPGRADE=y
> CONFIG_ACPI_TABLE_UPGRADE=y
> # CONFIG_ACPI_DEBUG is not set
> CONFIG_ACPI_PCI_SLOT=y
> CONFIG_ACPI_CONTAINER=y
> CONFIG_ACPI_HOTPLUG_MEMORY=y
> CONFIG_ACPI_HOTPLUG_IOAPIC=y
> CONFIG_ACPI_SBS=m
> CONFIG_ACPI_HED=y
> # CONFIG_ACPI_CUSTOM_METHOD is not set
> CONFIG_ACPI_BGRT=y
> # CONFIG_ACPI_REDUCED_HARDWARE_ONLY is not set
> CONFIG_ACPI_NFIT=m
> # CONFIG_NFIT_SECURITY_DEBUG is not set
> CONFIG_ACPI_NUMA=y
> CONFIG_ACPI_HMAT=y
> CONFIG_HAVE_ACPI_APEI=y
> CONFIG_HAVE_ACPI_APEI_NMI=y
> CONFIG_ACPI_APEI=y
> CONFIG_ACPI_APEI_GHES=y
> CONFIG_ACPI_APEI_PCIEAER=y
> CONFIG_ACPI_APEI_MEMORY_FAILURE=y
> CONFIG_ACPI_APEI_EINJ=m
> # CONFIG_ACPI_APEI_ERST_DEBUG is not set
> # CONFIG_ACPI_DPTF is not set
> CONFIG_ACPI_WATCHDOG=y
> CONFIG_ACPI_EXTLOG=m
> CONFIG_ACPI_ADXL=y
> # CONFIG_ACPI_CONFIGFS is not set
> # CONFIG_ACPI_PFRUT is not set
> CONFIG_ACPI_PCC=y
> # CONFIG_ACPI_FFH is not set
> # CONFIG_PMIC_OPREGION is not set
> CONFIG_ACPI_PRMT=y
> CONFIG_X86_PM_TIMER=y
>
> #
> # CPU Frequency scaling
> #
> CONFIG_CPU_FREQ=y
> CONFIG_CPU_FREQ_GOV_ATTR_SET=y
> CONFIG_CPU_FREQ_GOV_COMMON=y
> CONFIG_CPU_FREQ_STAT=y
> CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
> # CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
> # CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
> # CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL is not set
> CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
> CONFIG_CPU_FREQ_GOV_POWERSAVE=y
> CONFIG_CPU_FREQ_GOV_USERSPACE=y
> CONFIG_CPU_FREQ_GOV_ONDEMAND=y
> CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y
> CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y
>
> #
> # CPU frequency scaling drivers
> #
> CONFIG_X86_INTEL_PSTATE=y
> # CONFIG_X86_PCC_CPUFREQ is not set
> # CONFIG_X86_AMD_PSTATE is not set
> # CONFIG_X86_AMD_PSTATE_UT is not set
> CONFIG_X86_ACPI_CPUFREQ=m
> # CONFIG_X86_POWERNOW_K8 is not set
> # CONFIG_X86_SPEEDSTEP_CENTRINO is not set
> CONFIG_X86_P4_CLOCKMOD=m
>
> #
> # shared options
> #
> CONFIG_X86_SPEEDSTEP_LIB=m
> # end of CPU Frequency scaling
>
> #
> # CPU Idle
> #
> CONFIG_CPU_IDLE=y
> # CONFIG_CPU_IDLE_GOV_LADDER is not set
> CONFIG_CPU_IDLE_GOV_MENU=y
> # CONFIG_CPU_IDLE_GOV_TEO is not set
> CONFIG_CPU_IDLE_GOV_HALTPOLL=y
> CONFIG_HALTPOLL_CPUIDLE=y
> # end of CPU Idle
>
> CONFIG_INTEL_IDLE=y
> # end of Power management and ACPI options
>
> #
> # Bus options (PCI etc.)
> #
> CONFIG_PCI_DIRECT=y
> CONFIG_PCI_MMCONFIG=y
> CONFIG_MMCONF_FAM10H=y
> # CONFIG_PCI_CNB20LE_QUIRK is not set
> # CONFIG_ISA_BUS is not set
> CONFIG_ISA_DMA_API=y
> # end of Bus options (PCI etc.)
>
> #
> # Binary Emulations
> #
> CONFIG_IA32_EMULATION=y
> # CONFIG_X86_X32_ABI is not set
> CONFIG_COMPAT_32=y
> CONFIG_COMPAT=y
> CONFIG_COMPAT_FOR_U64_ALIGNMENT=y
> # end of Binary Emulations
>
> CONFIG_HAVE_KVM=y
> CONFIG_HAVE_KVM_PFNCACHE=y
> CONFIG_HAVE_KVM_IRQCHIP=y
> CONFIG_HAVE_KVM_IRQFD=y
> CONFIG_HAVE_KVM_IRQ_ROUTING=y
> CONFIG_HAVE_KVM_DIRTY_RING=y
> CONFIG_HAVE_KVM_DIRTY_RING_TSO=y
> CONFIG_HAVE_KVM_DIRTY_RING_ACQ_REL=y
> CONFIG_HAVE_KVM_EVENTFD=y
> CONFIG_KVM_MMIO=y
> CONFIG_KVM_ASYNC_PF=y
> CONFIG_HAVE_KVM_MSI=y
> CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y
> CONFIG_KVM_VFIO=y
> CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT=y
> CONFIG_KVM_COMPAT=y
> CONFIG_HAVE_KVM_IRQ_BYPASS=y
> CONFIG_HAVE_KVM_NO_POLL=y
> CONFIG_KVM_XFER_TO_GUEST_WORK=y
> CONFIG_HAVE_KVM_PM_NOTIFIER=y
> CONFIG_KVM_GENERIC_HARDWARE_ENABLING=y
> CONFIG_VIRTUALIZATION=y
> CONFIG_KVM=m
> # CONFIG_KVM_WERROR is not set
> CONFIG_KVM_INTEL=m
> # CONFIG_X86_SGX_KVM is not set
> # CONFIG_KVM_AMD is not set
> CONFIG_KVM_SMM=y
> # CONFIG_KVM_XEN is not set
> CONFIG_AS_AVX512=y
> CONFIG_AS_SHA1_NI=y
> CONFIG_AS_SHA256_NI=y
> CONFIG_AS_TPAUSE=y
> CONFIG_AS_GFNI=y
>
> #
> # General architecture-dependent options
> #
> CONFIG_CRASH_CORE=y
> CONFIG_KEXEC_CORE=y
> CONFIG_HAVE_IMA_KEXEC=y
> CONFIG_HOTPLUG_SMT=y
> CONFIG_GENERIC_ENTRY=y
> CONFIG_KPROBES=y
> CONFIG_JUMP_LABEL=y
> # CONFIG_STATIC_KEYS_SELFTEST is not set
> # CONFIG_STATIC_CALL_SELFTEST is not set
> CONFIG_OPTPROBES=y
> CONFIG_KPROBES_ON_FTRACE=y
> CONFIG_UPROBES=y
> CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
> CONFIG_ARCH_USE_BUILTIN_BSWAP=y
> CONFIG_KRETPROBES=y
> CONFIG_KRETPROBE_ON_RETHOOK=y
> CONFIG_USER_RETURN_NOTIFIER=y
> CONFIG_HAVE_IOREMAP_PROT=y
> CONFIG_HAVE_KPROBES=y
> CONFIG_HAVE_KRETPROBES=y
> CONFIG_HAVE_OPTPROBES=y
> CONFIG_HAVE_KPROBES_ON_FTRACE=y
> CONFIG_ARCH_CORRECT_STACKTRACE_ON_KRETPROBE=y
> CONFIG_HAVE_FUNCTION_ERROR_INJECTION=y
> CONFIG_HAVE_NMI=y
> CONFIG_TRACE_IRQFLAGS_SUPPORT=y
> CONFIG_TRACE_IRQFLAGS_NMI_SUPPORT=y
> CONFIG_HAVE_ARCH_TRACEHOOK=y
> CONFIG_HAVE_DMA_CONTIGUOUS=y
> CONFIG_GENERIC_SMP_IDLE_THREAD=y
> CONFIG_ARCH_HAS_FORTIFY_SOURCE=y
> CONFIG_ARCH_HAS_SET_MEMORY=y
> CONFIG_ARCH_HAS_SET_DIRECT_MAP=y
> CONFIG_HAVE_ARCH_THREAD_STRUCT_WHITELIST=y
> CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT=y
> CONFIG_ARCH_WANTS_NO_INSTR=y
> CONFIG_HAVE_ASM_MODVERSIONS=y
> CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
> CONFIG_HAVE_RSEQ=y
> CONFIG_HAVE_RUST=y
> CONFIG_HAVE_FUNCTION_ARG_ACCESS_API=y
> CONFIG_HAVE_HW_BREAKPOINT=y
> CONFIG_HAVE_MIXED_BREAKPOINTS_REGS=y
> CONFIG_HAVE_USER_RETURN_NOTIFIER=y
> CONFIG_HAVE_PERF_EVENTS_NMI=y
> CONFIG_HAVE_HARDLOCKUP_DETECTOR_PERF=y
> CONFIG_HAVE_PERF_REGS=y
> CONFIG_HAVE_PERF_USER_STACK_DUMP=y
> CONFIG_HAVE_ARCH_JUMP_LABEL=y
> CONFIG_HAVE_ARCH_JUMP_LABEL_RELATIVE=y
> CONFIG_MMU_GATHER_TABLE_FREE=y
> CONFIG_MMU_GATHER_RCU_TABLE_FREE=y
> CONFIG_MMU_GATHER_MERGE_VMAS=y
> CONFIG_MMU_LAZY_TLB_REFCOUNT=y
> CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
> CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS=y
> CONFIG_HAVE_ALIGNED_STRUCT_PAGE=y
> CONFIG_HAVE_CMPXCHG_LOCAL=y
> CONFIG_HAVE_CMPXCHG_DOUBLE=y
> CONFIG_ARCH_WANT_COMPAT_IPC_PARSE_VERSION=y
> CONFIG_ARCH_WANT_OLD_COMPAT_IPC=y
> CONFIG_HAVE_ARCH_SECCOMP=y
> CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
> CONFIG_SECCOMP=y
> CONFIG_SECCOMP_FILTER=y
> # CONFIG_SECCOMP_CACHE_DEBUG is not set
> CONFIG_HAVE_ARCH_STACKLEAK=y
> CONFIG_HAVE_STACKPROTECTOR=y
> CONFIG_STACKPROTECTOR=y
> CONFIG_STACKPROTECTOR_STRONG=y
> CONFIG_ARCH_SUPPORTS_LTO_CLANG=y
> CONFIG_ARCH_SUPPORTS_LTO_CLANG_THIN=y
> CONFIG_LTO_NONE=y
> CONFIG_ARCH_SUPPORTS_CFI_CLANG=y
> CONFIG_HAVE_ARCH_WITHIN_STACK_FRAMES=y
> CONFIG_HAVE_CONTEXT_TRACKING_USER=y
> CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK=y
> CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
> CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y
> CONFIG_HAVE_MOVE_PUD=y
> CONFIG_HAVE_MOVE_PMD=y
> CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
> CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y
> CONFIG_HAVE_ARCH_HUGE_VMAP=y
> CONFIG_HAVE_ARCH_HUGE_VMALLOC=y
> CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
> CONFIG_HAVE_ARCH_SOFT_DIRTY=y
> CONFIG_HAVE_MOD_ARCH_SPECIFIC=y
> CONFIG_MODULES_USE_ELF_RELA=y
> CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK=y
> CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK=y
> CONFIG_SOFTIRQ_ON_OWN_STACK=y
> CONFIG_ARCH_HAS_ELF_RANDOMIZE=y
> CONFIG_HAVE_ARCH_MMAP_RND_BITS=y
> CONFIG_HAVE_EXIT_THREAD=y
> CONFIG_ARCH_MMAP_RND_BITS=28
> CONFIG_HAVE_ARCH_MMAP_RND_COMPAT_BITS=y
> CONFIG_ARCH_MMAP_RND_COMPAT_BITS=8
> CONFIG_HAVE_ARCH_COMPAT_MMAP_BASES=y
> CONFIG_PAGE_SIZE_LESS_THAN_64KB=y
> CONFIG_PAGE_SIZE_LESS_THAN_256KB=y
> CONFIG_HAVE_OBJTOOL=y
> CONFIG_HAVE_JUMP_LABEL_HACK=y
> CONFIG_HAVE_NOINSTR_HACK=y
> CONFIG_HAVE_NOINSTR_VALIDATION=y
> CONFIG_HAVE_UACCESS_VALIDATION=y
> CONFIG_HAVE_STACK_VALIDATION=y
> CONFIG_HAVE_RELIABLE_STACKTRACE=y
> CONFIG_OLD_SIGSUSPEND3=y
> CONFIG_COMPAT_OLD_SIGACTION=y
> CONFIG_COMPAT_32BIT_TIME=y
> CONFIG_HAVE_ARCH_VMAP_STACK=y
> CONFIG_VMAP_STACK=y
> CONFIG_HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET=y
> CONFIG_RANDOMIZE_KSTACK_OFFSET=y
> CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT=y
> CONFIG_ARCH_HAS_STRICT_KERNEL_RWX=y
> CONFIG_STRICT_KERNEL_RWX=y
> CONFIG_ARCH_HAS_STRICT_MODULE_RWX=y
> CONFIG_STRICT_MODULE_RWX=y
> CONFIG_HAVE_ARCH_PREL32_RELOCATIONS=y
> CONFIG_ARCH_USE_MEMREMAP_PROT=y
> # CONFIG_LOCK_EVENT_COUNTS is not set
> CONFIG_ARCH_HAS_MEM_ENCRYPT=y
> CONFIG_ARCH_HAS_CC_PLATFORM=y
> CONFIG_HAVE_STATIC_CALL=y
> CONFIG_HAVE_STATIC_CALL_INLINE=y
> CONFIG_HAVE_PREEMPT_DYNAMIC=y
> CONFIG_HAVE_PREEMPT_DYNAMIC_CALL=y
> CONFIG_ARCH_WANT_LD_ORPHAN_WARN=y
> CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
> CONFIG_ARCH_SUPPORTS_PAGE_TABLE_CHECK=y
> CONFIG_ARCH_HAS_ELFCORE_COMPAT=y
> CONFIG_ARCH_HAS_PARANOID_L1D_FLUSH=y
> CONFIG_DYNAMIC_SIGFRAME=y
> CONFIG_HAVE_ARCH_NODE_DEV_GROUP=y
> CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG=y
>
> #
> # GCOV-based kernel profiling
> #
> # CONFIG_GCOV_KERNEL is not set
> CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
> # end of GCOV-based kernel profiling
>
> CONFIG_HAVE_GCC_PLUGINS=y
> CONFIG_GCC_PLUGINS=y
> # CONFIG_GCC_PLUGIN_LATENT_ENTROPY is not set
> CONFIG_FUNCTION_ALIGNMENT_4B=y
> CONFIG_FUNCTION_ALIGNMENT_16B=y
> CONFIG_FUNCTION_ALIGNMENT=16
> # end of General architecture-dependent options
>
> CONFIG_RT_MUTEXES=y
> CONFIG_BASE_SMALL=0
> CONFIG_MODULE_SIG_FORMAT=y
> CONFIG_MODULES=y
> # CONFIG_MODULE_DEBUG is not set
> CONFIG_MODULE_FORCE_LOAD=y
> CONFIG_MODULE_UNLOAD=y
> # CONFIG_MODULE_FORCE_UNLOAD is not set
> # CONFIG_MODULE_UNLOAD_TAINT_TRACKING is not set
> # CONFIG_MODVERSIONS is not set
> # CONFIG_MODULE_SRCVERSION_ALL is not set
> CONFIG_MODULE_SIG=y
> # CONFIG_MODULE_SIG_FORCE is not set
> CONFIG_MODULE_SIG_ALL=y
> # CONFIG_MODULE_SIG_SHA1 is not set
> # CONFIG_MODULE_SIG_SHA224 is not set
> CONFIG_MODULE_SIG_SHA256=y
> # CONFIG_MODULE_SIG_SHA384 is not set
> # CONFIG_MODULE_SIG_SHA512 is not set
> CONFIG_MODULE_SIG_HASH="sha256"
> CONFIG_MODULE_COMPRESS_NONE=y
> # CONFIG_MODULE_COMPRESS_GZIP is not set
> # CONFIG_MODULE_COMPRESS_XZ is not set
> # CONFIG_MODULE_COMPRESS_ZSTD is not set
> # CONFIG_MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS is not set
> CONFIG_MODPROBE_PATH="/sbin/modprobe"
> # CONFIG_TRIM_UNUSED_KSYMS is not set
> CONFIG_MODULES_TREE_LOOKUP=y
> CONFIG_BLOCK=y
> CONFIG_BLOCK_LEGACY_AUTOLOAD=y
> CONFIG_BLK_CGROUP_RWSTAT=y
> CONFIG_BLK_CGROUP_PUNT_BIO=y
> CONFIG_BLK_DEV_BSG_COMMON=y
> CONFIG_BLK_ICQ=y
> CONFIG_BLK_DEV_BSGLIB=y
> CONFIG_BLK_DEV_INTEGRITY=y
> CONFIG_BLK_DEV_INTEGRITY_T10=m
> # CONFIG_BLK_DEV_ZONED is not set
> CONFIG_BLK_DEV_THROTTLING=y
> # CONFIG_BLK_DEV_THROTTLING_LOW is not set
> CONFIG_BLK_WBT=y
> CONFIG_BLK_WBT_MQ=y
> # CONFIG_BLK_CGROUP_IOLATENCY is not set
> # CONFIG_BLK_CGROUP_IOCOST is not set
> # CONFIG_BLK_CGROUP_IOPRIO is not set
> CONFIG_BLK_DEBUG_FS=y
> # CONFIG_BLK_SED_OPAL is not set
> # CONFIG_BLK_INLINE_ENCRYPTION is not set
>
> #
> # Partition Types
> #
> # CONFIG_PARTITION_ADVANCED is not set
> CONFIG_MSDOS_PARTITION=y
> CONFIG_EFI_PARTITION=y
> # end of Partition Types
>
> CONFIG_BLK_MQ_PCI=y
> CONFIG_BLK_MQ_VIRTIO=y
> CONFIG_BLK_PM=y
> CONFIG_BLOCK_HOLDER_DEPRECATED=y
> CONFIG_BLK_MQ_STACKING=y
>
> #
> # IO Schedulers
> #
> CONFIG_MQ_IOSCHED_DEADLINE=y
> CONFIG_MQ_IOSCHED_KYBER=y
> CONFIG_IOSCHED_BFQ=y
> CONFIG_BFQ_GROUP_IOSCHED=y
> # CONFIG_BFQ_CGROUP_DEBUG is not set
> # end of IO Schedulers
>
> CONFIG_PREEMPT_NOTIFIERS=y
> CONFIG_PADATA=y
> CONFIG_ASN1=y
> CONFIG_UNINLINE_SPIN_UNLOCK=y
> CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y
> CONFIG_MUTEX_SPIN_ON_OWNER=y
> CONFIG_RWSEM_SPIN_ON_OWNER=y
> CONFIG_LOCK_SPIN_ON_OWNER=y
> CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
> CONFIG_QUEUED_SPINLOCKS=y
> CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
> CONFIG_QUEUED_RWLOCKS=y
> CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE=y
> CONFIG_ARCH_HAS_SYNC_CORE_BEFORE_USERMODE=y
> CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y
> CONFIG_FREEZER=y
>
> #
> # Executable file formats
> #
> CONFIG_BINFMT_ELF=y
> CONFIG_COMPAT_BINFMT_ELF=y
> CONFIG_ELFCORE=y
> CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y
> CONFIG_BINFMT_SCRIPT=y
> CONFIG_BINFMT_MISC=m
> CONFIG_COREDUMP=y
> # end of Executable file formats
>
> #
> # Memory Management options
> #
> CONFIG_ZPOOL=y
> CONFIG_SWAP=y
> CONFIG_ZSWAP=y
> # CONFIG_ZSWAP_DEFAULT_ON is not set
> # CONFIG_ZSWAP_COMPRESSOR_DEFAULT_DEFLATE is not set
> CONFIG_ZSWAP_COMPRESSOR_DEFAULT_LZO=y
> # CONFIG_ZSWAP_COMPRESSOR_DEFAULT_842 is not set
> # CONFIG_ZSWAP_COMPRESSOR_DEFAULT_LZ4 is not set
> # CONFIG_ZSWAP_COMPRESSOR_DEFAULT_LZ4HC is not set
> # CONFIG_ZSWAP_COMPRESSOR_DEFAULT_ZSTD is not set
> CONFIG_ZSWAP_COMPRESSOR_DEFAULT="lzo"
> CONFIG_ZSWAP_ZPOOL_DEFAULT_ZBUD=y
> # CONFIG_ZSWAP_ZPOOL_DEFAULT_Z3FOLD is not set
> # CONFIG_ZSWAP_ZPOOL_DEFAULT_ZSMALLOC is not set
> CONFIG_ZSWAP_ZPOOL_DEFAULT="zbud"
> CONFIG_ZBUD=y
> # CONFIG_Z3FOLD is not set
> CONFIG_ZSMALLOC=y
> CONFIG_ZSMALLOC_STAT=y
> CONFIG_ZSMALLOC_CHAIN_SIZE=8
>
> #
> # SLAB allocator options
> #
> # CONFIG_SLAB is not set
> CONFIG_SLUB=y
> # CONFIG_SLUB_TINY is not set
> CONFIG_SLAB_MERGE_DEFAULT=y
> CONFIG_SLAB_FREELIST_RANDOM=y
> CONFIG_SLAB_FREELIST_HARDENED=y
> # CONFIG_SLUB_STATS is not set
> CONFIG_SLUB_CPU_PARTIAL=y
> # end of SLAB allocator options
>
> CONFIG_SHUFFLE_PAGE_ALLOCATOR=y
> # CONFIG_COMPAT_BRK is not set
> CONFIG_SPARSEMEM=y
> CONFIG_SPARSEMEM_EXTREME=y
> CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
> CONFIG_SPARSEMEM_VMEMMAP=y
> CONFIG_ARCH_WANT_OPTIMIZE_VMEMMAP=y
> CONFIG_HAVE_FAST_GUP=y
> CONFIG_NUMA_KEEP_MEMINFO=y
> CONFIG_MEMORY_ISOLATION=y
> CONFIG_EXCLUSIVE_SYSTEM_RAM=y
> CONFIG_HAVE_BOOTMEM_INFO_NODE=y
> CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
> CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE=y
> CONFIG_MEMORY_HOTPLUG=y
> # CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE is not set
> CONFIG_MEMORY_HOTREMOVE=y
> CONFIG_MHP_MEMMAP_ON_MEMORY=y
> CONFIG_SPLIT_PTLOCK_CPUS=4
> CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y
> CONFIG_MEMORY_BALLOON=y
> CONFIG_BALLOON_COMPACTION=y
> CONFIG_COMPACTION=y
> CONFIG_COMPACT_UNEVICTABLE_DEFAULT=1
> CONFIG_PAGE_REPORTING=y
> CONFIG_MIGRATION=y
> CONFIG_DEVICE_MIGRATION=y
> CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y
> CONFIG_ARCH_ENABLE_THP_MIGRATION=y
> CONFIG_CONTIG_ALLOC=y
> CONFIG_PHYS_ADDR_T_64BIT=y
> CONFIG_MMU_NOTIFIER=y
> CONFIG_KSM=y
> CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
> CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y
> CONFIG_MEMORY_FAILURE=y
> CONFIG_HWPOISON_INJECT=m
> CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
> CONFIG_ARCH_WANTS_THP_SWAP=y
> CONFIG_TRANSPARENT_HUGEPAGE=y
> CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
> # CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
> CONFIG_THP_SWAP=y
> # CONFIG_READ_ONLY_THP_FOR_FS is not set
> CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
> CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
> CONFIG_USE_PERCPU_NUMA_NODE_ID=y
> CONFIG_HAVE_SETUP_PER_CPU_AREA=y
> CONFIG_FRONTSWAP=y
> # CONFIG_CMA is not set
> CONFIG_MEM_SOFT_DIRTY=y
> CONFIG_GENERIC_EARLY_IOREMAP=y
> CONFIG_DEFERRED_STRUCT_PAGE_INIT=y
> CONFIG_PAGE_IDLE_FLAG=y
> CONFIG_IDLE_PAGE_TRACKING=y
> CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
> CONFIG_ARCH_HAS_CURRENT_STACK_POINTER=y
> CONFIG_ARCH_HAS_PTE_DEVMAP=y
> CONFIG_ARCH_HAS_ZONE_DMA_SET=y
> CONFIG_ZONE_DMA=y
> CONFIG_ZONE_DMA32=y
> CONFIG_ZONE_DEVICE=y
> CONFIG_HMM_MIRROR=y
> CONFIG_GET_FREE_REGION=y
> CONFIG_DEVICE_PRIVATE=y
> CONFIG_VMAP_PFN=y
> CONFIG_ARCH_USES_HIGH_VMA_FLAGS=y
> CONFIG_ARCH_HAS_PKEYS=y
> CONFIG_VM_EVENT_COUNTERS=y
> # CONFIG_PERCPU_STATS is not set
> CONFIG_GUP_TEST=y
> # CONFIG_DMAPOOL_TEST is not set
> CONFIG_ARCH_HAS_PTE_SPECIAL=y
> CONFIG_SECRETMEM=y
> CONFIG_ANON_VMA_NAME=y
> CONFIG_USERFAULTFD=y
> CONFIG_HAVE_ARCH_USERFAULTFD_WP=y
> CONFIG_HAVE_ARCH_USERFAULTFD_MINOR=y
> CONFIG_PTE_MARKER_UFFD_WP=y
> # CONFIG_LRU_GEN is not set
> CONFIG_ARCH_SUPPORTS_PER_VMA_LOCK=y
> CONFIG_PER_VMA_LOCK=y
>
> #
> # Data Access Monitoring
> #
> CONFIG_DAMON=y
> CONFIG_DAMON_VADDR=y
> CONFIG_DAMON_PADDR=y
> CONFIG_DAMON_SYSFS=y
> CONFIG_DAMON_DBGFS=y
> # CONFIG_DAMON_RECLAIM is not set
> # CONFIG_DAMON_LRU_SORT is not set
> # end of Data Access Monitoring
> # end of Memory Management options
>
> CONFIG_NET=y
> CONFIG_NET_INGRESS=y
> CONFIG_NET_EGRESS=y
> CONFIG_NET_REDIRECT=y
> CONFIG_SKB_EXTENSIONS=y
>
> #
> # Networking options
> #
> CONFIG_PACKET=y
> CONFIG_PACKET_DIAG=m
> CONFIG_UNIX=y
> CONFIG_UNIX_SCM=y
> CONFIG_AF_UNIX_OOB=y
> CONFIG_UNIX_DIAG=m
> CONFIG_TLS=m
> CONFIG_TLS_DEVICE=y
> # CONFIG_TLS_TOE is not set
> CONFIG_XFRM=y
> CONFIG_XFRM_OFFLOAD=y
> CONFIG_XFRM_ALGO=y
> CONFIG_XFRM_USER=y
> # CONFIG_XFRM_USER_COMPAT is not set
> # CONFIG_XFRM_INTERFACE is not set
> CONFIG_XFRM_SUB_POLICY=y
> CONFIG_XFRM_MIGRATE=y
> CONFIG_XFRM_STATISTICS=y
> CONFIG_XFRM_AH=m
> CONFIG_XFRM_ESP=m
> CONFIG_XFRM_IPCOMP=m
> # CONFIG_NET_KEY is not set
> CONFIG_XDP_SOCKETS=y
> # CONFIG_XDP_SOCKETS_DIAG is not set
> CONFIG_NET_HANDSHAKE=y
> CONFIG_INET=y
> CONFIG_IP_MULTICAST=y
> CONFIG_IP_ADVANCED_ROUTER=y
> CONFIG_IP_FIB_TRIE_STATS=y
> CONFIG_IP_MULTIPLE_TABLES=y
> CONFIG_IP_ROUTE_MULTIPATH=y
> CONFIG_IP_ROUTE_VERBOSE=y
> CONFIG_IP_ROUTE_CLASSID=y
> CONFIG_IP_PNP=y
> CONFIG_IP_PNP_DHCP=y
> # CONFIG_IP_PNP_BOOTP is not set
> # CONFIG_IP_PNP_RARP is not set
> CONFIG_NET_IPIP=m
> CONFIG_NET_IPGRE_DEMUX=m
> CONFIG_NET_IP_TUNNEL=m
> CONFIG_NET_IPGRE=m
> CONFIG_NET_IPGRE_BROADCAST=y
> CONFIG_IP_MROUTE_COMMON=y
> CONFIG_IP_MROUTE=y
> CONFIG_IP_MROUTE_MULTIPLE_TABLES=y
> CONFIG_IP_PIMSM_V1=y
> CONFIG_IP_PIMSM_V2=y
> CONFIG_SYN_COOKIES=y
> CONFIG_NET_IPVTI=m
> CONFIG_NET_UDP_TUNNEL=m
> CONFIG_NET_FOU=m
> CONFIG_NET_FOU_IP_TUNNELS=y
> CONFIG_INET_AH=m
> CONFIG_INET_ESP=m
> CONFIG_INET_ESP_OFFLOAD=m
> # CONFIG_INET_ESPINTCP is not set
> CONFIG_INET_IPCOMP=m
> CONFIG_INET_TABLE_PERTURB_ORDER=16
> CONFIG_INET_XFRM_TUNNEL=m
> CONFIG_INET_TUNNEL=m
> CONFIG_INET_DIAG=m
> CONFIG_INET_TCP_DIAG=m
> CONFIG_INET_UDP_DIAG=m
> CONFIG_INET_RAW_DIAG=m
> # CONFIG_INET_DIAG_DESTROY is not set
> CONFIG_TCP_CONG_ADVANCED=y
> CONFIG_TCP_CONG_BIC=m
> CONFIG_TCP_CONG_CUBIC=y
> CONFIG_TCP_CONG_WESTWOOD=m
> CONFIG_TCP_CONG_HTCP=m
> CONFIG_TCP_CONG_HSTCP=m
> CONFIG_TCP_CONG_HYBLA=m
> CONFIG_TCP_CONG_VEGAS=m
> CONFIG_TCP_CONG_NV=m
> CONFIG_TCP_CONG_SCALABLE=m
> CONFIG_TCP_CONG_LP=m
> CONFIG_TCP_CONG_VENO=m
> CONFIG_TCP_CONG_YEAH=m
> CONFIG_TCP_CONG_ILLINOIS=m
> CONFIG_TCP_CONG_DCTCP=m
> # CONFIG_TCP_CONG_CDG is not set
> CONFIG_TCP_CONG_BBR=m
> CONFIG_DEFAULT_CUBIC=y
> # CONFIG_DEFAULT_RENO is not set
> CONFIG_DEFAULT_TCP_CONG="cubic"
> CONFIG_TCP_MD5SIG=y
> CONFIG_IPV6=y
> CONFIG_IPV6_ROUTER_PREF=y
> CONFIG_IPV6_ROUTE_INFO=y
> CONFIG_IPV6_OPTIMISTIC_DAD=y
> CONFIG_INET6_AH=m
> CONFIG_INET6_ESP=m
> CONFIG_INET6_ESP_OFFLOAD=m
> # CONFIG_INET6_ESPINTCP is not set
> CONFIG_INET6_IPCOMP=m
> CONFIG_IPV6_MIP6=m
> # CONFIG_IPV6_ILA is not set
> CONFIG_INET6_XFRM_TUNNEL=m
> CONFIG_INET6_TUNNEL=m
> CONFIG_IPV6_VTI=m
> CONFIG_IPV6_SIT=m
> CONFIG_IPV6_SIT_6RD=y
> CONFIG_IPV6_NDISC_NODETYPE=y
> CONFIG_IPV6_TUNNEL=m
> CONFIG_IPV6_GRE=m
> CONFIG_IPV6_FOU=m
> CONFIG_IPV6_FOU_TUNNEL=m
> CONFIG_IPV6_MULTIPLE_TABLES=y
> # CONFIG_IPV6_SUBTREES is not set
> CONFIG_IPV6_MROUTE=y
> CONFIG_IPV6_MROUTE_MULTIPLE_TABLES=y
> CONFIG_IPV6_PIMSM_V2=y
> # CONFIG_IPV6_SEG6_LWTUNNEL is not set
> # CONFIG_IPV6_SEG6_HMAC is not set
> # CONFIG_IPV6_RPL_LWTUNNEL is not set
> CONFIG_IPV6_IOAM6_LWTUNNEL=y
> CONFIG_NETLABEL=y
> CONFIG_MPTCP=y
> CONFIG_INET_MPTCP_DIAG=m
> CONFIG_MPTCP_IPV6=y
> CONFIG_NETWORK_SECMARK=y
> CONFIG_NET_PTP_CLASSIFY=y
> CONFIG_NETWORK_PHY_TIMESTAMPING=y
> CONFIG_NETFILTER=y
> CONFIG_NETFILTER_ADVANCED=y
> CONFIG_BRIDGE_NETFILTER=m
>
> #
> # Core Netfilter Configuration
> #
> CONFIG_NETFILTER_INGRESS=y
> CONFIG_NETFILTER_EGRESS=y
> CONFIG_NETFILTER_SKIP_EGRESS=y
> CONFIG_NETFILTER_NETLINK=m
> CONFIG_NETFILTER_FAMILY_BRIDGE=y
> CONFIG_NETFILTER_FAMILY_ARP=y
> CONFIG_NETFILTER_BPF_LINK=y
> # CONFIG_NETFILTER_NETLINK_HOOK is not set
> # CONFIG_NETFILTER_NETLINK_ACCT is not set
> CONFIG_NETFILTER_NETLINK_QUEUE=m
> CONFIG_NETFILTER_NETLINK_LOG=m
> CONFIG_NETFILTER_NETLINK_OSF=m
> CONFIG_NF_CONNTRACK=m
> CONFIG_NF_LOG_SYSLOG=m
> CONFIG_NETFILTER_CONNCOUNT=m
> CONFIG_NF_CONNTRACK_MARK=y
> CONFIG_NF_CONNTRACK_SECMARK=y
> CONFIG_NF_CONNTRACK_ZONES=y
> CONFIG_NF_CONNTRACK_PROCFS=y
> CONFIG_NF_CONNTRACK_EVENTS=y
> CONFIG_NF_CONNTRACK_TIMEOUT=y
> CONFIG_NF_CONNTRACK_TIMESTAMP=y
> CONFIG_NF_CONNTRACK_LABELS=y
> CONFIG_NF_CONNTRACK_OVS=y
> CONFIG_NF_CT_PROTO_DCCP=y
> CONFIG_NF_CT_PROTO_GRE=y
> CONFIG_NF_CT_PROTO_SCTP=y
> CONFIG_NF_CT_PROTO_UDPLITE=y
> CONFIG_NF_CONNTRACK_AMANDA=m
> CONFIG_NF_CONNTRACK_FTP=m
> CONFIG_NF_CONNTRACK_H323=m
> CONFIG_NF_CONNTRACK_IRC=m
> CONFIG_NF_CONNTRACK_BROADCAST=m
> CONFIG_NF_CONNTRACK_NETBIOS_NS=m
> CONFIG_NF_CONNTRACK_SNMP=m
> CONFIG_NF_CONNTRACK_PPTP=m
> CONFIG_NF_CONNTRACK_SANE=m
> CONFIG_NF_CONNTRACK_SIP=m
> CONFIG_NF_CONNTRACK_TFTP=m
> CONFIG_NF_CT_NETLINK=m
> CONFIG_NF_CT_NETLINK_TIMEOUT=m
> CONFIG_NF_CT_NETLINK_HELPER=m
> CONFIG_NETFILTER_NETLINK_GLUE_CT=y
> CONFIG_NF_NAT=m
> CONFIG_NF_NAT_AMANDA=m
> CONFIG_NF_NAT_FTP=m
> CONFIG_NF_NAT_IRC=m
> CONFIG_NF_NAT_SIP=m
> CONFIG_NF_NAT_TFTP=m
> CONFIG_NF_NAT_REDIRECT=y
> CONFIG_NF_NAT_MASQUERADE=y
> CONFIG_NF_NAT_OVS=y
> CONFIG_NETFILTER_SYNPROXY=m
> CONFIG_NF_TABLES=m
> CONFIG_NF_TABLES_INET=y
> CONFIG_NF_TABLES_NETDEV=y
> CONFIG_NFT_NUMGEN=m
> CONFIG_NFT_CT=m
> CONFIG_NFT_FLOW_OFFLOAD=m
> CONFIG_NFT_CONNLIMIT=m
> CONFIG_NFT_LOG=m
> CONFIG_NFT_LIMIT=m
> CONFIG_NFT_MASQ=m
> CONFIG_NFT_REDIR=m
> CONFIG_NFT_NAT=m
> # CONFIG_NFT_TUNNEL is not set
> CONFIG_NFT_QUEUE=m
> CONFIG_NFT_QUOTA=m
> CONFIG_NFT_REJECT=m
> CONFIG_NFT_REJECT_INET=m
> CONFIG_NFT_COMPAT=m
> CONFIG_NFT_HASH=m
> CONFIG_NFT_FIB=m
> CONFIG_NFT_FIB_INET=m
> # CONFIG_NFT_XFRM is not set
> CONFIG_NFT_SOCKET=m
> # CONFIG_NFT_OSF is not set
> CONFIG_NFT_TPROXY=m
> CONFIG_NFT_SYNPROXY=m
> CONFIG_NF_DUP_NETDEV=m
> CONFIG_NFT_DUP_NETDEV=m
> CONFIG_NFT_FWD_NETDEV=m
> CONFIG_NFT_FIB_NETDEV=m
> # CONFIG_NFT_REJECT_NETDEV is not set
> CONFIG_NF_FLOW_TABLE_INET=m
> CONFIG_NF_FLOW_TABLE=m
> # CONFIG_NF_FLOW_TABLE_PROCFS is not set
> CONFIG_NETFILTER_XTABLES=y
> # CONFIG_NETFILTER_XTABLES_COMPAT is not set
>
> #
> # Xtables combined modules
> #
> CONFIG_NETFILTER_XT_MARK=m
> CONFIG_NETFILTER_XT_CONNMARK=m
> CONFIG_NETFILTER_XT_SET=m
>
> #
> # Xtables targets
> #
> CONFIG_NETFILTER_XT_TARGET_AUDIT=m
> CONFIG_NETFILTER_XT_TARGET_CHECKSUM=m
> CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m
> CONFIG_NETFILTER_XT_TARGET_CONNMARK=m
> CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=m
> CONFIG_NETFILTER_XT_TARGET_CT=m
> CONFIG_NETFILTER_XT_TARGET_DSCP=m
> CONFIG_NETFILTER_XT_TARGET_HL=m
> CONFIG_NETFILTER_XT_TARGET_HMARK=m
> CONFIG_NETFILTER_XT_TARGET_IDLETIMER=m
> # CONFIG_NETFILTER_XT_TARGET_LED is not set
> CONFIG_NETFILTER_XT_TARGET_LOG=m
> CONFIG_NETFILTER_XT_TARGET_MARK=m
> CONFIG_NETFILTER_XT_NAT=m
> CONFIG_NETFILTER_XT_TARGET_NETMAP=m
> CONFIG_NETFILTER_XT_TARGET_NFLOG=m
> CONFIG_NETFILTER_XT_TARGET_NFQUEUE=m
> CONFIG_NETFILTER_XT_TARGET_NOTRACK=m
> CONFIG_NETFILTER_XT_TARGET_RATEEST=m
> CONFIG_NETFILTER_XT_TARGET_REDIRECT=m
> CONFIG_NETFILTER_XT_TARGET_MASQUERADE=m
> CONFIG_NETFILTER_XT_TARGET_TEE=m
> CONFIG_NETFILTER_XT_TARGET_TPROXY=m
> CONFIG_NETFILTER_XT_TARGET_TRACE=m
> CONFIG_NETFILTER_XT_TARGET_SECMARK=m
> CONFIG_NETFILTER_XT_TARGET_TCPMSS=m
> CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP=m
>
> #
> # Xtables matches
> #
> CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=m
> CONFIG_NETFILTER_XT_MATCH_BPF=m
> CONFIG_NETFILTER_XT_MATCH_CGROUP=m
> CONFIG_NETFILTER_XT_MATCH_CLUSTER=m
> CONFIG_NETFILTER_XT_MATCH_COMMENT=m
> CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m
> CONFIG_NETFILTER_XT_MATCH_CONNLABEL=m
> CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=m
> CONFIG_NETFILTER_XT_MATCH_CONNMARK=m
> CONFIG_NETFILTER_XT_MATCH_CONNTRACK=m
> CONFIG_NETFILTER_XT_MATCH_CPU=m
> CONFIG_NETFILTER_XT_MATCH_DCCP=m
> CONFIG_NETFILTER_XT_MATCH_DEVGROUP=m
> CONFIG_NETFILTER_XT_MATCH_DSCP=m
> CONFIG_NETFILTER_XT_MATCH_ECN=m
> CONFIG_NETFILTER_XT_MATCH_ESP=m
> CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m
> CONFIG_NETFILTER_XT_MATCH_HELPER=m
> CONFIG_NETFILTER_XT_MATCH_HL=m
> # CONFIG_NETFILTER_XT_MATCH_IPCOMP is not set
> CONFIG_NETFILTER_XT_MATCH_IPRANGE=m
> CONFIG_NETFILTER_XT_MATCH_IPVS=m
> # CONFIG_NETFILTER_XT_MATCH_L2TP is not set
> CONFIG_NETFILTER_XT_MATCH_LENGTH=m
> CONFIG_NETFILTER_XT_MATCH_LIMIT=m
> CONFIG_NETFILTER_XT_MATCH_MAC=m
> CONFIG_NETFILTER_XT_MATCH_MARK=m
> CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m
> # CONFIG_NETFILTER_XT_MATCH_NFACCT is not set
> CONFIG_NETFILTER_XT_MATCH_OSF=m
> CONFIG_NETFILTER_XT_MATCH_OWNER=m
> CONFIG_NETFILTER_XT_MATCH_POLICY=m
> CONFIG_NETFILTER_XT_MATCH_PHYSDEV=m
> CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m
> CONFIG_NETFILTER_XT_MATCH_QUOTA=m
> CONFIG_NETFILTER_XT_MATCH_RATEEST=m
> CONFIG_NETFILTER_XT_MATCH_REALM=m
> CONFIG_NETFILTER_XT_MATCH_RECENT=m
> CONFIG_NETFILTER_XT_MATCH_SCTP=m
> CONFIG_NETFILTER_XT_MATCH_SOCKET=m
> CONFIG_NETFILTER_XT_MATCH_STATE=m
> CONFIG_NETFILTER_XT_MATCH_STATISTIC=m
> CONFIG_NETFILTER_XT_MATCH_STRING=m
> CONFIG_NETFILTER_XT_MATCH_TCPMSS=m
> # CONFIG_NETFILTER_XT_MATCH_TIME is not set
> # CONFIG_NETFILTER_XT_MATCH_U32 is not set
> # end of Core Netfilter Configuration
>
> CONFIG_IP_SET=m
> CONFIG_IP_SET_MAX=256
> CONFIG_IP_SET_BITMAP_IP=m
> CONFIG_IP_SET_BITMAP_IPMAC=m
> CONFIG_IP_SET_BITMAP_PORT=m
> CONFIG_IP_SET_HASH_IP=m
> CONFIG_IP_SET_HASH_IPMARK=m
> CONFIG_IP_SET_HASH_IPPORT=m
> CONFIG_IP_SET_HASH_IPPORTIP=m
> CONFIG_IP_SET_HASH_IPPORTNET=m
> CONFIG_IP_SET_HASH_IPMAC=m
> CONFIG_IP_SET_HASH_MAC=m
> CONFIG_IP_SET_HASH_NETPORTNET=m
> CONFIG_IP_SET_HASH_NET=m
> CONFIG_IP_SET_HASH_NETNET=m
> CONFIG_IP_SET_HASH_NETPORT=m
> CONFIG_IP_SET_HASH_NETIFACE=m
> CONFIG_IP_SET_LIST_SET=m
> CONFIG_IP_VS=m
> CONFIG_IP_VS_IPV6=y
> # CONFIG_IP_VS_DEBUG is not set
> CONFIG_IP_VS_TAB_BITS=12
>
> #
> # IPVS transport protocol load balancing support
> #
> CONFIG_IP_VS_PROTO_TCP=y
> CONFIG_IP_VS_PROTO_UDP=y
> CONFIG_IP_VS_PROTO_AH_ESP=y
> CONFIG_IP_VS_PROTO_ESP=y
> CONFIG_IP_VS_PROTO_AH=y
> CONFIG_IP_VS_PROTO_SCTP=y
>
> #
> # IPVS scheduler
> #
> CONFIG_IP_VS_RR=m
> CONFIG_IP_VS_WRR=m
> CONFIG_IP_VS_LC=m
> CONFIG_IP_VS_WLC=m
> CONFIG_IP_VS_FO=m
> CONFIG_IP_VS_OVF=m
> CONFIG_IP_VS_LBLC=m
> CONFIG_IP_VS_LBLCR=m
> CONFIG_IP_VS_DH=m
> CONFIG_IP_VS_SH=m
> # CONFIG_IP_VS_MH is not set
> CONFIG_IP_VS_SED=m
> CONFIG_IP_VS_NQ=m
> # CONFIG_IP_VS_TWOS is not set
>
> #
> # IPVS SH scheduler
> #
> CONFIG_IP_VS_SH_TAB_BITS=8
>
> #
> # IPVS MH scheduler
> #
> CONFIG_IP_VS_MH_TAB_INDEX=12
>
> #
> # IPVS application helper
> #
> CONFIG_IP_VS_FTP=m
> CONFIG_IP_VS_NFCT=y
> CONFIG_IP_VS_PE_SIP=m
>
> #
> # IP: Netfilter Configuration
> #
> CONFIG_NF_DEFRAG_IPV4=m
> CONFIG_NF_SOCKET_IPV4=m
> CONFIG_NF_TPROXY_IPV4=m
> CONFIG_NF_TABLES_IPV4=y
> CONFIG_NFT_REJECT_IPV4=m
> CONFIG_NFT_DUP_IPV4=m
> CONFIG_NFT_FIB_IPV4=m
> CONFIG_NF_TABLES_ARP=y
> CONFIG_NF_DUP_IPV4=m
> CONFIG_NF_LOG_ARP=m
> CONFIG_NF_LOG_IPV4=m
> CONFIG_NF_REJECT_IPV4=m
> CONFIG_NF_NAT_SNMP_BASIC=m
> CONFIG_NF_NAT_PPTP=m
> CONFIG_NF_NAT_H323=m
> CONFIG_IP_NF_IPTABLES=m
> CONFIG_IP_NF_MATCH_AH=m
> CONFIG_IP_NF_MATCH_ECN=m
> CONFIG_IP_NF_MATCH_RPFILTER=m
> CONFIG_IP_NF_MATCH_TTL=m
> CONFIG_IP_NF_FILTER=m
> CONFIG_IP_NF_TARGET_REJECT=m
> CONFIG_IP_NF_TARGET_SYNPROXY=m
> CONFIG_IP_NF_NAT=m
> CONFIG_IP_NF_TARGET_MASQUERADE=m
> CONFIG_IP_NF_TARGET_NETMAP=m
> CONFIG_IP_NF_TARGET_REDIRECT=m
> CONFIG_IP_NF_MANGLE=m
> CONFIG_IP_NF_TARGET_ECN=m
> CONFIG_IP_NF_TARGET_TTL=m
> CONFIG_IP_NF_RAW=m
> CONFIG_IP_NF_SECURITY=m
> CONFIG_IP_NF_ARPTABLES=m
> CONFIG_IP_NF_ARPFILTER=m
> CONFIG_IP_NF_ARP_MANGLE=m
> # end of IP: Netfilter Configuration
>
> #
> # IPv6: Netfilter Configuration
> #
> CONFIG_NF_SOCKET_IPV6=m
> CONFIG_NF_TPROXY_IPV6=m
> CONFIG_NF_TABLES_IPV6=y
> CONFIG_NFT_REJECT_IPV6=m
> CONFIG_NFT_DUP_IPV6=m
> CONFIG_NFT_FIB_IPV6=m
> CONFIG_NF_DUP_IPV6=m
> CONFIG_NF_REJECT_IPV6=m
> CONFIG_NF_LOG_IPV6=m
> CONFIG_IP6_NF_IPTABLES=m
> CONFIG_IP6_NF_MATCH_AH=m
> CONFIG_IP6_NF_MATCH_EUI64=m
> CONFIG_IP6_NF_MATCH_FRAG=m
> CONFIG_IP6_NF_MATCH_OPTS=m
> CONFIG_IP6_NF_MATCH_HL=m
> CONFIG_IP6_NF_MATCH_IPV6HEADER=m
> CONFIG_IP6_NF_MATCH_MH=m
> CONFIG_IP6_NF_MATCH_RPFILTER=m
> CONFIG_IP6_NF_MATCH_RT=m
> # CONFIG_IP6_NF_MATCH_SRH is not set
> # CONFIG_IP6_NF_TARGET_HL is not set
> CONFIG_IP6_NF_FILTER=m
> CONFIG_IP6_NF_TARGET_REJECT=m
> CONFIG_IP6_NF_TARGET_SYNPROXY=m
> CONFIG_IP6_NF_MANGLE=m
> CONFIG_IP6_NF_RAW=m
> CONFIG_IP6_NF_SECURITY=m
> CONFIG_IP6_NF_NAT=m
> CONFIG_IP6_NF_TARGET_MASQUERADE=m
> CONFIG_IP6_NF_TARGET_NPT=m
> # end of IPv6: Netfilter Configuration
>
> CONFIG_NF_DEFRAG_IPV6=m
> CONFIG_NF_TABLES_BRIDGE=m
> # CONFIG_NFT_BRIDGE_META is not set
> CONFIG_NFT_BRIDGE_REJECT=m
> # CONFIG_NF_CONNTRACK_BRIDGE is not set
> CONFIG_BRIDGE_NF_EBTABLES=m
> CONFIG_BRIDGE_EBT_BROUTE=m
> CONFIG_BRIDGE_EBT_T_FILTER=m
> CONFIG_BRIDGE_EBT_T_NAT=m
> CONFIG_BRIDGE_EBT_802_3=m
> CONFIG_BRIDGE_EBT_AMONG=m
> CONFIG_BRIDGE_EBT_ARP=m
> CONFIG_BRIDGE_EBT_IP=m
> CONFIG_BRIDGE_EBT_IP6=m
> CONFIG_BRIDGE_EBT_LIMIT=m
> CONFIG_BRIDGE_EBT_MARK=m
> CONFIG_BRIDGE_EBT_PKTTYPE=m
> CONFIG_BRIDGE_EBT_STP=m
> CONFIG_BRIDGE_EBT_VLAN=m
> CONFIG_BRIDGE_EBT_ARPREPLY=m
> CONFIG_BRIDGE_EBT_DNAT=m
> CONFIG_BRIDGE_EBT_MARK_T=m
> CONFIG_BRIDGE_EBT_REDIRECT=m
> CONFIG_BRIDGE_EBT_SNAT=m
> CONFIG_BRIDGE_EBT_LOG=m
> CONFIG_BRIDGE_EBT_NFLOG=m
> # CONFIG_BPFILTER is not set
> # CONFIG_IP_DCCP is not set
> CONFIG_IP_SCTP=m
> # CONFIG_SCTP_DBG_OBJCNT is not set
> # CONFIG_SCTP_DEFAULT_COOKIE_HMAC_MD5 is not set
> CONFIG_SCTP_DEFAULT_COOKIE_HMAC_SHA1=y
> # CONFIG_SCTP_DEFAULT_COOKIE_HMAC_NONE is not set
> CONFIG_SCTP_COOKIE_HMAC_MD5=y
> CONFIG_SCTP_COOKIE_HMAC_SHA1=y
> CONFIG_INET_SCTP_DIAG=m
> # CONFIG_RDS is not set
> # CONFIG_TIPC is not set
> # CONFIG_ATM is not set
> # CONFIG_L2TP is not set
> CONFIG_STP=y
> CONFIG_GARP=y
> CONFIG_MRP=y
> CONFIG_BRIDGE=m
> CONFIG_BRIDGE_IGMP_SNOOPING=y
> CONFIG_BRIDGE_VLAN_FILTERING=y
> # CONFIG_BRIDGE_MRP is not set
> # CONFIG_BRIDGE_CFM is not set
> # CONFIG_NET_DSA is not set
> CONFIG_VLAN_8021Q=y
> CONFIG_VLAN_8021Q_GVRP=y
> CONFIG_VLAN_8021Q_MVRP=y
> CONFIG_LLC=y
> # CONFIG_LLC2 is not set
> # CONFIG_ATALK is not set
> # CONFIG_X25 is not set
> # CONFIG_LAPB is not set
> # CONFIG_PHONET is not set
> # CONFIG_6LOWPAN is not set
> # CONFIG_IEEE802154 is not set
> CONFIG_NET_SCHED=y
>
> #
> # Queueing/Scheduling
> #
> CONFIG_NET_SCH_HTB=m
> CONFIG_NET_SCH_HFSC=m
> CONFIG_NET_SCH_PRIO=m
> CONFIG_NET_SCH_MULTIQ=m
> CONFIG_NET_SCH_RED=m
> CONFIG_NET_SCH_SFB=m
> CONFIG_NET_SCH_SFQ=m
> CONFIG_NET_SCH_TEQL=m
> CONFIG_NET_SCH_TBF=m
> CONFIG_NET_SCH_CBS=m
> CONFIG_NET_SCH_ETF=m
> CONFIG_NET_SCH_MQPRIO_LIB=m
> CONFIG_NET_SCH_TAPRIO=m
> CONFIG_NET_SCH_GRED=m
> CONFIG_NET_SCH_NETEM=y
> CONFIG_NET_SCH_DRR=m
> CONFIG_NET_SCH_MQPRIO=m
> CONFIG_NET_SCH_SKBPRIO=m
> CONFIG_NET_SCH_CHOKE=m
> CONFIG_NET_SCH_QFQ=m
> CONFIG_NET_SCH_CODEL=m
> CONFIG_NET_SCH_FQ_CODEL=y
> CONFIG_NET_SCH_CAKE=m
> CONFIG_NET_SCH_FQ=m
> CONFIG_NET_SCH_HHF=m
> CONFIG_NET_SCH_PIE=m
> CONFIG_NET_SCH_FQ_PIE=m
> CONFIG_NET_SCH_INGRESS=y
> CONFIG_NET_SCH_PLUG=m
> CONFIG_NET_SCH_ETS=m
> CONFIG_NET_SCH_DEFAULT=y
> # CONFIG_DEFAULT_FQ is not set
> # CONFIG_DEFAULT_CODEL is not set
> CONFIG_DEFAULT_FQ_CODEL=y
> # CONFIG_DEFAULT_FQ_PIE is not set
> # CONFIG_DEFAULT_SFQ is not set
> # CONFIG_DEFAULT_PFIFO_FAST is not set
> CONFIG_DEFAULT_NET_SCH="fq_codel"
>
> #
> # Classification
> #
> CONFIG_NET_CLS=y
> CONFIG_NET_CLS_BASIC=m
> CONFIG_NET_CLS_ROUTE4=m
> CONFIG_NET_CLS_FW=m
> CONFIG_NET_CLS_U32=m
> CONFIG_CLS_U32_PERF=y
> CONFIG_CLS_U32_MARK=y
> CONFIG_NET_CLS_FLOW=m
> CONFIG_NET_CLS_CGROUP=y
> CONFIG_NET_CLS_BPF=m
> CONFIG_NET_CLS_FLOWER=m
> CONFIG_NET_CLS_MATCHALL=m
> CONFIG_NET_EMATCH=y
> CONFIG_NET_EMATCH_STACK=32
> CONFIG_NET_EMATCH_CMP=m
> CONFIG_NET_EMATCH_NBYTE=m
> CONFIG_NET_EMATCH_U32=m
> CONFIG_NET_EMATCH_META=m
> CONFIG_NET_EMATCH_TEXT=m
> CONFIG_NET_EMATCH_CANID=m
> CONFIG_NET_EMATCH_IPSET=m
> CONFIG_NET_EMATCH_IPT=m
> CONFIG_NET_CLS_ACT=y
> CONFIG_NET_ACT_POLICE=m
> CONFIG_NET_ACT_GACT=m
> CONFIG_GACT_PROB=y
> CONFIG_NET_ACT_MIRRED=m
> CONFIG_NET_ACT_SAMPLE=m
> CONFIG_NET_ACT_IPT=m
> CONFIG_NET_ACT_NAT=m
> CONFIG_NET_ACT_PEDIT=m
> CONFIG_NET_ACT_SIMP=m
> CONFIG_NET_ACT_SKBEDIT=m
> CONFIG_NET_ACT_CSUM=m
> CONFIG_NET_ACT_MPLS=m
> CONFIG_NET_ACT_VLAN=m
> CONFIG_NET_ACT_BPF=m
> CONFIG_NET_ACT_CONNMARK=m
> CONFIG_NET_ACT_CTINFO=m
> CONFIG_NET_ACT_SKBMOD=m
> CONFIG_NET_ACT_IFE=m
> CONFIG_NET_ACT_TUNNEL_KEY=m
> CONFIG_NET_ACT_CT=m
> CONFIG_NET_ACT_GATE=m
> CONFIG_NET_IFE_SKBMARK=m
> CONFIG_NET_IFE_SKBPRIO=m
> CONFIG_NET_IFE_SKBTCINDEX=m
> # CONFIG_NET_TC_SKB_EXT is not set
> CONFIG_NET_SCH_FIFO=y
> CONFIG_DCB=y
> CONFIG_DNS_RESOLVER=m
> # CONFIG_BATMAN_ADV is not set
> CONFIG_OPENVSWITCH=m
> CONFIG_OPENVSWITCH_GRE=m
> CONFIG_OPENVSWITCH_VXLAN=m
> CONFIG_VSOCKETS=m
> CONFIG_VSOCKETS_DIAG=m
> CONFIG_VSOCKETS_LOOPBACK=m
> CONFIG_VIRTIO_VSOCKETS=m
> CONFIG_VIRTIO_VSOCKETS_COMMON=m
> CONFIG_HYPERV_VSOCKETS=m
> CONFIG_NETLINK_DIAG=m
> CONFIG_MPLS=y
> CONFIG_NET_MPLS_GSO=y
> CONFIG_MPLS_ROUTING=m
> CONFIG_MPLS_IPTUNNEL=m
> CONFIG_NET_NSH=y
> # CONFIG_HSR is not set
> CONFIG_NET_SWITCHDEV=y
> CONFIG_NET_L3_MASTER_DEV=y
> # CONFIG_QRTR is not set
> # CONFIG_NET_NCSI is not set
> CONFIG_PCPU_DEV_REFCNT=y
> CONFIG_MAX_SKB_FRAGS=17
> CONFIG_RPS=y
> CONFIG_RFS_ACCEL=y
> CONFIG_SOCK_RX_QUEUE_MAPPING=y
> CONFIG_XPS=y
> CONFIG_CGROUP_NET_PRIO=y
> CONFIG_CGROUP_NET_CLASSID=y
> CONFIG_NET_RX_BUSY_POLL=y
> CONFIG_BQL=y
> CONFIG_BPF_STREAM_PARSER=y
> CONFIG_NET_FLOW_LIMIT=y
>
> #
> # Network testing
> #
> CONFIG_NET_PKTGEN=m
> CONFIG_NET_DROP_MONITOR=y
> # end of Network testing
> # end of Networking options
>
> # CONFIG_HAMRADIO is not set
> CONFIG_CAN=m
> CONFIG_CAN_RAW=m
> CONFIG_CAN_BCM=m
> CONFIG_CAN_GW=m
> # CONFIG_CAN_J1939 is not set
> # CONFIG_CAN_ISOTP is not set
> # CONFIG_BT is not set
> # CONFIG_AF_RXRPC is not set
> # CONFIG_AF_KCM is not set
> CONFIG_STREAM_PARSER=y
> # CONFIG_MCTP is not set
> CONFIG_FIB_RULES=y
> CONFIG_WIRELESS=y
> CONFIG_CFG80211=m
> # CONFIG_NL80211_TESTMODE is not set
> # CONFIG_CFG80211_DEVELOPER_WARNINGS is not set
> # CONFIG_CFG80211_CERTIFICATION_ONUS is not set
> CONFIG_CFG80211_REQUIRE_SIGNED_REGDB=y
> CONFIG_CFG80211_USE_KERNEL_REGDB_KEYS=y
> CONFIG_CFG80211_DEFAULT_PS=y
> # CONFIG_CFG80211_DEBUGFS is not set
> CONFIG_CFG80211_CRDA_SUPPORT=y
> # CONFIG_CFG80211_WEXT is not set
> CONFIG_MAC80211=m
> CONFIG_MAC80211_HAS_RC=y
> CONFIG_MAC80211_RC_MINSTREL=y
> CONFIG_MAC80211_RC_DEFAULT_MINSTREL=y
> CONFIG_MAC80211_RC_DEFAULT="minstrel_ht"
> # CONFIG_MAC80211_MESH is not set
> CONFIG_MAC80211_LEDS=y
> CONFIG_MAC80211_DEBUGFS=y
> # CONFIG_MAC80211_MESSAGE_TRACING is not set
> # CONFIG_MAC80211_DEBUG_MENU is not set
> CONFIG_MAC80211_STA_HASH_MAX_SIZE=0
> CONFIG_RFKILL=m
> CONFIG_RFKILL_LEDS=y
> CONFIG_RFKILL_INPUT=y
> # CONFIG_RFKILL_GPIO is not set
> CONFIG_NET_9P=y
> CONFIG_NET_9P_FD=y
> CONFIG_NET_9P_VIRTIO=y
> # CONFIG_NET_9P_DEBUG is not set
> # CONFIG_CAIF is not set
> CONFIG_CEPH_LIB=m
> # CONFIG_CEPH_LIB_PRETTYDEBUG is not set
> CONFIG_CEPH_LIB_USE_DNS_RESOLVER=y
> CONFIG_NFC=m
> # CONFIG_NFC_DIGITAL is not set
> CONFIG_NFC_NCI=m
> # CONFIG_NFC_NCI_SPI is not set
> # CONFIG_NFC_NCI_UART is not set
> # CONFIG_NFC_HCI is not set
>
> #
> # Near Field Communication (NFC) devices
> #
> CONFIG_NFC_VIRTUAL_NCI=m
> # CONFIG_NFC_FDP is not set
> # CONFIG_NFC_PN533_USB is not set
> # CONFIG_NFC_PN533_I2C is not set
> # CONFIG_NFC_MRVL_USB is not set
> # CONFIG_NFC_ST_NCI_I2C is not set
> # CONFIG_NFC_ST_NCI_SPI is not set
> # CONFIG_NFC_NXP_NCI is not set
> # CONFIG_NFC_S3FWRN5_I2C is not set
> # end of Near Field Communication (NFC) devices
>
> CONFIG_PSAMPLE=m
> CONFIG_NET_IFE=m
> CONFIG_LWTUNNEL=y
> CONFIG_LWTUNNEL_BPF=y
> CONFIG_DST_CACHE=y
> CONFIG_GRO_CELLS=y
> CONFIG_SOCK_VALIDATE_XMIT=y
> CONFIG_NET_SELFTESTS=y
> CONFIG_NET_SOCK_MSG=y
> CONFIG_NET_DEVLINK=y
> CONFIG_PAGE_POOL=y
> CONFIG_PAGE_POOL_STATS=y
> CONFIG_FAILOVER=m
> CONFIG_ETHTOOL_NETLINK=y
>
> #
> # Device Drivers
> #
> CONFIG_HAVE_EISA=y
> # CONFIG_EISA is not set
> CONFIG_HAVE_PCI=y
> CONFIG_PCI=y
> CONFIG_PCI_DOMAINS=y
> CONFIG_PCIEPORTBUS=y
> CONFIG_HOTPLUG_PCI_PCIE=y
> CONFIG_PCIEAER=y
> CONFIG_PCIEAER_INJECT=m
> CONFIG_PCIE_ECRC=y
> CONFIG_PCIEASPM=y
> CONFIG_PCIEASPM_DEFAULT=y
> # CONFIG_PCIEASPM_POWERSAVE is not set
> # CONFIG_PCIEASPM_POWER_SUPERSAVE is not set
> # CONFIG_PCIEASPM_PERFORMANCE is not set
> CONFIG_PCIE_PME=y
> CONFIG_PCIE_DPC=y
> # CONFIG_PCIE_PTM is not set
> # CONFIG_PCIE_EDR is not set
> CONFIG_PCI_MSI=y
> CONFIG_PCI_QUIRKS=y
> # CONFIG_PCI_DEBUG is not set
> # CONFIG_PCI_REALLOC_ENABLE_AUTO is not set
> CONFIG_PCI_STUB=y
> CONFIG_PCI_PF_STUB=m
> CONFIG_PCI_ATS=y
> CONFIG_PCI_LOCKLESS_CONFIG=y
> CONFIG_PCI_IOV=y
> CONFIG_PCI_PRI=y
> CONFIG_PCI_PASID=y
> # CONFIG_PCI_P2PDMA is not set
> CONFIG_PCI_LABEL=y
> CONFIG_PCI_HYPERV=m
> # CONFIG_PCIE_BUS_TUNE_OFF is not set
> CONFIG_PCIE_BUS_DEFAULT=y
> # CONFIG_PCIE_BUS_SAFE is not set
> # CONFIG_PCIE_BUS_PERFORMANCE is not set
> # CONFIG_PCIE_BUS_PEER2PEER is not set
> CONFIG_VGA_ARB=y
> CONFIG_VGA_ARB_MAX_GPUS=64
> CONFIG_HOTPLUG_PCI=y
> CONFIG_HOTPLUG_PCI_ACPI=y
> CONFIG_HOTPLUG_PCI_ACPI_IBM=m
> # CONFIG_HOTPLUG_PCI_CPCI is not set
> CONFIG_HOTPLUG_PCI_SHPC=y
>
> #
> # PCI controller drivers
> #
> CONFIG_VMD=y
> CONFIG_PCI_HYPERV_INTERFACE=m
>
> #
> # Cadence-based PCIe controllers
> #
> # end of Cadence-based PCIe controllers
>
> #
> # DesignWare-based PCIe controllers
> #
> # CONFIG_PCI_MESON is not set
> # CONFIG_PCIE_DW_PLAT_HOST is not set
> # end of DesignWare-based PCIe controllers
>
> #
> # Mobiveil-based PCIe controllers
> #
> # end of Mobiveil-based PCIe controllers
> # end of PCI controller drivers
>
> #
> # PCI Endpoint
> #
> # CONFIG_PCI_ENDPOINT is not set
> # end of PCI Endpoint
>
> #
> # PCI switch controller drivers
> #
> # CONFIG_PCI_SW_SWITCHTEC is not set
> # end of PCI switch controller drivers
>
> # CONFIG_CXL_BUS is not set
> # CONFIG_PCCARD is not set
> # CONFIG_RAPIDIO is not set
>
> #
> # Generic Driver Options
> #
> CONFIG_AUXILIARY_BUS=y
> # CONFIG_UEVENT_HELPER is not set
> CONFIG_DEVTMPFS=y
> CONFIG_DEVTMPFS_MOUNT=y
> # CONFIG_DEVTMPFS_SAFE is not set
> CONFIG_STANDALONE=y
> CONFIG_PREVENT_FIRMWARE_BUILD=y
>
> #
> # Firmware loader
> #
> CONFIG_FW_LOADER=y
> CONFIG_FW_LOADER_DEBUG=y
> CONFIG_FW_LOADER_PAGED_BUF=y
> CONFIG_FW_LOADER_SYSFS=y
> CONFIG_EXTRA_FIRMWARE=""
> CONFIG_FW_LOADER_USER_HELPER=y
> # CONFIG_FW_LOADER_USER_HELPER_FALLBACK is not set
> # CONFIG_FW_LOADER_COMPRESS is not set
> CONFIG_FW_CACHE=y
> CONFIG_FW_UPLOAD=y
> # end of Firmware loader
>
> CONFIG_ALLOW_DEV_COREDUMP=y
> # CONFIG_DEBUG_DRIVER is not set
> # CONFIG_DEBUG_DEVRES is not set
> # CONFIG_DEBUG_TEST_DRIVER_REMOVE is not set
> CONFIG_HMEM_REPORTING=y
> # CONFIG_TEST_ASYNC_DRIVER_PROBE is not set
> CONFIG_GENERIC_CPU_AUTOPROBE=y
> CONFIG_GENERIC_CPU_VULNERABILITIES=y
> CONFIG_REGMAP=y
> CONFIG_REGMAP_I2C=m
> CONFIG_REGMAP_SPI=m
> CONFIG_DMA_SHARED_BUFFER=y
> # CONFIG_DMA_FENCE_TRACE is not set
> # CONFIG_FW_DEVLINK_SYNC_STATE_TIMEOUT is not set
> # end of Generic Driver Options
>
> #
> # Bus devices
> #
> # CONFIG_MHI_BUS is not set
> # CONFIG_MHI_BUS_EP is not set
> # end of Bus devices
>
> CONFIG_CONNECTOR=y
> CONFIG_PROC_EVENTS=y
>
> #
> # Firmware Drivers
> #
>
> #
> # ARM System Control and Management Interface Protocol
> #
> # end of ARM System Control and Management Interface Protocol
>
> CONFIG_EDD=m
> # CONFIG_EDD_OFF is not set
> CONFIG_FIRMWARE_MEMMAP=y
> CONFIG_DMIID=y
> CONFIG_DMI_SYSFS=y
> CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK=y
> # CONFIG_ISCSI_IBFT is not set
> CONFIG_FW_CFG_SYSFS=y
> # CONFIG_FW_CFG_SYSFS_CMDLINE is not set
> CONFIG_SYSFB=y
> # CONFIG_SYSFB_SIMPLEFB is not set
> # CONFIG_GOOGLE_FIRMWARE is not set
>
> #
> # EFI (Extensible Firmware Interface) Support
> #
> CONFIG_EFI_ESRT=y
> CONFIG_EFI_VARS_PSTORE=y
> CONFIG_EFI_VARS_PSTORE_DEFAULT_DISABLE=y
> CONFIG_EFI_SOFT_RESERVE=y
> CONFIG_EFI_DXE_MEM_ATTRIBUTES=y
> CONFIG_EFI_RUNTIME_WRAPPERS=y
> # CONFIG_EFI_BOOTLOADER_CONTROL is not set
> # CONFIG_EFI_CAPSULE_LOADER is not set
> # CONFIG_EFI_TEST is not set
> # CONFIG_APPLE_PROPERTIES is not set
> # CONFIG_RESET_ATTACK_MITIGATION is not set
> # CONFIG_EFI_RCI2_TABLE is not set
> # CONFIG_EFI_DISABLE_PCI_DMA is not set
> CONFIG_EFI_EARLYCON=y
> CONFIG_EFI_CUSTOM_SSDT_OVERLAYS=y
> # CONFIG_EFI_DISABLE_RUNTIME is not set
> # CONFIG_EFI_COCO_SECRET is not set
> # end of EFI (Extensible Firmware Interface) Support
>
> CONFIG_UEFI_CPER=y
> CONFIG_UEFI_CPER_X86=y
>
> #
> # Tegra firmware driver
> #
> # end of Tegra firmware driver
> # end of Firmware Drivers
>
> # CONFIG_GNSS is not set
> # CONFIG_MTD is not set
> # CONFIG_OF is not set
> CONFIG_ARCH_MIGHT_HAVE_PC_PARPORT=y
> CONFIG_PARPORT=m
> CONFIG_PARPORT_PC=m
> CONFIG_PARPORT_SERIAL=m
> # CONFIG_PARPORT_PC_FIFO is not set
> # CONFIG_PARPORT_PC_SUPERIO is not set
> CONFIG_PARPORT_1284=y
> CONFIG_PNP=y
> # CONFIG_PNP_DEBUG_MESSAGES is not set
>
> #
> # Protocols
> #
> CONFIG_PNPACPI=y
> CONFIG_BLK_DEV=y
> CONFIG_BLK_DEV_NULL_BLK=m
> # CONFIG_BLK_DEV_FD is not set
> CONFIG_CDROM=m
> # CONFIG_BLK_DEV_PCIESSD_MTIP32XX is not set
> CONFIG_ZRAM=m
> CONFIG_ZRAM_DEF_COMP_LZORLE=y
> # CONFIG_ZRAM_DEF_COMP_LZO is not set
> CONFIG_ZRAM_DEF_COMP="lzo-rle"
> CONFIG_ZRAM_WRITEBACK=y
> # CONFIG_ZRAM_MEMORY_TRACKING is not set
> # CONFIG_ZRAM_MULTI_COMP is not set
> CONFIG_BLK_DEV_LOOP=m
> CONFIG_BLK_DEV_LOOP_MIN_COUNT=0
> # CONFIG_BLK_DEV_DRBD is not set
> CONFIG_BLK_DEV_NBD=m
> CONFIG_BLK_DEV_RAM=m
> CONFIG_BLK_DEV_RAM_COUNT=16
> CONFIG_BLK_DEV_RAM_SIZE=16384
> CONFIG_CDROM_PKTCDVD=m
> CONFIG_CDROM_PKTCDVD_BUFFERS=8
> # CONFIG_CDROM_PKTCDVD_WCACHE is not set
> # CONFIG_ATA_OVER_ETH is not set
> CONFIG_VIRTIO_BLK=m
> CONFIG_BLK_DEV_RBD=m
> # CONFIG_BLK_DEV_UBLK is not set
>
> #
> # NVME Support
> #
> CONFIG_NVME_CORE=m
> CONFIG_BLK_DEV_NVME=m
> CONFIG_NVME_MULTIPATH=y
> # CONFIG_NVME_VERBOSE_ERRORS is not set
> # CONFIG_NVME_HWMON is not set
> # CONFIG_NVME_FC is not set
> # CONFIG_NVME_TCP is not set
> # CONFIG_NVME_AUTH is not set
> # CONFIG_NVME_TARGET is not set
> # end of NVME Support
>
> #
> # Misc devices
> #
> # CONFIG_AD525X_DPOT is not set
> # CONFIG_DUMMY_IRQ is not set
> # CONFIG_IBM_ASM is not set
> # CONFIG_PHANTOM is not set
> CONFIG_TIFM_CORE=m
> CONFIG_TIFM_7XX1=m
> # CONFIG_ICS932S401 is not set
> CONFIG_ENCLOSURE_SERVICES=m
> # CONFIG_SGI_XP is not set
> CONFIG_HP_ILO=m
> # CONFIG_SGI_GRU is not set
> CONFIG_APDS9802ALS=m
> CONFIG_ISL29003=m
> CONFIG_ISL29020=m
> CONFIG_SENSORS_TSL2550=m
> CONFIG_SENSORS_BH1770=m
> CONFIG_SENSORS_APDS990X=m
> # CONFIG_HMC6352 is not set
> # CONFIG_DS1682 is not set
> # CONFIG_LATTICE_ECP3_CONFIG is not set
> # CONFIG_SRAM is not set
> # CONFIG_DW_XDATA_PCIE is not set
> # CONFIG_PCI_ENDPOINT_TEST is not set
> # CONFIG_XILINX_SDFEC is not set
> # CONFIG_C2PORT is not set
>
> #
> # EEPROM support
> #
> # CONFIG_EEPROM_AT24 is not set
> # CONFIG_EEPROM_AT25 is not set
> CONFIG_EEPROM_LEGACY=m
> CONFIG_EEPROM_MAX6875=m
> CONFIG_EEPROM_93CX6=m
> # CONFIG_EEPROM_93XX46 is not set
> # CONFIG_EEPROM_IDT_89HPESX is not set
> # CONFIG_EEPROM_EE1004 is not set
> # end of EEPROM support
>
> # CONFIG_CB710_CORE is not set
>
> #
> # Texas Instruments shared transport line discipline
> #
> # CONFIG_TI_ST is not set
> # end of Texas Instruments shared transport line discipline
>
> # CONFIG_SENSORS_LIS3_I2C is not set
> # CONFIG_ALTERA_STAPL is not set
> CONFIG_INTEL_MEI=m
> CONFIG_INTEL_MEI_ME=m
> # CONFIG_INTEL_MEI_TXE is not set
> # CONFIG_INTEL_MEI_GSC is not set
> # CONFIG_INTEL_MEI_HDCP is not set
> # CONFIG_INTEL_MEI_PXP is not set
> # CONFIG_VMWARE_VMCI is not set
> # CONFIG_GENWQE is not set
> # CONFIG_ECHO is not set
> # CONFIG_BCM_VK is not set
> # CONFIG_MISC_ALCOR_PCI is not set
> # CONFIG_MISC_RTSX_PCI is not set
> # CONFIG_MISC_RTSX_USB is not set
> # CONFIG_UACCE is not set
> CONFIG_PVPANIC=y
> # CONFIG_PVPANIC_MMIO is not set
> # CONFIG_PVPANIC_PCI is not set
> # CONFIG_GP_PCI1XXXX is not set
> # end of Misc devices
>
> #
> # SCSI device support
> #
> CONFIG_SCSI_MOD=y
> CONFIG_RAID_ATTRS=m
> CONFIG_SCSI_COMMON=y
> CONFIG_SCSI=y
> CONFIG_SCSI_DMA=y
> CONFIG_SCSI_NETLINK=y
> CONFIG_SCSI_PROC_FS=y
>
> #
> # SCSI support type (disk, tape, CD-ROM)
> #
> CONFIG_BLK_DEV_SD=m
> CONFIG_CHR_DEV_ST=m
> CONFIG_BLK_DEV_SR=m
> CONFIG_CHR_DEV_SG=m
> CONFIG_BLK_DEV_BSG=y
> CONFIG_CHR_DEV_SCH=m
> CONFIG_SCSI_ENCLOSURE=m
> CONFIG_SCSI_CONSTANTS=y
> CONFIG_SCSI_LOGGING=y
> CONFIG_SCSI_SCAN_ASYNC=y
>
> #
> # SCSI Transports
> #
> CONFIG_SCSI_SPI_ATTRS=m
> CONFIG_SCSI_FC_ATTRS=m
> CONFIG_SCSI_ISCSI_ATTRS=m
> CONFIG_SCSI_SAS_ATTRS=m
> CONFIG_SCSI_SAS_LIBSAS=m
> CONFIG_SCSI_SAS_ATA=y
> CONFIG_SCSI_SAS_HOST_SMP=y
> CONFIG_SCSI_SRP_ATTRS=m
> # end of SCSI Transports
>
> CONFIG_SCSI_LOWLEVEL=y
> # CONFIG_ISCSI_TCP is not set
> # CONFIG_ISCSI_BOOT_SYSFS is not set
> # CONFIG_SCSI_CXGB3_ISCSI is not set
> # CONFIG_SCSI_CXGB4_ISCSI is not set
> # CONFIG_SCSI_BNX2_ISCSI is not set
> # CONFIG_BE2ISCSI is not set
> # CONFIG_BLK_DEV_3W_XXXX_RAID is not set
> # CONFIG_SCSI_HPSA is not set
> # CONFIG_SCSI_3W_9XXX is not set
> # CONFIG_SCSI_3W_SAS is not set
> # CONFIG_SCSI_ACARD is not set
> # CONFIG_SCSI_AACRAID is not set
> # CONFIG_SCSI_AIC7XXX is not set
> # CONFIG_SCSI_AIC79XX is not set
> # CONFIG_SCSI_AIC94XX is not set
> # CONFIG_SCSI_MVSAS is not set
> # CONFIG_SCSI_MVUMI is not set
> # CONFIG_SCSI_ADVANSYS is not set
> # CONFIG_SCSI_ARCMSR is not set
> # CONFIG_SCSI_ESAS2R is not set
> CONFIG_MEGARAID_NEWGEN=y
> CONFIG_MEGARAID_MM=m
> CONFIG_MEGARAID_MAILBOX=m
> CONFIG_MEGARAID_LEGACY=m
> CONFIG_MEGARAID_SAS=m
> CONFIG_SCSI_MPT3SAS=m
> CONFIG_SCSI_MPT2SAS_MAX_SGE=128
> CONFIG_SCSI_MPT3SAS_MAX_SGE=128
> # CONFIG_SCSI_MPT2SAS is not set
> # CONFIG_SCSI_MPI3MR is not set
> # CONFIG_SCSI_SMARTPQI is not set
> # CONFIG_SCSI_HPTIOP is not set
> # CONFIG_SCSI_BUSLOGIC is not set
> # CONFIG_SCSI_MYRB is not set
> # CONFIG_SCSI_MYRS is not set
> # CONFIG_VMWARE_PVSCSI is not set
> CONFIG_HYPERV_STORAGE=m
> # CONFIG_LIBFC is not set
> # CONFIG_SCSI_SNIC is not set
> # CONFIG_SCSI_DMX3191D is not set
> # CONFIG_SCSI_FDOMAIN_PCI is not set
> CONFIG_SCSI_ISCI=m
> # CONFIG_SCSI_IPS is not set
> # CONFIG_SCSI_INITIO is not set
> # CONFIG_SCSI_INIA100 is not set
> # CONFIG_SCSI_PPA is not set
> # CONFIG_SCSI_IMM is not set
> # CONFIG_SCSI_STEX is not set
> # CONFIG_SCSI_SYM53C8XX_2 is not set
> # CONFIG_SCSI_IPR is not set
> # CONFIG_SCSI_QLOGIC_1280 is not set
> # CONFIG_SCSI_QLA_FC is not set
> # CONFIG_SCSI_QLA_ISCSI is not set
> # CONFIG_SCSI_LPFC is not set
> # CONFIG_SCSI_DC395x is not set
> # CONFIG_SCSI_AM53C974 is not set
> # CONFIG_SCSI_WD719X is not set
> CONFIG_SCSI_DEBUG=m
> # CONFIG_SCSI_PMCRAID is not set
> # CONFIG_SCSI_PM8001 is not set
> # CONFIG_SCSI_BFA_FC is not set
> # CONFIG_SCSI_VIRTIO is not set
> # CONFIG_SCSI_CHELSIO_FCOE is not set
> CONFIG_SCSI_DH=y
> CONFIG_SCSI_DH_RDAC=y
> CONFIG_SCSI_DH_HP_SW=y
> CONFIG_SCSI_DH_EMC=y
> CONFIG_SCSI_DH_ALUA=y
> # end of SCSI device support
>
> CONFIG_ATA=m
> CONFIG_SATA_HOST=y
> CONFIG_PATA_TIMINGS=y
> CONFIG_ATA_VERBOSE_ERROR=y
> CONFIG_ATA_FORCE=y
> CONFIG_ATA_ACPI=y
> # CONFIG_SATA_ZPODD is not set
> CONFIG_SATA_PMP=y
>
> #
> # Controllers with non-SFF native interface
> #
> CONFIG_SATA_AHCI=m
> CONFIG_SATA_MOBILE_LPM_POLICY=0
> CONFIG_SATA_AHCI_PLATFORM=m
> # CONFIG_AHCI_DWC is not set
> # CONFIG_SATA_INIC162X is not set
> # CONFIG_SATA_ACARD_AHCI is not set
> # CONFIG_SATA_SIL24 is not set
> CONFIG_ATA_SFF=y
>
> #
> # SFF controllers with custom DMA interface
> #
> # CONFIG_PDC_ADMA is not set
> # CONFIG_SATA_QSTOR is not set
> # CONFIG_SATA_SX4 is not set
> CONFIG_ATA_BMDMA=y
>
> #
> # SATA SFF controllers with BMDMA
> #
> CONFIG_ATA_PIIX=m
> # CONFIG_SATA_DWC is not set
> # CONFIG_SATA_MV is not set
> # CONFIG_SATA_NV is not set
> # CONFIG_SATA_PROMISE is not set
> # CONFIG_SATA_SIL is not set
> # CONFIG_SATA_SIS is not set
> # CONFIG_SATA_SVW is not set
> # CONFIG_SATA_ULI is not set
> # CONFIG_SATA_VIA is not set
> # CONFIG_SATA_VITESSE is not set
>
> #
> # PATA SFF controllers with BMDMA
> #
> # CONFIG_PATA_ALI is not set
> # CONFIG_PATA_AMD is not set
> # CONFIG_PATA_ARTOP is not set
> # CONFIG_PATA_ATIIXP is not set
> # CONFIG_PATA_ATP867X is not set
> # CONFIG_PATA_CMD64X is not set
> # CONFIG_PATA_CYPRESS is not set
> # CONFIG_PATA_EFAR is not set
> # CONFIG_PATA_HPT366 is not set
> # CONFIG_PATA_HPT37X is not set
> # CONFIG_PATA_HPT3X2N is not set
> # CONFIG_PATA_HPT3X3 is not set
> # CONFIG_PATA_IT8213 is not set
> # CONFIG_PATA_IT821X is not set
> # CONFIG_PATA_JMICRON is not set
> # CONFIG_PATA_MARVELL is not set
> # CONFIG_PATA_NETCELL is not set
> # CONFIG_PATA_NINJA32 is not set
> # CONFIG_PATA_NS87415 is not set
> # CONFIG_PATA_OLDPIIX is not set
> # CONFIG_PATA_OPTIDMA is not set
> # CONFIG_PATA_PDC2027X is not set
> # CONFIG_PATA_PDC_OLD is not set
> # CONFIG_PATA_RADISYS is not set
> # CONFIG_PATA_RDC is not set
> # CONFIG_PATA_SCH is not set
> # CONFIG_PATA_SERVERWORKS is not set
> # CONFIG_PATA_SIL680 is not set
> # CONFIG_PATA_SIS is not set
> # CONFIG_PATA_TOSHIBA is not set
> # CONFIG_PATA_TRIFLEX is not set
> # CONFIG_PATA_VIA is not set
> # CONFIG_PATA_WINBOND is not set
>
> #
> # PIO-only SFF controllers
> #
> # CONFIG_PATA_CMD640_PCI is not set
> # CONFIG_PATA_MPIIX is not set
> # CONFIG_PATA_NS87410 is not set
> # CONFIG_PATA_OPTI is not set
> # CONFIG_PATA_RZ1000 is not set
> # CONFIG_PATA_PARPORT is not set
>
> #
> # Generic fallback / legacy drivers
> #
> # CONFIG_PATA_ACPI is not set
> CONFIG_ATA_GENERIC=m
> # CONFIG_PATA_LEGACY is not set
> CONFIG_MD=y
> CONFIG_BLK_DEV_MD=y
> CONFIG_MD_AUTODETECT=y
> CONFIG_MD_LINEAR=m
> CONFIG_MD_RAID0=m
> CONFIG_MD_RAID1=m
> CONFIG_MD_RAID10=m
> CONFIG_MD_RAID456=m
> # CONFIG_MD_MULTIPATH is not set
> CONFIG_MD_FAULTY=m
> # CONFIG_BCACHE is not set
> CONFIG_BLK_DEV_DM_BUILTIN=y
> CONFIG_BLK_DEV_DM=m
> CONFIG_DM_DEBUG=y
> CONFIG_DM_BUFIO=m
> # CONFIG_DM_DEBUG_BLOCK_MANAGER_LOCKING is not set
> CONFIG_DM_BIO_PRISON=m
> CONFIG_DM_PERSISTENT_DATA=m
> # CONFIG_DM_UNSTRIPED is not set
> CONFIG_DM_CRYPT=m
> CONFIG_DM_SNAPSHOT=m
> CONFIG_DM_THIN_PROVISIONING=m
> CONFIG_DM_CACHE=m
> CONFIG_DM_CACHE_SMQ=m
> CONFIG_DM_WRITECACHE=m
> # CONFIG_DM_EBS is not set
> CONFIG_DM_ERA=m
> # CONFIG_DM_CLONE is not set
> CONFIG_DM_MIRROR=m
> CONFIG_DM_LOG_USERSPACE=m
> CONFIG_DM_RAID=m
> CONFIG_DM_ZERO=m
> CONFIG_DM_MULTIPATH=m
> CONFIG_DM_MULTIPATH_QL=m
> CONFIG_DM_MULTIPATH_ST=m
> # CONFIG_DM_MULTIPATH_HST is not set
> # CONFIG_DM_MULTIPATH_IOA is not set
> CONFIG_DM_DELAY=m
> # CONFIG_DM_DUST is not set
> CONFIG_DM_UEVENT=y
> CONFIG_DM_FLAKEY=m
> CONFIG_DM_VERITY=m
> # CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG is not set
> # CONFIG_DM_VERITY_FEC is not set
> CONFIG_DM_SWITCH=m
> CONFIG_DM_LOG_WRITES=m
> CONFIG_DM_INTEGRITY=m
> CONFIG_DM_AUDIT=y
> # CONFIG_TARGET_CORE is not set
> # CONFIG_FUSION is not set
>
> #
> # IEEE 1394 (FireWire) support
> #
> CONFIG_FIREWIRE=m
> CONFIG_FIREWIRE_OHCI=m
> CONFIG_FIREWIRE_SBP2=m
> CONFIG_FIREWIRE_NET=m
> # CONFIG_FIREWIRE_NOSY is not set
> # end of IEEE 1394 (FireWire) support
>
> CONFIG_MACINTOSH_DRIVERS=y
> CONFIG_MAC_EMUMOUSEBTN=y
After a few years of increasing test coverage in the MPTCP selftests, we
realised [1] the last version of the selftests is supposed to run on old
kernels without issues.
Supporting older versions is not that easy for this MPTCP case: these
selftests are often validating the internals by checking packets that
are exchanged, when some MIB counters are incremented after some
actions, how connections are getting opened and closed in some cases,
etc. In other words, it is not limited to the socket interface between
the userspace and the kernelspace.
In addition to that, the current MPTCP selftests run a lot of different
sub-tests but the TAP13 protocol used in the selftests don't support
sub-tests: one failure in sub-tests implies that the whole selftest is
seen as failed at the end because sub-tests are not tracked. It is then
important to skip sub-tests not supported by old kernels.
To minimise the modifications and reduce the complexity to support old
versions, the idea is to look at external signs and skip the whole
selftests or just some sub-tests before starting them. This cannot be
applied in all cases.
This second part focuses on marking different sub-tests as skipped if
some MPTCP features are not supported. A few techniques are used here:
- Before starting some tests:
- Check if a file (sysctl knob) is present: that's what patch 13/14 is
doing for the userspace PM feature.
- Check if a symbol is present in /proc/kallsyms: patch 1/14 adds some
helpers in mptcp_lib.sh to ease its use. Then these helpers are used
in patches 2, 3, 4, 10, 11 and 14/14.
- Set a flag and get the status to check if a feature is supported:
patch 8/14 is doing that with the 'fullmesh' flag.
- After having launched the tests:
- Retrieve the counters after a test and check if they are different
than 0. Similar to the check with the flag, that's not ideal but in
this case, the counters were already present before the introduction
of MPTCP but they have been supported by MPTCP sockets only later.
Patches 5 and 6/14 are using this technique.
Before skipping tests, SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES env var
value is checked: if it is set to 1, the test is marked as "failed"
instead of "skipped". MPTCP public CI expects to have all features
supported and it sets this env var to 1 to catch regressions in these
new checks.
Patches 7/14 and 9/14 are a bit different because they don't skip tests:
- Patch 7/14 retrieves the default values instead of using hardcoded
ones because these default values have been modified at some points.
Then the comparisons are done with the default values.
- patch 9/14 relaxes the expected returned size from MPTCP's getsockopt
because the different structures gathering various info can get new
fields and get bigger over time. We cannot expect that the userspace
is using the same structure as the kernel.
Patch 12/14 marks the test as "skipped" instead of "failed" if the "ip"
tool is not available.
In this second part, the "mptcp_join" selftest is not modified yet. This
will come soon after in the third part with quite a few patches.
Link: https://lore.kernel.org/stable/CA+G9fYtDGpgT4dckXD-y-N92nqUxuvue_7AtDdBcHrb… [1]
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
---
Matthieu Baerts (14):
selftests: mptcp: lib: skip if missing symbol
selftests: mptcp: connect: skip transp tests if not supported
selftests: mptcp: connect: skip disconnect tests if not supported
selftests: mptcp: connect: skip TFO tests if not supported
selftests: mptcp: diag: skip listen tests if not supported
selftests: mptcp: diag: skip inuse tests if not supported
selftests: mptcp: pm nl: remove hardcoded default limits
selftests: mptcp: pm nl: skip fullmesh flag checks if not supported
selftests: mptcp: sockopt: relax expected returned size
selftests: mptcp: sockopt: skip getsockopt checks if not supported
selftests: mptcp: sockopt: skip TCP_INQ checks if not supported
selftests: mptcp: userspace pm: skip if 'ip' tool is unavailable
selftests: mptcp: userspace pm: skip if not supported
selftests: mptcp: userspace pm: skip PM listener events tests if unavailable
tools/testing/selftests/net/mptcp/config | 1 +
tools/testing/selftests/net/mptcp/diag.sh | 42 +++++++++-------------
tools/testing/selftests/net/mptcp/mptcp_connect.sh | 20 +++++++++++
tools/testing/selftests/net/mptcp/mptcp_lib.sh | 38 ++++++++++++++++++++
tools/testing/selftests/net/mptcp/mptcp_sockopt.c | 18 ++++++----
tools/testing/selftests/net/mptcp/mptcp_sockopt.sh | 20 +++++++++--
tools/testing/selftests/net/mptcp/pm_netlink.sh | 27 ++++++++------
tools/testing/selftests/net/mptcp/userspace_pm.sh | 13 ++++++-
8 files changed, 135 insertions(+), 44 deletions(-)
---
base-commit: 6c0ec7ab5aaff3706657dd4946798aed483b9471
change-id: 20230608-upstream-net-20230608-mptcp-selftests-support-old-kernels-part-2-6e337e1f047d
Best regards,
--
Matthieu Baerts <matthieu.baerts(a)tessares.net>
Hi,
Enclosed are a pair of patches for an oops that can occur if an exception is
generated while a bpf subprogram is running. One of the bpf_prog_aux entries
for the subprograms are missing an extable. This can lead to an exception that
would otherwise be handled turning into a NULL pointer bug.
These changes were tested via the verifier and progs selftests and no
regressions were observed.
Changes from v3:
- Selftest style fixups (Feedback from Yonghong Song)
- Selftest needs to assert that test bpf program executed (Feedback from
Yonghong Song)
- Selftest should combine open and load using open_and_load (Feedback from
Yonghong Song)
Changes from v2:
- Insert only the main program's kallsyms (Feedback from Yonghong Song and
Alexei Starovoitov)
- Selftest should use ASSERT instead of CHECK (Feedback from Yonghong Song)
- Selftest needs some cleanup (Feedback from Yonghong Song)
- Switch patch order (Feedback from Alexei Starovoitov)
Changes from v1:
- Add a selftest (Feedback From Alexei Starovoitov)
- Move to a 1-line verifier change instead of searching multiple extables
Krister Johansen (2):
bpf: ensure main program has an extable
selftests/bpf: add a test for subprogram extables
kernel/bpf/verifier.c | 6 ++-
.../bpf/prog_tests/subprogs_extable.c | 29 +++++++++++
.../bpf/progs/test_subprogs_extable.c | 51 +++++++++++++++++++
3 files changed, 84 insertions(+), 2 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/subprogs_extable.c
create mode 100644 tools/testing/selftests/bpf/progs/test_subprogs_extable.c
--
2.25.1
For cases like IPv6 addresses, having a means to supply tracing
predicates for fields with more than 8 bytes would be convenient.
This series provides a simple way to support this by allowing
simple ==, != memory comparison with the predicate supplied when
the size of the field exceeds 8 bytes. For example, to trace
::1, the predicate
"dst == 0x00000000000000000000000000000001"
..could be used. Patch 1 implements this.
As a convenience, support for IPv4, IPv6 and MAC addresses are
also included; patches 2-4 cover these and allow simpler
comparisons which do not require getting the exact number of
bytes right; for exmaple
"dst == ::1"
"src != 127.0.0.1"
"mac_addr == ab:cd:ef:01:23:45"
Patch 5 adds tests for existing and new filter predicates, and patch 6
documents the fact that for the various addresses supported and
the >8 byte memory comparison. only == and != are supported.
Changes since v1 [1]:
- added support for IPv4, IPv6 and MAC addresses (patches 2-4)
(Masami and Steven)
- added selftests for IPv4, IPv6 and MAC addresses and updated
docs accordingly (patches 5,6)
Changes since RFC [2]:
- originally a fix was intermixed with the new functionality as
patch 1 in series [2]; the fix landed separately
- small tweaks to how filter predicates are defined via fn_num as
opposed to via fn directly
[1] https://lore.kernel.org/linux-trace-kernel/1682414197-13173-1-git-send-emai…
[22] https://lore.kernel.org/lkml/1659910883-18223-1-git-send-email-alan.maguire…
Alan Maguire (6):
tracing: support > 8 byte array filter predicates
tracing: support IPv4 address filter predicate
tracing: support IPv6 filter predicates
tracing: support MAC address filter predicates
selftests/ftrace: add test coverage for filter predicates
tracing: document IPv4, IPv6, MAC address and > 8 byte numeric
filtering support
Documentation/trace/events.rst | 21 +++
kernel/trace/trace_events_filter.c | 164 +++++++++++++++++-
.../selftests/ftrace/test.d/event/filter.tc | 91 ++++++++++
3 files changed, 275 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/ftrace/test.d/event/filter.tc
--
2.31.1
Some test cases from net/tls, net/fcnal-test and net/vrf-xfrm-tests
that rely on cryptographic functions to work and use non-compliant FIPS
algorithms fail in FIPS mode.
In order to allow these tests to pass in a wider set of kernels,
- for net/tls, skip the test variants that use the ChaCha20-Poly1305
and SM4 algorithms, when FIPS mode is enabled;
- for net/fcnal-test, skip the MD5 tests, when FIPS mode is enabled;
- for net/vrf-xfrm-tests, replace the algorithms that are not
FIPS-compliant with compliant ones.
Changes in v2:
- Add R-b tags.
- Put fips_non_compliant into the variants.
- Turn fips_enabled into a static global variable.
- Read /proc/sys/crypto/fips_enabled only once at main().
v1: https://lore.kernel.org/netdev/20230607174302.19542-1-magali.lemes@canonica…
Magali Lemes (3):
selftests: net: tls: check if FIPS mode is enabled
selftests: net: vrf-xfrm-tests: change authentication and encryption
algos
selftests: net: fcnal-test: check if FIPS mode is enabled
tools/testing/selftests/net/fcnal-test.sh | 27 ++-
tools/testing/selftests/net/tls.c | 175 +++++++++++++++++-
tools/testing/selftests/net/vrf-xfrm-tests.sh | 32 ++--
3 files changed, 209 insertions(+), 25 deletions(-)
--
2.34.1
Currently the the config fragment for cpufreq enables a lot of generic
lock debugging. While these options are useful when testing cpufreq
they aren't actually required to run the tests and are therefore out of
scope for the cpufreq fragement, they are more of a thing that it's good
to enable while doing testing than an actual requirement for cpufreq
testing specifically. Having these debugging options enabled,
especially the mutex and spinlock instrumentation, mean that any build
that includes the cpufreq fragment is both very much larger than a
standard defconfig (eg, I'm seeing 35% on x86_64) and also slower at
runtime.
This is causing real problems for CI systems. In order to avoid
building large numbers of kernels they try to group kselftest fragments
together, frequently just grouping all the kselftest fragments into a
single block. The increased size is an issue for memory constrained
systems and is also problematic for systems with fixed storage
allocations for kernel images (eg, typical u-boot systems) where it
frequently causes the kernel to overflow the storage space allocated for
kernels. The reduced performance isn't too bad with real hardware but
can be disruptive on emulated platforms.
In order to avoid these issues remove these generic instrumentation
options from the cpufreq fragment, bringing the cpufreq fragment into
line with other fragments which generally set requirements for testing
rather than nice to haves.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
tools/testing/selftests/cpufreq/config | 8 --------
1 file changed, 8 deletions(-)
diff --git a/tools/testing/selftests/cpufreq/config b/tools/testing/selftests/cpufreq/config
index 75e900793e8a..ce5068f5a6a2 100644
--- a/tools/testing/selftests/cpufreq/config
+++ b/tools/testing/selftests/cpufreq/config
@@ -5,11 +5,3 @@ CONFIG_CPU_FREQ_GOV_USERSPACE=y
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y
CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y
-CONFIG_DEBUG_RT_MUTEXES=y
-CONFIG_DEBUG_PLIST=y
-CONFIG_DEBUG_SPINLOCK=y
-CONFIG_DEBUG_MUTEXES=y
-CONFIG_DEBUG_LOCK_ALLOC=y
-CONFIG_PROVE_LOCKING=y
-CONFIG_LOCKDEP=y
-CONFIG_DEBUG_ATOMIC_SLEEP=y
---
base-commit: ac9a78681b921877518763ba0e89202254349d1b
change-id: 20230605-kselftest-cpufreq-options-2fd6d4742333
Best regards,
--
Mark Brown <broonie(a)kernel.org>
While KUnit tests that cannot be built as a loadable module must depend
on "KUNIT=y", this is not true for modular tests, where it adds an
unnecessary limitation.
Fix this by relaxing the dependency to "KUNIT".
Fixes: 08809e482a1c44d9 ("HID: uclogic: KUnit best practices and naming conventions")
Signed-off-by: Geert Uytterhoeven <geert+renesas(a)glider.be>
Reviewed-by: David Gow <davidgow(a)google.com>
Reviewed-by: José Expósito <jose.exposito89(a)gmail.com>
---
v2:
- Add Reviewed-by.
---
drivers/hid/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hid/Kconfig b/drivers/hid/Kconfig
index 4ce012f83253ec9f..b977450cac75265d 100644
--- a/drivers/hid/Kconfig
+++ b/drivers/hid/Kconfig
@@ -1285,7 +1285,7 @@ config HID_MCP2221
config HID_KUNIT_TEST
tristate "KUnit tests for HID" if !KUNIT_ALL_TESTS
- depends on KUNIT=y
+ depends on KUNIT
depends on HID_BATTERY_STRENGTH
depends on HID_UCLOGIC
default KUNIT_ALL_TESTS
--
2.34.1
From: Menglong Dong <imagedong(a)tencent.com>
For now, the BPF program of type BPF_PROG_TYPE_TRACING can only be used
on the kernel functions whose arguments count less than 6. This is not
friendly at all, as too many functions have arguments count more than 6.
Therefore, let's enhance it by increasing the function arguments count
allowed in arch_prepare_bpf_trampoline(), for now, only x86_64.
In the 1st patch, we make arch_prepare_bpf_trampoline() support to copy
function arguments in stack for x86 arch. Therefore, the maximum
arguments can be up to MAX_BPF_FUNC_ARGS for FENTRY and FEXIT.
In the 2nd patch, we clean garbage value in upper bytes of the trampoline
when we store the arguments from regs into stack.
And the 3rd patches are for the testcases of the 1st patch.
Changes since v2:
- keep MAX_BPF_FUNC_ARGS still
- clean garbage value in upper bytes in the 2nd patch
- move bpf_fentry_test{7,12} to bpf_testmod.c and rename them to
bpf_testmod_fentry_test{7,12} meanwhile in the 3rd patch
Changes since v1:
- change the maximun function arguments to 14 from 12
- add testcases (Jiri Olsa)
- instead EMIT4 with EMIT3_off32 for "lea" to prevent overflow
Menglong Dong (3):
bpf, x86: allow function arguments up to 12 for TRACING
bpf, x86: clean garbage value in the stack of trampoline
selftests/bpf: add testcase for FENTRY/FEXIT with 6+ arguments
arch/x86/net/bpf_jit_comp.c | 105 +++++++++++++++---
.../selftests/bpf/bpf_testmod/bpf_testmod.c | 19 +++-
.../selftests/bpf/prog_tests/fentry_fexit.c | 4 +-
.../selftests/bpf/prog_tests/fentry_test.c | 2 +
.../selftests/bpf/prog_tests/fexit_test.c | 2 +
.../testing/selftests/bpf/progs/fentry_test.c | 21 ++++
.../testing/selftests/bpf/progs/fexit_test.c | 33 ++++++
7 files changed, 169 insertions(+), 17 deletions(-)
--
2.40.1
Some test cases from net/tls, net/fcnal-test and net/vrf-xfrm-tests
that rely on cryptographic functions to work and use non-compliant FIPS
algorithms fail in FIPS mode.
In order to allow these tests to pass in a wider set of kernels,
- for net/tls, skip the test variants that use the ChaCha20-Poly1305
and SM4 algorithms, when FIPS mode is enabled;
- for net/fcnal-test, skip the MD5 tests, when FIPS mode is enabled;
- for net/vrf-xfrm-tests, replace the algorithms that are not
FIPS-compliant with compliant ones.
Magali Lemes (3):
selftests: net: tls: check if FIPS mode is enabled
selftests: net: vrf-xfrm-tests: change authentication and encryption
algos
selftests: net: fcnal-test: check if FIPS mode is enabled
tools/testing/selftests/net/fcnal-test.sh | 27 +-
tools/testing/selftests/net/tls.c | 265 +++++++++++++++++-
tools/testing/selftests/net/vrf-xfrm-tests.sh | 32 +--
3 files changed, 298 insertions(+), 26 deletions(-)
--
2.34.1
KVM_GET_REG_LIST will dump all register IDs that are available to
KVM_GET/SET_ONE_REG and It's very useful to identify some platform
regression issue during VM migration.
Patch 1-7 re-structured the get-reg-list test in aarch64 to make some
of the code as common test framework that can be shared by riscv.
Patch 8 enabled the KVM_GET_REG_LIST API in riscv and patch 9-11 added
the corresponding kselftest for checking possible register regressions.
The get-reg-list kvm selftest was ported from aarch64 and tested with
Linux 6.4-rc1 on a Qemu riscv virt machine.
---
Changed since v1:
* rebase to Andrew's changes
* fix coding style
Andrew Jones (7):
KVM: arm64: selftests: Replace str_with_index with strdup_printf
KVM: arm64: selftests: Drop SVE cap check in print_reg
KVM: arm64: selftests: Remove print_reg's dependency on vcpu_config
KVM: arm64: selftests: Rename vcpu_config and add to kvm_util.h
KVM: arm64: selftests: Delete core_reg_fixup
KVM: arm64: selftests: Split get-reg-list test code
KVM: arm64: selftests: Finish generalizing get-reg-list
Haibo Xu (4):
KVM: riscv: Add KVM_GET_REG_LIST API support
KVM: riscv: selftests: Make check_supported arch specific
KVM: riscv: selftests: Skip some registers set operation
KVM: riscv: selftests: Add get-reg-list test
Documentation/virt/kvm/api.rst | 2 +-
arch/riscv/kvm/vcpu.c | 372 ++++++++++++
tools/testing/selftests/kvm/Makefile | 13 +-
.../selftests/kvm/aarch64/get-reg-list.c | 540 ++----------------
tools/testing/selftests/kvm/get-reg-list.c | 426 ++++++++++++++
.../selftests/kvm/include/kvm_util_base.h | 16 +
.../selftests/kvm/include/riscv/processor.h | 3 +
.../testing/selftests/kvm/include/test_util.h | 2 +
tools/testing/selftests/kvm/lib/test_util.c | 15 +
.../selftests/kvm/riscv/get-reg-list.c | 539 +++++++++++++++++
10 files changed, 1428 insertions(+), 500 deletions(-)
create mode 100644 tools/testing/selftests/kvm/get-reg-list.c
create mode 100644 tools/testing/selftests/kvm/riscv/get-reg-list.c
--
2.34.1
Hi,
Enclosed are a pair of patches for an oops that can occur if an exception is
generated while a bpf subprogram is running. One of the bpf_prog_aux entries
for the subprograms are missing an extable. This can lead to an exception that
would otherwise be handled turning into a NULL pointer bug.
The bulk of the change here is simply adding a pair of programs for the
selftest. The proposed fix in this iteration is a 1-line change.
These changes were tested via the verifier and progs selftests and no
regressions were observed.
Changes from v1:
- Add a selftest (Feedback From Alexei Starovoitov)
- Move to a 1-line verifier change instead of searching multiple extables
Krister Johansen (2):
Add a selftest for subprogram extables
bpf: ensure main program has an extable
kernel/bpf/verifier.c | 1 +
.../bpf/prog_tests/subprogs_extable.c | 35 +++++++++
.../bpf/progs/test_subprogs_extable.c | 71 +++++++++++++++++++
3 files changed, 107 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/subprogs_extable.c
create mode 100644 tools/testing/selftests/bpf/progs/test_subprogs_extable.c
--
2.25.1
Hi,
This series is on top of kvmarm/next as I needed to also modify Eager
page splitting logic in clear-dirty-log API. Eager page splitting is not
present in Linux 6.4-rc4.
Also, I had to change selftests patches (1 to 5) as some commits were
removed from kvm/queue remote. This caused issue due to different APIs
being present in dirty_log_perf_test when I was rebasing v2. Those
removed commits are now back in kvm-x86 branch of Sean [1] but not in
kvmarm/next or kvm/queue. I didn't want to wait for review of v2, so I
changed dirty_log_perf_test to work with kvmarm/next branch. When Sean's
kvm-x86 branch is merged, sleftests in this patch series need to be
modified to use new APIs or whoever merges last need to take care of
that.
This patch series modifies clear-dirty-log operation to run under MMU
read lock. It write protects SPTEs and split huge pages using MMU read
lock instead of MMU write lock.
Use of MMU read lock is made possible by using shared page table
walkers. Currently only page fault handlers use shared page table
walkers, with this series, clear-dirty-log operation will also use
shared page table walkers.
Patches 1 to 5:
These patches are modifying dirty_log_perf_test. Intent is to mimic
production scenarios where guest keeps on executing while userspace
thread collects and clears dirty logs independently.
Three new command line options are added:
1. j: Allows to run guest vCPUs and main thread collecting dirty logs
independently of each other after initialization is complete.
2. k: Allows to clear dirty logs in smaller chunks compared to existing
whole memslot clear in one call.
3. l: Allows to add customizable wait time between consecutive clear
dirty log calls to mimic sending dirty memory to destination.
Patch 7-16:
These patches refactor code to move MMU lock operations to arch specific
code, refactor Arm's page table walker APIs, and change MMU write lock
for clearing dirty logs to read lock. Patch 16 has results showing
improvements based on dirty_log_perf_test.
1. https://lore.kernel.org/lkml/168565341087.666819.6731422637224460050.b4-ty@…
v2:
- Fix compile warning for mips and riscv.
- Added logic to continue or retry shared page walk which are not fault
handler.
- Huge page split also changed to run under MMU read lock.
- Added more explanations in commit logs.
- Selftests is modified because a commit series was reverted back in
dirty_log_perf_test on kvm/queue.
v1: https://lore.kernel.org/lkml/20230421165305.804301-1-vipinsh@google.com/
Vipin Sharma (16):
KVM: selftests: Clear dirty logs in user defined chunks sizes in
dirty_log_perf_test
KVM: selftests: Add optional delay between consecutive clear-dirty-log
calls
KVM: selftests: Pass the count of read and write accesses from guest
to host
KVM: selftests: Print read-write progress by vCPUs in
dirty_log_perf_test
KVM: selftests: Allow independent execution of vCPUs in
dirty_log_perf_test
KVM: arm64: Correct the kvm_pgtable_stage2_flush() documentation
KVM: mmu: Move mmu lock/unlock to arch code for clear dirty log
KMV: arm64: Pass page table walker flags to stage2_apply_range_*()
KVM: arm64: Document the page table walker actions based on the
callback's return value
KVM: arm64: Return -ENOENT if PTE is not valid in stage2_attr_walker
KVM: arm64: Use KVM_PGTABLE_WALK_SHARED flag instead of
KVM_PGTABLE_WALK_HANDLE_FAULT
KVM: arm64: Retry shared page table walks outside of fault handler
KVM: arm64: Run clear-dirty-log under MMU read lock
KVM: arm64: Pass page walker flags from callers of stage 2 split
walker
KVM: arm64: Provide option to pass page walker flag for huge page
splits
KVM: arm64: Split huge pages during clear-dirty-log under MMU read
lock
arch/arm64/include/asm/kvm_pgtable.h | 42 +++--
arch/arm64/kvm/hyp/nvhe/mem_protect.c | 4 +-
arch/arm64/kvm/hyp/pgtable.c | 68 ++++++--
arch/arm64/kvm/mmu.c | 65 +++++---
arch/mips/kvm/mmu.c | 2 +
arch/riscv/kvm/mmu.c | 2 +
arch/x86/kvm/mmu/mmu.c | 3 +
.../selftests/kvm/dirty_log_perf_test.c | 147 ++++++++++++++----
tools/testing/selftests/kvm/lib/memstress.c | 13 +-
virt/kvm/dirty_ring.c | 2 -
virt/kvm/kvm_main.c | 4 -
11 files changed, 265 insertions(+), 87 deletions(-)
base-commit: 532b2ecfa547f02b1825108711565eff026bce5a
--
2.41.0.rc0.172.g3f132b7071-goog
Hello Paul,
Thomas and Zhangjin have provided significant nolibc cleanups, and
fixes, as well as preparation work to later support riscv32.
These consist in the following main series:
- generalization of stackprotector to other archs that were not
previously supported (riscv, mips, loongarch, arm, arm64)
- general cleanups of the makefile, test report output, deduplication
of certain tests
- slightly better compliance of some tests performed on certain syscalls
(e.g. no longer pass (void*)1 to gettimeofday() since glibc hates it).
- add support for nanoseconds in stat() and statx()
- fixes for some syscalls (e.g. ppoll() has 5 arguments not 4)
- fixes around limits.h and INT_MAX / INT_FAST64_MAX
I rebased the whole series on top of your latest dev branch (d19a9ca3d5)
and it works fine for all archs.
I don't know if you're still planning on merging new stuff in this area
for 6.5 or not (since I know that it involves new series of tests on your
side as well), but given that Zhangjin will engage into deeper changes
later for riscv32 that will likely imply to update more syscalls to use
the time64 ones, I would prefer to split the cleanups from the hard stuff,
but I'll let you judge based on the current state of what's pending for
6.5.
In any case I'm putting all this here for now (not for merge yet):
git://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/nolibc.git 20230604-nolibc-rv32+stkp6
I'd like Thomas and Zhangjin to perform a last check to confirm they're
OK with this final integration.
Thanks!
Willy
Fixes: 8e3ab529bef9 ("tools/nolibc/unistd: add syscall()")
Signed-off-by: Zhangjin Wu <falcon(a)tinylab.org>
---
Hi, Willy
Since this may be ok for v6.5, so, directly based it on your
20230606-nolibc-rv32+stkp7a branch.
This may conflict with the reviewed series [1], if require, I can renew
that series too.
[1]: https://lore.kernel.org/linux-riscv/cover.1686135913.git.falcon@tinylab.org/
tools/include/nolibc/unistd.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/include/nolibc/unistd.h b/tools/include/nolibc/unistd.h
index c20b2fbf065e..0e832e10a0b2 100644
--- a/tools/include/nolibc/unistd.h
+++ b/tools/include/nolibc/unistd.h
@@ -66,10 +66,10 @@ int tcsetpgrp(int fd, pid_t pid)
_ret; \
})
-#define _sycall_narg(...) __syscall_narg(__VA_ARGS__, 6, 5, 4, 3, 2, 1, 0)
+#define _syscall_narg(...) __syscall_narg(__VA_ARGS__, 6, 5, 4, 3, 2, 1, 0)
#define __syscall_narg(_0, _1, _2, _3, _4, _5, _6, N, ...) N
#define _syscall_n(N, ...) _syscall(N, __VA_ARGS__)
-#define syscall(...) _syscall_n(_sycall_narg(__VA_ARGS__), ##__VA_ARGS__)
+#define syscall(...) _syscall_n(_syscall_narg(__VA_ARGS__), ##__VA_ARGS__)
/* make sure to include all global symbols */
#include "nolibc.h"
--
2.25.1
User space applications watch for timestamp changes on character device
files in order to determine idle time of a given terminal session. For
example, "w" program uses this information to populate the IDLE column
of its output [1]. Similarly, systemd-logind has optional feature where
it uses atime of the tty character device to determine if there was
activity on the terminal associated with the logind's session object. If
there was no activity for a configured period of time then logind will
terminate such session [2].
Now, usually (e.g. bash running on the terminal) the use of the terminal
will update timestamps (atime and mtime) on the corresponding terminal
character device. However, if access to the terminal, e.g. /dev/pts/0,
is performed through magic character device /dev/tty then such access
obviously changes the state of the terminal, however timestamps on the
device that correspond to the terminal (/dev/pts/0) are not updated.
This patch makes sure that we update timestamps on *all* character
devices that correspond to the given tty, because outside observers (w,
systemd-logind) are maybe checking these timestamps. Obviously, they can
not check timestamps on /dev/tty as that has per-process meaning.
[1] https://gitlab.com/procps-ng/procps/-/blob/v4.0.0/w.c#L286
[2] https://github.com/systemd/systemd/blob/v252/NEWS#L477
Signed-off-by: Michal Sekletar <msekleta(a)redhat.com>
---
drivers/tty/tty_io.c | 32 +++++++++++++++++++++-----------
1 file changed, 21 insertions(+), 11 deletions(-)
diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 36fb945fdad4..48e0148b0f3e 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -101,6 +101,7 @@
#include <linux/compat.h>
#include <linux/uaccess.h>
#include <linux/termios_internal.h>
+#include <linux/fs.h>
#include <linux/kbd_kern.h>
#include <linux/vt_kern.h>
@@ -811,18 +812,27 @@ void start_tty(struct tty_struct *tty)
}
EXPORT_SYMBOL(start_tty);
-static void tty_update_time(struct timespec64 *time)
+static void tty_update_time(struct tty_struct *tty, int tstamp)
{
+ struct tty_file_private *priv;
time64_t sec = ktime_get_real_seconds();
- /*
- * We only care if the two values differ in anything other than the
- * lower three bits (i.e every 8 seconds). If so, then we can update
- * the time of the tty device, otherwise it could be construded as a
- * security leak to let userspace know the exact timing of the tty.
- */
- if ((sec ^ time->tv_sec) & ~7)
- time->tv_sec = sec;
+ spin_lock(&tty->files_lock);
+ list_for_each_entry(priv, &tty->tty_files, list) {
+ struct file *filp = priv->file;
+ struct inode *inode = file_inode(filp);
+ struct timespec64 *time = tstamp == S_MTIME ? &inode->i_mtime : &inode->i_atime;
+
+ /*
+ * We only care if the two values differ in anything other than the
+ * lower three bits (i.e every 8 seconds). If so, then we can update
+ * the time of the tty device, otherwise it could be construded as a
+ * security leak to let userspace know the exact timing of the tty.
+ */
+ if ((sec ^ time->tv_sec) & ~7)
+ time->tv_sec = sec;
+ }
+ spin_unlock(&tty->files_lock);
}
/*
@@ -928,7 +938,7 @@ static ssize_t tty_read(struct kiocb *iocb, struct iov_iter *to)
tty_ldisc_deref(ld);
if (i > 0)
- tty_update_time(&inode->i_atime);
+ tty_update_time(tty, S_ATIME);
return i;
}
@@ -1036,7 +1046,7 @@ static inline ssize_t do_tty_write(
cond_resched();
}
if (written) {
- tty_update_time(&file_inode(file)->i_mtime);
+ tty_update_time(tty, S_MTIME);
ret = written;
}
out:
--
2.39.2
From: Maxim Mikityanskiy <maxim(a)isovalent.com>
See the details in the commit message (TL/DR: under CAP_BPF, the
verifier can incorrectly conclude that a scalar is zero while in
fact it can be crafted to a predefined number.)
v1 and v2 were sent off-list.
v2 changes:
Added more tests, migrated them to inline asm, started using
bpf_get_prandom_u32, switched to a more bulletproof dead branch check
and modified the failing spill test scenarios so that an unauthorized
access attempt is performed in both branches.
v3 changes:
Dropped an improvement not necessary for the fix, changed the Fixes tag.
v4 changes:
Dropped supposedly redundant tests, kept the ones that result in
different verifier verdicts. Dropped the variable that is not yet
useful in this patch. Rephrased the commit message with Daniel's
suggestions.
Maxim Mikityanskiy (2):
bpf: Fix verifier id tracking of scalars on spill
selftests/bpf: Add test cases to assert proper ID tracking on spill
kernel/bpf/verifier.c | 3 +
.../selftests/bpf/progs/verifier_spill_fill.c | 79 +++++++++++++++++++
2 files changed, 82 insertions(+)
--
2.40.1
Willy, Thomas
This is the revision of the v2 syscall helpers [1], it is based on
20230606-nolibc-rv32+stkp7a of [2]. It doesn't conflict with the v4 of
-ENOSYS patchset [3], so, it is ok to simply merge both of them.
This revision mainly applied Thomas' method, removed the __syscall()
helper and replaced it with __sysret() instead, because __syscall()
looks like _syscall() and syscall(), it may mixlead the developers.
Changes from v2 -> v3:
* tools/nolibc: sys.h: add a syscall return helper
* The __syscall() is removed.
* Align the code style of __sysret() with the others, and use
__inline__ instead of inline (like stdlib.h) to let it work with
the default -std=c89 in tools/testing/selftests/nolibc/Makefile
* tools/nolibc: unistd.h: apply __sysret() helper
As v2.
* tools/nolibc: sys.h: apply __sysret() helper
replaced __syscall() with __sysret() and merged two separated patches of v2 to one.
Did run-user tests for rv32 (with [3]), rv64 and arm64.
BTW, two questions for Thomas,
* This commit 659a49abc9c2 ("tools/nolibc: validate C89 compatibility")
enables -std=c89, why not gnu11 used by kernel ? ;-)
* Do we need to tune the order of the macros in unistd.h like this:
#define _syscall(N, ...) __sysret(my_syscall##N(__VA_ARGS__))
#define _syscall_n(N, ...) _syscall(N, __VA_ARGS__)
#define __syscall_narg(_0, _1, _2, _3, _4, _5, _6, N, ...) N
#define _sycall_narg(...) __syscall_narg(__VA_ARGS__, 6, 5, 4, 3, 2, 1, 0)
#define syscall(...) _syscall_n(_sycall_narg(__VA_ARGS__), ##__VA_ARGS__)
Before, It works but seems not put in using order:
#define _syscall(N, ...) __sysret(my_syscall##N(__VA_ARGS__))
#define _sycall_narg(...) __syscall_narg(__VA_ARGS__, 6, 5, 4, 3, 2, 1, 0)
#define __syscall_narg(_0, _1, _2, _3, _4, _5, _6, N, ...) N
#define _syscall_n(N, ...) _syscall(N, __VA_ARGS__)
#define syscall(...) _syscall_n(_sycall_narg(__VA_ARGS__), ##__VA_ARGS__)
Thanks.
Best regards,
Zhangjin
---
[1]: https://lore.kernel.org/linux-riscv/cover.1686036862.git.falcon@tinylab.org/
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/nolibc.git
[3]: https://lore.kernel.org/linux-riscv/cover.1686128703.git.falcon@tinylab.org…
Zhangjin Wu (3):
tools/nolibc: sys.h: add a syscall return helper
tools/nolibc: unistd.h: apply __sysret() helper
tools/nolibc: sys.h: apply __sysret() helper
tools/include/nolibc/sys.h | 364 +++++-----------------------------
tools/include/nolibc/unistd.h | 11 +-
2 files changed, 55 insertions(+), 320 deletions(-)
--
2.25.1
*Changes in v17*
- Rebase on top of next-20230606
- Minor improvements in PAGEMAP_SCAN IOCTL patch
*Changes in v16*
- Fix a corner case
- Add exclusive PM_SCAN_OP_WP back
*Changes in v15*
- Build fix (Add missed build fix in RESEND)
*Changes in v14*
- Fix build error caused by #ifdef added at last minute in some configs
*Changes in v13*
- Rebase on top of next-20230414
- Give-up on using uffd_wp_range() and write new helpers, flush tlb only
once
*Changes in v12*
- Update and other memory types to UFFD_FEATURE_WP_ASYNC
- Rebaase on top of next-20230406
- Review updates
*Changes in v11*
- Rebase on top of next-20230307
- Base patches on UFFD_FEATURE_WP_UNPOPULATED
- Do a lot of cosmetic changes and review updates
- Remove ENGAGE_WP + !GET operation as it can be performed with
UFFDIO_WRITEPROTECT
*Changes in v10*
- Add specific condition to return error if hugetlb is used with wp
async
- Move changes in tools/include/uapi/linux/fs.h to separate patch
- Add documentation
*Changes in v9:*
- Correct fault resolution for userfaultfd wp async
- Fix build warnings and errors which were happening on some configs
- Simplify pagemap ioctl's code
*Changes in v8:*
- Update uffd async wp implementation
- Improve PAGEMAP_IOCTL implementation
*Changes in v7:*
- Add uffd wp async
- Update the IOCTL to use uffd under the hood instead of soft-dirty
flags
*Motivation*
The real motivation for adding PAGEMAP_SCAN IOCTL is to emulate Windows
GetWriteWatch() syscall [1]. The GetWriteWatch{} retrieves the addresses of
the pages that are written to in a region of virtual memory.
This syscall is used in Windows applications and games etc. This syscall is
being emulated in pretty slow manner in userspace. Our purpose is to
enhance the kernel such that we translate it efficiently in a better way.
Currently some out of tree hack patches are being used to efficiently
emulate it in some kernels. We intend to replace those with these patches.
So the whole gaming on Linux can effectively get benefit from this. It
means there would be tons of users of this code.
CRIU use case [2] was mentioned by Andrei and Danylo:
> Use cases for migrating sparse VMAs are binaries sanitized with ASAN,
> MSAN or TSAN [3]. All of these sanitizers produce sparse mappings of
> shadow memory [4]. Being able to migrate such binaries allows to highly
> reduce the amount of work needed to identify and fix post-migration
> crashes, which happen constantly.
Andrei's defines the following uses of this code:
* it is more granular and allows us to track changed pages more
effectively. The current interface can clear dirty bits for the entire
process only. In addition, reading info about pages is a separate
operation. It means we must freeze the process to read information
about all its pages, reset dirty bits, only then we can start dumping
pages. The information about pages becomes more and more outdated,
while we are processing pages. The new interface solves both these
downsides. First, it allows us to read pte bits and clear the
soft-dirty bit atomically. It means that CRIU will not need to freeze
processes to pre-dump their memory. Second, it clears soft-dirty bits
for a specified region of memory. It means CRIU will have actual info
about pages to the moment of dumping them.
* The new interface has to be much faster because basic page filtering
is happening in the kernel. With the old interface, we have to read
pagemap for each page.
*Implementation Evolution (Short Summary)*
From the definition of GetWriteWatch(), we feel like kernel's soft-dirty
feature can be used under the hood with some additions like:
* reset soft-dirty flag for only a specific region of memory instead of
clearing the flag for the entire process
* get and clear soft-dirty flag for a specific region atomically
So we decided to use ioctl on pagemap file to read or/and reset soft-dirty
flag. But using soft-dirty flag, sometimes we get extra pages which weren't
even written. They had become soft-dirty because of VMA merging and
VM_SOFTDIRTY flag. This breaks the definition of GetWriteWatch(). We were
able to by-pass this short coming by ignoring VM_SOFTDIRTY until David
reported that mprotect etc messes up the soft-dirty flag while ignoring
VM_SOFTDIRTY [5]. This wasn't happening until [6] got introduced. We
discussed if we can revert these patches. But we could not reach to any
conclusion. So at this point, I made couple of tries to solve this whole
VM_SOFTDIRTY issue by correcting the soft-dirty implementation:
* [7] Correct the bug fixed wrongly back in 2014. It had potential to cause
regression. We left it behind.
* [8] Keep a list of soft-dirty part of a VMA across splits and merges. I
got the reply don't increase the size of the VMA by 8 bytes.
At this point, we left soft-dirty considering it is too much delicate and
userfaultfd [9] seemed like the only way forward. From there onward, we
have been basing soft-dirty emulation on userfaultfd wp feature where
kernel resolves the faults itself when WP_ASYNC feature is used. It was
straight forward to add WP_ASYNC feature in userfautlfd. Now we get only
those pages dirty or written-to which are really written in reality. (PS
There is another WP_UNPOPULATED userfautfd feature is required which is
needed to avoid pre-faulting memory before write-protecting [9].)
All the different masks were added on the request of CRIU devs to create
interface more generic and better.
[1] https://learn.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-…
[2] https://lore.kernel.org/all/20221014134802.1361436-1-mdanylo@google.com
[3] https://github.com/google/sanitizers
[4] https://github.com/google/sanitizers/wiki/AddressSanitizerAlgorithm#64-bit
[5] https://lore.kernel.org/all/bfcae708-db21-04b4-0bbe-712badd03071@redhat.com
[6] https://lore.kernel.org/all/20220725142048.30450-1-peterx@redhat.com/
[7] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.…
[8] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.…
[9] https://lore.kernel.org/all/20230306213925.617814-1-peterx@redhat.com
[10] https://lore.kernel.org/all/20230125144529.1630917-1-mdanylo@google.com
* Original Cover letter from v8*
Hello,
Note:
Soft-dirty pages and pages which have been written-to are synonyms. As
kernel already has soft-dirty feature inside which we have given up to
use, we are using written-to terminology while using UFFD async WP under
the hood.
This IOCTL, PAGEMAP_SCAN on pagemap file can be used to get and/or clear
the info about page table entries. The following operations are
supported in this ioctl:
- Get the information if the pages have been written-to (PAGE_IS_WRITTEN),
file mapped (PAGE_IS_FILE), present (PAGE_IS_PRESENT) or swapped
(PAGE_IS_SWAPPED).
- Write-protect the pages (PAGEMAP_WP_ENGAGE) to start finding which
pages have been written-to.
- Find pages which have been written-to and write protect the pages
(atomic PAGE_IS_WRITTEN + PAGEMAP_WP_ENGAGE)
It is possible to find and clear soft-dirty pages entirely in userspace.
But it isn't efficient:
- The mprotect and SIGSEGV handler for bookkeeping
- The userfaultfd wp (synchronous) with the handler for bookkeeping
Some benchmarks can be seen here[1]. This series adds features that weren't
present earlier:
- There is no atomic get soft-dirty/Written-to status and clear present in
the kernel.
- The pages which have been written-to can not be found in accurate way.
(Kernel's soft-dirty PTE bit + sof_dirty VMA bit shows more soft-dirty
pages than there actually are.)
Historically, soft-dirty PTE bit tracking has been used in the CRIU
project. The procfs interface is enough for finding the soft-dirty bit
status and clearing the soft-dirty bit of all the pages of a process.
We have the use case where we need to track the soft-dirty PTE bit for
only specific pages on-demand. We need this tracking and clear mechanism
of a region of memory while the process is running to emulate the
getWriteWatch() syscall of Windows.
*(Moved to using UFFD instead of soft-dirtyi feature to find pages which
have been written-to from v7 patch series)*:
Stop using the soft-dirty flags for finding which pages have been
written to. It is too delicate and wrong as it shows more soft-dirty
pages than the actual soft-dirty pages. There is no interest in
correcting it [2][3] as this is how the feature was written years ago.
It shouldn't be updated to changed behaviour. Peter Xu has suggested
using the async version of the UFFD WP [4] as it is based inherently
on the PTEs.
So in this patch series, I've added a new mode to the UFFD which is
asynchronous version of the write protect. When this variant of the
UFFD WP is used, the page faults are resolved automatically by the
kernel. The pages which have been written-to can be found by reading
pagemap file (!PM_UFFD_WP). This feature can be used successfully to
find which pages have been written to from the time the pages were
write protected. This works just like the soft-dirty flag without
showing any extra pages which aren't soft-dirty in reality.
The information related to pages if the page is file mapped, present and
swapped is required for the CRIU project [5][6]. The addition of the
required mask, any mask, excluded mask and return masks are also required
for the CRIU project [5].
The IOCTL returns the addresses of the pages which match the specific
masks. The page addresses are returned in struct page_region in a compact
form. The max_pages is needed to support a use case where user only wants
to get a specific number of pages. So there is no need to find all the
pages of interest in the range when max_pages is specified. The IOCTL
returns when the maximum number of the pages are found. The max_pages is
optional. If max_pages is specified, it must be equal or greater than the
vec_size. This restriction is needed to handle worse case when one
page_region only contains info of one page and it cannot be compacted.
This is needed to emulate the Windows getWriteWatch() syscall.
The patch series include the detailed selftest which can be used as an
example for the uffd async wp test and PAGEMAP_IOCTL. It shows the
interface usages as well.
[1] https://lore.kernel.org/lkml/54d4c322-cd6e-eefd-b161-2af2b56aae24@collabora…
[2] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.…
[3] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.…
[4] https://lore.kernel.org/all/Y6Hc2d+7eTKs7AiH@x1n
[5] https://lore.kernel.org/all/YyiDg79flhWoMDZB@gmail.com/
[6] https://lore.kernel.org/all/20221014134802.1361436-1-mdanylo@google.com/
Regards,
Muhammad Usama Anjum
Muhammad Usama Anjum (4):
fs/proc/task_mmu: Implement IOCTL to get and optionally clear info
about PTEs
tools headers UAPI: Update linux/fs.h with the kernel sources
mm/pagemap: add documentation of PAGEMAP_SCAN IOCTL
selftests: mm: add pagemap ioctl tests
Peter Xu (1):
userfaultfd: UFFD_FEATURE_WP_ASYNC
Documentation/admin-guide/mm/pagemap.rst | 58 +
Documentation/admin-guide/mm/userfaultfd.rst | 35 +
fs/proc/task_mmu.c | 505 ++++++
fs/userfaultfd.c | 26 +-
include/linux/hugetlb.h | 1 +
include/linux/userfaultfd_k.h | 21 +-
include/uapi/linux/fs.h | 53 +
include/uapi/linux/userfaultfd.h | 9 +-
mm/hugetlb.c | 34 +-
mm/memory.c | 27 +-
tools/include/uapi/linux/fs.h | 53 +
tools/testing/selftests/mm/.gitignore | 1 +
tools/testing/selftests/mm/Makefile | 3 +-
tools/testing/selftests/mm/config | 1 +
tools/testing/selftests/mm/pagemap_ioctl.c | 1459 ++++++++++++++++++
tools/testing/selftests/mm/run_vmtests.sh | 4 +
16 files changed, 2266 insertions(+), 24 deletions(-)
create mode 100644 tools/testing/selftests/mm/pagemap_ioctl.c
mode change 100644 => 100755 tools/testing/selftests/mm/run_vmtests.sh
--
2.39.2
Hi,
This follows the discussion here:
https://lore.kernel.org/linux-kselftest/20230324123157.bbwvfq4gsxnlnfwb@hou…
This shows a couple of inconsistencies with regard to how device-managed
resources are cleaned up. Basically, devm resources will only be cleaned up
if the device is attached to a bus and bound to a driver. Failing any of
these cases, a call to device_unregister will not end up in the devm
resources being released.
We had to work around it in DRM to provide helpers to create a device for
kunit tests, but the current discussion around creating similar, generic,
helpers for kunit resumed interest in fixing this.
This can be tested using the command:
./tools/testing/kunit/kunit.py run --kunitconfig=drivers/base/test/
Let me know what you think,
Maxime
Signed-off-by: Maxime Ripard <maxime(a)cerno.tech>
---
Maxime Ripard (2):
drivers: base: Add basic devm tests for root devices
drivers: base: Add basic devm tests for platform devices
drivers/base/test/.kunitconfig | 2 +
drivers/base/test/Kconfig | 4 +
drivers/base/test/Makefile | 3 +
drivers/base/test/platform-device-test.c | 278 +++++++++++++++++++++++++++++++
drivers/base/test/root-device-test.c | 120 +++++++++++++
5 files changed, 407 insertions(+)
---
base-commit: a6faf7ea9fcb7267d06116d4188947f26e00e57e
change-id: 20230329-kunit-devm-inconsistencies-test-5e5a7d01e60d
Best regards,
--
Maxime Ripard <mripard(a)kernel.org>
Add documentation for the new Virtual PCM Test Driver. It covers all
possible usage cases: errors and delay injections, random and
pattern-based data generation, playback and ioctl redefinition
functionalities testing.
We have a lot of different virtual media drivers, which can be used for
testing of the userspace applications and media subsystem middle layer.
However, all of them are aimed at testing the video functionality and
simulating the video devices. For audio devices we have only snd-dummy
module, which is good in simulating the correct behavior of an ALSA device.
I decided to write a tool, which would help to test the userspace ALSA
programs (and the PCM middle layer as well) under unusual circumstances
to figure out how they would behave. So I came up with this Virtual PCM
Test Driver.
This new Virtual PCM Test Driver has several features which can be useful
during the userspace ALSA applications testing/fuzzing, or testing/fuzzing
of the PCM middle layer. Not all of them can be implemented using the
existing virtual drivers (like dummy or loopback). Here is what can this
driver do:
- Simulate both capture and playback processes
- Check the playback stream for containing the looped pattern
- Generate random or pattern-based capture data
- Inject delays into the playback and capturing processes
- Inject errors during the PCM callbacks
Also, this driver can check the playback stream for containing the
predefined pattern, which is used in the corresponding selftest to check
the PCM middle layer data transferring functionality. Additionally, this
driver redefines the default RESET ioctl, and the selftest covers this PCM
API functionality as well.
The driver supports both interleaved and non-interleaved access modes, and
have separate pattern buffers for each channel. The driver supports up to
4 channels and up to 8 substreams.
Signed-off-by: Ivan Orlov <ivan.orlov0322(a)gmail.com>
---
V1 -> V2:
- Rename the driver from from 'valsa' to 'pcmtest'.
- Implement support for interleaved and non-interleaved access modes
- Add support for 8 substreams and 4 channels
- Extend supported formats
- Extend and rewrite in C the selftest for the driver
V2 -> V3:
- Add separate pattern buffers for each channel
- Speed up the capture data generation when using interleaved access mode
- Extend the corresponding selftest to cover the multiple channels
capturing and playback functionalities when using interleaved access mode.
- Fix documentation issues
Documentation/sound/cards/index.rst | 1 +
Documentation/sound/cards/pcmtest.rst | 120 ++++++++++++++++++++++++++
2 files changed, 121 insertions(+)
create mode 100644 Documentation/sound/cards/pcmtest.rst
diff --git a/Documentation/sound/cards/index.rst b/Documentation/sound/cards/index.rst
index c016f8c3b88b..49c1f2f688f8 100644
--- a/Documentation/sound/cards/index.rst
+++ b/Documentation/sound/cards/index.rst
@@ -17,3 +17,4 @@ Card-Specific Information
hdspm
serial-u16550
img-spdif-in
+ pcmtest
diff --git a/Documentation/sound/cards/pcmtest.rst b/Documentation/sound/cards/pcmtest.rst
new file mode 100644
index 000000000000..e163522f3205
--- /dev/null
+++ b/Documentation/sound/cards/pcmtest.rst
@@ -0,0 +1,120 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+The Virtual PCM Test Driver
+===========================
+
+The Virtual PCM Test Driver emulates a generic PCM device, and can be used for
+testing/fuzzing of the userspace ALSA applications, as well as for testing/fuzzing of
+the PCM middle layer. Additionally, it can be used for simulating hard to reproduce
+problems with PCM devices.
+
+What can this driver do?
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+At this moment the driver can do the following things:
+ * Simulate both capture and playback processes
+ * Generate random or pattern-based capturing data
+ * Inject delays into the playback and capturing processes
+ * Inject errors during the PCM callbacks
+
+It supports up to 8 substreams and 4 channels. Also it supports both interleaved and
+non-interleaved access modes.
+
+Also, this driver can check the playback stream for containing the predefined pattern,
+which is used in the corresponding selftest (alsa/pcmtest-test.sh) to check the PCM middle
+layer data transferring functionality. Additionally, this driver redefines the default
+RESET ioctl, and the selftest covers this PCM API functionality as well.
+
+Configuration
+-------------
+
+The driver has several parameters besides the common ALSA module parameters:
+
+ * fill_mode (bool) - Buffer fill mode (see below)
+ * inject_delay (int)
+ * inject_hwpars_err (bool)
+ * inject_prepare_err (bool)
+ * inject_trigger_err (bool)
+
+
+Capture Data Generation
+-----------------------
+
+The driver has two modes of data generation: the first (0 in the fill_mode parameter)
+means random data generation, the second (1 in the fill_mode) - pattern-based
+data generation. Let's look at the second mode.
+
+First of all, you may want to specify the pattern for data generation. You can do it
+by writing the pattern to the debugfs file. There are pattern buffer debugfs entries
+for each channel, as well as entries which contain the pattern buffer length.
+
+ * /sys/kernel/debug/pcmtest/fill_pattern[0-3]
+ * /sys/kernel/debug/pcmtest/fill_pattern[0-3]_len
+
+To set the pattern for the channel 0 you can execute the following command:
+
+.. code-block:: bash
+
+ echo -n mycoolpattern > /sys/kernel/debug/pcmtest/fill_pattern0
+
+Then, after every capture action performed on the 'pcmtest' device the buffer for the
+channel 0 will contain 'mycoolpatternmycoolpatternmycoolpatternmy...'.
+
+The pattern itself can be up to 4096 bytes long.
+
+Delay injection
+---------------
+
+The driver has 'inject_delay' parameter, which has very self-descriptive name and
+can be used for time delay/speedup simulations. The parameter has integer type, and
+it means the delay added between module's internal timer ticks.
+
+If the 'inject_delay' value is positive, the buffer will be filled slower, if it is
+negative - faster. You can try it yourself by starting a recording in any
+audiorecording application (like Audacity) and selecting the 'pcmtest' device as a
+source.
+
+This parameter can be also used for generating a huge amount of sound data in a very
+short period of time (with the negative 'inject_delay' value).
+
+Errors injection
+----------------
+
+This module can be used for injecting errors into the PCM communication process. This
+action can help you to figure out how the userspace ALSA program behaves under unusual
+circumstances.
+
+For example, you can make all 'hw_params' PCM callback calls return EBUSY error by
+writing '1' to the 'inject_hwpars_err' module parameter:
+
+.. code-block:: bash
+
+ echo 1 > /sys/module/snd_pcmtest/parameters/inject_hwpars_err
+
+Errors can be injected into the following PCM callbacks:
+
+ * hw_params (EBUSY)
+ * prepare (EINVAL)
+ * trigger (EINVAL)
+
+Playback test
+-------------
+
+This driver can be also used for the playback functionality testing - every time you
+write the playback data to the 'pcmtest' PCM device and close it, the driver checks the
+buffer for containing the looped pattern (which is specified in the fill_pattern
+debugfs file for each channel). If the playback buffer content represents the looped
+pattern, 'pc_test' debugfs entry is set into '1'. Otherwise, the driver sets it to '0'.
+
+ioctl redefinition test
+-----------------------
+
+The driver redefines the 'reset' ioctl, which is default for all PCM devices. To test
+this functionality, we can trigger the reset ioctl and check the 'ioctl_test' debugfs
+entry:
+
+.. code-block:: bash
+
+ cat /sys/kernel/debug/pcmtest/ioctl_test
+
+If the ioctl is triggered successfully, this file will contain '1', and '0' otherwise.
--
2.34.1
From: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
commit 4acfe3dfde685a5a9eaec5555351918e2d7266a1 upstream.
Dan Carpenter spotted a race condition in a couple of situations like
these in the test_firmware driver:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
u8 val;
int ret;
ret = kstrtou8(buf, 10, &val);
if (ret)
return ret;
mutex_lock(&test_fw_mutex);
*(u8 *)cfg = val;
mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
static ssize_t config_num_requests_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
int rc;
mutex_lock(&test_fw_mutex);
if (test_fw_config->reqs) {
pr_err("Must call release_all_firmware prior to changing config\n");
rc = -EINVAL;
mutex_unlock(&test_fw_mutex);
goto out;
}
mutex_unlock(&test_fw_mutex);
rc = test_dev_config_update_u8(buf, count,
&test_fw_config->num_requests);
out:
return rc;
}
static ssize_t config_read_fw_idx_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
return test_dev_config_update_u8(buf, count,
&test_fw_config->read_fw_idx);
}
The function test_dev_config_update_u8() is called from both the locked
and the unlocked context, function config_num_requests_store() and
config_read_fw_idx_store() which can both be called asynchronously as
they are driver's methods, while test_dev_config_update_u8() and siblings
change their argument pointed to by u8 *cfg or similar pointer.
To avoid deadlock on test_fw_mutex, the lock is dropped before calling
test_dev_config_update_u8() and re-acquired within test_dev_config_update_u8()
itself, but alas this creates a race condition.
Having two locks wouldn't assure a race-proof mutual exclusion.
This situation is best avoided by the introduction of a new, unlocked
function __test_dev_config_update_u8() which can be called from the locked
context and reducing test_dev_config_update_u8() to:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
int ret;
mutex_lock(&test_fw_mutex);
ret = __test_dev_config_update_u8(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
}
doing the locking and calling the unlocked primitive, which enables both
locked and unlocked versions without duplication of code.
The similar approach was applied to all functions called from the locked
and the unlocked context, which safely mitigates both deadlocks and race
conditions in the driver.
__test_dev_config_update_bool(), __test_dev_config_update_u8() and
__test_dev_config_update_size_t() unlocked versions of the functions
were introduced to be called from the locked contexts as a workaround
without releasing the main driver's lock and thereof causing a race
condition.
The test_dev_config_update_bool(), test_dev_config_update_u8() and
test_dev_config_update_size_t() locked versions of the functions
are being called from driver methods without the unnecessary multiplying
of the locking and unlocking code for each method, and complicating
the code with saving of the return value across lock.
Fixes: 7feebfa487b92 ("test_firmware: add support for request_firmware_into_buf")
Cc: Luis Chamberlain <mcgrof(a)kernel.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Russ Weight <russell.h.weight(a)intel.com>
Cc: Takashi Iwai <tiwai(a)suse.de>
Cc: Tianfei Zhang <tianfei.zhang(a)intel.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Colin Ian King <colin.i.king(a)gmail.com>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: linux-kselftest(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # v5.4
Suggested-by: Dan Carpenter <error27(a)gmail.com>
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Link: https://lore.kernel.org/r/20230509084746.48259-1-mirsad.todorovac@alu.unizg…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
lib/test_firmware.c | 52 +++++++++++++++++++++++++++++++++++-----------------
1 file changed, 35 insertions(+), 17 deletions(-)
--- a/lib/test_firmware.c
+++ b/lib/test_firmware.c
@@ -353,16 +353,26 @@ static ssize_t config_test_show_str(char
return len;
}
-static int test_dev_config_update_bool(const char *buf, size_t size,
+static inline int __test_dev_config_update_bool(const char *buf, size_t size,
bool *cfg)
{
int ret;
- mutex_lock(&test_fw_mutex);
if (kstrtobool(buf, cfg) < 0)
ret = -EINVAL;
else
ret = size;
+
+ return ret;
+}
+
+static int test_dev_config_update_bool(const char *buf, size_t size,
+ bool *cfg)
+{
+ int ret;
+
+ mutex_lock(&test_fw_mutex);
+ ret = __test_dev_config_update_bool(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
@@ -373,7 +383,8 @@ static ssize_t test_dev_config_show_bool
return snprintf(buf, PAGE_SIZE, "%d\n", val);
}
-static int test_dev_config_update_size_t(const char *buf,
+static int __test_dev_config_update_size_t(
+ const char *buf,
size_t size,
size_t *cfg)
{
@@ -384,9 +395,7 @@ static int test_dev_config_update_size_t
if (ret)
return ret;
- mutex_lock(&test_fw_mutex);
*(size_t *)cfg = new;
- mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
@@ -402,7 +411,7 @@ static ssize_t test_dev_config_show_int(
return snprintf(buf, PAGE_SIZE, "%d\n", val);
}
-static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
+static int __test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
u8 val;
int ret;
@@ -411,14 +420,23 @@ static int test_dev_config_update_u8(con
if (ret)
return ret;
- mutex_lock(&test_fw_mutex);
*(u8 *)cfg = val;
- mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
+static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
+{
+ int ret;
+
+ mutex_lock(&test_fw_mutex);
+ ret = __test_dev_config_update_u8(buf, size, cfg);
+ mutex_unlock(&test_fw_mutex);
+
+ return ret;
+}
+
static ssize_t test_dev_config_show_u8(char *buf, u8 val)
{
return snprintf(buf, PAGE_SIZE, "%u\n", val);
@@ -471,10 +489,10 @@ static ssize_t config_num_requests_store
mutex_unlock(&test_fw_mutex);
goto out;
}
- mutex_unlock(&test_fw_mutex);
- rc = test_dev_config_update_u8(buf, count,
- &test_fw_config->num_requests);
+ rc = __test_dev_config_update_u8(buf, count,
+ &test_fw_config->num_requests);
+ mutex_unlock(&test_fw_mutex);
out:
return rc;
@@ -518,10 +536,10 @@ static ssize_t config_buf_size_store(str
mutex_unlock(&test_fw_mutex);
goto out;
}
- mutex_unlock(&test_fw_mutex);
- rc = test_dev_config_update_size_t(buf, count,
- &test_fw_config->buf_size);
+ rc = __test_dev_config_update_size_t(buf, count,
+ &test_fw_config->buf_size);
+ mutex_unlock(&test_fw_mutex);
out:
return rc;
@@ -548,10 +566,10 @@ static ssize_t config_file_offset_store(
mutex_unlock(&test_fw_mutex);
goto out;
}
- mutex_unlock(&test_fw_mutex);
- rc = test_dev_config_update_size_t(buf, count,
- &test_fw_config->file_offset);
+ rc = __test_dev_config_update_size_t(buf, count,
+ &test_fw_config->file_offset);
+ mutex_unlock(&test_fw_mutex);
out:
return rc;
Hi, Willy
Thanks very much for your merge of the v3 generic part1 of rv32, just
tested your latest 20230604-nolibc-rv32+stkp6 branch, everything work
well except a trivial test report regression on the 'run' target.
Besides the fixup, a standalone test-report target added to share them
among run, run-user and re-run and allow independent test report check
via direct 'make test-report'.
Best regards,
Zhangjin
---
Zhangjin Wu (4):
selftests/nolibc: add a test-report target
selftests/nolibc: allow run test-report directly
selftests/nolibc: always print the log file
selftests/nolibc: fix up test-report for run target
tools/testing/selftests/nolibc/Makefile | 30 ++++++++++++-------------
1 file changed, 15 insertions(+), 15 deletions(-)
--
2.25.1
Add documentation for the new Virtual PCM Test Driver. It covers all
possible usage cases: errors and delay injections, random and
pattern-based data generation, playback and ioctl redefinition
functionalities testing.
We have a lot of different virtual media drivers, which can be used for
testing of the userspace applications and media subsystem middle layer.
However, all of them are aimed at testing the video functionality and
simulating the video devices. For audio devices we have only snd-dummy
module, which is good in simulating the correct behavior of an ALSA device.
I decided to write a tool, which would help to test the userspace ALSA
programs (and the PCM middle layer as well) under unusual circumstances
to figure out how they would behave. So I came up with this Virtual PCM
Test Driver.
This new Virtual PCM Test Driver has several features which can be useful
during the userspace ALSA applications testing/fuzzing, or testing/fuzzing
of the PCM middle layer. Not all of them can be implemented using the
existing virtual drivers (like dummy or loopback). Here is what can this
driver do:
- Simulate both capture and playback processes
- Check the playback stream for containing the looped pattern
- Generate random or pattern-based capture data
- Inject delays into the playback and capturing processes
- Inject errors during the PCM callbacks
Also, this driver can check the playback stream for containing the
predefined pattern, which is used in the corresponding selftest to check
the PCM middle layer data transferring functionality. Additionally, this
driver redefines the default RESET ioctl, and the selftest covers this PCM
API functionality as well.
The driver supports both interleaved and non-interleaved access modes, and
have separate pattern buffers for each channel. The driver supports up to
4 channels and up to 8 substreams.
Signed-off-by: Ivan Orlov <ivan.orlov0322(a)gmail.com>
---
V1 -> V2:
- Rename the driver from from 'valsa' to 'pcmtest'.
- Implement support for interleaved and non-interleaved access modes
- Add support for 8 substreams and 4 channels
- Extend supported formats
- Extend and rewrite in C the selftest for the driver
V2 -> V3:
- Add separate pattern buffers for each channel
- Speed up the capture data generation when using interleaved access mode
- Extend the corresponding selftest to cover the multiple channels
capturing and playback functionalities when using interleaved access mode.
- Fix documentation issues
V3 -> V4:
- Fix issue in the selftest: there was a typo in the fscanf argument.
Documentation/sound/cards/index.rst | 1 +
Documentation/sound/cards/pcmtest.rst | 120 ++++++++++++++++++++++++++
2 files changed, 121 insertions(+)
create mode 100644 Documentation/sound/cards/pcmtest.rst
diff --git a/Documentation/sound/cards/index.rst b/Documentation/sound/cards/index.rst
index c016f8c3b88b..49c1f2f688f8 100644
--- a/Documentation/sound/cards/index.rst
+++ b/Documentation/sound/cards/index.rst
@@ -17,3 +17,4 @@ Card-Specific Information
hdspm
serial-u16550
img-spdif-in
+ pcmtest
diff --git a/Documentation/sound/cards/pcmtest.rst b/Documentation/sound/cards/pcmtest.rst
new file mode 100644
index 000000000000..e163522f3205
--- /dev/null
+++ b/Documentation/sound/cards/pcmtest.rst
@@ -0,0 +1,120 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+The Virtual PCM Test Driver
+===========================
+
+The Virtual PCM Test Driver emulates a generic PCM device, and can be used for
+testing/fuzzing of the userspace ALSA applications, as well as for testing/fuzzing of
+the PCM middle layer. Additionally, it can be used for simulating hard to reproduce
+problems with PCM devices.
+
+What can this driver do?
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+At this moment the driver can do the following things:
+ * Simulate both capture and playback processes
+ * Generate random or pattern-based capturing data
+ * Inject delays into the playback and capturing processes
+ * Inject errors during the PCM callbacks
+
+It supports up to 8 substreams and 4 channels. Also it supports both interleaved and
+non-interleaved access modes.
+
+Also, this driver can check the playback stream for containing the predefined pattern,
+which is used in the corresponding selftest (alsa/pcmtest-test.sh) to check the PCM middle
+layer data transferring functionality. Additionally, this driver redefines the default
+RESET ioctl, and the selftest covers this PCM API functionality as well.
+
+Configuration
+-------------
+
+The driver has several parameters besides the common ALSA module parameters:
+
+ * fill_mode (bool) - Buffer fill mode (see below)
+ * inject_delay (int)
+ * inject_hwpars_err (bool)
+ * inject_prepare_err (bool)
+ * inject_trigger_err (bool)
+
+
+Capture Data Generation
+-----------------------
+
+The driver has two modes of data generation: the first (0 in the fill_mode parameter)
+means random data generation, the second (1 in the fill_mode) - pattern-based
+data generation. Let's look at the second mode.
+
+First of all, you may want to specify the pattern for data generation. You can do it
+by writing the pattern to the debugfs file. There are pattern buffer debugfs entries
+for each channel, as well as entries which contain the pattern buffer length.
+
+ * /sys/kernel/debug/pcmtest/fill_pattern[0-3]
+ * /sys/kernel/debug/pcmtest/fill_pattern[0-3]_len
+
+To set the pattern for the channel 0 you can execute the following command:
+
+.. code-block:: bash
+
+ echo -n mycoolpattern > /sys/kernel/debug/pcmtest/fill_pattern0
+
+Then, after every capture action performed on the 'pcmtest' device the buffer for the
+channel 0 will contain 'mycoolpatternmycoolpatternmycoolpatternmy...'.
+
+The pattern itself can be up to 4096 bytes long.
+
+Delay injection
+---------------
+
+The driver has 'inject_delay' parameter, which has very self-descriptive name and
+can be used for time delay/speedup simulations. The parameter has integer type, and
+it means the delay added between module's internal timer ticks.
+
+If the 'inject_delay' value is positive, the buffer will be filled slower, if it is
+negative - faster. You can try it yourself by starting a recording in any
+audiorecording application (like Audacity) and selecting the 'pcmtest' device as a
+source.
+
+This parameter can be also used for generating a huge amount of sound data in a very
+short period of time (with the negative 'inject_delay' value).
+
+Errors injection
+----------------
+
+This module can be used for injecting errors into the PCM communication process. This
+action can help you to figure out how the userspace ALSA program behaves under unusual
+circumstances.
+
+For example, you can make all 'hw_params' PCM callback calls return EBUSY error by
+writing '1' to the 'inject_hwpars_err' module parameter:
+
+.. code-block:: bash
+
+ echo 1 > /sys/module/snd_pcmtest/parameters/inject_hwpars_err
+
+Errors can be injected into the following PCM callbacks:
+
+ * hw_params (EBUSY)
+ * prepare (EINVAL)
+ * trigger (EINVAL)
+
+Playback test
+-------------
+
+This driver can be also used for the playback functionality testing - every time you
+write the playback data to the 'pcmtest' PCM device and close it, the driver checks the
+buffer for containing the looped pattern (which is specified in the fill_pattern
+debugfs file for each channel). If the playback buffer content represents the looped
+pattern, 'pc_test' debugfs entry is set into '1'. Otherwise, the driver sets it to '0'.
+
+ioctl redefinition test
+-----------------------
+
+The driver redefines the 'reset' ioctl, which is default for all PCM devices. To test
+this functionality, we can trigger the reset ioctl and check the 'ioctl_test' debugfs
+entry:
+
+.. code-block:: bash
+
+ cat /sys/kernel/debug/pcmtest/ioctl_test
+
+If the ioctl is triggered successfully, this file will contain '1', and '0' otherwise.
--
2.34.1
Hi, Willy
This is the v3 part2 of support for rv32, differs from the v2 part2 [1],
we only fix up compile issues in this patchset.
With the v3 generic part1 [2] and this patchset, we can compile nolibc
for rv32 now.
This is based on the idea of suggestions from Arnd [3], instead of
'#error' on the unsupported syscall on a target platform, a 'return
-ENOSYS' allow us to compile it at first and then allow we fix up the
test failures reported by nolibc-test one by one.
The first two patches fix up all of the compile failures with '-ENOSYS'
(and '#ifdef' if required):
tools/nolibc: fix up #error compile failures with -ENOSYS
tools/nolibc: fix up undeclared syscall macros with #ifdef and -ENOSYS
The last one enables rv32 compile support:
selftests/nolibc: riscv: customize makefile for rv32
The above compile support patch here is only for test currently, as
Thomas suggested, for a full rv32 support, it should wait for the left
parts.
Welcome your feedbacks, will wait for enough discussion on this patchset
and then send the left parts one by one to fix up the test failures
about waitid, llseek and time64 syscalls: ppoll_time64, clock_gettime64,
pselect6_time64.
So, I do recommend to apply this patchset, it allows us to send the left
parts independently, otherwise, all of them should be sent out for
review together. with this patchset, the rv32 users may be able to use
nolibc although some syscalls still missing :-)
Or at least we apply the first two, so, I can manually cherry-pick the
compile support patch to do my local test, and the other platform
developer may also benefit from them.
I'm cleaning up the left parts, but still require some time, I plan to
split them to such parts:
* part3: waitid, prepared, will send out later
* part4: llseek, prepared, will send out later
* part5: time64 syscalls, ppoll_time64 ok, will finish them next week
(It is a little hard to split them)
Best regards,
Zhangjin
---
[1]: https://lore.kernel.org/linux-riscv/cover.1685387484.git.falcon@tinylab.org…
[2]: https://lore.kernel.org/linux-riscv/cover.1685777982.git.falcon@tinylab.org…
[3]: https://lore.kernel.org/linux-riscv/5e7d2adf-e96f-41ca-a4c6-5c87a25d4c9c@ap…
Zhangjin Wu (3):
tools/nolibc: fix up #error compile failures with -ENOSYS
tools/nolibc: fix up undeclared syscall macros with #ifdef and -ENOSYS
selftests/nolibc: riscv: customize makefile for rv32
tools/include/nolibc/sys.h | 38 ++++++++++++++++---------
tools/testing/selftests/nolibc/Makefile | 11 +++++--
2 files changed, 34 insertions(+), 15 deletions(-)
--
2.25.1
Hi, Willy
This is the v4 part2 of support for rv32 (v3 [1]), it applied the
suggestions from Thomas, Arnd [2] and you [3]. now, the rv32 compile
support almost aligned with x86 except the extra KARCH to make kernel
happy, thanks very much for your nice review!
Since the 'override' method mentioned in [4] split the whole Makefile
context to two parts, it may make the code not that easy to maintain,
so, this patchset goes back to the KARCH (suggested from Willy, before,
I used something like _ARCH) passing method, as suggested by Willy, we
also aligned the KARCH assignment with the other variables.
Changes from v3 -> v4:
* No new changes in the first two except a new Reviewed-by line from Arnd
* selftests/nolibc: riscv: customize makefile for rv32
Do it like the other architectures, especially like x86.
The difference from x86 is, the top-level kernel Makefile doesn't
accept riscv32 and riscv64, it only accept riscv, to make kernel happy,
a KARCH variable is added for riscv32 and riscv64, and then passed to
kernel with ARCH=$(KARCH).
Since tools/include/nolibc/Makefile shares arch-riscv.h between riscv32
and riscv64 and there is a headers_standalone target who called kernel
headers and headers_install, so, pass ARCH=$(KARCH) to it too.
Did compile test for aarch64, rv32 and rv64, include run-user and run.
Note, this is required with the default config from the
20230606-nolibc-rv32+stkp7a branch of [5]:
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index ce02bb09651b..72bd8fe0cad6 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -1934,11 +1934,13 @@ void show_rcu_tasks_gp_kthreads(void)
}
#endif /* #ifndef CONFIG_TINY_RCU */
+#ifdef CONFIG_TASKS_RCU
struct task_struct *get_rcu_tasks_gp_kthread(void)
{
return rcu_tasks.kthread_ptr;
}
EXPORT_SYMBOL_GPL(get_rcu_tasks_gp_kthread);
+#endif
#ifdef CONFIG_PROVE_RCU
struct rcu_tasks_test_desc {
Best regards,
Zhangjin
---
[1]: https://lore.kernel.org/linux-riscv/cover.1685780412.git.falcon@tinylab.org/
[2]: https://lore.kernel.org/linux-riscv/d1c83340-af4c-4780-a101-b9d22b47379c@ap…
[3]: https://lore.kernel.org/lkml/ZIAywHvr6UB1J4of@1wt.eu/
[4]: https://lore.kernel.org/lkml/20230607063314.671429-1-falcon@tinylab.org/
[5]: https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/nolibc.git
Zhangjin Wu (3):
tools/nolibc: fix up #error compile failures with -ENOSYS
tools/nolibc: fix up undeclared syscall macros with #ifdef and -ENOSYS
selftests/nolibc: riscv: customize makefile for rv32
tools/include/nolibc/sys.h | 38 ++++++++++++++++---------
tools/testing/selftests/nolibc/Makefile | 20 +++++++++++--
2 files changed, 42 insertions(+), 16 deletions(-)
--
2.25.1
Resending as the first set got mangled with smtp error.
This is part of the effort to remove the empty element of the ctl_table
structures (used to calculate size) and replace it with an ARRAY_SIZE call. By
replacing the child element in struct ctl_table with a flags element we make
sure that there are no forward recursions on child nodes and therefore set
ourselves up for just using an ARRAY_SIZE. We also added some self tests to
make sure that we do not break anything.
Patchset is separated in 4: parport fixes, selftests fixes, selftests additions and
replacement of child element. Tested everything with sysctl self tests and everything
seems "ok".
1. parport fixes: @mcgrof: this is related to my previous series and it plugs a
sysct table leak in the parport driver. Please tell me if you want me to repost
the parport series with this one stiched in.
2. Selftests fixes: Remove the prefixed zeros when passing a awk field to the
awk print command because it was causing $0009 to be interpreted as $0.
Replaced continue with return in sysctl.sh(test_case) so the test actually
gets skipped. The skip decision is now in sysctl.sh(skip_test).
3. Selftest additions: New test to confirm that unregister actually removes
targets. New test to confirm that permanently empty targets are indeed
created and that no other targets can be created "on top".
4. Replaced the child pointer in struct ctl_table with a u8 flag. The flag
is used to differentiate between permanently empty targets and non-empty ones.
Comments/feedback greatly appreciated
Best
Joel
Joel Granados (8):
parport: plug a sysctl register leak
test_sysctl: Fix test metadata getters
test_sysctl: Group node sysctl test under one func
test_sysctl: Add an unregister sysctl test
test_sysctl: Add an option to prevent test skip
test_sysclt: Test for registering a mount point
sysctl: Remove debugging dump_stack
sysctl: replace child with a flags var
drivers/parport/procfs.c | 23 ++---
fs/proc/proc_sysctl.c | 82 ++++------------
include/linux/sysctl.h | 4 +-
lib/test_sysctl.c | 91 ++++++++++++++++--
tools/testing/selftests/sysctl/sysctl.sh | 115 +++++++++++++++++------
5 files changed, 204 insertions(+), 111 deletions(-)
--
2.30.2
From: Maxim Mikityanskiy <maxim(a)isovalent.com>
See the details in the commit message (TL/DR: under CAP_BPF, the
verifier can be fooled to think that a scalar is zero while in fact it's
your predefined number.)
v1 and v2 were sent off-list.
v2 changes:
Added more tests, migrated them to inline asm, started using
bpf_get_prandom_u32, switched to a more bulletproof dead branch check
and modified the failing spill test scenarios so that an unauthorized
access attempt is performed in both branches.
v3 changes:
Dropped an improvement not necessary for the fix, changed the Fixes tag.
Maxim Mikityanskiy (2):
bpf: Fix verifier tracking scalars on spill
selftests/bpf: Add test cases to assert proper ID tracking on spill
kernel/bpf/verifier.c | 7 +
.../selftests/bpf/progs/verifier_spill_fill.c | 198 ++++++++++++++++++
2 files changed, 205 insertions(+)
--
2.40.1
Let's add some selftests to make sure that:
* R/O long-term pinning always works of file mappings
* R/W long-term pinning always works in MAP_PRIVATE file mappings
* R/W long-term pinning only works in MAP_SHARED mappings with special
filesystems (shmem, hugetlb) and fails with other filesystems (ext4, btrfs,
xfs).
The tests make use of the gup_test kernel module to trigger ordinary GUP
and GUP-fast, and liburing (similar to our COW selftests). Test with memfd,
memfd hugetlb, tmpfile() and mkstemp(). The latter usually gives us a
"real" filesystem (ext4, btrfs, xfs) where long-term pinning is
expected to fail.
Note that these selftests don't contain any actual reproducers for data
corruptions in case R/W long-term pinning on problematic filesystems
"would" work.
Maybe we can later come up with a racy !FOLL_LONGTERM reproducer that can
reuse an existing interface to trigger short-term pinning (I'll look into
that next).
On current mm/mm-unstable:
# ./gup_longterm
# [INFO] detected hugetlb page size: 2048 KiB
# [INFO] detected hugetlb page size: 1048576 KiB
TAP version 13
1..50
# [RUN] R/W longterm GUP pin in MAP_SHARED file mapping ... with memfd
ok 1 Should have worked
# [RUN] R/W longterm GUP pin in MAP_SHARED file mapping ... with tmpfile
ok 2 Should have worked
# [RUN] R/W longterm GUP pin in MAP_SHARED file mapping ... with local tmpfile
ok 3 Should have failed
# [RUN] R/W longterm GUP pin in MAP_SHARED file mapping ... with memfd hugetlb (2048 kB)
ok 4 Should have worked
# [RUN] R/W longterm GUP pin in MAP_SHARED file mapping ... with memfd hugetlb (1048576 kB)
ok 5 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd
ok 6 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_SHARED file mapping ... with tmpfile
ok 7 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_SHARED file mapping ... with local tmpfile
ok 8 Should have failed
# [RUN] R/W longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd hugetlb (2048 kB)
ok 9 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd hugetlb (1048576 kB)
ok 10 Should have worked
# [RUN] R/O longterm GUP pin in MAP_SHARED file mapping ... with memfd
ok 11 Should have worked
# [RUN] R/O longterm GUP pin in MAP_SHARED file mapping ... with tmpfile
ok 12 Should have worked
# [RUN] R/O longterm GUP pin in MAP_SHARED file mapping ... with local tmpfile
ok 13 Should have worked
# [RUN] R/O longterm GUP pin in MAP_SHARED file mapping ... with memfd hugetlb (2048 kB)
ok 14 Should have worked
# [RUN] R/O longterm GUP pin in MAP_SHARED file mapping ... with memfd hugetlb (1048576 kB)
ok 15 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd
ok 16 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_SHARED file mapping ... with tmpfile
ok 17 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_SHARED file mapping ... with local tmpfile
ok 18 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd hugetlb (2048 kB)
ok 19 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd hugetlb (1048576 kB)
ok 20 Should have worked
# [RUN] R/W longterm GUP pin in MAP_PRIVATE file mapping ... with memfd
ok 21 Should have worked
# [RUN] R/W longterm GUP pin in MAP_PRIVATE file mapping ... with tmpfile
ok 22 Should have worked
# [RUN] R/W longterm GUP pin in MAP_PRIVATE file mapping ... with local tmpfile
ok 23 Should have worked
# [RUN] R/W longterm GUP pin in MAP_PRIVATE file mapping ... with memfd hugetlb (2048 kB)
ok 24 Should have worked
# [RUN] R/W longterm GUP pin in MAP_PRIVATE file mapping ... with memfd hugetlb (1048576 kB)
ok 25 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd
ok 26 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_PRIVATE file mapping ... with tmpfile
ok 27 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_PRIVATE file mapping ... with local tmpfile
ok 28 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd hugetlb (2048 kB)
ok 29 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd hugetlb (1048576 kB)
ok 30 Should have worked
# [RUN] R/O longterm GUP pin in MAP_PRIVATE file mapping ... with memfd
ok 31 Should have worked
# [RUN] R/O longterm GUP pin in MAP_PRIVATE file mapping ... with tmpfile
ok 32 Should have worked
# [RUN] R/O longterm GUP pin in MAP_PRIVATE file mapping ... with local tmpfile
ok 33 Should have worked
# [RUN] R/O longterm GUP pin in MAP_PRIVATE file mapping ... with memfd hugetlb (2048 kB)
ok 34 Should have worked
# [RUN] R/O longterm GUP pin in MAP_PRIVATE file mapping ... with memfd hugetlb (1048576 kB)
ok 35 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd
ok 36 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_PRIVATE file mapping ... with tmpfile
ok 37 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_PRIVATE file mapping ... with local tmpfile
ok 38 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd hugetlb (2048 kB)
ok 39 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd hugetlb (1048576 kB)
ok 40 Should have worked
# [RUN] io_uring fixed buffer with MAP_SHARED file mapping ... with memfd
ok 41 Should have worked
# [RUN] io_uring fixed buffer with MAP_SHARED file mapping ... with tmpfile
ok 42 Should have worked
# [RUN] io_uring fixed buffer with MAP_SHARED file mapping ... with local tmpfile
ok 43 Should have failed
# [RUN] io_uring fixed buffer with MAP_SHARED file mapping ... with memfd hugetlb (2048 kB)
ok 44 Should have worked
# [RUN] io_uring fixed buffer with MAP_SHARED file mapping ... with memfd hugetlb (1048576 kB)
ok 45 Should have worked
# [RUN] io_uring fixed buffer with MAP_PRIVATE file mapping ... with memfd
ok 46 Should have worked
# [RUN] io_uring fixed buffer with MAP_PRIVATE file mapping ... with tmpfile
ok 47 Should have worked
# [RUN] io_uring fixed buffer with MAP_PRIVATE file mapping ... with local tmpfile
ok 48 Should have worked
# [RUN] io_uring fixed buffer with MAP_PRIVATE file mapping ... with memfd hugetlb (2048 kB)
ok 49 Should have worked
# [RUN] io_uring fixed buffer with MAP_PRIVATE file mapping ... with memfd hugetlb (1048576 kB)
ok 50 Should have worked
# Totals: pass:50 fail:0 xfail:0 xpass:0 skip:0 error:0
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Lorenzo Stoakes <lstoakes(a)gmail.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Jason Gunthorpe <jgg(a)nvidia.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Jan Kara <jack(a)suse.cz>
David Hildenbrand (3):
selftests/mm: factor out detection of hugetlb page sizes into vm_util
selftests/mm: gup_longterm: new functional test for FOLL_LONGTERM
selftests/mm: gup_longterm: add liburing tests
tools/testing/selftests/mm/Makefile | 3 +
tools/testing/selftests/mm/cow.c | 29 +-
tools/testing/selftests/mm/gup_longterm.c | 459 ++++++++++++++++++++++
tools/testing/selftests/mm/run_vmtests.sh | 4 +-
tools/testing/selftests/mm/vm_util.c | 27 ++
tools/testing/selftests/mm/vm_util.h | 1 +
6 files changed, 495 insertions(+), 28 deletions(-)
create mode 100644 tools/testing/selftests/mm/gup_longterm.c
--
2.40.1
Hello, Waiman.
On Wed, Apr 12, 2023 at 03:52:36PM -0400, Waiman Long wrote:
> There is still a distribution hierarchy as the list of isolation CPUs have
> to be distributed down to the target cgroup through the hierarchy. For
> example,
>
> cgroup root
> +- isolcpus (cpus 8,9; isolcpus)
> +- user.slice (cpus 1-9; ecpus 1-7; member)
> +- user-x.slice (cpus 8,9; ecpus 8,9; isolated)
> +- user-y.slice (cpus 1,2; ecpus 1,2; member)
>
> OTOH, I do agree that this can be somewhat hacky. That is why I post it as a
> RFC to solicit feedback.
Wouldn't it be possible to make it hierarchical by adding another cpumask to
cpuset which lists the cpus which are allowed in the hierarchy but not used
unless claimed by an isolated domain?
Thanks.
--
tejun
Hi, Willy, Thomas
This is not really for merge, but only let it work as a demo code to
test whether it is possible to restore the next test when there is a bad
pointer access in user-space [1].
Besides, a new 'run' command is added to 'NOLIBC_TEST' environment
variable or arguments to control the running iterations, this may be
used to test the reentrancy issues, but no failures found currently ;-)
With glibc, it works as following:
$ ./nolibc-test run:2,syscall:28-30,stdlib:1
Running iteration(s): 2
Current iteration: 1
Running test 'syscall', from 28 to 30
28 dup3_m1 = -1 EBADF [OK]
29 efault_handler ! 11 SIGSEGV [OK]
30 execve_root = -1 EACCES [OK]
Errors during this test: 0
Running test 'stdlib'
1 getenv_blah = <(null)> [OK]
Errors during this test: 0
Total number of errors in the 1 iteration(s): 0
Current iteration: 2
Running test 'syscall'
28 dup3_m1 = -1 EBADF [OK]
29 efault_handler ! 11 SIGSEGV [OK]
30 execve_root = -1 EACCES [OK]
Errors during this test: 0
Running test 'stdlib'
1 getenv_blah = <(null)> [OK]
Errors during this test: 0
Total number of errors in the 2 iteration(s): 0
With nolibc, it will be skipped (run:2,syscall:28-30,stdlib:10):
Running iteration(s): 2
Current iteration: 1
Running test 'syscall', from 28 to 30
28 dup3_m1 = -1 EBADF [OK]
29 efault_handler [SKIPPED]
30 execve_root = -1 EACCES [OK]
Errors during this test: 0
Running test 'stdlib', from 10 to 10
10 strrchr_foobar_o = <obar> [OK]
Errors during this test: 0
Total number of errors in the 1 iteration(s): 0
Current iteration: 2
Running test 'syscall', from 28 to 30
28 dup3_m1 = -1 EBADF [OK]
29 efault_handler [SKIPPED]
30 execve_root = -1 EACCES [OK]
Errors during this test: 0
Running test 'stdlib', from 10 to 10
10 strrchr_foobar_o = <obar> [OK]
Errors during this test: 0
Total number of errors in the 2 iteration(s): 0
Best regards,
Zhangjin
---
[1]: https://lore.kernel.org/linux-riscv/20230529113143.GB2762@1wt.eu/
Zhangjin Wu (4):
selftests/nolibc: allow rerun with the same settings
selftests/nolibc: add rerun support
selftests/nolibc: add user space efault handler
selftests/nolibc: add user-space efault restore test case
tools/testing/selftests/nolibc/nolibc-test.c | 247 +++++++++++++++++--
1 file changed, 221 insertions(+), 26 deletions(-)
--
2.25.1
From: Menglong Dong <imagedong(a)tencent.com>
For now, the BPF program of type BPF_PROG_TYPE_TRACING can only be used
on the kernel functions whose arguments count less than 6. This is not
friendly at all, as too many functions have arguments count more than 6.
Therefore, let's enhance it by increasing the function arguments count
allowed in arch_prepare_bpf_trampoline(), for now, only x86_64.
In the 1th patch, we make MAX_BPF_FUNC_ARGS 14, according to our
statistics.
In the 2th patch, we make arch_prepare_bpf_trampoline() support to copy
function arguments in stack for x86 arch. Therefore, the maximum
arguments can be up to MAX_BPF_FUNC_ARGS for FENTRY and FEXIT.
And the 3-5th patches are for the testcases of the 2th patch.
Changes since v1:
- change the maximun function arguments to 14 from 12
- add testcases (Jiri Olsa)
- instead EMIT4 with EMIT3_off32 for "lea" to prevent overflow
Menglong Dong (5):
bpf: make MAX_BPF_FUNC_ARGS 14
bpf, x86: allow function arguments up to 14 for TRACING
libbpf: make BPF_PROG support 15 function arguments
selftests/bpf: rename bpf_fentry_test{7,8,9} to bpf_fentry_test_ptr*
selftests/bpf: add testcase for FENTRY/FEXIT with 6+ arguments
arch/x86/net/bpf_jit_comp.c | 96 ++++++++++++++++---
include/linux/bpf.h | 9 +-
net/bpf/test_run.c | 40 ++++++--
tools/lib/bpf/bpf_helpers.h | 9 +-
tools/lib/bpf/bpf_tracing.h | 10 +-
.../selftests/bpf/prog_tests/bpf_cookie.c | 24 ++---
.../bpf/prog_tests/kprobe_multi_test.c | 16 ++--
.../testing/selftests/bpf/progs/fentry_test.c | 50 ++++++++--
.../testing/selftests/bpf/progs/fexit_test.c | 51 ++++++++--
.../selftests/bpf/progs/get_func_ip_test.c | 2 +-
.../selftests/bpf/progs/kprobe_multi.c | 12 +--
.../bpf/progs/verifier_btf_ctx_access.c | 2 +-
.../selftests/bpf/verifier/atomic_fetch_add.c | 4 +-
13 files changed, 249 insertions(+), 76 deletions(-)
--
2.40.1
Hi,
This is v2 of a series that fixes up build errors and warnings for at
least the 64-bit builds on x86 with clang.
There are lots of changes since v1 [1], thanks to reviews from Peter Xu, David
Hildenbrand, and Muhammad Usama Anjum. These include:
* Using "make headers", and documenting that prerequisite as well.
* Better ways to avoid clang's Wformat-security warnings
* Added Cc's, ack-by's, reviewed-by's.
* Updated commit log messages.
The series also includes an optional "improvement" of moving some uffd
code into uffd-common.[ch], which is proving to be somewhat
controversial, and so if that doesn't get resolved, then patches 9 and
10 may just get dropped. They are not required in order to get a clean
build, now that "make headers" is happening.
[1]: https://lore.kernel.org/all/20230602013358.900637-1-jhubbard@nvidia.com/
thanks,
John Hubbard
NVIDIA
John Hubbard (11):
selftests/mm: fix uffd-stress unused function warning
selftests/mm: fix unused variable warnings in hugetlb-madvise.c,
migration.c
selftests/mm: fix "warning: expression which evaluates to zero..." in
mlock2-tests.c
selftests/mm: fix invocation of tests that are run via shell scripts
selftests/mm: .gitignore: add mkdirty, va_high_addr_switch
selftests/mm: fix two -Wformat-security warnings in uffd builds
selftests/mm: fix a "possibly uninitialized" warning in pkey-x86.h
selftests/mm: fix uffd-unit-tests.c build failure due to missing
MADV_COLLAPSE
selftests/mm: move psize(), pshift() into vm_utils.c
selftests/mm: move uffd* routines from vm_util.c to uffd-common.c
Documentation: kselftest: "make headers" is a prerequisite
Documentation/dev-tools/kselftest.rst | 1 +
tools/testing/selftests/mm/.gitignore | 2 +
tools/testing/selftests/mm/Makefile | 7 +-
tools/testing/selftests/mm/cow.c | 7 --
tools/testing/selftests/mm/hugepage-mremap.c | 2 +-
tools/testing/selftests/mm/hugetlb-madvise.c | 8 +-
tools/testing/selftests/mm/khugepaged.c | 10 --
.../selftests/mm/ksm_functional_tests.c | 2 +-
tools/testing/selftests/mm/migration.c | 5 +-
tools/testing/selftests/mm/mlock2-tests.c | 1 -
tools/testing/selftests/mm/pkey-x86.h | 2 +-
tools/testing/selftests/mm/run_vmtests.sh | 6 +-
tools/testing/selftests/mm/uffd-common.c | 105 +++++++++++++++++
tools/testing/selftests/mm/uffd-common.h | 12 +-
tools/testing/selftests/mm/uffd-stress.c | 10 --
tools/testing/selftests/mm/uffd-unit-tests.c | 16 +--
tools/testing/selftests/mm/vm_util.c | 106 ++----------------
tools/testing/selftests/mm/vm_util.h | 36 ++----
18 files changed, 165 insertions(+), 173 deletions(-)
base-commit: 929ed21dfdb6ee94391db51c9eedb63314ef6847
--
2.40.1
On Tue, May 23, 2023 at 11:22:07PM +0000, Ziqi Zhao wrote:
> An output message:
>
> > # # waitpid WEXITSTATUS=0
>
> will be printed for 30,000+ times in the `pidfd_test` selftest, which
> does not seem ideal. This patch removes the print logic in the
> `wait_for_pid` function, so each call to this function does not output
> a line by default. Any existing call sites where the extra line might
> be beneficial have been modified to include extra print statements
> outside of the function calls.
>
> Signed-off-by: Ziqi Zhao <astrajoan(a)yahoo.com>
> ---
Fine by me,
Reviewed-by: Christian Brauner <brauner(a)kernel.org>
The default timeout for selftests tests is 45 seconds. Although
we already have 13 settings for tests of about 96 sefltests which
use a timeout greater than this, we want to try to avoid encouraging
more tests to forcing a higher test timeout as selftests strives to
run all tests quickly. Selftests also uses the timeout as a non-fatal
error. Only tests runners which have control over a system would know
if to treat a timeout as fatal or not.
To help with all this:
o Enhance documentation to avoid future increases of insane timeouts
o Add the option to allow overriding the default timeout with test
runners with a command line option
Suggested-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Luis Chamberlain <mcgrof(a)kernel.org>
---
Documentation/dev-tools/kselftest.rst | 22 +++++++++++++++++++++
tools/testing/selftests/kselftest/runner.sh | 11 ++++++++++-
tools/testing/selftests/run_kselftest.sh | 5 +++++
3 files changed, 37 insertions(+), 1 deletion(-)
diff --git a/Documentation/dev-tools/kselftest.rst b/Documentation/dev-tools/kselftest.rst
index 12b575b76b20..dd214af7b7ff 100644
--- a/Documentation/dev-tools/kselftest.rst
+++ b/Documentation/dev-tools/kselftest.rst
@@ -168,6 +168,28 @@ the `-t` option for specific single tests. Either can be used multiple times::
For other features see the script usage output, seen with the `-h` option.
+Timeout for selftests
+=====================
+
+Selftests are designed to be quick and so a default timeout is used of 45
+seconds for each test. Tests can override the default timeout by adding
+a settings file in their directory and set a timeout variable there to the
+configured a desired upper timeout for the test. Only a few tests override
+the timeout with a value higher than 45 seconds, selftests strives to keep
+it that way. Timeouts in selftests are not considered fatal because the
+system under which a test runs may change and this can also modify the
+expected time it takes to run a test. If you have control over the systems
+which will run the tests you can configure a test runner on those systems to
+use a greater or lower timeout on the command line as with the `-o` or
+the `--override-timeout` argument. For example to use 165 seconds instead
+one would use:
+
+ $ ./run_kselftest.sh --override-timeout 165
+
+You can look at the TAP output to see if you ran into the timeout. Test
+runners which know a test must run under a specific time can then optionally
+treat these timeouts then as fatal.
+
Packaging selftests
===================
diff --git a/tools/testing/selftests/kselftest/runner.sh b/tools/testing/selftests/kselftest/runner.sh
index 294619ade49f..1c952d1401d4 100644
--- a/tools/testing/selftests/kselftest/runner.sh
+++ b/tools/testing/selftests/kselftest/runner.sh
@@ -8,7 +8,8 @@ export logfile=/dev/stdout
export per_test_logging=
# Defaults for "settings" file fields:
-# "timeout" how many seconds to let each test run before failing.
+# "timeout" how many seconds to let each test run before running
+# over our soft timeout limit.
export kselftest_default_timeout=45
# There isn't a shell-agnostic way to find the path of a sourced file,
@@ -90,6 +91,14 @@ run_one()
done < "$settings"
fi
+ # Command line timeout overrides the settings file
+ if [ -n "$kselftest_override_timeout" ]; then
+ kselftest_timeout="$kselftest_override_timeout"
+ echo "# overriding timeout to $kselftest_timeout" >> "$logfile"
+ else
+ echo "# timeout set to $kselftest_timeout" >> "$logfile"
+ fi
+
TEST_HDR_MSG="selftests: $DIR: $BASENAME_TEST"
echo "# $TEST_HDR_MSG"
if [ ! -e "$TEST" ]; then
diff --git a/tools/testing/selftests/run_kselftest.sh b/tools/testing/selftests/run_kselftest.sh
index 97165a83df63..9a981b36bd7f 100755
--- a/tools/testing/selftests/run_kselftest.sh
+++ b/tools/testing/selftests/run_kselftest.sh
@@ -26,6 +26,7 @@ Usage: $0 [OPTIONS]
-l | --list List the available collection:test entries
-d | --dry-run Don't actually run any tests
-h | --help Show this usage info
+ -o | --override-timeout Number of seconds after which we timeout
EOF
exit $1
}
@@ -33,6 +34,7 @@ EOF
COLLECTIONS=""
TESTS=""
dryrun=""
+kselftest_override_timeout=""
while true; do
case "$1" in
-s | --summary)
@@ -51,6 +53,9 @@ while true; do
-d | --dry-run)
dryrun="echo"
shift ;;
+ -o | --override-timeout)
+ kselftest_override_timeout="$2"
+ shift 2 ;;
-h | --help)
usage 0 ;;
"")
--
2.39.2
It turned out that an even dozen patches were required in order to get the
selftests building cleanly, and all running, once again. I made it worse on
myself by insisting on using clang, which seems to uncover a few more warnings
than gcc these days.
So I still haven't gotten to my original goal of running a new HMM test that
Alistair handed me (it's not here yet), but at least this fixes everything I ran
into just now.
John Hubbard (12):
selftests/mm: fix uffd-stress unused function warning
selftests/mm: fix unused variable warning in hugetlb-madvise.c
selftests/mm: fix unused variable warning in migration.c
selftests/mm: fix a char* assignment in mlock2-tests.c
selftests/mm: fix invocation of tests that are run via shell scripts
selftests/mm: .gitignore: add mkdirty, va_high_addr_switch
selftests/mm: set -Wno-format-security to avoid uffd build warnings
selftests/mm: fix a "possibly uninitialized" warning in pkey-x86.h
selftests/mm: move psize(), pshift() into vm_utils.c
selftests/mm: move uffd* routines from vm_util.c to uffd-common.c
selftests/mm: fix missing UFFDIO_CONTINUE_MODE_WP and similar build
failures
selftests/mm: fix uffd-unit-tests.c build failure due to missing
MADV_COLLAPSE
tools/testing/selftests/mm/.gitignore | 2 +
tools/testing/selftests/mm/Makefile | 9 +-
tools/testing/selftests/mm/cow.c | 7 --
tools/testing/selftests/mm/hugepage-mremap.c | 2 +-
tools/testing/selftests/mm/hugetlb-madvise.c | 2 +-
tools/testing/selftests/mm/khugepaged.c | 10 --
.../selftests/mm/ksm_functional_tests.c | 2 +-
tools/testing/selftests/mm/migration.c | 2 +-
tools/testing/selftests/mm/mlock2-tests.c | 2 +-
tools/testing/selftests/mm/pkey-x86.h | 2 +-
tools/testing/selftests/mm/run_vmtests.sh | 6 +-
tools/testing/selftests/mm/uffd-common.c | 105 +++++++++++++++++
tools/testing/selftests/mm/uffd-common.h | 29 ++++-
tools/testing/selftests/mm/uffd-stress.c | 10 --
tools/testing/selftests/mm/vm_util.c | 106 ++----------------
tools/testing/selftests/mm/vm_util.h | 36 ++----
16 files changed, 170 insertions(+), 162 deletions(-)
base-commit: 929ed21dfdb6ee94391db51c9eedb63314ef6847
--
2.40.1
On Sun, Jun 4, 2023, at 10:29, 吴章金 wrote:
>
> Sorry for missing part of your feedbacks, I will check if -nostdlib
> stops the linking of libgcc_s or my own separated test script forgot
> linking the libgcc_s manually.
According to the gcc documentation, -nostdlib drops libgcc.a, but
adding -lgcc is the recommended way to bring it back.
> And as suggestion from Thomas' reply,
>
>>> Perhaps we really need to add the missing __divdi3 and __aeabi_ldivmod and the
>>> ones for the other architectures, or get one from lib/math/div64.c.
>
>>No, these ones come from the compiler via libgcc_s, we must not try to
> reimplement them. And we should do our best to avoid depending on them
> to avoid the error you got above.
>
> So, the explicit conversion is used instead in the patch.
I think a cast to a 32-bit type is ideal when converting the
clock_gettime() result into microseconds, since the kernel guarantees
that the timespec value is normalized, with all zeroes in the
upper 34 bits. Going through __aeabi_ldivmod would make the
conversion much slower.
For user supplied non-normalized timeval values, it's not obvious
whether we need the full 64-bit division
Arnd
This patchset consolidates a number of disparate items that can all be
considered cleanups. They are all related to mlxsw in that they are
directly in mlxsw code, or in selftests that mlxsw heavily uses.
- patch #1 fixes a comment, patch #2 propagates an extack
- patches #3 and #4 tweak several loops to query a resource once and cache
in a local variable instead of querying on each iteration
- patches #5 and #6 fix selftest diagrams, and #7 adds a missing diagram
into an existing test
- patch #8 disables a PVID on a bridge in a selftest that should not need
said PVID
Petr Machata (8):
mlxsw: spectrum_router: Clarify a comment
mlxsw: spectrum_router: Use extack in
mlxsw_sp~_rif_ipip_lb_configure()
mlxsw: spectrum_router: Do not query MAX_RIFS on each iteration
mlxsw: spectrum_router: Do not query MAX_VRS on each iteration
selftests: mlxsw: ingress_rif_conf_1d: Fix the diagram
selftests: mlxsw: egress_vid_classification: Fix the diagram
selftests: router_bridge_vlan: Add a diagram
selftests: router_bridge_vlan: Set vlan_default_pvid 0 on the bridge
.../ethernet/mellanox/mlxsw/spectrum_router.c | 26 ++++++++++++-------
.../net/mlxsw/egress_vid_classification.sh | 5 ++--
.../drivers/net/mlxsw/ingress_rif_conf_1d.sh | 5 ++--
.../net/forwarding/router_bridge_vlan.sh | 24 ++++++++++++++++-
4 files changed, 43 insertions(+), 17 deletions(-)
--
2.40.1
Hi, Willy
When I worked on adding new syscalls and the related library routines,
I have seen most of the library routines share the same syscall call and
return logic, this patchset adds two macros to simplify and shrink them.
All of them have been tested on arm, aarch64, rv32 and rv64, no new
regressions found.
If this is ok, I will rebase the new syscalls and library routines on
this patchset.
Best regards,
Zhangjin
---
Zhangjin Wu (4):
tools/nolibc: unistd.h: add __syscall() and __syscall_ret() helpers
tools/nolibc: unistd.h: apply __syscall_ret() helper
tools/nolibc: sys.h: apply __syscall_ret() helper
tools/nolibc: sys.h: apply __syscall() helper
tools/include/nolibc/sys.h | 369 ++++++----------------------------
tools/include/nolibc/unistd.h | 12 +-
2 files changed, 65 insertions(+), 316 deletions(-)
--
2.25.1
Hi, Willy
This is the v3 generic part1 for rv32, all of the found issues of v2
part1 [1] have been fixed up, several generic patches have been fixed up
and merged from v2 part2 [2] to this series, the standalone test_fork
patch [4] is merged with a Reviewed-by line into this series too.
This series is based on 20230528-nolibc-rv32+stkp5 branch of [5].
Changes from v2 -> v3:
* selftests/nolibc: fix up compile warning with glibc on x86_64
Use simpler 'long long' conversion instead of old #ifdef ...
(Suggestion from Willy)
* tools/nolibc: add missing nanoseconds support for __NR_statx
Split the compound assignment into two single assignments
(Suggestion from Thomas)
* selftests/nolibc: add new gettimeofday test cases
Removed the gettimeofday(NULL, &tz)
(Suggestion from Thomas)
All of the commit messages have been re-checked, some missing
Suggested-by lines are added.
The whole patchset have been tested on arm, aarch64, rv32 and rv64, no
regressions (the next compile patchset is required to do rv32 test).
The nolibc-test has been tested with glibc on x86_64 too.
Btw, we have found such poll failures on arm (not introduced by this
patchset), this will be fixed in our coming ppoll_time64 patchset:
48 poll_null = -1 ENOSYS [FAIL]
49 poll_stdout = -1 ENOSYS [FAIL]
50 poll_fault = -1 ENOSYS != (-1 EFAULT) [FAIL]
And the gettimeofday_null removal patch from Thomas [3] may conflicts
with the gettimeofday removal and addition patches, but it is not hard
to fix.
Best regards,
Zhangjin
---
[1]: https://lore.kernel.org/linux-riscv/cover.1685362482.git.falcon@tinylab.org…
[2]: https://lore.kernel.org/linux-riscv/cover.1685387484.git.falcon@tinylab.org…
[3]: https://lore.kernel.org/lkml/20230530-nolibc-gettimeofday-v1-1-7307441a002b…
[4]: https://lore.kernel.org/lkml/61bdfe7bacebdef8aa9195f6f2550a5b0d33aab3.16854…
[5]: https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/nolibc.git
Zhangjin Wu (12):
selftests/nolibc: syscall_args: use generic __NR_statx
tools/nolibc: add missing nanoseconds support for __NR_statx
selftests/nolibc: allow specify extra arguments for qemu
selftests/nolibc: fix up compile warning with glibc on x86_64
selftests/nolibc: not include limits.h for nolibc
selftests/nolibc: use INT_MAX instead of __INT_MAX__
tools/nolibc: arm: add missing my_syscall6
tools/nolibc: open: fix up compile warning for arm
selftests/nolibc: support two errnos with EXPECT_SYSER2()
selftests/nolibc: remove gettimeofday_bad1/2 completely
selftests/nolibc: add new gettimeofday test cases
selftests/nolibc: test_fork: fix up duplicated print
tools/include/nolibc/arch-arm.h | 23 +++++++++++
tools/include/nolibc/stdint.h | 14 +++++++
tools/include/nolibc/sys.h | 39 +++++++++---------
tools/testing/selftests/nolibc/Makefile | 2 +-
tools/testing/selftests/nolibc/nolibc-test.c | 42 ++++++++++++--------
5 files changed, 85 insertions(+), 35 deletions(-)
--
2.25.1
running nolibc-test with glibc on x86_64 got such print issue:
29 execve_root = -1 EACCES [OK]
30 fork30 fork = 0 [OK]
31 getdents64_root = 712 [OK]
The fork test case has three printf calls:
(1) llen += printf("%d %s", test, #name);
(2) llen += printf(" = %d %s ", expr, errorname(errno));
(3) llen += pad_spc(llen, 64, "[FAIL]\n"); --> vfprintf()
In the following scene, the above issue happens:
(a) The parent calls (1)
(b) The parent calls fork()
(c) The child runs and shares the print buffer of (1)
(d) The child exits, flushs the print buffer and closes its own stdout/stderr
* "30 fork" is printed at the first time.
(e) The parent calls (2) and (3), with "\n" in (3), it flushs the whole buffer
* "30 fork = 0 ..." is printed
Therefore, there are two "30 fork" in the stdout.
Between (a) and (b), if flush the stdout (and the sterr), the child in
stage (c) will not be able to 'see' the print buffer.
Signed-off-by: Zhangjin Wu <falcon(a)tinylab.org>
---
tools/testing/selftests/nolibc/nolibc-test.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
index 7de46305f419..88323a60aa4a 100644
--- a/tools/testing/selftests/nolibc/nolibc-test.c
+++ b/tools/testing/selftests/nolibc/nolibc-test.c
@@ -486,7 +486,13 @@ static int test_getpagesize(void)
static int test_fork(void)
{
int status;
- pid_t pid = fork();
+ pid_t pid;
+
+ /* flush the printf buffer to avoid child flush it */
+ fflush(stdout);
+ fflush(stderr);
+
+ pid = fork();
switch (pid) {
case -1:
--
2.25.1
Hi,
This is somewhat related to the 11-patch series that I just posted [1],
but I hadn't originally planned to go this far with it. But since David
Hildenbrand asked if we could warn in this case [2], here it is.
It turns out that automatically doing the "make headers" correctly is
much harder than just warning, so I stopped at that point. It works well,
though.
[1] https://lore.kernel.org/all/20230603021558.95299-1-jhubbard@nvidia.com/
[2] https://lore.kernel.org/all/a4fbc191-9acb-5db8-a375-96c0c1ba3fcd@redhat.com/
thanks,
John Hubbard
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
Cc: Jonathan Corbet <corbet(a)lwn.net>
Cc: linux-doc(a)vger.kernel.org
John Hubbard (1):
selftests: error out if kernel header files are not yet built
tools/testing/selftests/lib.mk | 36 +++++++++++++++++++++++++++++++---
1 file changed, 33 insertions(+), 3 deletions(-)
base-commit: e5282a7d8f6b604f2bb6a06457734b8cf1e2f8f2
--
2.40.1
Adamos found the issue with the cached XFD state [1]. Although the XFD
state is reset on the CPU hotplug, the per-CPU XFD cache is missing
the reset. Then, running an AMX thread there, the staled value causes
the kernel crash to kill the thread.
This is reproducible when moving an AMX thread to the hot-plugged CPU.
So, add a test case to ensure no issue with that.
It repeats the test due to possible inconsistencies. Then, along with
the hotplug cost, it will bring a noticeable runtime increase. But,
the overall test has a quick turnaround time.
Link: https://lore.kernel.org/lkml/20230519112315.30616-1-attofari@amazon.de/ [1]
Signed-off-by: Chang S. Bae <chang.seok.bae(a)intel.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Adamos Ttofari <attofari(a)amazon.de>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: linux-kselftest(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
---
The overall x86 selftest via "$ make TARGETS='x86' kselftest" takes
about 3.5 -> 5.5 seconds. 'amx_64' itself took about 1.5 more seconds
over 0.x seconds.
But, this overall runtime still takes in a matter of some seconds,
which should be fine I thought.
---
tools/testing/selftests/x86/amx.c | 133 ++++++++++++++++++++++++++++--
1 file changed, 126 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/x86/amx.c b/tools/testing/selftests/x86/amx.c
index d884fd69dd51..6f2f0598c706 100644
--- a/tools/testing/selftests/x86/amx.c
+++ b/tools/testing/selftests/x86/amx.c
@@ -3,6 +3,7 @@
#define _GNU_SOURCE
#include <err.h>
#include <errno.h>
+#include <fcntl.h>
#include <pthread.h>
#include <setjmp.h>
#include <stdio.h>
@@ -25,6 +26,8 @@
# error This test is 64-bit only
#endif
+#define BUF_LEN 1000
+
#define XSAVE_HDR_OFFSET 512
#define XSAVE_HDR_SIZE 64
@@ -239,11 +242,10 @@ static inline uint64_t get_fpx_sw_bytes_features(void *buffer)
}
/* Work around printf() being unsafe in signals: */
-#define SIGNAL_BUF_LEN 1000
-char signal_message_buffer[SIGNAL_BUF_LEN];
+char signal_message_buffer[BUF_LEN];
void sig_print(char *msg)
{
- int left = SIGNAL_BUF_LEN - strlen(signal_message_buffer) - 1;
+ int left = BUF_LEN - strlen(signal_message_buffer) - 1;
strncat(signal_message_buffer, msg, left);
}
@@ -767,15 +769,15 @@ static int create_threads(int num, struct futex_info *finfo)
return 0;
}
-static void affinitize_cpu0(void)
+static inline void affinitize_cpu(int cpu)
{
cpu_set_t cpuset;
CPU_ZERO(&cpuset);
- CPU_SET(0, &cpuset);
+ CPU_SET(cpu, &cpuset);
if (sched_setaffinity(0, sizeof(cpuset), &cpuset) != 0)
- fatal_error("sched_setaffinity to CPU 0");
+ fatal_error("sched_setaffinity to CPU %d", cpu);
}
static void test_context_switch(void)
@@ -784,7 +786,7 @@ static void test_context_switch(void)
int i;
/* Affinitize to one CPU to force context switches */
- affinitize_cpu0();
+ affinitize_cpu(0);
req_xtiledata_perm();
@@ -926,6 +928,121 @@ static void test_ptrace(void)
err(1, "ptrace test");
}
+/* CPU Hotplug test */
+
+static void __hotplug_cpu(int online, int cpu)
+{
+ char buf[BUF_LEN] = {};
+ int fd, rc;
+
+ strncat(buf, "/sys/devices/system/cpu/cpu", BUF_LEN);
+ snprintf(buf + strlen(buf), BUF_LEN - strlen(buf), "%d", cpu);
+ strncat(buf, "/online", BUF_LEN - strlen(buf));
+
+ fd = open(buf, O_RDWR);
+ if (fd == -1)
+ fatal_error("open()");
+
+ snprintf(buf, BUF_LEN, "%d", online);
+ rc = write(fd, buf, strlen(buf));
+ if (rc == -1)
+ fatal_error("write()");
+
+ rc = close(fd);
+ if (rc == -1)
+ fatal_error("close()");
+}
+
+static void offline_cpu(int cpu)
+{
+ __hotplug_cpu(0, cpu);
+}
+
+static void online_cpu(int cpu)
+{
+ __hotplug_cpu(1, cpu);
+}
+
+static jmp_buf jmpbuf;
+
+static void handle_sigsegv(int sig, siginfo_t *si, void *ctx_void)
+{
+ siglongjmp(jmpbuf, 1);
+}
+
+#define RETRY 5
+
+/*
+ * Sanity-check the hotplug CPU for its (re-)initialization.
+ *
+ * Create an AMX thread on a CPU, while the hotplug CPU went offline.
+ * Then, plug the offlined back, and move the thread to run on it.
+ *
+ * Repeat this multiple times to ensure no inconsistent failure.
+ * If something goes wrong, the thread will get a signal or killed.
+ */
+static void *switch_cpus(void *arg)
+{
+ int *result = (int *)arg;
+ int i = 0;
+
+ affinitize_cpu(0);
+ offline_cpu(1);
+ load_rand_tiledata(stashed_xsave);
+
+ sethandler(SIGSEGV, handle_sigsegv, SA_ONSTACK);
+ for (i = 0; i < RETRY; i++) {
+ if (i > 0) {
+ affinitize_cpu(0);
+ offline_cpu(1);
+ }
+ if (sigsetjmp(jmpbuf, 1) == 0) {
+ online_cpu(1);
+ affinitize_cpu(1);
+ } else {
+ *result = 1;
+ goto out;
+ }
+ }
+ *result = 0;
+out:
+ clearhandler(SIGSEGV);
+ return result;
+}
+
+static void test_cpuhp(void)
+{
+ int max_cpu_num = sysconf(_SC_NPROCESSORS_ONLN) - 1;
+ void *thread_retval;
+ pthread_t thread;
+ int result, rc;
+
+ if (!max_cpu_num) {
+ printf("[SKIP]\tThe running system has no more CPU for the hotplug test.\n");
+ return;
+ }
+
+ printf("[RUN]\tTest AMX use with the CPU hotplug.\n");
+
+ if (pthread_create(&thread, NULL, switch_cpus, &result))
+ fatal_error("pthread_create()");
+
+ rc = pthread_join(thread, &thread_retval);
+
+ if (rc)
+ fatal_error("pthread_join()");
+
+ /*
+ * Either an invalid retval or a failed result indicates
+ * the test failure.
+ */
+ if (thread_retval != &result || result != 0)
+ printf("[FAIL]\tThe AMX thread had an issue (%s).\n",
+ thread_retval != &result ? "killed" : "signaled");
+ else
+ printf("[OK]\tThe AMX thread had no issue.\n");
+}
+
int main(void)
{
/* Check hardware availability at first */
@@ -948,6 +1065,8 @@ int main(void)
test_ptrace();
+ test_cpuhp();
+
clearhandler(SIGILL);
free_stashed_xsave();
base-commit: 7877cb91f1081754a1487c144d85dc0d2e2e7fc4
--
2.17.1
Hi,
This follows the discussion here:
https://lore.kernel.org/linux-kselftest/20230324123157.bbwvfq4gsxnlnfwb@hou…
This shows a couple of inconsistencies with regard to how device-managed
resources are cleaned up. Basically, devm resources will only be cleaned up
if the device is attached to a bus and bound to a driver. Failing any of
these cases, a call to device_unregister will not end up in the devm
resources being released.
We had to work around it in DRM to provide helpers to create a device for
kunit tests, but the current discussion around creating similar, generic,
helpers for kunit resumed interest in fixing this.
This can be tested using the command:
./tools/testing/kunit/kunit.py run --kunitconfig=drivers/base/test/
Let me know what you think,
Maxime
Signed-off-by: Maxime Ripard <maxime(a)cerno.tech>
---
Maxime Ripard (2):
drivers: base: Add basic devm tests for root devices
drivers: base: Add basic devm tests for platform devices
drivers/base/test/.kunitconfig | 2 +
drivers/base/test/Kconfig | 4 +
drivers/base/test/Makefile | 3 +
drivers/base/test/platform-device-test.c | 278 +++++++++++++++++++++++++++++++
drivers/base/test/root-device-test.c | 120 +++++++++++++
5 files changed, 407 insertions(+)
---
base-commit: a6faf7ea9fcb7267d06116d4188947f26e00e57e
change-id: 20230329-kunit-devm-inconsistencies-test-5e5a7d01e60d
Best regards,
--
Maxime Ripard <maxime(a)cerno.tech>
Dzień dobry,
w jaki sposób docierają Państwo do odbiorców?
Tworzymy potężne narzędzia sprzedaży, które pozwalają kompleksowo rozwiązać problemy potencjalnych klientów i skutecznie wpłynąć na ich decyzje zakupowe.
Skupiamy się na Państwa potrzebach związanych z obsługą sklepu, oczekiwaniach i planach sprzedażowych. Szczegółowo dopasowujemy grafikę, funkcjonalności, strukturę i mikrointerakcje do Państwa grupy docelowej, co przekłada się na oczekiwane rezultaty.
Chętnie przedstawię dotychczasowe realizacje, aby mogli Państwo przekonać się o naszych możliwościach. Mogę się skontaktować?
Pozdrawiam
Kamil Durjasz
Commit d937bc3449fa ("bpf: make uniform use of array->elem_size
everywhere in arraymap.c") changed array_map_gen_lookup to use
array->elem_size instead of round_up(map->value_size, 8) as the element
size when generating code to access a value in an array map.
array->elem_size, however, is not set by bpf_map_meta_alloc when
initializing an BPF_MAP_TYPE_ARRAY_OF_MAPS or BPF_MAP_TYPE_HASH_OF_MAPS.
This results in array_map_gen_lookup incorrectly outputting code that
always accesses index 0 in the array (as the index will be calculated
via a multiplication with the element size, which is incorrectly set to
0).
This patchset sets elem_size on the bpf_array object when allocating an
array or hash of maps to fix this and adds a selftest that accesses an
array map nested within a hash of maps at a nonzero index to prevent
regressions.
v1: https://lore.kernel.org/bpf/95b5da7c-ee52-3ecb-0a4e-f6a7a114f269@linux.dev/
Changelog:
v1 -> v2:
Address comments by Martin KaFai Lau:
- Directly use inner_array->elem_size instead of using round_up
- Move selftests to a new patch
- Use ASSERT_* macros instead of CHECK and remove duration
- Remove unnecessary usleep
- Shorten selftest name
Rhys Rustad-Elliott (2):
bpf: Fix elem_size not being set for inner maps
selftests/bpf: Add access_inner_map selftest
kernel/bpf/map_in_map.c | 8 +++-
.../bpf/prog_tests/inner_array_lookup.c | 31 +++++++++++++
.../bpf/progs/test_inner_array_lookup.c | 45 +++++++++++++++++++
3 files changed, 82 insertions(+), 2 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/inner_array_lookup.c
create mode 100644 tools/testing/selftests/bpf/progs/test_inner_array_lookup.c
--
2.40.1
*Changes in v16*
- Fix a corner case
- Add exclusive PM_SCAN_OP_WP back
*Changes in v15*
- Build fix (Add missed build fix in RESEND)
*Changes in v14*
- Fix build error caused by #ifdef added at last minute in some configs
*Changes in v13*
- Rebase on top of next-20230414
- Give-up on using uffd_wp_range() and write new helpers, flush tlb only
once
*Changes in v12*
- Update and other memory types to UFFD_FEATURE_WP_ASYNC
- Rebaase on top of next-20230406
- Review updates
*Changes in v11*
- Rebase on top of next-20230307
- Base patches on UFFD_FEATURE_WP_UNPOPULATED
- Do a lot of cosmetic changes and review updates
- Remove ENGAGE_WP + !GET operation as it can be performed with
UFFDIO_WRITEPROTECT
*Changes in v10*
- Add specific condition to return error if hugetlb is used with wp
async
- Move changes in tools/include/uapi/linux/fs.h to separate patch
- Add documentation
*Changes in v9:*
- Correct fault resolution for userfaultfd wp async
- Fix build warnings and errors which were happening on some configs
- Simplify pagemap ioctl's code
*Changes in v8:*
- Update uffd async wp implementation
- Improve PAGEMAP_IOCTL implementation
*Changes in v7:*
- Add uffd wp async
- Update the IOCTL to use uffd under the hood instead of soft-dirty
flags
*Motivation*
The real motivation for adding PAGEMAP_SCAN IOCTL is to emulate Windows
GetWriteWatch() syscall [1]. The GetWriteWatch{} retrieves the addresses of
the pages that are written to in a region of virtual memory.
This syscall is used in Windows applications and games etc. This syscall is
being emulated in pretty slow manner in userspace. Our purpose is to
enhance the kernel such that we translate it efficiently in a better way.
Currently some out of tree hack patches are being used to efficiently
emulate it in some kernels. We intend to replace those with these patches.
So the whole gaming on Linux can effectively get benefit from this. It
means there would be tons of users of this code.
CRIU use case [2] was mentioned by Andrei and Danylo:
> Use cases for migrating sparse VMAs are binaries sanitized with ASAN,
> MSAN or TSAN [3]. All of these sanitizers produce sparse mappings of
> shadow memory [4]. Being able to migrate such binaries allows to highly
> reduce the amount of work needed to identify and fix post-migration
> crashes, which happen constantly.
Andrei's defines the following uses of this code:
* it is more granular and allows us to track changed pages more
effectively. The current interface can clear dirty bits for the entire
process only. In addition, reading info about pages is a separate
operation. It means we must freeze the process to read information
about all its pages, reset dirty bits, only then we can start dumping
pages. The information about pages becomes more and more outdated,
while we are processing pages. The new interface solves both these
downsides. First, it allows us to read pte bits and clear the
soft-dirty bit atomically. It means that CRIU will not need to freeze
processes to pre-dump their memory. Second, it clears soft-dirty bits
for a specified region of memory. It means CRIU will have actual info
about pages to the moment of dumping them.
* The new interface has to be much faster because basic page filtering
is happening in the kernel. With the old interface, we have to read
pagemap for each page.
*Implementation Evolution (Short Summary)*
From the definition of GetWriteWatch(), we feel like kernel's soft-dirty
feature can be used under the hood with some additions like:
* reset soft-dirty flag for only a specific region of memory instead of
clearing the flag for the entire process
* get and clear soft-dirty flag for a specific region atomically
So we decided to use ioctl on pagemap file to read or/and reset soft-dirty
flag. But using soft-dirty flag, sometimes we get extra pages which weren't
even written. They had become soft-dirty because of VMA merging and
VM_SOFTDIRTY flag. This breaks the definition of GetWriteWatch(). We were
able to by-pass this short coming by ignoring VM_SOFTDIRTY until David
reported that mprotect etc messes up the soft-dirty flag while ignoring
VM_SOFTDIRTY [5]. This wasn't happening until [6] got introduced. We
discussed if we can revert these patches. But we could not reach to any
conclusion. So at this point, I made couple of tries to solve this whole
VM_SOFTDIRTY issue by correcting the soft-dirty implementation:
* [7] Correct the bug fixed wrongly back in 2014. It had potential to cause
regression. We left it behind.
* [8] Keep a list of soft-dirty part of a VMA across splits and merges. I
got the reply don't increase the size of the VMA by 8 bytes.
At this point, we left soft-dirty considering it is too much delicate and
userfaultfd [9] seemed like the only way forward. From there onward, we
have been basing soft-dirty emulation on userfaultfd wp feature where
kernel resolves the faults itself when WP_ASYNC feature is used. It was
straight forward to add WP_ASYNC feature in userfautlfd. Now we get only
those pages dirty or written-to which are really written in reality. (PS
There is another WP_UNPOPULATED userfautfd feature is required which is
needed to avoid pre-faulting memory before write-protecting [9].)
All the different masks were added on the request of CRIU devs to create
interface more generic and better.
[1] https://learn.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-…
[2] https://lore.kernel.org/all/20221014134802.1361436-1-mdanylo@google.com
[3] https://github.com/google/sanitizers
[4] https://github.com/google/sanitizers/wiki/AddressSanitizerAlgorithm#64-bit
[5] https://lore.kernel.org/all/bfcae708-db21-04b4-0bbe-712badd03071@redhat.com
[6] https://lore.kernel.org/all/20220725142048.30450-1-peterx@redhat.com/
[7] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.…
[8] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.…
[9] https://lore.kernel.org/all/20230306213925.617814-1-peterx@redhat.com
[10] https://lore.kernel.org/all/20230125144529.1630917-1-mdanylo@google.com
* Original Cover letter from v8*
Hello,
Note:
Soft-dirty pages and pages which have been written-to are synonyms. As
kernel already has soft-dirty feature inside which we have given up to
use, we are using written-to terminology while using UFFD async WP under
the hood.
This IOCTL, PAGEMAP_SCAN on pagemap file can be used to get and/or clear
the info about page table entries. The following operations are
supported in this ioctl:
- Get the information if the pages have been written-to (PAGE_IS_WRITTEN),
file mapped (PAGE_IS_FILE), present (PAGE_IS_PRESENT) or swapped
(PAGE_IS_SWAPPED).
- Write-protect the pages (PAGEMAP_WP_ENGAGE) to start finding which
pages have been written-to.
- Find pages which have been written-to and write protect the pages
(atomic PAGE_IS_WRITTEN + PAGEMAP_WP_ENGAGE)
It is possible to find and clear soft-dirty pages entirely in userspace.
But it isn't efficient:
- The mprotect and SIGSEGV handler for bookkeeping
- The userfaultfd wp (synchronous) with the handler for bookkeeping
Some benchmarks can be seen here[1]. This series adds features that weren't
present earlier:
- There is no atomic get soft-dirty/Written-to status and clear present in
the kernel.
- The pages which have been written-to can not be found in accurate way.
(Kernel's soft-dirty PTE bit + sof_dirty VMA bit shows more soft-dirty
pages than there actually are.)
Historically, soft-dirty PTE bit tracking has been used in the CRIU
project. The procfs interface is enough for finding the soft-dirty bit
status and clearing the soft-dirty bit of all the pages of a process.
We have the use case where we need to track the soft-dirty PTE bit for
only specific pages on-demand. We need this tracking and clear mechanism
of a region of memory while the process is running to emulate the
getWriteWatch() syscall of Windows.
*(Moved to using UFFD instead of soft-dirtyi feature to find pages which
have been written-to from v7 patch series)*:
Stop using the soft-dirty flags for finding which pages have been
written to. It is too delicate and wrong as it shows more soft-dirty
pages than the actual soft-dirty pages. There is no interest in
correcting it [2][3] as this is how the feature was written years ago.
It shouldn't be updated to changed behaviour. Peter Xu has suggested
using the async version of the UFFD WP [4] as it is based inherently
on the PTEs.
So in this patch series, I've added a new mode to the UFFD which is
asynchronous version of the write protect. When this variant of the
UFFD WP is used, the page faults are resolved automatically by the
kernel. The pages which have been written-to can be found by reading
pagemap file (!PM_UFFD_WP). This feature can be used successfully to
find which pages have been written to from the time the pages were
write protected. This works just like the soft-dirty flag without
showing any extra pages which aren't soft-dirty in reality.
The information related to pages if the page is file mapped, present and
swapped is required for the CRIU project [5][6]. The addition of the
required mask, any mask, excluded mask and return masks are also required
for the CRIU project [5].
The IOCTL returns the addresses of the pages which match the specific
masks. The page addresses are returned in struct page_region in a compact
form. The max_pages is needed to support a use case where user only wants
to get a specific number of pages. So there is no need to find all the
pages of interest in the range when max_pages is specified. The IOCTL
returns when the maximum number of the pages are found. The max_pages is
optional. If max_pages is specified, it must be equal or greater than the
vec_size. This restriction is needed to handle worse case when one
page_region only contains info of one page and it cannot be compacted.
This is needed to emulate the Windows getWriteWatch() syscall.
The patch series include the detailed selftest which can be used as an
example for the uffd async wp test and PAGEMAP_IOCTL. It shows the
interface usages as well.
[1] https://lore.kernel.org/lkml/54d4c322-cd6e-eefd-b161-2af2b56aae24@collabora…
[2] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.…
[3] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.…
[4] https://lore.kernel.org/all/Y6Hc2d+7eTKs7AiH@x1n
[5] https://lore.kernel.org/all/YyiDg79flhWoMDZB@gmail.com/
[6] https://lore.kernel.org/all/20221014134802.1361436-1-mdanylo@google.com/
Regards,
Muhammad Usama Anjum
Muhammad Usama Anjum (4):
fs/proc/task_mmu: Implement IOCTL to get and optionally clear info
about PTEs
tools headers UAPI: Update linux/fs.h with the kernel sources
mm/pagemap: add documentation of PAGEMAP_SCAN IOCTL
selftests: mm: add pagemap ioctl tests
Peter Xu (1):
userfaultfd: UFFD_FEATURE_WP_ASYNC
Documentation/admin-guide/mm/pagemap.rst | 58 +
Documentation/admin-guide/mm/userfaultfd.rst | 35 +
fs/proc/task_mmu.c | 503 ++++++
fs/userfaultfd.c | 26 +-
include/linux/userfaultfd_k.h | 21 +-
include/uapi/linux/fs.h | 53 +
include/uapi/linux/userfaultfd.h | 9 +-
mm/hugetlb.c | 32 +-
mm/memory.c | 27 +-
tools/include/uapi/linux/fs.h | 53 +
tools/testing/selftests/mm/.gitignore | 1 +
tools/testing/selftests/mm/Makefile | 3 +-
tools/testing/selftests/mm/config | 1 +
tools/testing/selftests/mm/pagemap_ioctl.c | 1459 ++++++++++++++++++
tools/testing/selftests/mm/run_vmtests.sh | 4 +
15 files changed, 2262 insertions(+), 23 deletions(-)
create mode 100644 tools/testing/selftests/mm/pagemap_ioctl.c
mode change 100644 => 100755 tools/testing/selftests/mm/run_vmtests.sh
--
2.39.2
This is part of the effort to remove the empty element of the ctl_table
structures (used to calculate size) and replace it with an ARRAY_SIZE call. By
replacing the child element in struct ctl_table with a flags element we make
sure that there are no forward recursions on child nodes and therefore set
ourselves up for just using an ARRAY_SIZE. We also added some self tests to
make sure that we do not break anything.
Patchset is separated in 4: parport fixes, selftests fixes, selftests additions and
replacement of child element. Tested everything with sysctl self tests and everything
seems "ok".
1. parport fixes: @mcgrof: this is related to my previous series and it plugs a
sysct table leak in the parport driver. Please tell me if you want me to repost
the parport series with this one stiched in.
2. Selftests fixes: Remove the prefixed zeros when passing a awk field to the
awk print command because it was causing $0009 to be interpreted as $0.
Replaced continue with return in sysctl.sh(test_case) so the test actually
gets skipped. The skip decision is now in sysctl.sh(skip_test).
3. Selftest additions: New test to confirm that unregister actually removes
targets. New test to confirm that permanently empty targets are indeed
created and that no other targets can be created "on top".
4. Replaced the child pointer in struct ctl_table with a u8 flag. The flag
is used to differentiate between permanently empty targets and non-empty ones.
Comments/feedback greatly appreciated
Best
Joel
Joel Granados (8):
parport: plug a sysctl register leak
test_sysctl: Fix test metadata getters
test_sysctl: Group node sysctl test under one func
test_sysctl: Add an unregister sysctl test
test_sysctl: Add an option to prevent test skip
test_sysclt: Test for registering a mount point
sysctl: Remove debugging dump_stack
sysctl: replace child with a flags var
drivers/parport/procfs.c | 23 ++---
fs/proc/proc_sysctl.c | 82 ++++------------
include/linux/sysctl.h | 4 +-
lib/test_sysctl.c | 91 ++++++++++++++++--
tools/testing/selftests/sysctl/sysctl.sh | 115 +++++++++++++++++------
5 files changed, 204 insertions(+), 111 deletions(-)
--
2.30.2
From: Jeff Xu <jeffxu(a)google.com>
This is the first set of Memory mapping (VMA) protection patches using PKU.
* * *
Background:
As discussed previously in the kernel mailing list [1], V8 CFI [2] uses
PKU to protect memory, and Stephen Röttger proposes to extend the PKU to
memory mapping [3].
We're using PKU for in-process isolation to enforce control-flow integrity
for a JIT compiler. In our threat model, an attacker exploits a
vulnerability and has arbitrary read/write access to the whole process
space concurrently to other threads being executed. This attacker can
manipulate some arguments to syscalls from some threads.
Under such a powerful attack, we want to create a “safe/isolated”
thread environment. We assign dedicated PKUs to this thread,
and use those PKUs to protect the threads’ runtime environment.
The thread has exclusive access to its run-time memory. This
includes modifying the protection of the memory mapping, or
munmap the memory mapping after use. And the other threads
won’t be able to access the memory or modify the memory mapping
(VMA) belonging to the thread.
* * *
Proposed changes:
This patch introduces a new flag, PKEY_ENFORCE_API, to the pkey_alloc()
function. When a PKEY is created with this flag, it is enforced that any
thread that wants to make changes to the memory mapping (such as mprotect)
of the memory must have write access to the PKEY. PKEYs created without
this flag will continue to work as they do now, for backwards
compatibility.
Only PKEY created from user space can have the new flag set, the PKEY
allocated by the kernel internally will not have it. In other words,
ARCH_DEFAULT_PKEY(0) and execute_only_pkey won’t have this flag set,
and continue work as today.
This flag is checked only at syscall entry, such as mprotect/munmap in
this set of patches. It will not apply to other call paths. In other
words, if the kernel want to change attributes of VMA for some reasons,
the kernel is free to do that and not affected by this new flag.
This set of patch covers mprotect/munmap, I plan to work on other
syscalls after this.
* * *
Testing:
I have tested this patch on a Linux kernel 5.15, 6,1, and 6.4-rc1,
new selftest is added in: pkey_enforce_api.c
* * *
Discussion:
We believe that this patch provides a valuable security feature.
It allows us to create “safe/isolated” thread environments that are
protected from attackers with arbitrary read/write access to
the process space.
We believe that the interface change and the patch don't
introduce backwards compatibility risk.
We would like to disucss this patch in Linux kernel community
for feedback and support.
* * *
Reference:
[1]https://lore.kernel.org/all/202208221331.71C50A6F@keescook/
[2]https://docs.google.com/document/d/1O2jwK4dxI3nRcOJuPYkonhTkNQfbmwdvxQMyX…
[3]https://docs.google.com/document/d/1qqVoVfRiF2nRylL3yjZyCQvzQaej1HRPh3f5w…
Best Regards,
-Jeff Xu
Jeff Xu (6):
PKEY: Introduce PKEY_ENFORCE_API flag
PKEY: Add arch_check_pkey_enforce_api()
PKEY: Apply PKEY_ENFORCE_API to mprotect
PKEY:selftest pkey_enforce_api for mprotect
KEY: Apply PKEY_ENFORCE_API to munmap
PKEY:selftest pkey_enforce_api for munmap
arch/powerpc/include/asm/pkeys.h | 19 +-
arch/x86/include/asm/mmu.h | 7 +
arch/x86/include/asm/pkeys.h | 92 +-
arch/x86/mm/pkeys.c | 2 +-
include/linux/mm.h | 2 +-
include/linux/pkeys.h | 18 +-
include/uapi/linux/mman.h | 5 +
mm/mmap.c | 34 +-
mm/mprotect.c | 31 +-
mm/mremap.c | 6 +-
tools/testing/selftests/mm/Makefile | 1 +
tools/testing/selftests/mm/pkey_enforce_api.c | 1312 +++++++++++++++++
12 files changed, 1507 insertions(+), 22 deletions(-)
create mode 100644 tools/testing/selftests/mm/pkey_enforce_api.c
base-commit: ba0ad6ed89fd5dada3b7b65ef2b08e95d449d4ab
--
2.40.1.606.ga4b1b128d6-goog
This patch updates the cgroup-v2.rst file to include information about
the new "cpuset.cpus.reserve" control file as well as the new remote
partition.
Signed-off-by: Waiman Long <longman(a)redhat.com>
---
Documentation/admin-guide/cgroup-v2.rst | 92 +++++++++++++++++++++----
1 file changed, 79 insertions(+), 13 deletions(-)
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index f67c0829350b..3e9351c2cd27 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2215,6 +2215,38 @@ Cpuset Interface Files
Its value will be affected by memory nodes hotplug events.
+ cpuset.cpus.reserve
+ A read-write multiple values file which exists only on root
+ cgroup.
+
+ It lists all the CPUs that are reserved for adjacent and remote
+ partitions created in the system. See the next section for
+ more information on what an adjacent or remote partitions is.
+
+ Creation of adjacent partition does not require touching this
+ control file as CPU reservation will be done automatically.
+ In order to create a remote partition, the CPUs needed by the
+ remote partition has to be written to this file first.
+
+ Due to the fact that "cpuset.cpus.reserve" holds reserve CPUs
+ that can be used by multiple partitions and automatic reservation
+ may also race with manual reservation, an extension prefixes of
+ "+" and "-" are allowed for this file to reduce race.
+
+ A "+" prefix can be used to indicate a list of additional
+ CPUs that are to be added without disturbing the CPUs that are
+ originally there. For example, if its current value is "3-4",
+ echoing ""+5" to it will change it to "3-5".
+
+ Once a remote partition is destroyed, its CPUs have to be
+ removed from this file or no other process can use them. A "-"
+ prefix can be used to remove a list of CPUs from it. However,
+ removing CPUs that are currently used in existing partitions
+ may cause those partitions to become invalid. A single "-"
+ character without any number can be used to indicate removal
+ of all the free CPUs not yet allocated to any partitions to
+ avoid accidental partition invalidation.
+
cpuset.cpus.partition
A read-write single value file which exists on non-root
cpuset-enabled cgroups. This flag is owned by the parent cgroup
@@ -2228,25 +2260,49 @@ Cpuset Interface Files
"isolated" Partition root without load balancing
========== =====================================
- The root cgroup is always a partition root and its state
- cannot be changed. All other non-root cgroups start out as
- "member".
+ A cpuset partition is a collection of cgroups with a partition
+ root at the top of the hierarchy and its descendants except
+ those that are separate partition roots themselves and their
+ descendants. A partition has exclusive access to the set of
+ CPUs allocated to it. Other cgroups outside of that partition
+ cannot use any CPUs in that set.
+
+ There are two types of partitions - adjacent and remote. The
+ parent of an adjacent partition must be a valid partition root.
+ Partition roots of adjacent partitions are all clustered around
+ the root cgroup. Creation of adjacent partition is done by
+ writing the desired partition type into "cpuset.cpus.partition".
+
+ A remote partition does not require a partition root parent.
+ So a remote partition can be formed far from the root cgroup.
+ However, its creation is a 2-step process. The CPUs needed
+ by a remote partition ("cpuset.cpus" of the partition root)
+ has to be written into "cpuset.cpus.reserve" of the root
+ cgroup first. After that, "isolated" can be written into
+ "cpuset.cpus.partition" of the partition root to form a remote
+ isolated partition which is the only supported remote partition
+ type for now.
+
+ All remote partitions are terminal as adjacent partition cannot
+ be created underneath it. With the way remote partition is
+ formed, it is not possible to create another valid remote
+ partition underneath it.
+
+ The root cgroup is always a partition root and its state cannot
+ be changed. All other non-root cgroups start out as "member".
When set to "root", the current cgroup is the root of a new
- partition or scheduling domain that comprises itself and all
- its descendants except those that are separate partition roots
- themselves and their descendants.
+ partition or scheduling domain.
- When set to "isolated", the CPUs in that partition root will
+ When set to "isolated", the CPUs in that partition will
be in an isolated state without any load balancing from the
scheduler. Tasks placed in such a partition with multiple
CPUs should be carefully distributed and bound to each of the
individual CPUs for optimal performance.
- The value shown in "cpuset.cpus.effective" of a partition root
- is the CPUs that the partition root can dedicate to a potential
- new child partition root. The new child subtracts available
- CPUs from its parent "cpuset.cpus.effective".
+ The value shown in "cpuset.cpus.effective" of a partition root is
+ the CPUs that are dedicated to that partition and not available
+ to cgroups outside of that partittion.
A partition root ("root" or "isolated") can be in one of the
two possible states - valid or invalid. An invalid partition
@@ -2270,8 +2326,8 @@ Cpuset Interface Files
In the case of an invalid partition root, a descriptive string on
why the partition is invalid is included within parentheses.
- For a partition root to become valid, the following conditions
- must be met.
+ For an adjacent partition root to be valid, the following
+ conditions must be met.
1) The "cpuset.cpus" is exclusive with its siblings , i.e. they
are not shared by any of its siblings (exclusivity rule).
@@ -2281,6 +2337,16 @@ Cpuset Interface Files
4) The "cpuset.cpus.effective" cannot be empty unless there is
no task associated with this partition.
+ For a remote partition root to be valid, the following conditions
+ must be met.
+
+ 1) The same exclusivity rule as adjacent partition root.
+ 2) The "cpuset.cpus" is not empty and all the CPUs must be
+ present in "cpuset.cpus.reserve" of the root cgroup and none
+ of them are allocated to another partition.
+ 3) The "cpuset.cpus" value must be present in all its ancestors
+ to ensure proper hierarchical cpu distribution.
+
External events like hotplug or changes to "cpuset.cpus" can
cause a valid partition root to become invalid and vice versa.
Note that a task cannot be moved to a cgroup with empty
--
2.31.1
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit dbcf76390eb9a65d5d0c37b0cd57335218564e37 ]
The ftrace selftests do not currently produce KTAP output, they produce a
custom format much nicer for human consumption. This means that when run in
automated test systems we just get a single result for the suite as a whole
rather than recording results for individual test cases, making it harder
to look at the test data and masking things like inappropriate skips.
Address this by adding support for KTAP output to the ftracetest script and
providing a trivial wrapper which will be invoked by the kselftest runner
to generate output in this format by default, users using ftracetest
directly will continue to get the existing output.
This is not the most elegant solution but it is simple and effective. I
did consider implementing this by post processing the existing output
format but that felt more complex and likely to result in all output being
lost if something goes seriously wrong during the run which would not be
helpful. I did also consider just writing a separate runner script but
there's enough going on with things like the signal handling for that to
seem like it would be duplicating too much.
Acked-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Tested-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/ftrace/Makefile | 3 +-
tools/testing/selftests/ftrace/ftracetest | 63 ++++++++++++++++++-
.../testing/selftests/ftrace/ftracetest-ktap | 8 +++
3 files changed, 70 insertions(+), 4 deletions(-)
create mode 100755 tools/testing/selftests/ftrace/ftracetest-ktap
diff --git a/tools/testing/selftests/ftrace/Makefile b/tools/testing/selftests/ftrace/Makefile
index d6e106fbce11c..a1e955d2de4cc 100644
--- a/tools/testing/selftests/ftrace/Makefile
+++ b/tools/testing/selftests/ftrace/Makefile
@@ -1,7 +1,8 @@
# SPDX-License-Identifier: GPL-2.0
all:
-TEST_PROGS := ftracetest
+TEST_PROGS_EXTENDED := ftracetest
+TEST_PROGS := ftracetest-ktap
TEST_FILES := test.d settings
EXTRA_CLEAN := $(OUTPUT)/logs/*
diff --git a/tools/testing/selftests/ftrace/ftracetest b/tools/testing/selftests/ftrace/ftracetest
index c3311c8c40890..2506621e75dfb 100755
--- a/tools/testing/selftests/ftrace/ftracetest
+++ b/tools/testing/selftests/ftrace/ftracetest
@@ -13,6 +13,7 @@ echo "Usage: ftracetest [options] [testcase(s)] [testcase-directory(s)]"
echo " Options:"
echo " -h|--help Show help message"
echo " -k|--keep Keep passed test logs"
+echo " -K|--ktap Output in KTAP format"
echo " -v|--verbose Increase verbosity of test messages"
echo " -vv Alias of -v -v (Show all results in stdout)"
echo " -vvv Alias of -v -v -v (Show all commands immediately)"
@@ -85,6 +86,10 @@ parse_opts() { # opts
KEEP_LOG=1
shift 1
;;
+ --ktap|-K)
+ KTAP=1
+ shift 1
+ ;;
--verbose|-v|-vv|-vvv)
if [ $VERBOSE -eq -1 ]; then
usage "--console can not use with --verbose"
@@ -178,6 +183,7 @@ TEST_DIR=$TOP_DIR/test.d
TEST_CASES=`find_testcases $TEST_DIR`
LOG_DIR=$TOP_DIR/logs/`date +%Y%m%d-%H%M%S`/
KEEP_LOG=0
+KTAP=0
DEBUG=0
VERBOSE=0
UNSUPPORTED_RESULT=0
@@ -229,7 +235,7 @@ prlog() { # messages
newline=
shift
fi
- printf "$*$newline"
+ [ "$KTAP" != "1" ] && printf "$*$newline"
[ "$LOG_FILE" ] && printf "$*$newline" | strip_esc >> $LOG_FILE
}
catlog() { #file
@@ -260,11 +266,11 @@ TOTAL_RESULT=0
INSTANCE=
CASENO=0
+CASENAME=
testcase() { # testfile
CASENO=$((CASENO+1))
- desc=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
- prlog -n "[$CASENO]$INSTANCE$desc"
+ CASENAME=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
}
checkreq() { # testfile
@@ -277,40 +283,68 @@ test_on_instance() { # testfile
grep -q "^#[ \t]*flags:.*instance" $1
}
+ktaptest() { # result comment
+ if [ "$KTAP" != "1" ]; then
+ return
+ fi
+
+ local result=
+ if [ "$1" = "1" ]; then
+ result="ok"
+ else
+ result="not ok"
+ fi
+ shift
+
+ local comment=$*
+ if [ "$comment" != "" ]; then
+ comment="# $comment"
+ fi
+
+ echo $CASENO $result $INSTANCE$CASENAME $comment
+}
+
eval_result() { # sigval
case $1 in
$PASS)
prlog " [${color_green}PASS${color_reset}]"
+ ktaptest 1
PASSED_CASES="$PASSED_CASES $CASENO"
return 0
;;
$FAIL)
prlog " [${color_red}FAIL${color_reset}]"
+ ktaptest 0
FAILED_CASES="$FAILED_CASES $CASENO"
return 1 # this is a bug.
;;
$UNRESOLVED)
prlog " [${color_blue}UNRESOLVED${color_reset}]"
+ ktaptest 0 UNRESOLVED
UNRESOLVED_CASES="$UNRESOLVED_CASES $CASENO"
return $UNRESOLVED_RESULT # depends on use case
;;
$UNTESTED)
prlog " [${color_blue}UNTESTED${color_reset}]"
+ ktaptest 1 SKIP
UNTESTED_CASES="$UNTESTED_CASES $CASENO"
return 0
;;
$UNSUPPORTED)
prlog " [${color_blue}UNSUPPORTED${color_reset}]"
+ ktaptest 1 SKIP
UNSUPPORTED_CASES="$UNSUPPORTED_CASES $CASENO"
return $UNSUPPORTED_RESULT # depends on use case
;;
$XFAIL)
prlog " [${color_green}XFAIL${color_reset}]"
+ ktaptest 1 XFAIL
XFAILED_CASES="$XFAILED_CASES $CASENO"
return 0
;;
*)
prlog " [${color_blue}UNDEFINED${color_reset}]"
+ ktaptest 0 error
UNDEFINED_CASES="$UNDEFINED_CASES $CASENO"
return 1 # this must be a test bug
;;
@@ -371,6 +405,7 @@ __run_test() { # testfile
run_test() { # testfile
local testname=`basename $1`
testcase $1
+ prlog -n "[$CASENO]$INSTANCE$CASENAME"
if [ ! -z "$LOG_FILE" ] ; then
local testlog=`mktemp $LOG_DIR/${CASENO}-${testname}-log.XXXXXX`
else
@@ -405,6 +440,17 @@ run_test() { # testfile
# load in the helper functions
. $TEST_DIR/functions
+if [ "$KTAP" = "1" ]; then
+ echo "TAP version 13"
+
+ casecount=`echo $TEST_CASES | wc -w`
+ for t in $TEST_CASES; do
+ test_on_instance $t || continue
+ casecount=$((casecount+1))
+ done
+ echo "1..${casecount}"
+fi
+
# Main loop
for t in $TEST_CASES; do
run_test $t
@@ -439,6 +485,17 @@ prlog "# of unsupported: " `echo $UNSUPPORTED_CASES | wc -w`
prlog "# of xfailed: " `echo $XFAILED_CASES | wc -w`
prlog "# of undefined(test bug): " `echo $UNDEFINED_CASES | wc -w`
+if [ "$KTAP" = "1" ]; then
+ echo -n "# Totals:"
+ echo -n " pass:"`echo $PASSED_CASES | wc -w`
+ echo -n " faii:"`echo $FAILED_CASES | wc -w`
+ echo -n " xfail:"`echo $XFAILED_CASES | wc -w`
+ echo -n " xpass:0"
+ echo -n " skip:"`echo $UNTESTED_CASES $UNSUPPORTED_CASES | wc -w`
+ echo -n " error:"`echo $UNRESOLVED_CASES $UNDEFINED_CASES | wc -w`
+ echo
+fi
+
cleanup
# if no error, return 0
diff --git a/tools/testing/selftests/ftrace/ftracetest-ktap b/tools/testing/selftests/ftrace/ftracetest-ktap
new file mode 100755
index 0000000000000..b3284679ef3af
--- /dev/null
+++ b/tools/testing/selftests/ftrace/ftracetest-ktap
@@ -0,0 +1,8 @@
+#!/bin/sh -e
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# ftracetest-ktap: Wrapper to integrate ftracetest with the kselftest runner
+#
+# Copyright (C) Arm Ltd., 2023
+
+./ftracetest -K
--
2.39.2
Dzień dobry,
zapoznałem się z Państwa ofertą i z przyjemnością przyznaję, że przyciąga uwagę i zachęca do dalszych rozmów.
Pomyślałem, że może mógłbym mieć swój wkład w Państwa rozwój i pomóc dotrzeć z tą ofertą do większego grona odbiorców. Pozycjonuję strony www, dzięki czemu generują świetny ruch w sieci.
Możemy porozmawiać w najbliższym czasie?
Pozdrawiam
Adam Charachuta
One can use "cpuset.cpus.partition" to create multiple scheduling domains
or to produce a set of isolated CPUs where load balancing is disabled.
The former use case is less common but the latter one can be frequently
used especially for the Telco use cases like DPDK.
The existing "isolated" partition can be used to produce isolated
CPUs if the applications have full control of a system. However, in a
containerized environment where all the apps are run in a container,
it is hard to distribute out isolated CPUs from the root down given
the unified hierarchy nature of cgroup v2.
The container running on isolated CPUs can be several layers down from
the root. The current partition feature requires that all the ancestors
of a leaf partition root must be parititon roots themselves. This can
be hard to configure.
This patch introduces a new type of partition called remote partition.
A remote partition is a partition whose parent is not a partition root
itself and its CPUs are acquired directly from available CPUs in the
top cpuset's cpuset.cpus.reserve. For contrast, the existing type of
partitions where their parents have to be valid partition roots are
referred to as adjacent partitions as they have to be clustered around
the cgroup root.
This patch enables only the creation of remote isolated partitions
for now.
The creation of a remote isolated partition is a 2-step process.
1) Reserve the CPUs needed by the remote partition by adding CPUs to
cpuset.cpus.reserve of the top cpuset.
2) Enable an isolated partition by
# echo isolated > cpuset.cpus.partition
Such a remote isolated partition P will only be valid if the following
conditions are true.
1) P/cpuset.cpus is a subset of top cpuset's cpuset.cpus.reserve.
2) All the CPUs in P/cpuset.cpus are present in the cpuset.cpus of
all its ancestors to ensure that those CPUs are properly granted
to P in a hierarchical manner.
3) None of the CPUs in P/cpuset.cpus have been acquired by other valid
partitions.
Like adjacent partitions, a remote partition has exclusive access to the
CPUs allocated to that partition. Because of the exclusive nature, none
of the cpuset.cpus of its sibling cpusets can contain any CPUs allocated
to the remote partition or the partition creation process will fail.
Signed-off-by: Waiman Long <longman(a)redhat.com>
---
kernel/cgroup/cpuset.c | 306 +++++++++++++++++++++++++++++++++++++++--
1 file changed, 291 insertions(+), 15 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 69abe95a9969..280018cddaba 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -98,6 +98,7 @@ enum prs_errcode {
PERR_NOCPUS,
PERR_HOTPLUG,
PERR_CPUSEMPTY,
+ PERR_RMTPARENT,
};
static const char * const perr_strings[] = {
@@ -108,6 +109,7 @@ static const char * const perr_strings[] = {
[PERR_NOCPUS] = "Parent unable to distribute cpu downstream",
[PERR_HOTPLUG] = "No cpu available due to hotplug",
[PERR_CPUSEMPTY] = "cpuset.cpus is empty",
+ [PERR_RMTPARENT] = "New partition not allowed under remote partition",
};
struct cpuset {
@@ -206,6 +208,9 @@ struct cpuset {
/* Handle for cpuset.cpus.partition */
struct cgroup_file partition_file;
+
+ /* Remote partition silbling list anchored at remote_children */
+ struct list_head remote_sibling;
};
/*
@@ -236,6 +241,9 @@ static cpumask_var_t cs_reserve_cpus; /* Reserved CPUs */
static cpumask_var_t cs_free_reserve_cpus; /* Unallocated reserved CPUs */
static cpumask_var_t cs_tmp_cpus; /* Temp cpumask for partition */
+/* List of remote partition root children */
+static struct list_head remote_children;
+
/*
* Partition root states:
*
@@ -385,6 +393,8 @@ static struct cpuset top_cpuset = {
.flags = ((1 << CS_ONLINE) | (1 << CS_CPU_EXCLUSIVE) |
(1 << CS_MEM_EXCLUSIVE)),
.partition_root_state = PRS_ROOT,
+ .remote_sibling = LIST_HEAD_INIT(top_cpuset.remote_sibling),
+
};
/**
@@ -1385,6 +1395,209 @@ static void update_partition_sd_lb(struct cpuset *cs, int old_prs)
rebuild_sched_domains_locked();
}
+static inline bool is_remote_partition(struct cpuset *cs)
+{
+ return !list_empty(&cs->remote_sibling);
+}
+
+/*
+ * update_isolated_cpumasks_hier - Update effective cpumasks and tasks
+ * @cs: the cpuset to consider
+ * @lb: load balance flag
+ *
+ * This is called for descendant cpusets when a cpuset switches to or
+ * from an isolated remote partition. There can't be any remote partitions
+ * underneath it.
+ */
+static void update_isolated_cpumasks_hier(struct cpuset *cs, bool lb)
+{
+ struct cpuset *cp;
+ struct cgroup_subsys_state *pos_css;
+
+ rcu_read_lock();
+ cpuset_for_each_descendant_pre(cp, pos_css, cs) {
+ struct cpuset *parent = parent_cs(cp);
+
+ if (cp == cs)
+ continue; /* Skip partition root */
+
+ WARN_ON_ONCE(is_partition_valid(cp));
+ spin_lock_irq(&callback_lock);
+
+ if (cpumask_and(cp->effective_cpus, cp->cpus_allowed,
+ parent->effective_cpus)) {
+ if (cp->use_parent_ecpus) {
+ WARN_ON_ONCE(--parent->child_ecpus_count < 0);
+ cp->use_parent_ecpus = false;
+ }
+ } else {
+ cpumask_copy(cp->effective_cpus, parent->effective_cpus);
+ if (!cp->use_parent_ecpus) {
+ parent->child_ecpus_count++;
+ cp->use_parent_ecpus = true;
+ }
+ }
+ if (lb)
+ set_bit(CS_SCHED_LOAD_BALANCE, &cp->flags);
+ else
+ clear_bit(CS_SCHED_LOAD_BALANCE, &cp->flags);
+
+ spin_unlock_irq(&callback_lock);
+ }
+ rcu_read_unlock();
+}
+
+/*
+ * isolated_cpus_acquire - Acquire isolated CPUs from cpuset.cpus.reserve
+ * @cs: the cpuset to update
+ * Return: 1 if successful, 0 if error
+ *
+ * Acquire isolated CPUs from cpuset.cpus.reserve and become an isolated
+ * partition root. cpuset_mutex must be held by the caller.
+ *
+ * Note that freely available reserve CPUs have already been isolated, so
+ * we don't need to rebuild sched domains. Since the cpuset is likely
+ * using effective_cpus from its parent before the conversion, we have to
+ * update parent's child_ecpus_count accordingly.
+ */
+static int isolated_cpus_acquire(struct cpuset *cs)
+{
+ struct cpuset *ancestor, *parent;
+
+ ancestor = parent = parent_cs(cs);
+
+ /*
+ * To enable acquiring of isolated CPUs from cpuset.cpus.reserve,
+ * cpus_allowed must be a subset of both its ancestor's cpus_allowed
+ * and cs_free_reserve_cpus and the user must have sysadmin privilege.
+ */
+ if (!capable(CAP_SYS_ADMIN) ||
+ !cpumask_subset(cs->cpus_allowed, cs_free_reserve_cpus))
+ return 0;
+
+ /*
+ * Check cpus_allowed of all its ancestors, except top_cpuset.
+ */
+ while (ancestor != &top_cpuset) {
+ if (!cpumask_subset(cs->cpus_allowed, ancestor->cpus_allowed))
+ return 0;
+ ancestor = parent_cs(ancestor);
+ }
+
+ spin_lock_irq(&callback_lock);
+ cpumask_andnot(cs_free_reserve_cpus,
+ cs_free_reserve_cpus, cs->cpus_allowed);
+ cpumask_and(cs->effective_cpus, cs->cpus_allowed, cpu_active_mask);
+
+ if (cs->use_parent_ecpus) {
+ cs->use_parent_ecpus = false;
+ parent->child_ecpus_count--;
+ }
+ list_add(&cs->remote_sibling, &remote_children);
+ clear_bit(CS_SCHED_LOAD_BALANCE, &cs->flags);
+ spin_unlock_irq(&callback_lock);
+
+ if (!list_empty(&cs->css.children))
+ update_isolated_cpumasks_hier(cs, false);
+
+ return 1;
+}
+
+/*
+ * isolated_cpus_release - Release isolated CPUs back to cpuset.cpus.reserve
+ * @cs: the cpuset to update
+ *
+ * Release isolated CPUs back to cpuset.cpus.reserve.
+ * cpuset_mutex must be held by the caller.
+ */
+static void isolated_cpus_release(struct cpuset *cs)
+{
+ struct cpuset *parent = parent_cs(cs);
+
+ if (!is_remote_partition(cs))
+ return;
+
+ /*
+ * This can be called when the cpu list in cs_reserve_cpus
+ * is reduced. So not all the cpus should be returned back to
+ * cs_free_reserve_cpus.
+ */
+ WARN_ON_ONCE(cs->partition_root_state != PRS_ISOLATED);
+ WARN_ON_ONCE(!cpumask_subset(cs->cpus_allowed, cs_reserve_cpus));
+ spin_lock_irq(&callback_lock);
+ if (!cpumask_and(cs->effective_cpus,
+ parent->effective_cpus, cs->cpus_allowed)) {
+ cs->use_parent_ecpus = true;
+ parent->child_ecpus_count++;
+ cpumask_copy(cs->effective_cpus, parent->effective_cpus);
+ }
+ list_del_init(&cs->remote_sibling);
+ cs->partition_root_state = PRS_INVALID_ISOLATED;
+ if (!cs->prs_err)
+ cs->prs_err = PERR_INVCPUS;
+
+ /* Add the CPUs back to cs_free_reserve_cpus */
+ cpumask_or(cs_free_reserve_cpus,
+ cs_free_reserve_cpus, cs->cpus_allowed);
+
+ /*
+ * There is no change in the CPU load balance state that requires
+ * rebuilding sched domains. So the flags bits can be set directly.
+ */
+ set_bit(CS_SCHED_LOAD_BALANCE, &cs->flags);
+ clear_bit(CS_CPU_EXCLUSIVE, &cs->flags);
+ spin_unlock_irq(&callback_lock);
+
+ if (!list_empty(&cs->css.children))
+ update_isolated_cpumasks_hier(cs, true);
+}
+
+/*
+ * isolated_cpus_update - cpuset.cpus change in a remote isolated partition
+ *
+ * Return: 1 if successful, 0 if it needs to become invalid.
+ */
+static int isolated_cpus_update(struct cpuset *cs, struct cpumask *newmask,
+ struct tmpmasks *tmp)
+{
+ bool adding, deleting;
+
+ if (WARN_ON_ONCE((cs->partition_root_state != PRS_ISOLATED) ||
+ !is_remote_partition(cs)))
+ return 0;
+
+ if (cpumask_empty(newmask))
+ goto invalidate;
+
+ adding = cpumask_andnot(tmp->addmask, newmask, cs->cpus_allowed);
+ deleting = cpumask_andnot(tmp->delmask, cs->cpus_allowed, newmask);
+
+ /*
+ * Additions of isolation CPUs is only allowed if those CPUs are
+ * in cs_free_reserve_cpus and the caller has sysadmin privilege.
+ */
+ if (adding && (!capable(CAP_SYS_ADMIN) ||
+ !cpumask_subset(tmp->addmask, cs_free_reserve_cpus)))
+ goto invalidate;
+
+ spin_lock_irq(&callback_lock);
+ if (adding)
+ cpumask_andnot(cs_free_reserve_cpus,
+ cs_free_reserve_cpus, tmp->addmask);
+ if (deleting)
+ cpumask_or(cs_free_reserve_cpus,
+ cs_free_reserve_cpus, tmp->delmask);
+ cpumask_copy(cs->cpus_allowed, newmask);
+ cpumask_andnot(cs->effective_cpus, newmask, cs->subparts_cpus);
+ cpumask_and(cs->effective_cpus, cs->effective_cpus, cpu_active_mask);
+ spin_unlock_irq(&callback_lock);
+ return 1;
+
+invalidate:
+ isolated_cpus_release(cs);
+ return 0;
+}
+
/**
* update_parent_subparts_cpumask - update subparts_cpus mask of parent cpuset
* @cs: The cpuset that requests change in partition root state
@@ -1457,9 +1670,12 @@ static int update_parent_subparts_cpumask(struct cpuset *cs, int cmd,
if (cmd == partcmd_enable) {
/*
* Enabling partition root is not allowed if cpus_allowed
- * doesn't overlap parent's cpus_allowed.
+ * doesn't overlap parent's cpus_allowed or if it intersects
+ * cs_free_reserve_cpus since it needs to be a remote partition
+ * in this case.
*/
- if (!cpumask_intersects(cs->cpus_allowed, parent->cpus_allowed))
+ if (!cpumask_intersects(cs->cpus_allowed, parent->cpus_allowed) ||
+ cpumask_intersects(cs->cpus_allowed, cs_free_reserve_cpus))
return PERR_INVCPUS;
/*
@@ -1694,6 +1910,15 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp,
struct cpuset *parent = parent_cs(cp);
bool update_parent = false;
+ /*
+ * Skip remote partition that acquires isolated CPUs directly
+ * from cs_reserve_cpus.
+ */
+ if (is_remote_partition(cp)) {
+ pos_css = css_rightmost_descendant(pos_css);
+ continue;
+ }
+
compute_effective_cpumask(tmp->new_cpus, cp, parent);
/*
@@ -1804,7 +2029,7 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp,
WARN_ON(!is_in_v2_mode() &&
!cpumask_equal(cp->cpus_allowed, cp->effective_cpus));
- update_tasks_cpumask(cp, tmp->new_cpus);
+ update_tasks_cpumask(cp, cp->effective_cpus);
/*
* On legacy hierarchy, if the effective cpumask of any non-
@@ -1946,6 +2171,14 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs,
return retval;
if (cs->partition_root_state) {
+ /*
+ * Call isolated_cpus_update() to handle valid remote partition
+ */
+ if (is_remote_partition(cs)) {
+ isolated_cpus_update(cs, cs_tmp_cpus, &tmp);
+ goto update_hier;
+ }
+
if (invalidate)
update_parent_subparts_cpumask(cs, partcmd_invalidate,
NULL, &tmp);
@@ -1980,10 +2213,11 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs,
}
spin_unlock_irq(&callback_lock);
+update_hier:
/* effective_cpus will be updated here */
update_cpumasks_hier(cs, &tmp, false);
- if (cs->partition_root_state) {
+ if (cs->partition_root_state && !is_remote_partition(cs)) {
struct cpuset *parent = parent_cs(cs);
/*
@@ -2072,7 +2306,13 @@ static int update_reserve_cpumask(struct cpuset *trialcs, const char *buf)
* Invalidate remote partitions if necessary
*/
if (deleting) {
- /* TODO */
+ struct cpuset *child, *next;
+
+ list_for_each_entry_safe(child, next, &remote_children,
+ remote_sibling) {
+ if (cpumask_intersects(child->cpus_allowed, tmp.delmask))
+ isolated_cpus_release(child);
+ }
}
/*
@@ -2539,21 +2779,32 @@ static int update_prstate(struct cpuset *cs, int new_prs)
return 0;
/*
- * For a previously invalid partition root, leave it at being
- * invalid if new_prs is not "member".
+ * For a previously invalid partition root, treat it like a "member".
*/
- if (new_prs && is_prs_invalid(old_prs)) {
- cs->partition_root_state = -new_prs;
- return 0;
- }
+ if (new_prs && is_prs_invalid(old_prs))
+ old_prs = PRS_MEMBER;
if (alloc_cpumasks(NULL, &tmpmask))
return -ENOMEM;
+ if ((old_prs == PRS_ISOLATED) && is_remote_partition(cs)) {
+ /* Pre-invalidate a remote isolated partition */
+ isolated_cpus_release(cs);
+ old_prs = PRS_MEMBER;
+ }
+
err = update_partition_exclusive(cs, new_prs);
if (err)
goto out;
+ /*
+ * New partition is not allowed under a remote partition
+ */
+ if (new_prs && is_remote_partition(parent)) {
+ err = PERR_RMTPARENT;
+ goto out;
+ }
+
if (!old_prs) {
/*
* cpus_allowed cannot be empty.
@@ -2565,6 +2816,12 @@ static int update_prstate(struct cpuset *cs, int new_prs)
err = update_parent_subparts_cpumask(cs, partcmd_enable,
NULL, &tmpmask);
+ /*
+ * If an attempt to become adjacent isolated partition fails,
+ * try to become a remote isolated partition instead.
+ */
+ if (err && (new_prs == PRS_ISOLATED) && isolated_cpus_acquire(cs))
+ err = 0; /* Become remote isolated partition */
} else if (old_prs && new_prs) {
/*
* A change in load balance state only, no change in cpumasks.
@@ -3462,6 +3719,7 @@ cpuset_css_alloc(struct cgroup_subsys_state *parent_css)
nodes_clear(cs->effective_mems);
fmeter_init(&cs->fmeter);
cs->relax_domain_level = -1;
+ INIT_LIST_HEAD(&cs->remote_sibling);
/* Set CS_MEMORY_MIGRATE for default hierarchy */
if (cgroup_subsys_on_dfl(cpuset_cgrp_subsys))
@@ -3497,6 +3755,11 @@ static int cpuset_css_online(struct cgroup_subsys_state *css)
cs->effective_mems = parent->effective_mems;
cs->use_parent_ecpus = true;
parent->child_ecpus_count++;
+ /*
+ * Clear CS_SCHED_LOAD_BALANCE if parent is isolated
+ */
+ if (!is_sched_load_balance(parent))
+ clear_bit(CS_SCHED_LOAD_BALANCE, &cs->flags);
}
spin_unlock_irq(&callback_lock);
@@ -3741,6 +4004,7 @@ int __init cpuset_init(void)
fmeter_init(&top_cpuset.fmeter);
set_bit(CS_SCHED_LOAD_BALANCE, &top_cpuset.flags);
top_cpuset.relax_domain_level = -1;
+ INIT_LIST_HEAD(&remote_children);
BUG_ON(!alloc_cpumask_var(&cpus_attach, GFP_KERNEL));
@@ -3873,9 +4137,20 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp)
}
parent = parent_cs(cs);
- compute_effective_cpumask(&new_cpus, cs, parent);
nodes_and(new_mems, cs->mems_allowed, parent->effective_mems);
+ /*
+ * In the special case of a valid remote isolated partition.
+ * We just need to mask offline cpus from cpus_allowed unless
+ * all the isolated cpus are gone.
+ */
+ if (is_remote_partition(cs)) {
+ if (!cpumask_and(&new_cpus, cs->cpus_allowed, cpu_active_mask))
+ isolated_cpus_release(cs);
+ } else {
+ compute_effective_cpumask(&new_cpus, cs, parent);
+ }
+
if (cs->nr_subparts_cpus)
/*
* Make sure that CPUs allocated to child partitions
@@ -3906,10 +4181,11 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp)
* the following conditions hold:
* 1) empty effective cpus but not valid empty partition.
* 2) parent is invalid or doesn't grant any cpus to child
- * partitions.
+ * partitions and not a remote partition.
*/
- if (is_partition_valid(cs) && (!parent->nr_subparts_cpus ||
- (cpumask_empty(&new_cpus) && partition_is_populated(cs, NULL)))) {
+ if (is_partition_valid(cs) &&
+ ((!parent->nr_subparts_cpus && !is_remote_partition(cs)) ||
+ (cpumask_empty(&new_cpus) && partition_is_populated(cs, NULL)))) {
int old_prs, parent_prs;
update_parent_subparts_cpumask(cs, partcmd_disable, NULL, tmp);
--
2.31.1
A cpuset partition is a collection of cpusets with a partition root
and its descendants from that root downward excluding any cpusets that
are part of other partitions. A partition has exclusive access to a set
of CPUs granted to it. Other cpusets outside of a partition cannot use
any CPUs in that set.
Currently, creation of partitions requires a hierarchical CPUs
distribution model where the parent of a partition root has to be
a partition root itself. Hence all the partition roots have to be
clustered around the cgroup root.
To enable the creation of a remote partition down in the hierarchy
without a parental partition root, we need a way to reserve the CPUs
that will be used in a remote partition. Introduce a new root-only
"cpuset.cpus.reserve" control file in the top cpuset for this particular
purpose.
By default, the new "cpuset.cpus.reserve" control file will track
the subparts_cpus cpumask in the top cpuset. By writing into this new
control file, however, we can reserve additional CPUs that can be used
in a remote partition. Any CPUs that are in "cpuset.cpus.reserve" will
have to be removed from the effective_cpus of all the cpusets that are
not part of that valid partitions.
The prefix "+" and "-" can be used to indicate the addition to or the
subtraction from the existing CPUs in "cpuset.cpus.reserve". A single
"-" character indicate the deletion of all the free reserve CPUs not
allocated to any existing partition.
Signed-off-by: Waiman Long <longman(a)redhat.com>
---
kernel/cgroup/cpuset.c | 253 ++++++++++++++++++++++++++++++++++++++---
1 file changed, 239 insertions(+), 14 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 8604c919e1e4..69abe95a9969 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -208,7 +208,33 @@ struct cpuset {
struct cgroup_file partition_file;
};
-static cpumask_var_t cs_tmp_cpus; /* Temp cpumask for partition */
+/*
+ * Reserved CPUs for partitions.
+ *
+ * By default, CPUs used in partitions are tracked in the parent's
+ * subparts_cpus mask following a hierarchical CPUs distribution model.
+ * To enable the creation of a remote partition down in the hierarchy
+ * without a parental partition root, one can write directly to
+ * cpuset.cpus.reserve in the root cgroup to allocate more CPUs that can
+ * be used by remote partitions. Removal of existing reserved CPUs may
+ * also cause some existing partitions to become invalid.
+ *
+ * All the cpumasks below should only be used with cpuset_mutex held.
+ * Modification of cs_reserve_cpus & cs_free_reserve_cpus also requires
+ * holding the callback_lock.
+ *
+ * Relationship among cs_reserve_cpus, cs_free_reserve_cpus and
+ * top_cpuset.subparts_cpus are:
+ *
+ * top_cpuset.subparts_cpus ⊆ cs_reserve_cpus
+ * cs_free_reserve_cpus ⊆ cs_reserve_cpus
+ * top_cpuset.subparts_cpus ∩ cs_free_reserve_cpus = ∅
+ * cs_reserve_cpus - cs_free_reserve_cpus - top_cpuset.subparts_cpus
+ * = CPUs dedicated to remote partitions
+ */
+static cpumask_var_t cs_reserve_cpus; /* Reserved CPUs */
+static cpumask_var_t cs_free_reserve_cpus; /* Unallocated reserved CPUs */
+static cpumask_var_t cs_tmp_cpus; /* Temp cpumask for partition */
/*
* Partition root states:
@@ -1202,13 +1228,13 @@ static void rebuild_sched_domains_locked(void)
* should be the same as the active CPUs, so checking only top_cpuset
* is enough to detect racing CPU offlines.
*/
- if (!top_cpuset.nr_subparts_cpus &&
+ if (cpumask_empty(cs_reserve_cpus) &&
!cpumask_equal(top_cpuset.effective_cpus, cpu_active_mask))
return;
/*
* With subpartition CPUs, however, the effective CPUs of a partition
- * root should be only a subset of the active CPUs. Since a CPU in any
+ * root should only be a subset of the active CPUs. Since a CPU in any
* partition root could be offlined, all must be checked.
*/
if (top_cpuset.nr_subparts_cpus) {
@@ -1275,7 +1301,7 @@ static void update_tasks_cpumask(struct cpuset *cs, struct cpumask *new_cpus)
*/
if ((task->flags & PF_KTHREAD) && kthread_is_per_cpu(task))
continue;
- cpumask_andnot(new_cpus, possible_mask, cs->subparts_cpus);
+ cpumask_andnot(new_cpus, possible_mask, cs_reserve_cpus);
} else {
cpumask_and(new_cpus, possible_mask, cs->effective_cpus);
}
@@ -1406,6 +1432,7 @@ static int update_parent_subparts_cpumask(struct cpuset *cs, int cmd,
int deleting; /* Moving cpus from subparts_cpus to effective_cpus */
int old_prs, new_prs;
int part_error = PERR_NONE; /* Partition error? */
+ bool update_reserve = (parent == &top_cpuset);
lockdep_assert_held(&cpuset_mutex);
@@ -1576,7 +1603,7 @@ static int update_parent_subparts_cpumask(struct cpuset *cs, int cmd,
}
/*
- * Change the parent's subparts_cpus.
+ * Change the parent's subparts_cpus and maybe cs_reserve_cpus.
* Newly added CPUs will be removed from effective_cpus and
* newly deleted ones will be added back to effective_cpus.
*/
@@ -1586,10 +1613,25 @@ static int update_parent_subparts_cpumask(struct cpuset *cs, int cmd,
parent->subparts_cpus, tmp->addmask);
cpumask_andnot(parent->effective_cpus,
parent->effective_cpus, tmp->addmask);
+ if (update_reserve) {
+ cpumask_or(cs_reserve_cpus,
+ cs_reserve_cpus, tmp->addmask);
+ cpumask_andnot(cs_free_reserve_cpus,
+ cs_free_reserve_cpus, tmp->addmask);
+ }
}
if (deleting) {
cpumask_andnot(parent->subparts_cpus,
parent->subparts_cpus, tmp->delmask);
+ /*
+ * The automatic cpu reservation of adjacent partition
+ * won't add back the deleted CPUs to cs_free_reserve_cpus.
+ * Instead, they are returned back to effective_cpus of top
+ * cpuset.
+ */
+ if (update_reserve)
+ cpumask_andnot(cs_reserve_cpus,
+ cs_reserve_cpus, tmp->delmask);
/*
* Some of the CPUs in subparts_cpus might have been offlined.
*/
@@ -1783,6 +1825,8 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp,
if (need_rebuild_sched_domains)
rebuild_sched_domains_locked();
+
+ return;
}
/**
@@ -1955,6 +1999,167 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs,
return 0;
}
+/**
+ * update_reserve_cpumask - update cs_reserve_cpus
+ * @trialcs: trial cpuset
+ * @buf: buffer of cpu numbers written to this cpuset
+ * Return: 0 if successful, < 0 if error
+ */
+static int update_reserve_cpumask(struct cpuset *trialcs, const char *buf)
+{
+ struct cgroup_subsys_state *css;
+ struct cpuset *cs;
+ bool adding, deleting;
+ struct tmpmasks tmp;
+
+ adding = deleting = false;
+ if (*buf == '+') {
+ adding = true;
+ buf++;
+ } else if (*buf == '-') {
+ deleting = true;
+ buf++;
+ }
+
+ if (!*buf) {
+ if (adding)
+ return -EINVAL;
+
+ if (deleting) {
+ if (cpumask_empty(cs_free_reserve_cpus))
+ return 0;
+ cpumask_copy(trialcs->cpus_allowed, cs_free_reserve_cpus);
+ } else {
+ cpumask_clear(trialcs->cpus_allowed);
+ }
+ } else {
+ int retval = cpulist_parse(buf, trialcs->cpus_allowed);
+
+ if (retval < 0)
+ return retval;
+ }
+
+ if (!adding && !deleting &&
+ cpumask_equal(trialcs->cpus_allowed, cs_reserve_cpus))
+ return 0;
+
+ /* Preserve trialcs->cpus_allowed for now */
+ init_tmpmasks(&tmp, NULL, trialcs->subparts_cpus,
+ trialcs->effective_cpus);
+
+ /*
+ * Compute the addition and removal of CPUs to/from cs_reserve_cpus
+ */
+ if (!adding && !deleting) {
+ adding = cpumask_andnot(tmp.addmask, trialcs->cpus_allowed,
+ cs_reserve_cpus);
+ deleting = cpumask_andnot(tmp.delmask, cs_reserve_cpus,
+ trialcs->cpus_allowed);
+ } else if (adding) {
+ adding = cpumask_andnot(tmp.addmask,
+ trialcs->cpus_allowed, cs_reserve_cpus);
+ cpumask_or(trialcs->cpus_allowed, cs_reserve_cpus, tmp.addmask);
+ } else { /* deleting */
+ deleting = cpumask_and(tmp.delmask,
+ trialcs->cpus_allowed, cs_reserve_cpus);
+ cpumask_andnot(trialcs->cpus_allowed, cs_reserve_cpus, tmp.delmask);
+ }
+
+ if (!adding && !deleting)
+ return 0;
+
+ /*
+ * Invalidate remote partitions if necessary
+ */
+ if (deleting) {
+ /* TODO */
+ }
+
+ /*
+ * Cannot use up all the CPUs in top_cpuset.effective_cpus
+ */
+ if (!deleting && adding &&
+ cpumask_subset(top_cpuset.effective_cpus, tmp.addmask))
+ return -EINVAL;
+
+ spin_lock_irq(&callback_lock);
+ /*
+ * Update top_cpuset.effective_cpus, cs_reserve_cpus &
+ * cs_free_reserve_cpus.
+ */
+ if (adding)
+ cpumask_or(cs_free_reserve_cpus, cs_free_reserve_cpus,
+ tmp.addmask);
+ cpumask_copy(cs_reserve_cpus, trialcs->cpus_allowed);
+ cpumask_andnot(top_cpuset.effective_cpus,
+ cpu_active_mask, cs_reserve_cpus);
+
+ /*
+ * Remove CPUs from cs_free_reserve_cpus first. Anything left
+ * means some partitions has to be made invalid.
+ */
+ if (deleting & cpumask_and(cs_tmp_cpus, cs_free_reserve_cpus,
+ tmp.delmask)) {
+ cpumask_andnot(cs_free_reserve_cpus, cs_free_reserve_cpus,
+ cs_tmp_cpus);
+ deleting = cpumask_andnot(tmp.delmask, tmp.delmask,
+ cs_tmp_cpus);
+ }
+ spin_unlock_irq(&callback_lock);
+
+ /*
+ * Invalidate some adjacent partitions under top cpuset, if necessary
+ */
+ if (deleting && cpumask_and(cs_tmp_cpus, tmp.delmask,
+ top_cpuset.subparts_cpus)) {
+ struct cgroup_subsys_state *css;
+ struct cpuset *cp;
+
+ /*
+ * Temporarily save the remaining CPUs to be deleted in
+ * trialcs->cpus_allowed to be restored back to tmp.delmask
+ * later.
+ */
+ deleting = cpumask_andnot(trialcs->cpus_allowed, tmp.delmask,
+ cs_tmp_cpus);
+ rcu_read_lock();
+ cpuset_for_each_child(cp, css, &top_cpuset)
+ if (is_partition_valid(cp) &&
+ cpumask_intersects(cs_tmp_cpus, cp->cpus_allowed)) {
+ rcu_read_unlock();
+ update_parent_subparts_cpumask(cp, partcmd_invalidate, NULL, &tmp);
+ rcu_read_lock();
+ }
+ rcu_read_unlock();
+ if (deleting)
+ cpumask_copy(tmp.delmask, trialcs->cpus_allowed);
+ }
+
+ /* Can now use all of trialcs */
+ init_tmpmasks(&tmp, trialcs->cpus_allowed, trialcs->subparts_cpus,
+ trialcs->effective_cpus);
+
+ /*
+ * Update effective_cpus of all descendants that are not in
+ * partitions and rebuild sched domaiins.
+ */
+ rcu_read_lock();
+ cpuset_for_each_child(cs, css, &top_cpuset) {
+ compute_effective_cpumask(tmp.new_cpus, cs, &top_cpuset);
+ if (cpumask_equal(tmp.new_cpus, cs->effective_cpus))
+ continue;
+ if (!css_tryget_online(&cs->css))
+ continue;
+ rcu_read_unlock();
+ update_cpumasks_hier(cs, &tmp, false);
+ rcu_read_lock();
+ css_put(&cs->css);
+ }
+ rcu_read_unlock();
+ rebuild_sched_domains_locked();
+ return 0;
+}
+
/*
* Migrate memory region from one set of nodes to another. This is
* performed asynchronously as it can be called from process migration path
@@ -2743,6 +2948,7 @@ typedef enum {
FILE_EFFECTIVE_CPULIST,
FILE_EFFECTIVE_MEMLIST,
FILE_SUBPARTS_CPULIST,
+ FILE_RESERVE_CPULIST,
FILE_CPU_EXCLUSIVE,
FILE_MEM_EXCLUSIVE,
FILE_MEM_HARDWALL,
@@ -2880,6 +3086,9 @@ static ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
case FILE_CPULIST:
retval = update_cpumask(cs, trialcs, buf);
break;
+ case FILE_RESERVE_CPULIST:
+ retval = update_reserve_cpumask(trialcs, buf);
+ break;
case FILE_MEMLIST:
retval = update_nodemask(cs, trialcs, buf);
break;
@@ -2927,6 +3136,9 @@ static int cpuset_common_seq_show(struct seq_file *sf, void *v)
case FILE_EFFECTIVE_MEMLIST:
seq_printf(sf, "%*pbl\n", nodemask_pr_args(&cs->effective_mems));
break;
+ case FILE_RESERVE_CPULIST:
+ seq_printf(sf, "%*pbl\n", cpumask_pr_args(cs_reserve_cpus));
+ break;
case FILE_SUBPARTS_CPULIST:
seq_printf(sf, "%*pbl\n", cpumask_pr_args(cs->subparts_cpus));
break;
@@ -3200,6 +3412,14 @@ static struct cftype dfl_files[] = {
.file_offset = offsetof(struct cpuset, partition_file),
},
+ {
+ .name = "cpus.reserve",
+ .seq_show = cpuset_common_seq_show,
+ .write = cpuset_write_resmask,
+ .private = FILE_RESERVE_CPULIST,
+ .flags = CFTYPE_ONLY_ON_ROOT,
+ },
+
{
.name = "cpus.subpartitions",
.seq_show = cpuset_common_seq_show,
@@ -3510,6 +3730,8 @@ int __init cpuset_init(void)
BUG_ON(!alloc_cpumask_var(&top_cpuset.effective_cpus, GFP_KERNEL));
BUG_ON(!zalloc_cpumask_var(&top_cpuset.subparts_cpus, GFP_KERNEL));
BUG_ON(!zalloc_cpumask_var(&cs_tmp_cpus, GFP_KERNEL));
+ BUG_ON(!zalloc_cpumask_var(&cs_reserve_cpus, GFP_KERNEL));
+ BUG_ON(!zalloc_cpumask_var(&cs_free_reserve_cpus, GFP_KERNEL));
cpumask_setall(top_cpuset.cpus_allowed);
nodes_setall(top_cpuset.mems_allowed);
@@ -3788,10 +4010,10 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
mems_updated = !nodes_equal(top_cpuset.effective_mems, new_mems);
/*
- * In the rare case that hotplug removes all the cpus in subparts_cpus,
+ * In the rare case that hotplug removes all the reserve cpus,
* we assumed that cpus are updated.
*/
- if (!cpus_updated && top_cpuset.nr_subparts_cpus)
+ if (!cpus_updated && !cpumask_empty(cs_reserve_cpus))
cpus_updated = true;
/* synchronize cpus_allowed to cpu_active_mask */
@@ -3801,18 +4023,21 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
cpumask_copy(top_cpuset.cpus_allowed, &new_cpus);
/*
* Make sure that CPUs allocated to child partitions
- * do not show up in effective_cpus. If no CPU is left,
- * we clear the subparts_cpus & let the child partitions
- * fight for the CPUs again.
+ * do not show up in top_cpuset's effective_cpus. In the
+ * unlikely event tht no effective CPU is left in top_cpuset,
+ * we clear all the reserve cpus and let the non-remote child
+ * partitions fight for the CPUs again.
*/
- if (top_cpuset.nr_subparts_cpus) {
- if (cpumask_subset(&new_cpus,
- top_cpuset.subparts_cpus)) {
+ if (!cpumask_empty(cs_reserve_cpus)) {
+
+ if (cpumask_subset(&new_cpus, cs_reserve_cpus)) {
top_cpuset.nr_subparts_cpus = 0;
cpumask_clear(top_cpuset.subparts_cpus);
+ cpumask_clear(cs_free_reserve_cpus);
+ cpumask_clear(cs_reserve_cpus);
} else {
cpumask_andnot(&new_cpus, &new_cpus,
- top_cpuset.subparts_cpus);
+ cs_reserve_cpus);
}
}
cpumask_copy(top_cpuset.effective_cpus, &new_cpus);
--
2.31.1
v2:
- [v1] https://lore.kernel.org/lkml/20230412153758.3088111-1-longman@redhat.com/
- Dropped the special "isolcpus" partition in v1
- Add the root only "cpuset.cpus.reserve" control file for reserving
CPUs used for remote isolated partitions.
- Update the test_cpuset_prs.sh test script and documentation
accordingly.
This patch series introduces a new category of cpuset partition called
remote partitions. The existing partition category where the partition
roots have to be clustered around the root cgroup in a hierarchical way
is now referred to as adjacent partitions.
A remote partition can be formed far from the root cgroup with no
partition root parent. The only commonality is that the CPUs that are
used in the partition as specified in "cpuset.cpus" have to be present
in the "cpuset.cpus" of all its ancestors.
It is relatively rare to have applications that require creation of
a separate scheduling domain (root). However, it is more common to
have applications that require the use of isolated CPUs (isolated),
e.g. DPDK. One can use the "isolcpus" or "nohz_full" boot command options
to get that statically. Of course, the "isolated" partition is another
way to achieve that dynamically.
Modern container orchestration tools like Kubernetes use the cgroup
hierarchy to manage different containers. And it is relying on other
middleware like systemd to help managing it. If a container needs to
use isolated CPUs, it is hard to get those with the adjacent partitions
as it will require the administrative parent cgroup to be a partition
root too which tool like systemd may not be ready to manage.
With this patch series, a new root cgroup only "cpuset.cpus.reserve"
file is added to specify the set of CPUs that can be used in partitions
(whether remote or adjacent). To create a remote partition, the set
of CPUs to be used in that partition (the "cpuset.cpus" file of the
partition root) has to be reserved by manually adding them to that
control file first. Then that partition can be activated by writing
"isolated" into its "cpuset.cpus.partition". CPU reservation of adjacent
partitions is done automatically without touching "cpuset.cpus.reserve"
at all.
Currently only remote isolated partitions are supported, we could
support a scheduling partition ("root") in the future if the need arises.
Additional isolation attributes like those with the "isolcpus" or "nohz"
boot command line options may be supported in the isolated partitions
in the future.
Waiman Long (6):
cgroup/cpuset: Extract out CS_CPU_EXCLUSIVE & CS_SCHED_LOAD_BALANCE
handling
cgroup/cpuset: Improve temporary cpumasks handling
cgroup/cpuset: Add cpuset.cpus.reserve for top cpuset
cgroup/cpuset: Introduce remote isolated partition
cgroup/cpuset: Documentation update for partition
cgroup/cpuset: Extend test_cpuset_prs.sh to test remote partition
Documentation/admin-guide/cgroup-v2.rst | 92 ++-
kernel/cgroup/cpuset.c | 749 +++++++++++++++---
.../selftests/cgroup/test_cpuset_prs.sh | 403 ++++++----
3 files changed, 988 insertions(+), 256 deletions(-)
--
2.31.1
From: Mirsad Todorovac <mirsad.todorovac(a)alu.unizg.hr>
[ Upstream commit 976d3c6778e99390c6d854d140b746d12ea18a51 ]
According to Mirsad the gpio-sim.sh test appears to FAIL in a wrong way
due to missing initialisation of shell variables:
4.2. Bias settings work correctly
cat: /sys/devices/platform/gpio-sim.0/gpiochip18/sim_gpio0/value: No such file or directory
./gpio-sim.sh: line 393: test: =: unary operator expected
bias setting does not work
GPIO gpio-sim test FAIL
After this change the test passed:
4.2. Bias settings work correctly
GPIO gpio-sim test PASS
His testing environment is AlmaLinux 8.7 on Lenovo desktop box with
the latest Linux kernel based on v6.2:
Linux 6.2.0-mglru-kmlk-andy-09238-gd2980d8d8265 x86_64
Suggested-by: Mirsad Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Signed-off-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Tested-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/gpio/gpio-sim.sh | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/selftests/gpio/gpio-sim.sh b/tools/testing/selftests/gpio/gpio-sim.sh
index 341e3de008968..bf67b23ed29ac 100755
--- a/tools/testing/selftests/gpio/gpio-sim.sh
+++ b/tools/testing/selftests/gpio/gpio-sim.sh
@@ -389,6 +389,9 @@ create_chip chip
create_bank chip bank
set_num_lines chip bank 8
enable_chip chip
+DEVNAME=`configfs_dev_name chip`
+CHIPNAME=`configfs_chip_name chip bank`
+SYSFS_PATH="/sys/devices/platform/$DEVNAME/$CHIPNAME/sim_gpio0/value"
$BASE_DIR/gpio-mockup-cdev -b pull-up /dev/`configfs_chip_name chip bank` 0
test `cat $SYSFS_PATH` = "1" || fail "bias setting does not work"
remove_chip chip
--
2.39.2
From: Mirsad Todorovac <mirsad.todorovac(a)alu.unizg.hr>
[ Upstream commit 976d3c6778e99390c6d854d140b746d12ea18a51 ]
According to Mirsad the gpio-sim.sh test appears to FAIL in a wrong way
due to missing initialisation of shell variables:
4.2. Bias settings work correctly
cat: /sys/devices/platform/gpio-sim.0/gpiochip18/sim_gpio0/value: No such file or directory
./gpio-sim.sh: line 393: test: =: unary operator expected
bias setting does not work
GPIO gpio-sim test FAIL
After this change the test passed:
4.2. Bias settings work correctly
GPIO gpio-sim test PASS
His testing environment is AlmaLinux 8.7 on Lenovo desktop box with
the latest Linux kernel based on v6.2:
Linux 6.2.0-mglru-kmlk-andy-09238-gd2980d8d8265 x86_64
Suggested-by: Mirsad Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Signed-off-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Tested-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/gpio/gpio-sim.sh | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/selftests/gpio/gpio-sim.sh b/tools/testing/selftests/gpio/gpio-sim.sh
index 9f539d454ee4d..fa2ce2b9dd5fc 100755
--- a/tools/testing/selftests/gpio/gpio-sim.sh
+++ b/tools/testing/selftests/gpio/gpio-sim.sh
@@ -389,6 +389,9 @@ create_chip chip
create_bank chip bank
set_num_lines chip bank 8
enable_chip chip
+DEVNAME=`configfs_dev_name chip`
+CHIPNAME=`configfs_chip_name chip bank`
+SYSFS_PATH="/sys/devices/platform/$DEVNAME/$CHIPNAME/sim_gpio0/value"
$BASE_DIR/gpio-mockup-cdev -b pull-up /dev/`configfs_chip_name chip bank` 0
test `cat $SYSFS_PATH` = "1" || fail "bias setting does not work"
remove_chip chip
--
2.39.2
The kunit_add_action() and related functions named the kunit_action_t
parameter 'func' in early drafts, which was later renamed to 'action'
However, the doc comments were not properly updated.
Fix these to avoid confusion and 'make htmldocs' warnings.
Fixes: b9dce8a1ed3e ("kunit: Add kunit_add_action() to defer a call until test exit")
Reported-by: Stephen Rothwell <sfr(a)canb.auug.org.au>
Closes: https://lore.kernel.org/lkml/20230530151840.16a56460@canb.auug.org.au/
Signed-off-by: David Gow <davidgow(a)google.com>
---
include/kunit/resource.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/kunit/resource.h b/include/kunit/resource.h
index b64eb783b1bc..c7383e90f5c9 100644
--- a/include/kunit/resource.h
+++ b/include/kunit/resource.h
@@ -393,7 +393,7 @@ typedef void (kunit_action_t)(void *);
/**
* kunit_add_action() - Call a function when the test ends.
* @test: Test case to associate the action with.
- * @func: The function to run on test exit
+ * @action: The function to run on test exit
* @ctx: Data passed into @func
*
* Defer the execution of a function until the test exits, either normally or
@@ -415,7 +415,7 @@ int kunit_add_action(struct kunit *test, kunit_action_t *action, void *ctx);
/**
* kunit_add_action_or_reset() - Call a function when the test ends.
* @test: Test case to associate the action with.
- * @func: The function to run on test exit
+ * @action: The function to run on test exit
* @ctx: Data passed into @func
*
* Defer the execution of a function until the test exits, either normally or
@@ -441,7 +441,7 @@ int kunit_add_action_or_reset(struct kunit *test, kunit_action_t *action,
/**
* kunit_remove_action() - Cancel a matching deferred action.
* @test: Test case the action is associated with.
- * @func: The deferred function to cancel.
+ * @action: The deferred function to cancel.
* @ctx: The context passed to the deferred function to trigger.
*
* Prevent an action deferred via kunit_add_action() from executing when the
@@ -459,7 +459,7 @@ void kunit_remove_action(struct kunit *test,
/**
* kunit_release_action() - Run a matching action call immediately.
* @test: Test case the action is associated with.
- * @func: The deferred function to trigger.
+ * @action: The deferred function to trigger.
* @ctx: The context passed to the deferred function to trigger.
*
* Execute a function deferred via kunit_add_action()) immediately, rather than
--
2.41.0.rc0.172.g3f132b7071-goog
The sample code has Kconfig for tristate configuration. In the case, it
could be friendly to developers that the code has MODULE_LICENSE, since
the missing MODULE_LICENSE brings error to modpost when the code is built
as loadable kernel module.
Signed-off-by: Takashi Sakamoto <o-takashi(a)sakamocchi.jp>
---
Documentation/dev-tools/kunit/start.rst | 2 ++
1 file changed, 2 insertions(+)
diff --git a/Documentation/dev-tools/kunit/start.rst b/Documentation/dev-tools/kunit/start.rst
index c736613c9b19..d4f99ef94f71 100644
--- a/Documentation/dev-tools/kunit/start.rst
+++ b/Documentation/dev-tools/kunit/start.rst
@@ -250,6 +250,8 @@ Now we are ready to write the test cases.
};
kunit_test_suite(misc_example_test_suite);
+ MODULE_LICENSE("GPL");
+
2. Add the following lines to ``drivers/misc/Kconfig``:
.. code-block:: kconfig
--
2.39.2
User processes register name_args for events. If the same name but different
args event are registered. The trace outputs of second event are printed
as the first event. This is incorrect.
Return EADDRINUSE back to the user process if the same name but different args
event has being registered.
Signed-off-by: sunliming <sunliming(a)kylinos.cn>
---
kernel/trace/trace_events_user.c | 36 +++++++++++++++----
.../selftests/user_events/ftrace_test.c | 6 ++++
2 files changed, 36 insertions(+), 6 deletions(-)
diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c
index b1ecd7677642..e90161294698 100644
--- a/kernel/trace/trace_events_user.c
+++ b/kernel/trace/trace_events_user.c
@@ -1753,6 +1753,8 @@ static int user_event_parse(struct user_event_group *group, char *name,
int ret;
u32 key;
struct user_event *user;
+ int argc = 0;
+ char **argv;
/* Prevent dyn_event from racing */
mutex_lock(&event_mutex);
@@ -1760,13 +1762,35 @@ static int user_event_parse(struct user_event_group *group, char *name,
mutex_unlock(&event_mutex);
if (user) {
- *newuser = user;
- /*
- * Name is allocated by caller, free it since it already exists.
- * Caller only worries about failure cases for freeing.
- */
- kfree(name);
+ if (args) {
+ argv = argv_split(GFP_KERNEL, args, &argc);
+ if (!argv) {
+ ret = -ENOMEM;
+ goto error;
+ }
+
+ ret = user_fields_match(user, argc, (const char **)argv);
+ argv_free(argv);
+
+ } else
+ ret = list_empty(&user->fields);
+
+ if (ret) {
+ *newuser = user;
+ /*
+ * Name is allocated by caller, free it since it already exists.
+ * Caller only worries about failure cases for freeing.
+ */
+ kfree(name);
+ } else {
+ ret = -EADDRINUSE;
+ goto error;
+ }
+
return 0;
+error:
+ refcount_dec(&user->refcnt);
+ return ret;
}
user = kzalloc(sizeof(*user), GFP_KERNEL_ACCOUNT);
diff --git a/tools/testing/selftests/user_events/ftrace_test.c b/tools/testing/selftests/user_events/ftrace_test.c
index 7c99cef94a65..6e8c4b47281c 100644
--- a/tools/testing/selftests/user_events/ftrace_test.c
+++ b/tools/testing/selftests/user_events/ftrace_test.c
@@ -228,6 +228,12 @@ TEST_F(user, register_events) {
ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSREG, ®));
ASSERT_EQ(0, reg.write_index);
+ /* Multiple registers to same name but different args should fail */
+ reg.enable_bit = 29;
+ reg.name_args = (__u64)"__test_event u32 field1;";
+ ASSERT_EQ(-1, ioctl(self->data_fd, DIAG_IOCSREG, ®));
+ ASSERT_EQ(EADDRINUSE, errno);
+
/* Ensure disabled */
self->enable_fd = open(enable_file, O_RDWR);
ASSERT_NE(-1, self->enable_fd);
--
2.25.1
Hallo, es tut mir so leid, Ihre Privatsphäre zu verletzen. Es heißt:
„Ein Bild sagt mehr als tausend Worte, aber als ich Ihres sah, war es
mehr, als Worte erklären könnten.“ Das charmante Profil ist
unwiderstehlich, obwohl es eine kleine persönliche Nachricht ist, aber
Ihr Aussehen verrät viel über eine nette Person ... Also musste ich
der charmanten Person mit diesem tollen Profil eine Nachricht
hinterlassen. Ich glaube, es ist die Neugier, die mich in einer
solchen Zeit zu Ihnen führt. Ich muss noch einmal sagen, dass es mir
leid tut, wenn das Schreiben an Sie Ihrer moralischen Ethik
widerspricht. Ich möchte dich einfach besser kennenlernen und ein
Freund sein oder mehr. Ich hoffe, irgendwann von Ihnen zu hören.
Hallo, es tut mir so leid, Ihre Privatsphäre zu verletzen. Es heißt:
„Ein Bild sagt mehr als tausend Worte, aber als ich Ihres sah, war es
mehr, als Worte erklären könnten.“ Das charmante Profil ist
unwiderstehlich, obwohl es eine kleine persönliche Nachricht ist, aber
Ihr Aussehen verrät viel über eine nette Person ... Also musste ich
der charmanten Person mit diesem tollen Profil eine Nachricht
hinterlassen. Ich glaube, es ist die Neugier, die mich in einer
solchen Zeit zu Ihnen führt. Ich muss noch einmal sagen, dass es mir
leid tut, wenn das Schreiben an Sie Ihrer moralischen Ethik
widerspricht. Ich möchte dich einfach besser kennenlernen und ein
Freund sein oder mehr. Ich hoffe, irgendwann von Ihnen zu hören.
After a few years of increasing test coverage in the MPTCP selftests, we
realised [1] the last version of the selftests is supposed to run on old
kernels without issues.
Supporting older versions is not that easy for this MPTCP case: these
selftests are often validating the internals by checking packets that
are exchanged, when some MIB counters are incremented after some
actions, how connections are getting opened and closed in some cases,
etc. In other words, it is not limited to the socket interface between
the userspace and the kernelspace. In addition, the current selftests
run a lot of different sub-tests but the TAP13 protocol used in the
selftests don't support sub-tests: in other words, one failure in
sub-tests implies that the whole selftest is seen as failed at the end
because sub-tests are not tracked. It is then important to skip
sub-tests not supported by old kernels.
To minimise the modifications and reduce the complexity to support old
versions, the idea is to look at external signs and skip the whole
selftests or just some sub-tests before starting them.
This first part focuses on marking the different selftests as skipped
if MPTCP is not even supported. That's what is done in patches 2 to 8.
Patch 2/8 introduces a new file (mptcp_lib.sh) to be able to re-use some
helpers in the different selftests. The first MPTCP selftest has been
introduced in v5.6.
Patch 1/8 is a bit different but still linked: it modifies mptcp_join.sh
selftest not to use 'cmp --bytes' which is not supported by the BusyBox
implementation. It is apparently quite common to use BusyBox in CI
environments. This tool is needed for a subtest introduced in v6.1.
Link: https://lore.kernel.org/stable/CA+G9fYtDGpgT4dckXD-y-N92nqUxuvue_7AtDdBcHrb… [1]
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
---
Matthieu Baerts (8):
selftests: mptcp: join: avoid using 'cmp --bytes'
selftests: mptcp: connect: skip if MPTCP is not supported
selftests: mptcp: pm nl: skip if MPTCP is not supported
selftests: mptcp: join: skip if MPTCP is not supported
selftests: mptcp: diag: skip if MPTCP is not supported
selftests: mptcp: simult flows: skip if MPTCP is not supported
selftests: mptcp: sockopt: skip if MPTCP is not supported
selftests: mptcp: userspace pm: skip if MPTCP is not supported
tools/testing/selftests/net/mptcp/Makefile | 2 +-
tools/testing/selftests/net/mptcp/diag.sh | 4 +++
tools/testing/selftests/net/mptcp/mptcp_connect.sh | 4 +++
tools/testing/selftests/net/mptcp/mptcp_join.sh | 17 +++++++--
tools/testing/selftests/net/mptcp/mptcp_lib.sh | 40 ++++++++++++++++++++++
tools/testing/selftests/net/mptcp/mptcp_sockopt.sh | 4 +++
tools/testing/selftests/net/mptcp/pm_netlink.sh | 4 +++
tools/testing/selftests/net/mptcp/simult_flows.sh | 4 +++
tools/testing/selftests/net/mptcp/userspace_pm.sh | 4 +++
9 files changed, 80 insertions(+), 3 deletions(-)
---
base-commit: 9b9e46aa07273ceb96866b2e812b46f1ee0b8d2f
change-id: 20230528-upstream-net-20230528-mptcp-selftests-support-old-kernels-part-1-305638f4dbc0
Best regards,
--
Matthieu Baerts <matthieu.baerts(a)tessares.net>
Hi, Willy
Thanks very mush for your kindly review, discuss and suggestion, now we
get full rv32 support ;-)
In the first series [1], we have fixed up the compile errors about
_start and __NR_llseek for rv32, but left compile errors about tons of
time32 syscalls (removed after kernel commit d4c08b9776b3 ("riscv: Use
latest system call ABI")) and the missing fstat in nolibc-test.c [2],
now we have fixed up all of them.
Introduction
============
This series is based on the 20230524-nolibc-rv32+stkp4 branch of [3], it
includes 3 parts, they work together to add full rv32 support:
* Reverts two old out-of-day patches
* Revert "tools/nolibc: riscv: Support __NR_llseek for rv32"
* Revert "selftests/nolibc: Fix up compile error for rv32"
(these two and the reverted ones:
* commit 606343b7478c ("selftests/nolibc: Fix up compile error for rv32")
* commit d2c3acba6d66 ("tools/nolibc: riscv: Support __NR_llseek for rv32")
can be removed from the git repo completely, there are two new ones to replace
them)
* Compile and test support patches
* selftests/nolibc: print name instead of number for EOVERFLOW
* selftests/nolibc: syscall_args: use __NR_statx for rv32
* --> replace the old one 606343b7478, use statx instead of read
* selftests/nolibc: riscv: customize makefile for rv32
* selftests/nolibc: allow specify a bios for qemu
* selftests/nolibc: remove the duplicated gettimeofday_bad2
* Fix up some missing syscalls, mainly time32 syscalls
* tools/nolibc: sys_lseek: riscv: use __NR_llseek for rv32
* --> replace the old one d2c3acba6d66, cleaned up
* tools/nolibc: sys_poll: riscv: use __NR_ppoll_time64 for rv32
* tools/nolibc: ppoll/ppoll_time64: Add a missing argument
* tools/nolibc: sys_select: riscv: use __NR_pselect6_time64 for rv32
* tools/nolibc: sys_wait4: riscv: use __NR_waitid for rv32
* tools/nolibc: sys_gettimeofday: riscv: use __NR_clock_gettime64 for rv32
Compile
=======
For rv64:
$ make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- nolibc-test
$ file nolibc-test
nolibc-test: ELF 64-bit LSB executable, UCB RISC-V ...
$ make ARCH=riscv64 CROSS_COMPILE=riscv64-linux-gnu- nolibc-test
$ file nolibc-test
nolibc-test: ELF 64-bit LSB executable, UCB RISC-V ...
For rv32:
$ make ARCH=riscv CONFIG_32BIT=1 CROSS_COMPILE=riscv64-linux-gnu- nolibc-test
$ file nolibc-test
nolibc-test: ELF 32-bit LSB executable, UCB RISC-V ...
$ make ARCH=riscv32 CROSS_COMPILE=riscv64-linux-gnu- nolibc-test
$ file nolibc-test
nolibc-test: ELF 32-bit LSB executable, UCB RISC-V ...
Testing
=======
Environment:
// gcc toolchain
$ riscv64-linux-gnu-gcc --version
riscv64-linux-gnu-gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
// glibc >= 2.33 required, for older glibc, must upgrade include/bits/wordsize.h
$ dpkg -l | grep libc6-dev | grep riscv
ii libc6-dev-riscv64-cross 2.31-0ubuntu7cross1
// glibc include/bits/wordsize.h: manually upgraded to >= 2.33
// without this, can not build tools/testing/selftests/nolibc/nolibc-test.c
$ cat /usr/riscv64-linux-gnu/include/bits/wordsize.h
#if __riscv_xlen == (__SIZEOF_POINTER__ * 8)
# define __WORDSIZE __riscv_xlen
#else
# error unsupported ABI
#endif
# define __WORDSIZE_TIME64_COMPAT32 1
#if __WORDSIZE == 32
# define __WORDSIZE32_SIZE_ULONG 0
# define __WORDSIZE32_PTRDIFF_LONG 0
#endif
// higher qemu version is better, latest version is v8.0.0+
$ qemu-system-riscv64 --version
QEMU emulator version 4.2.1 (Debian 1:4.2-3ubuntu6.18)
Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers
// opensbi version, higher is better, must match kernel version and qemu version
// rv64: used version is 1.2, latest is 1.2
$ head -2 /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/run.out | tail -1
OpenSBI v1.2-116-g7919530
// rv32: used version is v0.9, latest is 1.2
$ head -2 /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/run.out | tail -1
OpenSBI v0.9-152-g754d511
For rv64:
$ pwd
/labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc
$ make ARCH=riscv64 CROSS_COMPILE=riscv64-linux-gnu- defconfig
$ make ARCH=riscv64 CROSS_COMPILE=riscv64-linux-gnu- BIOS=/labs/linux-lab/boards/riscv64/virt/bsp/bios/opensbi/generic/fw_jump.elf run
MKDIR sysroot/riscv/include
make[1]: Entering directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc'
make[2]: Entering directory '/labs/linux-lab/src/linux-stable'
make[2]: Leaving directory '/labs/linux-lab/src/linux-stable'
make[2]: Entering directory '/labs/linux-lab/src/linux-stable'
INSTALL /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/sysroot/sysroot/include
make[2]: Leaving directory '/labs/linux-lab/src/linux-stable'
make[1]: Leaving directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc'
CC nolibc-test
MKDIR initramfs
INSTALL initramfs/init
make[1]: Entering directory '/labs/linux-lab/src/linux-stable'
...
LD vmlinux
NM System.map
SORTTAB vmlinux
OBJCOPY arch/riscv/boot/Image
Kernel: arch/riscv/boot/Image is ready
make[1]: Leaving directory '/labs/linux-lab/src/linux-stable'
135 test(s) passed.
$ file ../../../../vmlinux
../../../../vmlinux: ELF 64-bit LSB executable, UCB RISC-V, version 1 (SYSV), statically linked, BuildID[sha1]=b8e1cea5122b04bce540b4022f0d6f171ffe615a, not stripped
For rv32:
$ pwd
/labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc
$ make ARCH=riscv32 CROSS_COMPILE=riscv64-linux-gnu- defconfig
$ make ARCH=riscv32 CROSS_COMPILE=riscv64-linux-gnu- BIOS=/labs/linux-lab/boards/riscv32/virt/bsp/bios/opensbi/generic/fw_jump.elf run
MKDIR sysroot/riscv/include
make[1]: Entering directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc'
make[2]: Entering directory '/labs/linux-lab/src/linux-stable'
make[2]: Leaving directory '/labs/linux-lab/src/linux-stable'
make[2]: Entering directory '/labs/linux-lab/src/linux-stable'
INSTALL /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/sysroot/sysroot/include
make[2]: Leaving directory '/labs/linux-lab/src/linux-stable'
make[1]: Leaving directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc'
CC nolibc-test
MKDIR initramfs
INSTALL initramfs/init
make[1]: Entering directory '/labs/linux-lab/src/linux-stable'
CALL scripts/checksyscalls.sh
GEN usr/initramfs_data.cpio
COPY usr/initramfs_inc_data
AS usr/initramfs_data.o
AR usr/built-in.a
GEN security/selinux/flask.h security/selinux/av_permissions.h
CC security/selinux/avc.o
CC security/selinux/hooks.o
CC security/selinux/selinuxfs.o
CC security/selinux/nlmsgtab.o
CC security/selinux/netif.o
CC security/selinux/netnode.o
CC security/selinux/netport.o
CC security/selinux/status.o
CC security/selinux/ss/services.o
AR security/selinux/built-in.a
AR security/built-in.a
AR built-in.a
AR vmlinux.a
LD vmlinux.o
OBJCOPY modules.builtin.modinfo
GEN modules.builtin
MODPOST vmlinux.symvers
UPD include/generated/utsversion.h
CC init/version-timestamp.o
LD .tmp_vmlinux.kallsyms1
NM .tmp_vmlinux.kallsyms1.syms
KSYMS .tmp_vmlinux.kallsyms1.S
AS .tmp_vmlinux.kallsyms1.S
LD .tmp_vmlinux.kallsyms2
NM .tmp_vmlinux.kallsyms2.syms
KSYMS .tmp_vmlinux.kallsyms2.S
AS .tmp_vmlinux.kallsyms2.S
LD vmlinux
NM System.map
SORTTAB vmlinux
OBJCOPY arch/riscv/boot/Image
Kernel: arch/riscv/boot/Image is ready
make[1]: Leaving directory '/labs/linux-lab/src/linux-stable'
135 test(s) passed.
$ file ../../../../vmlinux
../../../../vmlinux: ELF 32-bit LSB executable, UCB RISC-V, version 1 (SYSV), statically linked, BuildID[sha1]=bad4c1f3899f47355d2a2010bade56972fd94b9d, not stripped
The full rv64 testing result (run.out) is uploaded at [4].
The full rv32 testing result (run.out) is uploaded at [5].
That's all, thanks!
Best regards,
Zhangjin Wu
---
[1]: https://lore.kernel.org/linux-riscv/20230520143154.68663-1-falcon@tinylab.o…
[2]: https://lore.kernel.org/linux-riscv/20230520135235.68155-1-falcon@tinylab.o…
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/nolibc.git
[4]: https://pastebin.com/3L0nV78u
[5]: https://pastebin.com/RadrXdta
Zhangjin Wu (13):
Revert "tools/nolibc: riscv: Support __NR_llseek for rv32"
Revert "selftests/nolibc: Fix up compile error for rv32"
selftests/nolibc: print name instead of number for EOVERFLOW
selftests/nolibc: syscall_args: use __NR_statx for rv32
selftests/nolibc: riscv: customize makefile for rv32
selftests/nolibc: allow specify a bios for qemu
selftests/nolibc: remove the duplicated gettimeofday_bad2
tools/nolibc: sys_lseek: riscv: use __NR_llseek for rv32
tools/nolibc: sys_poll: riscv: use __NR_ppoll_time64 for rv32
tools/nolibc: ppoll/ppoll_time64: Add a missing argument
tools/nolibc: sys_select: riscv: use __NR_pselect6_time64 for rv32
tools/nolibc: sys_wait4: riscv: use __NR_waitid for rv32
tools/nolibc: sys_gettimeofday: riscv: use __NR_clock_gettime64 for
rv32
tools/include/nolibc/std.h | 1 +
tools/include/nolibc/sys.h | 135 +++++++++++++++++--
tools/include/nolibc/types.h | 21 ++-
tools/testing/selftests/nolibc/Makefile | 14 +-
tools/testing/selftests/nolibc/nolibc-test.c | 15 ++-
5 files changed, 167 insertions(+), 19 deletions(-)
--
2.25.1
From: Jeff Xu <jeffxu(a)google.com>
This is the first set of Memory mapping (VMA) protection patches using PKU.
* * *
Background:
As discussed previously in the kernel mailing list [1], V8 CFI [2] uses
PKU to protect memory, and Stephen Röttger proposes to extend the PKU to
memory mapping [3].
We're using PKU for in-process isolation to enforce control-flow integrity
for a JIT compiler. In our threat model, an attacker exploits a
vulnerability and has arbitrary read/write access to the whole process
space concurrently to other threads being executed. This attacker can
manipulate some arguments to syscalls from some threads.
Under such a powerful attack, we want to create a “safe/isolated”
thread environment. We assign dedicated PKUs to this thread,
and use those PKUs to protect the threads’ runtime environment.
The thread has exclusive access to its run-time memory. This
includes modifying the protection of the memory mapping, or
munmap the memory mapping after use. And the other threads
won’t be able to access the memory or modify the memory mapping
(VMA) belonging to the thread.
* * *
Proposed changes:
This patch introduces a new flag, PKEY_ENFORCE_API, to the pkey_alloc()
function. When a PKEY is created with this flag, it is enforced that any
thread that wants to make changes to the memory mapping (such as mprotect)
of the memory must have write access to the PKEY. PKEYs created without
this flag will continue to work as they do now, for backwards
compatibility.
Only PKEY created from user space can have the new flag set, the PKEY
allocated by the kernel internally will not have it. In other words,
ARCH_DEFAULT_PKEY(0) and execute_only_pkey won’t have this flag set,
and continue work as today.
This flag is checked only at syscall entry, such as mprotect/munmap in
this set of patches. It will not apply to other call paths. In other
words, if the kernel want to change attributes of VMA for some reasons,
the kernel is free to do that and not affected by this new flag.
This set of patch covers mprotect/munmap, I plan to work on other
syscalls after this.
* * *
Testing:
I have tested this patch on a Linux kernel 5.15, 6,1, and 6.4-rc1,
new selftest is added in: pkey_enforce_api.c
* * *
Discussion:
We believe that this patch provides a valuable security feature.
It allows us to create “safe/isolated” thread environments that are
protected from attackers with arbitrary read/write access to
the process space.
We believe that the interface change and the patch don't
introduce backwards compatibility risk.
We would like to disucss this patch in Linux kernel community
for feedback and support.
* * *
Reference:
[1]https://lore.kernel.org/all/202208221331.71C50A6F@keescook/
[2]https://docs.google.com/document/d/1O2jwK4dxI3nRcOJuPYkonhTkNQfbmwdvxQMyX…
[3]https://docs.google.com/document/d/1qqVoVfRiF2nRylL3yjZyCQvzQaej1HRPh3f5w…
* * *
Current status:
There are on-going discussion related to threat model, io_uring, we will continue discuss using v0 thread.
* * *
PATCH history:
v1: update code related review comments:
mprotect.c:
remove syscall from do_mprotect_pkey()
remove pr_warn_ratelimited
munmap.c:
change syscall to enum caller_origin
remove pr_warn_ratelimited
v0:
https://lore.kernel.org/linux-mm/20230515130553.2311248-1-jeffxu@chromium.o…
Best Regards,
-Jeff Xu
Jeff Xu (6):
PKEY: Introduce PKEY_ENFORCE_API flag
PKEY: Add arch_check_pkey_enforce_api()
PKEY: Apply PKEY_ENFORCE_API to mprotect
PKEY:selftest pkey_enforce_api for mprotect
PKEY: Apply PKEY_ENFORCE_API to munmap
PKEY:selftest pkey_enforce_api for munmap
arch/powerpc/include/asm/pkeys.h | 19 +-
arch/x86/include/asm/mmu.h | 7 +
arch/x86/include/asm/pkeys.h | 92 +-
arch/x86/mm/pkeys.c | 2 +-
include/linux/mm.h | 8 +-
include/linux/pkeys.h | 18 +-
include/uapi/linux/mman.h | 5 +
mm/mmap.c | 31 +-
mm/mprotect.c | 17 +-
mm/mremap.c | 6 +-
tools/testing/selftests/mm/Makefile | 1 +
tools/testing/selftests/mm/pkey_enforce_api.c | 1312 +++++++++++++++++
12 files changed, 1499 insertions(+), 19 deletions(-)
create mode 100644 tools/testing/selftests/mm/pkey_enforce_api.c
base-commit: ba0ad6ed89fd5dada3b7b65ef2b08e95d449d4ab
--
2.40.1.606.ga4b1b128d6-goog
Hi, All
Thanks very much for your review suggestions of the v1 series [1], this
is the generic part1 of the v2 revison.
* selftests/nolibc: syscall_args: use generic __NR_statx
A more generic statx is used instead of fstat
(Review suggestions from Willy, Arnd)
* selftests/nolibc: allow specify extra arguments for qemu
Besides BIOS, QEMU_ARGS_EXTRA is better for more requirements
(Review suggestions from Thomas, Willy)
* selftests/nolibc: fix up compile warning with glibc on x86_64
Definition of uint64_t differs from glibc and nolibc, use the right
print format here
* selftests/nolibc: not include limits.h for nolibc
Remove the requirement of limits.h for nolibc can let us use older
glibc for rv32
(Review suggestions from thomas)
* selftests/nolibc: use INT_MAX instead of __INT_MAX__
A trivial cleanup, based on the previous patch
* tools/nolibc: arm: add missing my_syscall6
Required by future forced pselect6/pselect6_time64, tested on arm/vexpress-a9
(Review suggestions from Arnd)
* tools/nolibc: open: fix up compile warning for arm
A trivial fixup based on compiler's suggestion and glibc code
Best regards,
Zhangjin
----
[1]: https://lore.kernel.org/linux-riscv/20230529113143.GB2762@1wt.eu/T/#t
Zhangjin Wu (7):
selftests/nolibc: syscall_args: use __NR_statx for rv32
selftests/nolibc: allow specify extra arguments for qemu
selftests/nolibc: fix up compile warning with glibc on x86_64
selftests/nolibc: not include limits.h for nolibc
selftests/nolibc: use INT_MAX instead of __INT_MAX__
tools/nolibc: arm: add missing my_syscall6
tools/nolibc: open: fix up compile warning for arm
tools/include/nolibc/arch-arm.h | 23 ++++++++++++++++++++
tools/include/nolibc/stdint.h | 14 ++++++++++++
tools/include/nolibc/sys.h | 2 +-
tools/testing/selftests/nolibc/Makefile | 2 +-
tools/testing/selftests/nolibc/nolibc-test.c | 14 +++++++-----
5 files changed, 47 insertions(+), 8 deletions(-)
--
2.25.1
When A registering user event from dyn_events has no argments, it will pass the
matching check, regardless of whether there is a user event with the same name
and arguments. Add the matching check when the arguments of registering user
event is null.
Signed-off-by: sunliming <sunliming(a)kylinos.cn>
---
kernel/trace/trace_events_user.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c
index e90161294698..0d91dac206ff 100644
--- a/kernel/trace/trace_events_user.c
+++ b/kernel/trace/trace_events_user.c
@@ -1712,6 +1712,8 @@ static bool user_event_match(const char *system, const char *event,
if (match && argc > 0)
match = user_fields_match(user, argc, argv);
+ else if (match && argc == 0)
+ match = list_empty(&user->fields);
return match;
}
--
2.25.1
Partially backport v6.3 commit 11f75a01448f ("selftests/memfd: add
tests for MFD_NOEXEC_SEAL MFD_EXEC") to fix an unknown type name
build error.
In some systems, the __u64 typedef is not present due to differences
in system headers, causing compilation errors like this one:
fuse_test.c:64:8: error: unknown type name '__u64'
64 | static __u64 mfd_assert_get_seals(int fd)
This header includes the __u64 typedef which increases the
likelihood of successful compilation on a wider variety of systems.
Signed-off-by: Hardik Garg <hargar(a)linux.microsoft.com>
---
tools/testing/selftests/memfd/fuse_test.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/memfd/fuse_test.c b/tools/testing/selftests/memfd/fuse_test.c
index be675002f918..93798c8c5d54 100644
--- a/tools/testing/selftests/memfd/fuse_test.c
+++ b/tools/testing/selftests/memfd/fuse_test.c
@@ -22,6 +22,7 @@
#include <linux/falloc.h>
#include <fcntl.h>
#include <linux/memfd.h>
+#include <linux/types.h>
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
--
2.25.1
Small optimization to avoid coredump writing during the stack protector
tests.
Adds prctl() as prerequisite.
This series is based on nolibc/20230524-nolibc-rv32+stkp4
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
Changes in v2:
- Fix compilation warning in prctl() testcase
- Link to v1: https://lore.kernel.org/r/20230526-nolibc-test-no-dump-v1-0-62e724a96db2@we…
---
Thomas Weißschuh (2):
tools/nolibc: add support for prctl()
selftests/nolibc: prevent coredumps during test execution
tools/include/nolibc/sys.h | 27 +++++++++++++++++++++++++++
tools/testing/selftests/nolibc/nolibc-test.c | 3 +++
2 files changed, 30 insertions(+)
---
base-commit: 1974a2b5fd434812b32952b09df7b79fdee8104d
change-id: 20230526-nolibc-test-no-dump-a1b1d9557df8
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
Small optimization to avoid coredump writing during the stack protector
tests.
Adds prctl() as prerequisite.
This series is based on nolibc/20230524-nolibc-rv32+stkp4
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
Thomas Weißschuh (2):
tools/nolibc: add support for prctl()
selftests/nolibc: prevent coredumps during test execution
tools/include/nolibc/sys.h | 27 +++++++++++++++++++++++++++
tools/testing/selftests/nolibc/nolibc-test.c | 3 +++
2 files changed, 30 insertions(+)
---
base-commit: 1974a2b5fd434812b32952b09df7b79fdee8104d
change-id: 20230526-nolibc-test-no-dump-a1b1d9557df8
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
Dan Carpenter spotted a race condition in a couple of situations like
these in the test_firmware driver:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
u8 val;
int ret;
ret = kstrtou8(buf, 10, &val);
if (ret)
return ret;
mutex_lock(&test_fw_mutex);
*(u8 *)cfg = val;
mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
static ssize_t config_num_requests_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
int rc;
mutex_lock(&test_fw_mutex);
if (test_fw_config->reqs) {
pr_err("Must call release_all_firmware prior to changing config\n");
rc = -EINVAL;
mutex_unlock(&test_fw_mutex);
goto out;
}
mutex_unlock(&test_fw_mutex);
rc = test_dev_config_update_u8(buf, count,
&test_fw_config->num_requests);
out:
return rc;
}
static ssize_t config_read_fw_idx_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
return test_dev_config_update_u8(buf, count,
&test_fw_config->read_fw_idx);
}
The function test_dev_config_update_u8() is called from both the locked
and the unlocked context, function config_num_requests_store() and
config_read_fw_idx_store() which can both be called asynchronously as
they are driver's methods, while test_dev_config_update_u8() and siblings
change their argument pointed to by u8 *cfg or similar pointer.
To avoid deadlock on test_fw_mutex, the lock is dropped before calling
test_dev_config_update_u8() and re-acquired within test_dev_config_update_u8()
itself, but alas this creates a race condition.
Having two locks wouldn't assure a race-proof mutual exclusion.
This situation is best avoided by the introduction of a new, unlocked
function __test_dev_config_update_u8() which can be called from the locked
context and reducing test_dev_config_update_u8() to:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
int ret;
mutex_lock(&test_fw_mutex);
ret = __test_dev_config_update_u8(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
}
doing the locking and calling the unlocked primitive, which enables both
locked and unlocked versions without duplication of code.
The similar approach was applied to all functions called from the locked
and the unlocked context, which safely mitigates both deadlocks and race
conditions in the driver.
__test_dev_config_update_bool(), __test_dev_config_update_u8() and
__test_dev_config_update_size_t() unlocked versions of the functions
were introduced to be called from the locked contexts as a workaround
without releasing the main driver's lock and thereof causing a race
condition.
The test_dev_config_update_bool(), test_dev_config_update_u8() and
test_dev_config_update_size_t() locked versions of the functions
are being called from driver methods without the unnecessary multiplying
of the locking and unlocking code for each method, and complicating
the code with saving of the return value across lock.
Fixes: 7feebfa487b92 ("test_firmware: add support for request_firmware_into_buf")
Cc: Luis Chamberlain <mcgrof(a)kernel.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Russ Weight <russell.h.weight(a)intel.com>
Cc: Takashi Iwai <tiwai(a)suse.de>
Cc: Tianfei Zhang <tianfei.zhang(a)intel.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Colin Ian King <colin.i.king(a)gmail.com>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: linux-kselftest(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # v5.4
Suggested-by: Dan Carpenter <error27(a)gmail.com>
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
---
lib/test_firmware.c | 52 ++++++++++++++++++++++++++++++---------------
1 file changed, 35 insertions(+), 17 deletions(-)
diff --git a/lib/test_firmware.c b/lib/test_firmware.c
index 05ed84c2fc4c..35417e0af3f4 100644
--- a/lib/test_firmware.c
+++ b/lib/test_firmware.c
@@ -353,16 +353,26 @@ static ssize_t config_test_show_str(char *dst,
return len;
}
-static int test_dev_config_update_bool(const char *buf, size_t size,
+static inline int __test_dev_config_update_bool(const char *buf, size_t size,
bool *cfg)
{
int ret;
- mutex_lock(&test_fw_mutex);
if (kstrtobool(buf, cfg) < 0)
ret = -EINVAL;
else
ret = size;
+
+ return ret;
+}
+
+static int test_dev_config_update_bool(const char *buf, size_t size,
+ bool *cfg)
+{
+ int ret;
+
+ mutex_lock(&test_fw_mutex);
+ ret = __test_dev_config_update_bool(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
@@ -373,7 +383,8 @@ static ssize_t test_dev_config_show_bool(char *buf, bool val)
return snprintf(buf, PAGE_SIZE, "%d\n", val);
}
-static int test_dev_config_update_size_t(const char *buf,
+static int __test_dev_config_update_size_t(
+ const char *buf,
size_t size,
size_t *cfg)
{
@@ -384,9 +395,7 @@ static int test_dev_config_update_size_t(const char *buf,
if (ret)
return ret;
- mutex_lock(&test_fw_mutex);
*(size_t *)cfg = new;
- mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
@@ -402,7 +411,7 @@ static ssize_t test_dev_config_show_int(char *buf, int val)
return snprintf(buf, PAGE_SIZE, "%d\n", val);
}
-static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
+static int __test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
u8 val;
int ret;
@@ -411,14 +420,23 @@ static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
if (ret)
return ret;
- mutex_lock(&test_fw_mutex);
*(u8 *)cfg = val;
- mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
+static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
+{
+ int ret;
+
+ mutex_lock(&test_fw_mutex);
+ ret = __test_dev_config_update_u8(buf, size, cfg);
+ mutex_unlock(&test_fw_mutex);
+
+ return ret;
+}
+
static ssize_t test_dev_config_show_u8(char *buf, u8 val)
{
return snprintf(buf, PAGE_SIZE, "%u\n", val);
@@ -471,10 +489,10 @@ static ssize_t config_num_requests_store(struct device *dev,
mutex_unlock(&test_fw_mutex);
goto out;
}
- mutex_unlock(&test_fw_mutex);
- rc = test_dev_config_update_u8(buf, count,
- &test_fw_config->num_requests);
+ rc = __test_dev_config_update_u8(buf, count,
+ &test_fw_config->num_requests);
+ mutex_unlock(&test_fw_mutex);
out:
return rc;
@@ -518,10 +536,10 @@ static ssize_t config_buf_size_store(struct device *dev,
mutex_unlock(&test_fw_mutex);
goto out;
}
- mutex_unlock(&test_fw_mutex);
- rc = test_dev_config_update_size_t(buf, count,
- &test_fw_config->buf_size);
+ rc = __test_dev_config_update_size_t(buf, count,
+ &test_fw_config->buf_size);
+ mutex_unlock(&test_fw_mutex);
out:
return rc;
@@ -548,10 +566,10 @@ static ssize_t config_file_offset_store(struct device *dev,
mutex_unlock(&test_fw_mutex);
goto out;
}
- mutex_unlock(&test_fw_mutex);
- rc = test_dev_config_update_size_t(buf, count,
- &test_fw_config->file_offset);
+ rc = __test_dev_config_update_size_t(buf, count,
+ &test_fw_config->file_offset);
+ mutex_unlock(&test_fw_mutex);
out:
return rc;
--
2.30.2
KUnit aborts the current thread when an assertion fails. Currently, this
is done conditionally as part of the kunit_do_failed_assertion()
function, but this hides the kunit_abort() call from the compiler
(particularly if it's in another module). This, in turn, can lead to
both suboptimal code generation (the compiler can't know if
kunit_do_failed_assertion() will return), and to static analysis tools
like smatch giving false positives.
Moving the kunit_abort() call into the macro should give the compiler
and tools a better chance at understanding what's going on. Doing so
requires exporting kunit_abort(), though it's recommended to continue to
use assertions in lieu of aborting directly.
Suggested-by: Dan Carpenter <dan.carpenter(a)linaro.org>
Signed-off-by: David Gow <davidgow(a)google.com>
---
include/kunit/test.h | 4 ++++
lib/kunit/test.c | 5 +----
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/include/kunit/test.h b/include/kunit/test.h
index 2f23d6efa505..6a35e3e2a1e5 100644
--- a/include/kunit/test.h
+++ b/include/kunit/test.h
@@ -481,6 +481,8 @@ void __printf(2, 3) kunit_log_append(char *log, const char *fmt, ...);
*/
#define KUNIT_SUCCEED(test) do {} while (0)
+void __noreturn kunit_abort(struct kunit *test);
+
void kunit_do_failed_assertion(struct kunit *test,
const struct kunit_loc *loc,
enum kunit_assert_type type,
@@ -498,6 +500,8 @@ void kunit_do_failed_assertion(struct kunit *test,
assert_format, \
fmt, \
##__VA_ARGS__); \
+ if (assert_type == KUNIT_ASSERTION) \
+ kunit_abort(test); \
} while (0)
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index d3fb93a23ccc..3b350e50cab9 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -310,7 +310,7 @@ static void kunit_fail(struct kunit *test, const struct kunit_loc *loc,
string_stream_destroy(stream);
}
-static void __noreturn kunit_abort(struct kunit *test)
+void __noreturn kunit_abort(struct kunit *test)
{
kunit_try_catch_throw(&test->try_catch); /* Does not return. */
@@ -340,9 +340,6 @@ void kunit_do_failed_assertion(struct kunit *test,
kunit_fail(test, loc, type, assert, assert_format, &message);
va_end(args);
-
- if (type == KUNIT_ASSERTION)
- kunit_abort(test);
}
EXPORT_SYMBOL_GPL(kunit_do_failed_assertion);
--
2.41.0.rc0.172.g3f132b7071-goog
User processes register name_args for events. If the same name but different
args event are registered. The trace outputs of second event are printed
as the first event. This is incorrect.
Return EADDRINUSE back to the user process if the same name but different args
event has being registered.
Signed-off-by: sunliming <sunliming(a)kylinos.cn>
---
kernel/trace/trace_events_user.c | 34 +++++++++++++++----
.../selftests/user_events/ftrace_test.c | 6 ++++
2 files changed, 33 insertions(+), 7 deletions(-)
diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c
index b1ecd7677642..bd455052ccd0 100644
--- a/kernel/trace/trace_events_user.c
+++ b/kernel/trace/trace_events_user.c
@@ -1753,6 +1753,8 @@ static int user_event_parse(struct user_event_group *group, char *name,
int ret;
u32 key;
struct user_event *user;
+ int argc = 0;
+ char **argv;
/* Prevent dyn_event from racing */
mutex_lock(&event_mutex);
@@ -1760,13 +1762,31 @@ static int user_event_parse(struct user_event_group *group, char *name,
mutex_unlock(&event_mutex);
if (user) {
- *newuser = user;
- /*
- * Name is allocated by caller, free it since it already exists.
- * Caller only worries about failure cases for freeing.
- */
- kfree(name);
- return 0;
+ if (args) {
+ argv = argv_split(GFP_KERNEL, args, &argc);
+ if (!argv)
+ return -ENOMEM;
+
+ ret = user_fields_match(user, argc, (const char **)argv);
+ argv_free(argv);
+
+ } else
+ ret = list_empty(&user->fields);
+
+ if (ret) {
+ *newuser = user;
+ /*
+ * Name is allocated by caller, free it since it already exists.
+ * Caller only worries about failure cases for freeing.
+ */
+ kfree(name);
+ ret = 0;
+ } else {
+ refcount_dec(&user->refcnt);
+ ret = -EADDRINUSE;
+ }
+
+ return ret;
}
user = kzalloc(sizeof(*user), GFP_KERNEL_ACCOUNT);
diff --git a/tools/testing/selftests/user_events/ftrace_test.c b/tools/testing/selftests/user_events/ftrace_test.c
index 7c99cef94a65..6e8c4b47281c 100644
--- a/tools/testing/selftests/user_events/ftrace_test.c
+++ b/tools/testing/selftests/user_events/ftrace_test.c
@@ -228,6 +228,12 @@ TEST_F(user, register_events) {
ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSREG, ®));
ASSERT_EQ(0, reg.write_index);
+ /* Multiple registers to same name but different args should fail */
+ reg.enable_bit = 29;
+ reg.name_args = (__u64)"__test_event u32 field1;";
+ ASSERT_EQ(-1, ioctl(self->data_fd, DIAG_IOCSREG, ®));
+ ASSERT_EQ(EADDRINUSE, errno);
+
/* Ensure disabled */
self->enable_fd = open(enable_file, O_RDWR);
ASSERT_NE(-1, self->enable_fd);
--
2.25.1
There was a report that the hardware breakpoints and watch points weren't
reporting the debug architecture version as expected, they were reporting
a version of 0 which is not defined in the architecture. This happens
when running in a KVM guest if the host has a debug architecture version
not supported by KVM, it in turn confuses GDB which rejects any debug
architecture version it does not know about.
Add a test that covers that situation and while we're at it reports the
debug architecture version and number of slots available to aid with
figuring out problems that may arise.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v2:
- Rebase onto v6.4-rc3.
- Link to v1: https://lore.kernel.org/r/20230414-arm64-test-hw-breakpoint-v1-1-14162c8e5b…
---
tools/testing/selftests/arm64/abi/ptrace.c | 32 +++++++++++++++++++++++++++++-
1 file changed, 31 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/arm64/abi/ptrace.c b/tools/testing/selftests/arm64/abi/ptrace.c
index be952511af22..abe4d58d731d 100644
--- a/tools/testing/selftests/arm64/abi/ptrace.c
+++ b/tools/testing/selftests/arm64/abi/ptrace.c
@@ -20,7 +20,7 @@
#include "../../kselftest.h"
-#define EXPECTED_TESTS 7
+#define EXPECTED_TESTS 11
#define MAX_TPIDRS 2
@@ -132,6 +132,34 @@ static void test_tpidr(pid_t child)
}
}
+static void test_hw_debug(pid_t child, int type, const char *type_name)
+{
+ struct user_hwdebug_state state;
+ struct iovec iov;
+ int slots, arch, ret;
+
+ iov.iov_len = sizeof(state);
+ iov.iov_base = &state;
+
+ /* Should be able to read the values */
+ ret = ptrace(PTRACE_GETREGSET, child, type, &iov);
+ ksft_test_result(ret == 0, "read_%s\n", type_name);
+
+ if (ret == 0) {
+ /* Low 8 bits is the number of slots, next 4 bits the arch */
+ slots = state.dbg_info & 0xff;
+ arch = (state.dbg_info >> 8) & 0xf;
+
+ ksft_print_msg("%s version %d with %d slots\n", type_name,
+ arch, slots);
+
+ /* Zero is not currently architecturally valid */
+ ksft_test_result(arch, "%s_arch_set\n", type_name);
+ } else {
+ ksft_test_result_skip("%s_arch_set\n");
+ }
+}
+
static int do_child(void)
{
if (ptrace(PTRACE_TRACEME, -1, NULL, NULL))
@@ -207,6 +235,8 @@ static int do_parent(pid_t child)
ksft_print_msg("Parent is %d, child is %d\n", getpid(), child);
test_tpidr(child);
+ test_hw_debug(child, NT_ARM_HW_WATCH, "NT_ARM_HW_WATCH");
+ test_hw_debug(child, NT_ARM_HW_BREAK, "NT_ARM_HW_BREAK");
ret = EXIT_SUCCESS;
---
base-commit: 44c026a73be8038f03dbdeef028b642880cf1511
change-id: 20230414-arm64-test-hw-breakpoint-83fe02f607fc
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Due to the lack of the SKIP directive in the output, if any of the
parameterized test was skipped, the parser could not recognize that
correctly and was marking the test as PASSED.
This can easily be seen by running the new subtest from patch 1:
$ ./tools/testing/kunit/kunit.py run \
--kunitconfig ./lib/kunit/.kunitconfig *.example_params*
[ ] Starting KUnit Kernel (1/1)...
[ ] ============================================================
[ ] =================== example (1 subtest) ====================
[ ] =================== example_params_test ===================
[ ] [PASSED] example value 2
[ ] [PASSED] example value 1
[ ] [PASSED] example value 0
[ ] =============== [PASSED] example_params_test ===============
[ ] ===================== [PASSED] example =====================
[ ] ============================================================
[ ] Testing complete. Ran 3 tests: passed: 3
$ ./tools/testing/kunit/kunit.py run \
--kunitconfig ./lib/kunit/.kunitconfig *.example_params* \
--raw_output
[ ] Starting KUnit Kernel (1/1)...
KTAP version 1
1..1
# example: initializing suite
KTAP version 1
# Subtest: example
1..1
KTAP version 1
# Subtest: example_params_test
# example_params_test: initializing
ok 1 example value 2
# example_params_test: initializing
ok 2 example value 1
# example_params_test: initializing
ok 3 example value 0
# example_params_test: pass:2 fail:0 skip:1 total:3
ok 1 example_params_test
# Totals: pass:2 fail:0 skip:1 total:3
ok 1 example
After adding the SKIP directive, the report looks as expected:
[ ] Starting KUnit Kernel (1/1)...
[ ] ============================================================
[ ] =================== example (1 subtest) ====================
[ ] =================== example_params_test ===================
[ ] [PASSED] example value 2
[ ] [PASSED] example value 1
[ ] [SKIPPED] example value 0
[ ] =============== [PASSED] example_params_test ===============
[ ] ===================== [PASSED] example =====================
[ ] ============================================================
[ ] Testing complete. Ran 3 tests: passed: 2, skipped: 1
[ ] Starting KUnit Kernel (1/1)...
KTAP version 1
1..1
# example: initializing suite
KTAP version 1
# Subtest: example
1..1
KTAP version 1
# Subtest: example_params_test
# example_params_test: initializing
ok 1 example value 2
# example_params_test: initializing
ok 2 example value 1
# example_params_test: initializing
ok 3 example value 0 # SKIP unsupported param value
# example_params_test: pass:2 fail:0 skip:1 total:3
ok 1 example_params_test
# Totals: pass:2 fail:0 skip:1 total:3
ok 1 example
v2: better align with future support for arbitrary levels of testing
v3: rebased on kunit tree [1]
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git/l…
Cc: David Gow <davidgow(a)google.com>
Cc: Rae Moar <rmoar(a)google.com>
Michal Wajdeczko (3):
kunit/test: Add example test showing parameterized testing
kunit: Fix reporting of the skipped parameterized tests
kunit: Update kunit_print_ok_not_ok function
include/kunit/test.h | 1 +
lib/kunit/kunit-example-test.c | 34 +++++++++++++++++++++++++++
lib/kunit/test.c | 43 ++++++++++++++++++++++------------
3 files changed, 63 insertions(+), 15 deletions(-)
--
2.25.1
Hello!
Here is v3 of the mremap start address optimization / fix for exec warning.
The main changes are:
1. Care to be taken to move purely within a VMA, in other words this check
in call_align_down():
if (vma->vm_start <= addr_masked)
return false;
As an example of why this is needed:
Consider the following range which is 2MB aligned and is
a part of a larger 10MB range which is not shown. Each
character is 256KB below making the source and destination
2MB each. The lower case letters are moved (s to d) and the
upper case letters are not moved.
|DDDDddddSSSSssss|
If we align down 'ssss' to start from the 'SSSS', we will end up destroying
SSSS. The above if statement prevents that and I verified it.
I also added a test for this in the last patch.
2. Handle the stack case separately. We do not care about #1 for stack movement
because the 'SSSS' does not matter during this move. Further we need to do this
to prevent the stack move warning.
if (!for_stack && vma->vm_start <= addr_masked)
return false;
History of patches
==================
v2->v3:
1. Masked address was stored in int, fixed it to unsigned long to avoid truncation.
2. We now handle moves happening purely within a VMA, a new test is added to handle this.
3. More code comments.
v1->v2:
1. Trigger the optimization for mremaps smaller than a PMD. I tested by tracing
that it works correctly.
2. Fix issue with bogus return value found by Linus if we broke out of the
above loop for the first PMD itself.
v1: Initial RFC.
Description of patches
======================
These patches optimizes the start addresses in move_page_tables() and tests the
changes. It addresses a warning [1] that occurs due to a downward, overlapping
move on a mutually-aligned offset within a PMD during exec. By initiating the
copy process at the PMD level when such alignment is present, we can prevent
this warning and speed up the copying process at the same time. Linus Torvalds
suggested this idea.
Please check the individual patches for more details.
thanks,
- Joel
[1] https://lore.kernel.org/all/ZB2GTBD%2FLWTrkOiO@dhcp22.suse.cz/
Joel Fernandes (Google) (6):
mm/mremap: Optimize the start addresses in move_page_tables()
mm/mremap: Allow moves within the same VMA
selftests: mm: Fix failure case when new remap region was not found
selftests: mm: Add a test for mutually aligned moves > PMD size
selftests: mm: Add a test for remapping to area immediately after
existing mapping
selftests: mm: Add a test for remapping within a range
fs/exec.c | 2 +-
include/linux/mm.h | 2 +-
mm/mremap.c | 69 ++++++++++-
tools/testing/selftests/mm/mremap_test.c | 148 +++++++++++++++++++++--
4 files changed, 209 insertions(+), 12 deletions(-)
--
2.40.1.698.g37aff9b760-goog
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit dbcf76390eb9a65d5d0c37b0cd57335218564e37 ]
The ftrace selftests do not currently produce KTAP output, they produce a
custom format much nicer for human consumption. This means that when run in
automated test systems we just get a single result for the suite as a whole
rather than recording results for individual test cases, making it harder
to look at the test data and masking things like inappropriate skips.
Address this by adding support for KTAP output to the ftracetest script and
providing a trivial wrapper which will be invoked by the kselftest runner
to generate output in this format by default, users using ftracetest
directly will continue to get the existing output.
This is not the most elegant solution but it is simple and effective. I
did consider implementing this by post processing the existing output
format but that felt more complex and likely to result in all output being
lost if something goes seriously wrong during the run which would not be
helpful. I did also consider just writing a separate runner script but
there's enough going on with things like the signal handling for that to
seem like it would be duplicating too much.
Acked-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Tested-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/ftrace/Makefile | 3 +-
tools/testing/selftests/ftrace/ftracetest | 63 ++++++++++++++++++-
.../testing/selftests/ftrace/ftracetest-ktap | 8 +++
3 files changed, 70 insertions(+), 4 deletions(-)
create mode 100755 tools/testing/selftests/ftrace/ftracetest-ktap
diff --git a/tools/testing/selftests/ftrace/Makefile b/tools/testing/selftests/ftrace/Makefile
index d6e106fbce11c..a1e955d2de4cc 100644
--- a/tools/testing/selftests/ftrace/Makefile
+++ b/tools/testing/selftests/ftrace/Makefile
@@ -1,7 +1,8 @@
# SPDX-License-Identifier: GPL-2.0
all:
-TEST_PROGS := ftracetest
+TEST_PROGS_EXTENDED := ftracetest
+TEST_PROGS := ftracetest-ktap
TEST_FILES := test.d settings
EXTRA_CLEAN := $(OUTPUT)/logs/*
diff --git a/tools/testing/selftests/ftrace/ftracetest b/tools/testing/selftests/ftrace/ftracetest
index 8ec1922e974eb..9a73a110a8bfc 100755
--- a/tools/testing/selftests/ftrace/ftracetest
+++ b/tools/testing/selftests/ftrace/ftracetest
@@ -13,6 +13,7 @@ echo "Usage: ftracetest [options] [testcase(s)] [testcase-directory(s)]"
echo " Options:"
echo " -h|--help Show help message"
echo " -k|--keep Keep passed test logs"
+echo " -K|--ktap Output in KTAP format"
echo " -v|--verbose Increase verbosity of test messages"
echo " -vv Alias of -v -v (Show all results in stdout)"
echo " -vvv Alias of -v -v -v (Show all commands immediately)"
@@ -85,6 +86,10 @@ parse_opts() { # opts
KEEP_LOG=1
shift 1
;;
+ --ktap|-K)
+ KTAP=1
+ shift 1
+ ;;
--verbose|-v|-vv|-vvv)
if [ $VERBOSE -eq -1 ]; then
usage "--console can not use with --verbose"
@@ -178,6 +183,7 @@ TEST_DIR=$TOP_DIR/test.d
TEST_CASES=`find_testcases $TEST_DIR`
LOG_DIR=$TOP_DIR/logs/`date +%Y%m%d-%H%M%S`/
KEEP_LOG=0
+KTAP=0
DEBUG=0
VERBOSE=0
UNSUPPORTED_RESULT=0
@@ -229,7 +235,7 @@ prlog() { # messages
newline=
shift
fi
- printf "$*$newline"
+ [ "$KTAP" != "1" ] && printf "$*$newline"
[ "$LOG_FILE" ] && printf "$*$newline" | strip_esc >> $LOG_FILE
}
catlog() { #file
@@ -260,11 +266,11 @@ TOTAL_RESULT=0
INSTANCE=
CASENO=0
+CASENAME=
testcase() { # testfile
CASENO=$((CASENO+1))
- desc=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
- prlog -n "[$CASENO]$INSTANCE$desc"
+ CASENAME=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
}
checkreq() { # testfile
@@ -277,40 +283,68 @@ test_on_instance() { # testfile
grep -q "^#[ \t]*flags:.*instance" $1
}
+ktaptest() { # result comment
+ if [ "$KTAP" != "1" ]; then
+ return
+ fi
+
+ local result=
+ if [ "$1" = "1" ]; then
+ result="ok"
+ else
+ result="not ok"
+ fi
+ shift
+
+ local comment=$*
+ if [ "$comment" != "" ]; then
+ comment="# $comment"
+ fi
+
+ echo $CASENO $result $INSTANCE$CASENAME $comment
+}
+
eval_result() { # sigval
case $1 in
$PASS)
prlog " [${color_green}PASS${color_reset}]"
+ ktaptest 1
PASSED_CASES="$PASSED_CASES $CASENO"
return 0
;;
$FAIL)
prlog " [${color_red}FAIL${color_reset}]"
+ ktaptest 0
FAILED_CASES="$FAILED_CASES $CASENO"
return 1 # this is a bug.
;;
$UNRESOLVED)
prlog " [${color_blue}UNRESOLVED${color_reset}]"
+ ktaptest 0 UNRESOLVED
UNRESOLVED_CASES="$UNRESOLVED_CASES $CASENO"
return $UNRESOLVED_RESULT # depends on use case
;;
$UNTESTED)
prlog " [${color_blue}UNTESTED${color_reset}]"
+ ktaptest 1 SKIP
UNTESTED_CASES="$UNTESTED_CASES $CASENO"
return 0
;;
$UNSUPPORTED)
prlog " [${color_blue}UNSUPPORTED${color_reset}]"
+ ktaptest 1 SKIP
UNSUPPORTED_CASES="$UNSUPPORTED_CASES $CASENO"
return $UNSUPPORTED_RESULT # depends on use case
;;
$XFAIL)
prlog " [${color_green}XFAIL${color_reset}]"
+ ktaptest 1 XFAIL
XFAILED_CASES="$XFAILED_CASES $CASENO"
return 0
;;
*)
prlog " [${color_blue}UNDEFINED${color_reset}]"
+ ktaptest 0 error
UNDEFINED_CASES="$UNDEFINED_CASES $CASENO"
return 1 # this must be a test bug
;;
@@ -371,6 +405,7 @@ __run_test() { # testfile
run_test() { # testfile
local testname=`basename $1`
testcase $1
+ prlog -n "[$CASENO]$INSTANCE$CASENAME"
if [ ! -z "$LOG_FILE" ] ; then
local testlog=`mktemp $LOG_DIR/${CASENO}-${testname}-log.XXXXXX`
else
@@ -405,6 +440,17 @@ run_test() { # testfile
# load in the helper functions
. $TEST_DIR/functions
+if [ "$KTAP" = "1" ]; then
+ echo "TAP version 13"
+
+ casecount=`echo $TEST_CASES | wc -w`
+ for t in $TEST_CASES; do
+ test_on_instance $t || continue
+ casecount=$((casecount+1))
+ done
+ echo "1..${casecount}"
+fi
+
# Main loop
for t in $TEST_CASES; do
run_test $t
@@ -439,6 +485,17 @@ prlog "# of unsupported: " `echo $UNSUPPORTED_CASES | wc -w`
prlog "# of xfailed: " `echo $XFAILED_CASES | wc -w`
prlog "# of undefined(test bug): " `echo $UNDEFINED_CASES | wc -w`
+if [ "$KTAP" = "1" ]; then
+ echo -n "# Totals:"
+ echo -n " pass:"`echo $PASSED_CASES | wc -w`
+ echo -n " faii:"`echo $FAILED_CASES | wc -w`
+ echo -n " xfail:"`echo $XFAILED_CASES | wc -w`
+ echo -n " xpass:0"
+ echo -n " skip:"`echo $UNTESTED_CASES $UNSUPPORTED_CASES | wc -w`
+ echo -n " error:"`echo $UNRESOLVED_CASES $UNDEFINED_CASES | wc -w`
+ echo
+fi
+
cleanup
# if no error, return 0
diff --git a/tools/testing/selftests/ftrace/ftracetest-ktap b/tools/testing/selftests/ftrace/ftracetest-ktap
new file mode 100755
index 0000000000000..b3284679ef3af
--- /dev/null
+++ b/tools/testing/selftests/ftrace/ftracetest-ktap
@@ -0,0 +1,8 @@
+#!/bin/sh -e
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# ftracetest-ktap: Wrapper to integrate ftracetest with the kselftest runner
+#
+# Copyright (C) Arm Ltd., 2023
+
+./ftracetest -K
--
2.39.2
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit dbcf76390eb9a65d5d0c37b0cd57335218564e37 ]
The ftrace selftests do not currently produce KTAP output, they produce a
custom format much nicer for human consumption. This means that when run in
automated test systems we just get a single result for the suite as a whole
rather than recording results for individual test cases, making it harder
to look at the test data and masking things like inappropriate skips.
Address this by adding support for KTAP output to the ftracetest script and
providing a trivial wrapper which will be invoked by the kselftest runner
to generate output in this format by default, users using ftracetest
directly will continue to get the existing output.
This is not the most elegant solution but it is simple and effective. I
did consider implementing this by post processing the existing output
format but that felt more complex and likely to result in all output being
lost if something goes seriously wrong during the run which would not be
helpful. I did also consider just writing a separate runner script but
there's enough going on with things like the signal handling for that to
seem like it would be duplicating too much.
Acked-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Tested-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/ftrace/Makefile | 3 +-
tools/testing/selftests/ftrace/ftracetest | 63 ++++++++++++++++++-
.../testing/selftests/ftrace/ftracetest-ktap | 8 +++
3 files changed, 70 insertions(+), 4 deletions(-)
create mode 100755 tools/testing/selftests/ftrace/ftracetest-ktap
diff --git a/tools/testing/selftests/ftrace/Makefile b/tools/testing/selftests/ftrace/Makefile
index d6e106fbce11c..a1e955d2de4cc 100644
--- a/tools/testing/selftests/ftrace/Makefile
+++ b/tools/testing/selftests/ftrace/Makefile
@@ -1,7 +1,8 @@
# SPDX-License-Identifier: GPL-2.0
all:
-TEST_PROGS := ftracetest
+TEST_PROGS_EXTENDED := ftracetest
+TEST_PROGS := ftracetest-ktap
TEST_FILES := test.d settings
EXTRA_CLEAN := $(OUTPUT)/logs/*
diff --git a/tools/testing/selftests/ftrace/ftracetest b/tools/testing/selftests/ftrace/ftracetest
index 8ec1922e974eb..9a73a110a8bfc 100755
--- a/tools/testing/selftests/ftrace/ftracetest
+++ b/tools/testing/selftests/ftrace/ftracetest
@@ -13,6 +13,7 @@ echo "Usage: ftracetest [options] [testcase(s)] [testcase-directory(s)]"
echo " Options:"
echo " -h|--help Show help message"
echo " -k|--keep Keep passed test logs"
+echo " -K|--ktap Output in KTAP format"
echo " -v|--verbose Increase verbosity of test messages"
echo " -vv Alias of -v -v (Show all results in stdout)"
echo " -vvv Alias of -v -v -v (Show all commands immediately)"
@@ -85,6 +86,10 @@ parse_opts() { # opts
KEEP_LOG=1
shift 1
;;
+ --ktap|-K)
+ KTAP=1
+ shift 1
+ ;;
--verbose|-v|-vv|-vvv)
if [ $VERBOSE -eq -1 ]; then
usage "--console can not use with --verbose"
@@ -178,6 +183,7 @@ TEST_DIR=$TOP_DIR/test.d
TEST_CASES=`find_testcases $TEST_DIR`
LOG_DIR=$TOP_DIR/logs/`date +%Y%m%d-%H%M%S`/
KEEP_LOG=0
+KTAP=0
DEBUG=0
VERBOSE=0
UNSUPPORTED_RESULT=0
@@ -229,7 +235,7 @@ prlog() { # messages
newline=
shift
fi
- printf "$*$newline"
+ [ "$KTAP" != "1" ] && printf "$*$newline"
[ "$LOG_FILE" ] && printf "$*$newline" | strip_esc >> $LOG_FILE
}
catlog() { #file
@@ -260,11 +266,11 @@ TOTAL_RESULT=0
INSTANCE=
CASENO=0
+CASENAME=
testcase() { # testfile
CASENO=$((CASENO+1))
- desc=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
- prlog -n "[$CASENO]$INSTANCE$desc"
+ CASENAME=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
}
checkreq() { # testfile
@@ -277,40 +283,68 @@ test_on_instance() { # testfile
grep -q "^#[ \t]*flags:.*instance" $1
}
+ktaptest() { # result comment
+ if [ "$KTAP" != "1" ]; then
+ return
+ fi
+
+ local result=
+ if [ "$1" = "1" ]; then
+ result="ok"
+ else
+ result="not ok"
+ fi
+ shift
+
+ local comment=$*
+ if [ "$comment" != "" ]; then
+ comment="# $comment"
+ fi
+
+ echo $CASENO $result $INSTANCE$CASENAME $comment
+}
+
eval_result() { # sigval
case $1 in
$PASS)
prlog " [${color_green}PASS${color_reset}]"
+ ktaptest 1
PASSED_CASES="$PASSED_CASES $CASENO"
return 0
;;
$FAIL)
prlog " [${color_red}FAIL${color_reset}]"
+ ktaptest 0
FAILED_CASES="$FAILED_CASES $CASENO"
return 1 # this is a bug.
;;
$UNRESOLVED)
prlog " [${color_blue}UNRESOLVED${color_reset}]"
+ ktaptest 0 UNRESOLVED
UNRESOLVED_CASES="$UNRESOLVED_CASES $CASENO"
return $UNRESOLVED_RESULT # depends on use case
;;
$UNTESTED)
prlog " [${color_blue}UNTESTED${color_reset}]"
+ ktaptest 1 SKIP
UNTESTED_CASES="$UNTESTED_CASES $CASENO"
return 0
;;
$UNSUPPORTED)
prlog " [${color_blue}UNSUPPORTED${color_reset}]"
+ ktaptest 1 SKIP
UNSUPPORTED_CASES="$UNSUPPORTED_CASES $CASENO"
return $UNSUPPORTED_RESULT # depends on use case
;;
$XFAIL)
prlog " [${color_green}XFAIL${color_reset}]"
+ ktaptest 1 XFAIL
XFAILED_CASES="$XFAILED_CASES $CASENO"
return 0
;;
*)
prlog " [${color_blue}UNDEFINED${color_reset}]"
+ ktaptest 0 error
UNDEFINED_CASES="$UNDEFINED_CASES $CASENO"
return 1 # this must be a test bug
;;
@@ -371,6 +405,7 @@ __run_test() { # testfile
run_test() { # testfile
local testname=`basename $1`
testcase $1
+ prlog -n "[$CASENO]$INSTANCE$CASENAME"
if [ ! -z "$LOG_FILE" ] ; then
local testlog=`mktemp $LOG_DIR/${CASENO}-${testname}-log.XXXXXX`
else
@@ -405,6 +440,17 @@ run_test() { # testfile
# load in the helper functions
. $TEST_DIR/functions
+if [ "$KTAP" = "1" ]; then
+ echo "TAP version 13"
+
+ casecount=`echo $TEST_CASES | wc -w`
+ for t in $TEST_CASES; do
+ test_on_instance $t || continue
+ casecount=$((casecount+1))
+ done
+ echo "1..${casecount}"
+fi
+
# Main loop
for t in $TEST_CASES; do
run_test $t
@@ -439,6 +485,17 @@ prlog "# of unsupported: " `echo $UNSUPPORTED_CASES | wc -w`
prlog "# of xfailed: " `echo $XFAILED_CASES | wc -w`
prlog "# of undefined(test bug): " `echo $UNDEFINED_CASES | wc -w`
+if [ "$KTAP" = "1" ]; then
+ echo -n "# Totals:"
+ echo -n " pass:"`echo $PASSED_CASES | wc -w`
+ echo -n " faii:"`echo $FAILED_CASES | wc -w`
+ echo -n " xfail:"`echo $XFAILED_CASES | wc -w`
+ echo -n " xpass:0"
+ echo -n " skip:"`echo $UNTESTED_CASES $UNSUPPORTED_CASES | wc -w`
+ echo -n " error:"`echo $UNRESOLVED_CASES $UNDEFINED_CASES | wc -w`
+ echo
+fi
+
cleanup
# if no error, return 0
diff --git a/tools/testing/selftests/ftrace/ftracetest-ktap b/tools/testing/selftests/ftrace/ftracetest-ktap
new file mode 100755
index 0000000000000..b3284679ef3af
--- /dev/null
+++ b/tools/testing/selftests/ftrace/ftracetest-ktap
@@ -0,0 +1,8 @@
+#!/bin/sh -e
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# ftracetest-ktap: Wrapper to integrate ftracetest with the kselftest runner
+#
+# Copyright (C) Arm Ltd., 2023
+
+./ftracetest -K
--
2.39.2
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit dbcf76390eb9a65d5d0c37b0cd57335218564e37 ]
The ftrace selftests do not currently produce KTAP output, they produce a
custom format much nicer for human consumption. This means that when run in
automated test systems we just get a single result for the suite as a whole
rather than recording results for individual test cases, making it harder
to look at the test data and masking things like inappropriate skips.
Address this by adding support for KTAP output to the ftracetest script and
providing a trivial wrapper which will be invoked by the kselftest runner
to generate output in this format by default, users using ftracetest
directly will continue to get the existing output.
This is not the most elegant solution but it is simple and effective. I
did consider implementing this by post processing the existing output
format but that felt more complex and likely to result in all output being
lost if something goes seriously wrong during the run which would not be
helpful. I did also consider just writing a separate runner script but
there's enough going on with things like the signal handling for that to
seem like it would be duplicating too much.
Acked-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Tested-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/ftrace/Makefile | 3 +-
tools/testing/selftests/ftrace/ftracetest | 63 ++++++++++++++++++-
.../testing/selftests/ftrace/ftracetest-ktap | 8 +++
3 files changed, 70 insertions(+), 4 deletions(-)
create mode 100755 tools/testing/selftests/ftrace/ftracetest-ktap
diff --git a/tools/testing/selftests/ftrace/Makefile b/tools/testing/selftests/ftrace/Makefile
index d6e106fbce11c..a1e955d2de4cc 100644
--- a/tools/testing/selftests/ftrace/Makefile
+++ b/tools/testing/selftests/ftrace/Makefile
@@ -1,7 +1,8 @@
# SPDX-License-Identifier: GPL-2.0
all:
-TEST_PROGS := ftracetest
+TEST_PROGS_EXTENDED := ftracetest
+TEST_PROGS := ftracetest-ktap
TEST_FILES := test.d settings
EXTRA_CLEAN := $(OUTPUT)/logs/*
diff --git a/tools/testing/selftests/ftrace/ftracetest b/tools/testing/selftests/ftrace/ftracetest
index c3311c8c40890..2506621e75dfb 100755
--- a/tools/testing/selftests/ftrace/ftracetest
+++ b/tools/testing/selftests/ftrace/ftracetest
@@ -13,6 +13,7 @@ echo "Usage: ftracetest [options] [testcase(s)] [testcase-directory(s)]"
echo " Options:"
echo " -h|--help Show help message"
echo " -k|--keep Keep passed test logs"
+echo " -K|--ktap Output in KTAP format"
echo " -v|--verbose Increase verbosity of test messages"
echo " -vv Alias of -v -v (Show all results in stdout)"
echo " -vvv Alias of -v -v -v (Show all commands immediately)"
@@ -85,6 +86,10 @@ parse_opts() { # opts
KEEP_LOG=1
shift 1
;;
+ --ktap|-K)
+ KTAP=1
+ shift 1
+ ;;
--verbose|-v|-vv|-vvv)
if [ $VERBOSE -eq -1 ]; then
usage "--console can not use with --verbose"
@@ -178,6 +183,7 @@ TEST_DIR=$TOP_DIR/test.d
TEST_CASES=`find_testcases $TEST_DIR`
LOG_DIR=$TOP_DIR/logs/`date +%Y%m%d-%H%M%S`/
KEEP_LOG=0
+KTAP=0
DEBUG=0
VERBOSE=0
UNSUPPORTED_RESULT=0
@@ -229,7 +235,7 @@ prlog() { # messages
newline=
shift
fi
- printf "$*$newline"
+ [ "$KTAP" != "1" ] && printf "$*$newline"
[ "$LOG_FILE" ] && printf "$*$newline" | strip_esc >> $LOG_FILE
}
catlog() { #file
@@ -260,11 +266,11 @@ TOTAL_RESULT=0
INSTANCE=
CASENO=0
+CASENAME=
testcase() { # testfile
CASENO=$((CASENO+1))
- desc=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
- prlog -n "[$CASENO]$INSTANCE$desc"
+ CASENAME=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
}
checkreq() { # testfile
@@ -277,40 +283,68 @@ test_on_instance() { # testfile
grep -q "^#[ \t]*flags:.*instance" $1
}
+ktaptest() { # result comment
+ if [ "$KTAP" != "1" ]; then
+ return
+ fi
+
+ local result=
+ if [ "$1" = "1" ]; then
+ result="ok"
+ else
+ result="not ok"
+ fi
+ shift
+
+ local comment=$*
+ if [ "$comment" != "" ]; then
+ comment="# $comment"
+ fi
+
+ echo $CASENO $result $INSTANCE$CASENAME $comment
+}
+
eval_result() { # sigval
case $1 in
$PASS)
prlog " [${color_green}PASS${color_reset}]"
+ ktaptest 1
PASSED_CASES="$PASSED_CASES $CASENO"
return 0
;;
$FAIL)
prlog " [${color_red}FAIL${color_reset}]"
+ ktaptest 0
FAILED_CASES="$FAILED_CASES $CASENO"
return 1 # this is a bug.
;;
$UNRESOLVED)
prlog " [${color_blue}UNRESOLVED${color_reset}]"
+ ktaptest 0 UNRESOLVED
UNRESOLVED_CASES="$UNRESOLVED_CASES $CASENO"
return $UNRESOLVED_RESULT # depends on use case
;;
$UNTESTED)
prlog " [${color_blue}UNTESTED${color_reset}]"
+ ktaptest 1 SKIP
UNTESTED_CASES="$UNTESTED_CASES $CASENO"
return 0
;;
$UNSUPPORTED)
prlog " [${color_blue}UNSUPPORTED${color_reset}]"
+ ktaptest 1 SKIP
UNSUPPORTED_CASES="$UNSUPPORTED_CASES $CASENO"
return $UNSUPPORTED_RESULT # depends on use case
;;
$XFAIL)
prlog " [${color_green}XFAIL${color_reset}]"
+ ktaptest 1 XFAIL
XFAILED_CASES="$XFAILED_CASES $CASENO"
return 0
;;
*)
prlog " [${color_blue}UNDEFINED${color_reset}]"
+ ktaptest 0 error
UNDEFINED_CASES="$UNDEFINED_CASES $CASENO"
return 1 # this must be a test bug
;;
@@ -371,6 +405,7 @@ __run_test() { # testfile
run_test() { # testfile
local testname=`basename $1`
testcase $1
+ prlog -n "[$CASENO]$INSTANCE$CASENAME"
if [ ! -z "$LOG_FILE" ] ; then
local testlog=`mktemp $LOG_DIR/${CASENO}-${testname}-log.XXXXXX`
else
@@ -405,6 +440,17 @@ run_test() { # testfile
# load in the helper functions
. $TEST_DIR/functions
+if [ "$KTAP" = "1" ]; then
+ echo "TAP version 13"
+
+ casecount=`echo $TEST_CASES | wc -w`
+ for t in $TEST_CASES; do
+ test_on_instance $t || continue
+ casecount=$((casecount+1))
+ done
+ echo "1..${casecount}"
+fi
+
# Main loop
for t in $TEST_CASES; do
run_test $t
@@ -439,6 +485,17 @@ prlog "# of unsupported: " `echo $UNSUPPORTED_CASES | wc -w`
prlog "# of xfailed: " `echo $XFAILED_CASES | wc -w`
prlog "# of undefined(test bug): " `echo $UNDEFINED_CASES | wc -w`
+if [ "$KTAP" = "1" ]; then
+ echo -n "# Totals:"
+ echo -n " pass:"`echo $PASSED_CASES | wc -w`
+ echo -n " faii:"`echo $FAILED_CASES | wc -w`
+ echo -n " xfail:"`echo $XFAILED_CASES | wc -w`
+ echo -n " xpass:0"
+ echo -n " skip:"`echo $UNTESTED_CASES $UNSUPPORTED_CASES | wc -w`
+ echo -n " error:"`echo $UNRESOLVED_CASES $UNDEFINED_CASES | wc -w`
+ echo
+fi
+
cleanup
# if no error, return 0
diff --git a/tools/testing/selftests/ftrace/ftracetest-ktap b/tools/testing/selftests/ftrace/ftracetest-ktap
new file mode 100755
index 0000000000000..b3284679ef3af
--- /dev/null
+++ b/tools/testing/selftests/ftrace/ftracetest-ktap
@@ -0,0 +1,8 @@
+#!/bin/sh -e
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# ftracetest-ktap: Wrapper to integrate ftracetest with the kselftest runner
+#
+# Copyright (C) Arm Ltd., 2023
+
+./ftracetest -K
--
2.39.2
From: Jinrong Liang <cloudliang(a)tencent.com>
From: Jinrong Liang <cloudliang(a)tencent.com>
Hi,
This patch set adds some tests to ensure consistent PMU performance event
filter behavior. Specifically, the patches aim to improve KVM's PMU event
filter by strengthening the test coverage, adding documentation, and making
other small changes.
The first patch replaces int with uint32_t for nevents to ensure consistency
and readability in the code. The second patch adds fixed_counter_bitmap to
create_pmu_event_filter() to support the use of the same creator to control
the use of guest fixed counters. The third patch adds test cases for
unsupported input values in PMU filter, including unsupported "action"
values, unsupported "flags" values, and unsupported "nevents" values. Also,
it tests setting non-existent fixed counters in the fixed bitmap doesn't
fail.
The fourth patch updates the documentation for KVM_SET_PMU_EVENT_FILTER ioctl
to include a detailed description of how fixed performance events are handled
in the pmu filter. The fifth patch adds tests to cover that pmu_event_filter
works as expected when applied to fixed performance counters, even if there
is no fixed counter exists. The sixth patch adds a test to ensure that setting
both generic and fixed performance event filters does not affect the consistency
of the fixed performance filter behavior in KVM. The seventh patch adds a test
to verify the behavior of the pmu event filter when an incomplete
kvm_pmu_event_filter structure is used.
These changes help to ensure that KVM's PMU event filter functions as expected
in all supported use cases. These patches have been tested and verified to
function properly.
Thanks for your review and feedback.
Sincerely,
Jinrong Liang
Previous:
https://lore.kernel.org/kvm/20230414110056.19665-1-cloudliang@tencent.com
v2:
- Wrap the code from the documentation in a block of code; (Bagas Sanjaya)
Jinrong Liang (7):
KVM: selftests: Replace int with uint32_t for nevents
KVM: selftests: Apply create_pmu_event_filter() to fixed ctrs
KVM: selftests: Test unavailable event filters are rejected
KVM: x86/pmu: Add documentation for fixed ctr on PMU filter
KVM: selftests: Check if pmu_event_filter meets expectations on fixed
ctrs
KVM: selftests: Check gp event filters without affecting fixed event
filters
KVM: selftests: Test pmu event filter with incompatible
kvm_pmu_event_filter
Documentation/virt/kvm/api.rst | 21 ++
.../kvm/x86_64/pmu_event_filter_test.c | 239 ++++++++++++++++--
2 files changed, 243 insertions(+), 17 deletions(-)
base-commit: a25497a280bbd7bbcc08c87ddb2b3909affc8402
--
2.31.1
This is the v7 of this series which tries to implement the fd-based KVM
guest private memory. The patches are based on latest kvm/queue branch
commit:
b9b71f43683a (kvm/queue) KVM: x86/mmu: Buffer nested MMU
split_desc_cache only by default capacity
Introduction
------------
In general this patch series introduce fd-based memslot which provides
guest memory through memory file descriptor fd[offset,size] instead of
hva/size. The fd can be created from a supported memory filesystem
like tmpfs/hugetlbfs etc. which we refer as memory backing store. KVM
and the the memory backing store exchange callbacks when such memslot
gets created. At runtime KVM will call into callbacks provided by the
backing store to get the pfn with the fd+offset. Memory backing store
will also call into KVM callbacks when userspace punch hole on the fd
to notify KVM to unmap secondary MMU page table entries.
Comparing to existing hva-based memslot, this new type of memslot allows
guest memory unmapped from host userspace like QEMU and even the kernel
itself, therefore reduce attack surface and prevent bugs.
Based on this fd-based memslot, we can build guest private memory that
is going to be used in confidential computing environments such as Intel
TDX and AMD SEV. When supported, the memory backing store can provide
more enforcement on the fd and KVM can use a single memslot to hold both
the private and shared part of the guest memory.
mm extension
---------------------
Introduces new MFD_INACCESSIBLE flag for memfd_create(), the file
created with these flags cannot read(), write() or mmap() etc via normal
MMU operations. The file content can only be used with the newly
introduced memfile_notifier extension.
The memfile_notifier extension provides two sets of callbacks for KVM to
interact with the memory backing store:
- memfile_notifier_ops: callbacks for memory backing store to notify
KVM when memory gets invalidated.
- backing store callbacks: callbacks for KVM to call into memory
backing store to request memory pages for guest private memory.
The memfile_notifier extension also provides APIs for memory backing
store to register/unregister itself and to trigger the notifier when the
bookmarked memory gets invalidated.
The patchset also introduces a new memfd seal F_SEAL_AUTO_ALLOCATE to
prevent double allocation caused by unintentional guest when we only
have a single side of the shared/private memfds effective.
memslot extension
-----------------
Add the private fd and the fd offset to existing 'shared' memslot so
that both private/shared guest memory can live in one single memslot.
A page in the memslot is either private or shared. Whether a guest page
is private or shared is maintained through reusing existing SEV ioctls
KVM_MEMORY_ENCRYPT_{UN,}REG_REGION.
Test
----
To test the new functionalities of this patch TDX patchset is needed.
Since TDX patchset has not been merged so I did two kinds of test:
- Regresion test on kvm/queue (this patchset)
Most new code are not covered. Code also in below repo:
https://github.com/chao-p/linux/tree/privmem-v7
- New Funational test on latest TDX code
The patch is rebased to latest TDX code and tested the new
funcationalities. See below repos:
Linux: https://github.com/chao-p/linux/tree/privmem-v7-tdx
QEMU: https://github.com/chao-p/qemu/tree/privmem-v7
An example QEMU command line for TDX test:
-object tdx-guest,id=tdx,debug=off,sept-ve-disable=off \
-machine confidential-guest-support=tdx \
-object memory-backend-memfd-private,id=ram1,size=${mem} \
-machine memory-backend=ram1
Changelog
----------
v7:
- Move the private/shared info from backing store to KVM.
- Introduce F_SEAL_AUTO_ALLOCATE to avoid double allocation.
- Rework on the sync mechanism between zap/page fault paths.
- Addressed other comments in v6.
v6:
- Re-organzied patch for both mm/KVM parts.
- Added flags for memfile_notifier so its consumers can state their
features and memory backing store can check against these flags.
- Put a backing store reference in the memfile_notifier and move pfn_ops
into backing store.
- Only support boot time backing store register.
- Overall KVM part improvement suggested by Sean and some others.
v5:
- Removed userspace visible F_SEAL_INACCESSIBLE, instead using an
in-kernel flag (SHM_F_INACCESSIBLE for shmem). Private fd can only
be created by MFD_INACCESSIBLE.
- Introduced new APIs for backing store to register itself to
memfile_notifier instead of direct function call.
- Added the accounting and restriction for MFD_INACCESSIBLE memory.
- Added KVM API doc for new memslot extensions and man page for the new
MFD_INACCESSIBLE flag.
- Removed the overlap check for mapping the same file+offset into
multiple gfns due to perf consideration, warned in document.
- Addressed other comments in v4.
v4:
- Decoupled the callbacks between KVM/mm from memfd and use new
name 'memfile_notifier'.
- Supported register multiple memslots to the same backing store.
- Added per-memslot pfn_ops instead of per-system.
- Reworked the invalidation part.
- Improved new KVM uAPIs (private memslot extension and memory
error) per Sean's suggestions.
- Addressed many other minor fixes for comments from v3.
v3:
- Added locking protection when calling
invalidate_page_range/fallocate callbacks.
- Changed memslot structure to keep use useraddr for shared memory.
- Re-organized F_SEAL_INACCESSIBLE and MEMFD_OPS.
- Added MFD_INACCESSIBLE flag to force F_SEAL_INACCESSIBLE.
- Commit message improvement.
- Many small fixes for comments from the last version.
Links to previous discussions
-----------------------------
[1] Original design proposal:
https://lkml.kernel.org/kvm/20210824005248.200037-1-seanjc@google.com/
[2] Updated proposal and RFC patch v1:
https://lkml.kernel.org/linux-fsdevel/20211111141352.26311-1-chao.p.peng@li…
[3] Patch v5: https://lkml.org/lkml/2022/5/19/861
Chao Peng (12):
mm: Add F_SEAL_AUTO_ALLOCATE seal to memfd
selftests/memfd: Add tests for F_SEAL_AUTO_ALLOCATE
mm: Introduce memfile_notifier
mm/memfd: Introduce MFD_INACCESSIBLE flag
KVM: Rename KVM_PRIVATE_MEM_SLOTS to KVM_INTERNAL_MEM_SLOTS
KVM: Use gfn instead of hva for mmu_notifier_retry
KVM: Rename mmu_notifier_*
KVM: Extend the memslot to support fd-based private memory
KVM: Add KVM_EXIT_MEMORY_FAULT exit
KVM: Register/unregister the guest private memory regions
KVM: Handle page fault for private memory
KVM: Enable and expose KVM_MEM_PRIVATE
Kirill A. Shutemov (1):
mm/shmem: Support memfile_notifier
Documentation/virt/kvm/api.rst | 77 +++++-
arch/arm64/kvm/mmu.c | 8 +-
arch/mips/include/asm/kvm_host.h | 2 +-
arch/mips/kvm/mmu.c | 10 +-
arch/powerpc/include/asm/kvm_book3s_64.h | 2 +-
arch/powerpc/kvm/book3s_64_mmu_host.c | 4 +-
arch/powerpc/kvm/book3s_64_mmu_hv.c | 4 +-
arch/powerpc/kvm/book3s_64_mmu_radix.c | 6 +-
arch/powerpc/kvm/book3s_hv_nested.c | 2 +-
arch/powerpc/kvm/book3s_hv_rm_mmu.c | 8 +-
arch/powerpc/kvm/e500_mmu_host.c | 4 +-
arch/riscv/kvm/mmu.c | 4 +-
arch/x86/include/asm/kvm_host.h | 3 +-
arch/x86/kvm/Kconfig | 3 +
arch/x86/kvm/mmu.h | 2 -
arch/x86/kvm/mmu/mmu.c | 74 +++++-
arch/x86/kvm/mmu/mmu_internal.h | 18 ++
arch/x86/kvm/mmu/mmutrace.h | 1 +
arch/x86/kvm/mmu/paging_tmpl.h | 4 +-
arch/x86/kvm/x86.c | 2 +-
include/linux/kvm_host.h | 105 +++++---
include/linux/memfile_notifier.h | 91 +++++++
include/linux/shmem_fs.h | 2 +
include/uapi/linux/fcntl.h | 1 +
include/uapi/linux/kvm.h | 37 +++
include/uapi/linux/memfd.h | 1 +
mm/Kconfig | 4 +
mm/Makefile | 1 +
mm/memfd.c | 18 +-
mm/memfile_notifier.c | 123 ++++++++++
mm/shmem.c | 125 +++++++++-
tools/testing/selftests/memfd/memfd_test.c | 166 +++++++++++++
virt/kvm/Kconfig | 3 +
virt/kvm/kvm_main.c | 272 ++++++++++++++++++---
virt/kvm/pfncache.c | 14 +-
35 files changed, 1074 insertions(+), 127 deletions(-)
create mode 100644 include/linux/memfile_notifier.h
create mode 100644 mm/memfile_notifier.c
--
2.25.1
Many uses of the KUnit resource system are intended to simply defer
calling a function until the test exits (be it due to success or
failure). The existing kunit_alloc_resource() function is often used for
this, but was awkward to use (requiring passing NULL init functions, etc),
and returned a resource without incrementing its reference count, which
-- while okay for this use-case -- could cause problems in others.
Instead, introduce a simple kunit_add_action() API: a simple function
(returning nothing, accepting a single void* argument) can be scheduled
to be called when the test exits. Deferred actions are called in the
opposite order to that which they were registered.
This mimics the devres API, devm_add_action(), and also provides
kunit_remove_action(), to cancel a deferred action, and
kunit_release_action() to trigger one early.
This is implemented as a resource under the hood, so the ordering
between resource cleanup and deferred functions is maintained.
Reviewed-by: Benjamin Berg <benjamin.berg(a)intel.com>
Reviewed-by: Maxime Ripard <maxime(a)cerno.tech>
Tested-by: Maxime Ripard <maxime(a)cerno.tech>
Signed-off-by: David Gow <davidgow(a)google.com>
---
No changes since v2:
https://lore.kernel.org/linux-kselftest/20230518083849.2631178-1-davidgow@g…
Changes since v1:
https://lore.kernel.org/linux-kselftest/20230421084226.2278282-2-davidgow@g…
- Some small documentation updates (Thanks Daniel)
- Reinstate a typedef for the action function.
- This time, it's called kunit_action_t
- Thanks Maxime!
Changes since RFC v2:
https://lore.kernel.org/linux-kselftest/20230331080411.981038-2-davidgow@go…
- Got rid of internal_gfp
- everything uses GFP_KERNEL now
- This includes kunit_kzalloc() and friends, which still allocate the
returned memory with the provided GFP, but use GFP_KERNEL for
internal bookkeeping data.
- Thanks Maxime & Benjamin!
- Got rid of cancellation tokens.
- kunit_add_action() now returns 0 on success, otherwise an error
- Note that this can quite easily lead to a memory leak, so look at
kunit_add_action_or_reset()
- Thanks Maxime & Benjamin!
- Added kunit_add_action_or_reset
- Matches devm_add_action_or_reset()
- Returns 0 on success.
- Thanks Maxime & Benjamin!
- Got rid of the kunit_defer_func_t typedef.
- I liked it, but it is probably pushing the boundaries of kernel
style.
- Use (void (*)(void *)) instead.
Changes since RFC v1:
https://lore.kernel.org/linux-kselftest/20230325043104.3761770-2-davidgow@g…
- Rename functions to better match the devm_* APIs. (Thanks Maxime)
- Embed the kunit_resource in struct kunit_action_ctx to avoid an extra
allocation (Thanks Benjamin)
- Use 'struct kunit_action_ctx' as the type for cancellation tokens
(Thanks Benjamin)
- Add tests.
---
include/kunit/resource.h | 92 +++++++++++++++++++++++++++++++++++++
lib/kunit/kunit-test.c | 88 ++++++++++++++++++++++++++++++++++-
lib/kunit/resource.c | 99 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 278 insertions(+), 1 deletion(-)
diff --git a/include/kunit/resource.h b/include/kunit/resource.h
index c0d88b318e90..b64eb783b1bc 100644
--- a/include/kunit/resource.h
+++ b/include/kunit/resource.h
@@ -387,4 +387,96 @@ static inline int kunit_destroy_named_resource(struct kunit *test,
*/
void kunit_remove_resource(struct kunit *test, struct kunit_resource *res);
+/* A 'deferred action' function to be used with kunit_add_action. */
+typedef void (kunit_action_t)(void *);
+
+/**
+ * kunit_add_action() - Call a function when the test ends.
+ * @test: Test case to associate the action with.
+ * @func: The function to run on test exit
+ * @ctx: Data passed into @func
+ *
+ * Defer the execution of a function until the test exits, either normally or
+ * due to a failure. @ctx is passed as additional context. All functions
+ * registered with kunit_add_action() will execute in the opposite order to that
+ * they were registered in.
+ *
+ * This is useful for cleaning up allocated memory and resources, as these
+ * functions are called even if the test aborts early due to, e.g., a failed
+ * assertion.
+ *
+ * See also: devm_add_action() for the devres equivalent.
+ *
+ * Returns:
+ * 0 on success, an error if the action could not be deferred.
+ */
+int kunit_add_action(struct kunit *test, kunit_action_t *action, void *ctx);
+
+/**
+ * kunit_add_action_or_reset() - Call a function when the test ends.
+ * @test: Test case to associate the action with.
+ * @func: The function to run on test exit
+ * @ctx: Data passed into @func
+ *
+ * Defer the execution of a function until the test exits, either normally or
+ * due to a failure. @ctx is passed as additional context. All functions
+ * registered with kunit_add_action() will execute in the opposite order to that
+ * they were registered in.
+ *
+ * This is useful for cleaning up allocated memory and resources, as these
+ * functions are called even if the test aborts early due to, e.g., a failed
+ * assertion.
+ *
+ * If the action cannot be created (e.g., due to the system being out of memory),
+ * then action(ctx) will be called immediately, and an error will be returned.
+ *
+ * See also: devm_add_action_or_reset() for the devres equivalent.
+ *
+ * Returns:
+ * 0 on success, an error if the action could not be deferred.
+ */
+int kunit_add_action_or_reset(struct kunit *test, kunit_action_t *action,
+ void *ctx);
+
+/**
+ * kunit_remove_action() - Cancel a matching deferred action.
+ * @test: Test case the action is associated with.
+ * @func: The deferred function to cancel.
+ * @ctx: The context passed to the deferred function to trigger.
+ *
+ * Prevent an action deferred via kunit_add_action() from executing when the
+ * test terminates.
+ *
+ * If the function/context pair was deferred multiple times, only the most
+ * recent one will be cancelled.
+ *
+ * See also: devm_remove_action() for the devres equivalent.
+ */
+void kunit_remove_action(struct kunit *test,
+ kunit_action_t *action,
+ void *ctx);
+
+/**
+ * kunit_release_action() - Run a matching action call immediately.
+ * @test: Test case the action is associated with.
+ * @func: The deferred function to trigger.
+ * @ctx: The context passed to the deferred function to trigger.
+ *
+ * Execute a function deferred via kunit_add_action()) immediately, rather than
+ * when the test ends.
+ *
+ * If the function/context pair was deferred multiple times, it will only be
+ * executed once here. The most recent deferral will no longer execute when
+ * the test ends.
+ *
+ * kunit_release_action(test, func, ctx);
+ * is equivalent to
+ * func(ctx);
+ * kunit_remove_action(test, func, ctx);
+ *
+ * See also: devm_release_action() for the devres equivalent.
+ */
+void kunit_release_action(struct kunit *test,
+ kunit_action_t *action,
+ void *ctx);
#endif /* _KUNIT_RESOURCE_H */
diff --git a/lib/kunit/kunit-test.c b/lib/kunit/kunit-test.c
index 42e44caa1bdd..83d8e90ca7a2 100644
--- a/lib/kunit/kunit-test.c
+++ b/lib/kunit/kunit-test.c
@@ -112,7 +112,7 @@ struct kunit_test_resource_context {
struct kunit test;
bool is_resource_initialized;
int allocate_order[2];
- int free_order[2];
+ int free_order[4];
};
static int fake_resource_init(struct kunit_resource *res, void *context)
@@ -403,6 +403,88 @@ static void kunit_resource_test_named(struct kunit *test)
KUNIT_EXPECT_TRUE(test, list_empty(&test->resources));
}
+static void increment_int(void *ctx)
+{
+ int *i = (int *)ctx;
+ (*i)++;
+}
+
+static void kunit_resource_test_action(struct kunit *test)
+{
+ int num_actions = 0;
+
+ kunit_add_action(test, increment_int, &num_actions);
+ KUNIT_EXPECT_EQ(test, num_actions, 0);
+ kunit_cleanup(test);
+ KUNIT_EXPECT_EQ(test, num_actions, 1);
+
+ /* Once we've cleaned up, the action queue is empty. */
+ kunit_cleanup(test);
+ KUNIT_EXPECT_EQ(test, num_actions, 1);
+
+ /* Check the same function can be deferred multiple times. */
+ kunit_add_action(test, increment_int, &num_actions);
+ kunit_add_action(test, increment_int, &num_actions);
+ kunit_cleanup(test);
+ KUNIT_EXPECT_EQ(test, num_actions, 3);
+}
+static void kunit_resource_test_remove_action(struct kunit *test)
+{
+ int num_actions = 0;
+
+ kunit_add_action(test, increment_int, &num_actions);
+ KUNIT_EXPECT_EQ(test, num_actions, 0);
+
+ kunit_remove_action(test, increment_int, &num_actions);
+ kunit_cleanup(test);
+ KUNIT_EXPECT_EQ(test, num_actions, 0);
+}
+static void kunit_resource_test_release_action(struct kunit *test)
+{
+ int num_actions = 0;
+
+ kunit_add_action(test, increment_int, &num_actions);
+ KUNIT_EXPECT_EQ(test, num_actions, 0);
+ /* Runs immediately on trigger. */
+ kunit_release_action(test, increment_int, &num_actions);
+ KUNIT_EXPECT_EQ(test, num_actions, 1);
+
+ /* Doesn't run again on test exit. */
+ kunit_cleanup(test);
+ KUNIT_EXPECT_EQ(test, num_actions, 1);
+}
+static void action_order_1(void *ctx)
+{
+ struct kunit_test_resource_context *res_ctx = (struct kunit_test_resource_context *)ctx;
+
+ KUNIT_RESOURCE_TEST_MARK_ORDER(res_ctx, free_order, 1);
+ kunit_log(KERN_INFO, current->kunit_test, "action_order_1");
+}
+static void action_order_2(void *ctx)
+{
+ struct kunit_test_resource_context *res_ctx = (struct kunit_test_resource_context *)ctx;
+
+ KUNIT_RESOURCE_TEST_MARK_ORDER(res_ctx, free_order, 2);
+ kunit_log(KERN_INFO, current->kunit_test, "action_order_2");
+}
+static void kunit_resource_test_action_ordering(struct kunit *test)
+{
+ struct kunit_test_resource_context *ctx = test->priv;
+
+ kunit_add_action(test, action_order_1, ctx);
+ kunit_add_action(test, action_order_2, ctx);
+ kunit_add_action(test, action_order_1, ctx);
+ kunit_add_action(test, action_order_2, ctx);
+ kunit_remove_action(test, action_order_1, ctx);
+ kunit_release_action(test, action_order_2, ctx);
+ kunit_cleanup(test);
+
+ /* [2 is triggered] [2], [(1 is cancelled)] [1] */
+ KUNIT_EXPECT_EQ(test, ctx->free_order[0], 2);
+ KUNIT_EXPECT_EQ(test, ctx->free_order[1], 2);
+ KUNIT_EXPECT_EQ(test, ctx->free_order[2], 1);
+}
+
static int kunit_resource_test_init(struct kunit *test)
{
struct kunit_test_resource_context *ctx =
@@ -434,6 +516,10 @@ static struct kunit_case kunit_resource_test_cases[] = {
KUNIT_CASE(kunit_resource_test_proper_free_ordering),
KUNIT_CASE(kunit_resource_test_static),
KUNIT_CASE(kunit_resource_test_named),
+ KUNIT_CASE(kunit_resource_test_action),
+ KUNIT_CASE(kunit_resource_test_remove_action),
+ KUNIT_CASE(kunit_resource_test_release_action),
+ KUNIT_CASE(kunit_resource_test_action_ordering),
{}
};
diff --git a/lib/kunit/resource.c b/lib/kunit/resource.c
index c414df922f34..f0209252b179 100644
--- a/lib/kunit/resource.c
+++ b/lib/kunit/resource.c
@@ -77,3 +77,102 @@ int kunit_destroy_resource(struct kunit *test, kunit_resource_match_t match,
return 0;
}
EXPORT_SYMBOL_GPL(kunit_destroy_resource);
+
+struct kunit_action_ctx {
+ struct kunit_resource res;
+ kunit_action_t *func;
+ void *ctx;
+};
+
+static void __kunit_action_free(struct kunit_resource *res)
+{
+ struct kunit_action_ctx *action_ctx = container_of(res, struct kunit_action_ctx, res);
+
+ action_ctx->func(action_ctx->ctx);
+}
+
+
+int kunit_add_action(struct kunit *test, void (*action)(void *), void *ctx)
+{
+ struct kunit_action_ctx *action_ctx;
+
+ KUNIT_ASSERT_NOT_NULL_MSG(test, action, "Tried to action a NULL function!");
+
+ action_ctx = kzalloc(sizeof(*action_ctx), GFP_KERNEL);
+ if (!action_ctx)
+ return -ENOMEM;
+
+ action_ctx->func = action;
+ action_ctx->ctx = ctx;
+
+ action_ctx->res.should_kfree = true;
+ /* As init is NULL, this cannot fail. */
+ __kunit_add_resource(test, NULL, __kunit_action_free, &action_ctx->res, action_ctx);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(kunit_add_action);
+
+int kunit_add_action_or_reset(struct kunit *test, void (*action)(void *),
+ void *ctx)
+{
+ int res = kunit_add_action(test, action, ctx);
+
+ if (res)
+ action(ctx);
+ return res;
+}
+EXPORT_SYMBOL_GPL(kunit_add_action_or_reset);
+
+static bool __kunit_action_match(struct kunit *test,
+ struct kunit_resource *res, void *match_data)
+{
+ struct kunit_action_ctx *match_ctx = (struct kunit_action_ctx *)match_data;
+ struct kunit_action_ctx *res_ctx = container_of(res, struct kunit_action_ctx, res);
+
+ /* Make sure this is a free function. */
+ if (res->free != __kunit_action_free)
+ return false;
+
+ /* Both the function and context data should match. */
+ return (match_ctx->func == res_ctx->func) && (match_ctx->ctx == res_ctx->ctx);
+}
+
+void kunit_remove_action(struct kunit *test,
+ kunit_action_t *action,
+ void *ctx)
+{
+ struct kunit_action_ctx match_ctx;
+ struct kunit_resource *res;
+
+ match_ctx.func = action;
+ match_ctx.ctx = ctx;
+
+ res = kunit_find_resource(test, __kunit_action_match, &match_ctx);
+ if (res) {
+ /* Remove the free function so we don't run the action. */
+ res->free = NULL;
+ kunit_remove_resource(test, res);
+ kunit_put_resource(res);
+ }
+}
+EXPORT_SYMBOL_GPL(kunit_remove_action);
+
+void kunit_release_action(struct kunit *test,
+ kunit_action_t *action,
+ void *ctx)
+{
+ struct kunit_action_ctx match_ctx;
+ struct kunit_resource *res;
+
+ match_ctx.func = action;
+ match_ctx.ctx = ctx;
+
+ res = kunit_find_resource(test, __kunit_action_match, &match_ctx);
+ if (res) {
+ kunit_remove_resource(test, res);
+ /* We have to put() this here, else free won't be called. */
+ kunit_put_resource(res);
+ }
+}
+EXPORT_SYMBOL_GPL(kunit_release_action);
--
2.41.0.rc0.172.g3f132b7071-goog
Many uses of the KUnit resource system are intended to simply defer
calling a function until the test exits (be it due to success or
failure). The existing kunit_alloc_resource() function is often used for
this, but was awkward to use (requiring passing NULL init functions, etc),
and returned a resource without incrementing its reference count, which
-- while okay for this use-case -- could cause problems in others.
Instead, introduce a simple kunit_add_action() API: a simple function
(returning nothing, accepting a single void* argument) can be scheduled
to be called when the test exits. Deferred actions are called in the
opposite order to that which they were registered.
This mimics the devres API, devm_add_action(), and also provides
kunit_remove_action(), to cancel a deferred action, and
kunit_release_action() to trigger one early.
This is implemented as a resource under the hood, so the ordering
between resource cleanup and deferred functions is maintained.
Reviewed-by: Benjamin Berg <benjamin.berg(a)intel.com>
Reviewed-by: Maxime Ripard <maxime(a)cerno.tech>
Tested-by: Maxime Ripard <maxime(a)cerno.tech>
Signed-off-by: David Gow <davidgow(a)google.com>
---
Changes since v1:
https://lore.kernel.org/linux-kselftest/20230421084226.2278282-2-davidgow@g…
- Some small documentation updates (Thanks Daniel)
- Reinstate a typedef for the action function.
- This time, it's called kunit_action_t
- Thanks Maxime!
Changes since RFC v2:
https://lore.kernel.org/linux-kselftest/20230331080411.981038-2-davidgow@go…
- Got rid of internal_gfp
- everything uses GFP_KERNEL now
- This includes kunit_kzalloc() and friends, which still allocate the
returned memory with the provided GFP, but use GFP_KERNEL for
internal bookkeeping data.
- Thanks Maxime & Benjamin!
- Got rid of cancellation tokens.
- kunit_add_action() now returns 0 on success, otherwise an error
- Note that this can quite easily lead to a memory leak, so look at
kunit_add_action_or_reset()
- Thanks Maxime & Benjamin!
- Added kunit_add_action_or_reset
- Matches devm_add_action_or_reset()
- Returns 0 on success.
- Thanks Maxime & Benjamin!
- Got rid of the kunit_defer_func_t typedef.
- I liked it, but it is probably pushing the boundaries of kernel
style.
- Use (void (*)(void *)) instead.
Changes since RFC v1:
https://lore.kernel.org/linux-kselftest/20230325043104.3761770-2-davidgow@g…
- Rename functions to better match the devm_* APIs. (Thanks Maxime)
- Embed the kunit_resource in struct kunit_action_ctx to avoid an extra
allocation (Thanks Benjamin)
- Use 'struct kunit_action_ctx' as the type for cancellation tokens
(Thanks Benjamin)
- Add tests.
---
include/kunit/resource.h | 92 +++++++++++++++++++++++++++++++++++++
lib/kunit/kunit-test.c | 88 ++++++++++++++++++++++++++++++++++-
lib/kunit/resource.c | 99 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 278 insertions(+), 1 deletion(-)
diff --git a/include/kunit/resource.h b/include/kunit/resource.h
index c0d88b318e90..b64eb783b1bc 100644
--- a/include/kunit/resource.h
+++ b/include/kunit/resource.h
@@ -387,4 +387,96 @@ static inline int kunit_destroy_named_resource(struct kunit *test,
*/
void kunit_remove_resource(struct kunit *test, struct kunit_resource *res);
+/* A 'deferred action' function to be used with kunit_add_action. */
+typedef void (kunit_action_t)(void *);
+
+/**
+ * kunit_add_action() - Call a function when the test ends.
+ * @test: Test case to associate the action with.
+ * @func: The function to run on test exit
+ * @ctx: Data passed into @func
+ *
+ * Defer the execution of a function until the test exits, either normally or
+ * due to a failure. @ctx is passed as additional context. All functions
+ * registered with kunit_add_action() will execute in the opposite order to that
+ * they were registered in.
+ *
+ * This is useful for cleaning up allocated memory and resources, as these
+ * functions are called even if the test aborts early due to, e.g., a failed
+ * assertion.
+ *
+ * See also: devm_add_action() for the devres equivalent.
+ *
+ * Returns:
+ * 0 on success, an error if the action could not be deferred.
+ */
+int kunit_add_action(struct kunit *test, kunit_action_t *action, void *ctx);
+
+/**
+ * kunit_add_action_or_reset() - Call a function when the test ends.
+ * @test: Test case to associate the action with.
+ * @func: The function to run on test exit
+ * @ctx: Data passed into @func
+ *
+ * Defer the execution of a function until the test exits, either normally or
+ * due to a failure. @ctx is passed as additional context. All functions
+ * registered with kunit_add_action() will execute in the opposite order to that
+ * they were registered in.
+ *
+ * This is useful for cleaning up allocated memory and resources, as these
+ * functions are called even if the test aborts early due to, e.g., a failed
+ * assertion.
+ *
+ * If the action cannot be created (e.g., due to the system being out of memory),
+ * then action(ctx) will be called immediately, and an error will be returned.
+ *
+ * See also: devm_add_action_or_reset() for the devres equivalent.
+ *
+ * Returns:
+ * 0 on success, an error if the action could not be deferred.
+ */
+int kunit_add_action_or_reset(struct kunit *test, kunit_action_t *action,
+ void *ctx);
+
+/**
+ * kunit_remove_action() - Cancel a matching deferred action.
+ * @test: Test case the action is associated with.
+ * @func: The deferred function to cancel.
+ * @ctx: The context passed to the deferred function to trigger.
+ *
+ * Prevent an action deferred via kunit_add_action() from executing when the
+ * test terminates.
+ *
+ * If the function/context pair was deferred multiple times, only the most
+ * recent one will be cancelled.
+ *
+ * See also: devm_remove_action() for the devres equivalent.
+ */
+void kunit_remove_action(struct kunit *test,
+ kunit_action_t *action,
+ void *ctx);
+
+/**
+ * kunit_release_action() - Run a matching action call immediately.
+ * @test: Test case the action is associated with.
+ * @func: The deferred function to trigger.
+ * @ctx: The context passed to the deferred function to trigger.
+ *
+ * Execute a function deferred via kunit_add_action()) immediately, rather than
+ * when the test ends.
+ *
+ * If the function/context pair was deferred multiple times, it will only be
+ * executed once here. The most recent deferral will no longer execute when
+ * the test ends.
+ *
+ * kunit_release_action(test, func, ctx);
+ * is equivalent to
+ * func(ctx);
+ * kunit_remove_action(test, func, ctx);
+ *
+ * See also: devm_release_action() for the devres equivalent.
+ */
+void kunit_release_action(struct kunit *test,
+ kunit_action_t *action,
+ void *ctx);
#endif /* _KUNIT_RESOURCE_H */
diff --git a/lib/kunit/kunit-test.c b/lib/kunit/kunit-test.c
index 42e44caa1bdd..83d8e90ca7a2 100644
--- a/lib/kunit/kunit-test.c
+++ b/lib/kunit/kunit-test.c
@@ -112,7 +112,7 @@ struct kunit_test_resource_context {
struct kunit test;
bool is_resource_initialized;
int allocate_order[2];
- int free_order[2];
+ int free_order[4];
};
static int fake_resource_init(struct kunit_resource *res, void *context)
@@ -403,6 +403,88 @@ static void kunit_resource_test_named(struct kunit *test)
KUNIT_EXPECT_TRUE(test, list_empty(&test->resources));
}
+static void increment_int(void *ctx)
+{
+ int *i = (int *)ctx;
+ (*i)++;
+}
+
+static void kunit_resource_test_action(struct kunit *test)
+{
+ int num_actions = 0;
+
+ kunit_add_action(test, increment_int, &num_actions);
+ KUNIT_EXPECT_EQ(test, num_actions, 0);
+ kunit_cleanup(test);
+ KUNIT_EXPECT_EQ(test, num_actions, 1);
+
+ /* Once we've cleaned up, the action queue is empty. */
+ kunit_cleanup(test);
+ KUNIT_EXPECT_EQ(test, num_actions, 1);
+
+ /* Check the same function can be deferred multiple times. */
+ kunit_add_action(test, increment_int, &num_actions);
+ kunit_add_action(test, increment_int, &num_actions);
+ kunit_cleanup(test);
+ KUNIT_EXPECT_EQ(test, num_actions, 3);
+}
+static void kunit_resource_test_remove_action(struct kunit *test)
+{
+ int num_actions = 0;
+
+ kunit_add_action(test, increment_int, &num_actions);
+ KUNIT_EXPECT_EQ(test, num_actions, 0);
+
+ kunit_remove_action(test, increment_int, &num_actions);
+ kunit_cleanup(test);
+ KUNIT_EXPECT_EQ(test, num_actions, 0);
+}
+static void kunit_resource_test_release_action(struct kunit *test)
+{
+ int num_actions = 0;
+
+ kunit_add_action(test, increment_int, &num_actions);
+ KUNIT_EXPECT_EQ(test, num_actions, 0);
+ /* Runs immediately on trigger. */
+ kunit_release_action(test, increment_int, &num_actions);
+ KUNIT_EXPECT_EQ(test, num_actions, 1);
+
+ /* Doesn't run again on test exit. */
+ kunit_cleanup(test);
+ KUNIT_EXPECT_EQ(test, num_actions, 1);
+}
+static void action_order_1(void *ctx)
+{
+ struct kunit_test_resource_context *res_ctx = (struct kunit_test_resource_context *)ctx;
+
+ KUNIT_RESOURCE_TEST_MARK_ORDER(res_ctx, free_order, 1);
+ kunit_log(KERN_INFO, current->kunit_test, "action_order_1");
+}
+static void action_order_2(void *ctx)
+{
+ struct kunit_test_resource_context *res_ctx = (struct kunit_test_resource_context *)ctx;
+
+ KUNIT_RESOURCE_TEST_MARK_ORDER(res_ctx, free_order, 2);
+ kunit_log(KERN_INFO, current->kunit_test, "action_order_2");
+}
+static void kunit_resource_test_action_ordering(struct kunit *test)
+{
+ struct kunit_test_resource_context *ctx = test->priv;
+
+ kunit_add_action(test, action_order_1, ctx);
+ kunit_add_action(test, action_order_2, ctx);
+ kunit_add_action(test, action_order_1, ctx);
+ kunit_add_action(test, action_order_2, ctx);
+ kunit_remove_action(test, action_order_1, ctx);
+ kunit_release_action(test, action_order_2, ctx);
+ kunit_cleanup(test);
+
+ /* [2 is triggered] [2], [(1 is cancelled)] [1] */
+ KUNIT_EXPECT_EQ(test, ctx->free_order[0], 2);
+ KUNIT_EXPECT_EQ(test, ctx->free_order[1], 2);
+ KUNIT_EXPECT_EQ(test, ctx->free_order[2], 1);
+}
+
static int kunit_resource_test_init(struct kunit *test)
{
struct kunit_test_resource_context *ctx =
@@ -434,6 +516,10 @@ static struct kunit_case kunit_resource_test_cases[] = {
KUNIT_CASE(kunit_resource_test_proper_free_ordering),
KUNIT_CASE(kunit_resource_test_static),
KUNIT_CASE(kunit_resource_test_named),
+ KUNIT_CASE(kunit_resource_test_action),
+ KUNIT_CASE(kunit_resource_test_remove_action),
+ KUNIT_CASE(kunit_resource_test_release_action),
+ KUNIT_CASE(kunit_resource_test_action_ordering),
{}
};
diff --git a/lib/kunit/resource.c b/lib/kunit/resource.c
index c414df922f34..f0209252b179 100644
--- a/lib/kunit/resource.c
+++ b/lib/kunit/resource.c
@@ -77,3 +77,102 @@ int kunit_destroy_resource(struct kunit *test, kunit_resource_match_t match,
return 0;
}
EXPORT_SYMBOL_GPL(kunit_destroy_resource);
+
+struct kunit_action_ctx {
+ struct kunit_resource res;
+ kunit_action_t *func;
+ void *ctx;
+};
+
+static void __kunit_action_free(struct kunit_resource *res)
+{
+ struct kunit_action_ctx *action_ctx = container_of(res, struct kunit_action_ctx, res);
+
+ action_ctx->func(action_ctx->ctx);
+}
+
+
+int kunit_add_action(struct kunit *test, void (*action)(void *), void *ctx)
+{
+ struct kunit_action_ctx *action_ctx;
+
+ KUNIT_ASSERT_NOT_NULL_MSG(test, action, "Tried to action a NULL function!");
+
+ action_ctx = kzalloc(sizeof(*action_ctx), GFP_KERNEL);
+ if (!action_ctx)
+ return -ENOMEM;
+
+ action_ctx->func = action;
+ action_ctx->ctx = ctx;
+
+ action_ctx->res.should_kfree = true;
+ /* As init is NULL, this cannot fail. */
+ __kunit_add_resource(test, NULL, __kunit_action_free, &action_ctx->res, action_ctx);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(kunit_add_action);
+
+int kunit_add_action_or_reset(struct kunit *test, void (*action)(void *),
+ void *ctx)
+{
+ int res = kunit_add_action(test, action, ctx);
+
+ if (res)
+ action(ctx);
+ return res;
+}
+EXPORT_SYMBOL_GPL(kunit_add_action_or_reset);
+
+static bool __kunit_action_match(struct kunit *test,
+ struct kunit_resource *res, void *match_data)
+{
+ struct kunit_action_ctx *match_ctx = (struct kunit_action_ctx *)match_data;
+ struct kunit_action_ctx *res_ctx = container_of(res, struct kunit_action_ctx, res);
+
+ /* Make sure this is a free function. */
+ if (res->free != __kunit_action_free)
+ return false;
+
+ /* Both the function and context data should match. */
+ return (match_ctx->func == res_ctx->func) && (match_ctx->ctx == res_ctx->ctx);
+}
+
+void kunit_remove_action(struct kunit *test,
+ kunit_action_t *action,
+ void *ctx)
+{
+ struct kunit_action_ctx match_ctx;
+ struct kunit_resource *res;
+
+ match_ctx.func = action;
+ match_ctx.ctx = ctx;
+
+ res = kunit_find_resource(test, __kunit_action_match, &match_ctx);
+ if (res) {
+ /* Remove the free function so we don't run the action. */
+ res->free = NULL;
+ kunit_remove_resource(test, res);
+ kunit_put_resource(res);
+ }
+}
+EXPORT_SYMBOL_GPL(kunit_remove_action);
+
+void kunit_release_action(struct kunit *test,
+ kunit_action_t *action,
+ void *ctx)
+{
+ struct kunit_action_ctx match_ctx;
+ struct kunit_resource *res;
+
+ match_ctx.func = action;
+ match_ctx.ctx = ctx;
+
+ res = kunit_find_resource(test, __kunit_action_match, &match_ctx);
+ if (res) {
+ kunit_remove_resource(test, res);
+ /* We have to put() this here, else free won't be called. */
+ kunit_put_resource(res);
+ }
+}
+EXPORT_SYMBOL_GPL(kunit_release_action);
--
2.40.1.698.g37aff9b760-goog
The basic idea here is to "simulate" memory poisoning for VMs. A VM
running on some host might encounter a memory error, after which some
page(s) are poisoned (i.e., future accesses SIGBUS). They expect that
once poisoned, pages can never become "un-poisoned". So, when we live
migrate the VM, we need to preserve the poisoned status of these pages.
When live migrating, we try to get the guest running on its new host as
quickly as possible. So, we start it running before all memory has been
copied, and before we're certain which pages should be poisoned or not.
So the basic way to use this new feature is:
- On the new host, the guest's memory is registered with userfaultfd, in
either MISSING or MINOR mode (doesn't really matter for this purpose).
- On any first access, we get a userfaultfd event. At this point we can
communicate with the old host to find out if the page was poisoned.
- If so, we can respond with a UFFDIO_SIGBUS - this places a swap marker
so any future accesses will SIGBUS. Because the pte is now "present",
future accesses won't generate more userfaultfd events, they'll just
SIGBUS directly.
UFFDIO_SIGBUS does not handle unmapping previously-present PTEs. This
isn't needed, because during live migration we want to intercept
all accesses with userfaultfd (not just writes, so WP mode isn't useful
for this). So whether minor or missing mode is being used (or both), the
PTE won't be present in any case, so handling that case isn't needed.
Signed-off-by: Axel Rasmussen <axelrasmussen(a)google.com>
---
fs/userfaultfd.c | 63 ++++++++++++++++++++++++++++++++
include/linux/swapops.h | 3 +-
include/linux/userfaultfd_k.h | 4 ++
include/uapi/linux/userfaultfd.h | 25 +++++++++++--
mm/memory.c | 4 ++
mm/userfaultfd.c | 62 ++++++++++++++++++++++++++++++-
6 files changed, 156 insertions(+), 5 deletions(-)
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index 0fd96d6e39ce..edc2928dae2b 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -1966,6 +1966,66 @@ static int userfaultfd_continue(struct userfaultfd_ctx *ctx, unsigned long arg)
return ret;
}
+static inline int userfaultfd_sigbus(struct userfaultfd_ctx *ctx, unsigned long arg)
+{
+ __s64 ret;
+ struct uffdio_sigbus uffdio_sigbus;
+ struct uffdio_sigbus __user *user_uffdio_sigbus;
+ struct userfaultfd_wake_range range;
+
+ user_uffdio_sigbus = (struct uffdio_sigbus __user *)arg;
+
+ ret = -EAGAIN;
+ if (atomic_read(&ctx->mmap_changing))
+ goto out;
+
+ ret = -EFAULT;
+ if (copy_from_user(&uffdio_sigbus, user_uffdio_sigbus,
+ /* don't copy the output fields */
+ sizeof(uffdio_sigbus) - (sizeof(__s64))))
+ goto out;
+
+ ret = validate_range(ctx->mm, uffdio_sigbus.range.start,
+ uffdio_sigbus.range.len);
+ if (ret)
+ goto out;
+
+ ret = -EINVAL;
+ /* double check for wraparound just in case. */
+ if (uffdio_sigbus.range.start + uffdio_sigbus.range.len <=
+ uffdio_sigbus.range.start) {
+ goto out;
+ }
+ if (uffdio_sigbus.mode & ~UFFDIO_SIGBUS_MODE_DONTWAKE)
+ goto out;
+
+ if (mmget_not_zero(ctx->mm)) {
+ ret = mfill_atomic_sigbus(ctx->mm, uffdio_sigbus.range.start,
+ uffdio_sigbus.range.len,
+ &ctx->mmap_changing, 0);
+ mmput(ctx->mm);
+ } else {
+ return -ESRCH;
+ }
+
+ if (unlikely(put_user(ret, &user_uffdio_sigbus->updated)))
+ return -EFAULT;
+ if (ret < 0)
+ goto out;
+
+ /* len == 0 would wake all */
+ BUG_ON(!ret);
+ range.len = ret;
+ if (!(uffdio_sigbus.mode & UFFDIO_SIGBUS_MODE_DONTWAKE)) {
+ range.start = uffdio_sigbus.range.start;
+ wake_userfault(ctx, &range);
+ }
+ ret = range.len == uffdio_sigbus.range.len ? 0 : -EAGAIN;
+
+out:
+ return ret;
+}
+
static inline unsigned int uffd_ctx_features(__u64 user_features)
{
/*
@@ -2067,6 +2127,9 @@ static long userfaultfd_ioctl(struct file *file, unsigned cmd,
case UFFDIO_CONTINUE:
ret = userfaultfd_continue(ctx, arg);
break;
+ case UFFDIO_SIGBUS:
+ ret = userfaultfd_sigbus(ctx, arg);
+ break;
}
return ret;
}
diff --git a/include/linux/swapops.h b/include/linux/swapops.h
index 3a451b7afcb3..fa778a0ae730 100644
--- a/include/linux/swapops.h
+++ b/include/linux/swapops.h
@@ -405,7 +405,8 @@ typedef unsigned long pte_marker;
#define PTE_MARKER_UFFD_WP BIT(0)
#define PTE_MARKER_SWAPIN_ERROR BIT(1)
-#define PTE_MARKER_MASK (BIT(2) - 1)
+#define PTE_MARKER_UFFD_SIGBUS BIT(2)
+#define PTE_MARKER_MASK (BIT(3) - 1)
static inline swp_entry_t make_pte_marker_entry(pte_marker marker)
{
diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
index d78b01524349..6de1084939c5 100644
--- a/include/linux/userfaultfd_k.h
+++ b/include/linux/userfaultfd_k.h
@@ -46,6 +46,7 @@ enum mfill_atomic_mode {
MFILL_ATOMIC_COPY,
MFILL_ATOMIC_ZEROPAGE,
MFILL_ATOMIC_CONTINUE,
+ MFILL_ATOMIC_SIGBUS,
NR_MFILL_ATOMIC_MODES,
};
@@ -83,6 +84,9 @@ extern ssize_t mfill_atomic_zeropage(struct mm_struct *dst_mm,
extern ssize_t mfill_atomic_continue(struct mm_struct *dst_mm, unsigned long dst_start,
unsigned long len, atomic_t *mmap_changing,
uffd_flags_t flags);
+extern ssize_t mfill_atomic_sigbus(struct mm_struct *dst_mm, unsigned long start,
+ unsigned long len, atomic_t *mmap_changing,
+ uffd_flags_t flags);
extern int mwriteprotect_range(struct mm_struct *dst_mm,
unsigned long start, unsigned long len,
bool enable_wp, atomic_t *mmap_changing);
diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h
index 66dd4cd277bd..616e33d3db97 100644
--- a/include/uapi/linux/userfaultfd.h
+++ b/include/uapi/linux/userfaultfd.h
@@ -39,7 +39,8 @@
UFFD_FEATURE_MINOR_SHMEM | \
UFFD_FEATURE_EXACT_ADDRESS | \
UFFD_FEATURE_WP_HUGETLBFS_SHMEM | \
- UFFD_FEATURE_WP_UNPOPULATED)
+ UFFD_FEATURE_WP_UNPOPULATED | \
+ UFFD_FEATURE_SIGBUS_IOCTL)
#define UFFD_API_IOCTLS \
((__u64)1 << _UFFDIO_REGISTER | \
(__u64)1 << _UFFDIO_UNREGISTER | \
@@ -49,12 +50,14 @@
(__u64)1 << _UFFDIO_COPY | \
(__u64)1 << _UFFDIO_ZEROPAGE | \
(__u64)1 << _UFFDIO_WRITEPROTECT | \
- (__u64)1 << _UFFDIO_CONTINUE)
+ (__u64)1 << _UFFDIO_CONTINUE | \
+ (__u64)1 << _UFFDIO_SIGBUS)
#define UFFD_API_RANGE_IOCTLS_BASIC \
((__u64)1 << _UFFDIO_WAKE | \
(__u64)1 << _UFFDIO_COPY | \
+ (__u64)1 << _UFFDIO_WRITEPROTECT | \
(__u64)1 << _UFFDIO_CONTINUE | \
- (__u64)1 << _UFFDIO_WRITEPROTECT)
+ (__u64)1 << _UFFDIO_SIGBUS)
/*
* Valid ioctl command number range with this API is from 0x00 to
@@ -71,6 +74,7 @@
#define _UFFDIO_ZEROPAGE (0x04)
#define _UFFDIO_WRITEPROTECT (0x06)
#define _UFFDIO_CONTINUE (0x07)
+#define _UFFDIO_SIGBUS (0x08)
#define _UFFDIO_API (0x3F)
/* userfaultfd ioctl ids */
@@ -91,6 +95,8 @@
struct uffdio_writeprotect)
#define UFFDIO_CONTINUE _IOWR(UFFDIO, _UFFDIO_CONTINUE, \
struct uffdio_continue)
+#define UFFDIO_SIGBUS _IOWR(UFFDIO, _UFFDIO_SIGBUS, \
+ struct uffdio_sigbus)
/* read() structure */
struct uffd_msg {
@@ -225,6 +231,7 @@ struct uffdio_api {
#define UFFD_FEATURE_EXACT_ADDRESS (1<<11)
#define UFFD_FEATURE_WP_HUGETLBFS_SHMEM (1<<12)
#define UFFD_FEATURE_WP_UNPOPULATED (1<<13)
+#define UFFD_FEATURE_SIGBUS_IOCTL (1<<14)
__u64 features;
__u64 ioctls;
@@ -321,6 +328,18 @@ struct uffdio_continue {
__s64 mapped;
};
+struct uffdio_sigbus {
+ struct uffdio_range range;
+#define UFFDIO_SIGBUS_MODE_DONTWAKE ((__u64)1<<0)
+ __u64 mode;
+
+ /*
+ * Fields below here are written by the ioctl and must be at the end:
+ * the copy_from_user will not read past here.
+ */
+ __s64 updated;
+};
+
/*
* Flags for the userfaultfd(2) system call itself.
*/
diff --git a/mm/memory.c b/mm/memory.c
index f69fbc251198..e4b4207c2590 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3675,6 +3675,10 @@ static vm_fault_t handle_pte_marker(struct vm_fault *vmf)
if (WARN_ON_ONCE(!marker))
return VM_FAULT_SIGBUS;
+ /* SIGBUS explicitly requested for this PTE. */
+ if (marker & PTE_MARKER_UFFD_SIGBUS)
+ return VM_FAULT_SIGBUS;
+
/* Higher priority than uffd-wp when data corrupted */
if (marker & PTE_MARKER_SWAPIN_ERROR)
return VM_FAULT_SIGBUS;
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index e97a0b4889fc..933587eebd5d 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -278,6 +278,51 @@ static int mfill_atomic_pte_continue(pmd_t *dst_pmd,
goto out;
}
+/* Handles UFFDIO_SIGBUS for all non-hugetlb VMAs. */
+static int mfill_atomic_pte_sigbus(pmd_t *dst_pmd,
+ struct vm_area_struct *dst_vma,
+ unsigned long dst_addr,
+ uffd_flags_t flags)
+{
+ int ret;
+ struct mm_struct *dst_mm = dst_vma->vm_mm;
+ pte_t _dst_pte, *dst_pte;
+ spinlock_t *ptl;
+
+ _dst_pte = make_pte_marker(PTE_MARKER_UFFD_SIGBUS);
+ dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl);
+
+ if (vma_is_shmem(dst_vma)) {
+ struct inode *inode;
+ pgoff_t offset, max_off;
+
+ /* serialize against truncate with the page table lock */
+ inode = dst_vma->vm_file->f_inode;
+ offset = linear_page_index(dst_vma, dst_addr);
+ max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
+ ret = -EFAULT;
+ if (unlikely(offset >= max_off))
+ goto out_unlock;
+ }
+
+ ret = -EEXIST;
+ /*
+ * For now, we don't handle unmapping pages, so only support filling in
+ * none PTEs, or replacing PTE markers.
+ */
+ if (!pte_none_mostly(*dst_pte))
+ goto out_unlock;
+
+ set_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte);
+
+ /* No need to invalidate - it was non-present before */
+ update_mmu_cache(dst_vma, dst_addr, dst_pte);
+ ret = 0;
+out_unlock:
+ pte_unmap_unlock(dst_pte, ptl);
+ return ret;
+}
+
static pmd_t *mm_alloc_pmd(struct mm_struct *mm, unsigned long address)
{
pgd_t *pgd;
@@ -328,8 +373,12 @@ static __always_inline ssize_t mfill_atomic_hugetlb(
* supported by hugetlb. A PMD_SIZE huge pages may exist as used
* by THP. Since we can not reliably insert a zero page, this
* feature is not supported.
+ *
+ * PTE marker handling for hugetlb is a bit special, so for now
+ * UFFDIO_SIGBUS is not supported.
*/
- if (uffd_flags_mode_is(flags, MFILL_ATOMIC_ZEROPAGE)) {
+ if (uffd_flags_mode_is(flags, MFILL_ATOMIC_ZEROPAGE) ||
+ uffd_flags_mode_is(flags, MFILL_ATOMIC_SIGBUS)) {
mmap_read_unlock(dst_mm);
return -EINVAL;
}
@@ -473,6 +522,9 @@ static __always_inline ssize_t mfill_atomic_pte(pmd_t *dst_pmd,
if (uffd_flags_mode_is(flags, MFILL_ATOMIC_CONTINUE)) {
return mfill_atomic_pte_continue(dst_pmd, dst_vma,
dst_addr, flags);
+ } else if (uffd_flags_mode_is(flags, MFILL_ATOMIC_SIGBUS)) {
+ return mfill_atomic_pte_sigbus(dst_pmd, dst_vma,
+ dst_addr, flags);
}
/*
@@ -694,6 +746,14 @@ ssize_t mfill_atomic_continue(struct mm_struct *dst_mm, unsigned long start,
uffd_flags_set_mode(flags, MFILL_ATOMIC_CONTINUE));
}
+ssize_t mfill_atomic_sigbus(struct mm_struct *dst_mm, unsigned long start,
+ unsigned long len, atomic_t *mmap_changing,
+ uffd_flags_t flags)
+{
+ return mfill_atomic(dst_mm, start, 0, len, mmap_changing,
+ uffd_flags_set_mode(flags, MFILL_ATOMIC_SIGBUS));
+}
+
long uffd_wp_range(struct vm_area_struct *dst_vma,
unsigned long start, unsigned long len, bool enable_wp)
{
--
2.40.1.606.ga4b1b128d6-goog
*Changes in v15*
- Build fix (Add missed build fix in RESEND)
*Changes in v14*
- Fix build error caused by #ifdef added at last minute in some configs
*Changes in v13*
- Rebase on top of next-20230414
- Give-up on using uffd_wp_range() and write new helpers, flush tlb only
once
*Changes in v12*
- Update and other memory types to UFFD_FEATURE_WP_ASYNC
- Rebaase on top of next-20230406
- Review updates
*Changes in v11*
- Rebase on top of next-20230307
- Base patches on UFFD_FEATURE_WP_UNPOPULATED
- Do a lot of cosmetic changes and review updates
- Remove ENGAGE_WP + !GET operation as it can be performed with
UFFDIO_WRITEPROTECT
*Changes in v10*
- Add specific condition to return error if hugetlb is used with wp
async
- Move changes in tools/include/uapi/linux/fs.h to separate patch
- Add documentation
*Changes in v9:*
- Correct fault resolution for userfaultfd wp async
- Fix build warnings and errors which were happening on some configs
- Simplify pagemap ioctl's code
*Changes in v8:*
- Update uffd async wp implementation
- Improve PAGEMAP_IOCTL implementation
*Changes in v7:*
- Add uffd wp async
- Update the IOCTL to use uffd under the hood instead of soft-dirty
flags
*Motivation*
The real motivation for adding PAGEMAP_SCAN IOCTL is to emulate Windows
GetWriteWatch() syscall [1]. The GetWriteWatch{} retrieves the addresses of
the pages that are written to in a region of virtual memory.
This syscall is used in Windows applications and games etc. This syscall is
being emulated in pretty slow manner in userspace. Our purpose is to
enhance the kernel such that we translate it efficiently in a better way.
Currently some out of tree hack patches are being used to efficiently
emulate it in some kernels. We intend to replace those with these patches.
So the whole gaming on Linux can effectively get benefit from this. It
means there would be tons of users of this code.
CRIU use case [2] was mentioned by Andrei and Danylo:
> Use cases for migrating sparse VMAs are binaries sanitized with ASAN,
> MSAN or TSAN [3]. All of these sanitizers produce sparse mappings of
> shadow memory [4]. Being able to migrate such binaries allows to highly
> reduce the amount of work needed to identify and fix post-migration
> crashes, which happen constantly.
Andrei's defines the following uses of this code:
* it is more granular and allows us to track changed pages more
effectively. The current interface can clear dirty bits for the entire
process only. In addition, reading info about pages is a separate
operation. It means we must freeze the process to read information
about all its pages, reset dirty bits, only then we can start dumping
pages. The information about pages becomes more and more outdated,
while we are processing pages. The new interface solves both these
downsides. First, it allows us to read pte bits and clear the
soft-dirty bit atomically. It means that CRIU will not need to freeze
processes to pre-dump their memory. Second, it clears soft-dirty bits
for a specified region of memory. It means CRIU will have actual info
about pages to the moment of dumping them.
* The new interface has to be much faster because basic page filtering
is happening in the kernel. With the old interface, we have to read
pagemap for each page.
*Implementation Evolution (Short Summary)*
From the definition of GetWriteWatch(), we feel like kernel's soft-dirty
feature can be used under the hood with some additions like:
* reset soft-dirty flag for only a specific region of memory instead of
clearing the flag for the entire process
* get and clear soft-dirty flag for a specific region atomically
So we decided to use ioctl on pagemap file to read or/and reset soft-dirty
flag. But using soft-dirty flag, sometimes we get extra pages which weren't
even written. They had become soft-dirty because of VMA merging and
VM_SOFTDIRTY flag. This breaks the definition of GetWriteWatch(). We were
able to by-pass this short coming by ignoring VM_SOFTDIRTY until David
reported that mprotect etc messes up the soft-dirty flag while ignoring
VM_SOFTDIRTY [5]. This wasn't happening until [6] got introduced. We
discussed if we can revert these patches. But we could not reach to any
conclusion. So at this point, I made couple of tries to solve this whole
VM_SOFTDIRTY issue by correcting the soft-dirty implementation:
* [7] Correct the bug fixed wrongly back in 2014. It had potential to cause
regression. We left it behind.
* [8] Keep a list of soft-dirty part of a VMA across splits and merges. I
got the reply don't increase the size of the VMA by 8 bytes.
At this point, we left soft-dirty considering it is too much delicate and
userfaultfd [9] seemed like the only way forward. From there onward, we
have been basing soft-dirty emulation on userfaultfd wp feature where
kernel resolves the faults itself when WP_ASYNC feature is used. It was
straight forward to add WP_ASYNC feature in userfautlfd. Now we get only
those pages dirty or written-to which are really written in reality. (PS
There is another WP_UNPOPULATED userfautfd feature is required which is
needed to avoid pre-faulting memory before write-protecting [9].)
All the different masks were added on the request of CRIU devs to create
interface more generic and better.
[1] https://learn.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-…
[2] https://lore.kernel.org/all/20221014134802.1361436-1-mdanylo@google.com
[3] https://github.com/google/sanitizers
[4] https://github.com/google/sanitizers/wiki/AddressSanitizerAlgorithm#64-bit
[5] https://lore.kernel.org/all/bfcae708-db21-04b4-0bbe-712badd03071@redhat.com
[6] https://lore.kernel.org/all/20220725142048.30450-1-peterx@redhat.com/
[7] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.…
[8] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.…
[9] https://lore.kernel.org/all/20230306213925.617814-1-peterx@redhat.com
[10] https://lore.kernel.org/all/20230125144529.1630917-1-mdanylo@google.com
* Original Cover letter from v8*
Hello,
Note:
Soft-dirty pages and pages which have been written-to are synonyms. As
kernel already has soft-dirty feature inside which we have given up to
use, we are using written-to terminology while using UFFD async WP under
the hood.
This IOCTL, PAGEMAP_SCAN on pagemap file can be used to get and/or clear
the info about page table entries. The following operations are
supported in this ioctl:
- Get the information if the pages have been written-to (PAGE_IS_WRITTEN),
file mapped (PAGE_IS_FILE), present (PAGE_IS_PRESENT) or swapped
(PAGE_IS_SWAPPED).
- Write-protect the pages (PAGEMAP_WP_ENGAGE) to start finding which
pages have been written-to.
- Find pages which have been written-to and write protect the pages
(atomic PAGE_IS_WRITTEN + PAGEMAP_WP_ENGAGE)
It is possible to find and clear soft-dirty pages entirely in userspace.
But it isn't efficient:
- The mprotect and SIGSEGV handler for bookkeeping
- The userfaultfd wp (synchronous) with the handler for bookkeeping
Some benchmarks can be seen here[1]. This series adds features that weren't
present earlier:
- There is no atomic get soft-dirty/Written-to status and clear present in
the kernel.
- The pages which have been written-to can not be found in accurate way.
(Kernel's soft-dirty PTE bit + sof_dirty VMA bit shows more soft-dirty
pages than there actually are.)
Historically, soft-dirty PTE bit tracking has been used in the CRIU
project. The procfs interface is enough for finding the soft-dirty bit
status and clearing the soft-dirty bit of all the pages of a process.
We have the use case where we need to track the soft-dirty PTE bit for
only specific pages on-demand. We need this tracking and clear mechanism
of a region of memory while the process is running to emulate the
getWriteWatch() syscall of Windows.
*(Moved to using UFFD instead of soft-dirtyi feature to find pages which
have been written-to from v7 patch series)*:
Stop using the soft-dirty flags for finding which pages have been
written to. It is too delicate and wrong as it shows more soft-dirty
pages than the actual soft-dirty pages. There is no interest in
correcting it [2][3] as this is how the feature was written years ago.
It shouldn't be updated to changed behaviour. Peter Xu has suggested
using the async version of the UFFD WP [4] as it is based inherently
on the PTEs.
So in this patch series, I've added a new mode to the UFFD which is
asynchronous version of the write protect. When this variant of the
UFFD WP is used, the page faults are resolved automatically by the
kernel. The pages which have been written-to can be found by reading
pagemap file (!PM_UFFD_WP). This feature can be used successfully to
find which pages have been written to from the time the pages were
write protected. This works just like the soft-dirty flag without
showing any extra pages which aren't soft-dirty in reality.
The information related to pages if the page is file mapped, present and
swapped is required for the CRIU project [5][6]. The addition of the
required mask, any mask, excluded mask and return masks are also required
for the CRIU project [5].
The IOCTL returns the addresses of the pages which match the specific
masks. The page addresses are returned in struct page_region in a compact
form. The max_pages is needed to support a use case where user only wants
to get a specific number of pages. So there is no need to find all the
pages of interest in the range when max_pages is specified. The IOCTL
returns when the maximum number of the pages are found. The max_pages is
optional. If max_pages is specified, it must be equal or greater than the
vec_size. This restriction is needed to handle worse case when one
page_region only contains info of one page and it cannot be compacted.
This is needed to emulate the Windows getWriteWatch() syscall.
The patch series include the detailed selftest which can be used as an
example for the uffd async wp test and PAGEMAP_IOCTL. It shows the
interface usages as well.
[1] https://lore.kernel.org/lkml/54d4c322-cd6e-eefd-b161-2af2b56aae24@collabora…
[2] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.…
[3] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.…
[4] https://lore.kernel.org/all/Y6Hc2d+7eTKs7AiH@x1n
[5] https://lore.kernel.org/all/YyiDg79flhWoMDZB@gmail.com/
[6] https://lore.kernel.org/all/20221014134802.1361436-1-mdanylo@google.com/
Regards,
Muhammad Usama Anjum
Muhammad Usama Anjum (4):
fs/proc/task_mmu: Implement IOCTL to get and optionally clear info
about PTEs
tools headers UAPI: Update linux/fs.h with the kernel sources
mm/pagemap: add documentation of PAGEMAP_SCAN IOCTL
selftests: mm: add pagemap ioctl tests
Peter Xu (1):
userfaultfd: UFFD_FEATURE_WP_ASYNC
Documentation/admin-guide/mm/pagemap.rst | 56 +
Documentation/admin-guide/mm/userfaultfd.rst | 35 +
fs/proc/task_mmu.c | 481 +++++++
fs/userfaultfd.c | 26 +-
include/linux/userfaultfd_k.h | 21 +-
include/uapi/linux/fs.h | 53 +
include/uapi/linux/userfaultfd.h | 9 +-
mm/hugetlb.c | 32 +-
mm/memory.c | 27 +-
tools/include/uapi/linux/fs.h | 53 +
tools/testing/selftests/mm/.gitignore | 1 +
tools/testing/selftests/mm/Makefile | 3 +-
tools/testing/selftests/mm/config | 1 +
tools/testing/selftests/mm/pagemap_ioctl.c | 1326 ++++++++++++++++++
tools/testing/selftests/mm/run_vmtests.sh | 4 +
15 files changed, 2105 insertions(+), 23 deletions(-)
create mode 100644 tools/testing/selftests/mm/pagemap_ioctl.c
mode change 100644 => 100755 tools/testing/selftests/mm/run_vmtests.sh
--
2.39.2
Hi,
On vanilla AlmaLinux 8.7 (CentOS fork) selftests/net/af_unix/diag_uid.c doesn't
compile out of the box, giving the errors:
make[2]: Entering directory '/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/af_unix'
gcc diag_uid.c -o /home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/af_unix/diag_uid
diag_uid.c:36:16: error: ‘UDIAG_SHOW_UID’ undeclared here (not in a function); did you mean ‘UDIAG_SHOW_VFS’?
.udiag_show = UDIAG_SHOW_UID
^~~~~~~~~~~~~~
UDIAG_SHOW_VFS
In file included from diag_uid.c:17:
diag_uid.c: In function ‘render_response’:
diag_uid.c:128:28: error: ‘UNIX_DIAG_UID’ undeclared (first use in this function); did you mean ‘UNIX_DIAG_VFS’?
ASSERT_EQ(attr->rta_type, UNIX_DIAG_UID);
^~~~~~~~~~~~~
../../kselftest_harness.h:707:13: note: in definition of macro ‘__EXPECT’
__typeof__(_seen) __seen = (_seen); \
^~~~~
diag_uid.c:128:2: note: in expansion of macro ‘ASSERT_EQ’
ASSERT_EQ(attr->rta_type, UNIX_DIAG_UID);
^~~~~~~~~
diag_uid.c:128:28: note: each undeclared identifier is reported only once for each function it appears in
ASSERT_EQ(attr->rta_type, UNIX_DIAG_UID);
^~~~~~~~~~~~~
../../kselftest_harness.h:707:13: note: in definition of macro ‘__EXPECT’
__typeof__(_seen) __seen = (_seen); \
^~~~~
diag_uid.c:128:2: note: in expansion of macro ‘ASSERT_EQ’
ASSERT_EQ(attr->rta_type, UNIX_DIAG_UID);
^~~~~~~~~
make[2]: *** [../../lib.mk:147: /home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/af_unix/diag_uid] Error 1
The correct value is in <uapi/linux/unix_diag.h>:
include/uapi/linux/unix_diag.h:23:#define UDIAG_SHOW_UID 0x00000040 /* show socket's UID */
The fix is as follows:
---
tools/testing/selftests/net/af_unix/diag_uid.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/net/af_unix/diag_uid.c b/tools/testing/selftests/net/af_unix/diag_uid.c
index 5b88f7129fea..66d75b646d35 100644
--- a/tools/testing/selftests/net/af_unix/diag_uid.c
+++ b/tools/testing/selftests/net/af_unix/diag_uid.c
@@ -16,6 +16,10 @@
#include "../../kselftest_harness.h"
+#ifndef UDIAG_SHOW_UID
+#define UDIAG_SHOW_UID 0x00000040 /* show socket's UID */
+#endif
+
FIXTURE(diag_uid)
{
int netlink_fd;
--
However, this patch reveals another undefined value:
make[2]: Entering directory '/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/af_unix'
gcc diag_uid.c -o /home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/af_unix/diag_uid
In file included from diag_uid.c:17:
diag_uid.c: In function ‘render_response’:
diag_uid.c:132:28: error: ‘UNIX_DIAG_UID’ undeclared (first use in this function); did you mean ‘UNIX_DIAG_VFS’?
ASSERT_EQ(attr->rta_type, UNIX_DIAG_UID);
^~~~~~~~~~~~~
../../kselftest_harness.h:707:13: note: in definition of macro ‘__EXPECT’
__typeof__(_seen) __seen = (_seen); \
^~~~~
diag_uid.c:132:2: note: in expansion of macro ‘ASSERT_EQ’
ASSERT_EQ(attr->rta_type, UNIX_DIAG_UID);
^~~~~~~~~
diag_uid.c:132:28: note: each undeclared identifier is reported only once for each function it appears in
ASSERT_EQ(attr->rta_type, UNIX_DIAG_UID);
^~~~~~~~~~~~~
../../kselftest_harness.h:707:13: note: in definition of macro ‘__EXPECT’
__typeof__(_seen) __seen = (_seen); \
^~~~~
diag_uid.c:132:2: note: in expansion of macro ‘ASSERT_EQ’
ASSERT_EQ(attr->rta_type, UNIX_DIAG_UID);
^~~~~~~~~
make[2]: *** [../../lib.mk:147: /home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/af_unix/diag_uid] Error 1
Apparently, AlmaLinux 8.7 lacks this enum UNIX_DIAG_UID:
diff -u /usr/include/linux/unix_diag.h include/uapi/linux/unix_diag.h
--- /usr/include/linux/unix_diag.h 2023-05-16 13:47:51.000000000 +0200
+++ include/uapi/linux/unix_diag.h 2022-10-12 07:35:58.253481367 +0200
@@ -20,6 +20,7 @@
#define UDIAG_SHOW_ICONS 0x00000008 /* show pending connections */
#define UDIAG_SHOW_RQLEN 0x00000010 /* show skb receive queue len */
#define UDIAG_SHOW_MEMINFO 0x00000020 /* show memory info of a socket */
+#define UDIAG_SHOW_UID 0x00000040 /* show socket's UID */
struct unix_diag_msg {
__u8 udiag_family;
@@ -40,6 +41,7 @@
UNIX_DIAG_RQLEN,
UNIX_DIAG_MEMINFO,
UNIX_DIAG_SHUTDOWN,
+ UNIX_DIAG_UID,
__UNIX_DIAG_MAX,
};
Now, this is a change in enums and there doesn't seem to an easy way out
here. (I think I saw an example, but I cannot recall which thread. I will do
more research.)
When I included
# gcc -I ../../../../include diag_uid.c
I've got the following error:
[marvin@pc-mtodorov linux_torvalds]$ cd tools/testing/selftests/net/af_unix/
[marvin@pc-mtodorov af_unix]$ gcc -I ../../../../../include diag_uid.c -o
/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/af_unix/diag_uid
In file included from ../../../../../include/linux/build_bug.h:5,
from ../../../../../include/linux/bits.h:21,
from ../../../../../include/linux/capability.h:18,
from ../../../../../include/linux/netlink.h:6,
from diag_uid.c:8:
../../../../../include/linux/compiler.h:246:10: fatal error: asm/rwonce.h: No such file or directory
#include <asm/rwonce.h>
^~~~~~~~~~~~~~
compilation terminated.
[marvin@pc-mtodorov af_unix]$
At this point I gave up, as it would be an overkill to change kernel system
header to make a test pass, and this probably wouldn't be accepted upsteam?
Hope this helps. (If we still want to build on CentOS/AlmaLinux/Rocky 8?)
Best regards,
Mirsad
--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
Add documentation for the new Virtual PCM Test Driver. It covers all
possible usage cases: errors and delay injections, random and
pattern-based data generation, playback and ioctl redefinition
functionalities testing.
We have a lot of different virtual media drivers, which can be used for
testing of the userspace applications and media subsystem middle layer.
However, all of them are aimed at testing the video functionality and
simulating the video devices. For audio devices we have only snd-dummy
module, which is good in simulating the correct behavior of an ALSA device.
I decided to write a tool, which would help to test the userspace ALSA
programs (and the PCM middle layer as well) under unusual circumstances
to figure out how they would behave. So I came up with this Virtual PCM
Test Driver.
This new Virtual PCM Test Driver has several features which can be useful
during the userspace ALSA applications testing/fuzzing, or testing/fuzzing
of the PCM middle layer. Not all of them can be implemented using the
existing virtual drivers (like dummy or loopback). Here is what can this
driver do:
- Simulate both capture and playback processes
- Check the playback stream for containing the looped pattern
- Generate random or pattern-based capture data
- Inject delays into the playback and capturing processes
- Inject errors during the PCM callbacks
Also, this driver can check the playback stream for containing the
predefined pattern, which is used in the corresponding selftest to check
the PCM middle layer data transferring functionality. Additionally, this
driver redefines the default RESET ioctl, and the selftest covers this PCM
API functionality as well.
Signed-off-by: Ivan Orlov <ivan.orlov0322(a)gmail.com>
---
V1 -> V2:
- Rename the driver from from 'valsa' to 'pcmtest'.
- Implement support for interleaved and non-interleaved access modes
- Add support for 8 substreams and 4 channels
- Extend supported formats
- Extend and rewrite in C the selftest for the driver
Documentation/sound/cards/index.rst | 1 +
Documentation/sound/cards/pcmtest.rst | 119 ++++++++++++++++++++++++++
2 files changed, 120 insertions(+)
create mode 100644 Documentation/sound/cards/pcmtest.rst
diff --git a/Documentation/sound/cards/index.rst b/Documentation/sound/cards/index.rst
index c016f8c3b88b..49c1f2f688f8 100644
--- a/Documentation/sound/cards/index.rst
+++ b/Documentation/sound/cards/index.rst
@@ -17,3 +17,4 @@ Card-Specific Information
hdspm
serial-u16550
img-spdif-in
+ pcmtest
diff --git a/Documentation/sound/cards/pcmtest.rst b/Documentation/sound/cards/pcmtest.rst
new file mode 100644
index 000000000000..ea8070eaa44e
--- /dev/null
+++ b/Documentation/sound/cards/pcmtest.rst
@@ -0,0 +1,119 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+The Virtual PCM Test Driver
+===========================
+
+The Virtual PCM Test Driver emulates a generic PCM device, and can be used for
+testing/fuzzing of the userspace ALSA applications, as well as for testing/fuzzing of
+the PCM middle layer. Additionally, it can be used for simulating hard to reproduce
+problems with PCM devices.
+
+What can this driver do?
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+At this moment the driver can do the following things:
+ * Simulate both capture and playback processes
+ * Generate random or pattern-based capturing data
+ * Inject delays into the playback and capturing processes
+ * Inject errors during the PCM callbacks
+
+It supports up to 8 substreams and 4 channels. Also it supports both interleaved and
+non-interleaved access modes.
+
+Also, this driver can check the playback stream for containing the predefined pattern,
+which is used in the corresponding selftest (alsa/test-pcmtest-driver.c). To check the
+PCM middle layer data transferring functionality. Additionally, this driver redefines
+the default RESET ioctl, and the selftest covers this PCM API functionality as well.
+
+Configuration
+-------------
+
+The driver has several parameters besides the common ALSA module parameters:
+
+ * fill_mode (bool) - Buffer fill mode (see below)
+ * inject_delay (int)
+ * inject_hwpars_err (bool)
+ * inject_prepare_err (bool)
+ * inject_trigger_err (bool)
+
+
+Capture Data Generation
+-----------------------
+
+The driver has two modes of data generation: the first (0 in the fill_mode parameter)
+means random data generation, the second (1 in the fill_mode) - pattern-based
+data generation. Let's look at the second mode.
+
+First of all, you may want to specify the pattern for data generation. You can do it
+by writing the pattern to the debugfs file (/sys/kernel/debug/pcmtest/fill_pattern).
+Like that:
+
+.. code-block:: bash
+
+ echo -n mycoolpattern > /sys/kernel/debug/pcmtest/fill_pattern
+
+After this, every capture action performed on the 'pcmtest' device will return
+'mycoolpatternmycoolpatternmycoolpatternmy...' for every channel buffer.
+
+In case of interleaved access, the capture buffer will contain the repeated pattern
+for every channel. Otherwise, every channel buffer will contain the repeated pattern.
+
+The pattern itself can be up to 4096 bytes long.
+
+Delay injection
+---------------
+
+The driver has 'inject_delay' parameter, which has very self-descriptive name and
+can be used for time delay/speedup simulations. The parameter has integer type, and
+it means the delay added between module's internal timer ticks.
+
+If the 'inject_delay' value is positive, the buffer will be filled slower, if it is
+negative - faster. You can try it yourself by starting a recording in any
+audio recording application (like Audacity) and selecting the 'pcmtest' device as a
+source.
+
+This parameter can be also used for generating a huge amount of sound data in a very
+short period of time (with the negative 'inject_delay' value).
+
+Errors injection
+----------------
+
+This module can be used for injecting errors into the PCM communication process. This
+action can help you to figure out how the userspace ALSA program behaves under unusual
+circumstances.
+
+For example, you can make all 'hw_params' PCM callback calls return EBUSY error by
+writing '1' to the 'inject_hwpars_err' module parameter:
+
+.. code-block:: bash
+
+ echo 1 > /sys/module/snd_pcmtest/parameters/inject_hwpars_err
+
+Errors can be injected into the following PCM callbacks:
+
+ * hw_params (EBUSY)
+ * prepare (EINVAL)
+ * trigger (EINVAL)
+
+
+Playback test
+-------------
+
+This driver can be also used for the playback functionality testing - every time you
+write the playback data to the 'pcmtest' PCM device and close it, the driver checks the
+buffer of each channel for containing the looped pattern (which is specified in the
+fill_pattern debugfs file). If the playback buffer content represents the looped pattern,
+'pc_test' debugfs entry is set into '1'. Otherwise, the driver sets it to '0'.
+
+ioctl redefinition test
+-----------------------
+
+The driver redefines the 'reset' ioctl, which is default for all PCM devices. To test
+this functionality, we can trigger the reset ioctl and check the 'ioctl_test' debugfs
+entry:
+
+.. code-block:: bash
+
+ cat /sys/kernel/debug/pcmtest/ioctl_test
+
+If the ioctl is triggered successfully, this file will contain '1', and '0' otherwise.
--
2.34.1
iommufd gives userspace the capability to manipulate iommu subsytem.
e.g. DMA map/unmap etc. In the near future, it will support iommu nested
translation. Different platform vendors have different implementation for
the nested translation. So before set up nested translation, userspace
needs to know the hardware iommu information. For example, Intel VT-d
supports using guest I/O page table as the stage-1 translation table. This
requires guest I/O page table be compatible with hardware IOMMU.
This series reports the iommu hardware information for a given iommufd_device
which has been bound to iommufd. It is preparation work for userspace to
allocate hwpt for given device. Like the nested translation support[1].
This series introduces an iommu op to report the iommu hardware info,
and an ioctl IOMMU_DEVICE_GET_HW_INFO is added to report such hardware
info to user. enum iommu_hw_info_type is defined to differentiate the
iommu hardware info reported to user hence user can decode them. This
series only adds the framework for iommu hw info reporting, the complete
reporting path needs vendor specific definition and driver support. The
full picture is available in [1] as well.
base-commit: 35db4f4dac813ffaa987cf633694107fabf3aff5
[1] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting
Change log:
v3:
- Add r-b from Baolu
- Rename IOMMU_HW_INFO_TYPE_DEFAULT to be IOMMU_HW_INFO_TYPE_NONE to
better suit what it means
- Let IOMMU_DEVICE_GET_HW_INFO succeed even the underlying iommu driver
does not have driver-specific data to report per below remark.
https://lore.kernel.org/kvm/ZAcwJSK%2F9UVI9LXu@nvidia.com/
v2: https://lore.kernel.org/linux-iommu/20230309075358.571567-1-yi.l.liu@intel.…
- Drop patch 05 of v1 as it is already covered by other series
- Rename the capability info to be iommu hardware info
v1: https://lore.kernel.org/linux-iommu/20230209041642.9346-1-yi.l.liu@intel.co…
Regards,
Yi Liu
Lu Baolu (1):
iommu: Add new iommu op to get iommu hardware information
Nicolin Chen (1):
iommufd/selftest: Add coverage for IOMMU_DEVICE_GET_HW_INFO ioctl
Yi Liu (2):
iommu: Move dev_iommu_ops() to private header
iommufd: Add IOMMU_DEVICE_GET_HW_INFO
drivers/iommu/iommu-priv.h | 11 +++
drivers/iommu/iommu.c | 2 +
drivers/iommu/iommufd/device.c | 73 +++++++++++++++++++
drivers/iommu/iommufd/iommufd_private.h | 1 +
drivers/iommu/iommufd/iommufd_test.h | 9 +++
drivers/iommu/iommufd/main.c | 3 +
drivers/iommu/iommufd/selftest.c | 16 ++++
include/linux/iommu.h | 27 ++++---
include/uapi/linux/iommufd.h | 44 +++++++++++
tools/testing/selftests/iommu/iommufd.c | 17 ++++-
tools/testing/selftests/iommu/iommufd_utils.h | 26 +++++++
11 files changed, 217 insertions(+), 12 deletions(-)
--
2.34.1
As suggested by Willy it is possible to detect the availability of
stackprotector via preprocessor defines.
Make use of that to simplify the code and interface of nolibc.
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
Thomas Weißschuh (7):
tools/nolibc: fix typo pint -> point
tools/nolibc: x86_64: disable stack protector for _start
tools/nolibc: ensure stack protector guard is never zero
tools/nolibc: add test for __stack_chk_guard initialization
tools/nolibc: reformat list of headers to be installed
tools/nolibc: add autodetection for stackprotector support
tools/nolibc: simplify stackprotector compiler flags
tools/include/nolibc/Makefile | 19 +++++++++++++++++--
tools/include/nolibc/arch-aarch64.h | 6 +++---
tools/include/nolibc/arch-arm.h | 6 +++---
tools/include/nolibc/arch-i386.h | 6 +++---
tools/include/nolibc/arch-loongarch.h | 6 +++---
tools/include/nolibc/arch-mips.h | 6 +++---
tools/include/nolibc/arch-riscv.h | 6 +++---
tools/include/nolibc/arch-x86_64.h | 8 ++++----
tools/include/nolibc/arch.h | 2 +-
tools/include/nolibc/compiler.h | 15 +++++++++++++++
tools/include/nolibc/stackprotector.h | 15 ++++++---------
tools/testing/selftests/nolibc/Makefile | 13 ++-----------
tools/testing/selftests/nolibc/nolibc-test.c | 10 +++++++++-
13 files changed, 72 insertions(+), 46 deletions(-)
---
base-commit: 606343b7478c319cb30291a39ecbceddb42229d6
change-id: 20230521-nolibc-automatic-stack-protector-b4f7fab9e625
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
Hi,
On AlmaLinux 8.7, make kselftest-all fails at memfd/memfd_test.c:
make[2]: Entering directory '/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/memfd'
gcc -D_FILE_OFFSET_BITS=64 -isystem /home/marvin/linux/kernel/linux_torvalds/usr/include memfd_test.c common.c -o
/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/memfd/memfd_test
memfd_test.c: In function ‘test_seal_future_write’:
memfd_test.c:916:27: error: ‘F_SEAL_FUTURE_WRITE’ undeclared (first use in this function); did you mean ‘F_SEAL_WRITE’?
mfd_assert_add_seals(fd, F_SEAL_FUTURE_WRITE);
^~~~~~~~~~~~~~~~~~~
F_SEAL_WRITE
memfd_test.c:916:27: note: each undeclared identifier is reported only once for each function it appears in
memfd_test.c: In function ‘test_exec_seal’:
memfd_test.c:36:7: error: ‘F_SEAL_FUTURE_WRITE’ undeclared (first use in this function); did you mean ‘F_SEAL_WRITE’?
F_SEAL_FUTURE_WRITE | \
^~~~~~~~~~~~~~~~~~~
memfd_test.c:1058:27: note: in expansion of macro ‘F_WX_SEALS’
mfd_assert_has_seals(fd, F_WX_SEALS);
^~~~~~~~~~
make[2]: *** [../lib.mk:147: /home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/memfd/memfd_test] Error 1
make[2]: Leaving directory '/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/memfd'
Apparently, the include file include/uapi/linux/fcntl.h defines this
F_SEAL_FUTURE_WRITE as 0x0010:
include/uapi/linux/fcntl.h:45:#define F_SEAL_FUTURE_WRITE 0x0010 /* prevent future writes while mapped */
This patch fixed the issue:
---
tools/testing/selftests/memfd/memfd_test.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c
index dba0e8ba002f..868f17c02e32 100644
--- a/tools/testing/selftests/memfd/memfd_test.c
+++ b/tools/testing/selftests/memfd/memfd_test.c
@@ -28,7 +28,13 @@
#define MFD_DEF_SIZE 8192
#define STACK_SIZE 65536
-#define F_SEAL_EXEC 0x0020
+#ifndef F_SEAL_FUTURE_WRITE
+#define F_SEAL_FUTURE_WRITE 0x0010
+#endif
+
+#ifndef F_SEAL_EXEC
+#define F_SEAL_EXEC 0x0020
+#endif
#define F_WX_SEALS (F_SEAL_SHRINK | \
F_SEAL_GROW | \
Hope this helps.
Best regards,
Mirsad
--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
Changes since RFC v1:
* add two kselftests (patch 10-11)
* set virtual MSRs also on APs [Pawan]
* enable "virtualize IA32_SPEC_CTRL" for L2 to prevent L2 from changing
some bits of IA32_SPEC_CTRL (patch 4)
* other misc cleanup and cosmetic changes
RFC v1: https://lore.kernel.org/lkml/20221210160046.2608762-1-chen.zhang@intel.com/
This series introduces "virtualize IA32_SPEC_CTRL" support. Here are
introduction and use cases of this new feature.
### Virtualize IA32_SPEC_CTRL
"Virtualize IA32_SPEC_CTRL" [1] is a new VMX feature on Intel CPUs. This feature
allows VMM to lock some bits of IA32_SPEC_CTRL MSR even when the MSR is
pass-thru'd to a guest.
### Use cases of "virtualize IA32_SPEC_CTRL" [2]
Software mitigations like Retpoline and software BHB-clearing sequence depend on
CPU microarchitectures. And guest cannot know exactly the underlying
microarchitecture. When a guest is migrated between processors of different
microarchitectures, software mitigations which work perfectly on previous
microachitecture may be not effective on the new one. To fix the problem, some
hardware mitigations should be used in conjunction with software mitigations.
Using virtual IA32_SPEC_CTRL, VMM can enforce hardware mitigations transparently
to guests and avoid those hardware mitigations being unintentionally disabled
when guest changes IA32_SPEC_CTRL MSR.
### Intention of this series
This series adds the capability of enforcing hardware mitigations for guests
transparently and efficiently (i.e., without intecepting IA32_SPEC_CTRL MSR
accesses) to kvm. The capability can be used to solve the VM migration issue in
a pool consisting of processors of different microarchitectures.
Specifically, below are two target scenarios of this series:
Scenario 1: If retpoline is used by a VM to mitigate IMBTI in CPL0, VMM can set
RRSBA_DIS_S on parts enumerates RRSBA. Note that the VM is presented
with a microarchitecture doesn't enumerate RRSBA.
Scenario 2: If a VM uses software BHB-clearing sequence on transitions into CPL0
to mitigate BHI, VMM can use "virtualize IA32_SPEC_CTRL" to set
BHI_DIS_S on new parts which doesn't enumerate BHI_NO.
Intel defines some virtual MSRs [2] for guests to report in-use software
mitigations. This allows guests to opt in VMM's deploying hardware mitigations
for them if the guests are either running or later migrated to a system on which
in-use software mitigations are not effective. The virtual MSRs interface is
also added in this series.
### Organization of this series
1. Patch 1-3 Advertise RRSBA_CTRL and BHI_CTRL to guest
2. Patch 4 Add "virtualize IA32_SPEC_CTRL" support
3. Patch 5-9 Allow guests to report in-use software mitigations to KVM so
that KVM can enable hardware mitigations for guests.
4. Patch 10-11 Add kselftest for virtual MSRs and IA32_SPEC_CTRL
[1]: https://cdrdv2.intel.com/v1/dl/getContent/671368 Ref. #319433-047 Chapter 12
[2]: https://www.intel.com/content/www/us/en/developer/articles/technical/softwa…
Chao Gao (3):
KVM: VMX: Advertise MITI_ENUM_RETPOLINE_S_SUPPORT
KVM: selftests: Add tests for virtual enumeration/mitigation MSRs
KVM: selftests: Add tests for IA32_SPEC_CTRL MSR
Pawan Gupta (1):
x86/bugs: Use Virtual MSRs to request hardware mitigations
Zhang Chen (7):
x86/msr-index: Add bit definitions for BHI_DIS_S and BHI_NO
KVM: x86: Advertise CPUID.7.2.EDX and RRSBA_CTRL support
KVM: x86: Advertise BHI_CTRL support
KVM: VMX: Add IA32_SPEC_CTRL virtualization support
KVM: x86: Advertise ARCH_CAP_VIRTUAL_ENUM support
KVM: VMX: Advertise MITIGATION_CTRL support
KVM: VMX: Advertise MITI_CTRL_BHB_CLEAR_SEQ_S_SUPPORT
arch/x86/include/asm/msr-index.h | 33 +++-
arch/x86/include/asm/vmx.h | 5 +
arch/x86/include/asm/vmxfeatures.h | 2 +
arch/x86/kernel/cpu/bugs.c | 25 +++
arch/x86/kvm/cpuid.c | 22 ++-
arch/x86/kvm/reverse_cpuid.h | 8 +
arch/x86/kvm/svm/svm.c | 3 +
arch/x86/kvm/vmx/capabilities.h | 5 +
arch/x86/kvm/vmx/nested.c | 13 ++
arch/x86/kvm/vmx/vmcs.h | 2 +
arch/x86/kvm/vmx/vmx.c | 112 ++++++++++-
arch/x86/kvm/vmx/vmx.h | 43 ++++-
arch/x86/kvm/x86.c | 19 +-
tools/arch/x86/include/asm/msr-index.h | 37 +++-
tools/testing/selftests/kvm/Makefile | 2 +
.../selftests/kvm/include/x86_64/processor.h | 5 +
.../selftests/kvm/x86_64/spec_ctrl_msr_test.c | 178 ++++++++++++++++++
.../kvm/x86_64/virtual_mitigation_msr_test.c | 175 +++++++++++++++++
18 files changed, 676 insertions(+), 13 deletions(-)
create mode 100644 tools/testing/selftests/kvm/x86_64/spec_ctrl_msr_test.c
create mode 100644 tools/testing/selftests/kvm/x86_64/virtual_mitigation_msr_test.c
base-commit: 400d2132288edbd6d500f45eab5d85526ca94e46
--
2.40.0
Dzień dobry,
chcielibyśmy zapewnić Państwu kompleksowe rozwiązania, jeśli chodzi o system monitoringu GPS.
Precyzyjne monitorowanie pojazdów na mapach cyfrowych, śledzenie ich parametrów eksploatacyjnych w czasie rzeczywistym oraz kontrola paliwa to kluczowe funkcjonalności naszego systemu.
Organizowanie pracy pracowników jest dzięki temu prostsze i bardziej efektywne, a oszczędności i optymalizacja w zakresie ponoszonych kosztów, mają dla każdego przedsiębiorcy ogromne znaczenie.
Dopasujemy naszą ofertę do Państwa oczekiwań i potrzeb organizacji. Czy moglibyśmy porozmawiać o naszej propozycji?
Pozdrawiam
Konrad Trojanowski
syscall() is used by "normal" libcs to allow users to directly call
syscalls.
By having the same syntax inside nolibc users can more easily write code
that works with different libcs.
The macro logic is adapted from systemtaps STAP_PROBEV() macro that is
released in the public domain / CC0.
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
tools/include/nolibc/unistd.h | 15 +++++++++++++++
tools/testing/selftests/nolibc/nolibc-test.c | 2 ++
2 files changed, 17 insertions(+)
diff --git a/tools/include/nolibc/unistd.h b/tools/include/nolibc/unistd.h
index ac7d53d986cd..6773e83c16a0 100644
--- a/tools/include/nolibc/unistd.h
+++ b/tools/include/nolibc/unistd.h
@@ -56,6 +56,21 @@ int tcsetpgrp(int fd, pid_t pid)
return ioctl(fd, TIOCSPGRP, &pid);
}
+#define _syscall(N, ...) \
+({ \
+ int _ret = my_syscall##N(__VA_ARGS__); \
+ if (_ret < 0) { \
+ SET_ERRNO(-_ret); \
+ _ret = -1; \
+ } \
+ _ret; \
+})
+
+#define _sycall_narg(...) __syscall_narg(__VA_ARGS__, 6, 5, 4, 3, 2, 1, 0)
+#define __syscall_narg(_0, _1, _2, _3, _4, _5, _6, N, ...) N
+#define _syscall_n(N, ...) _syscall(N, __VA_ARGS__)
+#define syscall(...) _syscall_n(_sycall_narg(__VA_ARGS__), ##__VA_ARGS__)
+
/* make sure to include all global symbols */
#include "nolibc.h"
diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
index f042a6436b6b..54bf91847af3 100644
--- a/tools/testing/selftests/nolibc/nolibc-test.c
+++ b/tools/testing/selftests/nolibc/nolibc-test.c
@@ -588,6 +588,8 @@ int run_syscall(int min, int max)
CASE_TEST(waitpid_child); EXPECT_SYSER(1, waitpid(getpid(), &tmp, WNOHANG), -1, ECHILD); break;
CASE_TEST(write_badf); EXPECT_SYSER(1, write(-1, &tmp, 1), -1, EBADF); break;
CASE_TEST(write_zero); EXPECT_SYSZR(1, write(1, &tmp, 0)); break;
+ CASE_TEST(syscall_noargs); EXPECT_SYSEQ(1, syscall(__NR_getpid), getpid()); break;
+ CASE_TEST(syscall_args); EXPECT_SYSER(1, syscall(__NR_fstat, 0, NULL), -1, EFAULT); break;
case __LINE__:
return ret; /* must be last */
/* note: do not set any defaults so as to permit holes above */
---
base-commit: 063dcc53b416ae1e89f767330feab3d0842943ed
change-id: 20230517-nolibc-syscall-bd13da6468c6
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
Hello,
Here is v2 of the mremap start address optimization / fix for exec warning.
v1->v2:
1. Trigger the optimization for mremaps smaller than a PMD. I tested by tracing
that it works correctly.
2. Fix issue with bogus return value found by Linus if we broke out of the
above loop for the first PMD itself.
Description of patches:
These patches optimizes the start addresses in move_page_tables() and tests the
changes. It addresses a warning [1] that occurs due to a downward, overlapping
move on a mutually-aligned offset within a PMD during exec. By initiating the
copy process at the PMD level when such alignment is present, we can prevent
this warning and speed up the copying process at the same time. Linus Torvalds
suggested this idea.
Please check the individual patches for more details.
thanks,
- Joel
[1] https://lore.kernel.org/all/ZB2GTBD%2FLWTrkOiO@dhcp22.suse.cz/
Joel Fernandes (Google) (4):
mm/mremap: Optimize the start addresses in move_page_tables()
selftests: mm: Fix failure case when new remap region was not found
selftests: mm: Add a test for mutually aligned moves > PMD size
selftests: mm: Add a test for remapping to area immediately after
existing mapping
mm/mremap.c | 56 +++++++++++++++++++
tools/testing/selftests/mm/mremap_test.c | 69 +++++++++++++++++++++---
2 files changed, 119 insertions(+), 6 deletions(-)
--
2.40.1.698.g37aff9b760-goog
Dear,
Please grant me permission to share a very crucial discussion with
you. I am looking forward to hearing from you at your earliest
convenience.
Mrs. Nina Coulibal
> From: Tian, Kevin
> Sent: Friday, May 19, 2023 4:42 PM
> > +struct iommu_hw_info {
> > + __u32 size;
> > + __u32 flags;
> > + __u32 dev_id;
> > + __u32 data_len;
> > + __aligned_u64 data_ptr;
> > + __u32 out_data_type;
> > + __u32 __reserved;
>
> it's unusual to have reserved field in the end. It makes more sense
> to move data_ptr to the end to make it meaningful.
>
Please ignore this comment. typed too fast...
In the end of the test, there will be an error message induced by the
`ip netns del ns1` command in cleanup()
Tests passed: 201
Tests failed: 0
Cannot remove namespace file "/run/netns/ns1": No such file or directory
This can even be reproduced with just `./fib_tests.sh -h` as we're
calling cleanup() on exit.
Redirect the error message to /dev/null to mute it.
V2: Update commit message and fixes tag.
V3: resubmit due to missing netdev ML in V2
Fixes: b60417a9f2b8 ("selftest: fib_tests: Always cleanup before exit")
Signed-off-by: Po-Hsu Lin <po-hsu.lin(a)canonical.com>
---
tools/testing/selftests/net/fib_tests.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/fib_tests.sh b/tools/testing/selftests/net/fib_tests.sh
index 7da8ec8..35d89df 100755
--- a/tools/testing/selftests/net/fib_tests.sh
+++ b/tools/testing/selftests/net/fib_tests.sh
@@ -68,7 +68,7 @@ setup()
cleanup()
{
$IP link del dev dummy0 &> /dev/null
- ip netns del ns1
+ ip netns del ns1 &> /dev/null
ip netns del ns2 &> /dev/null
}
--
2.7.4
Hello,
I am posting this as an RFC for any feedback. I have tested them suitably and I
am continuing to test them.
These patches optimizes the start addresses in move_page_tables(). It addresses a
warning [1] that occurs due to a downward, overlapping move on a mutually-aligned
offset within a PMD during exec. By initiating the copy process at the PMD
level when such alignment is present, we can prevent this warning and speed up
the copying process at the same time. Linus Torvalds suggested this idea.
Please check the individual patches for more details.
thanks,
- Joel
[1] https://lore.kernel.org/all/ZB2GTBD%2FLWTrkOiO@dhcp22.suse.cz/
Joel Fernandes (Google) (4):
mm/mremap: Optimize the start addresses in move_page_tables()
selftests: mm: Fix failure case when new remap region was not found
selftests: mm: Add a test for mutually aligned moves > PMD size
selftests: mm: Add a test for remapping to area immediately after
existing mapping
mm/mremap.c | 49 +++++++++++++++++
tools/testing/selftests/mm/mremap_test.c | 69 +++++++++++++++++++++---
2 files changed, 112 insertions(+), 6 deletions(-)
--
2.40.1.606.ga4b1b128d6-goog
KVM_GET_REG_LIST will dump all register IDs that are available to
KVM_GET/SET_ONE_REG and It's very useful to identify some platform
regression issue during VM migration.
Patch 1 enabled the KVM_GET_REG_LIST API in riscv and patch 2 added
the corresponding kselftest for checking possible register regressions.
Both patches were ported from arm64 and tested with Linux 6.4-rc1 on a
Qemu riscv virt machine.
Haibo Xu (2):
riscv: kvm: Add KVM_GET_REG_LIST API support
KVM: selftests: Add riscv get-reg-list test
Documentation/virt/kvm/api.rst | 2 +-
arch/riscv/kvm/vcpu.c | 346 +++++++
tools/testing/selftests/kvm/Makefile | 3 +
.../selftests/kvm/include/riscv/processor.h | 3 +
.../selftests/kvm/riscv/get-reg-list.c | 869 ++++++++++++++++++
5 files changed, 1222 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/kvm/riscv/get-reg-list.c
--
2.34.1
In the end of the test, there will be an error message induced by the
`ip netns del ns1` command in cleanup()
Tests passed: 201
Tests failed: 0
Cannot remove namespace file "/run/netns/ns1": No such file or directory
This can even be reproduced with just `./fib_tests.sh -h` as we're
calling cleanup() on exit.
Redirect the error message to /dev/null to mute it.
V2: Update commit message and fixes tag.
Fixes: b60417a9f2b8 ("selftest: fib_tests: Always cleanup before exit")
Signed-off-by: Po-Hsu Lin <po-hsu.lin(a)canonical.com>
---
tools/testing/selftests/net/fib_tests.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/fib_tests.sh b/tools/testing/selftests/net/fib_tests.sh
index 7da8ec8..35d89df 100755
--- a/tools/testing/selftests/net/fib_tests.sh
+++ b/tools/testing/selftests/net/fib_tests.sh
@@ -68,7 +68,7 @@ setup()
cleanup()
{
$IP link del dev dummy0 &> /dev/null
- ip netns del ns1
+ ip netns del ns1 &> /dev/null
ip netns del ns2 &> /dev/null
}
--
2.7.4
This patchset adds a stress test for kprobes and a test for checking
optimized probes.
The two tests are being added based on the below discussion:
https://lore.kernel.org/all/20230128101622.ce6f8e64d929e29d36b08b73@kernel.…
kprobe_opt_types.tc is modified as per the below review comments:
https://lore.kernel.org/all/1682506809.uus6y0ir3i.naveen@linux.ibm.com/#t
Changelog:
v3:
* Add Acked-by for kprobe_insn_boundary.tc
* Simplify test for optimized probe, as suggested by Masami
* Add exit_unresolved to exit as unresolved in case no probe was optimized
v2:
* Add an explicit fork after enabling the events ( echo "forked" )
* Remove the extended test from multiple_kprobe_types.tc which adds
multiple consecutive probes in a function and add it as a
separate test case.
* Add new test case which checks for optimized probes.
Akanksha J N (2):
selftests/ftrace: Add new test case which adds multiple consecutive
probes in a function
selftests/ftrace: Add new test case which checks for optimized probes
.../test.d/kprobe/kprobe_insn_boundary.tc | 19 +++++++++++
.../ftrace/test.d/kprobe/kprobe_opt_types.tc | 34 +++++++++++++++++++
2 files changed, 53 insertions(+)
create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kprobe_insn_boundary.tc
create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kprobe_opt_types.tc
--
2.31.1
In the end of the test, there will be an error message induced by the
`ip netns del ns1` command in cleanup()
Tests passed: 201
Tests failed: 0
Cannot remove namespace file "/run/netns/ns1": No such file or directory
Redirect the error message to /dev/null to mute it.
Fixes: a0e11da78f48 ("fib_tests: Add tests for metrics on routes")
Signed-off-by: Po-Hsu Lin <po-hsu.lin(a)canonical.com>
---
tools/testing/selftests/net/fib_tests.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/fib_tests.sh b/tools/testing/selftests/net/fib_tests.sh
index 7da8ec8..35d89df 100755
--- a/tools/testing/selftests/net/fib_tests.sh
+++ b/tools/testing/selftests/net/fib_tests.sh
@@ -68,7 +68,7 @@ setup()
cleanup()
{
$IP link del dev dummy0 &> /dev/null
- ip netns del ns1
+ ip netns del ns1 &> /dev/null
ip netns del ns2 &> /dev/null
}
--
2.7.4
Hi Linus,
Please pull the following Kselftest fixes update for Linux 6.4-rc3.
This Kselftest fixes update for Linux 6.4-rc3 consists of:
- sgx test fix for false negatives.
- ftrace output is hard to parse and it masks inappropriate skips etc.
This fix addresses the problems by integrating with kselftest runner.
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit ac9a78681b921877518763ba0e89202254349d1b:
Linux 6.4-rc1 (2023-05-07 13:34:35 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux-kselftest-fixes-6.4-rc3
for you to fetch changes up to dbcf76390eb9a65d5d0c37b0cd57335218564e37:
selftests/ftrace: Improve integration with kselftest runner (2023-05-08 11:10:13 -0600)
----------------------------------------------------------------
linux-kselftest-fixes-6.4-rc3
This Kselftest fixes update for Linux 6.4-rc3 consists of:
- sgx test fix for false negatives.
- ftrace output is hard to parse and it masks inappropriate skips etc.
This fix addresses the problems by integrating with kselftest runner.
----------------------------------------------------------------
Mark Brown (1):
selftests/ftrace: Improve integration with kselftest runner
Yi Lai (1):
selftests/sgx: Add "test_encl.elf" to TEST_FILES
tools/testing/selftests/ftrace/Makefile | 3 +-
tools/testing/selftests/ftrace/ftracetest | 63 ++++++++++++++++++++++++--
tools/testing/selftests/ftrace/ftracetest-ktap | 8 ++++
tools/testing/selftests/sgx/Makefile | 1 +
4 files changed, 71 insertions(+), 4 deletions(-)
create mode 100755 tools/testing/selftests/ftrace/ftracetest-ktap
----------------------------------------------------------------
v2:
---
* swap order of patches (thanks Claudio)
* add r-b
* add comment why memslots are zeroed
Add a new selftest for CMMA migration. Also fix a small issue found during
development of the test.
Nico Boehr (2):
KVM: s390: fix KVM_S390_GET_CMMA_BITS for GFNs in memslot holes
KVM: s390: selftests: add selftest for CMMA migration
arch/s390/kvm/kvm-s390.c | 4 +
tools/testing/selftests/kvm/Makefile | 1 +
tools/testing/selftests/kvm/s390x/cmma_test.c | 680 ++++++++++++++++++
3 files changed, 685 insertions(+)
create mode 100644 tools/testing/selftests/kvm/s390x/cmma_test.c
--
2.39.1
The cited commit added a stray colon to the 'v' option. That makes the
option work incorrectly.
ex:
tools/testing/selftests/net# ./fib_nexthops.sh -v
(should enable verbose mode, instead it shows help text due to missing arg)
Fixes: 5feba4727395 ("selftests: fib_nexthops: Make ping timeout configurable")
Reviewed-by: Ido Schimmel <idosch(a)nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier(a)nvidia.com>
---
tools/testing/selftests/net/fib_nexthops.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/fib_nexthops.sh b/tools/testing/selftests/net/fib_nexthops.sh
index a47b26ab48f2..0f5e88c8f4ff 100755
--- a/tools/testing/selftests/net/fib_nexthops.sh
+++ b/tools/testing/selftests/net/fib_nexthops.sh
@@ -2283,7 +2283,7 @@ EOF
################################################################################
# main
-while getopts :t:pP46hv:w: o
+while getopts :t:pP46hvw: o
do
case $o in
t) TESTS=$OPTARG;;
--
2.40.1
While KUnit tests that cannot be built as a loadable module must depend
on "KUNIT=y", this is not true for modular tests, where it adds an
unnecessary limitation.
Fix this by relaxing the dependency to "KUNIT".
Fixes: 08809e482a1c44d9 ("HID: uclogic: KUnit best practices and naming conventions")
Signed-off-by: Geert Uytterhoeven <geert+renesas(a)glider.be>
---
drivers/hid/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hid/Kconfig b/drivers/hid/Kconfig
index 4ce012f83253ec9f..b977450cac75265d 100644
--- a/drivers/hid/Kconfig
+++ b/drivers/hid/Kconfig
@@ -1285,7 +1285,7 @@ config HID_MCP2221
config HID_KUNIT_TEST
tristate "KUnit tests for HID" if !KUNIT_ALL_TESTS
- depends on KUNIT=y
+ depends on KUNIT
depends on HID_BATTERY_STRENGTH
depends on HID_UCLOGIC
default KUNIT_ALL_TESTS
--
2.34.1
Add documentation for the new Virtual ALSA driver. It covers all possible
usage cases: errors and delay injections, random and pattern-based data
generation, playback and ioctl redefinition functionalities testing.
We have a lot of different virtual media drivers, which can be used for
testing of the userspace applications and media subsystem middle layer.
However, all of them are aimed at testing the video functionality and
simulating the video devices. For audio devices we have only snd-dummy
module, which is good in simulating the correct behavior of an ALSA device.
I decided to write a tool, which would help to test the userspace ALSA
programs (and the PCM middle layer as well) under unusual circumstances
to figure out how they would behave. So I came up with this Virtual ALSA
Driver.
This new Virtual ALSA Driver has several features which can be useful
during the userspace ALSA applications testing/fuzzing, or testing/fuzzing
of the PCM middle layer. Not all of them can be implemented using the
existing virtual drivers (like dummy or loopback). Here is what can this
driver do:
- Simulate both capture and playback processes
- Check the playback stream for containing the looped pattern
- Generate random or pattern-based capture data
- Inject delays into the playback and capturing processes
- Inject errors during the PCM callbacks
Also, this driver can check the playback stream for containing the
predefined pattern, which is used in the corresponding selftest to check
the PCM middle layer data transferring functionality. Additionally, this
driver redefines the default RESET ioctl, and the selftest covers this PCM
API functionality as well.
Signed-off-by: Ivan Orlov <ivan.orlov0322(a)gmail.com>
---
Documentation/admin-guide/index.rst | 1 +
Documentation/admin-guide/valsa.rst | 114 ++++++++++++++++++++++++++++
2 files changed, 115 insertions(+)
create mode 100644 Documentation/admin-guide/valsa.rst
diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
index 43ea35613dfc..328cc59275a1 100644
--- a/Documentation/admin-guide/index.rst
+++ b/Documentation/admin-guide/index.rst
@@ -131,6 +131,7 @@ configure specific aspects of kernel behavior to your liking.
thunderbolt
ufs
unicode
+ valsa
vga-softcursor
video-output
xfs
diff --git a/Documentation/admin-guide/valsa.rst b/Documentation/admin-guide/valsa.rst
new file mode 100644
index 000000000000..64ffc130fb4c
--- /dev/null
+++ b/Documentation/admin-guide/valsa.rst
@@ -0,0 +1,114 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+The Virtual ALSA Driver
+=======================
+
+The Virtual ALSA Driver emulates a generic ALSA device, and can be used for
+testing/fuzzing of the userspace ALSA applications, as well as for testing/fuzzing of
+the ALSA middle layer. Additionally, it can be used for simulating hard to reproduce
+problems with PCM devices.
+
+What can this driver do?
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+At this moment the driver can do the following things:
+ * Simulate both capture and playback processes
+ * Generate random or pattern-based capturing data
+ * Inject delays into the playback and capturing processes
+ * Inject errors during the PCM callbacks
+
+Also, this driver can check the playback stream for containing the
+predefined pattern, which is used in the corresponding selftest (alsa/valsa-test.sh)
+to check the PCM middle layer data transferring functionality. Additionally, this
+driver redefines the default RESET ioctl, and the selftest covers this PCM
+API functionality as well.
+
+Configuration
+-------------
+
+The driver has several parameters besides the common ALSA module parameters:
+
+ * fill_mode (bool) - Buffer fill mode (see below)
+ * inject_delay (int)
+ * inject_hwpars_err (bool)
+ * inject_prepare_err (bool)
+ * inject_trigger_err (bool)
+
+
+Capture Data Generation
+-----------------------
+
+The driver has two modes of data generation: the first (0 in the fill_mode parameter)
+means random data generation, the second (1 in the fill_mode) - pattern-based
+data generation. Let's look at the second mode.
+
+First of all, you may want to specify the pattern for data generation. You can do it
+by writing the pattern to the debugfs file (/sys/kernel/debug/valsa/fill_pattern).
+Like that:
+
+.. code-block:: bash
+
+ echo -n mycoolpattern > /sys/kernel/debug/valsa/fill_pattern
+
+After this, every capture action performed on the 'valsa' device will return
+'mycoolpatternmycoolpatternmycoolpatternmy...' in the capturing buffer.
+
+The pattern itself can be up to 4096 bytes long.
+
+Delay injection
+---------------
+
+The driver has 'inject_delay' parameter, which has very self-descriptive name and
+can be used for time delay/speedup simulations. The parameter has integer type, and
+it means the delay added between module's internal timer ticks.
+
+If the 'inject_delay' value is positive, the buffer will be filled slower, if it is
+negative - faster. You can try it yourself by starting a recording in any
+audiorecording application (like Audacity) and selecting the 'valsa' device as a
+source.
+
+This parameter can be also used for generating a huge amount of sound data in a very
+short period of time (with the negative 'inject_delay' value).
+
+Errors injection
+----------------
+
+This module can be used for injecting errors into the PCM communication process. This
+action can help you to figure out how the userspace ALSA program behaves under unusual
+circumstances.
+
+For example, you can make all 'hw_params' PCM callback calls return EBUSY error by
+writing '1' to the 'inject_hwpars_err' module parameter:
+
+.. code-block:: bash
+
+ echo 1 > /sys/module/snd_valsa/parameters/inject_hwpars_err
+
+Errors can be injected into the following PCM callbacks:
+
+ * hw_params (EBUSY)
+ * prepare (EINVAL)
+ * trigger (EINVAL)
+
+
+Playback test
+-------------
+
+This driver can be also used for the playback functionality testing - every time you
+write the playback data to the 'valsa' PCM device and close it, the driver checks the
+buffer for containing the looped pattern (which is specified in the fill_pattern
+debugfs file). If the playback buffer content represents the looped pattern, 'pc_test'
+debugfs entry is set into '1'. Otherwise, the driver sets it to '0'.
+
+ioctl redefinition test
+-----------------------
+
+The driver redefines the 'reset' ioctl, which is default for all PCM devices. To test
+this functionality, we can trigger the reset ioctl and check the 'ioctl_test' debugfs
+entry:
+
+.. code-block:: bash
+
+ cat /sys/kernel/debug/valsa/ioctl_test
+
+If the ioctl is triggered successfully, this file will contain '1', and '0' otherwise.
--
2.34.1
This pachset aims to improve and make more robust the selftests performed to
check whether SRv6 End.DT4 beahvior works as expected under different system
configurations.
Some Linux distributions enable Deduplication Address Detection and Reverse
Path Filtering mechanisms by default which can interfere with SRv6 End.DT4
behavior and cause selftests to fail.
The following patches improve selftests for End.DT4 by taking these two
mechanisms into account. Specifically:
- patch 1/2: selftests: seg6: disable DAD on IPv6 router cfg for
srv6_end_dt4_l3vpn_test
- patch 2/2: selftets: seg6: disable rp_filter by default in
srv6_end_dt4_l3vpn_test
Thank you all,
Andrea
Andrea Mayer (2):
selftests: seg6: disable DAD on IPv6 router cfg for
srv6_end_dt4_l3vpn_test
selftets: seg6: disable rp_filter by default in
srv6_end_dt4_l3vpn_test
.../selftests/net/srv6_end_dt4_l3vpn_test.sh | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)
--
2.20.1
Dzień dobry,
w ramach nowej edycji programu Czyste Powietrze dla klientów indywidualnych mogą otrzymać Państwo do 135 tys. zł wsparcia na zakup pompy ciepła.
Prócz wyższego dofinansowania program zakłada m.in. podwyższenie progów dochodowych oraz możliwość złożenia kolejnego wniosku o dofinansowanie dla tych, którzy już wcześniej skorzystali z Programu.
Jako firma specjalizująca się w dostawie, montażu i serwisie pomp ciepła pomożemy Państwu w uzyskaniu dofinansowania wraz z kompleksową realizacją całego projektu.
Są Państwo zainteresowani?
Pozdrawiam
Damian Hordych
KUnit tests run in a kthread, with the current->kunit_test pointer set
to the test's context. This allows the kunit_get_current_test() and
kunit_fail_current_test() macros to work. Normally, this pointer is
still valid during test shutdown (i.e., the suite->exit function, and
any resource cleanup). However, if the test has exited early (e.g., due
to a failed assertion), the cleanup is done in the parent KUnit thread,
which does not have an active context.
Instead, in the event test terminates early, run the test exit and
cleanup from a new 'cleanup' kthread, which sets current->kunit_test,
and better isolates the rest of KUnit from issues which arise in test
cleanup.
If a test cleanup function itself aborts (e.g., due to an assertion
failing), there will be no further attempts to clean up: an error will
be logged and the test failed. For example:
# example_simple_test: test aborted during cleanup. continuing without cleaning up
This should also make it easier to get access to the KUnit context,
particularly from within resource cleanup functions, which may, for
example, need access to data in test->priv.
Reviewed-by: Benjamin Berg <benjamin.berg(a)intel.com>
Reviewed-by: Maxime Ripard <maxime(a)cerno.tech>
Tested-by: Maxime Ripard <maxime(a)cerno.tech>
Signed-off-by: David Gow <davidgow(a)google.com>
---
This is an updated version of / replacement for "kunit: Set the current
KUnit context when cleaning up", which instead creates a new kthread
for cleanup tasks if the original test kthread is aborted. This protects
us from failed assertions during cleanup, if the test exited early.
Changes since v3:
https://lore.kernel.org/all/20230421040218.2156548-1-davidgow@google.com/
- Get rid of a unused 'suite' variable (kernel test robot)
- Add Benjamin and Maxime's Reviewed-by tags.
Changes since v2:
https://lore.kernel.org/linux-kselftest/20230419085426.1671703-1-davidgow@g…
- Always run cleanup in its own kthread
- Therefore, never attempt to re-run it if it exits
- Thanks, Benjamin.
Changes since v1:
https://lore.kernel.org/linux-kselftest/20230415091401.681395-1-davidgow@go…
- Move cleanup execution to another kthread
- (Thanks, Benjamin, for pointing out the assertion issues)
---
lib/kunit/test.c | 56 +++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 48 insertions(+), 8 deletions(-)
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index e2910b261112..f5e4ceffd282 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -419,15 +419,54 @@ static void kunit_try_run_case(void *data)
* thread will resume control and handle any necessary clean up.
*/
kunit_run_case_internal(test, suite, test_case);
- /* This line may never be reached. */
+}
+
+static void kunit_try_run_case_cleanup(void *data)
+{
+ struct kunit_try_catch_context *ctx = data;
+ struct kunit *test = ctx->test;
+ struct kunit_suite *suite = ctx->suite;
+
+ current->kunit_test = test;
+
kunit_run_case_cleanup(test, suite);
}
+static void kunit_catch_run_case_cleanup(void *data)
+{
+ struct kunit_try_catch_context *ctx = data;
+ struct kunit *test = ctx->test;
+ int try_exit_code = kunit_try_catch_get_result(&test->try_catch);
+
+ /* It is always a failure if cleanup aborts. */
+ kunit_set_failure(test);
+
+ if (try_exit_code) {
+ /*
+ * Test case could not finish, we have no idea what state it is
+ * in, so don't do clean up.
+ */
+ if (try_exit_code == -ETIMEDOUT) {
+ kunit_err(test, "test case cleanup timed out\n");
+ /*
+ * Unknown internal error occurred preventing test case from
+ * running, so there is nothing to clean up.
+ */
+ } else {
+ kunit_err(test, "internal error occurred during test case cleanup: %d\n",
+ try_exit_code);
+ }
+ return;
+ }
+
+ kunit_err(test, "test aborted during cleanup. continuing without cleaning up\n");
+}
+
+
static void kunit_catch_run_case(void *data)
{
struct kunit_try_catch_context *ctx = data;
struct kunit *test = ctx->test;
- struct kunit_suite *suite = ctx->suite;
int try_exit_code = kunit_try_catch_get_result(&test->try_catch);
if (try_exit_code) {
@@ -448,12 +487,6 @@ static void kunit_catch_run_case(void *data)
}
return;
}
-
- /*
- * Test case was run, but aborted. It is the test case's business as to
- * whether it failed or not, we just need to clean up.
- */
- kunit_run_case_cleanup(test, suite);
}
/*
@@ -478,6 +511,13 @@ static void kunit_run_case_catch_errors(struct kunit_suite *suite,
context.test_case = test_case;
kunit_try_catch_run(try_catch, &context);
+ /* Now run the cleanup */
+ kunit_try_catch_init(try_catch,
+ test,
+ kunit_try_run_case_cleanup,
+ kunit_catch_run_case_cleanup);
+ kunit_try_catch_run(try_catch, &context);
+
/* Propagate the parameter result to the test case. */
if (test->status == KUNIT_FAILURE)
test_case->status = KUNIT_FAILURE;
--
2.40.1.521.gf1e218fcd8-goog
Hi All,
In TDX guest, the attestation process is used to verify the TDX guest
trustworthiness to other entities before provisioning secrets to the
guest.
The TDX guest attestation process consists of two steps:
1. TDREPORT generation
2. Quote generation.
The First step (TDREPORT generation) involves getting the TDX guest
measurement data in the format of TDREPORT which is further used to
validate the authenticity of the TDX guest. The second step involves
sending the TDREPORT to a Quoting Enclave (QE) server to generate a
remotely verifiable Quote. TDREPORT by design can only be verified on
the local platform. To support remote verification of the TDREPORT,
TDX leverages Intel SGX Quoting Enclave to verify the TDREPORT
locally and convert it to a remotely verifiable Quote. Although
attestation software can use communication methods like TCP/IP or
vsock to send the TDREPORT to QE, not all platforms support these
communication models. So TDX GHCI specification [1] defines a method
for Quote generation via hypercalls. Please check the discussion from
Google [2] and Alibaba [3] which clarifies the need for hypercall based
Quote generation support. This patch set adds this support.
Support for TDREPORT generation already exists in the TDX guest driver.
This patchset extends the same driver to add the Quote generation
support.
Following are the details of the patch set:
Patch 1/3 -> Adds event notification IRQ support.
Patch 2/3 -> Adds Quote generation support.
Patch 3/3 -> Adds selftest support for Quote generation feature.
[1] https://cdrdv2.intel.com/v1/dl/getContent/726790, section titled "TDG.VP.VMCALL<GetQuote>".
[2] https://lore.kernel.org/lkml/CAAYXXYxxs2zy_978GJDwKfX5Hud503gPc8=1kQ-+JwG_k…
[3] https://lore.kernel.org/lkml/a69faebb-11e8-b386-d591-dbd08330b008@linux.ali…
Kuppuswamy Sathyanarayanan (3):
x86/tdx: Add TDX Guest event notify interrupt support
virt: tdx-guest: Add Quote generation support
selftests/tdx: Test GetQuote TDX attestation feature
Documentation/virt/coco/tdx-guest.rst | 11 ++
arch/x86/coco/tdx/tdx.c | 196 +++++++++++++++++++
arch/x86/include/asm/tdx.h | 8 +
drivers/virt/coco/tdx-guest/tdx-guest.c | 168 +++++++++++++++-
include/uapi/linux/tdx-guest.h | 43 ++++
tools/testing/selftests/tdx/tdx_guest_test.c | 68 ++++++-
6 files changed, 487 insertions(+), 7 deletions(-)
--
2.34.1
From: Ivan Orlov <ivan.orlov0322(a)gmail.com>
[ Upstream commit 735b0e0f2d001b7ed9486db84453fb860e764a4d ]
There is a 'malloc' call in vcpu_save_state function, which can
be unsuccessful. This patch will add the malloc failure checking
to avoid possible null dereference and give more information
about test fail reasons.
Signed-off-by: Ivan Orlov <ivan.orlov0322(a)gmail.com>
Link: https://lore.kernel.org/r/20230322144528.704077-1-ivan.orlov0322@gmail.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/kvm/lib/x86_64/processor.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index acfa1d01e7df0..d9365a9d1c490 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -950,6 +950,7 @@ struct kvm_x86_state *vcpu_save_state(struct kvm_vcpu *vcpu)
vcpu_run_complete_io(vcpu);
state = malloc(sizeof(*state) + msr_list->nmsrs * sizeof(state->msrs.entries[0]));
+ TEST_ASSERT(state, "-ENOMEM when allocating kvm state");
vcpu_events_get(vcpu, &state->events);
vcpu_mp_state_get(vcpu, &state->mp_state);
--
2.39.2
From: Ivan Orlov <ivan.orlov0322(a)gmail.com>
[ Upstream commit 735b0e0f2d001b7ed9486db84453fb860e764a4d ]
There is a 'malloc' call in vcpu_save_state function, which can
be unsuccessful. This patch will add the malloc failure checking
to avoid possible null dereference and give more information
about test fail reasons.
Signed-off-by: Ivan Orlov <ivan.orlov0322(a)gmail.com>
Link: https://lore.kernel.org/r/20230322144528.704077-1-ivan.orlov0322@gmail.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/kvm/lib/x86_64/processor.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index c39a4353ba194..827647ff3d41b 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -954,6 +954,7 @@ struct kvm_x86_state *vcpu_save_state(struct kvm_vcpu *vcpu)
vcpu_run_complete_io(vcpu);
state = malloc(sizeof(*state) + msr_list->nmsrs * sizeof(state->msrs.entries[0]));
+ TEST_ASSERT(state, "-ENOMEM when allocating kvm state");
vcpu_events_get(vcpu, &state->events);
vcpu_mp_state_get(vcpu, &state->mp_state);
--
2.39.2
The generic fork() implementation in nolibc falls back to the clone()
syscall. On s390 the first two arguments to clone() are swapped compared
to other architectures, breaking the implementation in nolibc.
Add a custom implementation of fork() to s390 that works.
While at it also add a testcase for fork().
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
Thomas Weißschuh (2):
tools/nolibc: s390: provide custom implementation for sys_fork
tools/nolibc: add testcase for fork()/waitpid()
tools/include/nolibc/arch-s390.h | 8 ++++++++
tools/include/nolibc/sys.h | 2 ++
tools/testing/selftests/nolibc/nolibc-test.c | 20 ++++++++++++++++++++
3 files changed, 30 insertions(+)
---
base-commit: c1c4f33b6be9b3412d9e0ba01b367f4ffe47c379
change-id: 20230415-nolibc-fork-b7087a345166
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>