Add support for a new kind of kunit_suite registration macro called
kunit_test_init_suite(); this new registration macro allows the
registration of kunit_suites that reference functions marked __init and
data marked __initdata.
Signed-off-by: Brendan Higgins <brendanhiggins(a)google.com>
Tested-by: Martin Fernandez <martin.fernandez(a)eclypsium.com>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Reviewed-by: David Gow <davidgow(a)google.com>
---
This is a follow-up to the RFC here[1].
This patch is in response to a KUnit user issue[2] in which the user was
attempting to test some init functions; although this is a functional
solution as long as KUnit tests only run during the init phase, we will
need to do more work if we ever allow tests to run after the init phase
is over; it is for this reason that this patch adds a new registration
macro rather than simply modifying the existing macros.
Changes since last version:
- I added more to the kunit_test_init_suites() kernel-doc comment
detailing "how" the modpost warnings are suppressed in addition to
the existing information regarding "why" it is OK for the modpost
warnings to be suppressed.
[1] https://lore.kernel.org/linux-kselftest/20220310210210.2124637-1-brendanhig…
[2] https://groups.google.com/g/kunit-dev/c/XDjieRHEneg/m/D0rFCwVABgAJ
---
include/kunit/test.h | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/include/kunit/test.h b/include/kunit/test.h
index b26400731c02..7f303a06bc97 100644
--- a/include/kunit/test.h
+++ b/include/kunit/test.h
@@ -379,6 +379,32 @@ static inline int kunit_run_all_tests(void)
#define kunit_test_suite(suite) kunit_test_suites(&suite)
+/**
+ * kunit_test_init_suites() - used to register one or more &struct kunit_suite
+ * containing init functions or init data.
+ *
+ * @__suites: a statically allocated list of &struct kunit_suite.
+ *
+ * This functions identically as &kunit_test_suites() except that it suppresses
+ * modpost warnings for referencing functions marked __init or data marked
+ * __initdata; this is OK because currently KUnit only runs tests upon boot
+ * during the init phase or upon loading a module during the init phase.
+ *
+ * NOTE TO KUNIT DEVS: If we ever allow KUnit tests to be run after boot, these
+ * tests must be excluded.
+ *
+ * The only thing this macro does that's different from kunit_test_suites is
+ * that it suffixes the array and suite declarations it makes with _probe;
+ * modpost suppresses warnings about referencing init data for symbols named in
+ * this manner.
+ */
+#define kunit_test_init_suites(__suites...) \
+ __kunit_test_suites(CONCATENATE(__UNIQUE_ID(array), _probe), \
+ CONCATENATE(__UNIQUE_ID(suites), _probe), \
+ ##__suites)
+
+#define kunit_test_init_suite(suite) kunit_test_init_suites(&suite)
+
#define kunit_suite_for_each_test_case(suite, test_case) \
for (test_case = suite->test_cases; test_case->run_case; test_case++)
base-commit: 330f4c53d3c2d8b11d86ec03a964b86dc81452f5
--
2.35.1.723.g4982287a31-goog
Context:
When using a non-UML arch, kunit.py will boot the test kernel with these
options by default:
> mem=1G console=tty kunit_shutdown=halt console=ttyS0 kunit_shutdown=reboot
For QEMU, we need to use 'reboot', and for UML we need to use 'halt'.
If you switch them, kunit.py will hang until the --timeout expires.
So the code currently unconditionally adds 'kunit_shutdown=halt' but
then appends 'reboot' when using QEMU (which overwrites it).
This patch:
Having these duplicate options is a bit noisy.
Switch so we only add 'halt' for UML.
I.e. we now get
UML: 'mem=1G console=tty console=ttyS0 kunit_shutdown=halt'
QEMU: 'mem=1G console=tty console=ttyS0 kunit_shutdown=reboot'
Side effect: you can't overwrite kunit_shutdown on UML w/ --kernel_arg.
But you already couldn't for QEMU, and why would you want to?
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
---
tools/testing/kunit/kunit_kernel.py | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/kunit/kunit_kernel.py b/tools/testing/kunit/kunit_kernel.py
index 483f78e15ce9..9731ceb7ad92 100644
--- a/tools/testing/kunit/kunit_kernel.py
+++ b/tools/testing/kunit/kunit_kernel.py
@@ -158,7 +158,7 @@ class LinuxSourceTreeOperationsUml(LinuxSourceTreeOperations):
def start(self, params: List[str], build_dir: str) -> subprocess.Popen:
"""Runs the Linux UML binary. Must be named 'linux'."""
linux_bin = os.path.join(build_dir, 'linux')
- return subprocess.Popen([linux_bin] + params,
+ return subprocess.Popen([linux_bin] + params + ['kunit_shutdown=halt'],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
@@ -332,7 +332,7 @@ class LinuxSourceTree(object):
def run_kernel(self, args=None, build_dir='', filter_glob='', timeout=None) -> Iterator[str]:
if not args:
args = []
- args.extend(['mem=1G', 'console=tty', 'kunit_shutdown=halt'])
+ args.extend(['mem=1G', 'console=tty'])
if filter_glob:
args.append('kunit.filter_glob='+filter_glob)
base-commit: b04d1a8dc7e7ff7ca91a20bef053bcc04265d83a
--
2.35.1.1178.g4f1659d476-goog
The selftest "mqueue/mq_perf_tests.c" use CPU_ALLOC to allocate
CPU set. This cpu set is used further in pthread_attr_setaffinity_np
and by pthread_create in the code. But in current code, allocated
cpu set is not freed.
Fix this issue by adding CPU_FREE in the "shutdown" function which
is called in most of the error/exit path for the cleanup. Also add
CPU_FREE in some of the error paths where shutdown is not called.
Fixes: 7820b0715b6f ("tools/selftests: add mq_perf_tests")
Signed-off-by: Athira Rajeev <atrajeev(a)linux.vnet.ibm.com>
---
Changelog:
From v1 -> v2:
Addressed review comment from Shuah Khan to add
CPU_FREE in other exit paths where it is needed
tools/testing/selftests/mqueue/mq_perf_tests.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mqueue/mq_perf_tests.c b/tools/testing/selftests/mqueue/mq_perf_tests.c
index b019e0b8221c..182434c7898d 100644
--- a/tools/testing/selftests/mqueue/mq_perf_tests.c
+++ b/tools/testing/selftests/mqueue/mq_perf_tests.c
@@ -180,6 +180,9 @@ void shutdown(int exit_val, char *err_cause, int line_no)
if (in_shutdown++)
return;
+ /* Free the cpu_set allocated using CPU_ALLOC in main function */
+ CPU_FREE(cpu_set);
+
for (i = 0; i < num_cpus_to_pin; i++)
if (cpu_threads[i]) {
pthread_kill(cpu_threads[i], SIGUSR1);
@@ -589,6 +592,7 @@ int main(int argc, char *argv[])
cpu_set)) {
fprintf(stderr, "Any given CPU may "
"only be given once.\n");
+ CPU_FREE(cpu_set);
exit(1);
} else
CPU_SET_S(cpus_to_pin[cpu],
@@ -607,6 +611,7 @@ int main(int argc, char *argv[])
queue_path = malloc(strlen(option) + 2);
if (!queue_path) {
perror("malloc()");
+ CPU_FREE(cpu_set);
exit(1);
}
queue_path[0] = '/';
@@ -619,6 +624,7 @@ int main(int argc, char *argv[])
}
if (continuous_mode && num_cpus_to_pin == 0) {
+ CPU_FREE(cpu_set);
fprintf(stderr, "Must pass at least one CPU to continuous "
"mode.\n");
poptPrintUsage(popt_context, stderr, 0);
@@ -628,10 +634,12 @@ int main(int argc, char *argv[])
cpus_to_pin[0] = cpus_online - 1;
}
- if (getuid() != 0)
+ if (getuid() != 0) {
+ CPU_FREE(cpu_set);
ksft_exit_skip("Not running as root, but almost all tests "
"require root in order to modify\nsystem settings. "
"Exiting.\n");
+ }
max_msgs = fopen(MAX_MSGS, "r+");
max_msgsize = fopen(MAX_MSGSIZE, "r+");
--
2.35.1
This patch series adds a memory.reclaim proactive reclaim interface.
The rationale behind the interface and how it works are in the first
patch.
---
Changes in V2:
- Add the interface to root as well.
- Added a selftest.
- Documented the interface as a nested-keyed interface, which makes
adding optional arguments in the future easier (see doc updates in the
first patch).
- Modified the commit message to reflect changes and add a timeout
argument as a suggested possible extension
- Return -EAGAIN if the kernel fails to reclaim the full requested
amount.
---
Shakeel Butt (1):
memcg: introduce per-memcg reclaim interface
Yosry Ahmed (3):
selftests: cgroup: return the errno of write() in cg_write() on
failure
selftests: cgroup: fix alloc_anon_noexit() instantly freeing memory
selftests: cgroup: add a selftest for memory.reclaim
Documentation/admin-guide/cgroup-v2.rst | 21 +++++
mm/memcontrol.c | 37 ++++++++
tools/testing/selftests/cgroup/cgroup_util.c | 11 ++-
.../selftests/cgroup/test_memcontrol.c | 94 ++++++++++++++++++-
4 files changed, 156 insertions(+), 7 deletions(-)
--
2.35.1.1178.g4f1659d476-goog
The selftest "mqueue/mq_perf_tests.c" use CPU_ALLOC to allocate
CPU set. This cpu set is used further in pthread_attr_setaffinity_np
and by pthread_create in the code. But in current code, allocated
cpu set is not freed. Fix this by adding CPU_FREE after its usage
is done.
Signed-off-by: Athira Rajeev <atrajeev(a)linux.vnet.ibm.com>
---
tools/testing/selftests/mqueue/mq_perf_tests.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/mqueue/mq_perf_tests.c b/tools/testing/selftests/mqueue/mq_perf_tests.c
index b019e0b8221c..17c41f216bef 100644
--- a/tools/testing/selftests/mqueue/mq_perf_tests.c
+++ b/tools/testing/selftests/mqueue/mq_perf_tests.c
@@ -732,6 +732,7 @@ int main(int argc, char *argv[])
pthread_attr_destroy(&thread_attr);
}
+ CPU_FREE(cpu_set);
if (!continuous_mode) {
pthread_join(cpu_threads[0], &retval);
shutdown((long)retval, "perf_test_thread()", __LINE__);
--
2.35.1
bpf_tcp_gen_syncookie looks at the IP version in the IP header and
validates the address family of the socket. It supports IPv4 packets in
AF_INET6 dual-stack sockets.
On the other hand, bpf_tcp_check_syncookie looks only at the address
family of the socket, ignoring the real IP version in headers, and
validates only the packet size. This implementation has some drawbacks:
1. Packets are not validated properly, allowing a BPF program to trick
bpf_tcp_check_syncookie into handling an IPv6 packet on an IPv4
socket.
2. Dual-stack sockets fail the checks on IPv4 packets. IPv4 clients end
up receiving a SYNACK with the cookie, but the following ACK gets
dropped.
This patch fixes these issues by changing the checks in
bpf_tcp_check_syncookie to match the ones in bpf_tcp_gen_syncookie. IP
version from the header is taken into account, and it is validated
properly with address family.
Fixes: 399040847084 ("bpf: add helper to check for a valid SYN cookie")
Signed-off-by: Maxim Mikityanskiy <maximmi(a)nvidia.com>
Reviewed-by: Tariq Toukan <tariqt(a)nvidia.com>
Acked-by: Arthur Fabre <afabre(a)cloudflare.com>
---
net/core/filter.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/net/core/filter.c b/net/core/filter.c
index a7044e98765e..64470a727ef7 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -7016,24 +7016,33 @@ BPF_CALL_5(bpf_tcp_check_syncookie, struct sock *, sk, void *, iph, u32, iph_len
if (!th->ack || th->rst || th->syn)
return -ENOENT;
+ if (unlikely(iph_len < sizeof(struct iphdr)))
+ return -EINVAL;
+
if (tcp_synq_no_recent_overflow(sk))
return -ENOENT;
cookie = ntohl(th->ack_seq) - 1;
- switch (sk->sk_family) {
- case AF_INET:
- if (unlikely(iph_len < sizeof(struct iphdr)))
+ /* Both struct iphdr and struct ipv6hdr have the version field at the
+ * same offset so we can cast to the shorter header (struct iphdr).
+ */
+ switch (((struct iphdr *)iph)->version) {
+ case 4:
+ if (sk->sk_family == AF_INET6 && ipv6_only_sock(sk))
return -EINVAL;
ret = __cookie_v4_check((struct iphdr *)iph, th, cookie);
break;
#if IS_BUILTIN(CONFIG_IPV6)
- case AF_INET6:
+ case 6:
if (unlikely(iph_len < sizeof(struct ipv6hdr)))
return -EINVAL;
+ if (sk->sk_family != AF_INET6)
+ return -EINVAL;
+
ret = __cookie_v6_check((struct ipv6hdr *)iph, th, cookie);
break;
#endif /* CONFIG_IPV6 */
--
2.30.2
We have switched to memcg based memory accouting and thus the rlimit is
not needed any more. LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK was introduced in
libbpf for backward compatibility, so we can use it instead now.
This patchset cleanups the usage of RLIMIT_MEMLOCK in tools/bpf/,
tools/testing/selftests/bpf and samples/bpf. The file
tools/testing/selftests/bpf/bpf_rlimit.h is removed. The included header
sys/resource.h is removed from many files as it is useless in these files.
- v3: Get rid of bpf_rlimit.h and fix some typos (Andrii)
- v2: Use libbpf_set_strict_mode instead. (Andrii)
- v1: https://lore.kernel.org/bpf/20220320060815.7716-2-laoar.shao@gmail.com/
Yafang Shao (27):
bpf: selftests: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK in
xdping
bpf: selftests: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK in
xdpxceiver
bpf: selftests: No need to include bpf_rlimit.h in test_tcpnotify_user
bpf: selftests: No need to include bpf_rlimit.h in flow_dissector_load
bpf: selftests: Set libbpf 1.0 API mode explicitly in
get_cgroup_id_user
bpf: selftests: Set libbpf 1.0 API mode explicitly in
test_cgroup_storage
bpf: selftests: Set libbpf 1.0 API mode explicitly in
get_cgroup_id_user
bpf: selftests: Set libbpf 1.0 API mode explicitly in test_lpm_map
bpf: selftests: Set libbpf 1.0 API mode explicitly in test_lru_map
bpf: selftests: Set libbpf 1.0 API mode explicitly in
test_skb_cgroup_id_user
bpf: selftests: Set libbpf 1.0 API mode explicitly in test_sock_addr
bpf: selftests: Set libbpf 1.0 API mode explicitly in test_sock
bpf: selftests: Set libbpf 1.0 API mode explicitly in test_sockmap
bpf: selftests: Set libbpf 1.0 API mode explicitly in test_sysctl
bpf: selftests: Set libbpf 1.0 API mode explicitly in test_tag
bpf: selftests: Set libbpf 1.0 API mode explicitly in
test_tcp_check_syncookie_user
bpf: selftests: Set libbpf 1.0 API mode explicitly in
test_verifier_log
bpf: samples: Set libbpf 1.0 API mode explicitly in hbm
bpf: selftests: Get rid of bpf_rlimit.h
bpf: selftests: No need to include sys/resource.h in some files
bpf: samples: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK in
xdpsock_user
bpf: samples: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK in
xsk_fwd
bpf: samples: No need to include sys/resource.h in many files
bpf: bpftool: Remove useless return value of libbpf_set_strict_mode
bpf: bpftool: Set LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK for legacy libbpf
bpf: bpftool: remove RLIMIT_MEMLOCK
bpf: runqslower: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK
samples/bpf/cpustat_user.c | 1 -
samples/bpf/hbm.c | 5 ++--
samples/bpf/ibumad_user.c | 1 -
samples/bpf/map_perf_test_user.c | 1 -
samples/bpf/offwaketime_user.c | 1 -
samples/bpf/sockex2_user.c | 1 -
samples/bpf/sockex3_user.c | 1 -
samples/bpf/spintest_user.c | 1 -
samples/bpf/syscall_tp_user.c | 1 -
samples/bpf/task_fd_query_user.c | 1 -
samples/bpf/test_lru_dist.c | 1 -
samples/bpf/test_map_in_map_user.c | 1 -
samples/bpf/test_overhead_user.c | 1 -
samples/bpf/tracex2_user.c | 1 -
samples/bpf/tracex3_user.c | 1 -
samples/bpf/tracex4_user.c | 1 -
samples/bpf/tracex5_user.c | 1 -
samples/bpf/tracex6_user.c | 1 -
samples/bpf/xdp1_user.c | 1 -
samples/bpf/xdp_adjust_tail_user.c | 1 -
samples/bpf/xdp_monitor_user.c | 1 -
samples/bpf/xdp_redirect_cpu_user.c | 1 -
samples/bpf/xdp_redirect_map_multi_user.c | 1 -
samples/bpf/xdp_redirect_user.c | 1 -
samples/bpf/xdp_router_ipv4_user.c | 1 -
samples/bpf/xdp_rxq_info_user.c | 1 -
samples/bpf/xdp_sample_pkts_user.c | 1 -
samples/bpf/xdp_sample_user.c | 1 -
samples/bpf/xdp_tx_iptunnel_user.c | 1 -
samples/bpf/xdpsock_user.c | 9 ++----
samples/bpf/xsk_fwd.c | 7 ++---
tools/bpf/bpftool/common.c | 8 ------
tools/bpf/bpftool/feature.c | 2 --
tools/bpf/bpftool/main.c | 6 ++--
tools/bpf/bpftool/main.h | 2 --
tools/bpf/bpftool/map.c | 2 --
tools/bpf/bpftool/pids.c | 1 -
tools/bpf/bpftool/prog.c | 3 --
tools/bpf/bpftool/struct_ops.c | 2 --
tools/bpf/runqslower/runqslower.c | 18 ++----------
tools/testing/selftests/bpf/bench.c | 1 -
tools/testing/selftests/bpf/bpf_rlimit.h | 28 -------------------
.../selftests/bpf/flow_dissector_load.c | 6 ++--
.../selftests/bpf/get_cgroup_id_user.c | 4 ++-
tools/testing/selftests/bpf/prog_tests/btf.c | 1 -
.../selftests/bpf/test_cgroup_storage.c | 4 ++-
tools/testing/selftests/bpf/test_dev_cgroup.c | 4 ++-
tools/testing/selftests/bpf/test_lpm_map.c | 4 ++-
tools/testing/selftests/bpf/test_lru_map.c | 4 ++-
.../selftests/bpf/test_skb_cgroup_id_user.c | 4 ++-
tools/testing/selftests/bpf/test_sock.c | 4 ++-
tools/testing/selftests/bpf/test_sock_addr.c | 4 ++-
tools/testing/selftests/bpf/test_sockmap.c | 5 ++--
tools/testing/selftests/bpf/test_sysctl.c | 4 ++-
tools/testing/selftests/bpf/test_tag.c | 4 ++-
.../bpf/test_tcp_check_syncookie_user.c | 4 ++-
.../selftests/bpf/test_tcpnotify_user.c | 1 -
.../testing/selftests/bpf/test_verifier_log.c | 5 ++--
.../selftests/bpf/xdp_redirect_multi.c | 1 -
tools/testing/selftests/bpf/xdping.c | 8 ++----
tools/testing/selftests/bpf/xdpxceiver.c | 6 ++--
61 files changed, 57 insertions(+), 142 deletions(-)
delete mode 100644 tools/testing/selftests/bpf/bpf_rlimit.h
--
2.17.1
eBPF already allows programs to be preloaded and kept running without
intervention from user space. There is a dedicated kernel module called
bpf_preload, which contains the light skeleton of the iterators_bpf eBPF
program. If this module is enabled in the kernel configuration, its loading
will be triggered when the bpf filesystem is mounted (unless the module is
built-in), and the links of iterators_bpf are pinned in that filesystem
(they will appear as the progs.debug and maps.debug files).
However, the current mechanism, if used to preload an LSM, would not offer
the same security guarantees of LSMs integrated in the security subsystem.
Also, it is not generic enough to be used for preloading arbitrary eBPF
programs, unless the bpf_preload code is heavily modified.
More specifically, the security problems are:
- any program can be pinned to the bpf filesystem without limitations
(unless a MAC mechanism enforces some restrictions);
- programs being executed can be terminated at any time by deleting the
pinned objects or unmounting the bpf filesystem.
The usability problems are:
- only a fixed amount of links can be pinned;
- only links can be pinned, other object types are not supported;
- code to pin objects has to be written manually;
- preloading multiple eBPF programs is not practical, bpf_preload has to be
modified to include additional light skeletons.
Solve the security problems by mounting the bpf filesystem from the kernel,
by preloading authenticated kernel modules (e.g. with module.sig_enforce)
and by pinning objects to that filesystem. This particular filesystem
instance guarantees that desired eBPF programs run until the very end of
the kernel lifecycle, since even root cannot interfere with it.
Solve the usability problems by generalizing the pinning function, to
handle not only links but also maps and progs. Also increment the object
reference count and call the pinning function directly from the preload
method (currently in the bpf_preload kernel module) rather than from the
bpf filesystem code itself, so that a generic eBPF program can do those
operations depending on its objects (this also avoids the limitation of the
fixed-size array for storing the objects to pin).
Then, simplify the process of pinning objects defined by a generic eBPF
program by automatically generating the required methods in the light
skeleton. Also, generate a separate kernel module for each eBPF program to
preload, so that existing ones don't have to be modified. Finally, support
preloading multiple eBPF programs by allowing users to specify a list from
the kernel configuration, at build time, or with the new kernel option
bpf_preload_list=, at run-time.
To summarize, this patch set makes it possible to plug in out-of-tree LSMs
matching the security guarantees of their counterpart in the security
subsystem, without having to modify the kernel itself. The same benefits
are extended to other eBPF program types.
Only one remaining problem is how to support auto-attaching eBPF programs
with LSM type. It will be solved with a separate patch set.
Patches 1-2 export some definitions, to build out-of-tree kernel modules
with eBPF programs to preload. Patches 3-4 allow eBPF programs to pin
objects by themselves. Patches 5-10 automatically generate the methods for
preloading in the light skeleton. Patches 11-14 make it possible to preload
multiple eBPF programs. Patch 15 automatically generates the kernel module
for preloading an eBPF program, patch 16 does a kernel mount of the bpf
filesystem, and finally patches 17-18 test the functionality introduced.
Roberto Sassu (18):
bpf: Export bpf_link_inc()
bpf-preload: Move bpf_preload.h to include/linux
bpf-preload: Generalize object pinning from the kernel
bpf-preload: Export and call bpf_obj_do_pin_kernel()
bpf-preload: Generate static variables
bpf-preload: Generate free_objs_and_skel()
bpf-preload: Generate preload()
bpf-preload: Generate load_skel()
bpf-preload: Generate code to pin non-internal maps
bpf-preload: Generate bpf_preload_ops
bpf-preload: Store multiple bpf_preload_ops structures in a linked
list
bpf-preload: Implement new registration method for preloading eBPF
programs
bpf-preload: Move pinned links and maps to a dedicated directory in
bpffs
bpf-preload: Switch to new preload registration method
bpf-preload: Generate code of kernel module to preload
bpf-preload: Do kernel mount to ensure that pinned objects don't
disappear
bpf-preload/selftests: Add test for automatic generation of preload
methods
bpf-preload/selftests: Preload a test eBPF program and check pinned
objects
.../admin-guide/kernel-parameters.txt | 8 +
fs/namespace.c | 1 +
include/linux/bpf.h | 5 +
include/linux/bpf_preload.h | 37 ++
init/main.c | 2 +
kernel/bpf/inode.c | 295 +++++++++--
kernel/bpf/preload/Kconfig | 25 +-
kernel/bpf/preload/bpf_preload.h | 16 -
kernel/bpf/preload/bpf_preload_kern.c | 85 +---
kernel/bpf/preload/iterators/Makefile | 9 +-
.../bpf/preload/iterators/iterators.lskel.h | 466 +++++++++++-------
kernel/bpf/syscall.c | 1 +
.../bpf/bpftool/Documentation/bpftool-gen.rst | 13 +
tools/bpf/bpftool/bash-completion/bpftool | 6 +-
tools/bpf/bpftool/gen.c | 331 +++++++++++++
tools/bpf/bpftool/main.c | 7 +-
tools/bpf/bpftool/main.h | 1 +
tools/testing/selftests/bpf/Makefile | 32 +-
.../bpf/bpf_testmod_preload/.gitignore | 7 +
.../bpf/bpf_testmod_preload/Makefile | 20 +
.../gen_preload_methods.expected.diff | 97 ++++
.../bpf/prog_tests/test_gen_preload_methods.c | 27 +
.../bpf/prog_tests/test_preload_methods.c | 69 +++
.../selftests/bpf/progs/gen_preload_methods.c | 23 +
24 files changed, 1246 insertions(+), 337 deletions(-)
create mode 100644 include/linux/bpf_preload.h
delete mode 100644 kernel/bpf/preload/bpf_preload.h
create mode 100644 tools/testing/selftests/bpf/bpf_testmod_preload/.gitignore
create mode 100644 tools/testing/selftests/bpf/bpf_testmod_preload/Makefile
create mode 100644 tools/testing/selftests/bpf/prog_tests/gen_preload_methods.expected.diff
create mode 100644 tools/testing/selftests/bpf/prog_tests/test_gen_preload_methods.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/test_preload_methods.c
create mode 100644 tools/testing/selftests/bpf/progs/gen_preload_methods.c
--
2.32.0
There are some issues in parse_num_list():
1. The end variable is assigned twice when parsing_end is true.
2. The function does not check that parsing_end should finally be false.
Clean up parse_num_list() and fix these issues.
Signed-off-by: Yuntao Wang <ytcoode(a)gmail.com>
---
tools/testing/selftests/bpf/testing_helpers.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/tools/testing/selftests/bpf/testing_helpers.c b/tools/testing/selftests/bpf/testing_helpers.c
index 795b6798ccee..82f0e2d99c23 100644
--- a/tools/testing/selftests/bpf/testing_helpers.c
+++ b/tools/testing/selftests/bpf/testing_helpers.c
@@ -20,16 +20,16 @@ int parse_num_list(const char *s, bool **num_set, int *num_set_len)
if (errno)
return -errno;
- if (parsing_end)
- end = num;
- else
+ if (!parsing_end) {
start = num;
+ if (*next == '-') {
+ s = next + 1;
+ parsing_end = true;
+ continue;
+ }
+ }
- if (!parsing_end && *next == '-') {
- s = next + 1;
- parsing_end = true;
- continue;
- } else if (*next == ',') {
+ if (*next == ',') {
parsing_end = false;
s = next + 1;
end = num;
@@ -60,7 +60,7 @@ int parse_num_list(const char *s, bool **num_set, int *num_set_len)
set[i] = true;
}
- if (!set)
+ if (!set || parsing_end)
return -EINVAL;
*num_set = set;
--
2.35.1
This patch series revisits the proposal for a GPU cgroup controller to
track and limit memory allocations by various device/allocator
subsystems. The patch series also contains a simple prototype to
illustrate how Android intends to implement DMA-BUF allocator
attribution using the GPU cgroup controller. The prototype does not
include resource limit enforcements.
Changelog:
v4:
Skip test if not run as root per Shuah Khan
Add better test logging for abnormal child termination per Shuah Khan
Adjust ordering of charge/uncharge during transfer to avoid potentially
hitting cgroup limit per Michal Koutný
Adjust gpucg_try_charge critical section for charge transfer functionality
Fix uninitialized return code error for dmabuf_try_charge error case
v3:
Remove Upstreaming Plan from gpu-cgroup.rst per John Stultz
Use more common dual author commit message format per John Stultz
Remove android from binder changes title per Todd Kjos
Add a kselftest for this new behavior per Greg Kroah-Hartman
Include details on behavior for all combinations of kernel/userspace
versions in changelog (thanks Suren Baghdasaryan) per Greg Kroah-Hartman.
Fix pid and uid types in binder UAPI header
v2:
See the previous revision of this change submitted by Hridya Valsaraju
at: https://lore.kernel.org/all/20220115010622.3185921-1-hridya@google.com/
Move dma-buf cgroup charge transfer from a dma_buf_op defined by every
heap to a single dma-buf function for all heaps per Daniel Vetter and
Christian König. Pointers to struct gpucg and struct gpucg_device
tracking the current associations were added to the dma_buf struct to
achieve this.
Fix incorrect Kconfig help section indentation per Randy Dunlap.
History of the GPU cgroup controller
====================================
The GPU/DRM cgroup controller came into being when a consensus[1]
was reached that the resources it tracked were unsuitable to be integrated
into memcg. Originally, the proposed controller was specific to the DRM
subsystem and was intended to track GEM buffers and GPU-specific
resources[2]. In order to help establish a unified memory accounting model
for all GPU and all related subsystems, Daniel Vetter put forth a
suggestion to move it out of the DRM subsystem so that it can be used by
other DMA-BUF exporters as well[3]. This RFC proposes an interface that
does the same.
[1]: https://patchwork.kernel.org/project/dri-devel/cover/20190501140438.9506-1-…
[2]: https://lore.kernel.org/amd-gfx/20210126214626.16260-1-brian.welty@intel.co…
[3]: https://lore.kernel.org/amd-gfx/YCVOl8%2F87bqRSQei@phenom.ffwll.local/
Hridya Valsaraju (5):
gpu: rfc: Proposal for a GPU cgroup controller
cgroup: gpu: Add a cgroup controller for allocator attribution of GPU
memory
dmabuf: heaps: export system_heap buffers with GPU cgroup charging
dmabuf: Add gpu cgroup charge transfer function
binder: Add a buffer flag to relinquish ownership of fds
T.J. Mercier (3):
dmabuf: Use the GPU cgroup charge/uncharge APIs
binder: use __kernel_pid_t and __kernel_uid_t for userspace
selftests: Add binder cgroup gpu memory transfer test
Documentation/gpu/rfc/gpu-cgroup.rst | 183 +++++++
Documentation/gpu/rfc/index.rst | 4 +
drivers/android/binder.c | 26 +
drivers/dma-buf/dma-buf.c | 107 ++++
drivers/dma-buf/dma-heap.c | 27 +
drivers/dma-buf/heaps/system_heap.c | 3 +
include/linux/cgroup_gpu.h | 139 +++++
include/linux/cgroup_subsys.h | 4 +
include/linux/dma-buf.h | 22 +-
include/linux/dma-heap.h | 11 +
include/uapi/linux/android/binder.h | 5 +-
init/Kconfig | 7 +
kernel/cgroup/Makefile | 1 +
kernel/cgroup/gpu.c | 362 +++++++++++++
.../selftests/drivers/android/binder/Makefile | 8 +
.../drivers/android/binder/binder_util.c | 254 +++++++++
.../drivers/android/binder/binder_util.h | 32 ++
.../selftests/drivers/android/binder/config | 4 +
.../binder/test_dmabuf_cgroup_transfer.c | 484 ++++++++++++++++++
19 files changed, 1679 insertions(+), 4 deletions(-)
create mode 100644 Documentation/gpu/rfc/gpu-cgroup.rst
create mode 100644 include/linux/cgroup_gpu.h
create mode 100644 kernel/cgroup/gpu.c
create mode 100644 tools/testing/selftests/drivers/android/binder/Makefile
create mode 100644 tools/testing/selftests/drivers/android/binder/binder_util.c
create mode 100644 tools/testing/selftests/drivers/android/binder/binder_util.h
create mode 100644 tools/testing/selftests/drivers/android/binder/config
create mode 100644 tools/testing/selftests/drivers/android/binder/test_dmabuf_cgroup_transfer.c
--
2.35.1.1021.g381101b075-goog
On Fri, 2022-03-11 at 17:24 +0100, Vincent Whitchurch wrote:
> Import the libvhost-user from QEMU for use in the implementation of the
> virtio devices in the roadtest backend.
>
So hm, I wonder if this is the sensible thing to do?
Not that I mind importing qemu code, but:
1) the implementation is rather complex in some places, and has support
for a LOT of virtio/vhost-user features that are really not needed
in these cases, for performance etc. It's also close to 4k LOC.
2) the implementation doesn't support time-travel mode which might come
in handy
We have another implementation that might be simpler:
https://github.com/linux-test-project/usfstl/blob/main/src/vhost.c
but it probably has dependencies on other things in this library, but
vhost.c itself is only ~1k LOC. (But I need to update it, I'm sure we
have some unpublished bugfixes etc. in this code)
johannes
Dzień dobry!
Czy mógłbym przedstawić rozwiązanie, które umożliwia monitoring każdego auta w czasie rzeczywistym w tym jego pozycję, zużycie paliwa i przebieg?
Dodatkowo nasze narzędzie minimalizuje koszty utrzymania samochodów, skraca czas przejazdów, a także tworzenie planu tras czy dostaw.
Z naszej wiedzy i doświadczenia korzysta już ponad 49 tys. Klientów. Monitorujemy 809 000 pojazdów na całym świecie, co jest naszą najlepszą wizytówką.
Bardzo proszę o e-maila zwrotnego, jeśli moglibyśmy wspólnie omówić potencjał wykorzystania takiego rozwiązania w Państwa firmie.
Pozdrawiam,
Marek Onufrowicz
Currently, when we run test_progs with just executable file name, for
example 'PATH=. test_progs-no_alu32', cd_flavor_subdir() will not check
if test_progs is running as a flavored test runner and switch into
corresponding sub-directory.
This will cause test_progs-no_alu32 executed by the
'PATH=. test_progs-no_alu32' command to run in the wrong directory and
load the wrong BPF objects.
Signed-off-by: Yuntao Wang <ytcoode(a)gmail.com>
---
tools/testing/selftests/bpf/test_progs.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index 2ecb73a65206..0a4b45d7b515 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -761,8 +761,10 @@ int cd_flavor_subdir(const char *exec_name)
const char *flavor = strrchr(exec_name, '/');
if (!flavor)
- return 0;
- flavor++;
+ flavor = exec_name;
+ else
+ flavor++;
+
flavor = strrchr(flavor, '-');
if (!flavor)
return 0;
--
2.35.1
If a memop fails due to key checked protection, after already having
written to the guest, don't indicate suppression to the guest, as that
would imply that memory wasn't modified.
This could be considered a fix to the code introducing storage key
support, however this is a bug in KVM only if we emulate an
instructions writing to an operand spanning multiple pages, which I
don't believe we do.
Janis Schoetterl-Glausch (2):
KVM: s390: Don't indicate suppression on dirtying, failing memop
KVM: s390: selftest: Test suppression indication on key prot exception
arch/s390/kvm/gaccess.c | 47 ++++++++++++++---------
tools/testing/selftests/kvm/s390x/memop.c | 43 ++++++++++++++++++++-
2 files changed, 70 insertions(+), 20 deletions(-)
base-commit: 1ebdbeb03efe89f01f15df038a589077df3d21f5
--
2.32.0
From: Ricardo Koller <ricarkol(a)google.com>
[ Upstream commit b53de63a89244c196d8a2ea76b6754e3fdb4b626 ]
vgic_poke_irq() checks that the attr argument passed to the vgic device
ioctl is sane. Make this check tighter by moving it to after the last
attr update.
Signed-off-by: Ricardo Koller <ricarkol(a)google.com>
Reported-by: Reiji Watanabe <reijiw(a)google.com>
Cc: Andrew Jones <drjones(a)redhat.com>
Reviewed-by: Andrew Jones <drjones(a)redhat.com>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20220127030858.3269036-6-ricarkol@google.com
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/kvm/lib/aarch64/vgic.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/kvm/lib/aarch64/vgic.c b/tools/testing/selftests/kvm/lib/aarch64/vgic.c
index 7c876ccf9294..5d45046c1b80 100644
--- a/tools/testing/selftests/kvm/lib/aarch64/vgic.c
+++ b/tools/testing/selftests/kvm/lib/aarch64/vgic.c
@@ -140,9 +140,6 @@ static void vgic_poke_irq(int gic_fd, uint32_t intid,
uint64_t val;
bool intid_is_private = INTID_IS_SGI(intid) || INTID_IS_PPI(intid);
- /* Check that the addr part of the attr is within 32 bits. */
- assert(attr <= KVM_DEV_ARM_VGIC_OFFSET_MASK);
-
uint32_t group = intid_is_private ? KVM_DEV_ARM_VGIC_GRP_REDIST_REGS
: KVM_DEV_ARM_VGIC_GRP_DIST_REGS;
@@ -152,6 +149,9 @@ static void vgic_poke_irq(int gic_fd, uint32_t intid,
attr += SZ_64K;
}
+ /* Check that the addr part of the attr is within 32 bits. */
+ assert((attr & ~KVM_DEV_ARM_VGIC_OFFSET_MASK) == 0);
+
/*
* All calls will succeed, even with invalid intid's, as long as the
* addr part of the attr is within 32 bits (checked above). An invalid
--
2.34.1
From: Ricardo Koller <ricarkol(a)google.com>
[ Upstream commit a5cd38fd9c47b23abc6df08d6ee6a71b39038185 ]
Fix the formatting of some comments and the wording of one of them (in
gicv3_access_reg).
Signed-off-by: Ricardo Koller <ricarkol(a)google.com>
Reported-by: Reiji Watanabe <reijiw(a)google.com>
Cc: Andrew Jones <drjones(a)redhat.com>
Reviewed-by: Andrew Jones <drjones(a)redhat.com>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20220127030858.3269036-5-ricarkol@google.com
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/kvm/aarch64/vgic_irq.c | 12 ++++++++----
tools/testing/selftests/kvm/lib/aarch64/gic_v3.c | 10 ++++++----
tools/testing/selftests/kvm/lib/aarch64/vgic.c | 3 ++-
3 files changed, 16 insertions(+), 9 deletions(-)
diff --git a/tools/testing/selftests/kvm/aarch64/vgic_irq.c b/tools/testing/selftests/kvm/aarch64/vgic_irq.c
index 48e43e24d240..554ca649d470 100644
--- a/tools/testing/selftests/kvm/aarch64/vgic_irq.c
+++ b/tools/testing/selftests/kvm/aarch64/vgic_irq.c
@@ -306,7 +306,8 @@ static void guest_restore_active(struct test_args *args,
uint32_t prio, intid, ap1r;
int i;
- /* Set the priorities of the first (KVM_NUM_PRIOS - 1) IRQs
+ /*
+ * Set the priorities of the first (KVM_NUM_PRIOS - 1) IRQs
* in descending order, so intid+1 can preempt intid.
*/
for (i = 0, prio = (num - 1) * 8; i < num; i++, prio -= 8) {
@@ -315,7 +316,8 @@ static void guest_restore_active(struct test_args *args,
gic_set_priority(intid, prio);
}
- /* In a real migration, KVM would restore all GIC state before running
+ /*
+ * In a real migration, KVM would restore all GIC state before running
* guest code.
*/
for (i = 0; i < num; i++) {
@@ -503,7 +505,8 @@ static void guest_code(struct test_args *args)
test_injection_failure(args, f);
}
- /* Restore the active state of IRQs. This would happen when live
+ /*
+ * Restore the active state of IRQs. This would happen when live
* migrating IRQs in the middle of being handled.
*/
for_each_supported_activate_fn(args, set_active_fns, f)
@@ -844,7 +847,8 @@ int main(int argc, char **argv)
}
}
- /* If the user just specified nr_irqs and/or gic_version, then run all
+ /*
+ * If the user just specified nr_irqs and/or gic_version, then run all
* combinations.
*/
if (default_args) {
diff --git a/tools/testing/selftests/kvm/lib/aarch64/gic_v3.c b/tools/testing/selftests/kvm/lib/aarch64/gic_v3.c
index e4945fe66620..263bf3ed8fd5 100644
--- a/tools/testing/selftests/kvm/lib/aarch64/gic_v3.c
+++ b/tools/testing/selftests/kvm/lib/aarch64/gic_v3.c
@@ -19,7 +19,7 @@ struct gicv3_data {
unsigned int nr_spis;
};
-#define sgi_base_from_redist(redist_base) (redist_base + SZ_64K)
+#define sgi_base_from_redist(redist_base) (redist_base + SZ_64K)
#define DIST_BIT (1U << 31)
enum gicv3_intid_range {
@@ -105,7 +105,8 @@ static void gicv3_set_eoi_split(bool split)
{
uint32_t val;
- /* All other fields are read-only, so no need to read CTLR first. In
+ /*
+ * All other fields are read-only, so no need to read CTLR first. In
* fact, the kernel does the same.
*/
val = split ? (1U << 1) : 0;
@@ -160,8 +161,9 @@ static void gicv3_access_reg(uint32_t intid, uint64_t offset,
GUEST_ASSERT(bits_per_field <= reg_bits);
GUEST_ASSERT(!write || *val < (1U << bits_per_field));
- /* Some registers like IROUTER are 64 bit long. Those are currently not
- * supported by readl nor writel, so just asserting here until then.
+ /*
+ * This function does not support 64 bit accesses. Just asserting here
+ * until we implement readq/writeq.
*/
GUEST_ASSERT(reg_bits == 32);
diff --git a/tools/testing/selftests/kvm/lib/aarch64/vgic.c b/tools/testing/selftests/kvm/lib/aarch64/vgic.c
index f5cd0c536d85..7c876ccf9294 100644
--- a/tools/testing/selftests/kvm/lib/aarch64/vgic.c
+++ b/tools/testing/selftests/kvm/lib/aarch64/vgic.c
@@ -152,7 +152,8 @@ static void vgic_poke_irq(int gic_fd, uint32_t intid,
attr += SZ_64K;
}
- /* All calls will succeed, even with invalid intid's, as long as the
+ /*
+ * All calls will succeed, even with invalid intid's, as long as the
* addr part of the attr is within 32 bits (checked above). An invalid
* intid will just make the read/writes point to above the intended
* register space (i.e., ICPENDR after ISPENDR).
--
2.34.1
From: Ricardo Koller <ricarkol(a)google.com>
[ Upstream commit 5b7898648f02083012900e48d063e51ccbdad165 ]
kvm_set_gsi_routing_irqchip_check(expect_failure=true) is used to check
the error code returned by the kernel when trying to setup an invalid
gsi routing table. The ioctl fails if "pin >= KVM_IRQCHIP_NUM_PINS", so
kvm_set_gsi_routing_irqchip_check() should test the error only when
"intid >= KVM_IRQCHIP_NUM_PINS+32". The issue is that the test check is
"intid >= KVM_IRQCHIP_NUM_PINS", so for a case like "intid =
KVM_IRQCHIP_NUM_PINS" the test wrongly assumes that the kernel will
return an error. Fix this by using the right check.
Signed-off-by: Ricardo Koller <ricarkol(a)google.com>
Reported-by: Reiji Watanabe <reijiw(a)google.com>
Cc: Andrew Jones <drjones(a)redhat.com>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20220127030858.3269036-4-ricarkol@google.com
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/kvm/aarch64/vgic_irq.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/kvm/aarch64/vgic_irq.c b/tools/testing/selftests/kvm/aarch64/vgic_irq.c
index 7f3afee5cc00..48e43e24d240 100644
--- a/tools/testing/selftests/kvm/aarch64/vgic_irq.c
+++ b/tools/testing/selftests/kvm/aarch64/vgic_irq.c
@@ -573,8 +573,8 @@ static void kvm_set_gsi_routing_irqchip_check(struct kvm_vm *vm,
kvm_gsi_routing_write(vm, routing);
} else {
ret = _kvm_gsi_routing_write(vm, routing);
- /* The kernel only checks for KVM_IRQCHIP_NUM_PINS. */
- if (intid >= KVM_IRQCHIP_NUM_PINS)
+ /* The kernel only checks e->irqchip.pin >= KVM_IRQCHIP_NUM_PINS */
+ if (((uint64_t)intid + num - 1 - MIN_SPI) >= KVM_IRQCHIP_NUM_PINS)
TEST_ASSERT(ret != 0 && errno == EINVAL,
"Bad intid %u did not cause KVM_SET_GSI_ROUTING "
"error: rc: %i errno: %i", intid, ret, errno);
--
2.34.1
From: Ricardo Koller <ricarkol(a)google.com>
[ Upstream commit cc94d47ce16d4147d546e47c8248e8bd12ba5fe5 ]
The val argument in gicv3_access_reg can have any value when used for a
read, not necessarily 0. Fix the assert by checking val only for
writes.
Signed-off-by: Ricardo Koller <ricarkol(a)google.com>
Reported-by: Reiji Watanabe <reijiw(a)google.com>
Cc: Andrew Jones <drjones(a)redhat.com>
Reviewed-by: Andrew Jones <drjones(a)redhat.com>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20220127030858.3269036-2-ricarkol@google.com
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/kvm/lib/aarch64/gic_v3.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/lib/aarch64/gic_v3.c b/tools/testing/selftests/kvm/lib/aarch64/gic_v3.c
index 00f613c0583c..e4945fe66620 100644
--- a/tools/testing/selftests/kvm/lib/aarch64/gic_v3.c
+++ b/tools/testing/selftests/kvm/lib/aarch64/gic_v3.c
@@ -159,7 +159,7 @@ static void gicv3_access_reg(uint32_t intid, uint64_t offset,
uint32_t cpu_or_dist;
GUEST_ASSERT(bits_per_field <= reg_bits);
- GUEST_ASSERT(*val < (1U << bits_per_field));
+ GUEST_ASSERT(!write || *val < (1U << bits_per_field));
/* Some registers like IROUTER are 64 bit long. Those are currently not
* supported by readl nor writel, so just asserting here until then.
*/
--
2.34.1
Hi,
This is a followup of my v1 at [0] and v2 at [1].
The short summary of the previous cover letter and discussions is that
HID could benefit from BPF for the following use cases:
- simple fixup of report descriptor:
benefits are faster development time and testing, with the produced
bpf program being shipped in the kernel directly (the shipping part
is *not* addressed here).
- Universal Stylus Interface:
allows a user-space program to define its own kernel interface
- Surface Dial:
somehow similar to the previous one except that userspace can decide
to change the shape of the exported device
- firewall:
still partly missing there, there is not yet interception of hidraw
calls, but it's coming in a followup series, I promise
- tracing:
well, tracing.
I think I addressed the comments from the previous version, but there are
a few things I'd like to note here:
- I did not take the various rev-by and tested-by (thanks a lot for those)
because the uapi changed significantly in v3, so I am not very confident
in taking those rev-by blindly
- I mentioned in my discussion with Song that I'll put a summary of the uapi
in the cover letter, but I ended up adding a (long) file in the Documentation
directory. So please maybe start by reading 17/17 to have an overview of
what I want to achieve
- I added in the libbpf and bpf the new type BPF_HID_DRIVER_EVENT, even though
I don't have a user of it right now in the kernel. I wanted to have them in
the docs, but we might not want to have them ready here.
In terms of code, it just means that we can attach such programs types
but that they will never get triggered.
Anyway, I have been mulling on this for the past 2 weeks, and I think that
maybe sharing this now is better than me just starring at the code over and
over.
Short summary of changes:
v3:
===
- squashed back together most of the libbpf and bpf changes into bigger
commits that give a better overview of the whole interactions
- reworked the user API to not expose .data as a directly accessible field
from the context, but instead forces everyone to use hid_bpf_get_data (or
get/set_bits)
- added BPF_HID_DRIVER_EVENT (see note above)
- addressed the various nitpicks from v2
- added a big Documentation file (and so adding now the doc maintainers to the
long list of recipients)
v2:
===
- split the series by subsystem (bpf, HID, libbpf, selftests and
samples)
- Added an extra patch at the beginning to not require CAP_NET_ADMIN for
BPF_PROG_TYPE_LIRC_MODE2 (please shout if this is wrong)
- made the bpf context attached to HID program of dynamic size:
* the first 1 kB will be able to be addressed directly
* the rest can be retrieved through bpf_hid_{set|get}_data
(note that I am definitivey not happy with that API, because there
is part of it in bits and other in bytes. ouch)
- added an extra patch to prevent non GPL HID bpf programs to be loaded
of type BPF_PROG_TYPE_HID
* same here, not really happy but I don't know where to put that check
in verifier.c
- added a new flag BPF_F_INSERT_HEAD for BPF_LINK_CREATE syscall when in
used with HID program types.
* this flag is used for tracing, to be able to load a program before
any others that might already have been inserted and that might
change the data stream.
Cheers,
Benjamin
[0] https://lore.kernel.org/linux-input/20220224110828.2168231-1-benjamin.tisso…
[1] https://lore.kernel.org/linux-input/20220304172852.274126-1-benjamin.tissoi…
Benjamin Tissoires (17):
bpf: add new is_sys_admin_prog_type() helper
bpf: introduce hid program type
bpf/verifier: prevent non GPL programs to be loaded against HID
libbpf: add HID program type and API
HID: hook up with bpf
HID: allow to change the report descriptor from an eBPF program
selftests/bpf: add tests for the HID-bpf initial implementation
selftests/bpf: add report descriptor fixup tests
selftests/bpf: Add a test for BPF_F_INSERT_HEAD
selftests/bpf: add test for user call of HID bpf programs
samples/bpf: add new hid_mouse example
bpf/hid: add more HID helpers
HID: bpf: implement hid_bpf_get|set_bits
HID: add implementation of bpf_hid_raw_request
selftests/bpf: add tests for hid_{get|set}_bits helpers
selftests/bpf: add tests for bpf_hid_hw_request
Documentation: add HID-BPF docs
Documentation/hid/hid-bpf.rst | 444 +++++++++++
Documentation/hid/index.rst | 1 +
drivers/hid/Makefile | 1 +
drivers/hid/hid-bpf.c | 328 ++++++++
drivers/hid/hid-core.c | 34 +-
include/linux/bpf-hid.h | 127 +++
include/linux/bpf_types.h | 4 +
include/linux/hid.h | 36 +-
include/uapi/linux/bpf.h | 67 ++
include/uapi/linux/bpf_hid.h | 71 ++
include/uapi/linux/hid.h | 10 +
kernel/bpf/Makefile | 3 +
kernel/bpf/btf.c | 1 +
kernel/bpf/hid.c | 728 +++++++++++++++++
kernel/bpf/syscall.c | 27 +-
kernel/bpf/verifier.c | 7 +
samples/bpf/.gitignore | 1 +
samples/bpf/Makefile | 4 +
samples/bpf/hid_mouse_kern.c | 117 +++
samples/bpf/hid_mouse_user.c | 129 +++
tools/include/uapi/linux/bpf.h | 67 ++
tools/lib/bpf/libbpf.c | 23 +-
tools/lib/bpf/libbpf.h | 2 +
tools/lib/bpf/libbpf.map | 1 +
tools/testing/selftests/bpf/config | 3 +
tools/testing/selftests/bpf/prog_tests/hid.c | 788 +++++++++++++++++++
tools/testing/selftests/bpf/progs/hid.c | 205 +++++
27 files changed, 3204 insertions(+), 25 deletions(-)
create mode 100644 Documentation/hid/hid-bpf.rst
create mode 100644 drivers/hid/hid-bpf.c
create mode 100644 include/linux/bpf-hid.h
create mode 100644 include/uapi/linux/bpf_hid.h
create mode 100644 kernel/bpf/hid.c
create mode 100644 samples/bpf/hid_mouse_kern.c
create mode 100644 samples/bpf/hid_mouse_user.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/hid.c
create mode 100644 tools/testing/selftests/bpf/progs/hid.c
--
2.35.1
Hi Team,
Can anyone help me on this ? Hope you will do the needful as soon as possible.
Regards
Sarath P T
From: P T, Sarath
Sent: 14 March 2022 13:19
To: 'linux-kselftest-mirror(a)lists.linaro.org' <linux-kselftest-mirror(a)lists.linaro.org>
Subject: clarification on tap_timeout function
Hi Team,
I need a clarification the function "tap_timeout" which is being used in the runner.sh , the one will give the result format in the TAP 13 protocol. Below I am giving the function.
tap_timeout()
{
# Make sure tests will time out if utility is available.
if [ -x /usr/bin/timeout ] ; then
/usr/bin/timeout --foreground "$kselftest_timeout" "$1"
else
"$1"
fi
}
Need a clarification why we are using the function "tap_timout" and why the "kselftest_timeout" variable declared as 45 seconds by default. It will be very helpful if you are clarifying these things for me.
Regards
Sarath PT
Before Linux 5.17, there was a problem with the CMOS RTC driver:
cmos_read_alarm() and cmos_set_alarm() did not check for the UIP (Update
in progress) bit, which could have caused it to sometimes fail silently
and read bogus values or do not set the alarm correctly.
Luckily, this issue was masked by cmos_read_time() invocations in core
RTC code - see https://marc.info/?l=linux-rtc&m=164858416511425&w=4
To avoid such a problem in the future in some other driver, I wrote a
test unit that reads the alarm time many times in a row. As the alarm
time is usually read once and cached by the RTC core, this requires a
way for userspace to trigger direct alarm time read from hardware. I
think that debugfs is the natural choice for this.
So, introduce /sys/kernel/debug/rtc/rtcX/wakealarm_raw. This interface
as implemented here does not seem to be that useful to userspace, so
there is little risk that it will become kernel ABI.
Is this approach correct and worth it?
TODO:
- should I add a new Kconfig option (like CONFIG_RTC_INTF_DEBUGFS), or
just use CONFIG_DEBUG_FS here? I wouldn't like to create unnecessary
config options in the kernel.
Signed-off-by: Mateusz Jończyk <mat.jonczyk(a)o2.pl>
Cc: Alessandro Zummo <a.zummo(a)towertech.it>
Cc: Alexandre Belloni <alexandre.belloni(a)bootlin.com>
Cc: Shuah Khan <shuah(a)kernel.org>
---
drivers/rtc/Makefile | 1 +
drivers/rtc/class.c | 3 ++
drivers/rtc/debugfs.c | 112 ++++++++++++++++++++++++++++++++++++++++
drivers/rtc/interface.c | 3 +-
include/linux/rtc.h | 16 ++++++
5 files changed, 133 insertions(+), 2 deletions(-)
create mode 100644 drivers/rtc/debugfs.c
diff --git a/drivers/rtc/Makefile b/drivers/rtc/Makefile
index 678a8ef4abae..50e166a97f54 100644
--- a/drivers/rtc/Makefile
+++ b/drivers/rtc/Makefile
@@ -14,6 +14,7 @@ rtc-core-$(CONFIG_RTC_NVMEM) += nvmem.o
rtc-core-$(CONFIG_RTC_INTF_DEV) += dev.o
rtc-core-$(CONFIG_RTC_INTF_PROC) += proc.o
rtc-core-$(CONFIG_RTC_INTF_SYSFS) += sysfs.o
+rtc-core-$(CONFIG_DEBUG_FS) += debugfs.o
obj-$(CONFIG_RTC_LIB_KUNIT_TEST) += lib_test.o
diff --git a/drivers/rtc/class.c b/drivers/rtc/class.c
index 4b460c61f1d8..5673b7b26c0d 100644
--- a/drivers/rtc/class.c
+++ b/drivers/rtc/class.c
@@ -334,6 +334,7 @@ static void devm_rtc_unregister_device(void *data)
* Remove innards of this RTC, then disable it, before
* letting any rtc_class_open() users access it again
*/
+ rtc_debugfs_del_device(rtc);
rtc_proc_del_device(rtc);
if (!test_bit(RTC_NO_CDEV, &rtc->flags))
cdev_device_del(&rtc->char_dev, &rtc->dev);
@@ -417,6 +418,7 @@ int __devm_rtc_register_device(struct module *owner, struct rtc_device *rtc)
}
rtc_proc_add_device(rtc);
+ rtc_debugfs_add_device(rtc);
dev_info(rtc->dev.parent, "registered as %s\n",
dev_name(&rtc->dev));
@@ -476,6 +478,7 @@ static int __init rtc_init(void)
}
rtc_class->pm = RTC_CLASS_DEV_PM_OPS;
rtc_dev_init();
+ rtc_debugfs_init();
return 0;
}
subsys_initcall(rtc_init);
diff --git a/drivers/rtc/debugfs.c b/drivers/rtc/debugfs.c
new file mode 100644
index 000000000000..5ceed5504033
--- /dev/null
+++ b/drivers/rtc/debugfs.c
@@ -0,0 +1,112 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+/*
+ * Debugfs interface for testing RTC alarms.
+ */
+#include <linux/debugfs.h>
+#include <linux/err.h>
+#include <linux/rtc.h>
+
+static struct dentry *rtc_main_debugfs_dir;
+
+void rtc_debugfs_init(void)
+{
+ struct dentry *ret = debugfs_create_dir("rtc", NULL);
+
+ // No error is critical here
+ if (!IS_ERR(ret))
+ rtc_main_debugfs_dir = ret;
+}
+
+/*
+ * Handler for /sys/kernel/debug/rtc/rtcX/wakealarm_raw .
+ * This function reads the RTC alarm time directly from hardware. If the RTC
+ * alarm is enabled, this function returns the alarm time modulo 24h in seconds
+ * since midnight.
+ *
+ * Should be only used for testing of the RTC alarm read functionality in
+ * drivers - to make sure that the driver returns consistent values.
+ *
+ * Used in tools/testing/selftests/rtc/rtctest.c .
+ */
+static int rtc_debugfs_alarm_read(void *p, u64 *out)
+{
+ int ret;
+ struct rtc_device *rtc = p;
+ struct rtc_wkalrm alm;
+
+ /* Using rtc_read_alarm_internal() instead of __rtc_read_alarm() will
+ * allow us to avoid any interaction with rtc_read_time() and possibly
+ * see more issues.
+ */
+ ret = rtc_read_alarm_internal(rtc, &alm);
+ if (ret != 0)
+ return ret;
+
+ if (!alm.enabled) {
+ *out = -1;
+ return 0;
+ }
+
+ /* It does not matter if the device does not support seconds resolution
+ * of the RTC alarm.
+ */
+ if (test_bit(RTC_FEATURE_ALARM_RES_MINUTE, rtc->features))
+ alm.time.tm_sec = 0;
+
+ /* The selftest code works with fully defined alarms only.
+ */
+ if (alm.time.tm_sec == -1 || alm.time.tm_min == -1 || alm.time.tm_hour == -1) {
+ *out = -2;
+ return 0;
+ }
+
+ /* Check if the alarm time is correct.
+ * rtc_valid_tm() does not allow fields containing "-1", so put in
+ * something to satisfy it.
+ */
+ if (alm.time.tm_year == -1)
+ alm.time.tm_year = 100;
+ if (alm.time.tm_mon == -1)
+ alm.time.tm_mon = 0;
+ if (alm.time.tm_mday == -1)
+ alm.time.tm_mday = 1;
+ if (rtc_valid_tm(&alm.time))
+ return -EINVAL;
+
+ /* We do not duplicate the logic in __rtc_read_alarm() and instead only
+ * return the alarm time modulo 24h, which all devices should support.
+ * This should be enough for testing purposes.
+ */
+ *out = alm.time.tm_hour * 3600 + alm.time.tm_min * 60 + alm.time.tm_sec;
+
+ return 0;
+}
+DEFINE_DEBUGFS_ATTRIBUTE(rtc_alarm_raw, rtc_debugfs_alarm_read, NULL, "%lld\n");
+
+void rtc_debugfs_add_device(struct rtc_device *rtc)
+{
+ struct dentry *dev_dir;
+
+ if (!rtc_main_debugfs_dir)
+ return;
+
+ dev_dir = debugfs_create_dir(dev_name(&rtc->dev), rtc_main_debugfs_dir);
+
+ if (IS_ERR(dev_dir)) {
+ rtc->debugfs_dir = NULL;
+ return;
+ }
+ rtc->debugfs_dir = dev_dir;
+
+ if (test_bit(RTC_FEATURE_ALARM, rtc->features) && rtc->ops->read_alarm) {
+ debugfs_create_file("wakealarm_raw", 0444, dev_dir,
+ rtc, &rtc_alarm_raw);
+ }
+}
+
+void rtc_debugfs_del_device(struct rtc_device *rtc)
+{
+ debugfs_remove_recursive(rtc->debugfs_dir);
+ rtc->debugfs_dir = NULL;
+}
diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
index d8e835798153..51c801c82472 100644
--- a/drivers/rtc/interface.c
+++ b/drivers/rtc/interface.c
@@ -175,8 +175,7 @@ int rtc_set_time(struct rtc_device *rtc, struct rtc_time *tm)
}
EXPORT_SYMBOL_GPL(rtc_set_time);
-static int rtc_read_alarm_internal(struct rtc_device *rtc,
- struct rtc_wkalrm *alarm)
+int rtc_read_alarm_internal(struct rtc_device *rtc, struct rtc_wkalrm *alarm)
{
int err;
diff --git a/include/linux/rtc.h b/include/linux/rtc.h
index 47fd1c2d3a57..4665bc238a94 100644
--- a/include/linux/rtc.h
+++ b/include/linux/rtc.h
@@ -41,6 +41,7 @@ static inline time64_t rtc_tm_sub(struct rtc_time *lhs, struct rtc_time *rhs)
#include <linux/mutex.h>
#include <linux/timerqueue.h>
#include <linux/workqueue.h>
+#include <linux/debugfs.h>
extern struct class *rtc_class;
@@ -152,6 +153,10 @@ struct rtc_device {
time64_t offset_secs;
bool set_start_time;
+#ifdef CONFIG_DEBUG_FS
+ struct dentry *debugfs_dir;
+#endif
+
#ifdef CONFIG_RTC_INTF_DEV_UIE_EMUL
struct work_struct uie_task;
struct timer_list uie_timer;
@@ -190,6 +195,7 @@ extern int rtc_set_time(struct rtc_device *rtc, struct rtc_time *tm);
int __rtc_read_alarm(struct rtc_device *rtc, struct rtc_wkalrm *alarm);
extern int rtc_read_alarm(struct rtc_device *rtc,
struct rtc_wkalrm *alrm);
+int rtc_read_alarm_internal(struct rtc_device *rtc, struct rtc_wkalrm *alarm);
extern int rtc_set_alarm(struct rtc_device *rtc,
struct rtc_wkalrm *alrm);
extern int rtc_initialize_alarm(struct rtc_device *rtc,
@@ -262,4 +268,14 @@ int rtc_add_groups(struct rtc_device *rtc, const struct attribute_group **grps)
return 0;
}
#endif
+
+#ifdef CONFIG_DEBUG_FS
+void rtc_debugfs_init(void);
+void rtc_debugfs_add_device(struct rtc_device *rtc);
+void rtc_debugfs_del_device(struct rtc_device *rtc);
+#else /* CONFIG_DEBUG_FS */
+static inline void rtc_debugfs_init(void) {}
+static inline void rtc_debugfs_add_device(struct rtc_device *rtc) {}
+static inline void rtc_debugfs_del_device(struct rtc_device *rtc) {}
+#endif /* CONFIG_DEBUG_FS */
#endif /* _LINUX_RTC_H_ */
--
2.25.1
Print two possible reasons /sys/kernel/debug/gup_test
cannot be opened to help users of this test diagnose
failures.
Signed-off-by: Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
Cc: stable(a)vger.kernel.org # 5.15+
---
tools/testing/selftests/vm/gup_test.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/vm/gup_test.c b/tools/testing/selftests/vm/gup_test.c
index fe043f67798b0..c496bcefa7a0e 100644
--- a/tools/testing/selftests/vm/gup_test.c
+++ b/tools/testing/selftests/vm/gup_test.c
@@ -205,7 +205,9 @@ int main(int argc, char **argv)
gup_fd = open("/sys/kernel/debug/gup_test", O_RDWR);
if (gup_fd == -1) {
- perror("open");
+ perror("failed to open /sys/kernel/debug/gup_test");
+ printf("check if CONFIG_GUP_TEST is enabled in kernel config\n");
+ printf("check if debugfs is mounted at /sys/kernel/debug\n");
exit(1);
}
--
2.24.1
On 3/30/2022 12:03 PM, Jarkko Sakkinen wrote:
> On Wed, 2022-03-30 at 10:40 -0700, Reinette Chatre wrote:
>> Could you please elaborate how the compiler will fix it up?
>
> Sure.
>
> Here's the disassembly of the RBX version:
>
> [0x000021a9]> pi 1
> lea rax, [rbx + loc.encl_stack]
>
> Here's the same with s/RBX/RIP/:
>
> [0x000021a9]> pi 5
> lea rax, loc.encl_stack
>
> Compiler will substitute correct offset relative to the RIP,
> well, because it can and it makes sense.
It does not make sense to me because, as proven with my test,
the two threads end up sharing the same stack memory.
Reinette