Em qua., 7 de abr. de 2021 às 15:31, Andrii Nakryiko andrii.nakryiko@gmail.com escreveu:
On Tue, Apr 6, 2021 at 11:55 AM Pedro Tammela pctammela@gmail.com wrote:
This macro was refactored out of the bpf selftests.
Since percpu values are rounded up to '8' in the kernel, a careless user in userspace might encounter unexpected values when parsing the output of the batched operations.
I wonder if a user has to be more careful, though? This BPF_PERCPU_TYPE, __bpf_percpu_align and bpf_percpu macros seem to create just another opaque layer. It actually seems detrimental to me.
I'd rather emphasize in the documentation (e.g., in bpf_map_lookup_elem) that all per-cpu maps are aligning values at 8 bytes, so user has to make sure that array of values provided to bpf_map_lookup_elem() has each element size rounded up to 8.
From my own experience, the documentation has been a very unreliable
source, to the point that I usually jump to the code first rather than to the documentation nowadays[1]. Tests, samples and projects have always been my source of truth and we are already lacking a bit on those as well. For instance, the samples directory contains programs that are very outdated (I didn't check if they are still functional). I think macros like these will be present in most of the project dealing with batched operations and as a daily user of libbpf I don't see how this could not be offered by libbpf as a standardized way to declare percpu types.
[1] So batched operations were introduced a little bit over a 1 year ago and yet the only reference I had for it was the selftests. The documentation is on my TODO list, but that's just because I have to deal with it daily.
In practice, I'd recommend users to always use __u64/__s64 when having primitive integers in a map (they are not saving anything by using int, it just creates an illusion of savings). Well, maybe on 32-bit arches they would save a bit of CPU, but not on typical 64-bit architectures. As for using structs as values, always mark them as __attribute__((aligned(8))).
Basically, instead of obscuring the real use some more, let's clarify and maybe even provide some examples in documentation?
Why not do both?
Provide a standardized way to declare a percpu value with examples and a good documentation with examples. Let the user decide what is best for his use case.
Now that both array and hash maps have support for batched ops in the percpu variant, let's provide a convenient macro to declare percpu map value types.
Updates the tests to a "reference" usage of the new macro.
Signed-off-by: Pedro Tammela pctammela@mojatatu.com
tools/lib/bpf/bpf.h | 10 ++++ tools/testing/selftests/bpf/bpf_util.h | 7 --- .../bpf/map_tests/htab_map_batch_ops.c | 48 ++++++++++--------- .../selftests/bpf/prog_tests/map_init.c | 5 +- tools/testing/selftests/bpf/test_maps.c | 16 ++++--- 5 files changed, 46 insertions(+), 40 deletions(-)
[...]
@@ -400,11 +402,11 @@ static void test_arraymap(unsigned int task, void *data) static void test_arraymap_percpu(unsigned int task, void *data) { unsigned int nr_cpus = bpf_num_possible_cpus();
BPF_DECLARE_PERCPU(long, values);
pcpu_map_value_t values[nr_cpus]; int key, next_key, fd, i; fd = bpf_create_map(BPF_MAP_TYPE_PERCPU_ARRAY, sizeof(key),
sizeof(bpf_percpu(values, 0)), 2, 0);
sizeof(long), 2, 0); if (fd < 0) { printf("Failed to create arraymap '%s'!\n", strerror(errno)); exit(1);
@@ -459,7 +461,7 @@ static void test_arraymap_percpu(unsigned int task, void *data) static void test_arraymap_percpu_many_keys(void) { unsigned int nr_cpus = bpf_num_possible_cpus();
This just sets a bad example for anyone using selftests as an aspiration for their own code. bpf_num_possible_cpus() does exit(1) internally if libbpf_num_possible_cpus() returns error. No one should write real production code like that. So maybe let's provide a better example instead with error handling and malloc (or perhaps alloca)?
OK. Makes sense.
BPF_DECLARE_PERCPU(long, values);
pcpu_map_value_t values[nr_cpus]; /* nr_keys is not too large otherwise the test stresses percpu * allocator more than anything else */
@@ -467,7 +469,7 @@ static void test_arraymap_percpu_many_keys(void) int key, fd, i;
fd = bpf_create_map(BPF_MAP_TYPE_PERCPU_ARRAY, sizeof(key),
sizeof(bpf_percpu(values, 0)), nr_keys, 0);
sizeof(long), nr_keys, 0); if (fd < 0) { printf("Failed to create per-cpu arraymap '%s'!\n", strerror(errno));
-- 2.25.1