On 2025/11/22 10:34, Alexei Starovoitov wrote:
On Mon, Nov 17, 2025 at 8:22 AM Leon Hwang leon.hwang@linux.dev wrote:
[...]
/* lookup then check value on CPUs */for (j = 0; j < nr_cpus; j++) {flags = (u64)j << 32 | BPF_F_CPU;err = bpf_map__lookup_elem(map, keys + i * key_sz, key_sz, values,value_sz, flags);if (!ASSERT_OK(err, "bpf_map__lookup_elem specified cpu"))goto out;if (!ASSERT_EQ(values[0], j != cpu ? 0 : value,"bpf_map__lookup_elem value on specified cpu"))goto out;I was about to apply it, but noticed that the test is unstable. It fails 1 out of 10 for me in the above line. test_percpu_map_op_cpu_flag:PASS:bpf_map_lookup_batch value on specified cpu 0 nsec test_percpu_map_op_cpu_flag:FAIL:bpf_map_lookup_batch value on specified cpu unexpected bpf_map_lookup_batch value on specified cpu: actual 0 != expected 3735929054 #261/15 percpu_alloc/cpu_flag_lru_percpu_hash:FAIL #261 percpu_alloc:FAIL
Please investigate what is going on.
I was able to reproduce the failure on a 16-core VM.
It appears to be caused by LRU eviction. When I increased max_entries of the lru_percpu_hash map to libbpf_num_possible_cpus(), the issue no longer reproduced.
I'll need to spend more time investigating the exact eviction behavior and why it shows up intermittently in this test.
Thanks, Leon