March 2025 - Linux-kselftest-mirror

[PATCH 0/4] mm: permit guard regions for file-backed/shmem mappings

by Lorenzo Stoakes

The guard regions feature was initially implemented to support anonymous mappings only, excluding shmem. This was done such as to introduce the feature carefully and incrementally and to be conservative when considering the various caveats and corner cases that are applicable to file-backed mappings but not to anonymous ones. Now this feature has landed in 6.13, it is time to revisit this and to extend this functionality to file-backed and shmem mappings. In order to make this maximally useful, and since one may map file-backed mappings read-only (for instance ELF images), we also remove the restriction on read-only mappings and permit the establishment of guard regions in any non-hugetlb, non-mlock()'d mapping. It is permissible to permit the establishment of guard regions in read-only mappings because the guard regions only reduce access to the mapping, and when removed simply reinstate the existing attributes of the underlying VMA, meaning no access violations can occur. While the change in kernel code introduced in this series is small, the majority of the effort here is spent in extending the testing to assert that the feature works correctly across numerous file-backed mapping scenarios. Every single guard region self-test performed against anonymous memory (which is relevant and not anon-only) has now been updated to also be performed against shmem and a mapping of a file in the working directory. This confirms that all cases also function correctly for file-backed guard regions. In addition a number of other tests are added for specific file-backed mapping scenarios. There are a number of other concerns that one might have with regard to guard regions, addressed below: Readahead ~~~~~~~~~ Readahead is a process through which the page cache is populated on the assumption that sequential reads will occur, thus amortising I/O and, through a clever use of the PG_readahead folio flag establishing during major fault and checked upon minor fault, provides for asynchronous I/O to occur as dat is processed, reducing I/O stalls as data is faulted in. Guard regions do not alter this mechanism which operations at the folio and fault level, but do of course prevent the faulting of folios that would otherwise be mapped. In the instance of a major fault prior to a guard region, synchronous readahead will occur including populating folios in the page cache which the guard regions will, in the case of the mapping in question, prevent access to. In addition, if PG_readahead is placed in a folio that is now inaccessible, this will prevent asynchronous readahead from occurring as it would otherwise do. However, there are mechanisms for heuristically resetting this within readahead regardless, which will 'recover' correct readahead behaviour. Readahead presumes sequential data access, the presence of a guard region clearly indicates that, at least in the guard region, no such sequential access will occur, as it cannot occur there. So this should have very little impact on any real workload. The far more important point is as to whether readahead causes incorrect or inappropriate mapping of ranges disallowed by the presence of guard regions - this is not the case, as readahead does not 'pre-fault' memory in this fashion. At any rate, any mechanism which would attempt to do so would hit the usual page fault paths, which correctly handle PTE markers as with anonymous mappings. Fault-Around ~~~~~~~~~~~~ The fault-around logic, in a similar vein to readahead, attempts to improve efficiency with regard to file-backed memory mappings, however it differs in that it does not try to fetch folios into the page cache that are about to be accessed, but rather pre-maps a range of folios around the faulting address. Guard regions making use of PTE markers makes this relatively trivial, as this case is already handled - see filemap_map_folio_range() and filemap_map_order0_folio() - in both instances, the solution is to simply keep the established page table mappings and let the fault handler take care of PTE markers, as per the comment: /* * NOTE: If there're PTE markers, we'll leave them to be * handled in the specific fault path, and it'll prohibit * the fault-around logic. */ This works, as establishing guard regions results in page table mappings with PTE markers, and clearing them removes them. Truncation ~~~~~~~~~~ File truncation will not eliminate existing guard regions, as the truncation operation will ultimately zap the range via unmap_mapping_range(), which specifically excludes PTE markers. Zapping ~~~~~~~ Zapping is, as with anonymous mappings, handled by zap_nonpresent_ptes(), which specifically deals with guard entries, leaving them intact except in instances such as process teardown or munmap() where they need to be removed. Reclaim ~~~~~~~ When reclaim is performed on file-backed folios, it ultimately invokes try_to_unmap_one() via the rmap. If the folio is non-large, then map_pte() will ultimately abort the operation for the guard region mapping. If large, then check_pte() will determine that this is a non-device private entry/device-exclusive entry 'swap' PTE and thus abort the operation in that instance. Therefore, no odd things happen in the instance of reclaim being attempted upon a file-backed guard region. Hole Punching ~~~~~~~~~~~~~ This updates the page cache and ultimately invokes unmap_mapping_range(), which explicitly leaves PTE markers in place. Because the establishment of guard regions zapped any existing mappings to file-backed folios, once the guard regions are removed then the hole-punched region will be faulted in as usual and everything will behave as expected. Lorenzo Stoakes (4): mm: allow guard regions in file-backed and read-only mappings selftests/mm: rename guard-pages to guard-regions tools/selftests: expand all guard region tests to file-backed tools/selftests: add file/shmem-backed mapping guard region tests mm/madvise.c | 8 +- tools/testing/selftests/mm/.gitignore | 2 +- tools/testing/selftests/mm/Makefile | 2 +- .../mm/{guard-pages.c => guard-regions.c} | 921 ++++++++++++++++-- 4 files changed, 821 insertions(+), 112 deletions(-) rename tools/testing/selftests/mm/{guard-pages.c => guard-regions.c} (58%) -- 2.48.1

2 months, 3 weeks

7
63
0 0

[PATCH v2 1/2] time/timekeeping: Fix possible inconsistencies in _COARSE clockids

by John Stultz

Lei Chen raised an issue with CLOCK_MONOTONIC_COARSE seeing time inconsistencies. Lei tracked down that this was being caused by the adjustment tk->tkr_mono.xtime_nsec -= offset; which is made to compensate for the unaccumulated cycles in offset when the mult value is adjusted forward, so that the non-_COARSE clockids don't see inconsistencies. However, the _COARSE clockids don't use the mult*offset value in their calculations, so this subtraction can cause the _COARSE clock ids to jump back a bit. Now, by design, this negative adjustment should be fine, because the logic run from timekeeping_adjust() is done after we accumulate approx mult*interval_cycles into xtime_nsec. The accumulated (mult*interval_cycles) will be larger then the (mult_adj*offset) value subtracted from xtime_nsec, and both operations are done together under the tk_core.lock, so the net change to xtime_nsec should always be positive. However, do_adjtimex() calls into timekeeping_advance() as well, since we want to apply the ntp freq adjustment immediately. In this case, we don't return early when the offset is smaller then interval_cycles, so we don't end up accumulating any time into xtime_nsec. But we do go on to call timekeeping_adjust(), which modifies the mult value, and subtracts from xtime_nsec to correct for the new mult value. Here because we did not accumulate anything, we have a window where the _COARSE clockids that don't utilize the mult*offset value, can see an inconsistency. So to fix this, rework the timekeeping_advance() logic a bit so that when we are called from do_adjtimex(), we call timekeeping_forward(), to first accumulate the sub-interval time into xtime_nsec. Then with no unaccumulated cycles in offset, we can do the mult adjustment without worry of the subtraction having an impact. Cc: Thomas Gleixner <tglx(a)linutronix.de> Cc: Stephen Boyd <sboyd(a)kernel.org> Cc: Anna-Maria Behnsen <anna-maria(a)linutronix.de> Cc: Frederic Weisbecker <frederic(a)kernel.org> Cc: Shuah Khan <shuah(a)kernel.org> Cc: Miroslav Lichvar <mlichvar(a)redhat.com> Cc: linux-kselftest(a)vger.kernel.org Cc: kernel-team(a)android.com Cc: Lei Chen <lei.chen(a)smartx.com> Fixes: da15cfdae033 ("time: Introduce CLOCK_REALTIME_COARSE") Reported-by: Lei Chen <lei.chen(a)smartx.com> Closes: https://lore.kernel.org/lkml/20250310030004.3705801-1-lei.chen@smartx.com/ Diagnosed-by: Thomas Gleixner <tglx(a)linutronix.de> Additional-fixes-by: Thomas Gleixner <tglx(a)linutronix.de> Signed-off-by: John Stultz <jstultz(a)google.com> --- v2: Include fixes from Thomas, dropping the unnecessary clock_set setting, and instead clearing ntp_error, along with some other minor tweaks. --- kernel/time/timekeeping.c | 94 ++++++++++++++++++++++++++++----------- 1 file changed, 69 insertions(+), 25 deletions(-) diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index 1e67d076f1955..929846b8b45ab 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -682,20 +682,19 @@ static void timekeeping_update_from_shadow(struct tk_data *tkd, unsigned int act } /** - * timekeeping_forward_now - update clock to the current time + * timekeeping_forward - update clock to given cycle now value * @tk: Pointer to the timekeeper to update + * @cycle_now: Current clocksource read value * * Forward the current clock to update its state since the last call to * update_wall_time(). This is useful before significant clock changes, * as it avoids having to deal with this time offset explicitly. */ -static void timekeeping_forward_now(struct timekeeper *tk) +static void timekeeping_forward(struct timekeeper *tk, u64 cycle_now) { - u64 cycle_now, delta; + u64 delta = clocksource_delta(cycle_now, tk->tkr_mono.cycle_last, tk->tkr_mono.mask, + tk->tkr_mono.clock->max_raw_delta); - cycle_now = tk_clock_read(&tk->tkr_mono); - delta = clocksource_delta(cycle_now, tk->tkr_mono.cycle_last, tk->tkr_mono.mask, - tk->tkr_mono.clock->max_raw_delta); tk->tkr_mono.cycle_last = cycle_now; tk->tkr_raw.cycle_last = cycle_now; @@ -710,6 +709,21 @@ static void timekeeping_forward_now(struct timekeeper *tk) } } +/** + * timekeeping_forward_now - update clock to the current time + * @tk: Pointer to the timekeeper to update + * + * Forward the current clock to update its state since the last call to + * update_wall_time(). This is useful before significant clock changes, + * as it avoids having to deal with this time offset explicitly. + */ +static void timekeeping_forward_now(struct timekeeper *tk) +{ + u64 cycle_now = tk_clock_read(&tk->tkr_mono); + + timekeeping_forward(tk, cycle_now); +} + /** * ktime_get_real_ts64 - Returns the time of day in a timespec64. * @ts: pointer to the timespec to be set @@ -2151,6 +2165,54 @@ static u64 logarithmic_accumulation(struct timekeeper *tk, u64 offset, return offset; } +static u64 timekeeping_accumulate(struct timekeeper *tk, u64 offset, + enum timekeeping_adv_mode mode, + unsigned int *clock_set) +{ + int shift = 0, maxshift; + + /* + * TK_ADV_FREQ indicates that adjtimex(2) directly set the + * frequency or the tick length. + * + * Accumulate the offset, so that the new multiplier starts from + * now. This is required as otherwise for offsets, which are + * smaller than tk::cycle_interval, timekeeping_adjust() could set + * xtime_nsec backwards, which subsequently causes time going + * backwards in the coarse time getters. But even for the case + * where offset is greater than tk::cycle_interval the periodic + * accumulation does not have much value. + * + * Also reset tk::ntp_error as it does not make sense to keep the + * old accumulated error around in this case. + */ + if (mode == TK_ADV_FREQ) { + timekeeping_forward(tk, tk->tkr_mono.cycle_last + offset); + tk->ntp_error = 0; + return 0; + } + + /* + * With NO_HZ we may have to accumulate many cycle_intervals + * (think "ticks") worth of time at once. To do this efficiently, + * we calculate the largest doubling multiple of cycle_intervals + * that is smaller than the offset. We then accumulate that + * chunk in one go, and then try to consume the next smaller + * doubled multiple. + */ + shift = ilog2(offset) - ilog2(tk->cycle_interval); + shift = max(0, shift); + /* Bound shift to one less than what overflows tick_length */ + maxshift = (64 - (ilog2(ntp_tick_length()) + 1)) - 1; + shift = min(shift, maxshift); + while (offset >= tk->cycle_interval) { + offset = logarithmic_accumulation(tk, offset, shift, clock_set); + if (offset < tk->cycle_interval << shift) + shift--; + } + return offset; +} + /* * timekeeping_advance - Updates the timekeeper to the current time and * current NTP tick length @@ -2160,7 +2222,6 @@ static bool timekeeping_advance(enum timekeeping_adv_mode mode) struct timekeeper *tk = &tk_core.shadow_timekeeper; struct timekeeper *real_tk = &tk_core.timekeeper; unsigned int clock_set = 0; - int shift = 0, maxshift; u64 offset; guard(raw_spinlock_irqsave)(&tk_core.lock); @@ -2177,24 +2238,7 @@ static bool timekeeping_advance(enum timekeeping_adv_mode mode) if (offset < real_tk->cycle_interval && mode == TK_ADV_TICK) return false; - /* - * With NO_HZ we may have to accumulate many cycle_intervals - * (think "ticks") worth of time at once. To do this efficiently, - * we calculate the largest doubling multiple of cycle_intervals - * that is smaller than the offset. We then accumulate that - * chunk in one go, and then try to consume the next smaller - * doubled multiple. - */ - shift = ilog2(offset) - ilog2(tk->cycle_interval); - shift = max(0, shift); - /* Bound shift to one less than what overflows tick_length */ - maxshift = (64 - (ilog2(ntp_tick_length())+1)) - 1; - shift = min(shift, maxshift); - while (offset >= tk->cycle_interval) { - offset = logarithmic_accumulation(tk, offset, shift, &clock_set); - if (offset < tk->cycle_interval<<shift) - shift--; - } + offset = timekeeping_accumulate(tk, offset, mode, &clock_set); /* Adjust the multiplier to correct NTP error */ timekeeping_adjust(tk, offset); -- 2.49.0.395.g12beb8f557-goog

2 months, 3 weeks

3
22
0 0

selftests: cgroup: Failures – Timeouts & OOM Issues Analysis

by Naresh Kamboju

As part of LKFT’s re-validation of known issues, we have observed that the selftests: cgroup suite is consistently failing across almost all LKFT-supported devices due to: - Test timeouts (45 seconds limit reached) - OOM-killer invocation ## Key Questions for Discussion: - Would it be beneficial to increase the test timeout to ~180 seconds to allow sufficient execution time? - Should we enhance logging to explicitly print failure reasons when a test fails? - Are there any missing dependencies that could be causing these failures? Note: The required selftests/cgroup/config options were included in LKFT's build and test plans. ## Devices Affected: The following DUTs consistently experience these failures: - dragonboard-410c (arm64) - dragonboard-845c (arm64) - e850-96 (arm64) - juno-r2 (arm64) - qemu-arm64 (arm64) - qemu-armv7 - qemu-x86_64 - rk3399-rock-pi-4b (arm64) - x15 (arm) - x86_64 Regression Analysis: - New regression? No (these failures have been observed for months/years). - Reproducibility? Yes, the failures occur consistently. - Test suite affected? selftests: cgroup (timeouts and OOM-related failures). Test regression: selftests cgroup fails timeout and oom-killer Reported-by: Linux Kernel Functional Testing <lkft(a)linaro.org> ## Test log: # selftests: cgroup: test_cpu # ok 1 test_cpucg_subtree_control # ok 2 test_cpucg_stats # ok 3 test_cpucg_nice # not ok 4 test_cpucg_weight_overprovisioned # ok 5 test_cpucg_weight_underprovisioned # ok 6 test_cpucg_nested_weight_overprovisioned # ok 7 test_cpucg_nested_weight_underprovisioned # not ok 2 selftests: cgroup: test_cpu # TIMEOUT 45 seconds <trim> # selftests: cgroup: test_freezer # ok 1 test_cgfreezer_simple # ok 2 test_cgfreezer_tree # ok 3 test_cgfreezer_forkbomb # ok 4 test_cgfreezer_mkdir # ok 5 test_cgfreezer_rmdir # ok 6 test_cgfreezer_migrate # Cgroup /sys/fs/cgroup/cg_test_ptrace isn't frozen # not ok 7 test_cgfreezer_ptrace # ok 8 test_cgfreezer_stopped # ok 9 test_cgfreezer_ptraced # ok 10 test_cgfreezer_vfork not ok 4 selftests: cgroup: test_freezer # exit=1 <trim> selftests: cgroup: test_kmem # not ok 7 selftests: cgroup: test_kmem # TIMEOUT 45 seconds <trim> # selftests: cgroup: test_memcontrol # ok 1 test_memcg_subtree_control # not ok 2 test_memcg_current_peak # not ok 3 test_memcg_min # not ok 4 test_memcg_low # not ok 5 test_memcg_high # ok 6 test_memcg_high_sync [ 270.699078] test_memcontrol invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0 [ 270.699921] CPU: 1 UID: 0 PID: 946 Comm: test_memcontrol Not tainted 6.14.0-rc5-next-20250303 #1 [ 270.699930] Hardware name: Radxa ROCK Pi 4B (DT) <trim> [ 270.729527] Memory cgroup out of memory: Killed process 946 (test_memcontrol) total-vm:104840kB, anon-rss:30596kB, file-rss:1056kB, shmem-rss:0kB, UID:0 pgtables:104kB oom_score_adj:0 # not ok 7 test_memcg_max # not ok 8 test_memcg_reclaim <trim> not ok 8 selftests: cgroup: test_memcontrol # exit=1 ## Source * Kernel version: 6.14.0-rc5-next-20250303 * Git tree: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git * Git sha: cd3215bbcb9d4321def93fea6cfad4d5b42b9d1d * Git describe: 6.14.0-rc5-next-20250303 * Project details: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250303/ ## Test data * Test log: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250303/te… * Test history: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250303/te… * Test details: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250303/te… * Test logs rock pi: https://lkft.validation.linaro.org/scheduler/job/8148789#L1774 * Test logs x86: https://lkft.validation.linaro.org/scheduler/job/8148731#L1948 -- Linaro LKFT https://lkft.linaro.org

2 months, 4 weeks

2
2
0 0

[PATCH] kunit: cs_dsp: Depend on FW_CS_DSP rather then enabling it

by Nico Pache

FW_CS_DSP gets enabled if KUNIT is enabled. The test should rather depend on if the feature is enabled. Fix this by moving FW_CS_DSP to the depends on clause, and set CONFIG_FW_CS_DSP=y in the kunit tooling. Fixes: dd0b6b1f29b9 ("firmware: cs_dsp: Add KUnit testing of bin file download") Signed-off-by: Nico Pache <npache(a)redhat.com> --- drivers/firmware/cirrus/Kconfig | 3 +-- tools/testing/kunit/configs/all_tests.config | 2 ++ 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/firmware/cirrus/Kconfig b/drivers/firmware/cirrus/Kconfig index 0a883091259a..989568ab5712 100644 --- a/drivers/firmware/cirrus/Kconfig +++ b/drivers/firmware/cirrus/Kconfig @@ -11,9 +11,8 @@ config FW_CS_DSP_KUNIT_TEST_UTILS config FW_CS_DSP_KUNIT_TEST tristate "KUnit tests for Cirrus Logic cs_dsp" if !KUNIT_ALL_TESTS - depends on KUNIT && REGMAP + depends on KUNIT && REGMAP && FW_CS_DSP default KUNIT_ALL_TESTS - select FW_CS_DSP select FW_CS_DSP_KUNIT_TEST_UTILS help This builds KUnit tests for cs_dsp. diff --git a/tools/testing/kunit/configs/all_tests.config b/tools/testing/kunit/configs/all_tests.config index b0049be00c70..96c6b4aca87d 100644 --- a/tools/testing/kunit/configs/all_tests.config +++ b/tools/testing/kunit/configs/all_tests.config @@ -49,3 +49,5 @@ CONFIG_SOUND=y CONFIG_SND=y CONFIG_SND_SOC=y CONFIG_SND_SOC_TOPOLOGY_BUILD=y + +CONFIG_FW_CS_DSP=y \ No newline at end of file -- 2.48.1

3 months

3
21
0 0

[PATCH 0/4] sysctl: Move the u8 range check test to lib/test_sysctl.c

by Joel Granados

Originally introduced to sysctl-test.c by commit b5ffbd139688 ("sysctl: move the extra1/2 boundary check of u8 to sysctl_check_table_array"), it has been shown to lead to a panic under certain conditions related to a dangling registration. This series moves the u8 test to lib/test_sysctl.c where the registration calls are kept and correctly removed on module exit. An additional 0012 test is added to selftests/sysctl/sysctl.sh in order to visualize the registration calls done in test_sysctl.c. Very much related to adding tests to sysctl, the last two patches of this series reduce the places that need to be changed when tests are added by managing the initialization and closing of sysctl tables with a for loop. Comments are greatly appreciated Signed-off-by: Joel Granados <joel.granados(a)kernel.org> --- Joel Granados (4): sysctl: move u8 register test to lib/test_sysctl.c sysctl: Add 0012 to test the u8 range check sysctl: call sysctl tests with a for loop sysctl: Close test ctl_headers with a for loop kernel/sysctl-test.c | 49 ------------ lib/test_sysctl.c | 133 +++++++++++++++++++++---------- tools/testing/selftests/sysctl/sysctl.sh | 30 +++++++ 3 files changed, 122 insertions(+), 90 deletions(-) --- base-commit: 7eb172143d5508b4da468ed59ee857c6e5e01da6 change-id: 20250321-jag-test_extra_val-40954050a1f6 Best regards, -- Joel Granados <joel.granados(a)kernel.org>

3 months

2
9
0 0

[PATCH bpf-next v2 0/6] selftests/bpf: Various sockmap-related fixes

by Michal Luczaj

Series takes care of few bugs and missing features with the aim to improve the test coverage of sockmap/sockhash. Last patch is a create_pair() rewrite making use of __attribute__((cleanup)) to handle socket fd lifetime. Signed-off-by: Michal Luczaj <mhal(a)rbox.co> --- Changes in v2: - Rebase on bpf-next (Jakub) - Use cleanup helpers from kernel's cleanup.h (Jakub) - Fix subject of patch 3, rephrase patch 4, use correct prefix - Link to v1: https://lore.kernel.org/r/20240724-sockmap-selftest-fixes-v1-0-46165d224712… Changes in v1: - No declarations in function body (Jakub) - Don't touch output arguments until function succeeds (Jakub) - Link to v0: https://lore.kernel.org/netdev/027fdb41-ee11-4be0-a493-22f28a1abd7c@rbox.co/ --- Michal Luczaj (6): selftests/bpf: Support more socket types in create_pair() selftests/bpf: Socket pair creation, cleanups selftests/bpf: Simplify inet_socketpair() and vsock_socketpair_connectible() selftests/bpf: Honour the sotype of af_unix redir tests selftests/bpf: Exercise SOCK_STREAM unix_inet_redir_to_connected() selftests/bpf: Introduce __attribute__((cleanup)) in create_pair() .../selftests/bpf/prog_tests/sockmap_basic.c | 28 ++-- .../selftests/bpf/prog_tests/sockmap_helpers.h | 149 ++++++++++++++------- .../selftests/bpf/prog_tests/sockmap_listen.c | 117 ++-------------- 3 files changed, 124 insertions(+), 170 deletions(-) --- base-commit: 92cc2456e9775dc4333fb4aa430763ae4ac2f2d9 change-id: 20240729-selftest-sockmap-fixes-bcca996e143b Best regards, -- Michal Luczaj <mhal(a)rbox.co>

3 months

3
26
0 0

[PATCH bpf-next v2 0/2] bpf: fix ktls panic with sockmap and add tests

by Jiayuan Chen

We can reproduce the issue using the existing test program: './test_sockmap --ktls' Or use the selftest I provided, which will cause a panic: ------------[ cut here ]------------ kernel BUG at lib/iov_iter.c:629! PKRU: 55555554 Call Trace: <TASK> ? die+0x36/0x90 ? do_trap+0xdd/0x100 ? iov_iter_revert+0x178/0x180 ? iov_iter_revert+0x178/0x180 ? do_error_trap+0x7d/0x110 ? iov_iter_revert+0x178/0x180 ? exc_invalid_op+0x50/0x70 ? iov_iter_revert+0x178/0x180 ? asm_exc_invalid_op+0x1a/0x20 ? iov_iter_revert+0x178/0x180 ? iov_iter_revert+0x5c/0x180 tls_sw_sendmsg_locked.isra.0+0x794/0x840 tls_sw_sendmsg+0x52/0x80 ? inet_sendmsg+0x1f/0x70 __sys_sendto+0x1cd/0x200 ? find_held_lock+0x2b/0x80 ? syscall_trace_enter+0x140/0x270 ? __lock_release.isra.0+0x5e/0x170 ? find_held_lock+0x2b/0x80 ? syscall_trace_enter+0x140/0x270 ? lockdep_hardirqs_on_prepare+0xda/0x190 ? ktime_get_coarse_real_ts64+0xc2/0xd0 __x64_sys_sendto+0x24/0x30 do_syscall_64+0x90/0x170 1. It looks like the issue started occurring after bpf being introduced to ktls and later the addition of assertions to iov_iter has caused a panic. If my fix tag is incorrect, please assist me in correcting the fix tag. 2. I make minimal changes for now, it's enough to make ktls work correctly. --- v1->v2: Added more content to the commit message https://lore.kernel.org/all/20250123171552.57345-1-mrpre@163.com/#r --- Jiayuan Chen (2): bpf: fix ktls panic with sockmap selftests/bpf: add ktls selftest net/tls/tls_sw.c | 8 +- .../selftests/bpf/prog_tests/sockmap_ktls.c | 174 +++++++++++++++++- .../selftests/bpf/progs/test_sockmap_ktls.c | 26 +++ 3 files changed, 205 insertions(+), 3 deletions(-) create mode 100644 tools/testing/selftests/bpf/progs/test_sockmap_ktls.c -- 2.47.1

3 months

3
4
0 0

[PATCH] selftests/bpf: close the file descriptor to avoid resource leaks

by Malaya Kumar Rout

Static Analyis for bench_htab_mem.c with cppcheck:error tools/testing/selftests/bpf/benchs/bench_htab_mem.c:284:3: error: Resource leak: fd [resourceLeak] tools/testing/selftests/bpf/prog_tests/sk_assign.c:41:3: error: Resource leak: tc [resourceLeak] fix the issue by closing the file descriptor (fd & tc) when read & fgets operation fails. Signed-off-by: Malaya Kumar Rout <malayarout91(a)gmail.com> --- tools/testing/selftests/bpf/benchs/bench_htab_mem.c | 1 + tools/testing/selftests/bpf/prog_tests/sk_assign.c | 4 +++- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/bpf/benchs/bench_htab_mem.c b/tools/testing/selftests/bpf/benchs/bench_htab_mem.c index 926ee822143e..59746fd2c23a 100644 --- a/tools/testing/selftests/bpf/benchs/bench_htab_mem.c +++ b/tools/testing/selftests/bpf/benchs/bench_htab_mem.c @@ -281,6 +281,7 @@ static void htab_mem_read_mem_cgrp_file(const char *name, unsigned long *value) got = read(fd, buf, sizeof(buf) - 1); if (got <= 0) { *value = 0; + close(fd); return; } buf[got] = 0; diff --git a/tools/testing/selftests/bpf/prog_tests/sk_assign.c b/tools/testing/selftests/bpf/prog_tests/sk_assign.c index 0b9bd1d6f7cc..10a0ab954b8a 100644 --- a/tools/testing/selftests/bpf/prog_tests/sk_assign.c +++ b/tools/testing/selftests/bpf/prog_tests/sk_assign.c @@ -37,8 +37,10 @@ configure_stack(void) tc = popen("tc -V", "r"); if (CHECK_FAIL(!tc)) return false; - if (CHECK_FAIL(!fgets(tc_version, sizeof(tc_version), tc))) + if (CHECK_FAIL(!fgets(tc_version, sizeof(tc_version), tc))) { + pclose(tc); return false; + } if (strstr(tc_version, ", libbpf ")) prog = "test_sk_assign_libbpf.bpf.o"; else -- 2.43.0

3 months

5
10
0 0

[PATCH] selftests/x86/lam: fix memory leak and resource leak in lam.c

by Malaya Kumar Rout

Static Analyis for bench_htab_mem.c with cppcheck:error tools/testing/selftests/x86/lam.c:585:3: error: Resource leak: file_fd [resourceLeak] tools/testing/selftests/x86/lam.c:593:3: error: Resource leak: file_fd [resourceLeak] tools/testing/selftests/x86/lam.c:600:3: error: Memory leak: fi [memleak] tools/testing/selftests/x86/lam.c:1066:2: error: Resource leak: fd [resourceLeak] fix the issue by closing the file descriptors and releasing the allocated memory. Signed-off-by: Malaya Kumar Rout <malayarout91(a)gmail.com> --- tools/testing/selftests/x86/lam.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/tools/testing/selftests/x86/lam.c b/tools/testing/selftests/x86/lam.c index 4d4a76532dc9..0b43b83ad142 100644 --- a/tools/testing/selftests/x86/lam.c +++ b/tools/testing/selftests/x86/lam.c @@ -581,24 +581,28 @@ int do_uring(unsigned long lam) if (file_fd < 0) return 1; - if (fstat(file_fd, &st) < 0) + if (fstat(file_fd, &st) < 0) { + close(file_fd); return 1; - + } off_t file_sz = st.st_size; int blocks = (int)(file_sz + URING_BLOCK_SZ - 1) / URING_BLOCK_SZ; fi = malloc(sizeof(*fi) + sizeof(struct iovec) * blocks); - if (!fi) + if (!fi) { + close(file_fd); return 1; - + } fi->file_sz = file_sz; fi->file_fd = file_fd; ring = malloc(sizeof(*ring)); - if (!ring) + if (!ring) { + close(file_fd); + free(fi); return 1; - + } memset(ring, 0, sizeof(struct io_ring)); if (setup_io_uring(ring)) @@ -1060,8 +1064,10 @@ void *allocate_dsa_pasid(void) wq = mmap(NULL, 0x1000, PROT_WRITE, MAP_SHARED | MAP_POPULATE, fd, 0); - if (wq == MAP_FAILED) + if (wq == MAP_FAILED) { + close(fd); perror("mmap"); + } return wq; } -- 2.43.0

3 months

4
10
0 0

[PATCH 1/1] selftests/mincore: Allow read-ahead pages to reach the end of the file

by Qiuxu Zhuo

When running the mincore_selftest on a system with an XFS file system, it failed the "check_file_mmap" test case due to the read-ahead pages reaching the end of the file. The failure log is as below: RUN global.check_file_mmap ... mincore_selftest.c:264:check_file_mmap:Expected i (1024) < vec_size (1024) mincore_selftest.c:265:check_file_mmap:Read-ahead pages reached the end of the file check_file_mmap: Test failed FAIL global.check_file_mmap This is because the read-ahead window size of the XFS file system on this machine is 4 MB, which is larger than the size from the #PF address to the end of the file. As a result, all the pages for this file are populated. blockdev --getra /dev/nvme0n1p5 8192 blockdev --getbsz /dev/nvme0n1p5 512 This issue can be fixed by extending the current FILE_SIZE 4MB to a larger number, but it will still fail if the read-ahead window size of the file system is larger enough. Additionally, in the real world, read-ahead pages reaching the end of the file can happen and is an expected behavior. Therefore, allowing read-ahead pages to reach the end of the file is a better choice for the "check_file_mmap" test case. Reported-by: Yi Lai <yi1.lai(a)intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo(a)intel.com> --- tools/testing/selftests/mincore/mincore_selftest.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/tools/testing/selftests/mincore/mincore_selftest.c b/tools/testing/selftests/mincore/mincore_selftest.c index e949a43a6145..efabfcbe0b49 100644 --- a/tools/testing/selftests/mincore/mincore_selftest.c +++ b/tools/testing/selftests/mincore/mincore_selftest.c @@ -261,9 +261,6 @@ TEST(check_file_mmap) TH_LOG("No read-ahead pages found in memory"); } - EXPECT_LT(i, vec_size) { - TH_LOG("Read-ahead pages reached the end of the file"); - } /* * End of the readahead window. The rest of the pages shouldn't * be in memory. -- 2.17.1

3 months

3
3
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror March 2025