From: Daniel Borkmann daniel@iogearbox.net
[ Upstream commit b8e188f023e07a733b47d5865311ade51878fe40 ]
The assumption of 'in privileged mode reads from uninitialized stack locations are permitted' is not quite correct since the verifier was probing for read access rather than write access. Both tests need to be annotated as __success for privileged and unprivileged.
Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/r/20240913191754.13290-6-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/bpf/progs/verifier_int_ptr.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/progs/verifier_int_ptr.c b/tools/testing/selftests/bpf/progs/verifier_int_ptr.c index 589e8270de462..d873da71f1436 100644 --- a/tools/testing/selftests/bpf/progs/verifier_int_ptr.c +++ b/tools/testing/selftests/bpf/progs/verifier_int_ptr.c @@ -8,7 +8,6 @@ SEC("socket") __description("ARG_PTR_TO_LONG uninitialized") __success -__failure_unpriv __msg_unpriv("invalid indirect read from stack R4 off -16+0 size 8") __naked void arg_ptr_to_long_uninitialized(void) { asm volatile (" \ @@ -36,9 +35,7 @@ __naked void arg_ptr_to_long_uninitialized(void)
SEC("socket") __description("ARG_PTR_TO_LONG half-uninitialized") -/* in privileged mode reads from uninitialized stack locations are permitted */ -__success __failure_unpriv -__msg_unpriv("invalid indirect read from stack R4 off -16+4 size 8") +__success __retval(0) __naked void ptr_to_long_half_uninitialized(void) {
From: Tao Chen chen.dylane@gmail.com
[ Upstream commit 1d244784be6b01162b732a5a7d637dfc024c3203 ]
Percpu map is often used, but the map value size limit often ignored, like issue: https://github.com/iovisor/bcc/issues/2519. Actually, percpu map value size is bound by PCPU_MIN_UNIT_SIZE, so we can check the value size whether it exceeds PCPU_MIN_UNIT_SIZE first, like percpu map of local_storage. Maybe the error message seems clearer compared with "cannot allocate memory".
Signed-off-by: Jinke Han jinkehan@didiglobal.com Signed-off-by: Tao Chen chen.dylane@gmail.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Jiri Olsa jolsa@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20240910144111.1464912-2-chen.dylane@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/arraymap.c | 3 +++ kernel/bpf/hashtab.c | 3 +++ 2 files changed, 6 insertions(+)
diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c index c9843dde69081..1811efcfbd6e3 100644 --- a/kernel/bpf/arraymap.c +++ b/kernel/bpf/arraymap.c @@ -73,6 +73,9 @@ int array_map_alloc_check(union bpf_attr *attr) /* avoid overflow on round_up(map->value_size) */ if (attr->value_size > INT_MAX) return -E2BIG; + /* percpu map value size is bound by PCPU_MIN_UNIT_SIZE */ + if (percpu && round_up(attr->value_size, 8) > PCPU_MIN_UNIT_SIZE) + return -E2BIG;
return 0; } diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 85cd17ca38290..7c64ad4f3732b 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -458,6 +458,9 @@ static int htab_map_alloc_check(union bpf_attr *attr) * kmalloc-able later in htab_map_update_elem() */ return -E2BIG; + /* percpu map value size is bound by PCPU_MIN_UNIT_SIZE */ + if (percpu && round_up(attr->value_size, 8) > PCPU_MIN_UNIT_SIZE) + return -E2BIG;
return 0; }
From: Kuan-Wei Chiu visitorckw@gmail.com
[ Upstream commit f04e2ad394e2755d0bb2d858ecb5598718bf00d5 ]
When netfilter has no entry to display, qsort is called with qsort(NULL, 0, ...). This results in undefined behavior, as UBSan reports:
net.c:827:2: runtime error: null pointer passed as argument 1, which is declared to never be null
Although the C standard does not explicitly state whether calling qsort with a NULL pointer when the size is 0 constitutes undefined behavior, Section 7.1.4 of the C standard (Use of library functions) mentions:
"Each of the following statements applies unless explicitly stated otherwise in the detailed descriptions that follow: If an argument to a function has an invalid value (such as a value outside the domain of the function, or a pointer outside the address space of the program, or a null pointer, or a pointer to non-modifiable storage when the corresponding parameter is not const-qualified) or a type (after promotion) not expected by a function with variable number of arguments, the behavior is undefined."
To avoid this, add an early return when nf_link_info is NULL to prevent calling qsort with a NULL pointer.
Signed-off-by: Kuan-Wei Chiu visitorckw@gmail.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Reviewed-by: Quentin Monnet qmo@kernel.org Link: https://lore.kernel.org/bpf/20240910150207.3179306-1-visitorckw@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/bpf/bpftool/net.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/tools/bpf/bpftool/net.c b/tools/bpf/bpftool/net.c index 66a8ce8ae0127..bd4e66d514f14 100644 --- a/tools/bpf/bpftool/net.c +++ b/tools/bpf/bpftool/net.c @@ -819,6 +819,9 @@ static void show_link_netfilter(void) nf_link_count++; }
+ if (!nf_link_info) + return; + qsort(nf_link_info, nf_link_count, sizeof(*nf_link_info), netfilter_link_compar);
for (id = 0; id < nf_link_count; id++) {
From: Kuan-Wei Chiu visitorckw@gmail.com
[ Upstream commit 4cdc0e4ce5e893bc92255f5f734d983012f2bc2e ]
Replace shifts of '1' with '1U' in bitwise operations within __show_dev_tc_bpf() to prevent undefined behavior caused by shifting into the sign bit of a signed integer. By using '1U', the operations are explicitly performed on unsigned integers, avoiding potential integer overflow or sign-related issues.
Signed-off-by: Kuan-Wei Chiu visitorckw@gmail.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Quentin Monnet qmo@kernel.org Link: https://lore.kernel.org/bpf/20240908140009.3149781-1-visitorckw@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/bpf/bpftool/net.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/bpf/bpftool/net.c b/tools/bpf/bpftool/net.c index bd4e66d514f14..28e9417a5c2e3 100644 --- a/tools/bpf/bpftool/net.c +++ b/tools/bpf/bpftool/net.c @@ -480,9 +480,9 @@ static void __show_dev_tc_bpf(const struct ip_devname_ifindex *dev, if (prog_flags[i] || json_output) { NET_START_ARRAY("prog_flags", "%s "); for (j = 0; prog_flags[i] && j < 32; j++) { - if (!(prog_flags[i] & (1 << j))) + if (!(prog_flags[i] & (1U << j))) continue; - NET_DUMP_UINT_ONLY(1 << j); + NET_DUMP_UINT_ONLY(1U << j); } NET_END_ARRAY(""); } @@ -491,9 +491,9 @@ static void __show_dev_tc_bpf(const struct ip_devname_ifindex *dev, if (link_flags[i] || json_output) { NET_START_ARRAY("link_flags", "%s "); for (j = 0; link_flags[i] && j < 32; j++) { - if (!(link_flags[i] & (1 << j))) + if (!(link_flags[i] & (1U << j))) continue; - NET_DUMP_UINT_ONLY(1 << j); + NET_DUMP_UINT_ONLY(1U << j); } NET_END_ARRAY(""); }
From: Heiko Carstens hca@linux.ibm.com
[ Upstream commit fccb175bc89a0d37e3ff513bb6bf1f73b3a48950 ]
Only a couple of files of the decompressor are compiled with the minimum architecture level. This is problematic for potential function calls between compile units, especially if a target function is within a compile until compiled for a higher architecture level, since that may lead to an unexpected operation exception.
Therefore compile all files of the decompressor for the same (minimum) architecture level.
Reviewed-by: Sven Schnelle svens@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/boot/Makefile | 19 +++++-------------- 1 file changed, 5 insertions(+), 14 deletions(-)
diff --git a/arch/s390/boot/Makefile b/arch/s390/boot/Makefile index c7c81e5f92189..e4def3a6c6cca 100644 --- a/arch/s390/boot/Makefile +++ b/arch/s390/boot/Makefile @@ -9,11 +9,8 @@ UBSAN_SANITIZE := n KASAN_SANITIZE := n KCSAN_SANITIZE := n
-KBUILD_AFLAGS := $(KBUILD_AFLAGS_DECOMPRESSOR) -KBUILD_CFLAGS := $(KBUILD_CFLAGS_DECOMPRESSOR) - # -# Use minimum architecture for als.c to be able to print an error +# Use minimum architecture level so it is possible to print an error # message if the kernel is started on a machine which is too old # ifndef CONFIG_CC_IS_CLANG @@ -22,16 +19,10 @@ else CC_FLAGS_MARCH_MINIMUM := -march=z10 endif
-ifneq ($(CC_FLAGS_MARCH),$(CC_FLAGS_MARCH_MINIMUM)) -AFLAGS_REMOVE_head.o += $(CC_FLAGS_MARCH) -AFLAGS_head.o += $(CC_FLAGS_MARCH_MINIMUM) -AFLAGS_REMOVE_mem.o += $(CC_FLAGS_MARCH) -AFLAGS_mem.o += $(CC_FLAGS_MARCH_MINIMUM) -CFLAGS_REMOVE_als.o += $(CC_FLAGS_MARCH) -CFLAGS_als.o += $(CC_FLAGS_MARCH_MINIMUM) -CFLAGS_REMOVE_sclp_early_core.o += $(CC_FLAGS_MARCH) -CFLAGS_sclp_early_core.o += $(CC_FLAGS_MARCH_MINIMUM) -endif +KBUILD_AFLAGS := $(filter-out $(CC_FLAGS_MARCH),$(KBUILD_AFLAGS_DECOMPRESSOR)) +KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_MARCH),$(KBUILD_CFLAGS_DECOMPRESSOR)) +KBUILD_AFLAGS += $(CC_FLAGS_MARCH_MINIMUM) +KBUILD_CFLAGS += $(CC_FLAGS_MARCH_MINIMUM)
CFLAGS_sclp_early_core.o += -I$(srctree)/drivers/s390/char
From: Heiko Carstens hca@linux.ibm.com
[ Upstream commit 0147addc4fb72a39448b8873d8acdf3a0f29aa65 ]
Disable compile time optimizations of test_facility() for the decompressor. The decompressor should not contain any optimized code depending on the architecture level set the kernel image is compiled for to avoid unexpected operation exceptions.
Add a __DECOMPRESSOR check to test_facility() to enforce that facilities are always checked during runtime for the decompressor.
Reviewed-by: Sven Schnelle svens@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/include/asm/facility.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/s390/include/asm/facility.h b/arch/s390/include/asm/facility.h index 94b6919026dfb..953d42205ea83 100644 --- a/arch/s390/include/asm/facility.h +++ b/arch/s390/include/asm/facility.h @@ -60,8 +60,10 @@ static inline int test_facility(unsigned long nr) unsigned long facilities_als[] = { FACILITIES_ALS };
if (__builtin_constant_p(nr) && nr < sizeof(facilities_als) * 8) { - if (__test_facility(nr, &facilities_als)) - return 1; + if (__test_facility(nr, &facilities_als)) { + if (!__is_defined(__DECOMPRESSOR)) + return 1; + } } return __test_facility(nr, &stfle_fac_list); }
From: Gerald Schaefer gerald.schaefer@linux.ibm.com
[ Upstream commit 131b8db78558120f58c5dc745ea9655f6b854162 ]
Adding/removing large amount of pages at once to/from the CMM balloon can result in rcu_sched stalls or workqueue lockups, because of busy looping w/o cond_resched().
Prevent this by adding a cond_resched(). cmm_free_pages() holds a spin_lock while looping, so it cannot be added directly to the existing loop. Instead, introduce a wrapper function that operates on maximum 256 pages at once, and add it there.
Signed-off-by: Gerald Schaefer gerald.schaefer@linux.ibm.com Reviewed-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/mm/cmm.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/arch/s390/mm/cmm.c b/arch/s390/mm/cmm.c index f47515313226c..9af4d82964944 100644 --- a/arch/s390/mm/cmm.c +++ b/arch/s390/mm/cmm.c @@ -95,11 +95,12 @@ static long cmm_alloc_pages(long nr, long *counter, (*counter)++; spin_unlock(&cmm_lock); nr--; + cond_resched(); } return nr; }
-static long cmm_free_pages(long nr, long *counter, struct cmm_page_array **list) +static long __cmm_free_pages(long nr, long *counter, struct cmm_page_array **list) { struct cmm_page_array *pa; unsigned long addr; @@ -123,6 +124,21 @@ static long cmm_free_pages(long nr, long *counter, struct cmm_page_array **list) return nr; }
+static long cmm_free_pages(long nr, long *counter, struct cmm_page_array **list) +{ + long inc = 0; + + while (nr) { + inc = min(256L, nr); + nr -= inc; + inc = __cmm_free_pages(inc, counter, list); + if (inc) + break; + cond_resched(); + } + return nr + inc; +} + static int cmm_oom_notify(struct notifier_block *self, unsigned long dummy, void *parm) {
From: Yonghong Song yonghong.song@linux.dev
[ Upstream commit c8831bdbfbab672c006a18006d36932a494b2fd6 ]
Daniel Hodges reported a jit error when playing with a sched-ext program. The error message is: unexpected jmp_cond padding: -4 bytes
But further investigation shows the error is actual due to failed convergence. The following are some analysis:
... pass4, final_proglen=4391: ... 20e: 48 85 ff test rdi,rdi 211: 74 7d je 0x290 213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] ... 289: 48 85 ff test rdi,rdi 28c: 74 17 je 0x2a5 28e: e9 7f ff ff ff jmp 0x212 293: bf 03 00 00 00 mov edi,0x3
Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125) and insn at 0x28e is 5-byte jmp insn with offset -129.
pass5, final_proglen=4392: ... 20e: 48 85 ff test rdi,rdi 211: 0f 84 80 00 00 00 je 0x297 217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] ... 28d: 48 85 ff test rdi,rdi 290: 74 1a je 0x2ac 292: eb 84 jmp 0x218 294: bf 03 00 00 00 mov edi,0x3
Note that insn at 0x211 is 6-byte cond jump insn now since its offset becomes 0x80 based on previous round (0x293 - 0x213 = 0x80). At the same time, insn at 0x292 is a 2-byte insn since its offset is -124.
pass6 will repeat the same code as in pass4. pass7 will repeat the same code as in pass5, and so on. This will prevent eventual convergence.
Passes 1-14 are with padding = 0. At pass15, padding is 1 and related insn looks like:
211: 0f 84 80 00 00 00 je 0x297 217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] ... 24d: 48 85 d2 test rdx,rdx
The similar code in pass14: 211: 74 7d je 0x290 213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] ... 249: 48 85 d2 test rdx,rdx 24c: 74 21 je 0x26f 24e: 48 01 f7 add rdi,rsi ...
Before generating the following insn, 250: 74 21 je 0x273 "padding = 1" enables some checking to ensure nops is either 0 or 4 where #define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp))) nops = INSN_SZ_DIFF - 2
In this specific case, addrs[i] = 0x24e // from pass14 addrs[i-1] = 0x24d // from pass15 prog - temp = 3 // from 'test rdx,rdx' in pass15 so nops = -4 and this triggers the failure.
To fix the issue, we need to break cycles of je <-> jmp. For example, in the above case, we have 211: 74 7d je 0x290 the offset is 0x7d. If 2-byte je insn is generated only if the offset is less than 0x7d (<= 0x7c), the cycle can be break and we can achieve the convergence.
I did some study on other cases like je <-> je, jmp <-> je and jmp <-> jmp which may cause cycles. Those cases are not from actual reproducible cases since it is pretty hard to construct a test case for them. the results show that the offset <= 0x7b (0x7b = 123) should be enough to cover all cases. This patch added a new helper to generate 8-bit cond/uncond jmp insns only if the offset range is [-128, 123].
Reported-by: Daniel Hodges hodgesd@meta.com Signed-off-by: Yonghong Song yonghong.song@linux.dev Link: https://lore.kernel.org/r/20240904221251.37109-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/net/bpf_jit_comp.c | 54 +++++++++++++++++++++++++++++++++++-- 1 file changed, 52 insertions(+), 2 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 878a4c6dd7565..a50c99e9b5c01 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -58,6 +58,56 @@ static bool is_imm8(int value) return value <= 127 && value >= -128; }
+/* + * Let us limit the positive offset to be <= 123. + * This is to ensure eventual jit convergence For the following patterns: + * ... + * pass4, final_proglen=4391: + * ... + * 20e: 48 85 ff test rdi,rdi + * 211: 74 7d je 0x290 + * 213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] + * ... + * 289: 48 85 ff test rdi,rdi + * 28c: 74 17 je 0x2a5 + * 28e: e9 7f ff ff ff jmp 0x212 + * 293: bf 03 00 00 00 mov edi,0x3 + * Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125) + * and insn at 0x28e is 5-byte jmp insn with offset -129. + * + * pass5, final_proglen=4392: + * ... + * 20e: 48 85 ff test rdi,rdi + * 211: 0f 84 80 00 00 00 je 0x297 + * 217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] + * ... + * 28d: 48 85 ff test rdi,rdi + * 290: 74 1a je 0x2ac + * 292: eb 84 jmp 0x218 + * 294: bf 03 00 00 00 mov edi,0x3 + * Note that insn at 0x211 is 6-byte cond jump insn now since its offset + * becomes 0x80 based on previous round (0x293 - 0x213 = 0x80). + * At the same time, insn at 0x292 is a 2-byte insn since its offset is + * -124. + * + * pass6 will repeat the same code as in pass4 and this will prevent + * eventual convergence. + * + * To fix this issue, we need to break je (2->6 bytes) <-> jmp (5->2 bytes) + * cycle in the above. In the above example je offset <= 0x7c should work. + * + * For other cases, je <-> je needs offset <= 0x7b to avoid no convergence + * issue. For jmp <-> je and jmp <-> jmp cases, jmp offset <= 0x7c should + * avoid no convergence issue. + * + * Overall, let us limit the positive offset for 8bit cond/uncond jmp insn + * to maximum 123 (0x7b). This way, the jit pass can eventually converge. + */ +static bool is_imm8_jmp_offset(int value) +{ + return value <= 123 && value >= -128; +} + static bool is_simm32(s64 value) { return value == (s64)(s32)value; @@ -1774,7 +1824,7 @@ st: if (is_imm8(insn->off)) return -EFAULT; } jmp_offset = addrs[i + insn->off] - addrs[i]; - if (is_imm8(jmp_offset)) { + if (is_imm8_jmp_offset(jmp_offset)) { if (jmp_padding) { /* To keep the jmp_offset valid, the extra bytes are * padded before the jump insn, so we subtract the @@ -1856,7 +1906,7 @@ st: if (is_imm8(insn->off)) break; } emit_jmp: - if (is_imm8(jmp_offset)) { + if (is_imm8_jmp_offset(jmp_offset)) { if (jmp_padding) { /* To avoid breaking jmp_offset, the extra bytes * are padded before the actual jmp insn, so
From: Artem Sadovnikov ancowi69@gmail.com
[ Upstream commit cc749e61c011c255d81b192a822db650c68b313f ]
Fuzzing reports a possible deadlock in jbd2_log_wait_commit.
This issue is triggered when an EXT4_IOC_MIGRATE ioctl is set to require synchronous updates because the file descriptor is opened with O_SYNC. This can lead to the jbd2_journal_stop() function calling jbd2_might_wait_for_commit(), potentially causing a deadlock if the EXT4_IOC_MIGRATE call races with a write(2) system call.
This problem only arises when CONFIG_PROVE_LOCKING is enabled. In this case, the jbd2_might_wait_for_commit macro locks jbd2_handle in the jbd2_journal_stop function while i_data_sem is locked. This triggers lockdep because the jbd2_journal_start function might also lock the same jbd2_handle simultaneously.
Found by Linux Verification Center (linuxtesting.org) with syzkaller.
Reviewed-by: Ritesh Harjani (IBM) ritesh.list@gmail.com Co-developed-by: Mikhail Ukhin mish.uxin2012@yandex.ru Signed-off-by: Mikhail Ukhin mish.uxin2012@yandex.ru Signed-off-by: Artem Sadovnikov ancowi69@gmail.com Rule: add Link: https://lore.kernel.org/stable/20240404095000.5872-1-mish.uxin2012%40yandex.... Link: https://patch.msgid.link/20240829152210.2754-1-ancowi69@gmail.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/migrate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c index d98ac2af8199f..a5e1492bbaaa5 100644 --- a/fs/ext4/migrate.c +++ b/fs/ext4/migrate.c @@ -663,8 +663,8 @@ int ext4_ind_migrate(struct inode *inode) if (unlikely(ret2 && !ret)) ret = ret2; errout: - ext4_journal_stop(handle); up_write(&EXT4_I(inode)->i_data_sem); + ext4_journal_stop(handle); out_unlock: ext4_writepages_up_write(inode->i_sb, alloc_ctx); return ret;
From: Baokun Li libaokun1@huawei.com
[ Upstream commit 4e2524ba2ca5f54bdbb9e5153bea00421ef653f5 ]
In ext4_find_extent(), path may be freed by error or be reallocated, so using a previously saved *ppath may have been freed and thus may trigger use-after-free, as follows:
ext4_split_extent path = *ppath; ext4_split_extent_at(ppath) path = ext4_find_extent(ppath) ext4_split_extent_at(ppath) // ext4_find_extent fails to free path // but zeroout succeeds ext4_ext_show_leaf(inode, path) eh = path[depth].p_hdr // path use-after-free !!!
Similar to ext4_split_extent_at(), we use *ppath directly as an input to ext4_ext_show_leaf(). Fix a spelling error by the way.
Same problem in ext4_ext_handle_unwritten_extents(). Since 'path' is only used in ext4_ext_show_leaf(), remove 'path' and use *ppath directly.
This issue is triggered only when EXT_DEBUG is defined and therefore does not affect functionality.
Signed-off-by: Baokun Li libaokun1@huawei.com Reviewed-by: Jan Kara jack@suse.cz Reviewed-by: Ojaswin Mujoo ojaswin@linux.ibm.com Tested-by: Ojaswin Mujoo ojaswin@linux.ibm.com Link: https://patch.msgid.link/20240822023545.1994557-5-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/extents.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 448e0ea49b31d..7fead53255fcb 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -3287,7 +3287,7 @@ static int ext4_split_extent_at(handle_t *handle, }
/* - * ext4_split_extents() splits an extent and mark extent which is covered + * ext4_split_extent() splits an extent and mark extent which is covered * by @map as split_flags indicates * * It may result in splitting the extent into multiple extents (up to three) @@ -3363,7 +3363,7 @@ static int ext4_split_extent(handle_t *handle, goto out; }
- ext4_ext_show_leaf(inode, path); + ext4_ext_show_leaf(inode, *ppath); out: return err ? err : allocated; } @@ -3828,14 +3828,13 @@ ext4_ext_handle_unwritten_extents(handle_t *handle, struct inode *inode, struct ext4_ext_path **ppath, int flags, unsigned int allocated, ext4_fsblk_t newblock) { - struct ext4_ext_path __maybe_unused *path = *ppath; int ret = 0; int err = 0;
ext_debug(inode, "logical block %llu, max_blocks %u, flags 0x%x, allocated %u\n", (unsigned long long)map->m_lblk, map->m_len, flags, allocated); - ext4_ext_show_leaf(inode, path); + ext4_ext_show_leaf(inode, *ppath);
/* * When writing into unwritten space, we should not fail to @@ -3932,7 +3931,7 @@ ext4_ext_handle_unwritten_extents(handle_t *handle, struct inode *inode, if (allocated > map->m_len) allocated = map->m_len; map->m_len = allocated; - ext4_ext_show_leaf(inode, path); + ext4_ext_show_leaf(inode, *ppath); out2: return err ? err : allocated; }
From: Thadeu Lima de Souza Cascardo cascardo@igalia.com
[ Upstream commit cd69f8f9de280e331c9e6ff689ced0a688a9ce8f ]
ext4_search_dir currently returns -1 in case of a failure, while it returns 0 when the name is not found. In such failure cases, it should return an error code instead.
This becomes even more important when ext4_find_inline_entry returns an error code as well in the next commit.
-EFSCORRUPTED seems appropriate as such error code as these failures would be caused by unexpected record lengths and is in line with other instances of ext4_check_dir_entry failures.
In the case of ext4_dx_find_entry, the current use of ERR_BAD_DX_DIR was left as is to reduce the risk of regressions.
Signed-off-by: Thadeu Lima de Souza Cascardo cascardo@igalia.com Link: https://patch.msgid.link/20240821152324.3621860-2-cascardo@igalia.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/namei.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index 3bd2301cb48e7..9913aa37e697c 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -1526,7 +1526,7 @@ static bool ext4_match(struct inode *parent, }
/* - * Returns 0 if not found, -1 on failure, and 1 on success + * Returns 0 if not found, -EFSCORRUPTED on failure, and 1 on success */ int ext4_search_dir(struct buffer_head *bh, char *search_buf, int buf_size, struct inode *dir, struct ext4_filename *fname, @@ -1547,7 +1547,7 @@ int ext4_search_dir(struct buffer_head *bh, char *search_buf, int buf_size, * a full check */ if (ext4_check_dir_entry(dir, NULL, de, bh, search_buf, buf_size, offset)) - return -1; + return -EFSCORRUPTED; *res_dir = de; return 1; } @@ -1555,7 +1555,7 @@ int ext4_search_dir(struct buffer_head *bh, char *search_buf, int buf_size, de_len = ext4_rec_len_from_disk(de->rec_len, dir->i_sb->s_blocksize); if (de_len <= 0) - return -1; + return -EFSCORRUPTED; offset += de_len; de = (struct ext4_dir_entry_2 *) ((char *) de + de_len); } @@ -1707,8 +1707,10 @@ static struct buffer_head *__ext4_find_entry(struct inode *dir, goto cleanup_and_exit; } else { brelse(bh); - if (i < 0) + if (i < 0) { + ret = ERR_PTR(i); goto cleanup_and_exit; + } } next: if (++block >= nblocks) @@ -1803,7 +1805,7 @@ static struct buffer_head * ext4_dx_find_entry(struct inode *dir, if (retval == 1) goto success; brelse(bh); - if (retval == -1) { + if (retval < 0) { bh = ERR_PTR(ERR_BAD_DX_DIR); goto errout; }
From: Pankaj Raghav p.raghav@samsung.com
[ Upstream commit 10553a91652d995274da63fc317470f703765081 ]
iomap_dio_zero() will pad a fs block with zeroes if the direct IO size < fs block size. iomap_dio_zero() has an implicit assumption that fs block size < page_size. This is true for most filesystems at the moment.
If the block size > page size, this will send the contents of the page next to zero page(as len > PAGE_SIZE) to the underlying block device, causing FS corruption.
iomap is a generic infrastructure and it should not make any assumptions about the fs block size and the page size of the system.
Signed-off-by: Pankaj Raghav p.raghav@samsung.com Link: https://lore.kernel.org/r/20240822135018.1931258-7-kernel@pankajraghav.com Reviewed-by: Hannes Reinecke hare@suse.de Reviewed-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Dave Chinner dchinner@redhat.com Reviewed-by: Daniel Gomez da.gomez@samsung.com Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/iomap/buffered-io.c | 4 ++-- fs/iomap/direct-io.c | 45 ++++++++++++++++++++++++++++++++++++------ 2 files changed, 41 insertions(+), 8 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 975fd88c1f0f4..6b89b5589ba28 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -1998,10 +1998,10 @@ iomap_writepages(struct address_space *mapping, struct writeback_control *wbc, } EXPORT_SYMBOL_GPL(iomap_writepages);
-static int __init iomap_init(void) +static int __init iomap_buffered_init(void) { return bioset_init(&iomap_ioend_bioset, 4 * (PAGE_SIZE / SECTOR_SIZE), offsetof(struct iomap_ioend, io_inline_bio), BIOSET_NEED_BVECS); } -fs_initcall(iomap_init); +fs_initcall(iomap_buffered_init); diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index bcd3f8cf5ea42..409a21144a555 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -11,6 +11,7 @@ #include <linux/iomap.h> #include <linux/backing-dev.h> #include <linux/uio.h> +#include <linux/set_memory.h> #include <linux/task_io_accounting_ops.h> #include "trace.h"
@@ -27,6 +28,13 @@ #define IOMAP_DIO_WRITE (1U << 30) #define IOMAP_DIO_DIRTY (1U << 31)
+/* + * Used for sub block zeroing in iomap_dio_zero() + */ +#define IOMAP_ZERO_PAGE_SIZE (SZ_64K) +#define IOMAP_ZERO_PAGE_ORDER (get_order(IOMAP_ZERO_PAGE_SIZE)) +static struct page *zero_page; + struct iomap_dio { struct kiocb *iocb; const struct iomap_dio_ops *dops; @@ -232,13 +240,20 @@ void iomap_dio_bio_end_io(struct bio *bio) } EXPORT_SYMBOL_GPL(iomap_dio_bio_end_io);
-static void iomap_dio_zero(const struct iomap_iter *iter, struct iomap_dio *dio, +static int iomap_dio_zero(const struct iomap_iter *iter, struct iomap_dio *dio, loff_t pos, unsigned len) { struct inode *inode = file_inode(dio->iocb->ki_filp); - struct page *page = ZERO_PAGE(0); struct bio *bio;
+ if (!len) + return 0; + /* + * Max block size supported is 64k + */ + if (WARN_ON_ONCE(len > IOMAP_ZERO_PAGE_SIZE)) + return -EINVAL; + bio = iomap_dio_alloc_bio(iter, dio, 1, REQ_OP_WRITE | REQ_SYNC | REQ_IDLE); fscrypt_set_bio_crypt_ctx(bio, inode, pos >> inode->i_blkbits, GFP_KERNEL); @@ -246,8 +261,9 @@ static void iomap_dio_zero(const struct iomap_iter *iter, struct iomap_dio *dio, bio->bi_private = dio; bio->bi_end_io = iomap_dio_bio_end_io;
- __bio_add_page(bio, page, len, 0); + __bio_add_page(bio, zero_page, len, 0); iomap_dio_submit_bio(iter, dio, bio, pos); + return 0; }
/* @@ -356,8 +372,10 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter, if (need_zeroout) { /* zero out from the start of the block to the write offset */ pad = pos & (fs_block_size - 1); - if (pad) - iomap_dio_zero(iter, dio, pos - pad, pad); + + ret = iomap_dio_zero(iter, dio, pos - pad, pad); + if (ret) + goto out; }
/* @@ -430,7 +448,8 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter, /* zero out from the end of the write to the end of the block */ pad = pos & (fs_block_size - 1); if (pad) - iomap_dio_zero(iter, dio, pos, fs_block_size - pad); + ret = iomap_dio_zero(iter, dio, pos, + fs_block_size - pad); } out: /* Undo iter limitation to current extent */ @@ -752,3 +771,17 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, return iomap_dio_complete(dio); } EXPORT_SYMBOL_GPL(iomap_dio_rw); + +static int __init iomap_dio_init(void) +{ + zero_page = alloc_pages(GFP_KERNEL | __GFP_ZERO, + IOMAP_ZERO_PAGE_ORDER); + + if (!zero_page) + return -ENOMEM; + + set_memory_ro((unsigned long)page_address(zero_page), + 1U << IOMAP_ZERO_PAGE_ORDER); + return 0; +} +fs_initcall(iomap_dio_init);
From: Jan Kara jack@suse.cz
[ Upstream commit d3476f3dad4ad68ae5f6b008ea6591d1520da5d8 ]
When the filesystem is mounted with errors=remount-ro, we were setting SB_RDONLY flag to stop all filesystem modifications. We knew this misses proper locking (sb->s_umount) and does not go through proper filesystem remount procedure but it has been the way this worked since early ext2 days and it was good enough for catastrophic situation damage mitigation. Recently, syzbot has found a way (see link) to trigger warnings in filesystem freezing because the code got confused by SB_RDONLY changing under its hands. Since these days we set EXT4_FLAGS_SHUTDOWN on the superblock which is enough to stop all filesystem modifications, modifying SB_RDONLY shouldn't be needed. So stop doing that.
Link: https://lore.kernel.org/all/000000000000b90a8e061e21d12f@google.com Reported-by: Christian Brauner brauner@kernel.org Signed-off-by: Jan Kara jack@suse.cz Reviewed-by: Christian Brauner brauner@kernel.org Link: https://patch.msgid.link/20240805201241.27286-1-jack@suse.cz Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/super.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 5baacb3058abd..b7d8abef2beba 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -744,11 +744,12 @@ static void ext4_handle_error(struct super_block *sb, bool force_ro, int error,
ext4_msg(sb, KERN_CRIT, "Remounting filesystem read-only"); /* - * Make sure updated value of ->s_mount_flags will be visible before - * ->s_flags update + * EXT4_FLAGS_SHUTDOWN was set which stops all filesystem + * modifications. We don't set SB_RDONLY because that requires + * sb->s_umount semaphore and setting it without proper remount + * procedure is confusing code such as freeze_super() leading to + * deadlocks and other problems. */ - smp_wmb(); - sb->s_flags |= SB_RDONLY; }
static void update_super_work(struct work_struct *work)
From: Wojciech GÅ‚adysz wojciech.gladysz@infogain.com
[ Upstream commit d1bc560e9a9c78d0b2314692847fc8661e0aeb99 ]
Add nested locking with I_MUTEX_XATTR subclass to avoid lockdep warning while handling xattr inode on file open syscall at ext4_xattr_inode_iget.
Backtrace EXT4-fs (loop0): Ignoring removed oldalloc option ====================================================== WARNING: possible circular locking dependency detected 5.10.0-syzkaller #0 Not tainted ------------------------------------------------------ syz-executor543/2794 is trying to acquire lock: ffff8880215e1a48 (&ea_inode->i_rwsem#7/1){+.+.}-{3:3}, at: inode_lock include/linux/fs.h:782 [inline] ffff8880215e1a48 (&ea_inode->i_rwsem#7/1){+.+.}-{3:3}, at: ext4_xattr_inode_iget+0x42a/0x5c0 fs/ext4/xattr.c:425
but task is already holding lock: ffff8880215e3278 (&ei->i_data_sem/3){++++}-{3:3}, at: ext4_setattr+0x136d/0x19c0 fs/ext4/inode.c:5559
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&ei->i_data_sem/3){++++}-{3:3}: lock_acquire+0x197/0x480 kernel/locking/lockdep.c:5566 down_write+0x93/0x180 kernel/locking/rwsem.c:1564 ext4_update_i_disksize fs/ext4/ext4.h:3267 [inline] ext4_xattr_inode_write fs/ext4/xattr.c:1390 [inline] ext4_xattr_inode_lookup_create fs/ext4/xattr.c:1538 [inline] ext4_xattr_set_entry+0x331a/0x3d80 fs/ext4/xattr.c:1662 ext4_xattr_ibody_set+0x124/0x390 fs/ext4/xattr.c:2228 ext4_xattr_set_handle+0xc27/0x14e0 fs/ext4/xattr.c:2385 ext4_xattr_set+0x219/0x390 fs/ext4/xattr.c:2498 ext4_xattr_user_set+0xc9/0xf0 fs/ext4/xattr_user.c:40 __vfs_setxattr+0x404/0x450 fs/xattr.c:177 __vfs_setxattr_noperm+0x11d/0x4f0 fs/xattr.c:208 __vfs_setxattr_locked+0x1f9/0x210 fs/xattr.c:266 vfs_setxattr+0x112/0x2c0 fs/xattr.c:283 setxattr+0x1db/0x3e0 fs/xattr.c:548 path_setxattr+0x15a/0x240 fs/xattr.c:567 __do_sys_setxattr fs/xattr.c:582 [inline] __se_sys_setxattr fs/xattr.c:578 [inline] __x64_sys_setxattr+0xc5/0xe0 fs/xattr.c:578 do_syscall_64+0x6d/0xa0 arch/x86/entry/common.c:62 entry_SYSCALL_64_after_hwframe+0x61/0xcb
-> #0 (&ea_inode->i_rwsem#7/1){+.+.}-{3:3}: check_prev_add kernel/locking/lockdep.c:2988 [inline] check_prevs_add kernel/locking/lockdep.c:3113 [inline] validate_chain+0x1695/0x58f0 kernel/locking/lockdep.c:3729 __lock_acquire+0x12fd/0x20d0 kernel/locking/lockdep.c:4955 lock_acquire+0x197/0x480 kernel/locking/lockdep.c:5566 down_write+0x93/0x180 kernel/locking/rwsem.c:1564 inode_lock include/linux/fs.h:782 [inline] ext4_xattr_inode_iget+0x42a/0x5c0 fs/ext4/xattr.c:425 ext4_xattr_inode_get+0x138/0x410 fs/ext4/xattr.c:485 ext4_xattr_move_to_block fs/ext4/xattr.c:2580 [inline] ext4_xattr_make_inode_space fs/ext4/xattr.c:2682 [inline] ext4_expand_extra_isize_ea+0xe70/0x1bb0 fs/ext4/xattr.c:2774 __ext4_expand_extra_isize+0x304/0x3f0 fs/ext4/inode.c:5898 ext4_try_to_expand_extra_isize fs/ext4/inode.c:5941 [inline] __ext4_mark_inode_dirty+0x591/0x810 fs/ext4/inode.c:6018 ext4_setattr+0x1400/0x19c0 fs/ext4/inode.c:5562 notify_change+0xbb6/0xe60 fs/attr.c:435 do_truncate+0x1de/0x2c0 fs/open.c:64 handle_truncate fs/namei.c:2970 [inline] do_open fs/namei.c:3311 [inline] path_openat+0x29f3/0x3290 fs/namei.c:3425 do_filp_open+0x20b/0x450 fs/namei.c:3452 do_sys_openat2+0x124/0x460 fs/open.c:1207 do_sys_open fs/open.c:1223 [inline] __do_sys_open fs/open.c:1231 [inline] __se_sys_open fs/open.c:1227 [inline] __x64_sys_open+0x221/0x270 fs/open.c:1227 do_syscall_64+0x6d/0xa0 arch/x86/entry/common.c:62 entry_SYSCALL_64_after_hwframe+0x61/0xcb
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(&ei->i_data_sem/3); lock(&ea_inode->i_rwsem#7/1); lock(&ei->i_data_sem/3); lock(&ea_inode->i_rwsem#7/1);
*** DEADLOCK ***
5 locks held by syz-executor543/2794: #0: ffff888026fbc448 (sb_writers#4){.+.+}-{0:0}, at: mnt_want_write+0x4a/0x2a0 fs/namespace.c:365 #1: ffff8880215e3488 (&sb->s_type->i_mutex_key#7){++++}-{3:3}, at: inode_lock include/linux/fs.h:782 [inline] #1: ffff8880215e3488 (&sb->s_type->i_mutex_key#7){++++}-{3:3}, at: do_truncate+0x1cf/0x2c0 fs/open.c:62 #2: ffff8880215e3310 (&ei->i_mmap_sem){++++}-{3:3}, at: ext4_setattr+0xec4/0x19c0 fs/ext4/inode.c:5519 #3: ffff8880215e3278 (&ei->i_data_sem/3){++++}-{3:3}, at: ext4_setattr+0x136d/0x19c0 fs/ext4/inode.c:5559 #4: ffff8880215e30c8 (&ei->xattr_sem){++++}-{3:3}, at: ext4_write_trylock_xattr fs/ext4/xattr.h:162 [inline] #4: ffff8880215e30c8 (&ei->xattr_sem){++++}-{3:3}, at: ext4_try_to_expand_extra_isize fs/ext4/inode.c:5938 [inline] #4: ffff8880215e30c8 (&ei->xattr_sem){++++}-{3:3}, at: __ext4_mark_inode_dirty+0x4fb/0x810 fs/ext4/inode.c:6018
stack backtrace: CPU: 1 PID: 2794 Comm: syz-executor543 Not tainted 5.10.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x177/0x211 lib/dump_stack.c:118 print_circular_bug+0x146/0x1b0 kernel/locking/lockdep.c:2002 check_noncircular+0x2cc/0x390 kernel/locking/lockdep.c:2123 check_prev_add kernel/locking/lockdep.c:2988 [inline] check_prevs_add kernel/locking/lockdep.c:3113 [inline] validate_chain+0x1695/0x58f0 kernel/locking/lockdep.c:3729 __lock_acquire+0x12fd/0x20d0 kernel/locking/lockdep.c:4955 lock_acquire+0x197/0x480 kernel/locking/lockdep.c:5566 down_write+0x93/0x180 kernel/locking/rwsem.c:1564 inode_lock include/linux/fs.h:782 [inline] ext4_xattr_inode_iget+0x42a/0x5c0 fs/ext4/xattr.c:425 ext4_xattr_inode_get+0x138/0x410 fs/ext4/xattr.c:485 ext4_xattr_move_to_block fs/ext4/xattr.c:2580 [inline] ext4_xattr_make_inode_space fs/ext4/xattr.c:2682 [inline] ext4_expand_extra_isize_ea+0xe70/0x1bb0 fs/ext4/xattr.c:2774 __ext4_expand_extra_isize+0x304/0x3f0 fs/ext4/inode.c:5898 ext4_try_to_expand_extra_isize fs/ext4/inode.c:5941 [inline] __ext4_mark_inode_dirty+0x591/0x810 fs/ext4/inode.c:6018 ext4_setattr+0x1400/0x19c0 fs/ext4/inode.c:5562 notify_change+0xbb6/0xe60 fs/attr.c:435 do_truncate+0x1de/0x2c0 fs/open.c:64 handle_truncate fs/namei.c:2970 [inline] do_open fs/namei.c:3311 [inline] path_openat+0x29f3/0x3290 fs/namei.c:3425 do_filp_open+0x20b/0x450 fs/namei.c:3452 do_sys_openat2+0x124/0x460 fs/open.c:1207 do_sys_open fs/open.c:1223 [inline] __do_sys_open fs/open.c:1231 [inline] __se_sys_open fs/open.c:1227 [inline] __x64_sys_open+0x221/0x270 fs/open.c:1227 do_syscall_64+0x6d/0xa0 arch/x86/entry/common.c:62 entry_SYSCALL_64_after_hwframe+0x61/0xcb RIP: 0033:0x7f0cde4ea229 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 21 18 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffd81d1c978 EFLAGS: 00000246 ORIG_RAX: 0000000000000002 RAX: ffffffffffffffda RBX: 0030656c69662f30 RCX: 00007f0cde4ea229 RDX: 0000000000000089 RSI: 00000000000a0a00 RDI: 00000000200001c0 RBP: 2f30656c69662f2e R08: 0000000000208000 R09: 0000000000208000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffd81d1c9c0 R13: 00007ffd81d1ca00 R14: 0000000000080000 R15: 0000000000000003 EXT4-fs error (device loop0): ext4_expand_extra_isize_ea:2730: inode #13: comm syz-executor543: corrupted in-inode xattr
Signed-off-by: Wojciech GÅ‚adysz wojciech.gladysz@infogain.com Link: https://patch.msgid.link/20240801143827.19135-1-wojciech.gladysz@infogain.co... Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/xattr.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c index c368ff671d773..eaedb47bfe2d8 100644 --- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -458,7 +458,7 @@ static int ext4_xattr_inode_iget(struct inode *parent, unsigned long ea_ino, ext4_set_inode_state(inode, EXT4_STATE_LUSTRE_EA_INODE); ext4_xattr_inode_set_ref(inode, 1); } else { - inode_lock(inode); + inode_lock_nested(inode, I_MUTEX_XATTR); inode->i_flags |= S_NOQUOTA; inode_unlock(inode); } @@ -1039,7 +1039,7 @@ static int ext4_xattr_inode_update_ref(handle_t *handle, struct inode *ea_inode, s64 ref_count; int ret;
- inode_lock(ea_inode); + inode_lock_nested(ea_inode, I_MUTEX_XATTR);
ret = ext4_reserve_inode_write(handle, ea_inode, &iloc); if (ret)
From: Thomas Richter tmricht@linux.ibm.com
[ Upstream commit b495e710157606889f2d8bdc62aebf2aa02f67a7 ]
Remove WARN_ON_ONCE statements. These have not triggered in the past.
Signed-off-by: Thomas Richter tmricht@linux.ibm.com Acked-by: Sumanth Korikkar sumanthk@linux.ibm.com Cc: Heiko Carstens hca@linux.ibm.com Cc: Vasily Gorbik gor@linux.ibm.com Cc: Alexander Gordeev agordeev@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/perf_cpum_sf.c | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/arch/s390/kernel/perf_cpum_sf.c b/arch/s390/kernel/perf_cpum_sf.c index 06efad5b4f931..a3169193775f7 100644 --- a/arch/s390/kernel/perf_cpum_sf.c +++ b/arch/s390/kernel/perf_cpum_sf.c @@ -1463,7 +1463,7 @@ static int aux_output_begin(struct perf_output_handle *handle, unsigned long range, i, range_scan, idx, head, base, offset; struct hws_trailer_entry *te;
- if (WARN_ON_ONCE(handle->head & ~PAGE_MASK)) + if (handle->head & ~PAGE_MASK) return -EINVAL;
aux->head = handle->head >> PAGE_SHIFT; @@ -1642,7 +1642,7 @@ static void hw_collect_aux(struct cpu_hw_sf *cpuhw) unsigned long num_sdb;
aux = perf_get_aux(handle); - if (WARN_ON_ONCE(!aux)) + if (!aux) return;
/* Inform user space new data arrived */ @@ -1661,7 +1661,7 @@ static void hw_collect_aux(struct cpu_hw_sf *cpuhw) num_sdb); break; } - if (WARN_ON_ONCE(!aux)) + if (!aux) return;
/* Update head and alert_mark to new position */ @@ -1896,12 +1896,8 @@ static void cpumsf_pmu_start(struct perf_event *event, int flags) { struct cpu_hw_sf *cpuhw = this_cpu_ptr(&cpu_hw_sf);
- if (WARN_ON_ONCE(!(event->hw.state & PERF_HES_STOPPED))) + if (!(event->hw.state & PERF_HES_STOPPED)) return; - - if (flags & PERF_EF_RELOAD) - WARN_ON_ONCE(!(event->hw.state & PERF_HES_UPTODATE)); - perf_pmu_disable(event->pmu); event->hw.state = 0; cpuhw->lsctl.cs = 1;
From: Heiko Carstens hca@linux.ibm.com
[ Upstream commit 3c4d0ae0671827f4b536cc2d26f8b9c54584ccc5 ]
Add missing warning handling to the early program check handler. This way a warning is printed to the console as soon as the early console is setup, and the kernel continues to boot.
Before this change a disabled wait psw was loaded instead and the machine was silently stopped without giving an idea about what happened.
Reviewed-by: Sven Schnelle svens@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/early.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/arch/s390/kernel/early.c b/arch/s390/kernel/early.c index 3a54733e4fc65..eb6307c066c8a 100644 --- a/arch/s390/kernel/early.c +++ b/arch/s390/kernel/early.c @@ -174,8 +174,21 @@ static __init void setup_topology(void)
void __do_early_pgm_check(struct pt_regs *regs) { - if (!fixup_exception(regs)) - disabled_wait(); + struct lowcore *lc = get_lowcore(); + unsigned long ip; + + regs->int_code = lc->pgm_int_code; + regs->int_parm_long = lc->trans_exc_code; + ip = __rewind_psw(regs->psw, regs->int_code >> 16); + + /* Monitor Event? Might be a warning */ + if ((regs->int_code & PGM_INT_CODE_MASK) == 0x40) { + if (report_bug(ip, regs) == BUG_TRAP_TYPE_WARN) + return; + } + if (fixup_exception(regs)) + return; + disabled_wait(); }
static noinline __init void setup_lowcore_early(void)
From: Xu Kuohai xukuohai@huawei.com
[ Upstream commit 28ead3eaabc16ecc907cfb71876da028080f6356 ]
bpf progs can be attached to kernel functions, and the attached functions can take different parameters or return different return values. If prog attached to one kernel function tail calls prog attached to another kernel function, the ctx access or return value verification could be bypassed.
For example, if prog1 is attached to func1 which takes only 1 parameter and prog2 is attached to func2 which takes two parameters. Since verifier assumes the bpf ctx passed to prog2 is constructed based on func2's prototype, verifier allows prog2 to access the second parameter from the bpf ctx passed to it. The problem is that verifier does not prevent prog1 from passing its bpf ctx to prog2 via tail call. In this case, the bpf ctx passed to prog2 is constructed from func1 instead of func2, that is, the assumption for ctx access verification is bypassed.
Another example, if BPF LSM prog1 is attached to hook file_alloc_security, and BPF LSM prog2 is attached to hook bpf_lsm_audit_rule_known. Verifier knows the return value rules for these two hooks, e.g. it is legal for bpf_lsm_audit_rule_known to return positive number 1, and it is illegal for file_alloc_security to return positive number. So verifier allows prog2 to return positive number 1, but does not allow prog1 to return positive number. The problem is that verifier does not prevent prog1 from calling prog2 via tail call. In this case, prog2's return value 1 will be used as the return value for prog1's hook file_alloc_security. That is, the return value rule is bypassed.
This patch adds restriction for tail call to prevent such bypasses.
Signed-off-by: Xu Kuohai xukuohai@huawei.com Link: https://lore.kernel.org/r/20240719110059.797546-4-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/bpf.h | 1 + kernel/bpf/core.c | 21 ++++++++++++++++++--- 2 files changed, 19 insertions(+), 3 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index e4cd28c38b825..0f0e0265cbdf5 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -288,6 +288,7 @@ struct bpf_map { * same prog type, JITed flag and xdp_has_frags flag. */ struct { + const struct btf_type *attach_func_proto; spinlock_t lock; enum bpf_prog_type type; bool jited; diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 4124805ad7ba5..58ee17f429a33 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -2259,6 +2259,7 @@ bool bpf_prog_map_compatible(struct bpf_map *map, { enum bpf_prog_type prog_type = resolve_prog_type(fp); bool ret; + struct bpf_prog_aux *aux = fp->aux;
if (fp->kprobe_override) return false; @@ -2268,7 +2269,7 @@ bool bpf_prog_map_compatible(struct bpf_map *map, * in the case of devmap and cpumap). Until device checks * are implemented, prohibit adding dev-bound programs to program maps. */ - if (bpf_prog_is_dev_bound(fp->aux)) + if (bpf_prog_is_dev_bound(aux)) return false;
spin_lock(&map->owner.lock); @@ -2278,12 +2279,26 @@ bool bpf_prog_map_compatible(struct bpf_map *map, */ map->owner.type = prog_type; map->owner.jited = fp->jited; - map->owner.xdp_has_frags = fp->aux->xdp_has_frags; + map->owner.xdp_has_frags = aux->xdp_has_frags; + map->owner.attach_func_proto = aux->attach_func_proto; ret = true; } else { ret = map->owner.type == prog_type && map->owner.jited == fp->jited && - map->owner.xdp_has_frags == fp->aux->xdp_has_frags; + map->owner.xdp_has_frags == aux->xdp_has_frags; + if (ret && + map->owner.attach_func_proto != aux->attach_func_proto) { + switch (prog_type) { + case BPF_PROG_TYPE_TRACING: + case BPF_PROG_TYPE_LSM: + case BPF_PROG_TYPE_EXT: + case BPF_PROG_TYPE_STRUCT_OPS: + ret = false; + break; + default: + break; + } + } } spin_unlock(&map->owner.lock);
From: Daniel Jordan daniel.m.jordan@oracle.com
[ Upstream commit 2351e8c65404aabc433300b6bf90c7a37e8bbc4d ]
Some distros have grub2 config files with the lines
if [ x"${feature_menuentry_id}" = xy ]; then menuentry_id_option="--id" else menuentry_id_option="" fi
which match the skip regex defined for grub2 in get_grub_index():
$skip = '^\s*menuentry';
These false positives cause the grub number to be higher than it should be, and the wrong kernel can end up booting.
Grub documents the menuentry command with whitespace between it and the title, so make the skip regex reflect this.
Link: https://lore.kernel.org/20240904175530.84175-1-daniel.m.jordan@oracle.com Signed-off-by: Daniel Jordan daniel.m.jordan@oracle.com Acked-by: John 'Warthog9' Hawley (Tenstorrent) warthog9@eaglescrag.net Signed-off-by: Steven Rostedt rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/ktest/ktest.pl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/ktest/ktest.pl b/tools/testing/ktest/ktest.pl index 24451f8f42910..045090085ac5b 100755 --- a/tools/testing/ktest/ktest.pl +++ b/tools/testing/ktest/ktest.pl @@ -2043,7 +2043,7 @@ sub get_grub_index { } elsif ($reboot_type eq "grub2") { $command = "cat $grub_file"; $target = '^\s*menuentry.*' . $grub_menu_qt; - $skip = '^\s*menuentry'; + $skip = '^\s*menuentry\s'; $submenu = '^\s*submenu\s'; } elsif ($reboot_type eq "grub2bls") { $command = $grub_bls_get;
From: Saravanan Vajravel saravanan.vajravel@broadcom.com
[ Upstream commit 2a777679b8ccd09a9a65ea0716ef10365179caac ]
Current timeout handler of mad agent acquires/releases mad_agent_priv lock for every timed out WRs. This causes heavy locking contention when higher no. of WRs are to be handled inside timeout handler.
This leads to softlockup with below trace in some use cases where rdma-cm path is used to establish connection between peer nodes
Trace: ----- BUG: soft lockup - CPU#4 stuck for 26s! [kworker/u128:3:19767] CPU: 4 PID: 19767 Comm: kworker/u128:3 Kdump: loaded Tainted: G OE ------- --- 5.14.0-427.13.1.el9_4.x86_64 #1 Hardware name: Dell Inc. PowerEdge R740/01YM03, BIOS 2.4.8 11/26/2019 Workqueue: ib_mad1 timeout_sends [ib_core] RIP: 0010:__do_softirq+0x78/0x2ac RSP: 0018:ffffb253449e4f98 EFLAGS: 00000246 RAX: 00000000ffffffff RBX: 0000000000000000 RCX: 000000000000001f RDX: 000000000000001d RSI: 000000003d1879ab RDI: fff363b66fd3a86b RBP: ffffb253604cbcd8 R08: 0000009065635f3b R09: 0000000000000000 R10: 0000000000000040 R11: ffffb253449e4ff8 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000040 FS: 0000000000000000(0000) GS:ffff8caa1fc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fd9ec9db900 CR3: 0000000891934006 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> ? show_trace_log_lvl+0x1c4/0x2df ? show_trace_log_lvl+0x1c4/0x2df ? __irq_exit_rcu+0xa1/0xc0 ? watchdog_timer_fn+0x1b2/0x210 ? __pfx_watchdog_timer_fn+0x10/0x10 ? __hrtimer_run_queues+0x127/0x2c0 ? hrtimer_interrupt+0xfc/0x210 ? __sysvec_apic_timer_interrupt+0x5c/0x110 ? sysvec_apic_timer_interrupt+0x37/0x90 ? asm_sysvec_apic_timer_interrupt+0x16/0x20 ? __do_softirq+0x78/0x2ac ? __do_softirq+0x60/0x2ac __irq_exit_rcu+0xa1/0xc0 sysvec_call_function_single+0x72/0x90 </IRQ> <TASK> asm_sysvec_call_function_single+0x16/0x20 RIP: 0010:_raw_spin_unlock_irq+0x14/0x30 RSP: 0018:ffffb253604cbd88 EFLAGS: 00000247 RAX: 000000000001960d RBX: 0000000000000002 RCX: ffff8cad2a064800 RDX: 000000008020001b RSI: 0000000000000001 RDI: ffff8cad5d39f66c RBP: ffff8cad5d39f600 R08: 0000000000000001 R09: 0000000000000000 R10: ffff8caa443e0c00 R11: ffffb253604cbcd8 R12: ffff8cacb8682538 R13: 0000000000000005 R14: ffffb253604cbd90 R15: ffff8cad5d39f66c cm_process_send_error+0x122/0x1d0 [ib_cm] timeout_sends+0x1dd/0x270 [ib_core] process_one_work+0x1e2/0x3b0 ? __pfx_worker_thread+0x10/0x10 worker_thread+0x50/0x3a0 ? __pfx_worker_thread+0x10/0x10 kthread+0xdd/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x29/0x50 </TASK>
Simplified timeout handler by creating local list of timed out WRs and invoke send handler post creating the list. The new method acquires/ releases lock once to fetch the list and hence helps to reduce locking contetiong when processing higher no. of WRs
Signed-off-by: Saravanan Vajravel saravanan.vajravel@broadcom.com Link: https://lore.kernel.org/r/20240722110325.195085-1-saravanan.vajravel@broadco... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/core/mad.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c index 674344eb8e2f4..58befbaaf0ad5 100644 --- a/drivers/infiniband/core/mad.c +++ b/drivers/infiniband/core/mad.c @@ -2616,14 +2616,16 @@ static int retry_send(struct ib_mad_send_wr_private *mad_send_wr)
static void timeout_sends(struct work_struct *work) { + struct ib_mad_send_wr_private *mad_send_wr, *n; struct ib_mad_agent_private *mad_agent_priv; - struct ib_mad_send_wr_private *mad_send_wr; struct ib_mad_send_wc mad_send_wc; + struct list_head local_list; unsigned long flags, delay;
mad_agent_priv = container_of(work, struct ib_mad_agent_private, timed_work.work); mad_send_wc.vendor_err = 0; + INIT_LIST_HEAD(&local_list);
spin_lock_irqsave(&mad_agent_priv->lock, flags); while (!list_empty(&mad_agent_priv->wait_list)) { @@ -2641,13 +2643,16 @@ static void timeout_sends(struct work_struct *work) break; }
- list_del(&mad_send_wr->agent_list); + list_del_init(&mad_send_wr->agent_list); if (mad_send_wr->status == IB_WC_SUCCESS && !retry_send(mad_send_wr)) continue;
- spin_unlock_irqrestore(&mad_agent_priv->lock, flags); + list_add_tail(&mad_send_wr->agent_list, &local_list); + } + spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
+ list_for_each_entry_safe(mad_send_wr, n, &local_list, agent_list) { if (mad_send_wr->status == IB_WC_SUCCESS) mad_send_wc.status = IB_WC_RESP_TIMEOUT_ERR; else @@ -2655,11 +2660,8 @@ static void timeout_sends(struct work_struct *work) mad_send_wc.send_buf = &mad_send_wr->send_buf; mad_agent_priv->agent.send_handler(&mad_agent_priv->agent, &mad_send_wc); - deref_mad_agent(mad_agent_priv); - spin_lock_irqsave(&mad_agent_priv->lock, flags); } - spin_unlock_irqrestore(&mad_agent_priv->lock, flags); }
/*
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
[ Upstream commit 5aedb8d8336b0a0421b58ca27d1b572aa6695b5b ]
The existing code enables the Cadence IP interrupts after the bus reset sequence. The problem with this sequence is that it might be pre-empted, giving SoundWire devices time to sync and report as ATTACHED before the interrupts are enabled. In that case, the Cadence IP will not detect a state change and will not throw an interrupt to proceed with the enumeration of a Device0.
In our overnight stress tests, we observed that a slight sub-millisecond delay in enabling interrupts after the reset was correlated with detection failures. This problem is more prevalent on the LunarLake silicon, likely due to SOC integration changes, but it was observed on earlier generations as well.
This patch reverts the sequence, with the interrupts enabled before performing the bus reset. This removes the race condition and makes sure the Cadence IP is able to detect the presence of a Device0 in all cases.
Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Signed-off-by: Bard Liao yung-chuan.liao@linux.intel.com Link: https://lore.kernel.org/r/20240805115003.88035-1-yung-chuan.liao@linux.intel... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soundwire/intel_bus_common.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/soundwire/intel_bus_common.c b/drivers/soundwire/intel_bus_common.c index e5ac3cc7cb79b..179aa0d85951b 100644 --- a/drivers/soundwire/intel_bus_common.c +++ b/drivers/soundwire/intel_bus_common.c @@ -45,15 +45,15 @@ int intel_start_bus(struct sdw_intel *sdw) return ret; }
- ret = sdw_cdns_exit_reset(cdns); + ret = sdw_cdns_enable_interrupt(cdns, true); if (ret < 0) { - dev_err(dev, "%s: unable to exit bus reset sequence: %d\n", __func__, ret); + dev_err(dev, "%s: cannot enable interrupts: %d\n", __func__, ret); return ret; }
- ret = sdw_cdns_enable_interrupt(cdns, true); + ret = sdw_cdns_exit_reset(cdns); if (ret < 0) { - dev_err(dev, "%s: cannot enable interrupts: %d\n", __func__, ret); + dev_err(dev, "%s: unable to exit bus reset sequence: %d\n", __func__, ret); return ret; }
@@ -136,15 +136,15 @@ int intel_start_bus_after_reset(struct sdw_intel *sdw) return ret; }
- ret = sdw_cdns_exit_reset(cdns); + ret = sdw_cdns_enable_interrupt(cdns, true); if (ret < 0) { - dev_err(dev, "unable to exit bus reset sequence during resume\n"); + dev_err(dev, "cannot enable interrupts during resume\n"); return ret; }
- ret = sdw_cdns_enable_interrupt(cdns, true); + ret = sdw_cdns_exit_reset(cdns); if (ret < 0) { - dev_err(dev, "cannot enable interrupts during resume\n"); + dev_err(dev, "unable to exit bus reset sequence during resume\n"); return ret; }
From: WangYuli wangyuli@uniontech.com
[ Upstream commit 9246b487ab3c3b5993aae7552b7a4c541cc14a49 ]
Add DMA support for audio function of Glenfly Arise chip, which uses Requester ID of function 0.
Link: https://lore.kernel.org/r/CA2BBD087345B6D1+20240823095708.3237375-1-wangyuli... Signed-off-by: SiyuLi siyuli@glenfly.com Signed-off-by: WangYuli wangyuli@uniontech.com [bhelgaas: lower-case hex to match local code, drop unused Device IDs] Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/quirks.c | 4 ++++ include/linux/pci_ids.h | 2 ++ sound/pci/hda/hda_intel.c | 2 +- 3 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index ec4277d7835b2..5af53d9cc8b38 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -4239,6 +4239,10 @@ static void quirk_dma_func0_alias(struct pci_dev *dev) DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_RICOH, 0xe832, quirk_dma_func0_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_RICOH, 0xe476, quirk_dma_func0_alias);
+/* Some Glenfly chips use function 0 as the PCIe Requester ID for DMA */ +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_GLENFLY, 0x3d40, quirk_dma_func0_alias); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_GLENFLY, 0x3d41, quirk_dma_func0_alias); + static void quirk_dma_func1_alias(struct pci_dev *dev) { if (PCI_FUNC(dev->devfn) != 1) diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h index abff4e3b6a58b..cebfd1bb9dfa1 100644 --- a/include/linux/pci_ids.h +++ b/include/linux/pci_ids.h @@ -2653,6 +2653,8 @@ #define PCI_DEVICE_ID_DCI_PCCOM8 0x0002 #define PCI_DEVICE_ID_DCI_PCCOM2 0x0004
+#define PCI_VENDOR_ID_GLENFLY 0x6766 + #define PCI_VENDOR_ID_INTEL 0x8086 #define PCI_DEVICE_ID_INTEL_EESSC 0x0008 #define PCI_DEVICE_ID_INTEL_HDA_CML_LP 0x02c8 diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index d5c9f113e477a..0c64f20664628 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -2690,7 +2690,7 @@ static const struct pci_device_id azx_ids[] = { .driver_data = AZX_DRIVER_ATIHDMI_NS | AZX_DCAPS_PRESET_ATI_HDMI_NS | AZX_DCAPS_PM_RUNTIME }, /* GLENFLY */ - { PCI_DEVICE(0x6766, PCI_ANY_ID), + { PCI_DEVICE(PCI_VENDOR_ID_GLENFLY, PCI_ANY_ID), .class = PCI_CLASS_MULTIMEDIA_HD_AUDIO << 8, .class_mask = 0xffffff, .driver_data = AZX_DRIVER_GFHDMI | AZX_DCAPS_POSFIX_LPIB |
From: Md Haris Iqbal haris.iqbal@ionos.com
[ Upstream commit d0e62bf7b575fbfe591f6f570e7595dd60a2f5eb ]
For RTRS path establishment, RTRS client initiates and completes con_num of connections. After establishing all its connections, the information is exchanged between the client and server through the info_req message. During this exchange, it is essential that all connections have been established, and the state of the RTRS srv path is CONNECTED.
So add these sanity checks, to make sure we detect and abort process in error scenarios to avoid null pointer deref.
Signed-off-by: Md Haris Iqbal haris.iqbal@ionos.com Signed-off-by: Jack Wang jinpu.wang@ionos.com Signed-off-by: Grzegorz Prajsner grzegorz.prajsner@ionos.com Link: https://patch.msgid.link/20240821112217.41827-9-haris.iqbal@ionos.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/ulp/rtrs/rtrs-srv.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c index 1d33efb8fb03b..e9835a1666d3a 100644 --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c @@ -931,12 +931,11 @@ static void rtrs_srv_info_req_done(struct ib_cq *cq, struct ib_wc *wc) if (err) goto close;
-out: rtrs_iu_free(iu, srv_path->s.dev->ib_dev, 1); return; close: + rtrs_iu_free(iu, srv_path->s.dev->ib_dev, 1); close_path(srv_path); - goto out; }
static int post_recv_info_req(struct rtrs_srv_con *con) @@ -987,6 +986,16 @@ static int post_recv_path(struct rtrs_srv_path *srv_path) q_size = SERVICE_CON_QUEUE_DEPTH; else q_size = srv->queue_depth; + if (srv_path->state != RTRS_SRV_CONNECTING) { + rtrs_err(s, "Path state invalid. state %s\n", + rtrs_srv_state_str(srv_path->state)); + return -EIO; + } + + if (!srv_path->s.con[cid]) { + rtrs_err(s, "Conn not set for %d\n", cid); + return -EIO; + }
err = post_recv_io(to_srv_con(srv_path->s.con[cid]), q_size); if (err) {
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
[ Upstream commit f92d67e23b8caa81f6322a2bad1d633b00ca000e ]
Driver code is leaking OF node reference from of_get_parent() in bcm53573_ilp_init(). Usage of of_get_parent() is not needed in the first place, because the parent node will not be freed while we are processing given node (triggered by CLK_OF_DECLARE()). Thus fix the leak by accessing parent directly, instead of of_get_parent().
Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Link: https://lore.kernel.org/r/20240826065801.17081-1-krzysztof.kozlowski@linaro.... Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/bcm/clk-bcm53573-ilp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/clk/bcm/clk-bcm53573-ilp.c b/drivers/clk/bcm/clk-bcm53573-ilp.c index 84f2af736ee8a..83ef41d618be3 100644 --- a/drivers/clk/bcm/clk-bcm53573-ilp.c +++ b/drivers/clk/bcm/clk-bcm53573-ilp.c @@ -112,7 +112,7 @@ static void bcm53573_ilp_init(struct device_node *np) goto err_free_ilp; }
- ilp->regmap = syscon_node_to_regmap(of_get_parent(np)); + ilp->regmap = syscon_node_to_regmap(np->parent); if (IS_ERR(ilp->regmap)) { err = PTR_ERR(ilp->regmap); goto err_free_ilp;
From: Subramanian Ananthanarayanan quic_skananth@quicinc.com
[ Upstream commit 026f84d3fa62d215b11cbeb5a5d97df941e93b5c ]
The Qualcomm SA8775P root ports don't advertise an ACS capability, but they do provide ACS-like features to disable peer transactions and validate bus numbers in requests.
Thus, add an ACS quirk for the SA8775P.
Link: https://lore.kernel.org/linux-pci/20240906052228.1829485-1-quic_skananth@qui... Signed-off-by: Subramanian Ananthanarayanan quic_skananth@quicinc.com Signed-off-by: Krzysztof Wilczyński kwilczynski@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/quirks.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 5af53d9cc8b38..b70126953fc42 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5067,6 +5067,8 @@ static const struct pci_dev_acs_enabled { /* QCOM QDF2xxx root ports */ { PCI_VENDOR_ID_QCOM, 0x0400, pci_quirk_qcom_rp_acs }, { PCI_VENDOR_ID_QCOM, 0x0401, pci_quirk_qcom_rp_acs }, + /* QCOM SA8775P root port */ + { PCI_VENDOR_ID_QCOM, 0x0115, pci_quirk_qcom_rp_acs }, /* HXT SD4800 root ports. The ACS design is same as QCOM QDF2xxx */ { PCI_VENDOR_ID_HXT, 0x0401, pci_quirk_qcom_rp_acs }, /* Intel PCH root ports */
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 43457ada98c824f310adb7bd96bd5f2fcd9a3279 ]
On chipsets with a second 'Integrated Device Function' SMBus controller use a different adapter-name for the second IDF adapter.
This allows platform glue code which is looking for the primary i801 adapter to manually instantiate i2c_clients on to differentiate between the 2.
This allows such code to find the primary i801 adapter by name, without needing to duplicate the PCI-ids to feature-flags mapping from i2c-i801.c.
Reviewed-by: Pali Rohár pali@kernel.org Signed-off-by: Hans de Goede hdegoede@redhat.com Acked-by: Wolfram Sang wsa+renesas@sang-engineering.com Signed-off-by: Andi Shyti andi.shyti@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-i801.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c index 3410add34aad2..2b8bcd121ffa5 100644 --- a/drivers/i2c/busses/i2c-i801.c +++ b/drivers/i2c/busses/i2c-i801.c @@ -1754,8 +1754,15 @@ static int i801_probe(struct pci_dev *dev, const struct pci_device_id *id)
i801_add_tco(priv);
+ /* + * adapter.name is used by platform code to find the main I801 adapter + * to instantiante i2c_clients, do not change. + */ snprintf(priv->adapter.name, sizeof(priv->adapter.name), - "SMBus I801 adapter at %04lx", priv->smba); + "SMBus %s adapter at %04lx", + (priv->features & FEATURE_IDF) ? "I801 IDF" : "I801", + priv->smba); + err = i2c_add_adapter(&priv->adapter); if (err) { platform_device_unregister(priv->tco_pdev);
From: Alex Williamson alex.williamson@redhat.com
[ Upstream commit 2910306655a7072640021563ec9501bfa67f0cb1 ]
Per user reports, the Creative Labs EMU20k2 (Sound Blaster X-Fi Titanium Series) generates spurious interrupts when used with vfio-pci unless DisINTx masking support is disabled.
Thus, quirk the device to mark INTx masking as broken.
Closes: https://lore.kernel.org/all/VI1PR10MB8207C507DB5420AB4C7281E0DB9A2@VI1PR10MB... Link: https://lore.kernel.org/linux-pci/20240912215331.839220-1-alex.williamson@re... Reported-by: zdravko delineshev delineshev@outlook.com Signed-off-by: Alex Williamson alex.williamson@redhat.com [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński kwilczynski@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/quirks.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index b70126953fc42..e740636e99796 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -3601,6 +3601,8 @@ DECLARE_PCI_FIXUP_FINAL(0x1814, 0x0601, /* Ralink RT2800 802.11n PCI */ quirk_broken_intx_masking); DECLARE_PCI_FIXUP_FINAL(0x1b7c, 0x0004, /* Ceton InfiniTV4 */ quirk_broken_intx_masking); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CREATIVE, PCI_DEVICE_ID_CREATIVE_20K2, + quirk_broken_intx_masking);
/* * Realtek RTL8169 PCI Gigabit Ethernet Controller (rev 10)
From: Kaixin Wang kxwang23@m.fudan.edu.cn
[ Upstream commit 609366e7a06d035990df78f1562291c3bf0d4a12 ]
In the cdns_i3c_master_probe function, &master->hj_work is bound with cdns_i3c_master_hj. And cdns_i3c_master_interrupt can call cnds_i3c_master_demux_ibis function to start the work.
If we remove the module which will call cdns_i3c_master_remove to make cleanup, it will free master->base through i3c_master_unregister while the work mentioned above will be used. The sequence of operations that may lead to a UAF bug is as follows:
CPU0 CPU1
| cdns_i3c_master_hj cdns_i3c_master_remove | i3c_master_unregister(&master->base) | device_unregister(&master->dev) | device_release | //free master->base | | i3c_master_do_daa(&master->base) | //use master->base
Fix it by ensuring that the work is canceled before proceeding with the cleanup in cdns_i3c_master_remove.
Signed-off-by: Kaixin Wang kxwang23@m.fudan.edu.cn Link: https://lore.kernel.org/r/20240911153544.848398-1-kxwang23@m.fudan.edu.cn Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i3c/master/i3c-master-cdns.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/i3c/master/i3c-master-cdns.c b/drivers/i3c/master/i3c-master-cdns.c index fa5aaaf446181..d8426847c2837 100644 --- a/drivers/i3c/master/i3c-master-cdns.c +++ b/drivers/i3c/master/i3c-master-cdns.c @@ -1666,6 +1666,7 @@ static void cdns_i3c_master_remove(struct platform_device *pdev) { struct cdns_i3c_master *master = platform_get_drvdata(pdev);
+ cancel_work_sync(&master->hj_work); i3c_master_unregister(&master->base);
clk_disable_unprepare(master->sysclk);
From: Palmer Dabbelt palmer@rivosinc.com
[ Upstream commit ad380f6a0a5e82e794b45bb2eaec24ed51a56846 ]
I recently ended up with a warning on some compilers along the lines of
CC kernel/resource.o In file included from include/linux/ioport.h:16, from kernel/resource.c:15: kernel/resource.c: In function 'gfr_start': include/linux/minmax.h:49:37: error: conversion from 'long long unsigned int' to 'resource_size_t' {aka 'unsigned int'} changes value from '17179869183' to '4294967295' [-Werror=overflow] 49 | ({ type ux = (x); type uy = (y); __cmp(op, ux, uy); }) | ^ include/linux/minmax.h:52:9: note: in expansion of macro '__cmp_once_unique' 52 | __cmp_once_unique(op, type, x, y, __UNIQUE_ID(x_), __UNIQUE_ID(y_)) | ^~~~~~~~~~~~~~~~~ include/linux/minmax.h:161:27: note: in expansion of macro '__cmp_once' 161 | #define min_t(type, x, y) __cmp_once(min, type, x, y) | ^~~~~~~~~~ kernel/resource.c:1829:23: note: in expansion of macro 'min_t' 1829 | end = min_t(resource_size_t, base->end, | ^~~~~ kernel/resource.c: In function 'gfr_continue': include/linux/minmax.h:49:37: error: conversion from 'long long unsigned int' to 'resource_size_t' {aka 'unsigned int'} changes value from '17179869183' to '4294967295' [-Werror=overflow] 49 | ({ type ux = (x); type uy = (y); __cmp(op, ux, uy); }) | ^ include/linux/minmax.h:52:9: note: in expansion of macro '__cmp_once_unique' 52 | __cmp_once_unique(op, type, x, y, __UNIQUE_ID(x_), __UNIQUE_ID(y_)) | ^~~~~~~~~~~~~~~~~ include/linux/minmax.h:161:27: note: in expansion of macro '__cmp_once' 161 | #define min_t(type, x, y) __cmp_once(min, type, x, y) | ^~~~~~~~~~ kernel/resource.c:1847:24: note: in expansion of macro 'min_t' 1847 | addr <= min_t(resource_size_t, base->end, | ^~~~~ cc1: all warnings being treated as errors
which looks like a real problem: our phys_addr_t is only 32 bits now, so having 34-bit masks is just going to result in overflows.
Reviewed-by: Charlie Jenkins charlie@rivosinc.com Reviewed-by: Alexandre Ghiti alexghiti@rivosinc.com Link: https://lore.kernel.org/r/20240731162159.9235-2-palmer@rivosinc.com Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/include/asm/sparsemem.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/riscv/include/asm/sparsemem.h b/arch/riscv/include/asm/sparsemem.h index 63acaecc33747..2f901a410586d 100644 --- a/arch/riscv/include/asm/sparsemem.h +++ b/arch/riscv/include/asm/sparsemem.h @@ -7,7 +7,7 @@ #ifdef CONFIG_64BIT #define MAX_PHYSMEM_BITS 56 #else -#define MAX_PHYSMEM_BITS 34 +#define MAX_PHYSMEM_BITS 32 #endif /* CONFIG_64BIT */ #define SECTION_SIZE_BITS 27 #endif /* CONFIG_SPARSEMEM */
From: Jens Axboe axboe@kernel.dk
[ Upstream commit eac2ca2d682f94f46b1973bdf5e77d85d77b8e53 ]
In terms of normal application usage, this list will always be empty. And if an application does overflow a bit, it'll have a few entries. However, nothing obviously prevents syzbot from running a test case that generates a ton of overflow entries, and then flushing them can take quite a while.
Check for needing to reschedule while flushing, and drop our locks and do so if necessary. There's no state to maintain here as overflows always prune from head-of-list, hence it's fine to drop and reacquire the locks at the end of the loop.
Link: https://lore.kernel.org/io-uring/66ed061d.050a0220.29194.0053.GAE@google.com... Reported-by: syzbot+5fca234bd7eb378ff78e@syzkaller.appspotmail.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- io_uring/io_uring.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 68504709f75cb..7ecfd314cf3cb 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -701,6 +701,21 @@ static void __io_cqring_overflow_flush(struct io_ring_ctx *ctx) memcpy(cqe, &ocqe->cqe, cqe_size); list_del(&ocqe->list); kfree(ocqe); + + /* + * For silly syzbot cases that deliberately overflow by huge + * amounts, check if we need to resched and drop and + * reacquire the locks if so. Nothing real would ever hit this. + * Ideally we'd have a non-posting unlock for this, but hard + * to care for a non-real case. + */ + if (need_resched()) { + io_cq_unlock_post(ctx); + mutex_unlock(&ctx->uring_lock); + cond_resched(); + mutex_lock(&ctx->uring_lock); + io_cq_lock(ctx); + } }
if (list_empty(&ctx->cq_overflow_list)) {
From: Kaixin Wang kxwang23@m.fudan.edu.cn
[ Upstream commit e51aded92d42784313ba16c12f4f88cc4f973bbb ]
In the switchtec_ntb_add function, it can call switchtec_ntb_init_sndev function, then &sndev->check_link_status_work is bound with check_link_status_work. switchtec_ntb_link_notification may be called to start the work.
If we remove the module which will call switchtec_ntb_remove to make cleanup, it will free sndev through kfree(sndev), while the work mentioned above will be used. The sequence of operations that may lead to a UAF bug is as follows:
CPU0 CPU1
| check_link_status_work switchtec_ntb_remove | kfree(sndev); | | if (sndev->link_force_down) | // use sndev
Fix it by ensuring that the work is canceled before proceeding with the cleanup in switchtec_ntb_remove.
Signed-off-by: Kaixin Wang kxwang23@m.fudan.edu.cn Reviewed-by: Logan Gunthorpe logang@deltatee.com Signed-off-by: Jon Mason jdmason@kudzu.us Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ntb/hw/mscc/ntb_hw_switchtec.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/ntb/hw/mscc/ntb_hw_switchtec.c b/drivers/ntb/hw/mscc/ntb_hw_switchtec.c index d6bbcc7b5b90d..0a94c634ddc27 100644 --- a/drivers/ntb/hw/mscc/ntb_hw_switchtec.c +++ b/drivers/ntb/hw/mscc/ntb_hw_switchtec.c @@ -1554,6 +1554,7 @@ static void switchtec_ntb_remove(struct device *dev) switchtec_ntb_deinit_db_msg_irq(sndev); switchtec_ntb_deinit_shared_mw(sndev); switchtec_ntb_deinit_crosslink(sndev); + cancel_work_sync(&sndev->check_link_status_work); kfree(sndev); dev_info(dev, "ntb device unregistered\n"); }
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit ae7eee56cdcfcb6a886f76232778d6517fd58690 ]
There are 2G and 4G RAM versions of the Lenovo Yoga Tab 3 X90F and it turns out that the 2G version has a DMI product name of "CHERRYVIEW D1 PLATFORM" where as the 4G version has "CHERRYVIEW C0 PLATFORM". The sys-vendor + product-version check are unique enough that the product-name check is not necessary.
Drop the product-name check so that the existing DMI match for the 4G RAM version also matches the 2G RAM version.
Signed-off-by: Hans de Goede hdegoede@redhat.com Reviewed-by: Andy Shevchenko andy@kernel.org Link: https://lore.kernel.org/r/20240825132617.8809-1-hdegoede@redhat.com Signed-off-by: Lee Jones lee@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mfd/intel_soc_pmic_chtwc.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/mfd/intel_soc_pmic_chtwc.c b/drivers/mfd/intel_soc_pmic_chtwc.c index 7fce3ef7ab453..2a83f540d4c9d 100644 --- a/drivers/mfd/intel_soc_pmic_chtwc.c +++ b/drivers/mfd/intel_soc_pmic_chtwc.c @@ -178,7 +178,6 @@ static const struct dmi_system_id cht_wc_model_dmi_ids[] = { .driver_data = (void *)(long)INTEL_CHT_WC_LENOVO_YT3_X90, .matches = { DMI_MATCH(DMI_SYS_VENDOR, "Intel Corporation"), - DMI_MATCH(DMI_PRODUCT_NAME, "CHERRYVIEW D1 PLATFORM"), DMI_MATCH(DMI_PRODUCT_VERSION, "Blade3-10A-001"), }, },
From: Jisheng Zhang jszhang@kernel.org
[ Upstream commit 8f1534e7440382d118c3d655d3a6014128b2086d ]
Inspired by[1], modify the code to remove the code of modifying ra to avoid imbalance RAS (return address stack) which may lead to incorret predictions on return.
Link: https://lore.kernel.org/linux-riscv/20240607061335.2197383-1-cyrilbur@tensto... [1] Signed-off-by: Jisheng Zhang jszhang@kernel.org Reviewed-by: Cyril Bur cyrilbur@tenstorrent.com Link: https://lore.kernel.org/r/20240720170659.1522-1-jszhang@kernel.org Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/entry.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index ed7baf2cf7e87..1f90fee24a8ba 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -219,8 +219,8 @@ SYM_CODE_START(ret_from_fork) jalr s0 1: move a0, sp /* pt_regs */ - la ra, ret_from_exception - tail syscall_exit_to_user_mode + call syscall_exit_to_user_mode + j ret_from_exception SYM_CODE_END(ret_from_fork)
/*
From: Michael Guralnik michaelgur@nvidia.com
[ Upstream commit 8c6d097d830f779fc1725fbaa1314f20a7a07b4b ]
The new memory scheme page faults are requesting the driver to fetch additinal pages to the faulted memory access. This is done in order to prefetch pages before and after the area that got the page fault, assuming this will reduce the total amount of page faults.
The driver should ensure it handles only the pages that are within the umem range.
Signed-off-by: Michael Guralnik michaelgur@nvidia.com Link: https://patch.msgid.link/20240909100504.29797-5-michaelgur@nvidia.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/mlx5/odp.c | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index a524181f34df9..3a4605fda6d57 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -733,24 +733,31 @@ static int pagefault_dmabuf_mr(struct mlx5_ib_mr *mr, size_t bcnt, * >0: Number of pages mapped */ static int pagefault_mr(struct mlx5_ib_mr *mr, u64 io_virt, size_t bcnt, - u32 *bytes_mapped, u32 flags) + u32 *bytes_mapped, u32 flags, bool permissive_fault) { struct ib_umem_odp *odp = to_ib_umem_odp(mr->umem);
- if (unlikely(io_virt < mr->ibmr.iova)) + if (unlikely(io_virt < mr->ibmr.iova) && !permissive_fault) return -EFAULT;
if (mr->umem->is_dmabuf) return pagefault_dmabuf_mr(mr, bcnt, bytes_mapped, flags);
if (!odp->is_implicit_odp) { + u64 offset = io_virt < mr->ibmr.iova ? 0 : io_virt - mr->ibmr.iova; u64 user_va;
- if (check_add_overflow(io_virt - mr->ibmr.iova, - (u64)odp->umem.address, &user_va)) + if (check_add_overflow(offset, (u64)odp->umem.address, + &user_va)) return -EFAULT; - if (unlikely(user_va >= ib_umem_end(odp) || - ib_umem_end(odp) - user_va < bcnt)) + + if (permissive_fault) { + if (user_va < ib_umem_start(odp)) + user_va = ib_umem_start(odp); + if ((user_va + bcnt) > ib_umem_end(odp)) + bcnt = ib_umem_end(odp) - user_va; + } else if (unlikely(user_va >= ib_umem_end(odp) || + ib_umem_end(odp) - user_va < bcnt)) return -EFAULT; return pagefault_real_mr(mr, odp, user_va, bcnt, bytes_mapped, flags); @@ -857,7 +864,7 @@ static int pagefault_single_data_segment(struct mlx5_ib_dev *dev, case MLX5_MKEY_MR: mr = container_of(mmkey, struct mlx5_ib_mr, mmkey);
- ret = pagefault_mr(mr, io_virt, bcnt, bytes_mapped, 0); + ret = pagefault_mr(mr, io_virt, bcnt, bytes_mapped, 0, false); if (ret < 0) goto end;
@@ -1710,7 +1717,7 @@ static void mlx5_ib_prefetch_mr_work(struct work_struct *w) for (i = 0; i < work->num_sge; ++i) { ret = pagefault_mr(work->frags[i].mr, work->frags[i].io_virt, work->frags[i].length, &bytes_mapped, - work->pf_flags); + work->pf_flags, false); if (ret <= 0) continue; mlx5_update_odp_stats(work->frags[i].mr, prefetch, ret); @@ -1761,7 +1768,7 @@ static int mlx5_ib_prefetch_sg_list(struct ib_pd *pd, if (IS_ERR(mr)) return PTR_ERR(mr); ret = pagefault_mr(mr, sg_list[i].addr, sg_list[i].length, - &bytes_mapped, pf_flags); + &bytes_mapped, pf_flags, false); if (ret < 0) { mlx5r_deref_odp_mkey(&mr->mmkey); return ret;
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
[ Upstream commit f8c35d61ba01afa76846905c67862cdace7f66b0 ]
The SoundWire peripheral enumeration is entirely based on interrupts, more specifically sticky bits tracking state changes.
This patch adds a defensive programming check on the actual status reported in PING frames. If for some reason an interrupt was lost or delayed, the delayed work would detect a peripheral change of status after the bus starts.
The 100ms defined for the delay is not completely arbitrary, if a Peripheral didn't join the bus within that delay then probably the hardware link is broken, and conversely if the detection didn't happen because of software issues the 100ms is still acceptable in terms of user experience.
The overhead of the one-shot workqueue is minimal, and the mutual exclusion ensures that the interrupt and delayed work cannot update the status concurrently.
Reviewed-by: Liam Girdwood liam.r.girdwood@intel.com Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Signed-off-by: Bard Liao yung-chuan.liao@linux.intel.com Link: https://lore.kernel.org/r/20240805114921.88007-1-yung-chuan.liao@linux.intel... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soundwire/cadence_master.c | 39 ++++++++++++++++++++++++++-- drivers/soundwire/cadence_master.h | 5 ++++ drivers/soundwire/intel.h | 2 ++ drivers/soundwire/intel_auxdevice.c | 1 + drivers/soundwire/intel_bus_common.c | 11 ++++++++ 5 files changed, 56 insertions(+), 2 deletions(-)
diff --git a/drivers/soundwire/cadence_master.c b/drivers/soundwire/cadence_master.c index 3e7cf04aaf2a6..e69982dbd449b 100644 --- a/drivers/soundwire/cadence_master.c +++ b/drivers/soundwire/cadence_master.c @@ -891,8 +891,14 @@ static int cdns_update_slave_status(struct sdw_cdns *cdns, } }
- if (is_slave) - return sdw_handle_slave_status(&cdns->bus, status); + if (is_slave) { + int ret; + + mutex_lock(&cdns->status_update_lock); + ret = sdw_handle_slave_status(&cdns->bus, status); + mutex_unlock(&cdns->status_update_lock); + return ret; + }
return 0; } @@ -989,6 +995,31 @@ irqreturn_t sdw_cdns_irq(int irq, void *dev_id) } EXPORT_SYMBOL(sdw_cdns_irq);
+static void cdns_check_attached_status_dwork(struct work_struct *work) +{ + struct sdw_cdns *cdns = + container_of(work, struct sdw_cdns, attach_dwork.work); + enum sdw_slave_status status[SDW_MAX_DEVICES + 1]; + u32 val; + int ret; + int i; + + val = cdns_readl(cdns, CDNS_MCP_SLAVE_STAT); + + for (i = 0; i <= SDW_MAX_DEVICES; i++) { + status[i] = val & 0x3; + if (status[i]) + dev_dbg(cdns->dev, "Peripheral %d status: %d\n", i, status[i]); + val >>= 2; + } + + mutex_lock(&cdns->status_update_lock); + ret = sdw_handle_slave_status(&cdns->bus, status); + mutex_unlock(&cdns->status_update_lock); + if (ret < 0) + dev_err(cdns->dev, "%s: sdw_handle_slave_status failed: %d\n", __func__, ret); +} + /** * cdns_update_slave_status_work - update slave status in a work since we will need to handle * other interrupts eg. CDNS_MCP_INT_RX_WL during the update slave @@ -1745,7 +1776,11 @@ int sdw_cdns_probe(struct sdw_cdns *cdns) init_completion(&cdns->tx_complete); cdns->bus.port_ops = &cdns_port_ops;
+ mutex_init(&cdns->status_update_lock); + INIT_WORK(&cdns->work, cdns_update_slave_status_work); + INIT_DELAYED_WORK(&cdns->attach_dwork, cdns_check_attached_status_dwork); + return 0; } EXPORT_SYMBOL(sdw_cdns_probe); diff --git a/drivers/soundwire/cadence_master.h b/drivers/soundwire/cadence_master.h index bc84435e420f5..e1d7969ba48ae 100644 --- a/drivers/soundwire/cadence_master.h +++ b/drivers/soundwire/cadence_master.h @@ -117,6 +117,8 @@ struct sdw_cdns_dai_runtime { * @link_up: Link status * @msg_count: Messages sent on bus * @dai_runtime_array: runtime context for each allocated DAI. + * @status_update_lock: protect concurrency between interrupt-based and delayed work + * status update */ struct sdw_cdns { struct device *dev; @@ -148,10 +150,13 @@ struct sdw_cdns { bool interrupt_enabled;
struct work_struct work; + struct delayed_work attach_dwork;
struct list_head list;
struct sdw_cdns_dai_runtime **dai_runtime_array; + + struct mutex status_update_lock; /* add mutual exclusion to sdw_handle_slave_status() */ };
#define bus_to_cdns(_bus) container_of(_bus, struct sdw_cdns, bus) diff --git a/drivers/soundwire/intel.h b/drivers/soundwire/intel.h index 511932c55216c..bb6b1df2d2c20 100644 --- a/drivers/soundwire/intel.h +++ b/drivers/soundwire/intel.h @@ -91,6 +91,8 @@ static inline void intel_writew(void __iomem *base, int offset, u16 value)
#define INTEL_MASTER_RESET_ITERATIONS 10
+#define SDW_INTEL_DELAYED_ENUMERATION_MS 100 + #define SDW_INTEL_CHECK_OPS(sdw, cb) ((sdw) && (sdw)->link_res && (sdw)->link_res->hw_ops && \ (sdw)->link_res->hw_ops->cb) #define SDW_INTEL_OPS(sdw, cb) ((sdw)->link_res->hw_ops->cb) diff --git a/drivers/soundwire/intel_auxdevice.c b/drivers/soundwire/intel_auxdevice.c index 93698532deac4..bdfff78ac2f81 100644 --- a/drivers/soundwire/intel_auxdevice.c +++ b/drivers/soundwire/intel_auxdevice.c @@ -398,6 +398,7 @@ static void intel_link_remove(struct auxiliary_device *auxdev) */ if (!bus->prop.hw_disabled) { sdw_intel_debugfs_exit(sdw); + cancel_delayed_work_sync(&cdns->attach_dwork); sdw_cdns_enable_interrupt(cdns, false); } sdw_bus_master_delete(bus); diff --git a/drivers/soundwire/intel_bus_common.c b/drivers/soundwire/intel_bus_common.c index 179aa0d85951b..db9cf211671a3 100644 --- a/drivers/soundwire/intel_bus_common.c +++ b/drivers/soundwire/intel_bus_common.c @@ -60,6 +60,9 @@ int intel_start_bus(struct sdw_intel *sdw) sdw_cdns_check_self_clearing_bits(cdns, __func__, true, INTEL_MASTER_RESET_ITERATIONS);
+ schedule_delayed_work(&cdns->attach_dwork, + msecs_to_jiffies(SDW_INTEL_DELAYED_ENUMERATION_MS)); + return 0; }
@@ -151,6 +154,9 @@ int intel_start_bus_after_reset(struct sdw_intel *sdw) } sdw_cdns_check_self_clearing_bits(cdns, __func__, true, INTEL_MASTER_RESET_ITERATIONS);
+ schedule_delayed_work(&cdns->attach_dwork, + msecs_to_jiffies(SDW_INTEL_DELAYED_ENUMERATION_MS)); + return 0; }
@@ -184,6 +190,9 @@ int intel_start_bus_after_clock_stop(struct sdw_intel *sdw)
sdw_cdns_check_self_clearing_bits(cdns, __func__, true, INTEL_MASTER_RESET_ITERATIONS);
+ schedule_delayed_work(&cdns->attach_dwork, + msecs_to_jiffies(SDW_INTEL_DELAYED_ENUMERATION_MS)); + return 0; }
@@ -194,6 +203,8 @@ int intel_stop_bus(struct sdw_intel *sdw, bool clock_stop) bool wake_enable = false; int ret;
+ cancel_delayed_work_sync(&cdns->attach_dwork); + if (clock_stop) { ret = sdw_cdns_clock_stop(cdns, true); if (ret < 0)
From: Ying Sun sunying@isrc.iscas.ac.cn
[ Upstream commit c6ebf2c528470a09be77d0d9df2c6617ea037ac5 ]
Runs on the kernel with CONFIG_RISCV_ALTERNATIVE enabled: kexec -sl vmlinux
Error: kexec_image: Unknown rela relocation: 34 kexec_image: Error loading purgatory ret=-8 and kexec_image: Unknown rela relocation: 38 kexec_image: Error loading purgatory ret=-8
The purgatory code uses the 16-bit addition and subtraction relocation type, but not handled, resulting in kexec_file_load failure. So add handle to arch_kexec_apply_relocations_add().
Tested on RISC-V64 Qemu-virt, issue fixed.
Co-developed-by: Petr Tesarik petr@tesarici.cz Signed-off-by: Petr Tesarik petr@tesarici.cz Signed-off-by: Ying Sun sunying@isrc.iscas.ac.cn Reviewed-by: Andrew Jones ajones@ventanamicro.com Link: https://lore.kernel.org/r/20240711083236.2859632-1-sunying@isrc.iscas.ac.cn Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/elf_kexec.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/riscv/kernel/elf_kexec.c b/arch/riscv/kernel/elf_kexec.c index e60fbd8660c4a..8c32bf1eedda0 100644 --- a/arch/riscv/kernel/elf_kexec.c +++ b/arch/riscv/kernel/elf_kexec.c @@ -444,6 +444,12 @@ int arch_kexec_apply_relocations_add(struct purgatory_info *pi, *(u32 *)loc = CLEAN_IMM(CJTYPE, *(u32 *)loc) | ENCODE_CJTYPE_IMM(val - addr); break; + case R_RISCV_ADD16: + *(u16 *)loc += val; + break; + case R_RISCV_SUB16: + *(u16 *)loc -= val; + break; case R_RISCV_ADD32: *(u32 *)loc += val; break;
From: Yunke Cao yunkec@chromium.org
[ Upstream commit 6a9c97ab6b7e85697e0b74e86062192a5ffffd99 ]
Clear vb2_plane's memory related fields in __vb2_plane_dmabuf_put(), including bytesused, length, fd and data_offset.
Remove the duplicated code in __prepare_dmabuf().
Signed-off-by: Yunke Cao yunkec@chromium.org Acked-by: Tomasz Figa tfiga@chromium.org Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/common/videobuf2/videobuf2-core.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c index 468191438849e..29bfc2bf796b6 100644 --- a/drivers/media/common/videobuf2/videobuf2-core.c +++ b/drivers/media/common/videobuf2/videobuf2-core.c @@ -302,6 +302,10 @@ static void __vb2_plane_dmabuf_put(struct vb2_buffer *vb, struct vb2_plane *p) p->mem_priv = NULL; p->dbuf = NULL; p->dbuf_mapped = 0; + p->bytesused = 0; + p->length = 0; + p->m.fd = 0; + p->data_offset = 0; }
/* @@ -1296,10 +1300,6 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
/* Release previously acquired memory if present */ __vb2_plane_dmabuf_put(vb, &vb->planes[plane]); - vb->planes[plane].bytesused = 0; - vb->planes[plane].length = 0; - vb->planes[plane].m.fd = 0; - vb->planes[plane].data_offset = 0;
/* Acquire each plane's memory */ mem_priv = call_ptr_memop(attach_dmabuf,
From: Peng Fan peng.fan@nxp.com
[ Upstream commit e954a1bd16102abc800629f9900715d8ec4c3130 ]
If there is a resource table device tree node, use the address as the resource table address, otherwise use the address(where .resource_table section loaded) inside the Cortex-M elf file.
And there is an update in NXP SDK that Resource Domain Control(RDC) enabled to protect TCM, linux not able to write the TCM space when updating resource table status and cause kernel dump. So use the address from device tree could avoid kernel dump.
Note: NXP M4 SDK not check resource table update, so it does not matter use whether resource table address specified in elf file or in device tree. But to reflect the fact that if people specific resource table address in device tree, it means people are aware and going to use it, not the address specified in elf file.
Reviewed-by: Iuliana Prodan iuliana.prodan@nxp.com Signed-off-by: Peng Fan peng.fan@nxp.com Reviewed-by: Daniel Baluta daniel.baluta@nxp.com Link: https://lore.kernel.org/r/20240719-imx_rproc-v2-2-10d0268c7eb1@nxp.com Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/remoteproc/imx_rproc.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/drivers/remoteproc/imx_rproc.c b/drivers/remoteproc/imx_rproc.c index cfee164dd645c..db281b7a38b3d 100644 --- a/drivers/remoteproc/imx_rproc.c +++ b/drivers/remoteproc/imx_rproc.c @@ -669,6 +669,17 @@ static struct resource_table *imx_rproc_get_loaded_rsc_table(struct rproc *rproc return (struct resource_table *)priv->rsc_table; }
+static struct resource_table * +imx_rproc_elf_find_loaded_rsc_table(struct rproc *rproc, const struct firmware *fw) +{ + struct imx_rproc *priv = rproc->priv; + + if (priv->rsc_table) + return (struct resource_table *)priv->rsc_table; + + return rproc_elf_find_loaded_rsc_table(rproc, fw); +} + static const struct rproc_ops imx_rproc_ops = { .prepare = imx_rproc_prepare, .attach = imx_rproc_attach, @@ -679,7 +690,7 @@ static const struct rproc_ops imx_rproc_ops = { .da_to_va = imx_rproc_da_to_va, .load = rproc_elf_load_segments, .parse_fw = imx_rproc_parse_fw, - .find_loaded_rsc_table = rproc_elf_find_loaded_rsc_table, + .find_loaded_rsc_table = imx_rproc_elf_find_loaded_rsc_table, .get_loaded_rsc_table = imx_rproc_get_loaded_rsc_table, .sanity_check = rproc_elf_sanity_check, .get_boot_addr = rproc_elf_get_boot_addr,
From: Peng Fan peng.fan@nxp.com
[ Upstream commit a54c441b46a0745683c2eef5a359d22856d27323 ]
For i.MX7D DRAM related mux clock, the clock source change should ONLY be done done in low level asm code without accessing DRAM, and then calling clk API to sync the HW clock status with clk tree, it should never touch real clock source switch via clk API, so CLK_SET_PARENT_GATE flag should NOT be added, otherwise, DRAM's clock parent will be disabled when DRAM is active, and system will hang.
Signed-off-by: Peng Fan peng.fan@nxp.com Reviewed-by: Abel Vesa abel.vesa@linaro.org Link: https://lore.kernel.org/r/20240607133347.3291040-8-peng.fan@oss.nxp.com Signed-off-by: Abel Vesa abel.vesa@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/imx/clk-imx7d.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/clk/imx/clk-imx7d.c b/drivers/clk/imx/clk-imx7d.c index 2b77d1fc7bb94..1e1296e748357 100644 --- a/drivers/clk/imx/clk-imx7d.c +++ b/drivers/clk/imx/clk-imx7d.c @@ -498,9 +498,9 @@ static void __init imx7d_clocks_init(struct device_node *ccm_node) hws[IMX7D_ENET_AXI_ROOT_SRC] = imx_clk_hw_mux2_flags("enet_axi_src", base + 0x8900, 24, 3, enet_axi_sel, ARRAY_SIZE(enet_axi_sel), CLK_SET_PARENT_GATE); hws[IMX7D_NAND_USDHC_BUS_ROOT_SRC] = imx_clk_hw_mux2_flags("nand_usdhc_src", base + 0x8980, 24, 3, nand_usdhc_bus_sel, ARRAY_SIZE(nand_usdhc_bus_sel), CLK_SET_PARENT_GATE); hws[IMX7D_DRAM_PHYM_ROOT_SRC] = imx_clk_hw_mux2_flags("dram_phym_src", base + 0x9800, 24, 1, dram_phym_sel, ARRAY_SIZE(dram_phym_sel), CLK_SET_PARENT_GATE); - hws[IMX7D_DRAM_ROOT_SRC] = imx_clk_hw_mux2_flags("dram_src", base + 0x9880, 24, 1, dram_sel, ARRAY_SIZE(dram_sel), CLK_SET_PARENT_GATE); + hws[IMX7D_DRAM_ROOT_SRC] = imx_clk_hw_mux2("dram_src", base + 0x9880, 24, 1, dram_sel, ARRAY_SIZE(dram_sel)); hws[IMX7D_DRAM_PHYM_ALT_ROOT_SRC] = imx_clk_hw_mux2_flags("dram_phym_alt_src", base + 0xa000, 24, 3, dram_phym_alt_sel, ARRAY_SIZE(dram_phym_alt_sel), CLK_SET_PARENT_GATE); - hws[IMX7D_DRAM_ALT_ROOT_SRC] = imx_clk_hw_mux2_flags("dram_alt_src", base + 0xa080, 24, 3, dram_alt_sel, ARRAY_SIZE(dram_alt_sel), CLK_SET_PARENT_GATE); + hws[IMX7D_DRAM_ALT_ROOT_SRC] = imx_clk_hw_mux2("dram_alt_src", base + 0xa080, 24, 3, dram_alt_sel, ARRAY_SIZE(dram_alt_sel)); hws[IMX7D_USB_HSIC_ROOT_SRC] = imx_clk_hw_mux2_flags("usb_hsic_src", base + 0xa100, 24, 3, usb_hsic_sel, ARRAY_SIZE(usb_hsic_sel), CLK_SET_PARENT_GATE); hws[IMX7D_PCIE_CTRL_ROOT_SRC] = imx_clk_hw_mux2_flags("pcie_ctrl_src", base + 0xa180, 24, 3, pcie_ctrl_sel, ARRAY_SIZE(pcie_ctrl_sel), CLK_SET_PARENT_GATE); hws[IMX7D_PCIE_PHY_ROOT_SRC] = imx_clk_hw_mux2_flags("pcie_phy_src", base + 0xa200, 24, 3, pcie_phy_sel, ARRAY_SIZE(pcie_phy_sel), CLK_SET_PARENT_GATE);
From: Alexander Mikhalitsyn aleksandr.mikhalitsyn@canonical.com
[ Upstream commit 5b8ca5a54cb89ab07b0389f50e038e533cdfdd86 ]
This is needed to properly clear suid/sgid.
Signed-off-by: Alexander Mikhalitsyn aleksandr.mikhalitsyn@canonical.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/fuse/file.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/fuse/file.c b/fs/fuse/file.c index ceb9f7d230388..eb2a3ffb1e816 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -1301,6 +1301,7 @@ static ssize_t fuse_perform_write(struct kiocb *iocb, struct iov_iter *ii) static ssize_t fuse_cache_write_iter(struct kiocb *iocb, struct iov_iter *from) { struct file *file = iocb->ki_filp; + struct mnt_idmap *idmap = file_mnt_idmap(file); struct address_space *mapping = file->f_mapping; ssize_t written = 0; struct inode *inode = mapping->host; @@ -1315,7 +1316,7 @@ static ssize_t fuse_cache_write_iter(struct kiocb *iocb, struct iov_iter *from) return err;
if (fc->handle_killpriv_v2 && - setattr_should_drop_suidgid(&nop_mnt_idmap, + setattr_should_drop_suidgid(idmap, file_inode(file))) { goto writethrough; }
From: "Jiri Slaby (SUSE)" jirislaby@kernel.org
[ Upstream commit 602babaa84d627923713acaf5f7e9a4369e77473 ]
Commit af224ca2df29 (serial: core: Prevent unsafe uart port access, part 3) added few uport == NULL checks. It added one to uart_shutdown(), so the commit assumes, uport can be NULL in there. But right after that protection, there is an unprotected "uart_port_dtr_rts(uport, false);" call. That is invoked only if HUPCL is set, so I assume that is the reason why we do not see lots of these reports.
Or it cannot be NULL at this point at all for some reason :P.
Until the above is investigated, stay on the safe side and move this dereference to the if too.
I got this inconsistency from Coverity under CID 1585130. Thanks.
Signed-off-by: Jiri Slaby (SUSE) jirislaby@kernel.org Cc: Peter Hurley peter@hurleysoftware.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Link: https://lore.kernel.org/r/20240805102046.307511-3-jirislaby@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/serial/serial_core.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c index ed8798fdf522a..d7c59d690a74d 100644 --- a/drivers/tty/serial/serial_core.c +++ b/drivers/tty/serial/serial_core.c @@ -374,14 +374,16 @@ static void uart_shutdown(struct tty_struct *tty, struct uart_state *state) /* * Turn off DTR and RTS early. */ - if (uport && uart_console(uport) && tty) { - uport->cons->cflag = tty->termios.c_cflag; - uport->cons->ispeed = tty->termios.c_ispeed; - uport->cons->ospeed = tty->termios.c_ospeed; - } + if (uport) { + if (uart_console(uport) && tty) { + uport->cons->cflag = tty->termios.c_cflag; + uport->cons->ispeed = tty->termios.c_ispeed; + uport->cons->ospeed = tty->termios.c_ospeed; + }
- if (!tty || C_HUPCL(tty)) - uart_port_dtr_rts(uport, false); + if (!tty || C_HUPCL(tty)) + uart_port_dtr_rts(uport, false); + }
uart_port_shutdown(port); }
From: Wadim Egorov w.egorov@phytec.de
[ Upstream commit db63d9868f7f310de44ba7bea584e2454f8b4ed0 ]
In polling mode, if no IRQ was requested there is no need to free it. Call devm_free_irq() only if client->irq is set. This fixes the warning caused by the tps6598x module removal:
WARNING: CPU: 2 PID: 333 at kernel/irq/devres.c:144 devm_free_irq+0x80/0x8c ... ... Call trace: devm_free_irq+0x80/0x8c tps6598x_remove+0x28/0x88 [tps6598x] i2c_device_remove+0x2c/0x9c device_remove+0x4c/0x80 device_release_driver_internal+0x1cc/0x228 driver_detach+0x50/0x98 bus_remove_driver+0x6c/0xbc driver_unregister+0x30/0x60 i2c_del_driver+0x54/0x64 tps6598x_i2c_driver_exit+0x18/0xc3c [tps6598x] __arm64_sys_delete_module+0x184/0x264 invoke_syscall+0x48/0x110 el0_svc_common.constprop.0+0xc8/0xe8 do_el0_svc+0x20/0x2c el0_svc+0x28/0x98 el0t_64_sync_handler+0x13c/0x158 el0t_64_sync+0x190/0x194
Signed-off-by: Wadim Egorov w.egorov@phytec.de Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://lore.kernel.org/r/20240816124150.608125-1-w.egorov@phytec.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/typec/tipd/core.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/usb/typec/tipd/core.c b/drivers/usb/typec/tipd/core.c index 125269f39f83a..01db27cbf1d10 100644 --- a/drivers/usb/typec/tipd/core.c +++ b/drivers/usb/typec/tipd/core.c @@ -907,6 +907,8 @@ static void tps6598x_remove(struct i2c_client *client)
if (!client->irq) cancel_delayed_work_sync(&tps->wq_poll); + else + devm_free_irq(tps->dev, client->irq, tps);
tps6598x_disconnect(tps, 0); typec_unregister_port(tps->port);
From: Xu Yang xu.yang_2@nxp.com
[ Upstream commit e4fdcc10092fb244218013bfe8ff01c55d54e8e4 ]
Currently, suspend interrupt is enabled before pullup enable operation. This will cause a suspend interrupt assert right after pullup DP. This suspend interrupt is meaningless, so this will ignore such interrupt by enable it after usb reset completed.
Signed-off-by: Xu Yang xu.yang_2@nxp.com Acked-by: Peter Chen peter.chen@kernel.org Link: https://lore.kernel.org/r/20240823073832.1702135-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/chipidea/udc.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c index 0b7bd3c643c3a..f70ceedfb468f 100644 --- a/drivers/usb/chipidea/udc.c +++ b/drivers/usb/chipidea/udc.c @@ -86,7 +86,7 @@ static int hw_device_state(struct ci_hdrc *ci, u32 dma) hw_write(ci, OP_ENDPTLISTADDR, ~0, dma); /* interrupt, error, port change, reset, sleep/suspend */ hw_write(ci, OP_USBINTR, ~0, - USBi_UI|USBi_UEI|USBi_PCI|USBi_URI|USBi_SLI); + USBi_UI|USBi_UEI|USBi_PCI|USBi_URI); } else { hw_write(ci, OP_USBINTR, ~0, 0); } @@ -876,6 +876,7 @@ __releases(ci->lock) __acquires(ci->lock) { int retval; + u32 intr;
spin_unlock(&ci->lock); if (ci->gadget.speed != USB_SPEED_UNKNOWN) @@ -889,6 +890,11 @@ __acquires(ci->lock) if (retval) goto done;
+ /* clear SLI */ + hw_write(ci, OP_USBSTS, USBi_SLI, USBi_SLI); + intr = hw_read(ci, OP_USBINTR, ~0); + hw_write(ci, OP_USBINTR, ~0, intr | USBi_SLI); + ci->status = usb_ep_alloc_request(&ci->ep0in->ep, GFP_ATOMIC); if (ci->status == NULL) retval = -ENOMEM;
From: Shawn Shao shawn.shao@jaguarmicro.com
[ Upstream commit 4058c39bd176daf11a826802d940d86292a6b02b ]
The issue is that before entering the crash kernel, the DWC USB controller did not perform operations such as resetting the interrupt mask bits. After entering the crash kernel,before the USB interrupt handler registration was completed while loading the DWC USB driver,an GINTSTS_SOF interrupt was received.This triggered the misroute_irq process within the GIC handling framework,ultimately leading to the misrouting of the interrupt,causing it to be handled by the wrong interrupt handler and resulting in the issue.
Summary:In a scenario where the kernel triggers a panic and enters the crash kernel,it is necessary to ensure that the interrupt mask bit is not enabled before the interrupt registration is complete. If an interrupt reaches the CPU at this moment,it will certainly not be handled correctly,especially in cases where this interrupt is reported frequently.
Please refer to the Crashkernel dmesg information as follows (the message on line 3 was added before devm_request_irq is called by the dwc2_driver_probe function): [ 5.866837][ T1] dwc2 JMIC0010:01: supply vusb_d not found, using dummy regulator [ 5.874588][ T1] dwc2 JMIC0010:01: supply vusb_a not found, using dummy regulator [ 5.882335][ T1] dwc2 JMIC0010:01: before devm_request_irq irq: [71], gintmsk[0xf300080e], gintsts[0x04200009] [ 5.892686][ C0] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.0-jmnd1.2_RC #18 [ 5.900327][ C0] Hardware name: CMSS HyperCard4-25G/HyperCard4-25G, BIOS 1.6.4 Jul 8 2024 [ 5.908836][ C0] Call trace: [ 5.911965][ C0] dump_backtrace+0x0/0x1f0 [ 5.916308][ C0] show_stack+0x20/0x30 [ 5.920304][ C0] dump_stack+0xd8/0x140 [ 5.924387][ C0] pcie_xxx_handler+0x3c/0x1d8 [ 5.930121][ C0] __handle_irq_event_percpu+0x64/0x1e0 [ 5.935506][ C0] handle_irq_event+0x80/0x1d0 [ 5.940109][ C0] try_one_irq+0x138/0x174 [ 5.944365][ C0] misrouted_irq+0x134/0x140 [ 5.948795][ C0] note_interrupt+0x1d0/0x30c [ 5.953311][ C0] handle_irq_event+0x13c/0x1d0 [ 5.958001][ C0] handle_fasteoi_irq+0xd4/0x260 [ 5.962779][ C0] __handle_domain_irq+0x88/0xf0 [ 5.967555][ C0] gic_handle_irq+0x9c/0x2f0 [ 5.971985][ C0] el1_irq+0xb8/0x140 [ 5.975807][ C0] __setup_irq+0x3dc/0x7cc [ 5.980064][ C0] request_threaded_irq+0xf4/0x1b4 [ 5.985015][ C0] devm_request_threaded_irq+0x80/0x100 [ 5.990400][ C0] dwc2_driver_probe+0x1b8/0x6b0 [ 5.995178][ C0] platform_drv_probe+0x5c/0xb0 [ 5.999868][ C0] really_probe+0xf8/0x51c [ 6.004125][ C0] driver_probe_device+0xfc/0x170 [ 6.008989][ C0] device_driver_attach+0xc8/0xd0 [ 6.013853][ C0] __driver_attach+0xe8/0x1b0 [ 6.018369][ C0] bus_for_each_dev+0x7c/0xdc [ 6.022886][ C0] driver_attach+0x2c/0x3c [ 6.027143][ C0] bus_add_driver+0xdc/0x240 [ 6.031573][ C0] driver_register+0x80/0x13c [ 6.036090][ C0] __platform_driver_register+0x50/0x5c [ 6.041476][ C0] dwc2_platform_driver_init+0x24/0x30 [ 6.046774][ C0] do_one_initcall+0x50/0x25c [ 6.051291][ C0] do_initcall_level+0xe4/0xfc [ 6.055894][ C0] do_initcalls+0x80/0xa4 [ 6.060064][ C0] kernel_init_freeable+0x198/0x240 [ 6.065102][ C0] kernel_init+0x1c/0x12c
Signed-off-by: Shawn Shao shawn.shao@jaguarmicro.com Link: https://lore.kernel.org/r/20240830031709.134-1-shawn.shao@jaguarmicro.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/dwc2/platform.c | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-)
diff --git a/drivers/usb/dwc2/platform.c b/drivers/usb/dwc2/platform.c index 7b84416dfc2b1..c1b7209b94836 100644 --- a/drivers/usb/dwc2/platform.c +++ b/drivers/usb/dwc2/platform.c @@ -469,18 +469,6 @@ static int dwc2_driver_probe(struct platform_device *dev)
spin_lock_init(&hsotg->lock);
- hsotg->irq = platform_get_irq(dev, 0); - if (hsotg->irq < 0) - return hsotg->irq; - - dev_dbg(hsotg->dev, "registering common handler for irq%d\n", - hsotg->irq); - retval = devm_request_irq(hsotg->dev, hsotg->irq, - dwc2_handle_common_intr, IRQF_SHARED, - dev_name(hsotg->dev), hsotg); - if (retval) - return retval; - hsotg->vbus_supply = devm_regulator_get_optional(hsotg->dev, "vbus"); if (IS_ERR(hsotg->vbus_supply)) { retval = PTR_ERR(hsotg->vbus_supply); @@ -524,6 +512,20 @@ static int dwc2_driver_probe(struct platform_device *dev) if (retval) goto error;
+ hsotg->irq = platform_get_irq(dev, 0); + if (hsotg->irq < 0) { + retval = hsotg->irq; + goto error; + } + + dev_dbg(hsotg->dev, "registering common handler for irq%d\n", + hsotg->irq); + retval = devm_request_irq(hsotg->dev, hsotg->irq, + dwc2_handle_common_intr, IRQF_SHARED, + dev_name(hsotg->dev), hsotg); + if (retval) + goto error; + /* * For OTG cores, set the force mode bits to reflect the value * of dr_mode. Force mode bits should not be touched at any
From: Ruffalo Lavoisier ruffalolavoisier@gmail.com
[ Upstream commit 5baeb157b341b1d26a5815aeaa4d3bb9e0444fda ]
- After fopen check NULL before using the file pointer use
Signed-off-by: Ruffalo Lavoisier RuffaloLavoisier@gmail.com Link: https://lore.kernel.org/r/20240906203025.89588-1-RuffaloLavoisier@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/comedi/drivers/ni_routing/tools/convert_c_to_py.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/comedi/drivers/ni_routing/tools/convert_c_to_py.c b/drivers/comedi/drivers/ni_routing/tools/convert_c_to_py.c index d55521b5bdcb2..892a66b2cea66 100644 --- a/drivers/comedi/drivers/ni_routing/tools/convert_c_to_py.c +++ b/drivers/comedi/drivers/ni_routing/tools/convert_c_to_py.c @@ -140,6 +140,11 @@ int main(void) { FILE *fp = fopen("ni_values.py", "w");
+ if (fp == NULL) { + fprintf(stderr, "Could not open file!"); + return -1; + } + /* write route register values */ fprintf(fp, "ni_route_values = {\n"); for (int i = 0; ni_all_route_values[i]; ++i)
From: Wentao Guan guanwentao@uniontech.com
[ Upstream commit 5016c3a31a6d74eaf2fdfdec673eae8fcf90379e ]
Add kfree(root_ops) in this case to avoid memleak of root_ops, leaks when pci_find_bus() != 0.
Signed-off-by: Yuli Wang wangyuli@uniontech.com Signed-off-by: Wentao Guan guanwentao@uniontech.com Signed-off-by: Huacai Chen chenhuacai@loongson.cn Signed-off-by: Sasha Levin sashal@kernel.org --- arch/loongarch/pci/acpi.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/loongarch/pci/acpi.c b/arch/loongarch/pci/acpi.c index 365f7de771cbb..1da4dc46df43e 100644 --- a/arch/loongarch/pci/acpi.c +++ b/arch/loongarch/pci/acpi.c @@ -225,6 +225,7 @@ struct pci_bus *pci_acpi_scan_root(struct acpi_pci_root *root) if (bus) { memcpy(bus->sysdata, info->cfg, sizeof(struct pci_config_window)); kfree(info); + kfree(root_ops); } else { struct pci_bus *child;
From: Florian Westphal fw@strlen.de
[ Upstream commit d8f84a9bc7c4e07fdc4edc00f9e868b8db974ccb ]
A conntrack entry can be inserted to the connection tracking table if there is no existing entry with an identical tuple in either direction.
Example: INITIATOR -> NAT/PAT -> RESPONDER
Initiator passes through NAT/PAT ("us") and SNAT is done (saddr rewrite). Then, later, NAT/PAT machine itself also wants to connect to RESPONDER.
This will not work if the SNAT done earlier has same IP:PORT source pair.
Conntrack table has: ORIGINAL: $IP_INITATOR:$SPORT -> $IP_RESPONDER:$DPORT REPLY: $IP_RESPONDER:$DPORT -> $IP_NAT:$SPORT
and new locally originating connection wants: ORIGINAL: $IP_NAT:$SPORT -> $IP_RESPONDER:$DPORT REPLY: $IP_RESPONDER:$DPORT -> $IP_NAT:$SPORT
This is handled by the NAT engine which will do a source port reallocation for the locally originating connection that is colliding with an existing tuple by attempting a source port rewrite.
This is done even if this new connection attempt did not go through a masquerade/snat rule.
There is a rare race condition with connection-less protocols like UDP, where we do the port reallocation even though its not needed.
This happens when new packets from the same, pre-existing flow are received in both directions at the exact same time on different CPUs after the conntrack table was flushed (or conntrack becomes active for first time).
With strict ordering/single cpu, the first packet creates new ct entry and second packet is resolved as established reply packet.
With parallel processing, both packets are picked up as new and both get their own ct entry.
In this case, the 'reply' packet (picked up as ORIGINAL) can be mangled by NAT engine because a port collision is detected.
This change isn't enough to prevent a packet drop later during nf_conntrack_confirm(), the existing clash resolution strategy will not detect such reverse clash case. This is resolved by a followup patch.
Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_nat_core.c | 120 +++++++++++++++++++++++++++++++++++- 1 file changed, 118 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c index c4e0516a8dfab..ccca6e3848bcc 100644 --- a/net/netfilter/nf_nat_core.c +++ b/net/netfilter/nf_nat_core.c @@ -183,7 +183,35 @@ hash_by_src(const struct net *net, return reciprocal_scale(hash, nf_nat_htable_size); }
-/* Is this tuple already taken? (not by us) */ +/** + * nf_nat_used_tuple - check if proposed nat tuple clashes with existing entry + * @tuple: proposed NAT binding + * @ignored_conntrack: our (unconfirmed) conntrack entry + * + * A conntrack entry can be inserted to the connection tracking table + * if there is no existing entry with an identical tuple in either direction. + * + * Example: + * INITIATOR -> NAT/PAT -> RESPONDER + * + * INITIATOR passes through NAT/PAT ("us") and SNAT is done (saddr rewrite). + * Then, later, NAT/PAT itself also connects to RESPONDER. + * + * This will not work if the SNAT done earlier has same IP:PORT source pair. + * + * Conntrack table has: + * ORIGINAL: $IP_INITIATOR:$SPORT -> $IP_RESPONDER:$DPORT + * REPLY: $IP_RESPONDER:$DPORT -> $IP_NAT:$SPORT + * + * and new locally originating connection wants: + * ORIGINAL: $IP_NAT:$SPORT -> $IP_RESPONDER:$DPORT + * REPLY: $IP_RESPONDER:$DPORT -> $IP_NAT:$SPORT + * + * ... which would mean incoming packets cannot be distinguished between + * the existing and the newly added entry (identical IP_CT_DIR_REPLY tuple). + * + * @return: true if the proposed NAT mapping collides with an existing entry. + */ static int nf_nat_used_tuple(const struct nf_conntrack_tuple *tuple, const struct nf_conn *ignored_conntrack) @@ -200,6 +228,94 @@ nf_nat_used_tuple(const struct nf_conntrack_tuple *tuple, return nf_conntrack_tuple_taken(&reply, ignored_conntrack); }
+static bool nf_nat_allow_clash(const struct nf_conn *ct) +{ + return nf_ct_l4proto_find(nf_ct_protonum(ct))->allow_clash; +} + +/** + * nf_nat_used_tuple_new - check if to-be-inserted conntrack collides with existing entry + * @tuple: proposed NAT binding + * @ignored_ct: our (unconfirmed) conntrack entry + * + * Same as nf_nat_used_tuple, but also check for rare clash in reverse + * direction. Should be called only when @tuple has not been altered, i.e. + * @ignored_conntrack will not be subject to NAT. + * + * @return: true if the proposed NAT mapping collides with existing entry. + */ +static noinline bool +nf_nat_used_tuple_new(const struct nf_conntrack_tuple *tuple, + const struct nf_conn *ignored_ct) +{ + static const unsigned long uses_nat = IPS_NAT_MASK | IPS_SEQ_ADJUST_BIT; + const struct nf_conntrack_tuple_hash *thash; + const struct nf_conntrack_zone *zone; + struct nf_conn *ct; + bool taken = true; + struct net *net; + + if (!nf_nat_used_tuple(tuple, ignored_ct)) + return false; + + if (!nf_nat_allow_clash(ignored_ct)) + return true; + + /* Initial choice clashes with existing conntrack. + * Check for (rare) reverse collision. + * + * This can happen when new packets are received in both directions + * at the exact same time on different CPUs. + * + * Without SMP, first packet creates new conntrack entry and second + * packet is resolved as established reply packet. + * + * With parallel processing, both packets could be picked up as + * new and both get their own ct entry allocated. + * + * If ignored_conntrack and colliding ct are not subject to NAT then + * pretend the tuple is available and let later clash resolution + * handle this at insertion time. + * + * Without it, the 'reply' packet has its source port rewritten + * by nat engine. + */ + if (READ_ONCE(ignored_ct->status) & uses_nat) + return true; + + net = nf_ct_net(ignored_ct); + zone = nf_ct_zone(ignored_ct); + + thash = nf_conntrack_find_get(net, zone, tuple); + if (unlikely(!thash)) /* clashing entry went away */ + return false; + + ct = nf_ct_tuplehash_to_ctrack(thash); + + /* NB: IP_CT_DIR_ORIGINAL should be impossible because + * nf_nat_used_tuple() handles origin collisions. + * + * Handle remote chance other CPU confirmed its ct right after. + */ + if (thash->tuple.dst.dir != IP_CT_DIR_REPLY) + goto out; + + /* clashing connection subject to NAT? Retry with new tuple. */ + if (READ_ONCE(ct->status) & uses_nat) + goto out; + + if (nf_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple, + &ignored_ct->tuplehash[IP_CT_DIR_REPLY].tuple) && + nf_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_REPLY].tuple, + &ignored_ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple)) { + taken = false; + goto out; + } +out: + nf_ct_put(ct); + return taken; +} + static bool nf_nat_may_kill(struct nf_conn *ct, unsigned long flags) { static const unsigned long flags_refuse = IPS_FIXED_TIMEOUT | @@ -608,7 +724,7 @@ get_unique_tuple(struct nf_conntrack_tuple *tuple, !(range->flags & NF_NAT_RANGE_PROTO_RANDOM_ALL)) { /* try the original tuple first */ if (nf_in_range(orig_tuple, range)) { - if (!nf_nat_used_tuple(orig_tuple, ct)) { + if (!nf_nat_used_tuple_new(orig_tuple, ct)) { *tuple = *orig_tuple; return; }
From: Simon Horman horms@kernel.org
[ Upstream commit fc56878ca1c288e49b5cbb43860a5938e3463654 ]
If CONFIG_BRIDGE_NETFILTER is not enabled, which is the case for x86_64 defconfig, then building nf_reject_ipv4.c and nf_reject_ipv6.c with W=1 using gcc-14 results in the following warnings, which are treated as errors:
net/ipv4/netfilter/nf_reject_ipv4.c: In function 'nf_send_reset': net/ipv4/netfilter/nf_reject_ipv4.c:243:23: error: variable 'niph' set but not used [-Werror=unused-but-set-variable] 243 | struct iphdr *niph; | ^~~~ cc1: all warnings being treated as errors net/ipv6/netfilter/nf_reject_ipv6.c: In function 'nf_send_reset6': net/ipv6/netfilter/nf_reject_ipv6.c:286:25: error: variable 'ip6h' set but not used [-Werror=unused-but-set-variable] 286 | struct ipv6hdr *ip6h; | ^~~~ cc1: all warnings being treated as errors
Address this by reducing the scope of these local variables to where they are used, which is code only compiled when CONFIG_BRIDGE_NETFILTER enabled.
Compile tested and run through netfilter selftests.
Reported-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Closes: https://lore.kernel.org/netfilter-devel/20240906145513.567781-1-andriy.shevc... Signed-off-by: Simon Horman horms@kernel.org Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/netfilter/nf_reject_ipv4.c | 10 ++++------ net/ipv6/netfilter/nf_reject_ipv6.c | 5 ++--- 2 files changed, 6 insertions(+), 9 deletions(-)
diff --git a/net/ipv4/netfilter/nf_reject_ipv4.c b/net/ipv4/netfilter/nf_reject_ipv4.c index fc761915c5f6f..675b5bbed638e 100644 --- a/net/ipv4/netfilter/nf_reject_ipv4.c +++ b/net/ipv4/netfilter/nf_reject_ipv4.c @@ -239,9 +239,8 @@ static int nf_reject_fill_skb_dst(struct sk_buff *skb_in) void nf_send_reset(struct net *net, struct sock *sk, struct sk_buff *oldskb, int hook) { - struct sk_buff *nskb; - struct iphdr *niph; const struct tcphdr *oth; + struct sk_buff *nskb; struct tcphdr _oth;
oth = nf_reject_ip_tcphdr_get(oldskb, &_oth, hook); @@ -266,14 +265,12 @@ void nf_send_reset(struct net *net, struct sock *sk, struct sk_buff *oldskb, nskb->mark = IP4_REPLY_MARK(net, oldskb->mark);
skb_reserve(nskb, LL_MAX_HEADER); - niph = nf_reject_iphdr_put(nskb, oldskb, IPPROTO_TCP, - ip4_dst_hoplimit(skb_dst(nskb))); + nf_reject_iphdr_put(nskb, oldskb, IPPROTO_TCP, + ip4_dst_hoplimit(skb_dst(nskb))); nf_reject_ip_tcphdr_put(nskb, oldskb, oth); if (ip_route_me_harder(net, sk, nskb, RTN_UNSPEC)) goto free_nskb;
- niph = ip_hdr(nskb); - /* "Never happens" */ if (nskb->len > dst_mtu(skb_dst(nskb))) goto free_nskb; @@ -290,6 +287,7 @@ void nf_send_reset(struct net *net, struct sock *sk, struct sk_buff *oldskb, */ if (nf_bridge_info_exists(oldskb)) { struct ethhdr *oeth = eth_hdr(oldskb); + struct iphdr *niph = ip_hdr(nskb); struct net_device *br_indev;
br_indev = nf_bridge_get_physindev(oldskb, net); diff --git a/net/ipv6/netfilter/nf_reject_ipv6.c b/net/ipv6/netfilter/nf_reject_ipv6.c index 71d692728230e..c8f5196d752e6 100644 --- a/net/ipv6/netfilter/nf_reject_ipv6.c +++ b/net/ipv6/netfilter/nf_reject_ipv6.c @@ -283,7 +283,6 @@ void nf_send_reset6(struct net *net, struct sock *sk, struct sk_buff *oldskb, const struct tcphdr *otcph; unsigned int otcplen, hh_len; const struct ipv6hdr *oip6h = ipv6_hdr(oldskb); - struct ipv6hdr *ip6h; struct dst_entry *dst = NULL; struct flowi6 fl6;
@@ -339,8 +338,7 @@ void nf_send_reset6(struct net *net, struct sock *sk, struct sk_buff *oldskb, nskb->mark = fl6.flowi6_mark;
skb_reserve(nskb, hh_len + dst->header_len); - ip6h = nf_reject_ip6hdr_put(nskb, oldskb, IPPROTO_TCP, - ip6_dst_hoplimit(dst)); + nf_reject_ip6hdr_put(nskb, oldskb, IPPROTO_TCP, ip6_dst_hoplimit(dst)); nf_reject_ip6_tcphdr_put(nskb, oldskb, otcph, otcplen);
nf_ct_attach(nskb, oldskb); @@ -355,6 +353,7 @@ void nf_send_reset6(struct net *net, struct sock *sk, struct sk_buff *oldskb, */ if (nf_bridge_info_exists(oldskb)) { struct ethhdr *oeth = eth_hdr(oldskb); + struct ipv6hdr *ip6h = ipv6_hdr(nskb); struct net_device *br_indev;
br_indev = nf_bridge_get_physindev(oldskb, net);
From: Philip Chen philipchen@chromium.org
[ Upstream commit e25fbcd97cf52c3c9824d44b5c56c19673c3dd50 ]
If a pmem device is in a bad status, the driver side could wait for host ack forever in virtio_pmem_flush(), causing the system to hang.
So add a status check in the beginning of virtio_pmem_flush() to return early if the device is not activated.
Signed-off-by: Philip Chen philipchen@chromium.org Message-Id: 20240826215313.2673566-1-philipchen@chromium.org Signed-off-by: Michael S. Tsirkin mst@redhat.com Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvdimm/nd_virtio.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/drivers/nvdimm/nd_virtio.c b/drivers/nvdimm/nd_virtio.c index 1f8c667c6f1ee..839f10ca56eac 100644 --- a/drivers/nvdimm/nd_virtio.c +++ b/drivers/nvdimm/nd_virtio.c @@ -44,6 +44,15 @@ static int virtio_pmem_flush(struct nd_region *nd_region) unsigned long flags; int err, err1;
+ /* + * Don't bother to submit the request to the device if the device is + * not activated. + */ + if (vdev->config->get_status(vdev) & VIRTIO_CONFIG_S_NEEDS_RESET) { + dev_info(&vdev->dev, "virtio pmem device needs a reset\n"); + return -EIO; + } + might_sleep(); req_data = kmalloc(sizeof(*req_data), GFP_KERNEL); if (!req_data)
From: Zhu Jun zhujun2@cmss.chinamobile.com
[ Upstream commit 3c6b818b097dd6932859bcc3d6722a74ec5931c1 ]
Added a check to handle memory allocation failure for `trigger_name` and return `-ENOMEM`.
Signed-off-by: Zhu Jun zhujun2@cmss.chinamobile.com Link: https://patch.msgid.link/20240828093129.3040-1-zhujun2@cmss.chinamobile.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/iio/iio_generic_buffer.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/tools/iio/iio_generic_buffer.c b/tools/iio/iio_generic_buffer.c index 0d0a7a19d6f95..9ef5ee087eda3 100644 --- a/tools/iio/iio_generic_buffer.c +++ b/tools/iio/iio_generic_buffer.c @@ -498,6 +498,10 @@ int main(int argc, char **argv) return -ENOMEM; } trigger_name = malloc(IIO_MAX_NAME_LENGTH); + if (!trigger_name) { + ret = -ENOMEM; + goto error; + } ret = read_sysfs_string("name", trig_dev_name, trigger_name); free(trig_dev_name); if (ret < 0) {
From: Riyan Dhiman riyandhiman14@gmail.com
[ Upstream commit a8a8b54350229f59c8ba6496fb5689a1632a59be ]
The geoid is a module parameter that allows users to hardcode the slot number. A bound check for geoid was added in the probe function because only values between 0 and less than VME_MAX_SLOT are valid.
Signed-off-by: Riyan Dhiman riyandhiman14@gmail.com Reviewed-by: Dan Carpenter dan.carpenter@linaro.org Link: https://lore.kernel.org/r/20240827125604.42771-2-riyandhiman14@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/staging/vme_user/vme_fake.c | 6 ++++++ drivers/staging/vme_user/vme_tsi148.c | 6 ++++++ 2 files changed, 12 insertions(+)
diff --git a/drivers/staging/vme_user/vme_fake.c b/drivers/staging/vme_user/vme_fake.c index 7c53a8a7b79b8..95730d1270af8 100644 --- a/drivers/staging/vme_user/vme_fake.c +++ b/drivers/staging/vme_user/vme_fake.c @@ -1064,6 +1064,12 @@ static int __init fake_init(void) struct vme_slave_resource *slave_image; struct vme_lm_resource *lm;
+ if (geoid < 0 || geoid >= VME_MAX_SLOTS) { + pr_err("VME geographical address must be between 0 and %d (exclusive), but got %d\n", + VME_MAX_SLOTS, geoid); + return -EINVAL; + } + /* We need a fake parent device */ vme_root = root_device_register("vme"); if (IS_ERR(vme_root)) diff --git a/drivers/staging/vme_user/vme_tsi148.c b/drivers/staging/vme_user/vme_tsi148.c index 2f5eafd509340..4566e391d913f 100644 --- a/drivers/staging/vme_user/vme_tsi148.c +++ b/drivers/staging/vme_user/vme_tsi148.c @@ -2252,6 +2252,12 @@ static int tsi148_probe(struct pci_dev *pdev, const struct pci_device_id *id) struct vme_dma_resource *dma_ctrlr; struct vme_lm_resource *lm;
+ if (geoid < 0 || geoid >= VME_MAX_SLOTS) { + dev_err(&pdev->dev, "VME geographical address must be between 0 and %d (exclusive), but got %d\n", + VME_MAX_SLOTS, geoid); + return -EINVAL; + } + /* If we want to support more than one of each bridge, we need to * dynamically generate this so we get one per device */
From: Zijun Hu quic_zijuhu@quicinc.com
[ Upstream commit bfa54a793ba77ef696755b66f3ac4ed00c7d1248 ]
For bus_register(), any error which happens after kset_register() will cause that @priv are freed twice, fixed by setting @priv with NULL after the first free.
Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com Link: https://lore.kernel.org/r/20240727-bus_register_fix-v1-1-fed8dd0dba7a@quicin... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/base/bus.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/base/bus.c b/drivers/base/bus.c index d7c4330786cae..e7761e0ef5a55 100644 --- a/drivers/base/bus.c +++ b/drivers/base/bus.c @@ -920,6 +920,8 @@ int bus_register(const struct bus_type *bus) bus_remove_file(bus, &bus_attr_uevent); bus_uevent_fail: kset_unregister(&priv->subsys); + /* Above kset_unregister() will kfree @priv */ + priv = NULL; out: kfree(priv); return retval;
From: Zijun Hu quic_zijuhu@quicinc.com
[ Upstream commit c0fd973c108cdc22a384854bc4b3e288a9717bb2 ]
Return -EIO instead of 0 for below erroneous bus attribute operations: - read a bus attribute without show(). - write a bus attribute without store().
Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com Link: https://lore.kernel.org/r/20240724-bus_fix-v2-1-5adbafc698fb@quicinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/base/bus.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/base/bus.c b/drivers/base/bus.c index e7761e0ef5a55..d4361ad3b433f 100644 --- a/drivers/base/bus.c +++ b/drivers/base/bus.c @@ -152,7 +152,8 @@ static ssize_t bus_attr_show(struct kobject *kobj, struct attribute *attr, { struct bus_attribute *bus_attr = to_bus_attr(attr); struct subsys_private *subsys_priv = to_subsys_private(kobj); - ssize_t ret = 0; + /* return -EIO for reading a bus attribute without show() */ + ssize_t ret = -EIO;
if (bus_attr->show) ret = bus_attr->show(subsys_priv->bus, buf); @@ -164,7 +165,8 @@ static ssize_t bus_attr_store(struct kobject *kobj, struct attribute *attr, { struct bus_attribute *bus_attr = to_bus_attr(attr); struct subsys_private *subsys_priv = to_subsys_private(kobj); - ssize_t ret = 0; + /* return -EIO for writing a bus attribute without store() */ + ssize_t ret = -EIO;
if (bus_attr->store) ret = bus_attr->store(subsys_priv->bus, buf, count);
From: Justin Tee justin.tee@broadcom.com
[ Upstream commit 93bcc5f3984bf4f51da1529700aec351872dbfff ]
During HBA stress testing, a spam of received PLOGIs exposes a resource recovery bug causing leakage of lpfc_sqlq entries from the global phba->sli4_hba.lpfc_els_sgl_list.
The issue is in lpfc_els_flush_cmd(), where the driver attempts to recover outstanding ELS sgls when walking the txcmplq. Only CMD_ELS_REQUEST64_CRs and CMD_GEN_REQUEST64_CRs are added to the abort and cancel lists. A check for CMD_XMIT_ELS_RSP64_WQE is missing in order to recover LS_ACC usages of the phba->sli4_hba.lpfc_els_sgl_list too.
Fix by adding CMD_XMIT_ELS_RSP64_WQE as part of the txcmplq walk when adding WQEs to the abort and cancel list in lpfc_els_flush_cmd(). Also, update naming convention from CRs to WQEs.
Signed-off-by: Justin Tee justin.tee@broadcom.com Link: https://lore.kernel.org/r/20240912232447.45607-2-justintee8345@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/lpfc/lpfc_els.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c index 44d3ada9fbbcb..739508e0a45f3 100644 --- a/drivers/scsi/lpfc/lpfc_els.c +++ b/drivers/scsi/lpfc/lpfc_els.c @@ -9644,11 +9644,12 @@ lpfc_els_flush_cmd(struct lpfc_vport *vport) if (piocb->cmd_flag & LPFC_DRIVER_ABORTED && !mbx_tmo_err) continue;
- /* On the ELS ring we can have ELS_REQUESTs or - * GEN_REQUESTs waiting for a response. + /* On the ELS ring we can have ELS_REQUESTs, ELS_RSPs, + * or GEN_REQUESTs waiting for a CQE response. */ ulp_command = get_job_cmnd(phba, piocb); - if (ulp_command == CMD_ELS_REQUEST64_CR) { + if (ulp_command == CMD_ELS_REQUEST64_WQE || + ulp_command == CMD_XMIT_ELS_RSP64_WQE) { list_add_tail(&piocb->dlist, &abort_list);
/* If the link is down when flushing ELS commands
From: Justin Tee justin.tee@broadcom.com
[ Upstream commit 0a3c84f71680684c1d41abb92db05f95c09111e8 ]
Deleting an NPIV instance requires all fabric ndlps to be released before an NPIV's resources can be torn down. Failure to release fabric ndlps beforehand opens kref imbalance race conditions. Fix by forcing the DA_ID to complete synchronously with usage of wait_queue.
Signed-off-by: Justin Tee justin.tee@broadcom.com Link: https://lore.kernel.org/r/20240912232447.45607-6-justintee8345@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/lpfc/lpfc_ct.c | 12 ++++++++++ drivers/scsi/lpfc/lpfc_disc.h | 7 ++++++ drivers/scsi/lpfc/lpfc_vport.c | 43 ++++++++++++++++++++++++++++------ 3 files changed, 55 insertions(+), 7 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc_ct.c b/drivers/scsi/lpfc/lpfc_ct.c index baae1f8279e0c..1775115239860 100644 --- a/drivers/scsi/lpfc/lpfc_ct.c +++ b/drivers/scsi/lpfc/lpfc_ct.c @@ -1671,6 +1671,18 @@ lpfc_cmpl_ct(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, }
out: + /* If the caller wanted a synchronous DA_ID completion, signal the + * wait obj and clear flag to reset the vport. + */ + if (ndlp->save_flags & NLP_WAIT_FOR_DA_ID) { + if (ndlp->da_id_waitq) + wake_up(ndlp->da_id_waitq); + } + + spin_lock_irq(&ndlp->lock); + ndlp->save_flags &= ~NLP_WAIT_FOR_DA_ID; + spin_unlock_irq(&ndlp->lock); + lpfc_ct_free_iocb(phba, cmdiocb); lpfc_nlp_put(ndlp); return; diff --git a/drivers/scsi/lpfc/lpfc_disc.h b/drivers/scsi/lpfc/lpfc_disc.h index f82615d87c4bb..f5ae8cc158205 100644 --- a/drivers/scsi/lpfc/lpfc_disc.h +++ b/drivers/scsi/lpfc/lpfc_disc.h @@ -90,6 +90,8 @@ enum lpfc_nlp_save_flags { NLP_IN_RECOV_POST_DEV_LOSS = 0x1, /* wait for outstanding LOGO to cmpl */ NLP_WAIT_FOR_LOGO = 0x2, + /* wait for outstanding DA_ID to finish */ + NLP_WAIT_FOR_DA_ID = 0x4 };
struct lpfc_nodelist { @@ -159,7 +161,12 @@ struct lpfc_nodelist { uint32_t nvme_fb_size; /* NVME target's supported byte cnt */ #define NVME_FB_BIT_SHIFT 9 /* PRLI Rsp first burst in 512B units. */ uint32_t nlp_defer_did; + + /* These wait objects are NPIV specific. These IOs must complete + * synchronously. + */ wait_queue_head_t *logo_waitq; + wait_queue_head_t *da_id_waitq; };
struct lpfc_node_rrq { diff --git a/drivers/scsi/lpfc/lpfc_vport.c b/drivers/scsi/lpfc/lpfc_vport.c index 9e0e9e02d2c47..256ee797adb30 100644 --- a/drivers/scsi/lpfc/lpfc_vport.c +++ b/drivers/scsi/lpfc/lpfc_vport.c @@ -633,6 +633,7 @@ lpfc_vport_delete(struct fc_vport *fc_vport) struct Scsi_Host *shost = lpfc_shost_from_vport(vport); struct lpfc_hba *phba = vport->phba; int rc; + DECLARE_WAIT_QUEUE_HEAD_ONSTACK(waitq);
if (vport->port_type == LPFC_PHYSICAL_PORT) { lpfc_printf_vlog(vport, KERN_ERR, LOG_TRACE_EVENT, @@ -688,21 +689,49 @@ lpfc_vport_delete(struct fc_vport *fc_vport) if (!ndlp) goto skip_logo;
+ /* Send the DA_ID and Fabric LOGO to cleanup the NPIV fabric entries. */ if (ndlp && ndlp->nlp_state == NLP_STE_UNMAPPED_NODE && phba->link_state >= LPFC_LINK_UP && phba->fc_topology != LPFC_TOPOLOGY_LOOP) { if (vport->cfg_enable_da_id) { - /* Send DA_ID and wait for a completion. */ + /* Send DA_ID and wait for a completion. This is best + * effort. If the DA_ID fails, likely the fabric will + * "leak" NportIDs but at least the driver issued the + * command. + */ + ndlp = lpfc_findnode_did(vport, NameServer_DID); + if (!ndlp) + goto issue_logo; + + spin_lock_irq(&ndlp->lock); + ndlp->da_id_waitq = &waitq; + ndlp->save_flags |= NLP_WAIT_FOR_DA_ID; + spin_unlock_irq(&ndlp->lock); + rc = lpfc_ns_cmd(vport, SLI_CTNS_DA_ID, 0, 0); - if (rc) { - lpfc_printf_log(vport->phba, KERN_WARNING, - LOG_VPORT, - "1829 CT command failed to " - "delete objects on fabric, " - "rc %d\n", rc); + if (!rc) { + wait_event_timeout(waitq, + !(ndlp->save_flags & NLP_WAIT_FOR_DA_ID), + msecs_to_jiffies(phba->fc_ratov * 2000)); } + + lpfc_printf_vlog(vport, KERN_INFO, LOG_VPORT | LOG_ELS, + "1829 DA_ID issue status %d. " + "SFlag x%x NState x%x, NFlag x%x " + "Rpi x%x\n", + rc, ndlp->save_flags, ndlp->nlp_state, + ndlp->nlp_flag, ndlp->nlp_rpi); + + /* Remove the waitq and save_flags. It no + * longer matters if the wake happened. + */ + spin_lock_irq(&ndlp->lock); + ndlp->da_id_waitq = NULL; + ndlp->save_flags &= ~NLP_WAIT_FOR_DA_ID; + spin_unlock_irq(&ndlp->lock); }
+issue_logo: /* * If the vpi is not registered, then a valid FDISC doesn't * exist and there is no need for a ELS LOGO. Just cleanup
From: Alex Hung alex.hung@amd.com
[ Upstream commit ff599ef6970ee000fa5bc38d02fa5ff5f3fc7575 ]
[WHAT & HOW] se is null checked previously in the same function, indicating it might be null; therefore, it must be checked when used again.
This fixes 1 FORWARD_NULL issue reported by Coverity.
Acked-by: Alex Hung alex.hung@amd.com Reviewed-by: Rodrigo Siqueira rodrigo.siqueira@amd.com Signed-off-by: Alex Hung alex.hung@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/core/dc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c index 50e643bfdfbad..cefa1756e1223 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -1691,7 +1691,7 @@ bool dc_validate_boot_timing(const struct dc *dc, if (crtc_timing->pix_clk_100hz != pix_clk_100hz) return false;
- if (!se->funcs->dp_get_pixel_format) + if (!se || !se->funcs->dp_get_pixel_format) return false;
if (!se->funcs->dp_get_pixel_format(
From: Qianqiang Liu qianqiang.liu@163.com
[ Upstream commit 5b97eebcce1b4f3f07a71f635d6aa3af96c236e7 ]
syzbot has found a NULL pointer dereference bug in fbcon. Here is the simplified C reproducer:
struct param { uint8_t type; struct tiocl_selection ts; };
int main() { struct fb_con2fbmap con2fb; struct param param;
int fd = open("/dev/fb1", 0, 0);
con2fb.console = 0x19; con2fb.framebuffer = 0; ioctl(fd, FBIOPUT_CON2FBMAP, &con2fb);
param.type = 2; param.ts.xs = 0; param.ts.ys = 0; param.ts.xe = 0; param.ts.ye = 0; param.ts.sel_mode = 0;
int fd1 = open("/dev/tty1", O_RDWR, 0); ioctl(fd1, TIOCLINUX, ¶m);
con2fb.console = 1; con2fb.framebuffer = 0; ioctl(fd, FBIOPUT_CON2FBMAP, &con2fb);
return 0; }
After calling ioctl(fd1, TIOCLINUX, ¶m), the subsequent ioctl(fd, FBIOPUT_CON2FBMAP, &con2fb) causes the kernel to follow a different execution path:
set_con2fb_map -> con2fb_init_display -> fbcon_set_disp -> redraw_screen -> hide_cursor -> clear_selection -> highlight -> invert_screen -> do_update_region -> fbcon_putcs -> ops->putcs
Since ops->putcs is a NULL pointer, this leads to a kernel panic. To prevent this, we need to call set_blitting_type() within set_con2fb_map() to properly initialize ops->putcs.
Reported-by: syzbot+3d613ae53c031502687a@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=3d613ae53c031502687a Tested-by: syzbot+3d613ae53c031502687a@syzkaller.appspotmail.com Signed-off-by: Qianqiang Liu qianqiang.liu@163.com Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/video/fbdev/core/fbcon.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/video/fbdev/core/fbcon.c b/drivers/video/fbdev/core/fbcon.c index 24035b4f2cd70..405d587450ef8 100644 --- a/drivers/video/fbdev/core/fbcon.c +++ b/drivers/video/fbdev/core/fbcon.c @@ -847,6 +847,8 @@ static int set_con2fb_map(int unit, int newidx, int user) return err;
fbcon_add_cursor_work(info); + } else if (vc) { + set_blitting_type(vc, info); }
con2fb_map[unit] = newidx;
From: Enzo Matsumiya ematsumiya@suse.de
[ Upstream commit b0abcd65ec545701b8793e12bc27dc98042b151a ]
Doing an async decryption (large read) crashes with a slab-use-after-free way down in the crypto API.
Reproducer: # mount.cifs -o ...,seal,esize=1 //srv/share /mnt # dd if=/mnt/largefile of=/dev/null ... [ 194.196391] ================================================================== [ 194.196844] BUG: KASAN: slab-use-after-free in gf128mul_4k_lle+0xc1/0x110 [ 194.197269] Read of size 8 at addr ffff888112bd0448 by task kworker/u77:2/899 [ 194.197707] [ 194.197818] CPU: 12 UID: 0 PID: 899 Comm: kworker/u77:2 Not tainted 6.11.0-lku-00028-gfca3ca14a17a-dirty #43 [ 194.198400] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-prebuilt.qemu.org 04/01/2014 [ 194.199046] Workqueue: smb3decryptd smb2_decrypt_offload [cifs] [ 194.200032] Call Trace: [ 194.200191] <TASK> [ 194.200327] dump_stack_lvl+0x4e/0x70 [ 194.200558] ? gf128mul_4k_lle+0xc1/0x110 [ 194.200809] print_report+0x174/0x505 [ 194.201040] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ 194.201352] ? srso_return_thunk+0x5/0x5f [ 194.201604] ? __virt_addr_valid+0xdf/0x1c0 [ 194.201868] ? gf128mul_4k_lle+0xc1/0x110 [ 194.202128] kasan_report+0xc8/0x150 [ 194.202361] ? gf128mul_4k_lle+0xc1/0x110 [ 194.202616] gf128mul_4k_lle+0xc1/0x110 [ 194.202863] ghash_update+0x184/0x210 [ 194.203103] shash_ahash_update+0x184/0x2a0 [ 194.203377] ? __pfx_shash_ahash_update+0x10/0x10 [ 194.203651] ? srso_return_thunk+0x5/0x5f [ 194.203877] ? crypto_gcm_init_common+0x1ba/0x340 [ 194.204142] gcm_hash_assoc_remain_continue+0x10a/0x140 [ 194.204434] crypt_message+0xec1/0x10a0 [cifs] [ 194.206489] ? __pfx_crypt_message+0x10/0x10 [cifs] [ 194.208507] ? srso_return_thunk+0x5/0x5f [ 194.209205] ? srso_return_thunk+0x5/0x5f [ 194.209925] ? srso_return_thunk+0x5/0x5f [ 194.210443] ? srso_return_thunk+0x5/0x5f [ 194.211037] decrypt_raw_data+0x15f/0x250 [cifs] [ 194.212906] ? __pfx_decrypt_raw_data+0x10/0x10 [cifs] [ 194.214670] ? srso_return_thunk+0x5/0x5f [ 194.215193] smb2_decrypt_offload+0x12a/0x6c0 [cifs]
This is because TFM is being used in parallel.
Fix this by allocating a new AEAD TFM for async decryption, but keep the existing one for synchronous READ cases (similar to what is done in smb3_calc_signature()).
Also remove the calls to aead_request_set_callback() and crypto_wait_req() since it's always going to be a synchronous operation.
Signed-off-by: Enzo Matsumiya ematsumiya@suse.de Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/client/smb2ops.c | 47 ++++++++++++++++++++++++----------------- fs/smb/client/smb2pdu.c | 6 ++++++ 2 files changed, 34 insertions(+), 19 deletions(-)
diff --git a/fs/smb/client/smb2ops.c b/fs/smb/client/smb2ops.c index acd5d7d793525..a484bbe9e9de8 100644 --- a/fs/smb/client/smb2ops.c +++ b/fs/smb/client/smb2ops.c @@ -4245,7 +4245,7 @@ smb2_get_enc_key(struct TCP_Server_Info *server, __u64 ses_id, int enc, u8 *key) */ static int crypt_message(struct TCP_Server_Info *server, int num_rqst, - struct smb_rqst *rqst, int enc) + struct smb_rqst *rqst, int enc, struct crypto_aead *tfm) { struct smb2_transform_hdr *tr_hdr = (struct smb2_transform_hdr *)rqst[0].rq_iov[0].iov_base; @@ -4256,8 +4256,6 @@ crypt_message(struct TCP_Server_Info *server, int num_rqst, u8 key[SMB3_ENC_DEC_KEY_SIZE]; struct aead_request *req; u8 *iv; - DECLARE_CRYPTO_WAIT(wait); - struct crypto_aead *tfm; unsigned int crypt_len = le32_to_cpu(tr_hdr->OriginalMessageSize); void *creq; size_t sensitive_size; @@ -4269,14 +4267,6 @@ crypt_message(struct TCP_Server_Info *server, int num_rqst, return rc; }
- rc = smb3_crypto_aead_allocate(server); - if (rc) { - cifs_server_dbg(VFS, "%s: crypto alloc failed\n", __func__); - return rc; - } - - tfm = enc ? server->secmech.enc : server->secmech.dec; - if ((server->cipher_type == SMB2_ENCRYPTION_AES256_CCM) || (server->cipher_type == SMB2_ENCRYPTION_AES256_GCM)) rc = crypto_aead_setkey(tfm, key, SMB3_GCM256_CRYPTKEY_SIZE); @@ -4316,11 +4306,7 @@ crypt_message(struct TCP_Server_Info *server, int num_rqst, aead_request_set_crypt(req, sg, sg, crypt_len, iv); aead_request_set_ad(req, assoc_data_len);
- aead_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG, - crypto_req_done, &wait); - - rc = crypto_wait_req(enc ? crypto_aead_encrypt(req) - : crypto_aead_decrypt(req), &wait); + rc = enc ? crypto_aead_encrypt(req) : crypto_aead_decrypt(req);
if (!rc && enc) memcpy(&tr_hdr->Signature, sign, SMB2_SIGNATURE_SIZE); @@ -4427,7 +4413,7 @@ smb3_init_transform_rq(struct TCP_Server_Info *server, int num_rqst, /* fill the 1st iov with a transform header */ fill_transform_hdr(tr_hdr, orig_len, old_rq, server->cipher_type);
- rc = crypt_message(server, num_rqst, new_rq, 1); + rc = crypt_message(server, num_rqst, new_rq, 1, server->secmech.enc); cifs_dbg(FYI, "Encrypt message returned %d\n", rc); if (rc) goto err_free; @@ -4452,8 +4438,9 @@ decrypt_raw_data(struct TCP_Server_Info *server, char *buf, unsigned int buf_data_size, struct iov_iter *iter, bool is_offloaded) { - struct kvec iov[2]; + struct crypto_aead *tfm; struct smb_rqst rqst = {NULL}; + struct kvec iov[2]; size_t iter_size = 0; int rc;
@@ -4470,9 +4457,31 @@ decrypt_raw_data(struct TCP_Server_Info *server, char *buf, iter_size = iov_iter_count(iter); }
- rc = crypt_message(server, 1, &rqst, 0); + if (is_offloaded) { + if ((server->cipher_type == SMB2_ENCRYPTION_AES128_GCM) || + (server->cipher_type == SMB2_ENCRYPTION_AES256_GCM)) + tfm = crypto_alloc_aead("gcm(aes)", 0, 0); + else + tfm = crypto_alloc_aead("ccm(aes)", 0, 0); + if (IS_ERR(tfm)) { + rc = PTR_ERR(tfm); + cifs_server_dbg(VFS, "%s: Failed alloc decrypt TFM, rc=%d\n", __func__, rc); + + return rc; + } + } else { + if (unlikely(!server->secmech.dec)) + return -EIO; + + tfm = server->secmech.dec; + } + + rc = crypt_message(server, 1, &rqst, 0, tfm); cifs_dbg(FYI, "Decrypt message returned %d\n", rc);
+ if (is_offloaded) + crypto_free_aead(tfm); + if (rc) return rc;
diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c index bf45b8652e580..83a03201bb862 100644 --- a/fs/smb/client/smb2pdu.c +++ b/fs/smb/client/smb2pdu.c @@ -1263,6 +1263,12 @@ SMB2_negotiate(const unsigned int xid, else cifs_server_dbg(VFS, "Missing expected negotiate contexts\n"); } + + if (server->cipher_type && !rc) { + rc = smb3_crypto_aead_allocate(server); + if (rc) + cifs_server_dbg(VFS, "%s: crypto alloc failed, rc=%d\n", __func__, rc); + } neg_exit: free_rsp_buf(resp_buftype, rsp); return rc;
From: Andrey Shumilin shum.sdl@nppct.ru
[ Upstream commit 9cf14f5a2746c19455ce9cb44341b5527b5e19c3 ]
The values of the variables xres and yres are placed in strbuf. These variables are obtained from strbuf1. The strbuf1 array contains digit characters and a space if the array contains non-digit characters. Then, when executing sprintf(strbuf, "%ux%ux8", xres, yres); more than 16 bytes will be written to strbuf. It is suggested to increase the size of the strbuf array to 24.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Andrey Shumilin shum.sdl@nppct.ru Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/video/fbdev/sis/sis_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/video/fbdev/sis/sis_main.c b/drivers/video/fbdev/sis/sis_main.c index 6d524a65af181..ad39571f91349 100644 --- a/drivers/video/fbdev/sis/sis_main.c +++ b/drivers/video/fbdev/sis/sis_main.c @@ -184,7 +184,7 @@ static void sisfb_search_mode(char *name, bool quiet) { unsigned int j = 0, xres = 0, yres = 0, depth = 0, rate = 0; int i = 0; - char strbuf[16], strbuf1[20]; + char strbuf[24], strbuf1[20]; char *nameptr = name;
/* We don't know the hardware specs yet and there is no ivideo */
linux-stable-mirror@lists.linaro.org