This is the start of the stable review cycle for the 4.14.242 release. There are 38 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 04 Aug 2021 13:43:24 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.242-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.14.242-rc1
Arnaldo Carvalho de Melo acme@redhat.com Revert "perf map: Fix dso->nsinfo refcounting"
Dan Carpenter dan.carpenter@oracle.com can: hi311x: fix a signedness bug in hi3110_cmd()
Wang Hai wanghai38@huawei.com sis900: Fix missing pci_disable_device() in probe and remove
Wang Hai wanghai38@huawei.com tulip: windbond-840: Fix missing pci_disable_device() in probe and remove
Marcelo Ricardo Leitner marcelo.leitner@gmail.com sctp: fix return value check in __sctp_rcv_asconf_lookup
Maor Gottlieb maorg@nvidia.com net/mlx5: Fix flow table chaining
Pavel Skripkin paskripkin@gmail.com net: llc: fix skb_over_panic
Jiapeng Chong jiapeng.chong@linux.alibaba.com mlx4: Fix missing error code in mlx4_load_one()
Hoang Le hoang.h.le@dektech.com.au tipc: fix sleeping in tipc accept routine
Pablo Neira Ayuso pablo@netfilter.org netfilter: nft_nat: allow to specify layer 4 protocol NAT only
Florian Westphal fw@strlen.de netfilter: conntrack: adjust stop timestamp to real expiry value
Nguyen Dinh Phi phind.uet@gmail.com cfg80211: Fix possible memory leak in function cfg80211_bss_update
Jan Kiszka jan.kiszka@siemens.com x86/asm: Ensure asm/proto.h can be included stand-alone
Krzysztof Kozlowski krzysztof.kozlowski@canonical.com nfc: nfcsim: fix use after free during module unload
Paul Jakma paul@jakma.org NIU: fix incorrect error return, missed in previous revert
Pavel Skripkin paskripkin@gmail.com can: esd_usb2: fix memory leak
Pavel Skripkin paskripkin@gmail.com can: ems_usb: fix memory leak
Pavel Skripkin paskripkin@gmail.com can: usb_8dev: fix memory leak
Pavel Skripkin paskripkin@gmail.com can: mcba_usb_start(): add missing urb->transfer_dma initialization
Ziyang Xuan william.xuanziyang@huawei.com can: raw: raw_setsockopt(): fix raw_rcv panic for sock UAF
Junxiao Bi junxiao.bi@oracle.com ocfs2: issue zeroout to EOF blocks
Junxiao Bi junxiao.bi@oracle.com ocfs2: fix zero out valid data
Juergen Gross jgross@suse.com x86/kvm: fix vcpu-id indexed array sizes
Eric Dumazet edumazet@google.com gro: ensure frag0 meets IP header alignment
Eric Dumazet edumazet@google.com virtio_net: Do not pull payload in skb->head
Sudeep Holla sudeep.holla@arm.com ARM: dts: versatile: Fix up interrupt controller node names
Desmond Cheong Zhi Xi desmondcheongzx@gmail.com hfs: add lock nesting notation to hfs_find_init
Desmond Cheong Zhi Xi desmondcheongzx@gmail.com hfs: fix high memory mapping in hfs_bnode_read
Desmond Cheong Zhi Xi desmondcheongzx@gmail.com hfs: add missing clean-up in hfs_fill_super
Xin Long lucien.xin@gmail.com sctp: move 198 addresses from unusable to private scope
Eric Dumazet edumazet@google.com net: annotate data race around sk_ll_usec
Yang Yingliang yangyingliang@huawei.com net/802/garp: fix memleak in garp_request_join()
Yang Yingliang yangyingliang@huawei.com net/802/mrp: fix memleak in mrp_request_join()
Yang Yingliang yangyingliang@huawei.com workqueue: fix UAF in pwq_unbound_release_workfn()
Miklos Szeredi mszeredi@redhat.com af_unix: fix garbage collect vs MSG_PEEK
Jens Axboe axboe@kernel.dk net: split out functions related to registering inflight socket files
Maxim Levitsky mlevitsk@redhat.com KVM: x86: determine if an exception has an error code only when injecting it.
Greg Kroah-Hartman gregkh@linuxfoundation.org selftest: fix build error in tools/testing/selftests/vm/userfaultfd.c
-------------
Diffstat:
Makefile | 4 +- arch/arm/boot/dts/versatile-ab.dts | 5 +- arch/arm/boot/dts/versatile-pb.dts | 2 +- arch/x86/include/asm/proto.h | 2 + arch/x86/kvm/ioapic.c | 2 +- arch/x86/kvm/ioapic.h | 4 +- arch/x86/kvm/x86.c | 13 +- drivers/net/can/spi/hi311x.c | 2 +- drivers/net/can/usb/ems_usb.c | 14 +- drivers/net/can/usb/esd_usb2.c | 16 ++- drivers/net/can/usb/mcba_usb.c | 2 + drivers/net/can/usb/usb_8dev.c | 15 ++- drivers/net/ethernet/dec/tulip/winbond-840.c | 7 +- drivers/net/ethernet/mellanox/mlx4/main.c | 1 + drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 10 +- drivers/net/ethernet/sis/sis900.c | 7 +- drivers/net/ethernet/sun/niu.c | 3 +- drivers/net/virtio_net.c | 10 +- drivers/nfc/nfcsim.c | 3 +- fs/hfs/bfind.c | 14 +- fs/hfs/bnode.c | 25 +++- fs/hfs/btree.h | 7 + fs/hfs/super.c | 10 +- fs/ocfs2/file.c | 103 +++++++++------ include/linux/skbuff.h | 9 ++ include/linux/virtio_net.h | 14 +- include/net/af_unix.h | 1 + include/net/busy_poll.h | 2 +- include/net/llc_pdu.h | 31 +++-- include/net/sctp/constants.h | 4 +- kernel/workqueue.c | 20 ++- net/802/garp.c | 14 ++ net/802/mrp.c | 14 ++ net/Makefile | 2 +- net/can/raw.c | 20 ++- net/core/dev.c | 3 +- net/core/sock.c | 2 +- net/llc/af_llc.c | 10 +- net/llc/llc_s_ac.c | 2 +- net/netfilter/nf_conntrack_core.c | 7 +- net/netfilter/nft_nat.c | 4 +- net/sctp/input.c | 2 +- net/sctp/protocol.c | 3 +- net/tipc/socket.c | 9 +- net/unix/Kconfig | 5 + net/unix/Makefile | 2 + net/unix/af_unix.c | 102 +++++++-------- net/unix/garbage.c | 68 +--------- net/unix/scm.c | 149 ++++++++++++++++++++++ net/unix/scm.h | 10 ++ net/wireless/scan.c | 6 +- tools/perf/util/map.c | 2 - tools/testing/selftests/vm/userfaultfd.c | 2 +- 53 files changed, 540 insertions(+), 260 deletions(-)
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
When backporting 0db282ba2c12 ("selftest: use mmap instead of posix_memalign to allocate memory") to this stable branch, I forgot a { breaking the build.
Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/vm/userfaultfd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -131,7 +131,7 @@ static void anon_allocate_area(void **al { *alloc_area = mmap(NULL, nr_pages * page_size, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); - if (*alloc_area == MAP_FAILED) + if (*alloc_area == MAP_FAILED) { fprintf(stderr, "mmap of anonymous memory failed"); *alloc_area = NULL; }
From: Maxim Levitsky mlevitsk@redhat.com
commit b97f074583736c42fb36f2da1164e28c73758912 upstream.
A page fault can be queued while vCPU is in real paged mode on AMD, and AMD manual asks the user to always intercept it (otherwise result is undefined). The resulting VM exit, does have an error code.
Signed-off-by: Maxim Levitsky mlevitsk@redhat.com Message-Id: 20210225154135.405125-2-mlevitsk@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Zubin Mithra zsm@chromium.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/x86.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
--- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -400,8 +400,6 @@ static void kvm_multiple_exception(struc
if (!vcpu->arch.exception.pending && !vcpu->arch.exception.injected) { queue: - if (has_error && !is_protmode(vcpu)) - has_error = false; if (reinject) { /* * On vmentry, vcpu->arch.exception.pending is only @@ -6624,13 +6622,20 @@ static void update_cr8_intercept(struct kvm_x86_ops->update_cr8_intercept(vcpu, tpr, max_irr); }
+static void kvm_inject_exception(struct kvm_vcpu *vcpu) +{ + if (vcpu->arch.exception.error_code && !is_protmode(vcpu)) + vcpu->arch.exception.error_code = false; + kvm_x86_ops->queue_exception(vcpu); +} + static int inject_pending_event(struct kvm_vcpu *vcpu) { int r;
/* try to reinject previous events if any */ if (vcpu->arch.exception.injected) { - kvm_x86_ops->queue_exception(vcpu); + kvm_inject_exception(vcpu); return 0; }
@@ -6675,7 +6680,7 @@ static int inject_pending_event(struct k kvm_update_dr7(vcpu); }
- kvm_x86_ops->queue_exception(vcpu); + kvm_inject_exception(vcpu); } else if (vcpu->arch.smi_pending && !is_smm(vcpu)) { vcpu->arch.smi_pending = false; enter_smm(vcpu);
From: Jens Axboe axboe@kernel.dk
commit f4e65870e5cede5ca1ec0006b6c9803994e5f7b8 upstream.
We need this functionality for the io_uring file registration, but we cannot rely on it since CONFIG_UNIX can be modular. Move the helpers to a separate file, that's always builtin to the kernel if CONFIG_UNIX is m/y.
No functional changes in this patch, just moving code around.
Reviewed-by: Hannes Reinecke hare@suse.com Acked-by: David S. Miller davem@davemloft.net Signed-off-by: Jens Axboe axboe@kernel.dk [ backported to older kernels to get access to unix_gc_lock - gregkh ] Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/af_unix.h | 1 net/Makefile | 2 net/unix/Kconfig | 5 + net/unix/Makefile | 2 net/unix/af_unix.c | 63 --------------------- net/unix/garbage.c | 68 ---------------------- net/unix/scm.c | 149 ++++++++++++++++++++++++++++++++++++++++++++++++++ net/unix/scm.h | 10 +++ 8 files changed, 172 insertions(+), 128 deletions(-) create mode 100644 net/unix/scm.c create mode 100644 net/unix/scm.h
--- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -10,6 +10,7 @@
void unix_inflight(struct user_struct *user, struct file *fp); void unix_notinflight(struct user_struct *user, struct file *fp); +void unix_destruct_scm(struct sk_buff *skb); void unix_gc(void); void wait_for_unix_gc(void); struct sock *unix_get_socket(struct file *filp); --- a/net/Makefile +++ b/net/Makefile @@ -18,7 +18,7 @@ obj-$(CONFIG_NETFILTER) += netfilter/ obj-$(CONFIG_INET) += ipv4/ obj-$(CONFIG_TLS) += tls/ obj-$(CONFIG_XFRM) += xfrm/ -obj-$(CONFIG_UNIX) += unix/ +obj-$(CONFIG_UNIX_SCM) += unix/ obj-$(CONFIG_NET) += ipv6/ obj-$(CONFIG_PACKET) += packet/ obj-$(CONFIG_NET_KEY) += key/ --- a/net/unix/Kconfig +++ b/net/unix/Kconfig @@ -19,6 +19,11 @@ config UNIX
Say Y unless you know what you are doing.
+config UNIX_SCM + bool + depends on UNIX + default y + config UNIX_DIAG tristate "UNIX: socket monitoring interface" depends on UNIX --- a/net/unix/Makefile +++ b/net/unix/Makefile @@ -10,3 +10,5 @@ unix-$(CONFIG_SYSCTL) += sysctl_net_unix
obj-$(CONFIG_UNIX_DIAG) += unix_diag.o unix_diag-y := diag.o + +obj-$(CONFIG_UNIX_SCM) += scm.o --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -119,6 +119,8 @@ #include <linux/freezer.h> #include <linux/file.h>
+#include "scm.h" + struct hlist_head unix_socket_table[2 * UNIX_HASH_SIZE]; EXPORT_SYMBOL_GPL(unix_socket_table); DEFINE_SPINLOCK(unix_table_lock); @@ -1519,67 +1521,6 @@ out: return err; }
-static void unix_detach_fds(struct scm_cookie *scm, struct sk_buff *skb) -{ - int i; - - scm->fp = UNIXCB(skb).fp; - UNIXCB(skb).fp = NULL; - - for (i = scm->fp->count-1; i >= 0; i--) - unix_notinflight(scm->fp->user, scm->fp->fp[i]); -} - -static void unix_destruct_scm(struct sk_buff *skb) -{ - struct scm_cookie scm; - memset(&scm, 0, sizeof(scm)); - scm.pid = UNIXCB(skb).pid; - if (UNIXCB(skb).fp) - unix_detach_fds(&scm, skb); - - /* Alas, it calls VFS */ - /* So fscking what? fput() had been SMP-safe since the last Summer */ - scm_destroy(&scm); - sock_wfree(skb); -} - -/* - * The "user->unix_inflight" variable is protected by the garbage - * collection lock, and we just read it locklessly here. If you go - * over the limit, there might be a tiny race in actually noticing - * it across threads. Tough. - */ -static inline bool too_many_unix_fds(struct task_struct *p) -{ - struct user_struct *user = current_user(); - - if (unlikely(user->unix_inflight > task_rlimit(p, RLIMIT_NOFILE))) - return !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN); - return false; -} - -static int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb) -{ - int i; - - if (too_many_unix_fds(current)) - return -ETOOMANYREFS; - - /* - * Need to duplicate file references for the sake of garbage - * collection. Otherwise a socket in the fps might become a - * candidate for GC while the skb is not yet queued. - */ - UNIXCB(skb).fp = scm_fp_dup(scm->fp); - if (!UNIXCB(skb).fp) - return -ENOMEM; - - for (i = scm->fp->count - 1; i >= 0; i--) - unix_inflight(scm->fp->user, scm->fp->fp[i]); - return 0; -} - static int unix_scm_to_skb(struct scm_cookie *scm, struct sk_buff *skb, bool send_fds) { int err = 0; --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -86,77 +86,13 @@ #include <net/scm.h> #include <net/tcp_states.h>
+#include "scm.h" + /* Internal data structures and random procedures: */
-static LIST_HEAD(gc_inflight_list); static LIST_HEAD(gc_candidates); -static DEFINE_SPINLOCK(unix_gc_lock); static DECLARE_WAIT_QUEUE_HEAD(unix_gc_wait);
-unsigned int unix_tot_inflight; - -struct sock *unix_get_socket(struct file *filp) -{ - struct sock *u_sock = NULL; - struct inode *inode = file_inode(filp); - - /* Socket ? */ - if (S_ISSOCK(inode->i_mode) && !(filp->f_mode & FMODE_PATH)) { - struct socket *sock = SOCKET_I(inode); - struct sock *s = sock->sk; - - /* PF_UNIX ? */ - if (s && sock->ops && sock->ops->family == PF_UNIX) - u_sock = s; - } - return u_sock; -} - -/* Keep the number of times in flight count for the file - * descriptor if it is for an AF_UNIX socket. - */ - -void unix_inflight(struct user_struct *user, struct file *fp) -{ - struct sock *s = unix_get_socket(fp); - - spin_lock(&unix_gc_lock); - - if (s) { - struct unix_sock *u = unix_sk(s); - - if (atomic_long_inc_return(&u->inflight) == 1) { - BUG_ON(!list_empty(&u->link)); - list_add_tail(&u->link, &gc_inflight_list); - } else { - BUG_ON(list_empty(&u->link)); - } - unix_tot_inflight++; - } - user->unix_inflight++; - spin_unlock(&unix_gc_lock); -} - -void unix_notinflight(struct user_struct *user, struct file *fp) -{ - struct sock *s = unix_get_socket(fp); - - spin_lock(&unix_gc_lock); - - if (s) { - struct unix_sock *u = unix_sk(s); - - BUG_ON(!atomic_long_read(&u->inflight)); - BUG_ON(list_empty(&u->link)); - - if (atomic_long_dec_and_test(&u->inflight)) - list_del_init(&u->link); - unix_tot_inflight--; - } - user->unix_inflight--; - spin_unlock(&unix_gc_lock); -} - static void scan_inflight(struct sock *x, void (*func)(struct unix_sock *), struct sk_buff_head *hitlist) { --- /dev/null +++ b/net/unix/scm.c @@ -0,0 +1,149 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <linux/module.h> +#include <linux/kernel.h> +#include <linux/string.h> +#include <linux/socket.h> +#include <linux/net.h> +#include <linux/fs.h> +#include <net/af_unix.h> +#include <net/scm.h> +#include <linux/init.h> +#include <linux/sched/signal.h> + +#include "scm.h" + +unsigned int unix_tot_inflight; +EXPORT_SYMBOL(unix_tot_inflight); + +LIST_HEAD(gc_inflight_list); +EXPORT_SYMBOL(gc_inflight_list); + +DEFINE_SPINLOCK(unix_gc_lock); +EXPORT_SYMBOL(unix_gc_lock); + +struct sock *unix_get_socket(struct file *filp) +{ + struct sock *u_sock = NULL; + struct inode *inode = file_inode(filp); + + /* Socket ? */ + if (S_ISSOCK(inode->i_mode) && !(filp->f_mode & FMODE_PATH)) { + struct socket *sock = SOCKET_I(inode); + struct sock *s = sock->sk; + + /* PF_UNIX ? */ + if (s && sock->ops && sock->ops->family == PF_UNIX) + u_sock = s; + } + return u_sock; +} +EXPORT_SYMBOL(unix_get_socket); + +/* Keep the number of times in flight count for the file + * descriptor if it is for an AF_UNIX socket. + */ +void unix_inflight(struct user_struct *user, struct file *fp) +{ + struct sock *s = unix_get_socket(fp); + + spin_lock(&unix_gc_lock); + + if (s) { + struct unix_sock *u = unix_sk(s); + + if (atomic_long_inc_return(&u->inflight) == 1) { + BUG_ON(!list_empty(&u->link)); + list_add_tail(&u->link, &gc_inflight_list); + } else { + BUG_ON(list_empty(&u->link)); + } + unix_tot_inflight++; + } + user->unix_inflight++; + spin_unlock(&unix_gc_lock); +} + +void unix_notinflight(struct user_struct *user, struct file *fp) +{ + struct sock *s = unix_get_socket(fp); + + spin_lock(&unix_gc_lock); + + if (s) { + struct unix_sock *u = unix_sk(s); + + BUG_ON(!atomic_long_read(&u->inflight)); + BUG_ON(list_empty(&u->link)); + + if (atomic_long_dec_and_test(&u->inflight)) + list_del_init(&u->link); + unix_tot_inflight--; + } + user->unix_inflight--; + spin_unlock(&unix_gc_lock); +} + +/* + * The "user->unix_inflight" variable is protected by the garbage + * collection lock, and we just read it locklessly here. If you go + * over the limit, there might be a tiny race in actually noticing + * it across threads. Tough. + */ +static inline bool too_many_unix_fds(struct task_struct *p) +{ + struct user_struct *user = current_user(); + + if (unlikely(user->unix_inflight > task_rlimit(p, RLIMIT_NOFILE))) + return !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN); + return false; +} + +int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb) +{ + int i; + + if (too_many_unix_fds(current)) + return -ETOOMANYREFS; + + /* + * Need to duplicate file references for the sake of garbage + * collection. Otherwise a socket in the fps might become a + * candidate for GC while the skb is not yet queued. + */ + UNIXCB(skb).fp = scm_fp_dup(scm->fp); + if (!UNIXCB(skb).fp) + return -ENOMEM; + + for (i = scm->fp->count - 1; i >= 0; i--) + unix_inflight(scm->fp->user, scm->fp->fp[i]); + return 0; +} +EXPORT_SYMBOL(unix_attach_fds); + +void unix_detach_fds(struct scm_cookie *scm, struct sk_buff *skb) +{ + int i; + + scm->fp = UNIXCB(skb).fp; + UNIXCB(skb).fp = NULL; + + for (i = scm->fp->count-1; i >= 0; i--) + unix_notinflight(scm->fp->user, scm->fp->fp[i]); +} +EXPORT_SYMBOL(unix_detach_fds); + +void unix_destruct_scm(struct sk_buff *skb) +{ + struct scm_cookie scm; + + memset(&scm, 0, sizeof(scm)); + scm.pid = UNIXCB(skb).pid; + if (UNIXCB(skb).fp) + unix_detach_fds(&scm, skb); + + /* Alas, it calls VFS */ + /* So fscking what? fput() had been SMP-safe since the last Summer */ + scm_destroy(&scm); + sock_wfree(skb); +} +EXPORT_SYMBOL(unix_destruct_scm); --- /dev/null +++ b/net/unix/scm.h @@ -0,0 +1,10 @@ +#ifndef NET_UNIX_SCM_H +#define NET_UNIX_SCM_H + +extern struct list_head gc_inflight_list; +extern spinlock_t unix_gc_lock; + +int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb); +void unix_detach_fds(struct scm_cookie *scm, struct sk_buff *skb); + +#endif
From: Miklos Szeredi mszeredi@redhat.com
commit cbcf01128d0a92e131bd09f1688fe032480b65ca upstream.
unix_gc() assumes that candidate sockets can never gain an external reference (i.e. be installed into an fd) while the unix_gc_lock is held. Except for MSG_PEEK this is guaranteed by modifying inflight count under the unix_gc_lock.
MSG_PEEK does not touch any variable protected by unix_gc_lock (file count is not), yet it needs to be serialized with garbage collection. Do this by locking/unlocking unix_gc_lock:
1) increment file count
2) lock/unlock barrier to make sure incremented file count is visible to garbage collection
3) install file into fd
This is a lock barrier (unlike smp_mb()) that ensures that garbage collection is run completely before or completely after the barrier.
Cc: stable@vger.kernel.org Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/unix/af_unix.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 49 insertions(+), 2 deletions(-)
--- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -1521,6 +1521,53 @@ out: return err; }
+static void unix_peek_fds(struct scm_cookie *scm, struct sk_buff *skb) +{ + scm->fp = scm_fp_dup(UNIXCB(skb).fp); + + /* + * Garbage collection of unix sockets starts by selecting a set of + * candidate sockets which have reference only from being in flight + * (total_refs == inflight_refs). This condition is checked once during + * the candidate collection phase, and candidates are marked as such, so + * that non-candidates can later be ignored. While inflight_refs is + * protected by unix_gc_lock, total_refs (file count) is not, hence this + * is an instantaneous decision. + * + * Once a candidate, however, the socket must not be reinstalled into a + * file descriptor while the garbage collection is in progress. + * + * If the above conditions are met, then the directed graph of + * candidates (*) does not change while unix_gc_lock is held. + * + * Any operations that changes the file count through file descriptors + * (dup, close, sendmsg) does not change the graph since candidates are + * not installed in fds. + * + * Dequeing a candidate via recvmsg would install it into an fd, but + * that takes unix_gc_lock to decrement the inflight count, so it's + * serialized with garbage collection. + * + * MSG_PEEK is special in that it does not change the inflight count, + * yet does install the socket into an fd. The following lock/unlock + * pair is to ensure serialization with garbage collection. It must be + * done between incrementing the file count and installing the file into + * an fd. + * + * If garbage collection starts after the barrier provided by the + * lock/unlock, then it will see the elevated refcount and not mark this + * as a candidate. If a garbage collection is already in progress + * before the file count was incremented, then the lock/unlock pair will + * ensure that garbage collection is finished before progressing to + * installing the fd. + * + * (*) A -> B where B is on the queue of A or B is on the queue of C + * which is on the queue of listening socket A. + */ + spin_lock(&unix_gc_lock); + spin_unlock(&unix_gc_lock); +} + static int unix_scm_to_skb(struct scm_cookie *scm, struct sk_buff *skb, bool send_fds) { int err = 0; @@ -2146,7 +2193,7 @@ static int unix_dgram_recvmsg(struct soc sk_peek_offset_fwd(sk, size);
if (UNIXCB(skb).fp) - scm.fp = scm_fp_dup(UNIXCB(skb).fp); + unix_peek_fds(&scm, skb); } err = (flags & MSG_TRUNC) ? skb->len - skip : size;
@@ -2387,7 +2434,7 @@ unlock: /* It is questionable, see note in unix_dgram_recvmsg. */ if (UNIXCB(skb).fp) - scm.fp = scm_fp_dup(UNIXCB(skb).fp); + unix_peek_fds(&scm, skb);
sk_peek_offset_fwd(sk, chunk);
From: Yang Yingliang yangyingliang@huawei.com
commit b42b0bddcbc87b4c66f6497f66fc72d52b712aa7 upstream.
I got a UAF report when doing fuzz test:
[ 152.880091][ T8030] ================================================================== [ 152.881240][ T8030] BUG: KASAN: use-after-free in pwq_unbound_release_workfn+0x50/0x190 [ 152.882442][ T8030] Read of size 4 at addr ffff88810d31bd00 by task kworker/3:2/8030 [ 152.883578][ T8030] [ 152.883932][ T8030] CPU: 3 PID: 8030 Comm: kworker/3:2 Not tainted 5.13.0+ #249 [ 152.885014][ T8030] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 [ 152.886442][ T8030] Workqueue: events pwq_unbound_release_workfn [ 152.887358][ T8030] Call Trace: [ 152.887837][ T8030] dump_stack_lvl+0x75/0x9b [ 152.888525][ T8030] ? pwq_unbound_release_workfn+0x50/0x190 [ 152.889371][ T8030] print_address_description.constprop.10+0x48/0x70 [ 152.890326][ T8030] ? pwq_unbound_release_workfn+0x50/0x190 [ 152.891163][ T8030] ? pwq_unbound_release_workfn+0x50/0x190 [ 152.891999][ T8030] kasan_report.cold.15+0x82/0xdb [ 152.892740][ T8030] ? pwq_unbound_release_workfn+0x50/0x190 [ 152.893594][ T8030] __asan_load4+0x69/0x90 [ 152.894243][ T8030] pwq_unbound_release_workfn+0x50/0x190 [ 152.895057][ T8030] process_one_work+0x47b/0x890 [ 152.895778][ T8030] worker_thread+0x5c/0x790 [ 152.896439][ T8030] ? process_one_work+0x890/0x890 [ 152.897163][ T8030] kthread+0x223/0x250 [ 152.897747][ T8030] ? set_kthread_struct+0xb0/0xb0 [ 152.898471][ T8030] ret_from_fork+0x1f/0x30 [ 152.899114][ T8030] [ 152.899446][ T8030] Allocated by task 8884: [ 152.900084][ T8030] kasan_save_stack+0x21/0x50 [ 152.900769][ T8030] __kasan_kmalloc+0x88/0xb0 [ 152.901416][ T8030] __kmalloc+0x29c/0x460 [ 152.902014][ T8030] alloc_workqueue+0x111/0x8e0 [ 152.902690][ T8030] __btrfs_alloc_workqueue+0x11e/0x2a0 [ 152.903459][ T8030] btrfs_alloc_workqueue+0x6d/0x1d0 [ 152.904198][ T8030] scrub_workers_get+0x1e8/0x490 [ 152.904929][ T8030] btrfs_scrub_dev+0x1b9/0x9c0 [ 152.905599][ T8030] btrfs_ioctl+0x122c/0x4e50 [ 152.906247][ T8030] __x64_sys_ioctl+0x137/0x190 [ 152.906916][ T8030] do_syscall_64+0x34/0xb0 [ 152.907535][ T8030] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 152.908365][ T8030] [ 152.908688][ T8030] Freed by task 8884: [ 152.909243][ T8030] kasan_save_stack+0x21/0x50 [ 152.909893][ T8030] kasan_set_track+0x20/0x30 [ 152.910541][ T8030] kasan_set_free_info+0x24/0x40 [ 152.911265][ T8030] __kasan_slab_free+0xf7/0x140 [ 152.911964][ T8030] kfree+0x9e/0x3d0 [ 152.912501][ T8030] alloc_workqueue+0x7d7/0x8e0 [ 152.913182][ T8030] __btrfs_alloc_workqueue+0x11e/0x2a0 [ 152.913949][ T8030] btrfs_alloc_workqueue+0x6d/0x1d0 [ 152.914703][ T8030] scrub_workers_get+0x1e8/0x490 [ 152.915402][ T8030] btrfs_scrub_dev+0x1b9/0x9c0 [ 152.916077][ T8030] btrfs_ioctl+0x122c/0x4e50 [ 152.916729][ T8030] __x64_sys_ioctl+0x137/0x190 [ 152.917414][ T8030] do_syscall_64+0x34/0xb0 [ 152.918034][ T8030] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 152.918872][ T8030] [ 152.919203][ T8030] The buggy address belongs to the object at ffff88810d31bc00 [ 152.919203][ T8030] which belongs to the cache kmalloc-512 of size 512 [ 152.921155][ T8030] The buggy address is located 256 bytes inside of [ 152.921155][ T8030] 512-byte region [ffff88810d31bc00, ffff88810d31be00) [ 152.922993][ T8030] The buggy address belongs to the page: [ 152.923800][ T8030] page:ffffea000434c600 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10d318 [ 152.925249][ T8030] head:ffffea000434c600 order:2 compound_mapcount:0 compound_pincount:0 [ 152.926399][ T8030] flags: 0x57ff00000010200(slab|head|node=1|zone=2|lastcpupid=0x7ff) [ 152.927515][ T8030] raw: 057ff00000010200 dead000000000100 dead000000000122 ffff888009c42c80 [ 152.928716][ T8030] raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000 [ 152.929890][ T8030] page dumped because: kasan: bad access detected [ 152.930759][ T8030] [ 152.931076][ T8030] Memory state around the buggy address: [ 152.931851][ T8030] ffff88810d31bc00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 152.932967][ T8030] ffff88810d31bc80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 152.934068][ T8030] >ffff88810d31bd00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 152.935189][ T8030] ^ [ 152.935763][ T8030] ffff88810d31bd80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 152.936847][ T8030] ffff88810d31be00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 152.937940][ T8030] ==================================================================
If apply_wqattrs_prepare() fails in alloc_workqueue(), it will call put_pwq() which invoke a work queue to call pwq_unbound_release_workfn() and use the 'wq'. The 'wq' allocated in alloc_workqueue() will be freed in error path when apply_wqattrs_prepare() fails. So it will lead a UAF.
CPU0 CPU1 alloc_workqueue() alloc_and_link_pwqs() apply_wqattrs_prepare() fails apply_wqattrs_cleanup() schedule_work(&pwq->unbound_release_work) kfree(wq) worker_thread() pwq_unbound_release_workfn() <- trigger uaf here
If apply_wqattrs_prepare() fails, the new pwq are not linked, it doesn't hold any reference to the 'wq', 'wq' is invalid to access in the worker, so add check pwq if linked to fix this.
Fixes: 2d5f0764b526 ("workqueue: split apply_workqueue_attrs() into 3 stages") Cc: stable@vger.kernel.org # v4.2+ Reported-by: Hulk Robot hulkci@huawei.com Suggested-by: Lai Jiangshan jiangshanlai@gmail.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Lai Jiangshan jiangshanlai@gmail.com Tested-by: Pavel Skripkin paskripkin@gmail.com Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/workqueue.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-)
--- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -3441,15 +3441,21 @@ static void pwq_unbound_release_workfn(s unbound_release_work); struct workqueue_struct *wq = pwq->wq; struct worker_pool *pool = pwq->pool; - bool is_last; + bool is_last = false;
- if (WARN_ON_ONCE(!(wq->flags & WQ_UNBOUND))) - return; + /* + * when @pwq is not linked, it doesn't hold any reference to the + * @wq, and @wq is invalid to access. + */ + if (!list_empty(&pwq->pwqs_node)) { + if (WARN_ON_ONCE(!(wq->flags & WQ_UNBOUND))) + return;
- mutex_lock(&wq->mutex); - list_del_rcu(&pwq->pwqs_node); - is_last = list_empty(&wq->pwqs); - mutex_unlock(&wq->mutex); + mutex_lock(&wq->mutex); + list_del_rcu(&pwq->pwqs_node); + is_last = list_empty(&wq->pwqs); + mutex_unlock(&wq->mutex); + }
mutex_lock(&wq_pool_mutex); put_unbound_pool(pool);
From: Yang Yingliang yangyingliang@huawei.com
[ Upstream commit 996af62167d0e0ec69b938a3561e96f84ffff1aa ]
I got kmemleak report when doing fuzz test:
BUG: memory leak unreferenced object 0xffff88810c239500 (size 64): comm "syz-executor940", pid 882, jiffies 4294712870 (age 14.631s) hex dump (first 32 bytes): 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 01 00 00 00 01 02 00 04 ................ backtrace: [<00000000a323afa4>] slab_alloc_node mm/slub.c:2972 [inline] [<00000000a323afa4>] slab_alloc mm/slub.c:2980 [inline] [<00000000a323afa4>] __kmalloc+0x167/0x340 mm/slub.c:4130 [<000000005034ca11>] kmalloc include/linux/slab.h:595 [inline] [<000000005034ca11>] mrp_attr_create net/802/mrp.c:276 [inline] [<000000005034ca11>] mrp_request_join+0x265/0x550 net/802/mrp.c:530 [<00000000fcfd81f3>] vlan_mvrp_request_join+0x145/0x170 net/8021q/vlan_mvrp.c:40 [<000000009258546e>] vlan_dev_open+0x477/0x890 net/8021q/vlan_dev.c:292 [<0000000059acd82b>] __dev_open+0x281/0x410 net/core/dev.c:1609 [<000000004e6dc695>] __dev_change_flags+0x424/0x560 net/core/dev.c:8767 [<00000000471a09af>] rtnl_configure_link+0xd9/0x210 net/core/rtnetlink.c:3122 [<0000000037a4672b>] __rtnl_newlink+0xe08/0x13e0 net/core/rtnetlink.c:3448 [<000000008d5d0fda>] rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3488 [<000000004882fe39>] rtnetlink_rcv_msg+0x369/0xa10 net/core/rtnetlink.c:5552 [<00000000907e6c54>] netlink_rcv_skb+0x134/0x3d0 net/netlink/af_netlink.c:2504 [<00000000e7d7a8c4>] netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline] [<00000000e7d7a8c4>] netlink_unicast+0x4a0/0x6a0 net/netlink/af_netlink.c:1340 [<00000000e0645d50>] netlink_sendmsg+0x78e/0xc90 net/netlink/af_netlink.c:1929 [<00000000c24559b7>] sock_sendmsg_nosec net/socket.c:654 [inline] [<00000000c24559b7>] sock_sendmsg+0x139/0x170 net/socket.c:674 [<00000000fc210bc2>] ____sys_sendmsg+0x658/0x7d0 net/socket.c:2350 [<00000000be4577b5>] ___sys_sendmsg+0xf8/0x170 net/socket.c:2404
Calling mrp_request_leave() after mrp_request_join(), the attr->state is set to MRP_APPLICANT_VO, mrp_attr_destroy() won't be called in last TX event in mrp_uninit_applicant(), the attr of applicant will be leaked. To fix this leak, iterate and free each attr of applicant before rerturning from mrp_uninit_applicant().
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/802/mrp.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/net/802/mrp.c b/net/802/mrp.c index be4dd3165347..7a893a03e795 100644 --- a/net/802/mrp.c +++ b/net/802/mrp.c @@ -295,6 +295,19 @@ static void mrp_attr_destroy(struct mrp_applicant *app, struct mrp_attr *attr) kfree(attr); }
+static void mrp_attr_destroy_all(struct mrp_applicant *app) +{ + struct rb_node *node, *next; + struct mrp_attr *attr; + + for (node = rb_first(&app->mad); + next = node ? rb_next(node) : NULL, node != NULL; + node = next) { + attr = rb_entry(node, struct mrp_attr, node); + mrp_attr_destroy(app, attr); + } +} + static int mrp_pdu_init(struct mrp_applicant *app) { struct sk_buff *skb; @@ -899,6 +912,7 @@ void mrp_uninit_applicant(struct net_device *dev, struct mrp_application *appl)
spin_lock_bh(&app->lock); mrp_mad_event(app, MRP_EVENT_TX); + mrp_attr_destroy_all(app); mrp_pdu_queue(app); spin_unlock_bh(&app->lock);
From: Yang Yingliang yangyingliang@huawei.com
[ Upstream commit 42ca63f980842918560b25f0244307fd83b4777c ]
I got kmemleak report when doing fuzz test:
BUG: memory leak unreferenced object 0xffff88810c909b80 (size 64): comm "syz", pid 957, jiffies 4295220394 (age 399.090s) hex dump (first 32 bytes): 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 08 00 00 00 01 02 00 04 ................ backtrace: [<00000000ca1f2e2e>] garp_request_join+0x285/0x3d0 [<00000000bf153351>] vlan_gvrp_request_join+0x15b/0x190 [<0000000024005e72>] vlan_dev_open+0x706/0x980 [<00000000dc20c4d4>] __dev_open+0x2bb/0x460 [<0000000066573004>] __dev_change_flags+0x501/0x650 [<0000000035b42f83>] rtnl_configure_link+0xee/0x280 [<00000000a5e69de0>] __rtnl_newlink+0xed5/0x1550 [<00000000a5258f4a>] rtnl_newlink+0x66/0x90 [<00000000506568ee>] rtnetlink_rcv_msg+0x439/0xbd0 [<00000000b7eaeae1>] netlink_rcv_skb+0x14d/0x420 [<00000000c373ce66>] netlink_unicast+0x550/0x750 [<00000000ec74ce74>] netlink_sendmsg+0x88b/0xda0 [<00000000381ff246>] sock_sendmsg+0xc9/0x120 [<000000008f6a2db3>] ____sys_sendmsg+0x6e8/0x820 [<000000008d9c1735>] ___sys_sendmsg+0x145/0x1c0 [<00000000aa39dd8b>] __sys_sendmsg+0xfe/0x1d0
Calling garp_request_leave() after garp_request_join(), the attr->state is set to GARP_APPLICANT_VO, garp_attr_destroy() won't be called in last transmit event in garp_uninit_applicant(), the attr of applicant will be leaked. To fix this leak, iterate and free each attr of applicant before rerturning from garp_uninit_applicant().
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/802/garp.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/net/802/garp.c b/net/802/garp.c index 2dac647ff420..237f6f076355 100644 --- a/net/802/garp.c +++ b/net/802/garp.c @@ -206,6 +206,19 @@ static void garp_attr_destroy(struct garp_applicant *app, struct garp_attr *attr kfree(attr); }
+static void garp_attr_destroy_all(struct garp_applicant *app) +{ + struct rb_node *node, *next; + struct garp_attr *attr; + + for (node = rb_first(&app->gid); + next = node ? rb_next(node) : NULL, node != NULL; + node = next) { + attr = rb_entry(node, struct garp_attr, node); + garp_attr_destroy(app, attr); + } +} + static int garp_pdu_init(struct garp_applicant *app) { struct sk_buff *skb; @@ -612,6 +625,7 @@ void garp_uninit_applicant(struct net_device *dev, struct garp_application *appl
spin_lock_bh(&app->lock); garp_gid_event(app, GARP_EVENT_TRANSMIT_PDU); + garp_attr_destroy_all(app); garp_pdu_queue(app); spin_unlock_bh(&app->lock);
From: Eric Dumazet edumazet@google.com
[ Upstream commit 0dbffbb5335a1e3aa6855e4ee317e25e669dd302 ]
sk_ll_usec is read locklessly from sk_can_busy_loop() while another thread can change its value in sock_setsockopt()
This is correct but needs annotations.
BUG: KCSAN: data-race in __skb_try_recv_datagram / sock_setsockopt
write to 0xffff88814eb5f904 of 4 bytes by task 14011 on cpu 0: sock_setsockopt+0x1287/0x2090 net/core/sock.c:1175 __sys_setsockopt+0x14f/0x200 net/socket.c:2100 __do_sys_setsockopt net/socket.c:2115 [inline] __se_sys_setsockopt net/socket.c:2112 [inline] __x64_sys_setsockopt+0x62/0x70 net/socket.c:2112 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae
read to 0xffff88814eb5f904 of 4 bytes by task 14001 on cpu 1: sk_can_busy_loop include/net/busy_poll.h:41 [inline] __skb_try_recv_datagram+0x14f/0x320 net/core/datagram.c:273 unix_dgram_recvmsg+0x14c/0x870 net/unix/af_unix.c:2101 unix_seqpacket_recvmsg+0x5a/0x70 net/unix/af_unix.c:2067 ____sys_recvmsg+0x15d/0x310 include/linux/uio.h:244 ___sys_recvmsg net/socket.c:2598 [inline] do_recvmmsg+0x35c/0x9f0 net/socket.c:2692 __sys_recvmmsg net/socket.c:2771 [inline] __do_sys_recvmmsg net/socket.c:2794 [inline] __se_sys_recvmmsg net/socket.c:2787 [inline] __x64_sys_recvmmsg+0xcf/0x150 net/socket.c:2787 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae
value changed: 0x00000000 -> 0x00000101
Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 14001 Comm: syz-executor.3 Not tainted 5.13.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/busy_poll.h | 2 +- net/core/sock.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/net/busy_poll.h b/include/net/busy_poll.h index c86fcadccbd7..5dd22b740f9c 100644 --- a/include/net/busy_poll.h +++ b/include/net/busy_poll.h @@ -48,7 +48,7 @@ static inline bool net_busy_loop_on(void)
static inline bool sk_can_busy_loop(const struct sock *sk) { - return sk->sk_ll_usec && !signal_pending(current); + return READ_ONCE(sk->sk_ll_usec) && !signal_pending(current); }
bool sk_busy_loop_end(void *p, unsigned long start_time); diff --git a/net/core/sock.c b/net/core/sock.c index 3b65fedf77ca..699bd3052c61 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1023,7 +1023,7 @@ set_rcvbuf: if (val < 0) ret = -EINVAL; else - sk->sk_ll_usec = val; + WRITE_ONCE(sk->sk_ll_usec, val); } break; #endif
From: Xin Long lucien.xin@gmail.com
[ Upstream commit 1d11fa231cabeae09a95cb3e4cf1d9dd34e00f08 ]
The doc draft-stewart-tsvwg-sctp-ipv4-00 that restricts 198 addresses was never published. These addresses as private addresses should be allowed to use in SCTP.
As Michael Tuexen suggested, this patch is to move 198 addresses from unusable to private scope.
Reported-by: Sérgio surkamp@gmail.com Signed-off-by: Xin Long lucien.xin@gmail.com Acked-by: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/sctp/constants.h | 4 +--- net/sctp/protocol.c | 3 ++- 2 files changed, 3 insertions(+), 4 deletions(-)
diff --git a/include/net/sctp/constants.h b/include/net/sctp/constants.h index d4da07048aa3..cbf96458ce22 100644 --- a/include/net/sctp/constants.h +++ b/include/net/sctp/constants.h @@ -348,8 +348,7 @@ enum { #define SCTP_SCOPE_POLICY_MAX SCTP_SCOPE_POLICY_LINK
/* Based on IPv4 scoping <draft-stewart-tsvwg-sctp-ipv4-00.txt>, - * SCTP IPv4 unusable addresses: 0.0.0.0/8, 224.0.0.0/4, 198.18.0.0/24, - * 192.88.99.0/24. + * SCTP IPv4 unusable addresses: 0.0.0.0/8, 224.0.0.0/4, 192.88.99.0/24. * Also, RFC 8.4, non-unicast addresses are not considered valid SCTP * addresses. */ @@ -357,7 +356,6 @@ enum { ((htonl(INADDR_BROADCAST) == a) || \ ipv4_is_multicast(a) || \ ipv4_is_zeronet(a) || \ - ipv4_is_test_198(a) || \ ipv4_is_anycast_6to4(a))
/* Flags used for the bind address copy functions. */ diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c index d5cf05efddfd..868b97607601 100644 --- a/net/sctp/protocol.c +++ b/net/sctp/protocol.c @@ -423,7 +423,8 @@ static enum sctp_scope sctp_v4_scope(union sctp_addr *addr) retval = SCTP_SCOPE_LINK; } else if (ipv4_is_private_10(addr->v4.sin_addr.s_addr) || ipv4_is_private_172(addr->v4.sin_addr.s_addr) || - ipv4_is_private_192(addr->v4.sin_addr.s_addr)) { + ipv4_is_private_192(addr->v4.sin_addr.s_addr) || + ipv4_is_test_198(addr->v4.sin_addr.s_addr)) { retval = SCTP_SCOPE_PRIVATE; } else { retval = SCTP_SCOPE_GLOBAL;
From: Desmond Cheong Zhi Xi desmondcheongzx@gmail.com
[ Upstream commit 16ee572eaf0d09daa4c8a755fdb71e40dbf8562d ]
Patch series "hfs: fix various errors", v2.
This series ultimately aims to address a lockdep warning in hfs_find_init reported by Syzbot [1].
The work done for this led to the discovery of another bug, and the Syzkaller repro test also reveals an invalid memory access error after clearing the lockdep warning. Hence, this series is broken up into three patches:
1. Add a missing call to hfs_find_exit for an error path in hfs_fill_super
2. Fix memory mapping in hfs_bnode_read by fixing calls to kmap
3. Add lock nesting notation to tell lockdep that the observed locking hierarchy is safe
This patch (of 3):
Before exiting hfs_fill_super, the struct hfs_find_data used in hfs_find_init should be passed to hfs_find_exit to be cleaned up, and to release the lock held on the btree.
The call to hfs_find_exit is missing from an error path. We add it back in by consolidating calls to hfs_find_exit for error paths.
Link: https://syzkaller.appspot.com/bug?id=f007ef1d7a31a469e3be7aeb0fde0769b18585d... [1] Link: https://lkml.kernel.org/r/20210701030756.58760-1-desmondcheongzx@gmail.com Link: https://lkml.kernel.org/r/20210701030756.58760-2-desmondcheongzx@gmail.com Signed-off-by: Desmond Cheong Zhi Xi desmondcheongzx@gmail.com Reviewed-by: Viacheslav Dubeyko slava@dubeyko.com Cc: Gustavo A. R. Silva gustavoars@kernel.org Cc: Al Viro viro@zeniv.linux.org.uk Cc: Shuah Khan skhan@linuxfoundation.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/hfs/super.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/hfs/super.c b/fs/hfs/super.c index 7e0d65e9586c..691810b0e6bc 100644 --- a/fs/hfs/super.c +++ b/fs/hfs/super.c @@ -427,14 +427,12 @@ static int hfs_fill_super(struct super_block *sb, void *data, int silent) if (!res) { if (fd.entrylength > sizeof(rec) || fd.entrylength < 0) { res = -EIO; - goto bail; + goto bail_hfs_find; } hfs_bnode_read(fd.bnode, &rec, fd.entryoffset, fd.entrylength); } - if (res) { - hfs_find_exit(&fd); - goto bail_no_root; - } + if (res) + goto bail_hfs_find; res = -EINVAL; root_inode = hfs_iget(sb, &fd.search_key->cat, &rec); hfs_find_exit(&fd); @@ -450,6 +448,8 @@ static int hfs_fill_super(struct super_block *sb, void *data, int silent) /* everything's okay */ return 0;
+bail_hfs_find: + hfs_find_exit(&fd); bail_no_root: pr_err("get root inode failed\n"); bail:
From: Desmond Cheong Zhi Xi desmondcheongzx@gmail.com
[ Upstream commit 54a5ead6f5e2b47131a7385d0c0af18e7b89cb02 ]
Pages that we read in hfs_bnode_read need to be kmapped into kernel address space. However, currently only the 0th page is kmapped. If the given offset + length exceeds this 0th page, then we have an invalid memory access.
To fix this, we kmap relevant pages one by one and copy their relevant portions of data.
An example of invalid memory access occurring without this fix can be seen in the following crash report:
================================================================== BUG: KASAN: use-after-free in memcpy include/linux/fortify-string.h:191 [inline] BUG: KASAN: use-after-free in hfs_bnode_read+0xc4/0xe0 fs/hfs/bnode.c:26 Read of size 2 at addr ffff888125fdcffe by task syz-executor5/4634
CPU: 0 PID: 4634 Comm: syz-executor5 Not tainted 5.13.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x195/0x1f8 lib/dump_stack.c:120 print_address_description.constprop.0+0x1d/0x110 mm/kasan/report.c:233 __kasan_report mm/kasan/report.c:419 [inline] kasan_report.cold+0x7b/0xd4 mm/kasan/report.c:436 check_region_inline mm/kasan/generic.c:180 [inline] kasan_check_range+0x154/0x1b0 mm/kasan/generic.c:186 memcpy+0x24/0x60 mm/kasan/shadow.c:65 memcpy include/linux/fortify-string.h:191 [inline] hfs_bnode_read+0xc4/0xe0 fs/hfs/bnode.c:26 hfs_bnode_read_u16 fs/hfs/bnode.c:34 [inline] hfs_bnode_find+0x880/0xcc0 fs/hfs/bnode.c:365 hfs_brec_find+0x2d8/0x540 fs/hfs/bfind.c:126 hfs_brec_read+0x27/0x120 fs/hfs/bfind.c:165 hfs_cat_find_brec+0x19a/0x3b0 fs/hfs/catalog.c:194 hfs_fill_super+0xc13/0x1460 fs/hfs/super.c:419 mount_bdev+0x331/0x3f0 fs/super.c:1368 hfs_mount+0x35/0x40 fs/hfs/super.c:457 legacy_get_tree+0x10c/0x220 fs/fs_context.c:592 vfs_get_tree+0x93/0x300 fs/super.c:1498 do_new_mount fs/namespace.c:2905 [inline] path_mount+0x13f5/0x20e0 fs/namespace.c:3235 do_mount fs/namespace.c:3248 [inline] __do_sys_mount fs/namespace.c:3456 [inline] __se_sys_mount fs/namespace.c:3433 [inline] __x64_sys_mount+0x2b8/0x340 fs/namespace.c:3433 do_syscall_64+0x37/0xc0 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x45e63a Code: 48 c7 c2 bc ff ff ff f7 d8 64 89 02 b8 ff ff ff ff eb d2 e8 88 04 00 00 0f 1f 84 00 00 00 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f9404d410d8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 0000000020000248 RCX: 000000000045e63a RDX: 0000000020000000 RSI: 0000000020000100 RDI: 00007f9404d41120 RBP: 00007f9404d41120 R08: 00000000200002c0 R09: 0000000020000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003 R13: 0000000000000003 R14: 00000000004ad5d8 R15: 0000000000000000
The buggy address belongs to the page: page:00000000dadbcf3e refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x125fdc flags: 0x2fffc0000000000(node=0|zone=2|lastcpupid=0x3fff) raw: 02fffc0000000000 ffffea000497f748 ffffea000497f6c8 0000000000000000 raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address: ffff888125fdce80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff888125fdcf00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ffff888125fdcf80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
^ ffff888125fdd000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff888125fdd080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ==================================================================
Link: https://lkml.kernel.org/r/20210701030756.58760-3-desmondcheongzx@gmail.com Signed-off-by: Desmond Cheong Zhi Xi desmondcheongzx@gmail.com Reviewed-by: Viacheslav Dubeyko slava@dubeyko.com Cc: Al Viro viro@zeniv.linux.org.uk Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Gustavo A. R. Silva gustavoars@kernel.org Cc: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/hfs/bnode.c | 25 ++++++++++++++++++++----- 1 file changed, 20 insertions(+), 5 deletions(-)
diff --git a/fs/hfs/bnode.c b/fs/hfs/bnode.c index 8aec5e732abf..bca3ea4137ee 100644 --- a/fs/hfs/bnode.c +++ b/fs/hfs/bnode.c @@ -15,16 +15,31 @@
#include "btree.h"
-void hfs_bnode_read(struct hfs_bnode *node, void *buf, - int off, int len) +void hfs_bnode_read(struct hfs_bnode *node, void *buf, int off, int len) { struct page *page; + int pagenum; + int bytes_read; + int bytes_to_read; + void *vaddr;
off += node->page_offset; - page = node->page[0]; + pagenum = off >> PAGE_SHIFT; + off &= ~PAGE_MASK; /* compute page offset for the first page */
- memcpy(buf, kmap(page) + off, len); - kunmap(page); + for (bytes_read = 0; bytes_read < len; bytes_read += bytes_to_read) { + if (pagenum >= node->tree->pages_per_bnode) + break; + page = node->page[pagenum]; + bytes_to_read = min_t(int, len - bytes_read, PAGE_SIZE - off); + + vaddr = kmap_atomic(page); + memcpy(buf + bytes_read, vaddr + off, bytes_to_read); + kunmap_atomic(vaddr); + + pagenum++; + off = 0; /* page offset only applies to the first page */ + } }
u16 hfs_bnode_read_u16(struct hfs_bnode *node, int off)
From: Desmond Cheong Zhi Xi desmondcheongzx@gmail.com
[ Upstream commit b3b2177a2d795e35dc11597b2609eb1e7e57e570 ]
Syzbot reports a possible recursive lock in [1].
This happens due to missing lock nesting information. From the logs, we see that a call to hfs_fill_super is made to mount the hfs filesystem. While searching for the root inode, the lock on the catalog btree is grabbed. Then, when the parent of the root isn't found, a call to __hfs_bnode_create is made to create the parent of the root. This eventually leads to a call to hfs_ext_read_extent which grabs a lock on the extents btree.
Since the order of locking is catalog btree -> extents btree, this lock hierarchy does not lead to a deadlock.
To tell lockdep that this locking is safe, we add nesting notation to distinguish between catalog btrees, extents btrees, and attributes btrees (for HFS+). This has already been done in hfsplus.
Link: https://syzkaller.appspot.com/bug?id=f007ef1d7a31a469e3be7aeb0fde0769b18585d... [1] Link: https://lkml.kernel.org/r/20210701030756.58760-4-desmondcheongzx@gmail.com Signed-off-by: Desmond Cheong Zhi Xi desmondcheongzx@gmail.com Reported-by: syzbot+b718ec84a87b7e73ade4@syzkaller.appspotmail.com Tested-by: syzbot+b718ec84a87b7e73ade4@syzkaller.appspotmail.com Reviewed-by: Viacheslav Dubeyko slava@dubeyko.com Cc: Al Viro viro@zeniv.linux.org.uk Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Gustavo A. R. Silva gustavoars@kernel.org Cc: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/hfs/bfind.c | 14 +++++++++++++- fs/hfs/btree.h | 7 +++++++ 2 files changed, 20 insertions(+), 1 deletion(-)
diff --git a/fs/hfs/bfind.c b/fs/hfs/bfind.c index 4af318fbda77..ef9498a6e88a 100644 --- a/fs/hfs/bfind.c +++ b/fs/hfs/bfind.c @@ -25,7 +25,19 @@ int hfs_find_init(struct hfs_btree *tree, struct hfs_find_data *fd) fd->key = ptr + tree->max_key_len + 2; hfs_dbg(BNODE_REFS, "find_init: %d (%p)\n", tree->cnid, __builtin_return_address(0)); - mutex_lock(&tree->tree_lock); + switch (tree->cnid) { + case HFS_CAT_CNID: + mutex_lock_nested(&tree->tree_lock, CATALOG_BTREE_MUTEX); + break; + case HFS_EXT_CNID: + mutex_lock_nested(&tree->tree_lock, EXTENTS_BTREE_MUTEX); + break; + case HFS_ATTR_CNID: + mutex_lock_nested(&tree->tree_lock, ATTR_BTREE_MUTEX); + break; + default: + return -EINVAL; + } return 0; }
diff --git a/fs/hfs/btree.h b/fs/hfs/btree.h index dcc2aab1b2c4..25ac9a8bb57a 100644 --- a/fs/hfs/btree.h +++ b/fs/hfs/btree.h @@ -13,6 +13,13 @@ typedef int (*btree_keycmp)(const btree_key *, const btree_key *);
#define NODE_HASH_SIZE 256
+/* B-tree mutex nested subclasses */ +enum hfs_btree_mutex_classes { + CATALOG_BTREE_MUTEX, + EXTENTS_BTREE_MUTEX, + ATTR_BTREE_MUTEX, +}; + /* A HFS BTree held in memory */ struct hfs_btree { struct super_block *sb;
From: Sudeep Holla sudeep.holla@arm.com
[ Upstream commit 82a1c67554dff610d6be4e1982c425717b3c6a23 ]
Once the new schema interrupt-controller/arm,vic.yaml is added, we get the below warnings:
arch/arm/boot/dts/versatile-ab.dt.yaml: intc@10140000: $nodename:0: 'intc@10140000' does not match '^interrupt-controller(@[0-9a-f,]+)*$'
arch/arm/boot/dts/versatile-ab.dt.yaml: intc@10140000: 'clear-mask' does not match any of the regexes
Fix the node names for the interrupt controller to conform to the standard node name interrupt-controller@.. Also drop invalid clear-mask property.
Signed-off-by: Sudeep Holla sudeep.holla@arm.com Acked-by: Linus Walleij linus.walleij@linaro.org Link: https://lore.kernel.org/r/20210701132118.759454-1-sudeep.holla@arm.com' Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/versatile-ab.dts | 5 ++--- arch/arm/boot/dts/versatile-pb.dts | 2 +- 2 files changed, 3 insertions(+), 4 deletions(-)
diff --git a/arch/arm/boot/dts/versatile-ab.dts b/arch/arm/boot/dts/versatile-ab.dts index a9000d22b2c0..873889ddecbe 100644 --- a/arch/arm/boot/dts/versatile-ab.dts +++ b/arch/arm/boot/dts/versatile-ab.dts @@ -155,16 +155,15 @@ #size-cells = <1>; ranges;
- vic: intc@10140000 { + vic: interrupt-controller@10140000 { compatible = "arm,versatile-vic"; interrupt-controller; #interrupt-cells = <1>; reg = <0x10140000 0x1000>; - clear-mask = <0xffffffff>; valid-mask = <0xffffffff>; };
- sic: intc@10003000 { + sic: interrupt-controller@10003000 { compatible = "arm,versatile-sic"; interrupt-controller; #interrupt-cells = <1>; diff --git a/arch/arm/boot/dts/versatile-pb.dts b/arch/arm/boot/dts/versatile-pb.dts index 06a0fdf24026..e7e751a858d8 100644 --- a/arch/arm/boot/dts/versatile-pb.dts +++ b/arch/arm/boot/dts/versatile-pb.dts @@ -7,7 +7,7 @@
amba { /* The Versatile PB is using more SIC IRQ lines than the AB */ - sic: intc@10003000 { + sic: interrupt-controller@10003000 { clear-mask = <0xffffffff>; /* * Valid interrupt lines mask according to
From: Eric Dumazet edumazet@google.com
commit 0f6925b3e8da0dbbb52447ca8a8b42b371aac7db upstream.
Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs") brought a ~10% performance drop.
The reason for the performance drop was that GRO was forced to chain sk_buff (using skb_shinfo(skb)->frag_list), which uses more memory but also cause packet consumers to go over a lot of overhead handling all the tiny skbs.
It turns out that virtio_net page_to_skb() has a wrong strategy : It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then copies 128 bytes from the page, before feeding the packet to GRO stack.
This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs") because GRO was using 2 frags per MSS, meaning we were not packing MSS with 100% efficiency.
Fix is to pull only the ethernet header in page_to_skb()
Then, we change virtio_net_hdr_to_skb() to pull the missing headers, instead of assuming they were already pulled by callers.
This fixes the performance regression, but could also allow virtio_net to accept packets with more than 128bytes of headers.
Many thanks to Xuan Zhuo for his report, and his tests/help.
Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs") Reported-by: Xuan Zhuo xuanzhuo@linux.alibaba.com Link: https://www.spinics.net/lists/netdev/msg731397.html Co-Developed-by: Xuan Zhuo xuanzhuo@linux.alibaba.com Signed-off-by: Xuan Zhuo xuanzhuo@linux.alibaba.com Signed-off-by: Eric Dumazet edumazet@google.com Cc: "Michael S. Tsirkin" mst@redhat.com Cc: Jason Wang jasowang@redhat.com Cc: virtualization@lists.linux-foundation.org Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/virtio_net.c | 10 +++++++--- include/linux/virtio_net.h | 14 +++++++++----- 2 files changed, 16 insertions(+), 8 deletions(-)
--- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -339,9 +339,13 @@ static struct sk_buff *page_to_skb(struc offset += hdr_padded_len; p += hdr_padded_len;
- copy = len; - if (copy > skb_tailroom(skb)) - copy = skb_tailroom(skb); + /* Copy all frame if it fits skb->head, otherwise + * we let virtio_net_hdr_to_skb() and GRO pull headers as needed. + */ + if (len <= skb_tailroom(skb)) + copy = len; + else + copy = ETH_HLEN; skb_put_data(skb, p, copy);
len -= copy; --- a/include/linux/virtio_net.h +++ b/include/linux/virtio_net.h @@ -65,14 +65,18 @@ static inline int virtio_net_hdr_to_skb( skb_reset_mac_header(skb);
if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) { - u16 start = __virtio16_to_cpu(little_endian, hdr->csum_start); - u16 off = __virtio16_to_cpu(little_endian, hdr->csum_offset); + u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start); + u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset); + u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16)); + + if (!pskb_may_pull(skb, needed)) + return -EINVAL;
if (!skb_partial_csum_set(skb, start, off)) return -EINVAL;
p_off = skb_transport_offset(skb) + thlen; - if (p_off > skb_headlen(skb)) + if (!pskb_may_pull(skb, p_off)) return -EINVAL; } else { /* gso packets without NEEDS_CSUM do not set transport_offset. @@ -100,14 +104,14 @@ retry: }
p_off = keys.control.thoff + thlen; - if (p_off > skb_headlen(skb) || + if (!pskb_may_pull(skb, p_off) || keys.basic.ip_proto != ip_proto) return -EINVAL;
skb_set_transport_header(skb, keys.control.thoff); } else if (gso_type) { p_off = thlen; - if (p_off > skb_headlen(skb)) + if (!pskb_may_pull(skb, p_off)) return -EINVAL; } }
From: Eric Dumazet edumazet@google.com
commit 38ec4944b593fd90c5ef42aaaa53e66ae5769d04 upstream.
After commit 0f6925b3e8da ("virtio_net: Do not pull payload in skb->head") Guenter Roeck reported one failure in his tests using sh architecture.
After much debugging, we have been able to spot silent unaligned accesses in inet_gro_receive()
The issue at hand is that upper networking stacks assume their header is word-aligned. Low level drivers are supposed to reserve NET_IP_ALIGN bytes before the Ethernet header to make that happen.
This patch hardens skb_gro_reset_offset() to not allow frag0 fast-path if the fragment is not properly aligned.
Some arches like x86, arm64 and powerpc do not care and define NET_IP_ALIGN as 0, this extra check will be a NOP for them.
Note that if frag0 is not used, GRO will call pskb_may_pull() as many times as needed to pull network and transport headers.
Fixes: 0f6925b3e8da ("virtio_net: Do not pull payload in skb->head") Fixes: 78a478d0efd9 ("gro: Inline skb_gro_header and cache frag0 virtual address") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: Guenter Roeck linux@roeck-us.net Cc: Xuan Zhuo xuanzhuo@linux.alibaba.com Cc: "Michael S. Tsirkin" mst@redhat.com Cc: Jason Wang jasowang@redhat.com Acked-by: Michael S. Tsirkin mst@redhat.com Tested-by: Guenter Roeck linux@roeck-us.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/skbuff.h | 9 +++++++++ net/core/dev.c | 3 ++- 2 files changed, 11 insertions(+), 1 deletion(-)
--- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -2785,6 +2785,15 @@ static inline void skb_propagate_pfmemal }
/** + * skb_frag_off() - Returns the offset of a skb fragment + * @frag: the paged fragment + */ +static inline unsigned int skb_frag_off(const skb_frag_t *frag) +{ + return frag->page_offset; +} + +/** * skb_frag_page - retrieve the page referred to by a paged fragment * @frag: the paged fragment * --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4763,7 +4763,8 @@ static void skb_gro_reset_offset(struct
if (skb_mac_header(skb) == skb_tail_pointer(skb) && pinfo->nr_frags && - !PageHighMem(skb_frag_page(frag0))) { + !PageHighMem(skb_frag_page(frag0)) && + (!NET_IP_ALIGN || !(skb_frag_off(frag0) & 3))) { NAPI_GRO_CB(skb)->frag0 = skb_frag_address(frag0); NAPI_GRO_CB(skb)->frag0_len = min_t(unsigned int, skb_frag_size(frag0),
From: Juergen Gross jgross@suse.com
commit 76b4f357d0e7d8f6f0013c733e6cba1773c266d3 upstream.
KVM_MAX_VCPU_ID is the maximum vcpu-id of a guest, and not the number of vcpu-ids. Fix array indexed by vcpu-id to have KVM_MAX_VCPU_ID+1 elements.
Note that this is currently no real problem, as KVM_MAX_VCPU_ID is an odd number, resulting in always enough padding being available at the end of those arrays.
Nevertheless this should be fixed in order to avoid rare problems in case someone is using an even number for KVM_MAX_VCPU_ID.
Signed-off-by: Juergen Gross jgross@suse.com Message-Id: 20210701154105.23215-2-jgross@suse.com Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/ioapic.c | 2 +- arch/x86/kvm/ioapic.h | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-)
--- a/arch/x86/kvm/ioapic.c +++ b/arch/x86/kvm/ioapic.c @@ -96,7 +96,7 @@ static unsigned long ioapic_read_indirec static void rtc_irq_eoi_tracking_reset(struct kvm_ioapic *ioapic) { ioapic->rtc_status.pending_eoi = 0; - bitmap_zero(ioapic->rtc_status.dest_map.map, KVM_MAX_VCPU_ID); + bitmap_zero(ioapic->rtc_status.dest_map.map, KVM_MAX_VCPU_ID + 1); }
static void kvm_rtc_eoi_tracking_restore_all(struct kvm_ioapic *ioapic); --- a/arch/x86/kvm/ioapic.h +++ b/arch/x86/kvm/ioapic.h @@ -43,13 +43,13 @@ struct kvm_vcpu;
struct dest_map { /* vcpu bitmap where IRQ has been sent */ - DECLARE_BITMAP(map, KVM_MAX_VCPU_ID); + DECLARE_BITMAP(map, KVM_MAX_VCPU_ID + 1);
/* * Vector sent to a given vcpu, only valid when * the vcpu's bit in map is set */ - u8 vectors[KVM_MAX_VCPU_ID]; + u8 vectors[KVM_MAX_VCPU_ID + 1]; };
From: Junxiao Bi junxiao.bi@oracle.com
commit f267aeb6dea5e468793e5b8eb6a9c72c0020d418 upstream.
If append-dio feature is enabled, direct-io write and fallocate could run in parallel to extend file size, fallocate used "orig_isize" to record i_size before taking "ip_alloc_sem", when ocfs2_zeroout_partial_cluster() zeroout EOF blocks, i_size maybe already extended by ocfs2_dio_end_io_write(), that will cause valid data zeroed out.
Link: https://lkml.kernel.org/r/20210722054923.24389-1-junxiao.bi@oracle.com Fixes: 6bba4471f0cc ("ocfs2: fix data corruption by fallocate") Signed-off-by: Junxiao Bi junxiao.bi@oracle.com Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Changwei Ge gechangwei@live.cn Cc: Gang He ghe@suse.com Cc: Joel Becker jlbec@evilplan.org Cc: Jun Piao piaojun@huawei.com Cc: Mark Fasheh mark@fasheh.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ocfs2/file.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/ocfs2/file.c +++ b/fs/ocfs2/file.c @@ -1941,7 +1941,6 @@ static int __ocfs2_change_file_space(str goto out_inode_unlock; }
- orig_isize = i_size_read(inode); switch (sr->l_whence) { case 0: /*SEEK_SET*/ break; @@ -1949,7 +1948,7 @@ static int __ocfs2_change_file_space(str sr->l_start += f_pos; break; case 2: /*SEEK_END*/ - sr->l_start += orig_isize; + sr->l_start += i_size_read(inode); break; default: ret = -EINVAL; @@ -2004,6 +2003,7 @@ static int __ocfs2_change_file_space(str ret = -EINVAL; }
+ orig_isize = i_size_read(inode); /* zeroout eof blocks in the cluster. */ if (!ret && change_size && orig_isize < size) { ret = ocfs2_zeroout_partial_cluster(inode, orig_isize,
From: Junxiao Bi junxiao.bi@oracle.com
commit 9449ad33be8480f538b11a593e2dda2fb33ca06d upstream.
For punch holes in EOF blocks, fallocate used buffer write to zero the EOF blocks in last cluster. But since ->writepage will ignore EOF pages, those zeros will not be flushed.
This "looks" ok as commit 6bba4471f0cc ("ocfs2: fix data corruption by fallocate") will zero the EOF blocks when extend the file size, but it isn't. The problem happened on those EOF pages, before writeback, those pages had DIRTY flag set and all buffer_head in them also had DIRTY flag set, when writeback run by write_cache_pages(), DIRTY flag on the page was cleared, but DIRTY flag on the buffer_head not.
When next write happened to those EOF pages, since buffer_head already had DIRTY flag set, it would not mark page DIRTY again. That made writeback ignore them forever. That will cause data corruption. Even directio write can't work because it will fail when trying to drop pages caches before direct io, as it found the buffer_head for those pages still had DIRTY flag set, then it will fall back to buffer io mode.
To make a summary of the issue, as writeback ingores EOF pages, once any EOF page is generated, any write to it will only go to the page cache, it will never be flushed to disk even file size extends and that page is not EOF page any more. The fix is to avoid zero EOF blocks with buffer write.
The following code snippet from qemu-img could trigger the corruption.
656 open("6b3711ae-3306-4bdd-823c-cf1c0060a095.conv.2", O_RDWR|O_DIRECT|O_CLOEXEC) = 11 ... 660 fallocate(11, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 2275868672, 327680 <unfinished ...> 660 fallocate(11, 0, 2275868672, 327680) = 0 658 pwrite64(11, "
Link: https://lkml.kernel.org/r/20210722054923.24389-2-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi junxiao.bi@oracle.com Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Mark Fasheh mark@fasheh.com Cc: Joel Becker jlbec@evilplan.org Cc: Changwei Ge gechangwei@live.cn Cc: Gang He ghe@suse.com Cc: Jun Piao piaojun@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ocfs2/file.c | 99 +++++++++++++++++++++++++++++++++----------------------- 1 file changed, 60 insertions(+), 39 deletions(-)
--- a/fs/ocfs2/file.c +++ b/fs/ocfs2/file.c @@ -1535,6 +1535,45 @@ static void ocfs2_truncate_cluster_pages } }
+/* + * zero out partial blocks of one cluster. + * + * start: file offset where zero starts, will be made upper block aligned. + * len: it will be trimmed to the end of current cluster if "start + len" + * is bigger than it. + */ +static int ocfs2_zeroout_partial_cluster(struct inode *inode, + u64 start, u64 len) +{ + int ret; + u64 start_block, end_block, nr_blocks; + u64 p_block, offset; + u32 cluster, p_cluster, nr_clusters; + struct super_block *sb = inode->i_sb; + u64 end = ocfs2_align_bytes_to_clusters(sb, start); + + if (start + len < end) + end = start + len; + + start_block = ocfs2_blocks_for_bytes(sb, start); + end_block = ocfs2_blocks_for_bytes(sb, end); + nr_blocks = end_block - start_block; + if (!nr_blocks) + return 0; + + cluster = ocfs2_bytes_to_clusters(sb, start); + ret = ocfs2_get_clusters(inode, cluster, &p_cluster, + &nr_clusters, NULL); + if (ret) + return ret; + if (!p_cluster) + return 0; + + offset = start_block - ocfs2_clusters_to_blocks(sb, cluster); + p_block = ocfs2_clusters_to_blocks(sb, p_cluster) + offset; + return sb_issue_zeroout(sb, p_block, nr_blocks, GFP_NOFS); +} + static int ocfs2_zero_partial_clusters(struct inode *inode, u64 start, u64 len) { @@ -1544,6 +1583,7 @@ static int ocfs2_zero_partial_clusters(s struct ocfs2_super *osb = OCFS2_SB(inode->i_sb); unsigned int csize = osb->s_clustersize; handle_t *handle; + loff_t isize = i_size_read(inode);
/* * The "start" and "end" values are NOT necessarily part of @@ -1564,6 +1604,26 @@ static int ocfs2_zero_partial_clusters(s if ((start & (csize - 1)) == 0 && (end & (csize - 1)) == 0) goto out;
+ /* No page cache for EOF blocks, issue zero out to disk. */ + if (end > isize) { + /* + * zeroout eof blocks in last cluster starting from + * "isize" even "start" > "isize" because it is + * complicated to zeroout just at "start" as "start" + * may be not aligned with block size, buffer write + * would be required to do that, but out of eof buffer + * write is not supported. + */ + ret = ocfs2_zeroout_partial_cluster(inode, isize, + end - isize); + if (ret) { + mlog_errno(ret); + goto out; + } + if (start >= isize) + goto out; + end = isize; + } handle = ocfs2_start_trans(osb, OCFS2_INODE_UPDATE_CREDITS); if (IS_ERR(handle)) { ret = PTR_ERR(handle); @@ -1862,45 +1922,6 @@ out: }
/* - * zero out partial blocks of one cluster. - * - * start: file offset where zero starts, will be made upper block aligned. - * len: it will be trimmed to the end of current cluster if "start + len" - * is bigger than it. - */ -static int ocfs2_zeroout_partial_cluster(struct inode *inode, - u64 start, u64 len) -{ - int ret; - u64 start_block, end_block, nr_blocks; - u64 p_block, offset; - u32 cluster, p_cluster, nr_clusters; - struct super_block *sb = inode->i_sb; - u64 end = ocfs2_align_bytes_to_clusters(sb, start); - - if (start + len < end) - end = start + len; - - start_block = ocfs2_blocks_for_bytes(sb, start); - end_block = ocfs2_blocks_for_bytes(sb, end); - nr_blocks = end_block - start_block; - if (!nr_blocks) - return 0; - - cluster = ocfs2_bytes_to_clusters(sb, start); - ret = ocfs2_get_clusters(inode, cluster, &p_cluster, - &nr_clusters, NULL); - if (ret) - return ret; - if (!p_cluster) - return 0; - - offset = start_block - ocfs2_clusters_to_blocks(sb, cluster); - p_block = ocfs2_clusters_to_blocks(sb, p_cluster) + offset; - return sb_issue_zeroout(sb, p_block, nr_blocks, GFP_NOFS); -} - -/* * Parts of this function taken from xfs_change_file_space() */ static int __ocfs2_change_file_space(struct file *file, struct inode *inode,
From: Ziyang Xuan william.xuanziyang@huawei.com
commit 54f93336d000229f72c26d8a3f69dd256b744528 upstream.
We get a bug during ltp can_filter test as following.
=========================================== [60919.264984] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 [60919.265223] PGD 8000003dda726067 P4D 8000003dda726067 PUD 3dda727067 PMD 0 [60919.265443] Oops: 0000 [#1] SMP PTI [60919.265550] CPU: 30 PID: 3638365 Comm: can_filter Kdump: loaded Tainted: G W 4.19.90+ #1 [60919.266068] RIP: 0010:selinux_socket_sock_rcv_skb+0x3e/0x200 [60919.293289] RSP: 0018:ffff8d53bfc03cf8 EFLAGS: 00010246 [60919.307140] RAX: 0000000000000000 RBX: 000000000000001d RCX: 0000000000000007 [60919.320756] RDX: 0000000000000001 RSI: ffff8d5104a8ed00 RDI: ffff8d53bfc03d30 [60919.334319] RBP: ffff8d9338056800 R08: ffff8d53bfc29d80 R09: 0000000000000001 [60919.347969] R10: ffff8d53bfc03ec0 R11: ffffb8526ef47c98 R12: ffff8d53bfc03d30 [60919.350320] perf: interrupt took too long (3063 > 2500), lowering kernel.perf_event_max_sample_rate to 65000 [60919.361148] R13: 0000000000000001 R14: ffff8d53bcf90000 R15: 0000000000000000 [60919.361151] FS: 00007fb78b6b3600(0000) GS:ffff8d53bfc00000(0000) knlGS:0000000000000000 [60919.400812] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [60919.413730] CR2: 0000000000000010 CR3: 0000003e3f784006 CR4: 00000000007606e0 [60919.426479] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [60919.439339] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [60919.451608] PKRU: 55555554 [60919.463622] Call Trace: [60919.475617] <IRQ> [60919.487122] ? update_load_avg+0x89/0x5d0 [60919.498478] ? update_load_avg+0x89/0x5d0 [60919.509822] ? account_entity_enqueue+0xc5/0xf0 [60919.520709] security_sock_rcv_skb+0x2a/0x40 [60919.531413] sk_filter_trim_cap+0x47/0x1b0 [60919.542178] ? kmem_cache_alloc+0x38/0x1b0 [60919.552444] sock_queue_rcv_skb+0x17/0x30 [60919.562477] raw_rcv+0x110/0x190 [can_raw] [60919.572539] can_rcv_filter+0xbc/0x1b0 [can] [60919.582173] can_receive+0x6b/0xb0 [can] [60919.591595] can_rcv+0x31/0x70 [can] [60919.600783] __netif_receive_skb_one_core+0x5a/0x80 [60919.609864] process_backlog+0x9b/0x150 [60919.618691] net_rx_action+0x156/0x400 [60919.627310] ? sched_clock_cpu+0xc/0xa0 [60919.635714] __do_softirq+0xe8/0x2e9 [60919.644161] do_softirq_own_stack+0x2a/0x40 [60919.652154] </IRQ> [60919.659899] do_softirq.part.17+0x4f/0x60 [60919.667475] __local_bh_enable_ip+0x60/0x70 [60919.675089] __dev_queue_xmit+0x539/0x920 [60919.682267] ? finish_wait+0x80/0x80 [60919.689218] ? finish_wait+0x80/0x80 [60919.695886] ? sock_alloc_send_pskb+0x211/0x230 [60919.702395] ? can_send+0xe5/0x1f0 [can] [60919.708882] can_send+0xe5/0x1f0 [can] [60919.715037] raw_sendmsg+0x16d/0x268 [can_raw]
It's because raw_setsockopt() concurrently with unregister_netdevice_many(). Concurrent scenario as following.
cpu0 cpu1 raw_bind raw_setsockopt unregister_netdevice_many unlist_netdevice dev_get_by_index raw_notifier raw_enable_filters ...... can_rx_register can_rcv_list_find(..., net->can.rx_alldev_list)
......
sock_close raw_release(sock_a)
......
can_receive can_rcv_filter(net->can.rx_alldev_list, ...) raw_rcv(skb, sock_a) BUG
After unlist_netdevice(), dev_get_by_index() return NULL in raw_setsockopt(). Function raw_enable_filters() will add sock and can_filter to net->can.rx_alldev_list. Then the sock is closed. Followed by, we sock_sendmsg() to a new vcan device use the same can_filter. Protocol stack match the old receiver whose sock has been released on net->can.rx_alldev_list in can_rcv_filter(). Function raw_rcv() uses the freed sock. UAF BUG is triggered.
We can find that the key issue is that net_device has not been protected in raw_setsockopt(). Use rtnl_lock to protect net_device in raw_setsockopt().
Fixes: c18ce101f2e4 ("[CAN]: Add raw protocol") Link: https://lore.kernel.org/r/20210722070819.1048263-1-william.xuanziyang@huawei... Cc: linux-stable stable@vger.kernel.org Signed-off-by: Ziyang Xuan william.xuanziyang@huawei.com Acked-by: Oliver Hartkopp socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/can/raw.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-)
--- a/net/can/raw.c +++ b/net/can/raw.c @@ -549,10 +549,18 @@ static int raw_setsockopt(struct socket return -EFAULT; }
+ rtnl_lock(); lock_sock(sk);
- if (ro->bound && ro->ifindex) + if (ro->bound && ro->ifindex) { dev = dev_get_by_index(sock_net(sk), ro->ifindex); + if (!dev) { + if (count > 1) + kfree(filter); + err = -ENODEV; + goto out_fil; + } + }
if (ro->bound) { /* (try to) register the new filters */ @@ -591,6 +599,7 @@ static int raw_setsockopt(struct socket dev_put(dev);
release_sock(sk); + rtnl_unlock();
break;
@@ -603,10 +612,16 @@ static int raw_setsockopt(struct socket
err_mask &= CAN_ERR_MASK;
+ rtnl_lock(); lock_sock(sk);
- if (ro->bound && ro->ifindex) + if (ro->bound && ro->ifindex) { dev = dev_get_by_index(sock_net(sk), ro->ifindex); + if (!dev) { + err = -ENODEV; + goto out_err; + } + }
/* remove current error mask */ if (ro->bound) { @@ -630,6 +645,7 @@ static int raw_setsockopt(struct socket dev_put(dev);
release_sock(sk); + rtnl_unlock();
break;
From: Pavel Skripkin paskripkin@gmail.com
commit fc43fb69a7af92839551f99c1a96a37b77b3ae7a upstream.
Yasushi reported, that his Microchip CAN Analyzer stopped working since commit 91c02557174b ("can: mcba_usb: fix memory leak in mcba_usb"). The problem was in missing urb->transfer_dma initialization.
In my previous patch to this driver I refactored mcba_usb_start() code to avoid leaking usb coherent buffers. To archive it, I passed local stack variable to usb_alloc_coherent() and then saved it to private array to correctly free all coherent buffers on ->close() call. But I forgot to initialize urb->transfer_dma with variable passed to usb_alloc_coherent().
All of this was causing device to not work, since dma addr 0 is not valid and following log can be found on bug report page, which points exactly to problem described above.
| DMAR: [DMA Write] Request device [00:14.0] PASID ffffffff fault addr 0 [fault reason 05] PTE Write access is not set
Fixes: 91c02557174b ("can: mcba_usb: fix memory leak in mcba_usb") Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=990850 Link: https://lore.kernel.org/r/20210725103630.23864-1-paskripkin@gmail.com Cc: linux-stable stable@vger.kernel.org Reported-by: Yasushi SHOJI yasushi.shoji@gmail.com Signed-off-by: Pavel Skripkin paskripkin@gmail.com Tested-by: Yasushi SHOJI yashi@spacecubics.com [mkl: fixed typos in commit message - thanks Yasushi SHOJI] Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/usb/mcba_usb.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/net/can/usb/mcba_usb.c +++ b/drivers/net/can/usb/mcba_usb.c @@ -664,6 +664,8 @@ static int mcba_usb_start(struct mcba_pr break; }
+ urb->transfer_dma = buf_dma; + usb_fill_bulk_urb(urb, priv->udev, usb_rcvbulkpipe(priv->udev, MCBA_USB_EP_IN), buf, MCBA_USB_RX_BUFF_SIZE,
From: Pavel Skripkin paskripkin@gmail.com
commit 0e865f0c31928d6a313269ef624907eec55287c4 upstream.
In usb_8dev_start() MAX_RX_URBS coherent buffers are allocated and there is nothing, that frees them:
1) In callback function the urb is resubmitted and that's all 2) In disconnect function urbs are simply killed, but URB_FREE_BUFFER is not set (see usb_8dev_start) and this flag cannot be used with coherent buffers.
So, all allocated buffers should be freed with usb_free_coherent() explicitly.
Side note: This code looks like a copy-paste of other can drivers. The same patch was applied to mcba_usb driver and it works nice with real hardware. There is no change in functionality, only clean-up code for coherent buffers.
Fixes: 0024d8ad1639 ("can: usb_8dev: Add support for USB2CAN interface from 8 devices") Link: https://lore.kernel.org/r/d39b458cd425a1cf7f512f340224e6e9563b07bd.162740447... Cc: linux-stable stable@vger.kernel.org Signed-off-by: Pavel Skripkin paskripkin@gmail.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/usb/usb_8dev.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-)
--- a/drivers/net/can/usb/usb_8dev.c +++ b/drivers/net/can/usb/usb_8dev.c @@ -148,7 +148,8 @@ struct usb_8dev_priv { u8 *cmd_msg_buffer;
struct mutex usb_8dev_cmd_lock; - + void *rxbuf[MAX_RX_URBS]; + dma_addr_t rxbuf_dma[MAX_RX_URBS]; };
/* tx frame */ @@ -744,6 +745,7 @@ static int usb_8dev_start(struct usb_8de for (i = 0; i < MAX_RX_URBS; i++) { struct urb *urb = NULL; u8 *buf; + dma_addr_t buf_dma;
/* create a URB, and a buffer for it */ urb = usb_alloc_urb(0, GFP_KERNEL); @@ -753,7 +755,7 @@ static int usb_8dev_start(struct usb_8de }
buf = usb_alloc_coherent(priv->udev, RX_BUFFER_SIZE, GFP_KERNEL, - &urb->transfer_dma); + &buf_dma); if (!buf) { netdev_err(netdev, "No memory left for USB buffer\n"); usb_free_urb(urb); @@ -761,6 +763,8 @@ static int usb_8dev_start(struct usb_8de break; }
+ urb->transfer_dma = buf_dma; + usb_fill_bulk_urb(urb, priv->udev, usb_rcvbulkpipe(priv->udev, USB_8DEV_ENDP_DATA_RX), @@ -778,6 +782,9 @@ static int usb_8dev_start(struct usb_8de break; }
+ priv->rxbuf[i] = buf; + priv->rxbuf_dma[i] = buf_dma; + /* Drop reference, USB core will take care of freeing it */ usb_free_urb(urb); } @@ -847,6 +854,10 @@ static void unlink_all_urbs(struct usb_8
usb_kill_anchored_urbs(&priv->rx_submitted);
+ for (i = 0; i < MAX_RX_URBS; ++i) + usb_free_coherent(priv->udev, RX_BUFFER_SIZE, + priv->rxbuf[i], priv->rxbuf_dma[i]); + usb_kill_anchored_urbs(&priv->tx_submitted); atomic_set(&priv->active_tx_urbs, 0);
From: Pavel Skripkin paskripkin@gmail.com
commit 9969e3c5f40c166e3396acc36c34f9de502929f6 upstream.
In ems_usb_start() MAX_RX_URBS coherent buffers are allocated and there is nothing, that frees them:
1) In callback function the urb is resubmitted and that's all 2) In disconnect function urbs are simply killed, but URB_FREE_BUFFER is not set (see ems_usb_start) and this flag cannot be used with coherent buffers.
So, all allocated buffers should be freed with usb_free_coherent() explicitly.
Side note: This code looks like a copy-paste of other can drivers. The same patch was applied to mcba_usb driver and it works nice with real hardware. There is no change in functionality, only clean-up code for coherent buffers.
Fixes: 702171adeed3 ("ems_usb: Added support for EMS CPC-USB/ARM7 CAN/USB interface") Link: https://lore.kernel.org/r/59aa9fbc9a8cbf9af2bbd2f61a659c480b415800.162740447... Cc: linux-stable stable@vger.kernel.org Signed-off-by: Pavel Skripkin paskripkin@gmail.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/usb/ems_usb.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-)
--- a/drivers/net/can/usb/ems_usb.c +++ b/drivers/net/can/usb/ems_usb.c @@ -267,6 +267,8 @@ struct ems_usb { unsigned int free_slots; /* remember number of available slots */
struct ems_cpc_msg active_params; /* active controller parameters */ + void *rxbuf[MAX_RX_URBS]; + dma_addr_t rxbuf_dma[MAX_RX_URBS]; };
static void ems_usb_read_interrupt_callback(struct urb *urb) @@ -598,6 +600,7 @@ static int ems_usb_start(struct ems_usb for (i = 0; i < MAX_RX_URBS; i++) { struct urb *urb = NULL; u8 *buf = NULL; + dma_addr_t buf_dma;
/* create a URB, and a buffer for it */ urb = usb_alloc_urb(0, GFP_KERNEL); @@ -607,7 +610,7 @@ static int ems_usb_start(struct ems_usb }
buf = usb_alloc_coherent(dev->udev, RX_BUFFER_SIZE, GFP_KERNEL, - &urb->transfer_dma); + &buf_dma); if (!buf) { netdev_err(netdev, "No memory left for USB buffer\n"); usb_free_urb(urb); @@ -615,6 +618,8 @@ static int ems_usb_start(struct ems_usb break; }
+ urb->transfer_dma = buf_dma; + usb_fill_bulk_urb(urb, dev->udev, usb_rcvbulkpipe(dev->udev, 2), buf, RX_BUFFER_SIZE, ems_usb_read_bulk_callback, dev); @@ -630,6 +635,9 @@ static int ems_usb_start(struct ems_usb break; }
+ dev->rxbuf[i] = buf; + dev->rxbuf_dma[i] = buf_dma; + /* Drop reference, USB core will take care of freeing it */ usb_free_urb(urb); } @@ -695,6 +703,10 @@ static void unlink_all_urbs(struct ems_u
usb_kill_anchored_urbs(&dev->rx_submitted);
+ for (i = 0; i < MAX_RX_URBS; ++i) + usb_free_coherent(dev->udev, RX_BUFFER_SIZE, + dev->rxbuf[i], dev->rxbuf_dma[i]); + usb_kill_anchored_urbs(&dev->tx_submitted); atomic_set(&dev->active_tx_urbs, 0);
From: Pavel Skripkin paskripkin@gmail.com
commit 928150fad41ba16df7fcc9f7f945747d0f56cbb6 upstream.
In esd_usb2_setup_rx_urbs() MAX_RX_URBS coherent buffers are allocated and there is nothing, that frees them:
1) In callback function the urb is resubmitted and that's all 2) In disconnect function urbs are simply killed, but URB_FREE_BUFFER is not set (see esd_usb2_setup_rx_urbs) and this flag cannot be used with coherent buffers.
So, all allocated buffers should be freed with usb_free_coherent() explicitly.
Side note: This code looks like a copy-paste of other can drivers. The same patch was applied to mcba_usb driver and it works nice with real hardware. There is no change in functionality, only clean-up code for coherent buffers.
Fixes: 96d8e90382dc ("can: Add driver for esd CAN-USB/2 device") Link: https://lore.kernel.org/r/b31b096926dcb35998ad0271aac4b51770ca7cc8.162740447... Cc: linux-stable stable@vger.kernel.org Signed-off-by: Pavel Skripkin paskripkin@gmail.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/usb/esd_usb2.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-)
--- a/drivers/net/can/usb/esd_usb2.c +++ b/drivers/net/can/usb/esd_usb2.c @@ -207,6 +207,8 @@ struct esd_usb2 { int net_count; u32 version; int rxinitdone; + void *rxbuf[MAX_RX_URBS]; + dma_addr_t rxbuf_dma[MAX_RX_URBS]; };
struct esd_usb2_net_priv { @@ -556,6 +558,7 @@ static int esd_usb2_setup_rx_urbs(struct for (i = 0; i < MAX_RX_URBS; i++) { struct urb *urb = NULL; u8 *buf = NULL; + dma_addr_t buf_dma;
/* create a URB, and a buffer for it */ urb = usb_alloc_urb(0, GFP_KERNEL); @@ -565,7 +568,7 @@ static int esd_usb2_setup_rx_urbs(struct }
buf = usb_alloc_coherent(dev->udev, RX_BUFFER_SIZE, GFP_KERNEL, - &urb->transfer_dma); + &buf_dma); if (!buf) { dev_warn(dev->udev->dev.parent, "No memory left for USB buffer\n"); @@ -573,6 +576,8 @@ static int esd_usb2_setup_rx_urbs(struct goto freeurb; }
+ urb->transfer_dma = buf_dma; + usb_fill_bulk_urb(urb, dev->udev, usb_rcvbulkpipe(dev->udev, 1), buf, RX_BUFFER_SIZE, @@ -585,8 +590,12 @@ static int esd_usb2_setup_rx_urbs(struct usb_unanchor_urb(urb); usb_free_coherent(dev->udev, RX_BUFFER_SIZE, buf, urb->transfer_dma); + goto freeurb; }
+ dev->rxbuf[i] = buf; + dev->rxbuf_dma[i] = buf_dma; + freeurb: /* Drop reference, USB core will take care of freeing it */ usb_free_urb(urb); @@ -674,6 +683,11 @@ static void unlink_all_urbs(struct esd_u int i, j;
usb_kill_anchored_urbs(&dev->rx_submitted); + + for (i = 0; i < MAX_RX_URBS; ++i) + usb_free_coherent(dev->udev, RX_BUFFER_SIZE, + dev->rxbuf[i], dev->rxbuf_dma[i]); + for (i = 0; i < dev->net_count; i++) { priv = dev->nets[i]; if (priv) {
From: Paul Jakma paul@jakma.org
commit 15bbf8bb4d4ab87108ecf5f4155ec8ffa3c141d6 upstream.
Commit 7930742d6, reverting 26fd962, missed out on reverting an incorrect change to a return value. The niu_pci_vpd_scan_props(..) == 1 case appears to be a normal path - treating it as an error and return -EINVAL was breaking VPD_SCAN and causing the driver to fail to load.
Fix, so my Neptune card works again.
Cc: Kangjie Lu kjlu@umn.edu Cc: Shannon Nelson shannon.lee.nelson@gmail.com Cc: David S. Miller davem@davemloft.net Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: stable stable@vger.kernel.org Fixes: 7930742d ('Revert "niu: fix missing checks of niu_pci_eeprom_read"') Signed-off-by: Paul Jakma paul@jakma.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/sun/niu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/sun/niu.c +++ b/drivers/net/ethernet/sun/niu.c @@ -8211,8 +8211,9 @@ static int niu_pci_vpd_fetch(struct niu err = niu_pci_vpd_scan_props(np, here, end); if (err < 0) return err; + /* ret == 1 is not an error */ if (err == 1) - return -EINVAL; + return 0; } return 0; }
From: Krzysztof Kozlowski krzysztof.kozlowski@canonical.com
commit 5e7b30d24a5b8cb691c173b45b50e3ca0191be19 upstream.
There is a use after free memory corruption during module exit: - nfcsim_exit() - nfcsim_device_free(dev0) - nfc_digital_unregister_device() This iterates over command queue and frees all commands, - dev->up = false - nfcsim_link_shutdown() - nfcsim_link_recv_wake() This wakes the sleeping thread nfcsim_link_recv_skb().
- nfcsim_link_recv_skb() Wake from wait_event_interruptible_timeout(), call directly the deb->cb callback even though (dev->up == false), - digital_send_cmd_complete() Dereference of "struct digital_cmd" cmd which was freed earlier by nfc_digital_unregister_device().
This causes memory corruption shortly after (with unrelated stack trace):
nfc nfc0: NFC: nfcsim_recv_wq: Device is down llcp: nfc_llcp_recv: err -19 nfc nfc1: NFC: nfcsim_recv_wq: Device is down BUG: unable to handle page fault for address: ffffffffffffffed Call Trace: fsnotify+0x54b/0x5c0 __fsnotify_parent+0x1fe/0x300 ? vfs_write+0x27c/0x390 vfs_write+0x27c/0x390 ksys_write+0x63/0xe0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae
KASAN report:
BUG: KASAN: use-after-free in digital_send_cmd_complete+0x16/0x50 Write of size 8 at addr ffff88800a05f720 by task kworker/0:2/71 Workqueue: events nfcsim_recv_wq [nfcsim] Call Trace: dump_stack_lvl+0x45/0x59 print_address_description.constprop.0+0x21/0x140 ? digital_send_cmd_complete+0x16/0x50 ? digital_send_cmd_complete+0x16/0x50 kasan_report.cold+0x7f/0x11b ? digital_send_cmd_complete+0x16/0x50 ? digital_dep_link_down+0x60/0x60 digital_send_cmd_complete+0x16/0x50 nfcsim_recv_wq+0x38f/0x3d5 [nfcsim] ? nfcsim_in_send_cmd+0x4a/0x4a [nfcsim] ? lock_is_held_type+0x98/0x110 ? finish_wait+0x110/0x110 ? rcu_read_lock_sched_held+0x9c/0xd0 ? rcu_read_lock_bh_held+0xb0/0xb0 ? lockdep_hardirqs_on_prepare+0x12e/0x1f0
This flow of calling digital_send_cmd_complete() callback on driver exit is specific to nfcsim which implements reading and sending work queues. Since the NFC digital device was unregistered, the callback should not be called.
Fixes: 204bddcb508f ("NFC: nfcsim: Make use of the Digital layer") Cc: stable@vger.kernel.org Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@canonical.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/nfc/nfcsim.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/nfc/nfcsim.c +++ b/drivers/nfc/nfcsim.c @@ -201,8 +201,7 @@ static void nfcsim_recv_wq(struct work_s
if (!IS_ERR(skb)) dev_kfree_skb(skb); - - skb = ERR_PTR(-ENODEV); + return; }
dev->cb(dev->nfc_digital_dev, dev->arg, skb);
From: Jan Kiszka jan.kiszka@siemens.com
[ Upstream commit f7b21a0e41171d22296b897dac6e4c41d2a3643c ]
Fix:
../arch/x86/include/asm/proto.h:14:30: warning: ‘struct task_struct’ declared \ inside parameter list will not be visible outside of this definition or declaration long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2); ^~~~~~~~~~~
.../arch/x86/include/asm/proto.h:40:34: warning: ‘struct task_struct’ declared \ inside parameter list will not be visible outside of this definition or declaration long do_arch_prctl_common(struct task_struct *task, int option, ^~~~~~~~~~~
if linux/sched.h hasn't be included previously. This fixes a build error when this header is used outside of the kernel tree.
[ bp: Massage commit message. ]
Signed-off-by: Jan Kiszka jan.kiszka@siemens.com Signed-off-by: Borislav Petkov bp@suse.de Link: https://lkml.kernel.org/r/b76b4be3-cf66-f6b2-9a6c-3e7ef54f9845@web.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/proto.h | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/x86/include/asm/proto.h b/arch/x86/include/asm/proto.h index 6e81788a30c1..0eaca7a130c9 100644 --- a/arch/x86/include/asm/proto.h +++ b/arch/x86/include/asm/proto.h @@ -4,6 +4,8 @@
#include <asm/ldt.h>
+struct task_struct; + /* misc architecture specific prototypes */
void syscall_init(void);
From: Nguyen Dinh Phi phind.uet@gmail.com
commit f9a5c358c8d26fed0cc45f2afc64633d4ba21dff upstream.
When we exceed the limit of BSS entries, this function will free the new entry, however, at this time, it is the last door to access the inputed ies, so these ies will be unreferenced objects and cause memory leak. Therefore we should free its ies before deallocating the new entry, beside of dropping it from hidden_list.
Signed-off-by: Nguyen Dinh Phi phind.uet@gmail.com Link: https://lore.kernel.org/r/20210628132334.851095-1-phind.uet@gmail.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/wireless/scan.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/net/wireless/scan.c +++ b/net/wireless/scan.c @@ -1026,16 +1026,14 @@ cfg80211_bss_update(struct cfg80211_regi * be grouped with this beacon for updates ... */ if (!cfg80211_combine_bsses(rdev, new)) { - kfree(new); + bss_ref_put(rdev, new); goto drop; } }
if (rdev->bss_entries >= bss_entries_limit && !cfg80211_bss_expire_oldest(rdev)) { - if (!list_empty(&new->hidden_list)) - list_del(&new->hidden_list); - kfree(new); + bss_ref_put(rdev, new); goto drop; }
From: Florian Westphal fw@strlen.de
[ Upstream commit 30a56a2b881821625f79837d4d968c679852444e ]
In case the entry is evicted via garbage collection there is delay between the timeout value and the eviction event.
This adjusts the stop value based on how much time has passed.
Fixes: b87a2f9199ea82 ("netfilter: conntrack: add gc worker to remove timed-out entries") Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_conntrack_core.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index ede0ab5dc400..f13b476378aa 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -506,8 +506,13 @@ bool nf_ct_delete(struct nf_conn *ct, u32 portid, int report) return false;
tstamp = nf_conn_tstamp_find(ct); - if (tstamp && tstamp->stop == 0) + if (tstamp) { + s32 timeout = ct->timeout - nfct_time_stamp; + tstamp->stop = ktime_get_real_ns(); + if (timeout < 0) + tstamp->stop -= jiffies_to_nsecs(-timeout); + }
if (nf_conntrack_event_report(IPCT_DESTROY, ct, portid, report) < 0) {
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit a33f387ecd5aafae514095c2c4a8c24f7aea7e8b ]
nft_nat reports a bogus EAFNOSUPPORT if no layer 3 information is specified.
Fixes: d07db9884a5f ("netfilter: nf_tables: introduce nft_validate_register_load()") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nft_nat.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/netfilter/nft_nat.c b/net/netfilter/nft_nat.c index a18cceecef88..04dd813ed775 100644 --- a/net/netfilter/nft_nat.c +++ b/net/netfilter/nft_nat.c @@ -153,7 +153,9 @@ static int nft_nat_init(const struct nft_ctx *ctx, const struct nft_expr *expr, alen = FIELD_SIZEOF(struct nf_nat_range, min_addr.ip6); break; default: - return -EAFNOSUPPORT; + if (tb[NFTA_NAT_REG_ADDR_MIN]) + return -EAFNOSUPPORT; + break; } priv->family = family;
From: Hoang Le hoang.h.le@dektech.com.au
[ Upstream commit d237a7f11719ff9320721be5818352e48071aab6 ]
The release_sock() is blocking function, it would change the state after sleeping. In order to evaluate the stated condition outside the socket lock context, switch to use wait_woken() instead.
Fixes: 6398e23cdb1d8 ("tipc: standardize accept routine") Acked-by: Jon Maloy jmaloy@redhat.com Signed-off-by: Hoang Le hoang.h.le@dektech.com.au Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/tipc/socket.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/net/tipc/socket.c b/net/tipc/socket.c index 14e6cb814e4c..2e4d892768f9 100644 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -2001,7 +2001,7 @@ static int tipc_listen(struct socket *sock, int len) static int tipc_wait_for_accept(struct socket *sock, long timeo) { struct sock *sk = sock->sk; - DEFINE_WAIT(wait); + DEFINE_WAIT_FUNC(wait, woken_wake_function); int err;
/* True wake-one mechanism for incoming connections: only @@ -2010,12 +2010,12 @@ static int tipc_wait_for_accept(struct socket *sock, long timeo) * anymore, the common case will execute the loop only once. */ for (;;) { - prepare_to_wait_exclusive(sk_sleep(sk), &wait, - TASK_INTERRUPTIBLE); if (timeo && skb_queue_empty(&sk->sk_receive_queue)) { + add_wait_queue(sk_sleep(sk), &wait); release_sock(sk); - timeo = schedule_timeout(timeo); + timeo = wait_woken(&wait, TASK_INTERRUPTIBLE, timeo); lock_sock(sk); + remove_wait_queue(sk_sleep(sk), &wait); } err = 0; if (!skb_queue_empty(&sk->sk_receive_queue)) @@ -2027,7 +2027,6 @@ static int tipc_wait_for_accept(struct socket *sock, long timeo) if (signal_pending(current)) break; } - finish_wait(sk_sleep(sk), &wait); return err; }
From: Jiapeng Chong jiapeng.chong@linux.alibaba.com
[ Upstream commit 7e4960b3d66d7248b23de3251118147812b42da2 ]
The error code is missing in this code scenario, add the error code '-EINVAL' to the return value 'err'.
Eliminate the follow smatch warning:
drivers/net/ethernet/mellanox/mlx4/main.c:3538 mlx4_load_one() warn: missing error code 'err'.
Reported-by: Abaci Robot abaci@linux.alibaba.com Fixes: 7ae0e400cd93 ("net/mlx4_core: Flexible (asymmetric) allocation of EQs and MSI-X vectors for PF/VFs") Signed-off-by: Jiapeng Chong jiapeng.chong@linux.alibaba.com Reviewed-by: Tariq Toukan tariqt@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx4/main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c index c6660b61e836..69692f7a523c 100644 --- a/drivers/net/ethernet/mellanox/mlx4/main.c +++ b/drivers/net/ethernet/mellanox/mlx4/main.c @@ -3469,6 +3469,7 @@ slave_start:
if (!SRIOV_VALID_STATE(dev->flags)) { mlx4_err(dev, "Invalid SRIOV state\n"); + err = -EINVAL; goto err_close; } }
From: Pavel Skripkin paskripkin@gmail.com
[ Upstream commit c7c9d2102c9c098916ab9e0ab248006107d00d6c ]
Syzbot reported skb_over_panic() in llc_pdu_init_as_xid_cmd(). The problem was in wrong LCC header manipulations.
Syzbot's reproducer tries to send XID packet. llc_ui_sendmsg() is doing following steps:
1. skb allocation with size = len + header size len is passed from userpace and header size is 3 since addr->sllc_xid is set.
2. skb_reserve() for header_len = 3 3. filling all other space with memcpy_from_msg()
Ok, at this moment we have fully loaded skb, only headers needs to be filled.
Then code comes to llc_sap_action_send_xid_c(). This function pushes 3 bytes for LLC PDU header and initializes it. Then comes llc_pdu_init_as_xid_cmd(). It initalizes next 3 bytes *AFTER* LLC PDU header and call skb_push(skb, 3). This looks wrong for 2 reasons:
1. Bytes rigth after LLC header are user data, so this function was overwriting payload.
2. skb_push(skb, 3) call can cause skb_over_panic() since all free space was filled in llc_ui_sendmsg(). (This can happen is user passed 686 len: 686 + 14 (eth header) + 3 (LLC header) = 703. SKB_DATA_ALIGN(703) = 704)
So, in this patch I added 2 new private constansts: LLC_PDU_TYPE_U_XID and LLC_PDU_LEN_U_XID. LLC_PDU_LEN_U_XID is used to correctly reserve header size to handle LLC + XID case. LLC_PDU_TYPE_U_XID is used by llc_pdu_header_init() function to push 6 bytes instead of 3. And finally I removed skb_push() call from llc_pdu_init_as_xid_cmd().
This changes should not affect other parts of LLC, since after all steps we just transmit buffer.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-and-tested-by: syzbot+5e5a981ad7cc54c4b2b4@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin paskripkin@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/llc_pdu.h | 31 +++++++++++++++++++++++-------- net/llc/af_llc.c | 10 +++++++++- net/llc/llc_s_ac.c | 2 +- 3 files changed, 33 insertions(+), 10 deletions(-)
diff --git a/include/net/llc_pdu.h b/include/net/llc_pdu.h index c0f0a13ed818..49aa79c7b278 100644 --- a/include/net/llc_pdu.h +++ b/include/net/llc_pdu.h @@ -15,9 +15,11 @@ #include <linux/if_ether.h>
/* Lengths of frame formats */ -#define LLC_PDU_LEN_I 4 /* header and 2 control bytes */ -#define LLC_PDU_LEN_S 4 -#define LLC_PDU_LEN_U 3 /* header and 1 control byte */ +#define LLC_PDU_LEN_I 4 /* header and 2 control bytes */ +#define LLC_PDU_LEN_S 4 +#define LLC_PDU_LEN_U 3 /* header and 1 control byte */ +/* header and 1 control byte and XID info */ +#define LLC_PDU_LEN_U_XID (LLC_PDU_LEN_U + sizeof(struct llc_xid_info)) /* Known SAP addresses */ #define LLC_GLOBAL_SAP 0xFF #define LLC_NULL_SAP 0x00 /* not network-layer visible */ @@ -50,9 +52,10 @@ #define LLC_PDU_TYPE_U_MASK 0x03 /* 8-bit control field */ #define LLC_PDU_TYPE_MASK 0x03
-#define LLC_PDU_TYPE_I 0 /* first bit */ -#define LLC_PDU_TYPE_S 1 /* first two bits */ -#define LLC_PDU_TYPE_U 3 /* first two bits */ +#define LLC_PDU_TYPE_I 0 /* first bit */ +#define LLC_PDU_TYPE_S 1 /* first two bits */ +#define LLC_PDU_TYPE_U 3 /* first two bits */ +#define LLC_PDU_TYPE_U_XID 4 /* private type for detecting XID commands */
#define LLC_PDU_TYPE_IS_I(pdu) \ ((!(pdu->ctrl_1 & LLC_PDU_TYPE_I_MASK)) ? 1 : 0) @@ -230,9 +233,18 @@ static inline struct llc_pdu_un *llc_pdu_un_hdr(struct sk_buff *skb) static inline void llc_pdu_header_init(struct sk_buff *skb, u8 type, u8 ssap, u8 dsap, u8 cr) { - const int hlen = type == LLC_PDU_TYPE_U ? 3 : 4; + int hlen = 4; /* default value for I and S types */ struct llc_pdu_un *pdu;
+ switch (type) { + case LLC_PDU_TYPE_U: + hlen = 3; + break; + case LLC_PDU_TYPE_U_XID: + hlen = 6; + break; + } + skb_push(skb, hlen); skb_reset_network_header(skb); pdu = llc_pdu_un_hdr(skb); @@ -374,7 +386,10 @@ static inline void llc_pdu_init_as_xid_cmd(struct sk_buff *skb, xid_info->fmt_id = LLC_XID_FMT_ID; /* 0x81 */ xid_info->type = svcs_supported; xid_info->rw = rx_window << 1; /* size of receive window */ - skb_put(skb, sizeof(struct llc_xid_info)); + + /* no need to push/put since llc_pdu_header_init() has already + * pushed 3 + 3 bytes + */ }
/** diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c index d301ac51bbe1..ec48fb3fd30e 100644 --- a/net/llc/af_llc.c +++ b/net/llc/af_llc.c @@ -98,8 +98,16 @@ static inline u8 llc_ui_header_len(struct sock *sk, struct sockaddr_llc *addr) { u8 rc = LLC_PDU_LEN_U;
- if (addr->sllc_test || addr->sllc_xid) + if (addr->sllc_test) rc = LLC_PDU_LEN_U; + else if (addr->sllc_xid) + /* We need to expand header to sizeof(struct llc_xid_info) + * since llc_pdu_init_as_xid_cmd() sets 4,5,6 bytes of LLC header + * as XID PDU. In llc_ui_sendmsg() we reserved header size and then + * filled all other space with user data. If we won't reserve this + * bytes, llc_pdu_init_as_xid_cmd() will overwrite user data + */ + rc = LLC_PDU_LEN_U_XID; else if (sk->sk_type == SOCK_STREAM) rc = LLC_PDU_LEN_I; return rc; diff --git a/net/llc/llc_s_ac.c b/net/llc/llc_s_ac.c index 7ae4cc684d3a..9fa3342c7a82 100644 --- a/net/llc/llc_s_ac.c +++ b/net/llc/llc_s_ac.c @@ -79,7 +79,7 @@ int llc_sap_action_send_xid_c(struct llc_sap *sap, struct sk_buff *skb) struct llc_sap_state_ev *ev = llc_sap_ev(skb); int rc;
- llc_pdu_header_init(skb, LLC_PDU_TYPE_U, ev->saddr.lsap, + llc_pdu_header_init(skb, LLC_PDU_TYPE_U_XID, ev->saddr.lsap, ev->daddr.lsap, LLC_PDU_CMD); llc_pdu_init_as_xid_cmd(skb, LLC_XID_NULL_CLASS_2, 0); rc = llc_mac_hdr_init(skb, ev->saddr.mac, ev->daddr.mac);
From: Maor Gottlieb maorg@nvidia.com
[ Upstream commit 8b54874ef1617185048029a3083d510569e93751 ]
Fix a bug when flow table is created in priority that already has other flow tables as shown in the below diagram. If the new flow table (FT-B) has the lowest level in the priority, we need to connect the flow tables from the previous priority (p0) to this new table. In addition when this flow table is destroyed (FT-B), we need to connect the flow tables from the previous priority (p0) to the next level flow table (FT-C) in the same priority of the destroyed table (if exists).
--------- |root_ns| --------- | -------------------------------- | | | ---------- ---------- --------- |p(prio)-x| | p-y | | p-n | ---------- ---------- --------- | | ---------------- ------------------ |ns(e.g bypass)| |ns(e.g. kernel) | ---------------- ------------------ | | | ------- ------ ---- | p0 | | p1 | |p2| ------- ------ ---- | | \ -------- ------- ------ | FT-A | |FT-B | |FT-C| -------- ------- ------
Fixes: f90edfd279f3 ("net/mlx5_core: Connect flow tables") Signed-off-by: Maor Gottlieb maorg@nvidia.com Reviewed-by: Mark Bloch mbloch@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c index 24d1b0be5a68..24f70c337d8f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c @@ -795,17 +795,19 @@ static int connect_fwd_rules(struct mlx5_core_dev *dev, static int connect_flow_table(struct mlx5_core_dev *dev, struct mlx5_flow_table *ft, struct fs_prio *prio) { - struct mlx5_flow_table *next_ft; + struct mlx5_flow_table *next_ft, *first_ft; int err = 0;
/* Connect_prev_fts and update_root_ft_create are mutually exclusive */
- if (list_empty(&prio->node.children)) { + first_ft = list_first_entry_or_null(&prio->node.children, + struct mlx5_flow_table, node.list); + if (!first_ft || first_ft->level > ft->level) { err = connect_prev_fts(dev, ft, prio); if (err) return err;
- next_ft = find_next_chained_ft(prio); + next_ft = first_ft ? first_ft : find_next_chained_ft(prio); err = connect_fwd_rules(dev, ft, next_ft); if (err) return err; @@ -1703,7 +1705,7 @@ static int disconnect_flow_table(struct mlx5_flow_table *ft) node.list) == ft)) return 0;
- next_ft = find_next_chained_ft(prio); + next_ft = find_next_ft(ft); err = connect_fwd_rules(dev, next_ft, ft); if (err) return err;
From: Marcelo Ricardo Leitner marcelo.leitner@gmail.com
[ Upstream commit 557fb5862c9272ad9b21407afe1da8acfd9b53eb ]
As Ben Hutchings noticed, this check should have been inverted: the call returns true in case of success.
Reported-by: Ben Hutchings ben@decadent.org.uk Fixes: 0c5dc070ff3d ("sctp: validate from_addr_param return") Signed-off-by: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Reviewed-by: Xin Long lucien.xin@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/sctp/input.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sctp/input.c b/net/sctp/input.c index 1af35b69e99e..90428c59cfaf 100644 --- a/net/sctp/input.c +++ b/net/sctp/input.c @@ -1125,7 +1125,7 @@ static struct sctp_association *__sctp_rcv_asconf_lookup( if (unlikely(!af)) return NULL;
- if (af->from_addr_param(&paddr, param, peer_port, 0)) + if (!af->from_addr_param(&paddr, param, peer_port, 0)) return NULL;
return __sctp_lookup_association(net, laddr, &paddr, transportp);
From: Wang Hai wanghai38@huawei.com
[ Upstream commit 76a16be07b209a3f507c72abe823bd3af1c8661a ]
Replace pci_enable_device() with pcim_enable_device(), pci_disable_device() and pci_release_regions() will be called in release automatically.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Wang Hai wanghai38@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/dec/tulip/winbond-840.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/dec/tulip/winbond-840.c b/drivers/net/ethernet/dec/tulip/winbond-840.c index 32d7229544fa..e3b4345b2cc8 100644 --- a/drivers/net/ethernet/dec/tulip/winbond-840.c +++ b/drivers/net/ethernet/dec/tulip/winbond-840.c @@ -367,7 +367,7 @@ static int w840_probe1(struct pci_dev *pdev, const struct pci_device_id *ent) int i, option = find_cnt < MAX_UNITS ? options[find_cnt] : 0; void __iomem *ioaddr;
- i = pci_enable_device(pdev); + i = pcim_enable_device(pdev); if (i) return i;
pci_set_master(pdev); @@ -389,7 +389,7 @@ static int w840_probe1(struct pci_dev *pdev, const struct pci_device_id *ent)
ioaddr = pci_iomap(pdev, TULIP_BAR, netdev_res_size); if (!ioaddr) - goto err_out_free_res; + goto err_out_netdev;
for (i = 0; i < 3; i++) ((__le16 *)dev->dev_addr)[i] = cpu_to_le16(eeprom_read(ioaddr, i)); @@ -468,8 +468,6 @@ static int w840_probe1(struct pci_dev *pdev, const struct pci_device_id *ent)
err_out_cleardev: pci_iounmap(pdev, ioaddr); -err_out_free_res: - pci_release_regions(pdev); err_out_netdev: free_netdev (dev); return -ENODEV; @@ -1537,7 +1535,6 @@ static void w840_remove1(struct pci_dev *pdev) if (dev) { struct netdev_private *np = netdev_priv(dev); unregister_netdev(dev); - pci_release_regions(pdev); pci_iounmap(pdev, np->base_addr); free_netdev(dev); }
From: Wang Hai wanghai38@huawei.com
[ Upstream commit 89fb62fde3b226f99b7015280cf132e2a7438edf ]
Replace pci_enable_device() with pcim_enable_device(), pci_disable_device() and pci_release_regions() will be called in release automatically.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Wang Hai wanghai38@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/sis/sis900.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/sis/sis900.c b/drivers/net/ethernet/sis/sis900.c index 43b090f61cdc..aebc85a5e08a 100644 --- a/drivers/net/ethernet/sis/sis900.c +++ b/drivers/net/ethernet/sis/sis900.c @@ -441,7 +441,7 @@ static int sis900_probe(struct pci_dev *pci_dev, #endif
/* setup various bits in PCI command register */ - ret = pci_enable_device(pci_dev); + ret = pcim_enable_device(pci_dev); if(ret) return ret;
i = pci_set_dma_mask(pci_dev, DMA_BIT_MASK(32)); @@ -467,7 +467,7 @@ static int sis900_probe(struct pci_dev *pci_dev, ioaddr = pci_iomap(pci_dev, 0, 0); if (!ioaddr) { ret = -ENOMEM; - goto err_out_cleardev; + goto err_out; }
sis_priv = netdev_priv(net_dev); @@ -575,8 +575,6 @@ err_unmap_tx: sis_priv->tx_ring_dma); err_out_unmap: pci_iounmap(pci_dev, ioaddr); -err_out_cleardev: - pci_release_regions(pci_dev); err_out: free_netdev(net_dev); return ret; @@ -2423,7 +2421,6 @@ static void sis900_remove(struct pci_dev *pci_dev) sis_priv->tx_ring_dma); pci_iounmap(pci_dev, sis_priv->ioaddr); free_netdev(net_dev); - pci_release_regions(pci_dev); }
#ifdef CONFIG_PM
From: Dan Carpenter dan.carpenter@oracle.com
[ Upstream commit f6b3c7848e66e9046c8a79a5b88fd03461cc252b ]
The hi3110_cmd() is supposed to return zero on success and negative error codes on failure, but it was accidentally declared as a u8 when it needs to be an int type.
Fixes: 57e83fb9b746 ("can: hi311x: Add Holt HI-311x CAN driver") Link: https://lore.kernel.org/r/20210729141246.GA1267@kili Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/spi/hi311x.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/can/spi/hi311x.c b/drivers/net/can/spi/hi311x.c index ddaf46239e39..472175e37055 100644 --- a/drivers/net/can/spi/hi311x.c +++ b/drivers/net/can/spi/hi311x.c @@ -236,7 +236,7 @@ static int hi3110_spi_trans(struct spi_device *spi, int len) return ret; }
-static u8 hi3110_cmd(struct spi_device *spi, u8 command) +static int hi3110_cmd(struct spi_device *spi, u8 command) { struct hi3110_priv *priv = spi_get_drvdata(spi);
From: Arnaldo Carvalho de Melo acme@redhat.com
commit 9bac1bd6e6d36459087a728a968e79e37ebcea1a upstream.
This makes 'perf top' abort in some cases, and the right fix will involve surgery that is too much to do at this stage, so revert for now and fix it in the next merge window.
This reverts commit 2d6b74baa7147251c30a46c4996e8cc224aa2dc5.
Cc: Riccardo Mancini rickyman7@gmail.com Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@redhat.com Cc: Krister Johansen kjlx@templeofstupid.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/perf/util/map.c | 2 -- 1 file changed, 2 deletions(-)
--- a/tools/perf/util/map.c +++ b/tools/perf/util/map.c @@ -216,8 +216,6 @@ struct map *map__new(struct machine *mac if (type != MAP__FUNCTION) dso__set_loaded(dso, map->type); } - - nsinfo__put(dso->nsinfo); dso->nsinfo = nsi; dso__put(dso); }
On Mon, 2 Aug 2021 at 19:19, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.14.242 release. There are 38 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 04 Aug 2021 13:43:24 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.242-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 4.14.242-rc1 * git: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git * git branch: linux-4.14.y * git commit: ec038bb8339f8cbc9d78324a4e62c5cb3992e69f * git describe: v4.14.241-39-gec038bb8339f * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-4.14.y/build/v4.14....
## No regressions (compared to v4.14.241-14-g8cb34df08062)
## No fixes (compared to v4.14.241-14-g8cb34df08062)
## Test result summary total: 66149, pass: 52221, fail: 673, skip: 11274, xfail: 1981,
## Build Summary * arm: 97 total, 97 passed, 0 failed * arm64: 24 total, 24 passed, 0 failed * dragonboard-410c: 1 total, 1 passed, 0 failed * hi6220-hikey: 1 total, 1 passed, 0 failed * i386: 14 total, 14 passed, 0 failed * juno-r2: 1 total, 1 passed, 0 failed * mips: 36 total, 36 passed, 0 failed * sparc: 9 total, 9 passed, 0 failed * x15: 1 total, 1 passed, 0 failed * x86: 1 total, 1 passed, 0 failed * x86_64: 14 total, 14 passed, 0 failed
## Test suites summary * fwts * igt-gpu-tools * install-android-platform-tools-r2600 * kselftest-android * kselftest-bpf * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-drivers * kselftest-efivarfs * kselftest-filesystems * kselftest-firmware * kselftest-fpu * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-kvm * kselftest-lib * kselftest-livepatch * kselftest-lkdtm * kselftest-membarrier * kselftest-memfd * kselftest-memory-hotplug * kselftest-mincore * kselftest-mount * kselftest-mqueue * kselftest-net * kselftest-netfilter * kselftest-nsfs * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-tc-testing * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-vm * kselftest-x86 * kselftest-zram * kvm-unit-tests * libhugetlbfs * linux-log-parser * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-controllers-tests * ltp-cpuhotplug-tests * ltp-crypto-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-open-posix-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-tracing-tests * network-basic-tests * packetdrill * perf * rcutorture * ssuite * v4l2-compliance
-- Linaro LKFT https://lkft.linaro.org
On Mon, 02 Aug 2021 15:44:22 +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.14.242 release. There are 38 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 04 Aug 2021 13:43:24 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.242-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v4.14: 8 builds: 8 pass, 0 fail 16 boots: 16 pass, 0 fail 32 tests: 32 pass, 0 fail
Linux version: 4.14.242-rc1-gec038bb8339f Boards tested: tegra124-jetson-tk1, tegra20-ventana, tegra210-p2371-2180, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
On Mon, Aug 02, 2021 at 03:44:22PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.14.242 release. There are 38 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 04 Aug 2021 13:43:24 +0000. Anything received after that time might be too late.
Build results: total: 168 pass: 168 fail: 0 Qemu test results: total: 417 pass: 417 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
On 2021/8/2 21:44, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.14.242 release. There are 38 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 04 Aug 2021 13:43:24 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.242-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y and the diffstat can be found below.
thanks,
greg k-h
Tested on x86 for 4.14.242-rc1,
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git Branch: linux-4.14.y Version: 4.14.242-rc1 Commit: ec038bb8339f8cbc9d78324a4e62c5cb3992e69f Compiler: gcc version 7.3.0 (GCC)
x86: -------------------------------------------------------------------- Testcase Result Summary: total: 8836 passed: 8836 failed: 0 timeout: 0 --------------------------------------------------------------------
Tested-by: Hulk Robot hulkrobot@huawei.com
linux-stable-mirror@lists.linaro.org