This is the start of the stable review cycle for the 4.19.46 release. There are 114 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat 25 May 2019 06:15:02 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.46-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.19.46-rc1
Daniel Borkmann daniel@iogearbox.net bpf, lru: avoid messing with eviction heuristics upon syscall lookup
Daniel Borkmann daniel@iogearbox.net bpf: add map_lookup_elem_sys_only for lookups from syscall side
Chenbo Feng fengc@google.com bpf: relax inode permission check for retrieving bpf program
Greg Kroah-Hartman gregkh@linuxfoundation.org Revert "selftests/bpf: skip verifier tests for unsupported program types"
John Garry john.garry@huawei.com driver core: Postpone DMA tear-down until after devres release for probe failure
Nigel Croxon ncroxon@redhat.com md/raid: raid5 preserve the writeback action after the parity check
Song Liu songliubraving@fb.com Revert "Don't jump to compute_result state from check_result state"
Jiri Olsa jolsa@kernel.org perf/x86/intel: Fix race in intel_pmu_disable_event()
Arnaldo Carvalho de Melo acme@redhat.com perf bench numa: Add define for RUSAGE_THREAD if not present
Al Viro viro@zeniv.linux.org.uk ufs: fix braino in ufs_get_inode_gid() for solaris UFS flavour
Gary Hook Gary.Hook@amd.com x86/mm/mem_encrypt: Disable all instrumentation for early SME setup
Tobin C. Harding tobin@kernel.org sched/cpufreq: Fix kobject memleak
Luca Coelho luciano.coelho@intel.com iwlwifi: mvm: check for length correctness in iwl_mvm_create_skb()
Bjørn Mork bjorn@mork.no qmi_wwan: new Wistron, ZTE and D-Link devices
Peter Zijlstra peterz@infradead.org bpf: Fix preempt_enable_no_resched() abuse
Andrey Smirnov andrew.smirnov@gmail.com power: supply: sysfs: prevent endless uevent loop with CONFIG_POWER_SUPPLY_DEBUG
Andrew Jones drjones@redhat.com KVM: arm/arm64: Ensure vcpu target is unset on reset failure
Kangjie Lu kjlu@umn.edu net: ieee802154: fix missing checks for regmap_update_bits
Bhagavathi Perumal S bperumal@codeaurora.org mac80211: Fix kernel panic due to use of txq after free
Vitaly Kuznetsov vkuznets@redhat.com x86: kvm: hyper-v: deal with buggy TLB flush requests from WS2012
Logan Gunthorpe logang@deltatee.com PCI: Fix issue with "pci=disable_acs_redir" parameter being ignored
Al Viro viro@zeniv.linux.org.uk apparmorfs: fix use-after-free on symlink traversal
Al Viro viro@zeniv.linux.org.uk securityfs: fix use-after-free on symlink traversal
Tony Lindgren tony@atomide.com power: supply: cpcap-battery: Fix division by zero
Jernej Skrabec jernej.skrabec@siol.net clk: sunxi-ng: nkmp: Avoid GENMASK(-1, 0)
Steffen Klassert steffen.klassert@secunet.com xfrm4: Fix uninitialized memory read in _decode_session4
Martin Willi martin@strongswan.org xfrm: Honor original L3 slave device in xfrmi policy lookup
Sabrina Dubroca sd@queasysnail.net esp4: add length check for UDP encapsulation
Cong Wang xiyou.wangcong@gmail.com xfrm: clean up xfrm protocol checks
Jeremy Sowden jeremy@azazel.net vti4: ipip tunnel deregistration fixes.
Su Yanjun suyj.fnst@cn.fujitsu.com xfrm6_tunnel: Fix potential panic when unloading xfrm6_tunnel module
YueHaibing yuehaibing@huawei.com xfrm: policy: Fix out-of-bound array accesses in __xfrm_policy_unlink
Kirill Smelkov kirr@nexedi.com fuse: Add FOPEN_STREAM to use stream_open()
Martin Wilck mwilck@suse.com dm mpath: always free attached_handler_name in parse_path()
Mikulas Patocka mpatocka@redhat.com dm integrity: correctly calculate the size of metadata area
Mikulas Patocka mpatocka@redhat.com dm delay: fix a crash when invalid device is specified
Damien Le Moal damien.lemoal@wdc.com dm zoned: Fix zone report handling
Nikos Tsironis ntsironis@arrikto.com dm cache metadata: Fix loading discard bitset
Stefan Mätje stefan.maetje@esd.eu PCI: Work around Pericom PCIe-to-PCI bridge Retrain Link erratum
Stefan Mätje stefan.maetje@esd.eu PCI: Factor out pcie_retrain_link() function
Kazufumi Ikeda kaz-ikeda@xc.jp.nec.com PCI: rcar: Add the initialization of PCIe link in resume_noirq()
Jisheng Zhang Jisheng.Zhang@synaptics.com PCI/AER: Change pci_aer_init() stub to return void
Jean-Philippe Brucker jean-philippe.brucker@arm.com PCI: Init PCIe feature bits for managed host bridge alloc
James Prestwood james.prestwood@linux.intel.com PCI: Mark Atheros AR9462 to avoid bus reset
Nikolai Kostrigin nickel@altlinux.org PCI: Mark AMD Stoney Radeon R7 GPU ATS as broken
Yifeng Li tomli@tomli.me fbdev: sm712fb: fix crashes and garbled display during DPMS modesetting
Yifeng Li tomli@tomli.me fbdev: sm712fb: use 1024x768 by default on non-MIPS, fix garbled display
Yifeng Li tomli@tomli.me fbdev: sm712fb: fix support for 1024x768-16 mode
Yifeng Li tomli@tomli.me fbdev: sm712fb: fix crashes during framebuffer writes by correctly mapping VRAM
Yifeng Li tomli@tomli.me fbdev: sm712fb: fix boot screen glitch when sm712fb replaces VGA
Yifeng Li tomli@tomli.me fbdev: sm712fb: fix white screen of death on reboot, don't set CR3B-CR3F
Yifeng Li tomli@tomli.me fbdev: sm712fb: fix VRAM detection, don't set SR70/71/74/75
Yifeng Li tomli@tomli.me fbdev: sm712fb: fix brightness control on reboot, don't set SR30
Ard Biesheuvel ard.biesheuvel@linaro.org fbdev/efifb: Ignore framebuffer memmap entries that lack any memory types
Nathan Chancellor natechancellor@gmail.com objtool: Allow AR to be overridden with HOSTAR
Florian Fainelli f.fainelli@gmail.com MIPS: perf: Fix build with CONFIG_CPU_BMIPS5000 enabled
Adrian Hunter adrian.hunter@intel.com perf intel-pt: Fix sample timestamp wrt non-taken branches
Adrian Hunter adrian.hunter@intel.com perf intel-pt: Fix improved sample timestamp
Adrian Hunter adrian.hunter@intel.com perf intel-pt: Fix instructions sampling rate
Dmitry Osipenko digetx@gmail.com memory: tegra: Fix integer overflow on tick value calculation
Elazar Leibovich elazar@lightbitslabs.com tracing: Fix partial reading of trace event's id file
Peter Zijlstra peterz@infradead.org ftrace/x86_64: Emulate call function while updating in breakpoint handler
Peter Zijlstra peterz@infradead.org x86_64: Allow breakpoints to emulate call instructions
Josh Poimboeuf jpoimboe@redhat.com x86_64: Add gap to int3 to allow for call emulation
Jeff Layton jlayton@kernel.org ceph: flush dirty inodes before proceeding with remount
Dmitry Osipenko digetx@gmail.com iommu/tegra-smmu: Fix invalid ASID bits on Tegra30/114
Amir Goldstein amir73il@gmail.com ovl: fix missing upper fs freeze protection on copy up for ioctl
Liu Bo bo.liu@linux.alibaba.com fuse: honor RLIMIT_FSIZE in fuse_file_fallocate
Miklos Szeredi mszeredi@redhat.com fuse: fix writepages on 32bit
Mikulas Patocka mpatocka@redhat.com udlfb: introduce a rendering mutex
Mikulas Patocka mpatocka@redhat.com udlfb: fix sleeping inside spinlock
Mikulas Patocka mpatocka@redhat.com udlfb: delete the unused parameter for dlfb_handle_damage
Jonas Karlman jonas@kwiboo.se clk: rockchip: fix wrong clock definitions for rk3328
Owen Chen owen.chen@mediatek.com clk: mediatek: Disable tuner_en before change PLL rate
Dmitry Osipenko digetx@gmail.com clk: tegra: Fix PLLM programming on Tegra124+ when PMC overrides divider
Leo Yan leo.yan@linaro.org clk: hi3660: Mark clk_gate_ufs_subsys as critical
Olga Kornievskaia kolga@netapp.com PNFS fallback to MDS if no deviceid found
ZhangXiaoxu zhangxiaoxu5@huawei.com NFS4: Fix v4.0 client state corruption when mount
Steve Longerbeam slongerbeam@gmail.com media: imx: Clear fwnode link struct for each endpoint iteration
Steve Longerbeam slongerbeam@gmail.com media: imx: csi: Allow unknown nearest upstream entities
Janusz Krzysztofik jmkrzyszt@gmail.com media: ov6650: Fix sensor possibly not detected on probe
Colin Ian King colin.king@canonical.com phy: ti-pipe3: fix missing bit-wise or operator when assigning val
Christoph Probst kernel@probst.it cifs: fix strcat buffer overflow and reduce raciness in smb21_set_oplock_level()
Phong Tran tranmanphong@gmail.com of: fix clang -Wunsequenced for be32_to_cpu()
Pan Bian bianpan2016@163.com p54: drop device reference count if fails to enable device
Alexander Shishkin alexander.shishkin@linux.intel.com intel_th: msu: Fix single mode with IOMMU
Al Viro viro@zeniv.linux.org.uk dcache: sort the freeing-without-RCU-delay mess for good.
Yufen Yu yuyufen@huawei.com md: add mddev->pers to avoid potential NULL pointer dereference
NeilBrown neilb@suse.com md: batch flush requests.
NeilBrown neilb@suse.com Revert "MD: fix lock contention for flush bios"
Paul Moore paul@paul-moore.com proc: prevent changes to overridden credentials
Hou Tao houtao1@huawei.com brd: re-enable __GFP_HIGHMEM in brd_insert_page()
Alexander Shishkin alexander.shishkin@linux.intel.com stm class: Fix channel bitmap on 32-bit systems
Tingwei Zhang tingwei@codeaurora.org stm class: Fix channel free in stm output free path
Helge Deller deller@gmx.de parisc: Rename LEVEL to PA_ASM_LEVEL to avoid name clash with DRBD code
Helge Deller deller@gmx.de parisc: Use PA_ASM_LEVEL in boot code
Helge Deller deller@gmx.de parisc: Skip registering LED when running in QEMU
Helge Deller deller@gmx.de parisc: Export running_on_qemu symbol for modules
Saeed Mahameed saeedm@mellanox.com net/mlx5e: Fix ethtool rxfh commands when CONFIG_MLX5_EN_RXNFC is disabled
Saeed Mahameed saeedm@mellanox.com net/mlx5: Imply MLXFW in mlx5_core
Jorge E. Moreira jemoreira@google.com vsock/virtio: Initialize core virtio vsock before registering the driver
Junwei Hu hujunwei4@huawei.com tipc: fix modprobe tipc failed after switch order of device registration
Stefano Garzarella sgarzare@redhat.com vsock/virtio: free packets during the socket release
Junwei Hu hujunwei4@huawei.com tipc: switch order of device registration to fix a crash
Sabrina Dubroca sd@queasysnail.net rtnetlink: always put IFLA_LINK for links with a link-netnsid
YueHaibing yuehaibing@huawei.com ppp: deflate: Fix possible crash in deflate_init
Pieter Jansen van Vuuren pieter.jansenvanvuuren@netronome.com nfp: flower: add rcu locks when accessing netdev for tunnels
Daniele Palmas dnlplm@gmail.com net: usb: qmi_wwan: add Telit 0x1260 and 0x1261 compositions
Willem de Bruijn willemb@google.com net: test nouarg before dereferencing zerocopy pointers
Yunjian Wang wangyunjian@huawei.com net/mlx4_core: Change the error print to info print
Eric Dumazet edumazet@google.com net: avoid weird emergency message
Florian Fainelli f.fainelli@gmail.com net: Always descend into dsa/
Eric Dumazet edumazet@google.com ipv6: prevent possible fib6 leaks
Wei Wang weiwan@google.com ipv6: fix src addr routing with the exception table
-------------
Diffstat:
Documentation/filesystems/porting | 5 + Makefile | 4 +- arch/mips/kernel/perf_event_mipsxx.c | 21 +- arch/parisc/boot/compressed/head.S | 6 +- arch/parisc/include/asm/assembly.h | 6 +- arch/parisc/kernel/head.S | 4 +- arch/parisc/kernel/process.c | 1 + arch/parisc/kernel/syscall.S | 2 +- arch/x86/entry/entry_64.S | 18 +- arch/x86/events/intel/core.c | 10 +- arch/x86/include/asm/text-patching.h | 28 +++ arch/x86/kernel/ftrace.c | 32 ++- arch/x86/kvm/hyperv.c | 11 +- arch/x86/lib/Makefile | 12 + drivers/base/dd.c | 5 +- drivers/block/brd.c | 7 +- drivers/clk/hisilicon/clk-hi3660.c | 6 +- drivers/clk/mediatek/clk-pll.c | 48 ++-- drivers/clk/rockchip/clk-rk3328.c | 18 +- drivers/clk/sunxi-ng/ccu_nkmp.c | 18 +- drivers/clk/tegra/clk-pll.c | 4 +- drivers/hwtracing/intel_th/msu.c | 35 ++- drivers/hwtracing/stm/core.c | 9 +- drivers/iommu/tegra-smmu.c | 25 ++- drivers/md/dm-cache-metadata.c | 9 +- drivers/md/dm-delay.c | 3 +- drivers/md/dm-integrity.c | 4 +- drivers/md/dm-mpath.c | 2 +- drivers/md/dm-zoned-metadata.c | 5 + drivers/md/md.c | 182 +++++++--------- drivers/md/md.h | 25 +-- drivers/md/raid5.c | 29 ++- drivers/media/i2c/ov6650.c | 2 + drivers/memory/tegra/mc.c | 2 +- drivers/net/Makefile | 2 +- drivers/net/ethernet/mellanox/mlx4/mcg.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/Kconfig | 1 + .../net/ethernet/mellanox/mlx5/core/en_ethtool.c | 18 +- .../ethernet/netronome/nfp/flower/tunnel_conf.c | 17 +- drivers/net/ieee802154/mcr20a.c | 6 + drivers/net/ppp/ppp_deflate.c | 20 +- drivers/net/usb/qmi_wwan.c | 12 + drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c | 28 ++- drivers/net/wireless/intersil/p54/p54pci.c | 3 +- drivers/parisc/led.c | 3 + drivers/pci/controller/pcie-rcar.c | 21 ++ drivers/pci/pci.c | 19 +- drivers/pci/pci.h | 2 +- drivers/pci/pcie/aspm.c | 49 +++-- drivers/pci/probe.c | 23 +- drivers/pci/quirks.c | 19 ++ drivers/phy/ti/phy-ti-pipe3.c | 2 +- drivers/power/supply/cpcap-battery.c | 3 + drivers/power/supply/power_supply_sysfs.c | 6 - drivers/staging/media/imx/imx-media-csi.c | 18 +- drivers/staging/media/imx/imx-media-of.c | 15 +- drivers/video/fbdev/efifb.c | 8 +- drivers/video/fbdev/sm712.h | 12 +- drivers/video/fbdev/sm712fb.c | 242 +++++++++++++++++---- drivers/video/fbdev/udlfb.c | 114 ++++++++-- fs/ceph/super.c | 7 + fs/cifs/smb2ops.c | 14 +- fs/dcache.c | 24 +- fs/fuse/file.c | 13 +- fs/nfs/filelayout/filelayout.c | 2 +- fs/nfs/nfs4state.c | 4 + fs/nsfs.c | 3 +- fs/overlayfs/copy_up.c | 6 +- fs/overlayfs/file.c | 5 +- fs/overlayfs/overlayfs.h | 2 +- fs/proc/base.c | 5 + fs/ufs/util.h | 2 +- include/linux/bpf.h | 3 +- include/linux/dcache.h | 2 +- include/linux/of.h | 4 +- include/linux/pci.h | 2 + include/linux/skbuff.h | 9 +- include/net/ip6_fib.h | 3 +- include/net/xfrm.h | 20 +- include/uapi/linux/fuse.h | 2 + include/video/udlfb.h | 7 + kernel/bpf/hashtab.c | 23 +- kernel/bpf/inode.c | 2 +- kernel/bpf/syscall.c | 5 +- kernel/sched/cpufreq_schedutil.c | 1 + kernel/trace/trace_events.c | 3 - lib/Makefile | 11 + net/core/dev.c | 2 +- net/core/rtnetlink.c | 16 +- net/ipv4/esp4.c | 20 +- net/ipv4/ip_vti.c | 5 +- net/ipv4/xfrm4_policy.c | 24 +- net/ipv6/ip6_fib.c | 12 +- net/ipv6/route.c | 58 +++-- net/ipv6/xfrm6_tunnel.c | 6 +- net/key/af_key.c | 4 +- net/mac80211/iface.c | 3 + net/tipc/core.c | 14 +- net/vmw_vsock/virtio_transport.c | 13 +- net/vmw_vsock/virtio_transport_common.c | 7 + net/xfrm/xfrm_interface.c | 17 +- net/xfrm/xfrm_policy.c | 2 +- net/xfrm/xfrm_state.c | 2 +- net/xfrm/xfrm_user.c | 16 +- security/apparmor/apparmorfs.c | 13 +- security/inode.c | 13 +- tools/objtool/Makefile | 3 +- tools/perf/bench/numa.c | 4 + .../perf/util/intel-pt-decoder/intel-pt-decoder.c | 31 ++- tools/testing/selftests/bpf/test_verifier.c | 9 +- virt/kvm/arm/arm.c | 11 +- 111 files changed, 1215 insertions(+), 537 deletions(-)
From: Wei Wang weiwan@google.com
[ Upstream commit 510e2ceda031eed97a7a0f9aad65d271a58b460d ]
When inserting route cache into the exception table, the key is generated with both src_addr and dest_addr with src addr routing. However, current logic always assumes the src_addr used to generate the key is a /128 host address. This is not true in the following scenarios: 1. When the route is a gateway route or does not have next hop. (rt6_is_gw_or_nonexthop() == false) 2. When calling ip6_rt_cache_alloc(), saddr is passed in as NULL. This means, when looking for a route cache in the exception table, we have to do the lookup twice: first time with the passed in /128 host address, second time with the src_addr stored in fib6_info.
This solves the pmtu discovery issue reported by Mikael Magnusson where a route cache with a lower mtu info is created for a gateway route with src addr. However, the lookup code is not able to find this route cache.
Fixes: 2b760fcf5cfb ("ipv6: hook up exception table to store dst cache") Reported-by: Mikael Magnusson mikael.kernel@lists.m7n.se Bisected-by: David Ahern dsahern@gmail.com Signed-off-by: Wei Wang weiwan@google.com Cc: Martin Lau kafai@fb.com Cc: Eric Dumazet edumazet@google.com Acked-by: Martin KaFai Lau kafai@fb.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/route.c | 51 +++++++++++++++++++++++++++------------------------ 1 file changed, 27 insertions(+), 24 deletions(-)
--- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -110,8 +110,8 @@ static int rt6_fill_node(struct net *net int iif, int type, u32 portid, u32 seq, unsigned int flags); static struct rt6_info *rt6_find_cached_rt(struct fib6_info *rt, - struct in6_addr *daddr, - struct in6_addr *saddr); + const struct in6_addr *daddr, + const struct in6_addr *saddr);
#ifdef CONFIG_IPV6_ROUTE_INFO static struct fib6_info *rt6_add_route_info(struct net *net, @@ -1542,31 +1542,44 @@ out: * Caller has to hold rcu_read_lock() */ static struct rt6_info *rt6_find_cached_rt(struct fib6_info *rt, - struct in6_addr *daddr, - struct in6_addr *saddr) + const struct in6_addr *daddr, + const struct in6_addr *saddr) { + const struct in6_addr *src_key = NULL; struct rt6_exception_bucket *bucket; - struct in6_addr *src_key = NULL; struct rt6_exception *rt6_ex; struct rt6_info *res = NULL;
- bucket = rcu_dereference(rt->rt6i_exception_bucket); - #ifdef CONFIG_IPV6_SUBTREES /* rt6i_src.plen != 0 indicates rt is in subtree * and exception table is indexed by a hash of * both rt6i_dst and rt6i_src. - * Otherwise, the exception table is indexed by - * a hash of only rt6i_dst. + * However, the src addr used to create the hash + * might not be exactly the passed in saddr which + * is a /128 addr from the flow. + * So we need to use f6i->fib6_src to redo lookup + * if the passed in saddr does not find anything. + * (See the logic in ip6_rt_cache_alloc() on how + * rt->rt6i_src is updated.) */ if (rt->fib6_src.plen) src_key = saddr; +find_ex: #endif + bucket = rcu_dereference(rt->rt6i_exception_bucket); rt6_ex = __rt6_find_exception_rcu(&bucket, daddr, src_key);
if (rt6_ex && !rt6_check_expired(rt6_ex->rt6i)) res = rt6_ex->rt6i;
+#ifdef CONFIG_IPV6_SUBTREES + /* Use fib6_src as src_key and redo lookup */ + if (!res && src_key && src_key != &rt->fib6_src.addr) { + src_key = &rt->fib6_src.addr; + goto find_ex; + } +#endif + return res; }
@@ -2650,10 +2663,8 @@ out: u32 ip6_mtu_from_fib6(struct fib6_info *f6i, struct in6_addr *daddr, struct in6_addr *saddr) { - struct rt6_exception_bucket *bucket; - struct rt6_exception *rt6_ex; - struct in6_addr *src_key; struct inet6_dev *idev; + struct rt6_info *rt; u32 mtu = 0;
if (unlikely(fib6_metric_locked(f6i, RTAX_MTU))) { @@ -2662,18 +2673,10 @@ u32 ip6_mtu_from_fib6(struct fib6_info * goto out; }
- src_key = NULL; -#ifdef CONFIG_IPV6_SUBTREES - if (f6i->fib6_src.plen) - src_key = saddr; -#endif - - bucket = rcu_dereference(f6i->rt6i_exception_bucket); - rt6_ex = __rt6_find_exception_rcu(&bucket, daddr, src_key); - if (rt6_ex && !rt6_check_expired(rt6_ex->rt6i)) - mtu = dst_metric_raw(&rt6_ex->rt6i->dst, RTAX_MTU); - - if (likely(!mtu)) { + rt = rt6_find_cached_rt(f6i, daddr, saddr); + if (unlikely(rt)) { + mtu = dst_metric_raw(&rt->dst, RTAX_MTU); + } else { struct net_device *dev = fib6_info_nh_dev(f6i);
mtu = IPV6_MIN_MTU;
From: Eric Dumazet edumazet@google.com
[ Upstream commit 61fb0d01680771f72cc9d39783fb2c122aaad51e ]
At ipv6 route dismantle, fib6_drop_pcpu_from() is responsible for finding all percpu routes and set their ->from pointer to NULL, so that fib6_ref can reach its expected value (1).
The problem right now is that other cpus can still catch the route being deleted, since there is no rcu grace period between the route deletion and call to fib6_drop_pcpu_from()
This can leak the fib6 and associated resources, since no notifier will take care of removing the last reference(s).
I decided to add another boolean (fib6_destroying) instead of reusing/renaming exception_bucket_flushed to ease stable backports, and properly document the memory barriers used to implement this fix.
This patch has been co-developped with Wei Wang.
Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst based routes") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Cc: Wei Wang weiwan@google.com Cc: David Ahern dsahern@gmail.com Cc: Martin Lau kafai@fb.com Acked-by: Wei Wang weiwan@google.com Acked-by: Martin KaFai Lau kafai@fb.com Reviewed-by: David Ahern dsahern@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/ip6_fib.h | 3 ++- net/ipv6/ip6_fib.c | 12 +++++++++--- net/ipv6/route.c | 7 +++++++ 3 files changed, 18 insertions(+), 4 deletions(-)
--- a/include/net/ip6_fib.h +++ b/include/net/ip6_fib.h @@ -171,7 +171,8 @@ struct fib6_info { dst_nocount:1, dst_nopolicy:1, dst_host:1, - unused:3; + fib6_destroying:1, + unused:2;
struct fib6_nh fib6_nh; struct rcu_head rcu; --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -877,6 +877,12 @@ static void fib6_drop_pcpu_from(struct f { int cpu;
+ /* Make sure rt6_make_pcpu_route() wont add other percpu routes + * while we are cleaning them here. + */ + f6i->fib6_destroying = 1; + mb(); /* paired with the cmpxchg() in rt6_make_pcpu_route() */ + /* release the reference to this fib entry from * all of its cached pcpu routes */ @@ -900,6 +906,9 @@ static void fib6_purge_rt(struct fib6_in { struct fib6_table *table = rt->fib6_table;
+ if (rt->rt6i_pcpu) + fib6_drop_pcpu_from(rt, table); + if (atomic_read(&rt->fib6_ref) != 1) { /* This route is used as dummy address holder in some split * nodes. It is not leaked, but it still holds other resources, @@ -921,9 +930,6 @@ static void fib6_purge_rt(struct fib6_in fn = rcu_dereference_protected(fn->parent, lockdep_is_held(&table->tb6_lock)); } - - if (rt->rt6i_pcpu) - fib6_drop_pcpu_from(rt, table); } }
--- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1268,6 +1268,13 @@ static struct rt6_info *rt6_make_pcpu_ro prev = cmpxchg(p, NULL, pcpu_rt); BUG_ON(prev);
+ if (rt->fib6_destroying) { + struct fib6_info *from; + + from = xchg((__force struct fib6_info **)&pcpu_rt->from, NULL); + fib6_info_release(from); + } + return pcpu_rt; }
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit 0fe9f173d6cda95874edeb413b1fa9907b5ae830 ]
Jiri reported that with a kernel built with CONFIG_FIXED_PHY=y, CONFIG_NET_DSA=m and CONFIG_NET_DSA_LOOP=m, we would not get to a functional state where the mock-up driver is registered. Turns out that we are not descending into drivers/net/dsa/ unconditionally, and we won't be able to link-in dsa_loop_bdinfo.o which does the actual mock-up mdio device registration.
Reported-by: Jiri Pirko jiri@resnulli.us Fixes: 40013ff20b1b ("net: dsa: Fix functional dsa-loop dependency on FIXED_PHY") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Reviewed-by: Vivien Didelot vivien.didelot@gmail.com Tested-by: Jiri Pirko jiri@resnulli.us Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/Makefile +++ b/drivers/net/Makefile @@ -40,7 +40,7 @@ obj-$(CONFIG_ARCNET) += arcnet/ obj-$(CONFIG_DEV_APPLETALK) += appletalk/ obj-$(CONFIG_CAIF) += caif/ obj-$(CONFIG_CAN) += can/ -obj-$(CONFIG_NET_DSA) += dsa/ +obj-y += dsa/ obj-$(CONFIG_ETHERNET) += ethernet/ obj-$(CONFIG_FDDI) += fddi/ obj-$(CONFIG_HIPPI) += hippi/
From: Eric Dumazet edumazet@google.com
[ Upstream commit d7c04b05c9ca14c55309eb139430283a45c4c25f ]
When host is under high stress, it is very possible thread running netdev_wait_allrefs() returns from msleep(250) 10 seconds late.
This leads to these messages in the syslog :
[...] unregister_netdevice: waiting for syz_tun to become free. Usage count = 0
If the device refcount is zero, the wait is over.
Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/dev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/core/dev.c +++ b/net/core/dev.c @@ -8716,7 +8716,7 @@ static void netdev_wait_allrefs(struct n
refcnt = netdev_refcnt_read(dev);
- if (time_after(jiffies, warning_time + 10 * HZ)) { + if (refcnt && time_after(jiffies, warning_time + 10 * HZ)) { pr_emerg("unregister_netdevice: waiting for %s to become free. Usage count = %d\n", dev->name, refcnt); warning_time = jiffies;
From: Yunjian Wang wangyunjian@huawei.com
[ Upstream commit 00f9fec48157f3734e52130a119846e67a12314b ]
The error print within mlx4_flow_steer_promisc_add() should be a info print.
Fixes: 592e49dda812 ('net/mlx4: Implement promiscuous mode with device managed flow-steering') Signed-off-by: Yunjian Wang wangyunjian@huawei.com Reviewed-by: Tariq Toukan tariqt@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx4/mcg.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/mellanox/mlx4/mcg.c +++ b/drivers/net/ethernet/mellanox/mlx4/mcg.c @@ -1492,7 +1492,7 @@ int mlx4_flow_steer_promisc_add(struct m rule.port = port; rule.qpn = qpn; INIT_LIST_HEAD(&rule.list); - mlx4_err(dev, "going promisc on %x\n", port); + mlx4_info(dev, "going promisc on %x\n", port);
return mlx4_flow_attach(dev, &rule, regid_p); }
From: Willem de Bruijn willemb@google.com
[ Upstream commit 185ce5c38ea76f29b6bd9c7c8c7a5e5408834920 ]
Zerocopy skbs without completion notification were added for packet sockets with PACKET_TX_RING user buffers. Those signal completion through the TP_STATUS_USER bit in the ring. Zerocopy annotation was added only to avoid premature notification after clone or orphan, by triggering a copy on these paths for these packets.
The mechanism had to define a special "no-uarg" mode because packet sockets already use skb_uarg(skb) == skb_shinfo(skb)->destructor_arg for a different pointer.
Before deferencing skb_uarg(skb), verify that it is a real pointer.
Fixes: 5cd8d46ea1562 ("packet: copy user buffers before orphan or clone") Signed-off-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/skbuff.h | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
--- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -1333,10 +1333,12 @@ static inline void skb_zcopy_clear(struc struct ubuf_info *uarg = skb_zcopy(skb);
if (uarg) { - if (uarg->callback == sock_zerocopy_callback) { + if (skb_zcopy_is_nouarg(skb)) { + /* no notification callback */ + } else if (uarg->callback == sock_zerocopy_callback) { uarg->zerocopy = uarg->zerocopy && zerocopy; sock_zerocopy_put(uarg); - } else if (!skb_zcopy_is_nouarg(skb)) { + } else { uarg->callback(uarg, zerocopy); }
@@ -2587,7 +2589,8 @@ static inline int skb_orphan_frags(struc { if (likely(!skb_zcopy(skb))) return 0; - if (skb_uarg(skb)->callback == sock_zerocopy_callback) + if (!skb_zcopy_is_nouarg(skb) && + skb_uarg(skb)->callback == sock_zerocopy_callback) return 0; return skb_copy_ubufs(skb, gfp_mask); }
From: Daniele Palmas dnlplm@gmail.com
[ Upstream commit b4e467c82f8c12af78b6f6fa5730cb7dea7af1b4 ]
Added support for Telit LE910Cx 0x1260 and 0x1261 compositions.
Signed-off-by: Daniele Palmas dnlplm@gmail.com Acked-by: Bjørn Mork bjorn@mork.no Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/qmi_wwan.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -1240,6 +1240,8 @@ static const struct usb_device_id produc {QMI_FIXED_INTF(0x1bc7, 0x1101, 3)}, /* Telit ME910 dual modem */ {QMI_FIXED_INTF(0x1bc7, 0x1200, 5)}, /* Telit LE920 */ {QMI_QUIRK_SET_DTR(0x1bc7, 0x1201, 2)}, /* Telit LE920, LE920A4 */ + {QMI_QUIRK_SET_DTR(0x1bc7, 0x1260, 2)}, /* Telit LE910Cx */ + {QMI_QUIRK_SET_DTR(0x1bc7, 0x1261, 2)}, /* Telit LE910Cx */ {QMI_QUIRK_SET_DTR(0x1bc7, 0x1900, 1)}, /* Telit LN940 series */ {QMI_FIXED_INTF(0x1c9e, 0x9801, 3)}, /* Telewell TW-3G HSPA+ */ {QMI_FIXED_INTF(0x1c9e, 0x9803, 4)}, /* Telewell TW-3G HSPA+ */
From: Pieter Jansen van Vuuren pieter.jansenvanvuuren@netronome.com
[ Upstream commit cb07d915bf278a7a3938b983bbcb4921366b5eff ]
Add rcu locks when accessing netdev when processing route request and tunnel keep alive messages received from hardware.
Fixes: 8e6a9046b66a ("nfp: flower vxlan neighbour offload") Fixes: 856f5b135758 ("nfp: flower vxlan neighbour keep-alive") Signed-off-by: Pieter Jansen van Vuuren pieter.jansenvanvuuren@netronome.com Reviewed-by: Jakub Kicinski jakub.kicinski@netronome.com Reviewed-by: John Hurley john.hurley@netronome.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/netronome/nfp/flower/tunnel_conf.c | 17 ++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-)
--- a/drivers/net/ethernet/netronome/nfp/flower/tunnel_conf.c +++ b/drivers/net/ethernet/netronome/nfp/flower/tunnel_conf.c @@ -194,6 +194,7 @@ void nfp_tunnel_keep_alive(struct nfp_ap return; }
+ rcu_read_lock(); for (i = 0; i < count; i++) { ipv4_addr = payload->tun_info[i].ipv4; port = be32_to_cpu(payload->tun_info[i].egress_port); @@ -209,6 +210,7 @@ void nfp_tunnel_keep_alive(struct nfp_ap neigh_event_send(n, NULL); neigh_release(n); } + rcu_read_unlock(); }
static bool nfp_tun_is_netdev_to_offload(struct net_device *netdev) @@ -404,9 +406,10 @@ void nfp_tunnel_request_route(struct nfp
payload = nfp_flower_cmsg_get_data(skb);
+ rcu_read_lock(); netdev = nfp_app_repr_get(app, be32_to_cpu(payload->ingress_port)); if (!netdev) - goto route_fail_warning; + goto fail_rcu_unlock;
flow.daddr = payload->ipv4_addr; flow.flowi4_proto = IPPROTO_UDP; @@ -416,21 +419,23 @@ void nfp_tunnel_request_route(struct nfp rt = ip_route_output_key(dev_net(netdev), &flow); err = PTR_ERR_OR_ZERO(rt); if (err) - goto route_fail_warning; + goto fail_rcu_unlock; #else - goto route_fail_warning; + goto fail_rcu_unlock; #endif
/* Get the neighbour entry for the lookup */ n = dst_neigh_lookup(&rt->dst, &flow.daddr); ip_rt_put(rt); if (!n) - goto route_fail_warning; - nfp_tun_write_neigh(n->dev, app, &flow, n, GFP_KERNEL); + goto fail_rcu_unlock; + nfp_tun_write_neigh(n->dev, app, &flow, n, GFP_ATOMIC); neigh_release(n); + rcu_read_unlock(); return;
-route_fail_warning: +fail_rcu_unlock: + rcu_read_unlock(); nfp_flower_cmsg_warn(app, "Requested route not found.\n"); }
Hi!
From: Pieter Jansen van Vuuren pieter.jansenvanvuuren@netronome.com
[ Upstream commit cb07d915bf278a7a3938b983bbcb4921366b5eff ]
Add rcu locks when accessing netdev when processing route request and tunnel keep alive messages received from hardware.
/* Get the neighbour entry for the lookup */ n = dst_neigh_lookup(&rt->dst, &flow.daddr); ip_rt_put(rt); if (!n)
goto route_fail_warning;
- nfp_tun_write_neigh(n->dev, app, &flow, n, GFP_KERNEL);
goto fail_rcu_unlock;
- nfp_tun_write_neigh(n->dev, app, &flow, n, GFP_ATOMIC); neigh_release(n);
- rcu_read_unlock(); return;
This will make allocations more likely to fail. I thought it is okay to sleep in rcu lock critical sections...?
From: YueHaibing yuehaibing@huawei.com
[ Upstream commit 3ebe1bca58c85325c97a22d4fc3f5b5420752e6f ]
BUG: unable to handle kernel paging request at ffffffffa018f000 PGD 3270067 P4D 3270067 PUD 3271063 PMD 2307eb067 PTE 0 Oops: 0000 [#1] PREEMPT SMP CPU: 0 PID: 4138 Comm: modprobe Not tainted 5.1.0-rc7+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:ppp_register_compressor+0x3e/0xd0 [ppp_generic] Code: 98 4a 3f e2 48 8b 15 c1 67 00 00 41 8b 0c 24 48 81 fa 40 f0 19 a0 75 0e eb 35 48 8b 12 48 81 fa 40 f0 19 a0 74 RSP: 0018:ffffc90000d93c68 EFLAGS: 00010287 RAX: ffffffffa018f000 RBX: ffffffffa01a3000 RCX: 000000000000001a RDX: ffff888230c750a0 RSI: 0000000000000000 RDI: ffffffffa019f000 RBP: ffffc90000d93c80 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa0194080 R13: ffff88822ee1a700 R14: 0000000000000000 R15: ffffc90000d93e78 FS: 00007f2339557540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffa018f000 CR3: 000000022bde4000 CR4: 00000000000006f0 Call Trace: ? 0xffffffffa01a3000 deflate_init+0x11/0x1000 [ppp_deflate] ? 0xffffffffa01a3000 do_one_initcall+0x6c/0x3cc ? kmem_cache_alloc_trace+0x248/0x3b0 do_init_module+0x5b/0x1f1 load_module+0x1db1/0x2690 ? m_show+0x1d0/0x1d0 __do_sys_finit_module+0xc5/0xd0 __x64_sys_finit_module+0x15/0x20 do_syscall_64+0x6b/0x1d0 entry_SYSCALL_64_after_hwframe+0x49/0xbe
If ppp_deflate fails to register in deflate_init, module initialization failed out, however ppp_deflate_draft may has been regiestred and not unregistered before return. Then the seconed modprobe will trigger crash like this.
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: YueHaibing yuehaibing@huawei.com Acked-by: Guillaume Nault gnault@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ppp/ppp_deflate.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-)
--- a/drivers/net/ppp/ppp_deflate.c +++ b/drivers/net/ppp/ppp_deflate.c @@ -610,12 +610,20 @@ static struct compressor ppp_deflate_dra
static int __init deflate_init(void) { - int answer = ppp_register_compressor(&ppp_deflate); - if (answer == 0) - printk(KERN_INFO - "PPP Deflate Compression module registered\n"); - ppp_register_compressor(&ppp_deflate_draft); - return answer; + int rc; + + rc = ppp_register_compressor(&ppp_deflate); + if (rc) + return rc; + + rc = ppp_register_compressor(&ppp_deflate_draft); + if (rc) { + ppp_unregister_compressor(&ppp_deflate); + return rc; + } + + pr_info("PPP Deflate Compression module registered\n"); + return 0; }
static void __exit deflate_cleanup(void)
From: Sabrina Dubroca sd@queasysnail.net
[ Upstream commit feadc4b6cf42a53a8a93c918a569a0b7e62bd350 ]
Currently, nla_put_iflink() doesn't put the IFLA_LINK attribute when iflink == ifindex.
In some cases, a device can be created in a different netns with the same ifindex as its parent. That device will not dump its IFLA_LINK attribute, which can confuse some userspace software that expects it. For example, if the last ifindex created in init_net and foo are both 8, these commands will trigger the issue:
ip link add parent type dummy # ifindex 9 ip link add link parent netns foo type macvlan # ifindex 9 in ns foo
So, in case a device puts the IFLA_LINK_NETNSID attribute in a dump, always put the IFLA_LINK attribute as well.
Thanks to Dan Winship for analyzing the original OpenShift bug down to the missing netlink attribute.
v2: change Fixes tag, it's been here forever, as Nicolas Dichtel said add Nicolas' ack v3: change Fixes tag fix subject typo, spotted by Edward Cree
Analyzed-by: Dan Winship danw@redhat.com Fixes: d8a5ec672768 ("[NET]: netlink support for moving devices between network namespaces.") Signed-off-by: Sabrina Dubroca sd@queasysnail.net Acked-by: Nicolas Dichtel nicolas.dichtel@6wind.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/rtnetlink.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-)
--- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -1496,14 +1496,15 @@ static int put_master_ifindex(struct sk_ return ret; }
-static int nla_put_iflink(struct sk_buff *skb, const struct net_device *dev) +static int nla_put_iflink(struct sk_buff *skb, const struct net_device *dev, + bool force) { int ifindex = dev_get_iflink(dev);
- if (dev->ifindex == ifindex) - return 0; + if (force || dev->ifindex != ifindex) + return nla_put_u32(skb, IFLA_LINK, ifindex);
- return nla_put_u32(skb, IFLA_LINK, ifindex); + return 0; }
static noinline_for_stack int nla_put_ifalias(struct sk_buff *skb, @@ -1520,6 +1521,8 @@ static int rtnl_fill_link_netnsid(struct const struct net_device *dev, struct net *src_net) { + bool put_iflink = false; + if (dev->rtnl_link_ops && dev->rtnl_link_ops->get_link_net) { struct net *link_net = dev->rtnl_link_ops->get_link_net(dev);
@@ -1528,10 +1531,12 @@ static int rtnl_fill_link_netnsid(struct
if (nla_put_s32(skb, IFLA_LINK_NETNSID, id)) return -EMSGSIZE; + + put_iflink = true; } }
- return 0; + return nla_put_iflink(skb, dev, put_iflink); }
static int rtnl_fill_link_af(struct sk_buff *skb, @@ -1617,7 +1622,6 @@ static int rtnl_fill_ifinfo(struct sk_bu #ifdef CONFIG_RPS nla_put_u32(skb, IFLA_NUM_RX_QUEUES, dev->num_rx_queues) || #endif - nla_put_iflink(skb, dev) || put_master_ifindex(skb, dev) || nla_put_u8(skb, IFLA_CARRIER, netif_carrier_ok(dev)) || (dev->qdisc &&
From: Junwei Hu hujunwei4@huawei.com
[ Upstream commit 7e27e8d6130c5e88fac9ddec4249f7f2337fe7f8 ]
When tipc is loaded while many processes try to create a TIPC socket, a crash occurs: PANIC: Unable to handle kernel paging request at virtual address "dfff20000000021d" pc : tipc_sk_create+0x374/0x1180 [tipc] lr : tipc_sk_create+0x374/0x1180 [tipc] Exception class = DABT (current EL), IL = 32 bits Call trace: tipc_sk_create+0x374/0x1180 [tipc] __sock_create+0x1cc/0x408 __sys_socket+0xec/0x1f0 __arm64_sys_socket+0x74/0xa8 ...
This is due to race between sock_create and unfinished register_pernet_device. tipc_sk_insert tries to do "net_generic(net, tipc_net_id)". but tipc_net_id is not initialized yet.
So switch the order of the two to close the race.
This can be reproduced with multiple processes doing socket(AF_TIPC, ...) and one process doing module removal.
Fixes: a62fbccecd62 ("tipc: make subscriber server support net namespace") Signed-off-by: Junwei Hu hujunwei4@huawei.com Reported-by: Wang Wang wangwang2@huawei.com Reviewed-by: Xiaogang Wang wangxiaogang3@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/tipc/core.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
--- a/net/tipc/core.c +++ b/net/tipc/core.c @@ -129,10 +129,6 @@ static int __init tipc_init(void) if (err) goto out_netlink_compat;
- err = tipc_socket_init(); - if (err) - goto out_socket; - err = tipc_register_sysctl(); if (err) goto out_sysctl; @@ -141,6 +137,10 @@ static int __init tipc_init(void) if (err) goto out_pernet;
+ err = tipc_socket_init(); + if (err) + goto out_socket; + err = tipc_bearer_setup(); if (err) goto out_bearer; @@ -148,12 +148,12 @@ static int __init tipc_init(void) pr_info("Started in single node mode\n"); return 0; out_bearer: + tipc_socket_stop(); +out_socket: unregister_pernet_subsys(&tipc_net_ops); out_pernet: tipc_unregister_sysctl(); out_sysctl: - tipc_socket_stop(); -out_socket: tipc_netlink_compat_stop(); out_netlink_compat: tipc_netlink_stop(); @@ -165,10 +165,10 @@ out_netlink: static void __exit tipc_exit(void) { tipc_bearer_cleanup(); + tipc_socket_stop(); unregister_pernet_subsys(&tipc_net_ops); tipc_netlink_stop(); tipc_netlink_compat_stop(); - tipc_socket_stop(); tipc_unregister_sysctl();
pr_info("Deactivated\n");
From: Stefano Garzarella sgarzare@redhat.com
[ Upstream commit ac03046ece2b158ebd204dfc4896fd9f39f0e6c8 ]
When the socket is released, we should free all packets queued in the per-socket list in order to avoid a memory leak.
Signed-off-by: Stefano Garzarella sgarzare@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/vmw_vsock/virtio_transport_common.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -786,12 +786,19 @@ static bool virtio_transport_close(struc
void virtio_transport_release(struct vsock_sock *vsk) { + struct virtio_vsock_sock *vvs = vsk->trans; + struct virtio_vsock_pkt *pkt, *tmp; struct sock *sk = &vsk->sk; bool remove_sock = true;
lock_sock(sk); if (sk->sk_type == SOCK_STREAM) remove_sock = virtio_transport_close(vsk); + + list_for_each_entry_safe(pkt, tmp, &vvs->rx_queue, list) { + list_del(&pkt->list); + virtio_transport_free_pkt(pkt); + } release_sock(sk);
if (remove_sock)
From: Junwei Hu hujunwei4@huawei.com
[ Upstream commit 532b0f7ece4cb2ffd24dc723ddf55242d1188e5e ]
Error message printed: modprobe: ERROR: could not insert 'tipc': Address family not supported by protocol. when modprobe tipc after the following patch: switch order of device registration, commit 7e27e8d6130c ("tipc: switch order of device registration to fix a crash")
Because sock_create_kern(net, AF_TIPC, ...) is called by tipc_topsrv_create_listener() in the initialization process of tipc_net_ops, tipc_socket_init() must be execute before that.
I move tipc_socket_init() into function tipc_init_net().
Fixes: 7e27e8d6130c ("tipc: switch order of device registration to fix a crash") Signed-off-by: Junwei Hu hujunwei4@huawei.com Reported-by: Wang Wang wangwang2@huawei.com Reviewed-by: Kang Zhou zhoukang7@huawei.com Reviewed-by: Suanming Mou mousuanming@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/tipc/core.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
--- a/net/tipc/core.c +++ b/net/tipc/core.c @@ -66,6 +66,10 @@ static int __net_init tipc_init_net(stru INIT_LIST_HEAD(&tn->node_list); spin_lock_init(&tn->node_list_lock);
+ err = tipc_socket_init(); + if (err) + goto out_socket; + err = tipc_sk_rht_init(net); if (err) goto out_sk_rht; @@ -92,6 +96,8 @@ out_subscr: out_nametbl: tipc_sk_rht_destroy(net); out_sk_rht: + tipc_socket_stop(); +out_socket: return err; }
@@ -102,6 +108,7 @@ static void __net_exit tipc_exit_net(str tipc_bcast_stop(net); tipc_nametbl_stop(net); tipc_sk_rht_destroy(net); + tipc_socket_stop(); }
static struct pernet_operations tipc_net_ops = { @@ -137,10 +144,6 @@ static int __init tipc_init(void) if (err) goto out_pernet;
- err = tipc_socket_init(); - if (err) - goto out_socket; - err = tipc_bearer_setup(); if (err) goto out_bearer; @@ -148,8 +151,6 @@ static int __init tipc_init(void) pr_info("Started in single node mode\n"); return 0; out_bearer: - tipc_socket_stop(); -out_socket: unregister_pernet_subsys(&tipc_net_ops); out_pernet: tipc_unregister_sysctl(); @@ -165,7 +166,6 @@ out_netlink: static void __exit tipc_exit(void) { tipc_bearer_cleanup(); - tipc_socket_stop(); unregister_pernet_subsys(&tipc_net_ops); tipc_netlink_stop(); tipc_netlink_compat_stop();
From: "Jorge E. Moreira" jemoreira@google.com
[ Upstream commit ba95e5dfd36647622d8897a2a0470dde60e59ffd ]
Avoid a race in which static variables in net/vmw_vsock/af_vsock.c are accessed (while handling interrupts) before they are initialized.
[ 4.201410] BUG: unable to handle kernel paging request at ffffffffffffffe8 [ 4.207829] IP: vsock_addr_equals_addr+0x3/0x20 [ 4.211379] PGD 28210067 P4D 28210067 PUD 28212067 PMD 0 [ 4.211379] Oops: 0000 [#1] PREEMPT SMP PTI [ 4.211379] Modules linked in: [ 4.211379] CPU: 1 PID: 30 Comm: kworker/1:1 Not tainted 4.14.106-419297-gd7e28cc1f241 #1 [ 4.211379] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 4.211379] Workqueue: virtio_vsock virtio_transport_rx_work [ 4.211379] task: ffffa3273d175280 task.stack: ffffaea1800e8000 [ 4.211379] RIP: 0010:vsock_addr_equals_addr+0x3/0x20 [ 4.211379] RSP: 0000:ffffaea1800ebd28 EFLAGS: 00010286 [ 4.211379] RAX: 0000000000000002 RBX: 0000000000000000 RCX: ffffffffb94e42f0 [ 4.211379] RDX: 0000000000000400 RSI: ffffffffffffffe0 RDI: ffffaea1800ebdd0 [ 4.211379] RBP: ffffaea1800ebd58 R08: 0000000000000001 R09: 0000000000000001 [ 4.211379] R10: 0000000000000000 R11: ffffffffb89d5d60 R12: ffffaea1800ebdd0 [ 4.211379] R13: 00000000828cbfbf R14: 0000000000000000 R15: ffffaea1800ebdc0 [ 4.211379] FS: 0000000000000000(0000) GS:ffffa3273fd00000(0000) knlGS:0000000000000000 [ 4.211379] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4.211379] CR2: ffffffffffffffe8 CR3: 000000002820e001 CR4: 00000000001606e0 [ 4.211379] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 4.211379] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 4.211379] Call Trace: [ 4.211379] ? vsock_find_connected_socket+0x6c/0xe0 [ 4.211379] virtio_transport_recv_pkt+0x15f/0x740 [ 4.211379] ? detach_buf+0x1b5/0x210 [ 4.211379] virtio_transport_rx_work+0xb7/0x140 [ 4.211379] process_one_work+0x1ef/0x480 [ 4.211379] worker_thread+0x312/0x460 [ 4.211379] kthread+0x132/0x140 [ 4.211379] ? process_one_work+0x480/0x480 [ 4.211379] ? kthread_destroy_worker+0xd0/0xd0 [ 4.211379] ret_from_fork+0x35/0x40 [ 4.211379] Code: c7 47 08 00 00 00 00 66 c7 07 28 00 c7 47 08 ff ff ff ff c7 47 04 ff ff ff ff c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 8b 47 08 <3b> 46 08 75 0a 8b 47 04 3b 46 04 0f 94 c0 c3 31 c0 c3 90 66 2e [ 4.211379] RIP: vsock_addr_equals_addr+0x3/0x20 RSP: ffffaea1800ebd28 [ 4.211379] CR2: ffffffffffffffe8 [ 4.211379] ---[ end trace f31cc4a2e6df3689 ]--- [ 4.211379] Kernel panic - not syncing: Fatal exception in interrupt [ 4.211379] Kernel Offset: 0x37000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 4.211379] Rebooting in 5 seconds..
Fixes: 22b5c0b63f32 ("vsock/virtio: fix kernel panic after device hot-unplug") Cc: Stefan Hajnoczi stefanha@redhat.com Cc: Stefano Garzarella sgarzare@redhat.com Cc: "David S. Miller" davem@davemloft.net Cc: kvm@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: netdev@vger.kernel.org Cc: kernel-team@android.com Cc: stable@vger.kernel.org [4.9+] Signed-off-by: Jorge E. Moreira jemoreira@google.com Reviewed-by: Stefano Garzarella sgarzare@redhat.com Reviewed-by: Stefan Hajnoczi stefanha@redhat.com Acked-by: Stefan Hajnoczi stefanha@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/vmw_vsock/virtio_transport.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-)
--- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -702,28 +702,27 @@ static int __init virtio_vsock_init(void if (!virtio_vsock_workqueue) return -ENOMEM;
- ret = register_virtio_driver(&virtio_vsock_driver); + ret = vsock_core_init(&virtio_transport.transport); if (ret) goto out_wq;
- ret = vsock_core_init(&virtio_transport.transport); + ret = register_virtio_driver(&virtio_vsock_driver); if (ret) - goto out_vdr; + goto out_vci;
return 0;
-out_vdr: - unregister_virtio_driver(&virtio_vsock_driver); +out_vci: + vsock_core_exit(); out_wq: destroy_workqueue(virtio_vsock_workqueue); return ret; - }
static void __exit virtio_vsock_exit(void) { - vsock_core_exit(); unregister_virtio_driver(&virtio_vsock_driver); + vsock_core_exit(); destroy_workqueue(virtio_vsock_workqueue); }
From: Saeed Mahameed saeedm@mellanox.com
[ Upstream commit bad861f31bb15a99becef31aab59640eaeb247e2 ]
mlxfw can be compiled as external module while mlx5_core can be builtin, in such case mlx5 will act like mlxfw is disabled.
Since mlxfw is just a service library for mlx* drivers, imply it in mlx5_core to make it always reachable if it was enabled.
Fixes: 3ffaabecd1a1 ("net/mlx5e: Support the flash device ethtool callback") Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/Kconfig | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig +++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig @@ -8,6 +8,7 @@ config MLX5_CORE depends on PCI imply PTP_1588_CLOCK imply VXLAN + imply MLXFW default n ---help--- Core driver for low level functionality of the ConnectX-4 and
From: Saeed Mahameed saeedm@mellanox.com
[ Upstream commit 8f0916c6dc5cd5e3bc52416fa2a9ff4075080180 ]
ethtool user spaces needs to know ring count via ETHTOOL_GRXRINGS when executing (ethtool -x) which is retrieved via ethtool get_rxnfc callback, in mlx5 this callback is disabled when CONFIG_MLX5_EN_RXNFC=n.
This patch allows only ETHTOOL_GRXRINGS command on mlx5e_get_rxnfc() when CONFIG_MLX5_EN_RXNFC is disabled, so ethtool -x will continue working.
Fixes: fe6d86b3c316 ("net/mlx5e: Add CONFIG_MLX5_EN_RXNFC for ethtool rx nfc") Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c @@ -1609,6 +1609,22 @@ static int mlx5e_flash_device(struct net return mlx5e_ethtool_flash_device(priv, flash); }
+#ifndef CONFIG_MLX5_EN_RXNFC +/* When CONFIG_MLX5_EN_RXNFC=n we only support ETHTOOL_GRXRINGS + * otherwise this function will be defined from en_fs_ethtool.c + */ +static int mlx5e_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info, u32 *rule_locs) +{ + struct mlx5e_priv *priv = netdev_priv(dev); + + if (info->cmd != ETHTOOL_GRXRINGS) + return -EOPNOTSUPP; + /* ring_count is needed by ethtool -x */ + info->data = priv->channels.params.num_channels; + return 0; +} +#endif + const struct ethtool_ops mlx5e_ethtool_ops = { .get_drvinfo = mlx5e_get_drvinfo, .get_link = ethtool_op_get_link, @@ -1627,8 +1643,8 @@ const struct ethtool_ops mlx5e_ethtool_o .get_rxfh_indir_size = mlx5e_get_rxfh_indir_size, .get_rxfh = mlx5e_get_rxfh, .set_rxfh = mlx5e_set_rxfh, -#ifdef CONFIG_MLX5_EN_RXNFC .get_rxnfc = mlx5e_get_rxnfc, +#ifdef CONFIG_MLX5_EN_RXNFC .set_rxnfc = mlx5e_set_rxnfc, #endif .flash_device = mlx5e_flash_device,
From: Helge Deller deller@gmx.de
commit 3e1120f4b57bc12437048494ab56648edaa5b57d upstream.
Signed-off-by: Helge Deller deller@gmx.de CC: stable@vger.kernel.org # v4.9+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/parisc/kernel/process.c | 1 + 1 file changed, 1 insertion(+)
--- a/arch/parisc/kernel/process.c +++ b/arch/parisc/kernel/process.c @@ -193,6 +193,7 @@ int dump_task_fpu (struct task_struct *t */
int running_on_qemu __read_mostly; +EXPORT_SYMBOL(running_on_qemu);
void __cpuidle arch_cpu_idle_dead(void) {
From: Helge Deller deller@gmx.de
commit b438749044356dd1329c45e9b5a9377b6ea13eb2 upstream.
No need to spend CPU cycles when we run on QEMU.
Signed-off-by: Helge Deller deller@gmx.de CC: stable@vger.kernel.org # v4.9+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/parisc/led.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/parisc/led.c +++ b/drivers/parisc/led.c @@ -568,6 +568,9 @@ int __init register_led_driver(int model break;
case DISPLAY_MODEL_LASI: + /* Skip to register LED in QEMU */ + if (running_on_qemu) + return 1; LED_DATA_REG = data_reg; led_func_ptr = led_LASI_driver; printk(KERN_INFO "LED display at %lx registered\n", LED_DATA_REG);
From: Helge Deller deller@gmx.de
commit bdca5d64ee92abeacd6dada0bc6f6f8e6350dd67 upstream.
The LEVEL define clashed with the DRBD code.
Reported-by: kbuild test robot lkp@intel.com Signed-off-by: Helge Deller deller@gmx.de Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/parisc/boot/compressed/head.S | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/arch/parisc/boot/compressed/head.S +++ b/arch/parisc/boot/compressed/head.S @@ -22,7 +22,7 @@ __HEAD
ENTRY(startup) - .level LEVEL + .level PA_ASM_LEVEL
#define PSW_W_SM 0x200 #define PSW_W_BIT 36 @@ -63,7 +63,7 @@ $bss_loop: load32 BOOTADDR(decompress_kernel),%r3
#ifdef CONFIG_64BIT - .level LEVEL + .level PA_ASM_LEVEL ssm PSW_W_SM, %r0 /* set W-bit */ depdi 0, 31, 32, %r3 #endif @@ -72,7 +72,7 @@ $bss_loop:
startup_continue: #ifdef CONFIG_64BIT - .level LEVEL + .level PA_ASM_LEVEL rsm PSW_W_SM, %r0 /* clear W-bit */ #endif
From: Helge Deller deller@gmx.de
commit 1829dda0e87f4462782ca81be474c7890efe31ce upstream.
LEVEL is a very common word, and now after many years it suddenly clashed with another LEVEL define in the DRBD code. Rename it to PA_ASM_LEVEL instead.
Reported-by: kbuild test robot lkp@intel.com Signed-off-by: Helge Deller deller@gmx.de Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/parisc/include/asm/assembly.h | 6 +++--- arch/parisc/kernel/head.S | 4 ++-- arch/parisc/kernel/syscall.S | 2 +- 3 files changed, 6 insertions(+), 6 deletions(-)
--- a/arch/parisc/include/asm/assembly.h +++ b/arch/parisc/include/asm/assembly.h @@ -61,14 +61,14 @@ #define LDCW ldcw,co #define BL b,l # ifdef CONFIG_64BIT -# define LEVEL 2.0w +# define PA_ASM_LEVEL 2.0w # else -# define LEVEL 2.0 +# define PA_ASM_LEVEL 2.0 # endif #else #define LDCW ldcw #define BL bl -#define LEVEL 1.1 +#define PA_ASM_LEVEL 1.1 #endif
#ifdef __ASSEMBLY__ --- a/arch/parisc/kernel/head.S +++ b/arch/parisc/kernel/head.S @@ -22,7 +22,7 @@ #include <linux/linkage.h> #include <linux/init.h>
- .level LEVEL + .level PA_ASM_LEVEL
__INITDATA ENTRY(boot_args) @@ -258,7 +258,7 @@ stext_pdc_ret: ldo R%PA(fault_vector_11)(%r10),%r10
$is_pa20: - .level LEVEL /* restore 1.1 || 2.0w */ + .level PA_ASM_LEVEL /* restore 1.1 || 2.0w */ #endif /*!CONFIG_64BIT*/ load32 PA(fault_vector_20),%r10
--- a/arch/parisc/kernel/syscall.S +++ b/arch/parisc/kernel/syscall.S @@ -48,7 +48,7 @@ registers). */ #define KILL_INSN break 0,0
- .level LEVEL + .level PA_ASM_LEVEL
.text
From: Tingwei Zhang tingwei@codeaurora.org
commit ee496da4c3915de3232b5f5cd20e21ae3e46fe8d upstream.
Number of free masters is not set correctly in stm free path. Fix this by properly adding the number of output channels before setting them to 0 in stm_output_disclaim().
Currently it is equivalent to doing nothing since master->nr_free is incremented by 0.
Fixes: 7bd1d4093c2f ("stm class: Introduce an abstraction for System Trace Module devices") Signed-off-by: Tingwei Zhang tingwei@codeaurora.org Signed-off-by: Sai Prakash Ranjan saiprakash.ranjan@codeaurora.org Cc: stable@vger.kernel.org # v4.4 Signed-off-by: Alexander Shishkin alexander.shishkin@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hwtracing/stm/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/hwtracing/stm/core.c +++ b/drivers/hwtracing/stm/core.c @@ -218,8 +218,8 @@ stm_output_disclaim(struct stm_device *s bitmap_release_region(&master->chan_map[0], output->channel, ilog2(output->nr_chans));
- output->nr_chans = 0; master->nr_free += output->nr_chans; + output->nr_chans = 0; }
/*
From: Alexander Shishkin alexander.shishkin@linux.intel.com
commit 51e0f227812ed81a368de54157ebe14396b4be03 upstream.
Commit 7bd1d4093c2f ("stm class: Introduce an abstraction for System Trace Module devices") naively calculates the channel bitmap size in 64-bit chunks regardless of the size of underlying unsigned long, making the bitmap half as big on a 32-bit system. This leads to an out of bounds access with the upper half of the bitmap.
Fix this by using BITS_TO_LONGS. While at it, convert to using struct_size() for the total size calculation of the master struct.
Signed-off-by: Alexander Shishkin alexander.shishkin@linux.intel.com Fixes: 7bd1d4093c2f ("stm class: Introduce an abstraction for System Trace Module devices") Reported-by: Mulu He muluhe@codeaurora.org Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hwtracing/stm/core.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
--- a/drivers/hwtracing/stm/core.c +++ b/drivers/hwtracing/stm/core.c @@ -166,11 +166,10 @@ stm_master(struct stm_device *stm, unsig static int stp_master_alloc(struct stm_device *stm, unsigned int idx) { struct stp_master *master; - size_t size;
- size = ALIGN(stm->data->sw_nchannels, 8) / 8; - size += sizeof(struct stp_master); - master = kzalloc(size, GFP_ATOMIC); + master = kzalloc(struct_size(master, chan_map, + BITS_TO_LONGS(stm->data->sw_nchannels)), + GFP_ATOMIC); if (!master) return -ENOMEM;
From: Hou Tao houtao1@huawei.com
commit f6b50160a06d4a0d6a3999ab0c5aec4f52dba248 upstream.
__GFP_HIGHMEM is disabled if dax is enabled on brd, however dax support for brd has been removed since commit (7a862fbbdec6 "brd: remove dax support"), so restore __GFP_HIGHMEM in brd_insert_page().
Also remove the no longer applicable comments about DAX and highmem.
Cc: stable@vger.kernel.org Fixes: 7a862fbbdec6 ("brd: remove dax support") Signed-off-by: Hou Tao houtao1@huawei.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/brd.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-)
--- a/drivers/block/brd.c +++ b/drivers/block/brd.c @@ -96,13 +96,8 @@ static struct page *brd_insert_page(stru /* * Must use NOIO because we don't want to recurse back into the * block or filesystem layers from page reclaim. - * - * Cannot support DAX and highmem, because our ->direct_access - * routine for DAX must return memory that is always addressable. - * If DAX was reworked to use pfns and kmap throughout, this - * restriction might be able to be lifted. */ - gfp_flags = GFP_NOIO | __GFP_ZERO; + gfp_flags = GFP_NOIO | __GFP_ZERO | __GFP_HIGHMEM; page = alloc_page(gfp_flags); if (!page) return NULL;
From: Paul Moore paul@paul-moore.com
commit 35a196bef449b5824033865b963ed9a43fb8c730 upstream.
Prevent userspace from changing the the /proc/PID/attr values if the task's credentials are currently overriden. This not only makes sense conceptually, it also prevents some really bizarre error cases caused when trying to commit credentials to a task with overridden credentials.
Cc: stable@vger.kernel.org Reported-by: "chengjian (D)" cj.chengjian@huawei.com Signed-off-by: Paul Moore paul@paul-moore.com Acked-by: John Johansen john.johansen@canonical.com Acked-by: James Morris james.morris@microsoft.com Acked-by: Casey Schaufler casey@schaufler-ca.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/proc/base.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -2542,6 +2542,11 @@ static ssize_t proc_pid_attr_write(struc rcu_read_unlock(); return -EACCES; } + /* Prevent changes to overridden credentials. */ + if (current_cred() != current_real_cred()) { + rcu_read_unlock(); + return -EBUSY; + } rcu_read_unlock();
if (count > PAGE_SIZE)
From: NeilBrown neilb@suse.com
commit 4bc034d35377196c854236133b07730a777c4aba upstream.
This reverts commit 5a409b4f56d50b212334f338cb8465d65550cd85.
This patch has two problems.
1/ it make multiple calls to submit_bio() from inside a make_request_fn. The bios thus submitted will be queued on current->bio_list and not submitted immediately. As the bios are allocated from a mempool, this can theoretically result in a deadlock - all the pool of requests could be in various ->bio_list queues and a subsequent mempool_alloc could block waiting for one of them to be released.
2/ It aims to handle a case when there are many concurrent flush requests. It handles this by submitting many requests in parallel - all of which are identical and so most of which do nothing useful. It would be more efficient to just send one lower-level request, but allow that to satisfy multiple upper-level requests.
Fixes: 5a409b4f56d5 ("MD: fix lock contention for flush bios") Cc: stable@vger.kernel.org # v4.19+ Tested-by: Xiao Ni xni@redhat.com Signed-off-by: NeilBrown neilb@suse.com Signed-off-by: Song Liu songliubraving@fb.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/md.c | 163 ++++++++++++++++++-------------------------------------- drivers/md/md.h | 22 ++----- 2 files changed, 62 insertions(+), 123 deletions(-)
--- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -132,24 +132,6 @@ static inline int speed_max(struct mddev mddev->sync_speed_max : sysctl_speed_limit_max; }
-static void * flush_info_alloc(gfp_t gfp_flags, void *data) -{ - return kzalloc(sizeof(struct flush_info), gfp_flags); -} -static void flush_info_free(void *flush_info, void *data) -{ - kfree(flush_info); -} - -static void * flush_bio_alloc(gfp_t gfp_flags, void *data) -{ - return kzalloc(sizeof(struct flush_bio), gfp_flags); -} -static void flush_bio_free(void *flush_bio, void *data) -{ - kfree(flush_bio); -} - static struct ctl_table_header *raid_table_header;
static struct ctl_table raid_table[] = { @@ -429,54 +411,30 @@ static int md_congested(void *data, int /* * Generic flush handling for md */ -static void submit_flushes(struct work_struct *ws) -{ - struct flush_info *fi = container_of(ws, struct flush_info, flush_work); - struct mddev *mddev = fi->mddev; - struct bio *bio = fi->bio; - - bio->bi_opf &= ~REQ_PREFLUSH; - md_handle_request(mddev, bio); - - mempool_free(fi, mddev->flush_pool); -}
-static void md_end_flush(struct bio *fbio) +static void md_end_flush(struct bio *bio) { - struct flush_bio *fb = fbio->bi_private; - struct md_rdev *rdev = fb->rdev; - struct flush_info *fi = fb->fi; - struct bio *bio = fi->bio; - struct mddev *mddev = fi->mddev; + struct md_rdev *rdev = bio->bi_private; + struct mddev *mddev = rdev->mddev;
rdev_dec_pending(rdev, mddev);
- if (atomic_dec_and_test(&fi->flush_pending)) { - if (bio->bi_iter.bi_size == 0) { - /* an empty barrier - all done */ - bio_endio(bio); - mempool_free(fi, mddev->flush_pool); - } else { - INIT_WORK(&fi->flush_work, submit_flushes); - queue_work(md_wq, &fi->flush_work); - } + if (atomic_dec_and_test(&mddev->flush_pending)) { + /* The pre-request flush has finished */ + queue_work(md_wq, &mddev->flush_work); } - - mempool_free(fb, mddev->flush_bio_pool); - bio_put(fbio); + bio_put(bio); }
-void md_flush_request(struct mddev *mddev, struct bio *bio) +static void md_submit_flush_data(struct work_struct *ws); + +static void submit_flushes(struct work_struct *ws) { + struct mddev *mddev = container_of(ws, struct mddev, flush_work); struct md_rdev *rdev; - struct flush_info *fi; - - fi = mempool_alloc(mddev->flush_pool, GFP_NOIO); - - fi->bio = bio; - fi->mddev = mddev; - atomic_set(&fi->flush_pending, 1);
+ INIT_WORK(&mddev->flush_work, md_submit_flush_data); + atomic_set(&mddev->flush_pending, 1); rcu_read_lock(); rdev_for_each_rcu(rdev, mddev) if (rdev->raid_disk >= 0 && @@ -486,40 +444,59 @@ void md_flush_request(struct mddev *mdde * we reclaim rcu_read_lock */ struct bio *bi; - struct flush_bio *fb; atomic_inc(&rdev->nr_pending); atomic_inc(&rdev->nr_pending); rcu_read_unlock(); - - fb = mempool_alloc(mddev->flush_bio_pool, GFP_NOIO); - fb->fi = fi; - fb->rdev = rdev; - bi = bio_alloc_mddev(GFP_NOIO, 0, mddev); - bio_set_dev(bi, rdev->bdev); bi->bi_end_io = md_end_flush; - bi->bi_private = fb; + bi->bi_private = rdev; + bio_set_dev(bi, rdev->bdev); bi->bi_opf = REQ_OP_WRITE | REQ_PREFLUSH; - - atomic_inc(&fi->flush_pending); + atomic_inc(&mddev->flush_pending); submit_bio(bi); - rcu_read_lock(); rdev_dec_pending(rdev, mddev); } rcu_read_unlock(); + if (atomic_dec_and_test(&mddev->flush_pending)) + queue_work(md_wq, &mddev->flush_work); +}
- if (atomic_dec_and_test(&fi->flush_pending)) { - if (bio->bi_iter.bi_size == 0) { - /* an empty barrier - all done */ - bio_endio(bio); - mempool_free(fi, mddev->flush_pool); - } else { - INIT_WORK(&fi->flush_work, submit_flushes); - queue_work(md_wq, &fi->flush_work); - } +static void md_submit_flush_data(struct work_struct *ws) +{ + struct mddev *mddev = container_of(ws, struct mddev, flush_work); + struct bio *bio = mddev->flush_bio; + + /* + * must reset flush_bio before calling into md_handle_request to avoid a + * deadlock, because other bios passed md_handle_request suspend check + * could wait for this and below md_handle_request could wait for those + * bios because of suspend check + */ + mddev->flush_bio = NULL; + wake_up(&mddev->sb_wait); + + if (bio->bi_iter.bi_size == 0) { + /* an empty barrier - all done */ + bio_endio(bio); + } else { + bio->bi_opf &= ~REQ_PREFLUSH; + md_handle_request(mddev, bio); } } + +void md_flush_request(struct mddev *mddev, struct bio *bio) +{ + spin_lock_irq(&mddev->lock); + wait_event_lock_irq(mddev->sb_wait, + !mddev->flush_bio, + mddev->lock); + mddev->flush_bio = bio; + spin_unlock_irq(&mddev->lock); + + INIT_WORK(&mddev->flush_work, submit_flushes); + queue_work(md_wq, &mddev->flush_work); +} EXPORT_SYMBOL(md_flush_request);
static inline struct mddev *mddev_get(struct mddev *mddev) @@ -566,6 +543,7 @@ void mddev_init(struct mddev *mddev) atomic_set(&mddev->openers, 0); atomic_set(&mddev->active_io, 0); spin_lock_init(&mddev->lock); + atomic_set(&mddev->flush_pending, 0); init_waitqueue_head(&mddev->sb_wait); init_waitqueue_head(&mddev->recovery_wait); mddev->reshape_position = MaxSector; @@ -5519,22 +5497,6 @@ int md_run(struct mddev *mddev) if (err) return err; } - if (mddev->flush_pool == NULL) { - mddev->flush_pool = mempool_create(NR_FLUSH_INFOS, flush_info_alloc, - flush_info_free, mddev); - if (!mddev->flush_pool) { - err = -ENOMEM; - goto abort; - } - } - if (mddev->flush_bio_pool == NULL) { - mddev->flush_bio_pool = mempool_create(NR_FLUSH_BIOS, flush_bio_alloc, - flush_bio_free, mddev); - if (!mddev->flush_bio_pool) { - err = -ENOMEM; - goto abort; - } - }
spin_lock(&pers_lock); pers = find_pers(mddev->level, mddev->clevel); @@ -5694,15 +5656,8 @@ int md_run(struct mddev *mddev) return 0;
abort: - if (mddev->flush_bio_pool) { - mempool_destroy(mddev->flush_bio_pool); - mddev->flush_bio_pool = NULL; - } - if (mddev->flush_pool){ - mempool_destroy(mddev->flush_pool); - mddev->flush_pool = NULL; - } - + bioset_exit(&mddev->bio_set); + bioset_exit(&mddev->sync_set); return err; } EXPORT_SYMBOL_GPL(md_run); @@ -5906,14 +5861,6 @@ static void __md_stop(struct mddev *mdde mddev->to_remove = &md_redundancy_group; module_put(pers->owner); clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); - if (mddev->flush_bio_pool) { - mempool_destroy(mddev->flush_bio_pool); - mddev->flush_bio_pool = NULL; - } - if (mddev->flush_pool) { - mempool_destroy(mddev->flush_pool); - mddev->flush_pool = NULL; - } }
void md_stop(struct mddev *mddev) --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -252,19 +252,6 @@ enum mddev_sb_flags { MD_SB_NEED_REWRITE, /* metadata write needs to be repeated */ };
-#define NR_FLUSH_INFOS 8 -#define NR_FLUSH_BIOS 64 -struct flush_info { - struct bio *bio; - struct mddev *mddev; - struct work_struct flush_work; - atomic_t flush_pending; -}; -struct flush_bio { - struct flush_info *fi; - struct md_rdev *rdev; -}; - struct mddev { void *private; struct md_personality *pers; @@ -470,8 +457,13 @@ struct mddev { * metadata and bitmap writes */
- mempool_t *flush_pool; - mempool_t *flush_bio_pool; + /* Generic flush handling. + * The last to finish preflush schedules a worker to submit + * the rest of the request (without the REQ_PREFLUSH flag). + */ + struct bio *flush_bio; + atomic_t flush_pending; + struct work_struct flush_work; struct work_struct event_work; /* used by dm to report failure event */ void (*sync_super)(struct mddev *mddev, struct md_rdev *rdev); struct md_cluster_info *cluster_info;
From: NeilBrown neilb@suse.com
commit 2bc13b83e6298486371761de503faeffd15b7534 upstream.
Currently if many flush requests are submitted to an md device is quick succession, they are serialized and can take a long to process them all. We don't really need to call flush all those times - a single flush call can satisfy all requests submitted before it started. So keep track of when the current flush started and when it finished, allow any pending flush that was requested before the flush started to complete without waiting any more.
Test results from Xiao:
Test is done on a raid10 device which is created by 4 SSDs. The tool is dbench.
1. The latest linux stable kernel Operation Count AvgLat MaxLat -------------------------------------------------- Deltree 768 10.509 78.305 Flush 2078376 0.013 10.094 Close 21787697 0.019 18.821 LockX 96580 0.007 3.184 Mkdir 384 0.008 0.062 Rename 1255883 0.191 23.534 ReadX 46495589 0.020 14.230 WriteX 14790591 7.123 60.706 Unlink 5989118 0.440 54.551 UnlockX 96580 0.005 2.736 FIND_FIRST 10393845 0.042 12.079 SET_FILE_INFORMATION 2415558 0.129 10.088 QUERY_FILE_INFORMATION 4711725 0.005 8.462 QUERY_PATH_INFORMATION 26883327 0.032 21.715 QUERY_FS_INFORMATION 4929409 0.010 8.238 NTCreateX 29660080 0.100 53.268
Throughput 1034.88 MB/sec (sync open) 128 clients 128 procs max_latency=60.712 ms
2. With patch1 "Revert "MD: fix lock contention for flush bios"" Operation Count AvgLat MaxLat -------------------------------------------------- Deltree 256 8.326 36.761 Flush 693291 3.974 180.269 Close 7266404 0.009 36.929 LockX 32160 0.006 0.840 Mkdir 128 0.008 0.021 Rename 418755 0.063 29.945 ReadX 15498708 0.007 7.216 WriteX 4932310 22.482 267.928 Unlink 1997557 0.109 47.553 UnlockX 32160 0.004 1.110 FIND_FIRST 3465791 0.036 7.320 SET_FILE_INFORMATION 805825 0.015 1.561 QUERY_FILE_INFORMATION 1570950 0.005 2.403 QUERY_PATH_INFORMATION 8965483 0.013 14.277 QUERY_FS_INFORMATION 1643626 0.009 3.314 NTCreateX 9892174 0.061 41.278
Throughput 345.009 MB/sec (sync open) 128 clients 128 procs max_latency=267.939 m
3. With patch1 and patch2 Operation Count AvgLat MaxLat -------------------------------------------------- Deltree 768 9.570 54.588 Flush 2061354 0.666 15.102 Close 21604811 0.012 25.697 LockX 95770 0.007 1.424 Mkdir 384 0.008 0.053 Rename 1245411 0.096 12.263 ReadX 46103198 0.011 12.116 WriteX 14667988 7.375 60.069 Unlink 5938936 0.173 30.905 UnlockX 95770 0.005 4.147 FIND_FIRST 10306407 0.041 11.715 SET_FILE_INFORMATION 2395987 0.048 7.640 QUERY_FILE_INFORMATION 4672371 0.005 9.291 QUERY_PATH_INFORMATION 26656735 0.018 19.719 QUERY_FS_INFORMATION 4887940 0.010 7.654 NTCreateX 29410811 0.059 28.551
Throughput 1026.21 MB/sec (sync open) 128 clients 128 procs max_latency=60.075 ms
Cc: stable@vger.kernel.org # v4.19+ Tested-by: Xiao Ni xni@redhat.com Signed-off-by: NeilBrown neilb@suse.com Signed-off-by: Song Liu songliubraving@fb.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/md.c | 27 +++++++++++++++++++++++---- drivers/md/md.h | 3 +++ 2 files changed, 26 insertions(+), 4 deletions(-)
--- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -433,6 +433,7 @@ static void submit_flushes(struct work_s struct mddev *mddev = container_of(ws, struct mddev, flush_work); struct md_rdev *rdev;
+ mddev->start_flush = ktime_get_boottime(); INIT_WORK(&mddev->flush_work, md_submit_flush_data); atomic_set(&mddev->flush_pending, 1); rcu_read_lock(); @@ -473,6 +474,7 @@ static void md_submit_flush_data(struct * could wait for this and below md_handle_request could wait for those * bios because of suspend check */ + mddev->last_flush = mddev->start_flush; mddev->flush_bio = NULL; wake_up(&mddev->sb_wait);
@@ -487,15 +489,32 @@ static void md_submit_flush_data(struct
void md_flush_request(struct mddev *mddev, struct bio *bio) { + ktime_t start = ktime_get_boottime(); spin_lock_irq(&mddev->lock); wait_event_lock_irq(mddev->sb_wait, - !mddev->flush_bio, + !mddev->flush_bio || + ktime_after(mddev->last_flush, start), mddev->lock); - mddev->flush_bio = bio; + if (!ktime_after(mddev->last_flush, start)) { + WARN_ON(mddev->flush_bio); + mddev->flush_bio = bio; + bio = NULL; + } spin_unlock_irq(&mddev->lock);
- INIT_WORK(&mddev->flush_work, submit_flushes); - queue_work(md_wq, &mddev->flush_work); + if (!bio) { + INIT_WORK(&mddev->flush_work, submit_flushes); + queue_work(md_wq, &mddev->flush_work); + } else { + /* flush was performed for some other bio while we waited. */ + if (bio->bi_iter.bi_size == 0) + /* an empty barrier - all done */ + bio_endio(bio); + else { + bio->bi_opf &= ~REQ_PREFLUSH; + mddev->pers->make_request(mddev, bio); + } + } } EXPORT_SYMBOL(md_flush_request);
--- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -463,6 +463,9 @@ struct mddev { */ struct bio *flush_bio; atomic_t flush_pending; + ktime_t start_flush, last_flush; /* last_flush is when the last completed + * flush was started. + */ struct work_struct flush_work; struct work_struct event_work; /* used by dm to report failure event */ void (*sync_super)(struct mddev *mddev, struct md_rdev *rdev);
From: Yufen Yu yuyufen@huawei.com
commit ee37e62191a59d253fc916b9fc763deb777211e2 upstream.
When doing re-add, we need to ensure rdev->mddev->pers is not NULL, which can avoid potential NULL pointer derefence in fallowing add_bound_rdev().
Fixes: a6da4ef85cef ("md: re-add a failed disk") Cc: Xiao Ni xni@redhat.com Cc: NeilBrown neilb@suse.com Cc: stable@vger.kernel.org # 4.4+ Reviewed-by: NeilBrown neilb@suse.com Signed-off-by: Yufen Yu yuyufen@huawei.com Signed-off-by: Song Liu songliubraving@fb.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/md.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -2860,8 +2860,10 @@ state_store(struct md_rdev *rdev, const err = 0; } } else if (cmd_match(buf, "re-add")) { - if (test_bit(Faulty, &rdev->flags) && (rdev->raid_disk == -1) && - rdev->saved_raid_disk >= 0) { + if (!rdev->mddev->pers) + err = -EINVAL; + else if (test_bit(Faulty, &rdev->flags) && (rdev->raid_disk == -1) && + rdev->saved_raid_disk >= 0) { /* clear_bit is performed _after_ all the devices * have their local Faulty bit cleared. If any writes * happen in the meantime in the local node, they
From: Al Viro viro@zeniv.linux.org.uk
commit 5467a68cbf6884c9a9d91e2a89140afb1839c835 upstream.
For lockless accesses to dentries we don't have pinned we rely (among other things) upon having an RCU delay between dropping the last reference and actually freeing the memory.
On the other hand, for things like pipes and sockets we neither do that kind of lockless access, nor want to deal with the overhead of an RCU delay every time a socket gets closed.
So delay was made optional - setting DCACHE_RCUACCESS in ->d_flags made sure it would happen. We tried to avoid setting it unless we knew we need it. Unfortunately, that had led to recurring class of bugs, in which we missed the need to set it.
We only really need it for dentries that are created by d_alloc_pseudo(), so let's not bother with trying to be smart - just make having an RCU delay the default. The ones that do *not* get it set the replacement flag (DCACHE_NORCU) and we'd better use that sparingly. d_alloc_pseudo() is the only such user right now.
FWIW, the race that finally prompted that switch had been between __lock_parent() of immediate subdirectory of what's currently the root of a disconnected tree (e.g. from open-by-handle in progress) racing with d_splice_alias() elsewhere picking another alias for the same inode, either on outright corrupted fs image, or (in case of open-by-handle on NFS) that subdirectory having been just moved on server. It's not easy to hit, so the sky is not falling, but that's not the first race on similar missed cases and the logics for settinf DCACHE_RCUACCESS has gotten ridiculously convoluted.
Cc: stable@vger.kernel.org Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Documentation/filesystems/porting | 5 +++++ fs/dcache.c | 24 +++++++++++++----------- fs/nsfs.c | 3 +-- include/linux/dcache.h | 2 +- 4 files changed, 20 insertions(+), 14 deletions(-)
--- a/Documentation/filesystems/porting +++ b/Documentation/filesystems/porting @@ -622,3 +622,8 @@ in your dentry operations instead. alloc_file_clone(file, flags, ops) does not affect any caller's references. On success you get a new struct file sharing the mount/dentry with the original, on failure - ERR_PTR(). +-- +[mandatory] + DCACHE_RCUACCESS is gone; having an RCU delay on dentry freeing is the + default. DCACHE_NORCU opts out, and only d_alloc_pseudo() has any + business doing so. --- a/fs/dcache.c +++ b/fs/dcache.c @@ -344,7 +344,7 @@ static void dentry_free(struct dentry *d } } /* if dentry was never visible to RCU, immediate free is OK */ - if (!(dentry->d_flags & DCACHE_RCUACCESS)) + if (dentry->d_flags & DCACHE_NORCU) __d_free(&dentry->d_u.d_rcu); else call_rcu(&dentry->d_u.d_rcu, __d_free); @@ -1694,7 +1694,6 @@ struct dentry *d_alloc(struct dentry * p struct dentry *dentry = __d_alloc(parent->d_sb, name); if (!dentry) return NULL; - dentry->d_flags |= DCACHE_RCUACCESS; spin_lock(&parent->d_lock); /* * don't need child lock because it is not subject @@ -1719,7 +1718,7 @@ struct dentry *d_alloc_cursor(struct den { struct dentry *dentry = d_alloc_anon(parent->d_sb); if (dentry) { - dentry->d_flags |= DCACHE_RCUACCESS | DCACHE_DENTRY_CURSOR; + dentry->d_flags |= DCACHE_DENTRY_CURSOR; dentry->d_parent = dget(parent); } return dentry; @@ -1732,10 +1731,17 @@ struct dentry *d_alloc_cursor(struct den * * For a filesystem that just pins its dentries in memory and never * performs lookups at all, return an unhashed IS_ROOT dentry. + * This is used for pipes, sockets et.al. - the stuff that should + * never be anyone's children or parents. Unlike all other + * dentries, these will not have RCU delay between dropping the + * last reference and freeing them. */ struct dentry *d_alloc_pseudo(struct super_block *sb, const struct qstr *name) { - return __d_alloc(sb, name); + struct dentry *dentry = __d_alloc(sb, name); + if (likely(dentry)) + dentry->d_flags |= DCACHE_NORCU; + return dentry; } EXPORT_SYMBOL(d_alloc_pseudo);
@@ -1899,12 +1905,10 @@ struct dentry *d_make_root(struct inode
if (root_inode) { res = d_alloc_anon(root_inode->i_sb); - if (res) { - res->d_flags |= DCACHE_RCUACCESS; + if (res) d_instantiate(res, root_inode); - } else { + else iput(root_inode); - } } return res; } @@ -2769,9 +2773,7 @@ static void __d_move(struct dentry *dent copy_name(dentry, target); target->d_hash.pprev = NULL; dentry->d_parent->d_lockref.count++; - if (dentry == old_parent) - dentry->d_flags |= DCACHE_RCUACCESS; - else + if (dentry != old_parent) /* wasn't IS_ROOT */ WARN_ON(!--old_parent->d_lockref.count); } else { target->d_parent = old_parent; --- a/fs/nsfs.c +++ b/fs/nsfs.c @@ -85,13 +85,12 @@ slow: inode->i_fop = &ns_file_operations; inode->i_private = ns;
- dentry = d_alloc_pseudo(mnt->mnt_sb, &empty_name); + dentry = d_alloc_anon(mnt->mnt_sb); if (!dentry) { iput(inode); return ERR_PTR(-ENOMEM); } d_instantiate(dentry, inode); - dentry->d_flags |= DCACHE_RCUACCESS; dentry->d_fsdata = (void *)ns->ops; d = atomic_long_cmpxchg(&ns->stashed, 0, (unsigned long)dentry); if (d) { --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -175,7 +175,6 @@ struct dentry_operations { * typically using d_splice_alias. */
#define DCACHE_REFERENCED 0x00000040 /* Recently used, don't discard. */ -#define DCACHE_RCUACCESS 0x00000080 /* Entry has ever been RCU-visible */
#define DCACHE_CANT_MOUNT 0x00000100 #define DCACHE_GENOCIDE 0x00000200 @@ -216,6 +215,7 @@ struct dentry_operations {
#define DCACHE_PAR_LOOKUP 0x10000000 /* being looked up (with parent locked shared) */ #define DCACHE_DENTRY_CURSOR 0x20000000 +#define DCACHE_NORCU 0x40000000 /* No RCU delay for freeing */
extern seqlock_t rename_lock;
From: Alexander Shishkin alexander.shishkin@linux.intel.com
commit 4e0eaf239fb33ebc671303e2b736fa043462e2f4 upstream.
Currently, the pages that are allocated for the single mode of MSC are not mapped into the device's dma space and the code is incorrectly using *_to_phys() in place of a dma address. This fails with IOMMU enabled and is otherwise bad practice.
Fix the single mode buffer allocation to map the pages into the device's DMA space.
Signed-off-by: Alexander Shishkin alexander.shishkin@linux.intel.com Fixes: ba82664c134e ("intel_th: Add Memory Storage Unit driver") Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hwtracing/intel_th/msu.c | 35 ++++++++++++++++++++++++++++++++--- 1 file changed, 32 insertions(+), 3 deletions(-)
--- a/drivers/hwtracing/intel_th/msu.c +++ b/drivers/hwtracing/intel_th/msu.c @@ -84,6 +84,7 @@ struct msc_iter { * @reg_base: register window base address * @thdev: intel_th_device pointer * @win_list: list of windows in multiblock mode + * @single_sgt: single mode buffer * @nr_pages: total number of pages allocated for this buffer * @single_sz: amount of data in single mode * @single_wrap: single mode wrap occurred @@ -104,6 +105,7 @@ struct msc { struct intel_th_device *thdev;
struct list_head win_list; + struct sg_table single_sgt; unsigned long nr_pages; unsigned long single_sz; unsigned int single_wrap : 1; @@ -617,22 +619,45 @@ static void intel_th_msc_deactivate(stru */ static int msc_buffer_contig_alloc(struct msc *msc, unsigned long size) { + unsigned long nr_pages = size >> PAGE_SHIFT; unsigned int order = get_order(size); struct page *page; + int ret;
if (!size) return 0;
+ ret = sg_alloc_table(&msc->single_sgt, 1, GFP_KERNEL); + if (ret) + goto err_out; + + ret = -ENOMEM; page = alloc_pages(GFP_KERNEL | __GFP_ZERO, order); if (!page) - return -ENOMEM; + goto err_free_sgt;
split_page(page, order); - msc->nr_pages = size >> PAGE_SHIFT; + sg_set_buf(msc->single_sgt.sgl, page_address(page), size); + + ret = dma_map_sg(msc_dev(msc)->parent->parent, msc->single_sgt.sgl, 1, + DMA_FROM_DEVICE); + if (ret < 0) + goto err_free_pages; + + msc->nr_pages = nr_pages; msc->base = page_address(page); - msc->base_addr = page_to_phys(page); + msc->base_addr = sg_dma_address(msc->single_sgt.sgl);
return 0; + +err_free_pages: + __free_pages(page, order); + +err_free_sgt: + sg_free_table(&msc->single_sgt); + +err_out: + return ret; }
/** @@ -643,6 +668,10 @@ static void msc_buffer_contig_free(struc { unsigned long off;
+ dma_unmap_sg(msc_dev(msc)->parent->parent, msc->single_sgt.sgl, + 1, DMA_FROM_DEVICE); + sg_free_table(&msc->single_sgt); + for (off = 0; off < msc->nr_pages << PAGE_SHIFT; off += PAGE_SIZE) { struct page *page = virt_to_page(msc->base + off);
From: Pan Bian bianpan2016@163.com
commit 8149069db81853570a665f5e5648c0e526dc0e43 upstream.
The function p54p_probe takes an extra reference count of the PCI device. However, the extra reference count is not dropped when it fails to enable the PCI device. This patch fixes the bug.
Cc: stable@vger.kernel.org Signed-off-by: Pan Bian bianpan2016@163.com Acked-by: Christian Lamparter chunkeey@gmail.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/intersil/p54/p54pci.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/net/wireless/intersil/p54/p54pci.c +++ b/drivers/net/wireless/intersil/p54/p54pci.c @@ -554,7 +554,7 @@ static int p54p_probe(struct pci_dev *pd err = pci_enable_device(pdev); if (err) { dev_err(&pdev->dev, "Cannot enable new PCI device\n"); - return err; + goto err_put; }
mem_addr = pci_resource_start(pdev, 0); @@ -639,6 +639,7 @@ static int p54p_probe(struct pci_dev *pd pci_release_regions(pdev); err_disable_dev: pci_disable_device(pdev); +err_put: pci_dev_put(pdev); return err; }
From: Phong Tran tranmanphong@gmail.com
commit 440868661f36071886ed360d91de83bd67c73b4f upstream.
Now, make the loop explicit to avoid clang warning.
./include/linux/of.h:238:37: warning: multiple unsequenced modifications to 'cell' [-Wunsequenced] r = (r << 32) | be32_to_cpu(*(cell++)); ^~ ./include/linux/byteorder/generic.h:95:21: note: expanded from macro 'be32_to_cpu' ^ ./include/uapi/linux/byteorder/little_endian.h:40:59: note: expanded from macro '__be32_to_cpu' ^ ./include/uapi/linux/swab.h:118:21: note: expanded from macro '__swab32' ___constant_swab32(x) : \ ^ ./include/uapi/linux/swab.h:18:12: note: expanded from macro '___constant_swab32' (((__u32)(x) & (__u32)0x000000ffUL) << 24) | \ ^
Signed-off-by: Phong Tran tranmanphong@gmail.com Reported-by: Nick Desaulniers ndesaulniers@google.com Link: https://github.com/ClangBuiltLinux/linux/issues/460 Suggested-by: David Laight David.Laight@ACULAB.COM Reviewed-by: Nick Desaulniers ndesaulniers@google.com Cc: stable@vger.kernel.org [robh: fix up whitespace] Signed-off-by: Rob Herring robh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/of.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/include/linux/of.h +++ b/include/linux/of.h @@ -236,8 +236,8 @@ extern struct device_node *of_find_all_n static inline u64 of_read_number(const __be32 *cell, int size) { u64 r = 0; - while (size--) - r = (r << 32) | be32_to_cpu(*(cell++)); + for (; size--; cell++) + r = (r << 32) | be32_to_cpu(*cell); return r; }
From: Christoph Probst kernel@probst.it
commit 6a54b2e002c9d00b398d35724c79f9fe0d9b38fb upstream.
Change strcat to strncpy in the "None" case to fix a buffer overflow when cinode->oplock is reset to 0 by another thread accessing the same cinode. It is never valid to append "None" to any other message.
Consolidate multiple writes to cinode->oplock to reduce raciness.
Signed-off-by: Christoph Probst kernel@probst.it Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Steve French stfrench@microsoft.com CC: Stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/smb2ops.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
--- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -2348,26 +2348,28 @@ smb21_set_oplock_level(struct cifsInodeI unsigned int epoch, bool *purge_cache) { char message[5] = {0}; + unsigned int new_oplock = 0;
oplock &= 0xFF; if (oplock == SMB2_OPLOCK_LEVEL_NOCHANGE) return;
- cinode->oplock = 0; if (oplock & SMB2_LEASE_READ_CACHING_HE) { - cinode->oplock |= CIFS_CACHE_READ_FLG; + new_oplock |= CIFS_CACHE_READ_FLG; strcat(message, "R"); } if (oplock & SMB2_LEASE_HANDLE_CACHING_HE) { - cinode->oplock |= CIFS_CACHE_HANDLE_FLG; + new_oplock |= CIFS_CACHE_HANDLE_FLG; strcat(message, "H"); } if (oplock & SMB2_LEASE_WRITE_CACHING_HE) { - cinode->oplock |= CIFS_CACHE_WRITE_FLG; + new_oplock |= CIFS_CACHE_WRITE_FLG; strcat(message, "W"); } - if (!cinode->oplock) - strcat(message, "None"); + if (!new_oplock) + strncpy(message, "None", sizeof(message)); + + cinode->oplock = new_oplock; cifs_dbg(FYI, "%s Lease granted on inode %p\n", message, &cinode->vfs_inode); }
From: Colin Ian King colin.king@canonical.com
commit e6577cb5103b7ca7c0204c0c86ef4af8aa6288f6 upstream.
There seems to be a missing bit-wise or operator when setting val, fix this by adding it in.
Fixes: 2796ceb0c18a ("phy: ti-pipe3: Update pcie phy settings") Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Colin Ian King colin.king@canonical.com Signed-off-by: Kishon Vijay Abraham I kishon@ti.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/phy/ti/phy-ti-pipe3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/phy/ti/phy-ti-pipe3.c +++ b/drivers/phy/ti/phy-ti-pipe3.c @@ -303,7 +303,7 @@ static void ti_pipe3_calibrate(struct ti
val = ti_pipe3_readl(phy->phy_rx, PCIEPHYRX_ANA_PROGRAMMABILITY); val &= ~(INTERFACE_MASK | LOSD_MASK | MEM_PLLDIV); - val = (0x1 << INTERFACE_SHIFT | 0xA << LOSD_SHIFT); + val |= (0x1 << INTERFACE_SHIFT | 0xA << LOSD_SHIFT); ti_pipe3_writel(phy->phy_rx, PCIEPHYRX_ANA_PROGRAMMABILITY, val);
val = ti_pipe3_readl(phy->phy_rx, PCIEPHYRX_DIGITAL_MODES);
From: Janusz Krzysztofik jmkrzyszt@gmail.com
commit 933c1320847f5ed6b61a7d10f0a948aa98ccd7b0 upstream.
After removal of clock_start() from before soc_camera_init_i2c() in soc_camera_probe() by commit 9aea470b399d ("[media] soc-camera: switch I2C subdevice drivers to use v4l2-clk") introduced in v3.11, the ov6650 driver could no longer probe the sensor successfully because its clock was no longer turned on in advance. The issue was initially worked around by adding that missing clock_start() equivalent to OMAP1 camera interface driver - the only user of this sensor - but a propoer fix should be rather implemented in the sensor driver code itself.
Fix the issue by inserting a delay between the clock is turned on and the sensor I2C registers are read for the first time.
Tested on Amstrad Delta with now out of tree but still locally maintained omap1_camera host driver.
Fixes: 9aea470b399d ("[media] soc-camera: switch I2C subdevice drivers to use v4l2-clk")
Signed-off-by: Janusz Krzysztofik jmkrzyszt@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/i2c/ov6650.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/media/i2c/ov6650.c +++ b/drivers/media/i2c/ov6650.c @@ -815,6 +815,8 @@ static int ov6650_video_probe(struct i2c if (ret < 0) return ret;
+ msleep(20); + /* * check and show product ID and manufacturer ID */
From: Steve Longerbeam slongerbeam@gmail.com
commit 904371f90b2c0c749a5ab75478c129a4682ac3d8 upstream.
On i.MX6, the nearest upstream entity to the CSI can only be the CSI video muxes or the Synopsys DW MIPI CSI-2 receiver.
However the i.MX53 has no CSI video muxes or a MIPI CSI-2 receiver. So allow for the nearest upstream entity to the CSI to be something other than those.
Fixes: bf3cfaa712e5c ("media: staging/imx: get CSI bus type from nearest upstream entity")
Signed-off-by: Steve Longerbeam slongerbeam@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/staging/media/imx/imx-media-csi.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-)
--- a/drivers/staging/media/imx/imx-media-csi.c +++ b/drivers/staging/media/imx/imx-media-csi.c @@ -153,9 +153,10 @@ static inline bool requires_passthrough( /* * Parses the fwnode endpoint from the source pad of the entity * connected to this CSI. This will either be the entity directly - * upstream from the CSI-2 receiver, or directly upstream from the - * video mux. The endpoint is needed to determine the bus type and - * bus config coming into the CSI. + * upstream from the CSI-2 receiver, directly upstream from the + * video mux, or directly upstream from the CSI itself. The endpoint + * is needed to determine the bus type and bus config coming into + * the CSI. */ static int csi_get_upstream_endpoint(struct csi_priv *priv, struct v4l2_fwnode_endpoint *ep) @@ -168,7 +169,8 @@ static int csi_get_upstream_endpoint(str if (!priv->src_sd) return -EPIPE;
- src = &priv->src_sd->entity; + sd = priv->src_sd; + src = &sd->entity;
if (src->function == MEDIA_ENT_F_VID_MUX) { /* @@ -182,6 +184,14 @@ static int csi_get_upstream_endpoint(str src = &sd->entity; }
+ /* + * If the source is neither the video mux nor the CSI-2 receiver, + * get the source pad directly upstream from CSI itself. + */ + if (src->function != MEDIA_ENT_F_VID_MUX && + sd->grp_id != IMX_MEDIA_GRP_ID_CSI2) + src = &priv->sd.entity; + /* get source pad of entity directly upstream from src */ pad = imx_media_find_upstream_pad(priv->md, src, 0); if (IS_ERR(pad))
From: Steve Longerbeam slongerbeam@gmail.com
commit 107927fa597c99eaeee4f51865ca0956ec71b6a2 upstream.
In imx_media_create_csi_of_links(), the 'struct v4l2_fwnode_link' must be cleared for each endpoint iteration, otherwise if the remote port has no "reg" property, link.remote_port will not be reset to zero. This was discovered on the i.MX53 SMD board, since the OV5642 connects directly to ipu1_csi0 and has a single source port with no "reg" property.
Fixes: 621b08eabcddb ("media: staging/imx: remove static media link arrays")
Signed-off-by: Steve Longerbeam slongerbeam@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/staging/media/imx/imx-media-of.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
--- a/drivers/staging/media/imx/imx-media-of.c +++ b/drivers/staging/media/imx/imx-media-of.c @@ -233,15 +233,18 @@ int imx_media_create_csi_of_links(struct struct v4l2_subdev *csi) { struct device_node *csi_np = csi->dev->of_node; - struct fwnode_handle *fwnode, *csi_ep; - struct v4l2_fwnode_link link; struct device_node *ep; - int ret; - - link.local_node = of_fwnode_handle(csi_np); - link.local_port = CSI_SINK_PAD;
for_each_child_of_node(csi_np, ep) { + struct fwnode_handle *fwnode, *csi_ep; + struct v4l2_fwnode_link link; + int ret; + + memset(&link, 0, sizeof(link)); + + link.local_node = of_fwnode_handle(csi_np); + link.local_port = CSI_SINK_PAD; + csi_ep = of_fwnode_handle(ep);
fwnode = fwnode_graph_get_remote_endpoint(csi_ep);
From: ZhangXiaoxu zhangxiaoxu5@huawei.com
commit f02f3755dbd14fb935d24b14650fff9ba92243b8 upstream.
stat command with soft mount never return after server is stopped.
When alloc a new client, the state of the client will be set to NFS4CLNT_LEASE_EXPIRED.
When the server is stopped, the state manager will work, and accord the state to recover. But the state is NFS4CLNT_LEASE_EXPIRED, it will drain the slot table and lead other task to wait queue, until the client recovered. Then the stat command is hung.
When discover server trunking, the client will renew the lease, but check the client state, it lead the client state corruption.
So, we need to call state manager to recover it when detect server ip trunking.
Signed-off-by: ZhangXiaoxu zhangxiaoxu5@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/nfs/nfs4state.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -159,6 +159,10 @@ int nfs40_discover_server_trunking(struc /* Sustain the lease, even if it's empty. If the clientid4 * goes stale it's of no use for trunking discovery. */ nfs4_schedule_state_renewal(*result); + + /* If the client state need to recover, do it. */ + if (clp->cl_state) + nfs4_schedule_state_manager(clp); } out: return status;
From: Olga Kornievskaia kolga@netapp.com
commit b1029c9bc078a6f1515f55dd993b507dcc7e3440 upstream.
If we fail to find a good deviceid while trying to pnfs instead of propogating an error back fallback to doing IO to the MDS. Currently, code with fals the IO with EINVAL.
Signed-off-by: Olga Kornievskaia kolga@netapp.com Fixes: 8d40b0f14846f ("NFS filelayout:call GETDEVICEINFO after pnfs_layout_process completes" Cc: stable@vger.kernel.org # v4.11+ Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/nfs/filelayout/filelayout.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/nfs/filelayout/filelayout.c +++ b/fs/nfs/filelayout/filelayout.c @@ -904,7 +904,7 @@ fl_pnfs_update_layout(struct inode *ino, status = filelayout_check_deviceid(lo, fl, gfp_flags); if (status) { pnfs_put_lseg(lseg); - lseg = ERR_PTR(status); + lseg = NULL; } out: return lseg;
From: Leo Yan leo.yan@linaro.org
commit 9f77a60669d13ed4ddfa6cd7374c9d88da378ffa upstream.
clk_gate_ufs_subsys is a system bus clock, turning off it will introduce lockup issue during system suspend flow. Let's mark clk_gate_ufs_subsys as critical clock, thus keeps it on during system suspend and resume.
Fixes: d374e6fd5088 ("clk: hisilicon: Add clock driver for hi3660 SoC") Cc: stable@vger.kernel.org Cc: Zhong Kaihua zhongkaihua@huawei.com Cc: John Stultz john.stultz@linaro.org Cc: Zhangfei Gao zhangfei.gao@linaro.org Suggested-by: Dong Zhang zhangdong46@hisilicon.com Signed-off-by: Leo Yan leo.yan@linaro.org Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/clk/hisilicon/clk-hi3660.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/clk/hisilicon/clk-hi3660.c +++ b/drivers/clk/hisilicon/clk-hi3660.c @@ -163,8 +163,12 @@ static const struct hisi_gate_clock hi36 "clk_isp_snclk_mux", CLK_SET_RATE_PARENT, 0x50, 17, 0, }, { HI3660_CLK_GATE_ISP_SNCLK2, "clk_gate_isp_snclk2", "clk_isp_snclk_mux", CLK_SET_RATE_PARENT, 0x50, 18, 0, }, + /* + * clk_gate_ufs_subsys is a system bus clock, mark it as critical + * clock and keep it on for system suspend and resume. + */ { HI3660_CLK_GATE_UFS_SUBSYS, "clk_gate_ufs_subsys", "clk_div_sysbus", - CLK_SET_RATE_PARENT, 0x50, 21, 0, }, + CLK_SET_RATE_PARENT | CLK_IS_CRITICAL, 0x50, 21, 0, }, { HI3660_PCLK_GATE_DSI0, "pclk_gate_dsi0", "clk_div_cfgbus", CLK_SET_RATE_PARENT, 0x50, 28, 0, }, { HI3660_PCLK_GATE_DSI1, "pclk_gate_dsi1", "clk_div_cfgbus",
From: Dmitry Osipenko digetx@gmail.com
commit 40db569d6769ffa3864fd1b89616b1a7323568a8 upstream.
There are wrongly set parenthesis in the code that are resulting in a wrong configuration being programmed for PLLM. The original fix was made by Danny Huang in the downstream kernel. The patch was tested on Nyan Big Tegra124 chromebook, PLLM rate changing works correctly now and system doesn't lock up after changing the PLLM rate due to EMC scaling.
Cc: stable@vger.kernel.org Tested-by: Steev Klimaszewski steev@kali.org Signed-off-by: Dmitry Osipenko digetx@gmail.com Acked-By: Peter De Schrijver pdeschrijver@nvidia.com Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/clk/tegra/clk-pll.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/clk/tegra/clk-pll.c +++ b/drivers/clk/tegra/clk-pll.c @@ -662,8 +662,8 @@ static void _update_pll_mnp(struct tegra pll_override_writel(val, params->pmc_divp_reg, pll);
val = pll_override_readl(params->pmc_divnm_reg, pll); - val &= ~(divm_mask(pll) << div_nmp->override_divm_shift) | - ~(divn_mask(pll) << div_nmp->override_divn_shift); + val &= ~((divm_mask(pll) << div_nmp->override_divm_shift) | + (divn_mask(pll) << div_nmp->override_divn_shift)); val |= (cfg->m << div_nmp->override_divm_shift) | (cfg->n << div_nmp->override_divn_shift); pll_override_writel(val, params->pmc_divnm_reg, pll);
From: Owen Chen owen.chen@mediatek.com
commit be17ca6ac76a5cfd07cc3a0397dd05d6929fcbbb upstream.
PLLs with tuner_en bit, such as APLL1, need to disable tuner_en before apply new frequency settings, or the new frequency settings (pcw) will not be applied. The tuner_en bit will be disabled during changing PLL rate and be restored after new settings applied.
Fixes: e2f744a82d725 (clk: mediatek: Add MT2712 clock support) Cc: stable@vger.kernel.org Signed-off-by: Owen Chen owen.chen@mediatek.com Signed-off-by: Weiyi Lu weiyi.lu@mediatek.com Reviewed-by: James Liao jamesjj.liao@mediatek.com Reviewed-by: Matthias Brugger matthias.bgg@gmail.com Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/clk/mediatek/clk-pll.c | 48 +++++++++++++++++++++++++++++------------ 1 file changed, 34 insertions(+), 14 deletions(-)
--- a/drivers/clk/mediatek/clk-pll.c +++ b/drivers/clk/mediatek/clk-pll.c @@ -88,6 +88,32 @@ static unsigned long __mtk_pll_recalc_ra return ((unsigned long)vco + postdiv - 1) / postdiv; }
+static void __mtk_pll_tuner_enable(struct mtk_clk_pll *pll) +{ + u32 r; + + if (pll->tuner_en_addr) { + r = readl(pll->tuner_en_addr) | BIT(pll->data->tuner_en_bit); + writel(r, pll->tuner_en_addr); + } else if (pll->tuner_addr) { + r = readl(pll->tuner_addr) | AUDPLL_TUNER_EN; + writel(r, pll->tuner_addr); + } +} + +static void __mtk_pll_tuner_disable(struct mtk_clk_pll *pll) +{ + u32 r; + + if (pll->tuner_en_addr) { + r = readl(pll->tuner_en_addr) & ~BIT(pll->data->tuner_en_bit); + writel(r, pll->tuner_en_addr); + } else if (pll->tuner_addr) { + r = readl(pll->tuner_addr) & ~AUDPLL_TUNER_EN; + writel(r, pll->tuner_addr); + } +} + static void mtk_pll_set_rate_regs(struct mtk_clk_pll *pll, u32 pcw, int postdiv) { @@ -96,6 +122,9 @@ static void mtk_pll_set_rate_regs(struct
pll_en = readl(pll->base_addr + REG_CON0) & CON0_BASE_EN;
+ /* disable tuner */ + __mtk_pll_tuner_disable(pll); + /* set postdiv */ val = readl(pll->pd_addr); val &= ~(POSTDIV_MASK << pll->data->pd_shift); @@ -122,6 +151,9 @@ static void mtk_pll_set_rate_regs(struct if (pll->tuner_addr) writel(con1 + 1, pll->tuner_addr);
+ /* restore tuner_en */ + __mtk_pll_tuner_enable(pll); + if (pll_en) udelay(20); } @@ -228,13 +260,7 @@ static int mtk_pll_prepare(struct clk_hw r |= pll->data->en_mask; writel(r, pll->base_addr + REG_CON0);
- if (pll->tuner_en_addr) { - r = readl(pll->tuner_en_addr) | BIT(pll->data->tuner_en_bit); - writel(r, pll->tuner_en_addr); - } else if (pll->tuner_addr) { - r = readl(pll->tuner_addr) | AUDPLL_TUNER_EN; - writel(r, pll->tuner_addr); - } + __mtk_pll_tuner_enable(pll);
udelay(20);
@@ -258,13 +284,7 @@ static void mtk_pll_unprepare(struct clk writel(r, pll->base_addr + REG_CON0); }
- if (pll->tuner_en_addr) { - r = readl(pll->tuner_en_addr) & ~BIT(pll->data->tuner_en_bit); - writel(r, pll->tuner_en_addr); - } else if (pll->tuner_addr) { - r = readl(pll->tuner_addr) & ~AUDPLL_TUNER_EN; - writel(r, pll->tuner_addr); - } + __mtk_pll_tuner_disable(pll);
r = readl(pll->base_addr + REG_CON0); r &= ~CON0_BASE_EN;
From: Jonas Karlman jonas@kwiboo.se
commit fb903392131a324a243c7731389277db1cd9f8df upstream.
This patch fixes definition of several clock gate and select register that is wrong for rk3328 referring to the TRM and vendor kernel. Also use correct number of softrst registers.
Fix clock definition for: - clk_crypto - aclk_h265 - pclk_h265 - aclk_h264 - hclk_h264 - aclk_axisram - aclk_gmac - aclk_usb3otg
Fixes: fe3511ad8a1c ("clk: rockchip: add clock controller for rk3328") Cc: stable@vger.kernel.org Signed-off-by: Jonas Karlman jonas@kwiboo.se Tested-by: Peter Geis pgwipeout@gmail.com Signed-off-by: Heiko Stuebner heiko@sntech.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/clk/rockchip/clk-rk3328.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
--- a/drivers/clk/rockchip/clk-rk3328.c +++ b/drivers/clk/rockchip/clk-rk3328.c @@ -458,7 +458,7 @@ static struct rockchip_clk_branch rk3328 RK3328_CLKSEL_CON(35), 15, 1, MFLAGS, 8, 7, DFLAGS, RK3328_CLKGATE_CON(2), 12, GFLAGS), COMPOSITE(SCLK_CRYPTO, "clk_crypto", mux_2plls_p, 0, - RK3328_CLKSEL_CON(20), 7, 1, MFLAGS, 0, 7, DFLAGS, + RK3328_CLKSEL_CON(20), 7, 1, MFLAGS, 0, 5, DFLAGS, RK3328_CLKGATE_CON(2), 4, GFLAGS), COMPOSITE_NOMUX(SCLK_TSADC, "clk_tsadc", "clk_24m", 0, RK3328_CLKSEL_CON(22), 0, 10, DFLAGS, @@ -550,15 +550,15 @@ static struct rockchip_clk_branch rk3328 GATE(0, "hclk_rkvenc_niu", "hclk_rkvenc", 0, RK3328_CLKGATE_CON(25), 1, GFLAGS), GATE(ACLK_H265, "aclk_h265", "aclk_rkvenc", 0, - RK3328_CLKGATE_CON(25), 0, GFLAGS), + RK3328_CLKGATE_CON(25), 2, GFLAGS), GATE(PCLK_H265, "pclk_h265", "hclk_rkvenc", 0, - RK3328_CLKGATE_CON(25), 1, GFLAGS), + RK3328_CLKGATE_CON(25), 3, GFLAGS), GATE(ACLK_H264, "aclk_h264", "aclk_rkvenc", 0, - RK3328_CLKGATE_CON(25), 0, GFLAGS), + RK3328_CLKGATE_CON(25), 4, GFLAGS), GATE(HCLK_H264, "hclk_h264", "hclk_rkvenc", 0, - RK3328_CLKGATE_CON(25), 1, GFLAGS), + RK3328_CLKGATE_CON(25), 5, GFLAGS), GATE(ACLK_AXISRAM, "aclk_axisram", "aclk_rkvenc", CLK_IGNORE_UNUSED, - RK3328_CLKGATE_CON(25), 0, GFLAGS), + RK3328_CLKGATE_CON(25), 6, GFLAGS),
COMPOSITE(SCLK_VENC_CORE, "sclk_venc_core", mux_4plls_p, 0, RK3328_CLKSEL_CON(51), 14, 2, MFLAGS, 8, 5, DFLAGS, @@ -663,7 +663,7 @@ static struct rockchip_clk_branch rk3328
/* PD_GMAC */ COMPOSITE(ACLK_GMAC, "aclk_gmac", mux_2plls_hdmiphy_p, 0, - RK3328_CLKSEL_CON(35), 6, 2, MFLAGS, 0, 5, DFLAGS, + RK3328_CLKSEL_CON(25), 6, 2, MFLAGS, 0, 5, DFLAGS, RK3328_CLKGATE_CON(3), 2, GFLAGS), COMPOSITE_NOMUX(PCLK_GMAC, "pclk_gmac", "aclk_gmac", 0, RK3328_CLKSEL_CON(25), 8, 3, DFLAGS, @@ -733,7 +733,7 @@ static struct rockchip_clk_branch rk3328
/* PD_PERI */ GATE(0, "aclk_peri_noc", "aclk_peri", CLK_IGNORE_UNUSED, RK3328_CLKGATE_CON(19), 11, GFLAGS), - GATE(ACLK_USB3OTG, "aclk_usb3otg", "aclk_peri", 0, RK3328_CLKGATE_CON(19), 4, GFLAGS), + GATE(ACLK_USB3OTG, "aclk_usb3otg", "aclk_peri", 0, RK3328_CLKGATE_CON(19), 14, GFLAGS),
GATE(HCLK_SDMMC, "hclk_sdmmc", "hclk_peri", 0, RK3328_CLKGATE_CON(19), 0, GFLAGS), GATE(HCLK_SDIO, "hclk_sdio", "hclk_peri", 0, RK3328_CLKGATE_CON(19), 1, GFLAGS), @@ -913,7 +913,7 @@ static void __init rk3328_clk_init(struc &rk3328_cpuclk_data, rk3328_cpuclk_rates, ARRAY_SIZE(rk3328_cpuclk_rates));
- rockchip_register_softrst(np, 11, reg_base + RK3328_SOFTRST_CON(0), + rockchip_register_softrst(np, 12, reg_base + RK3328_SOFTRST_CON(0), ROCKCHIP_SOFTRST_HIWORD_MASK);
rockchip_register_restart_notifier(ctx, RK3328_GLB_SRST_FST, NULL);
From: Mikulas Patocka mpatocka@redhat.com
commit bd86b6c5c60711dbd4fa21bdb497a188ecb6cf63 upstream.
Remove the unused parameter "data" and unused variable "ret".
Signed-off-by: Mikulas Patocka mpatocka@redhat.com Cc: Bernie Thompson bernie@plugable.com Cc: Ladislav Michl ladis@linux-mips.org Cc: stable@vger.kernel.org Signed-off-by: Bartlomiej Zolnierkiewicz b.zolnierkie@samsung.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/video/fbdev/udlfb.c | 21 +++++++++------------ 1 file changed, 9 insertions(+), 12 deletions(-)
--- a/drivers/video/fbdev/udlfb.c +++ b/drivers/video/fbdev/udlfb.c @@ -594,10 +594,9 @@ static int dlfb_render_hline(struct dlfb return 0; }
-static int dlfb_handle_damage(struct dlfb_data *dlfb, int x, int y, - int width, int height, char *data) +static int dlfb_handle_damage(struct dlfb_data *dlfb, int x, int y, int width, int height) { - int i, ret; + int i; char *cmd; cycles_t start_cycles, end_cycles; int bytes_sent = 0; @@ -641,7 +640,7 @@ static int dlfb_handle_damage(struct dlf *cmd++ = 0xAF; /* Send partial buffer remaining before exiting */ len = cmd - (char *) urb->transfer_buffer; - ret = dlfb_submit_urb(dlfb, urb, len); + dlfb_submit_urb(dlfb, urb, len); bytes_sent += len; } else dlfb_urb_completion(urb); @@ -679,7 +678,7 @@ static ssize_t dlfb_ops_write(struct fb_ (u32)info->var.yres);
dlfb_handle_damage(dlfb, 0, start, info->var.xres, - lines, info->screen_base); + lines); }
return result; @@ -695,7 +694,7 @@ static void dlfb_ops_copyarea(struct fb_ sys_copyarea(info, area);
dlfb_handle_damage(dlfb, area->dx, area->dy, - area->width, area->height, info->screen_base); + area->width, area->height); }
static void dlfb_ops_imageblit(struct fb_info *info, @@ -706,7 +705,7 @@ static void dlfb_ops_imageblit(struct fb sys_imageblit(info, image);
dlfb_handle_damage(dlfb, image->dx, image->dy, - image->width, image->height, info->screen_base); + image->width, image->height); }
static void dlfb_ops_fillrect(struct fb_info *info, @@ -717,7 +716,7 @@ static void dlfb_ops_fillrect(struct fb_ sys_fillrect(info, rect);
dlfb_handle_damage(dlfb, rect->dx, rect->dy, rect->width, - rect->height, info->screen_base); + rect->height); }
/* @@ -859,8 +858,7 @@ static int dlfb_ops_ioctl(struct fb_info if (area.y > info->var.yres) area.y = info->var.yres;
- dlfb_handle_damage(dlfb, area.x, area.y, area.w, area.h, - info->screen_base); + dlfb_handle_damage(dlfb, area.x, area.y, area.w, area.h); }
return 0; @@ -1065,8 +1063,7 @@ static int dlfb_ops_set_par(struct fb_in pix_framebuffer[i] = 0x37e6; }
- dlfb_handle_damage(dlfb, 0, 0, info->var.xres, info->var.yres, - info->screen_base); + dlfb_handle_damage(dlfb, 0, 0, info->var.xres, info->var.yres);
return 0; }
From: Mikulas Patocka mpatocka@redhat.com
commit 6b11f9d8433b471fdd3ebed232b43a4b723be6ff upstream.
If a framebuffer device is used as a console, the rendering calls (copyarea, fillrect, imageblit) may be done with the console spinlock held. On udlfb, these function call dlfb_handle_damage that takes a blocking semaphore before acquiring an URB.
In order to fix the bug, this patch changes the calls copyarea, fillrect and imageblit to offload USB work to a workqueue.
A side effect of this patch is 3x improvement in console scrolling speed because the device doesn't have to be updated after each copyarea call.
Signed-off-by: Mikulas Patocka mpatocka@redhat.com Cc: Bernie Thompson bernie@plugable.com Cc: Ladislav Michl ladis@linux-mips.org Cc: stable@vger.kernel.org Signed-off-by: Bartlomiej Zolnierkiewicz b.zolnierkie@samsung.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/video/fbdev/udlfb.c | 56 +++++++++++++++++++++++++++++++++++++++++--- include/video/udlfb.h | 6 ++++ 2 files changed, 59 insertions(+), 3 deletions(-)
--- a/drivers/video/fbdev/udlfb.c +++ b/drivers/video/fbdev/udlfb.c @@ -657,6 +657,50 @@ error: return 0; }
+static void dlfb_init_damage(struct dlfb_data *dlfb) +{ + dlfb->damage_x = INT_MAX; + dlfb->damage_x2 = 0; + dlfb->damage_y = INT_MAX; + dlfb->damage_y2 = 0; +} + +static void dlfb_damage_work(struct work_struct *w) +{ + struct dlfb_data *dlfb = container_of(w, struct dlfb_data, damage_work); + int x, x2, y, y2; + + spin_lock_irq(&dlfb->damage_lock); + x = dlfb->damage_x; + x2 = dlfb->damage_x2; + y = dlfb->damage_y; + y2 = dlfb->damage_y2; + dlfb_init_damage(dlfb); + spin_unlock_irq(&dlfb->damage_lock); + + if (x < x2 && y < y2) + dlfb_handle_damage(dlfb, x, y, x2 - x, y2 - y); +} + +static void dlfb_offload_damage(struct dlfb_data *dlfb, int x, int y, int width, int height) +{ + unsigned long flags; + int x2 = x + width; + int y2 = y + height; + + if (x >= x2 || y >= y2) + return; + + spin_lock_irqsave(&dlfb->damage_lock, flags); + dlfb->damage_x = min(x, dlfb->damage_x); + dlfb->damage_x2 = max(x2, dlfb->damage_x2); + dlfb->damage_y = min(y, dlfb->damage_y); + dlfb->damage_y2 = max(y2, dlfb->damage_y2); + spin_unlock_irqrestore(&dlfb->damage_lock, flags); + + schedule_work(&dlfb->damage_work); +} + /* * Path triggered by usermode clients who write to filesystem * e.g. cat filename > /dev/fb1 @@ -693,7 +737,7 @@ static void dlfb_ops_copyarea(struct fb_
sys_copyarea(info, area);
- dlfb_handle_damage(dlfb, area->dx, area->dy, + dlfb_offload_damage(dlfb, area->dx, area->dy, area->width, area->height); }
@@ -704,7 +748,7 @@ static void dlfb_ops_imageblit(struct fb
sys_imageblit(info, image);
- dlfb_handle_damage(dlfb, image->dx, image->dy, + dlfb_offload_damage(dlfb, image->dx, image->dy, image->width, image->height); }
@@ -715,7 +759,7 @@ static void dlfb_ops_fillrect(struct fb_
sys_fillrect(info, rect);
- dlfb_handle_damage(dlfb, rect->dx, rect->dy, rect->width, + dlfb_offload_damage(dlfb, rect->dx, rect->dy, rect->width, rect->height); }
@@ -940,6 +984,8 @@ static void dlfb_ops_destroy(struct fb_i { struct dlfb_data *dlfb = info->par;
+ cancel_work_sync(&dlfb->damage_work); + if (info->cmap.len != 0) fb_dealloc_cmap(&info->cmap); if (info->monspecs.modedb) @@ -1636,6 +1682,10 @@ static int dlfb_usb_probe(struct usb_int dlfb->ops = dlfb_ops; info->fbops = &dlfb->ops;
+ dlfb_init_damage(dlfb); + spin_lock_init(&dlfb->damage_lock); + INIT_WORK(&dlfb->damage_work, dlfb_damage_work); + INIT_LIST_HEAD(&info->modelist);
if (!dlfb_alloc_urb_list(dlfb, WRITES_IN_FLIGHT, MAX_TRANSFER)) { --- a/include/video/udlfb.h +++ b/include/video/udlfb.h @@ -48,6 +48,12 @@ struct dlfb_data { int base8; u32 pseudo_palette[256]; int blank_mode; /*one of FB_BLANK_ */ + int damage_x; + int damage_y; + int damage_x2; + int damage_y2; + spinlock_t damage_lock; + struct work_struct damage_work; struct fb_ops ops; /* blit-only rendering path metrics, exposed through sysfs */ atomic_t bytes_rendered; /* raw pixel-bytes driver asked to render */
From: Mikulas Patocka mpatocka@redhat.com
commit babc250e278eac7b0e671bdaedf833759b43bb78 upstream.
Rendering calls may be done simultaneously from the workqueue, dlfb_ops_write, dlfb_ops_ioctl, dlfb_ops_set_par and dlfb_dpy_deferred_io. The code is robust enough so that it won't crash on concurrent rendering.
However, concurrent rendering may cause display corruption if the same pixel is simultaneously being rendered. In order to avoid this corruption, this patch adds a mutex around the rendering calls.
Signed-off-by: Mikulas Patocka mpatocka@redhat.com Cc: Bernie Thompson bernie@plugable.com Cc: Ladislav Michl ladis@linux-mips.org Cc: stable@vger.kernel.org [b.zolnierkie: replace "dlfb:" with "uldfb:" in the patch summary] Signed-off-by: Bartlomiej Zolnierkiewicz b.zolnierkie@samsung.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/video/fbdev/udlfb.c | 41 ++++++++++++++++++++++++++++++----------- include/video/udlfb.h | 1 + 2 files changed, 31 insertions(+), 11 deletions(-)
--- a/drivers/video/fbdev/udlfb.c +++ b/drivers/video/fbdev/udlfb.c @@ -596,7 +596,7 @@ static int dlfb_render_hline(struct dlfb
static int dlfb_handle_damage(struct dlfb_data *dlfb, int x, int y, int width, int height) { - int i; + int i, ret; char *cmd; cycles_t start_cycles, end_cycles; int bytes_sent = 0; @@ -606,21 +606,29 @@ static int dlfb_handle_damage(struct dlf
start_cycles = get_cycles();
+ mutex_lock(&dlfb->render_mutex); + aligned_x = DL_ALIGN_DOWN(x, sizeof(unsigned long)); width = DL_ALIGN_UP(width + (x-aligned_x), sizeof(unsigned long)); x = aligned_x;
if ((width <= 0) || (x + width > dlfb->info->var.xres) || - (y + height > dlfb->info->var.yres)) - return -EINVAL; + (y + height > dlfb->info->var.yres)) { + ret = -EINVAL; + goto unlock_ret; + }
- if (!atomic_read(&dlfb->usb_active)) - return 0; + if (!atomic_read(&dlfb->usb_active)) { + ret = 0; + goto unlock_ret; + }
urb = dlfb_get_urb(dlfb); - if (!urb) - return 0; + if (!urb) { + ret = 0; + goto unlock_ret; + } cmd = urb->transfer_buffer;
for (i = y; i < y + height ; i++) { @@ -654,7 +662,11 @@ error: >> 10)), /* Kcycles */ &dlfb->cpu_kcycles_used);
- return 0; + ret = 0; + +unlock_ret: + mutex_unlock(&dlfb->render_mutex); + return ret; }
static void dlfb_init_damage(struct dlfb_data *dlfb) @@ -782,17 +794,19 @@ static void dlfb_dpy_deferred_io(struct int bytes_identical = 0; int bytes_rendered = 0;
+ mutex_lock(&dlfb->render_mutex); + if (!fb_defio) - return; + goto unlock_ret;
if (!atomic_read(&dlfb->usb_active)) - return; + goto unlock_ret;
start_cycles = get_cycles();
urb = dlfb_get_urb(dlfb); if (!urb) - return; + goto unlock_ret;
cmd = urb->transfer_buffer;
@@ -825,6 +839,8 @@ error: atomic_add(((unsigned int) ((end_cycles - start_cycles) >> 10)), /* Kcycles */ &dlfb->cpu_kcycles_used); +unlock_ret: + mutex_unlock(&dlfb->render_mutex); }
static int dlfb_get_edid(struct dlfb_data *dlfb, char *edid, int len) @@ -986,6 +1002,8 @@ static void dlfb_ops_destroy(struct fb_i
cancel_work_sync(&dlfb->damage_work);
+ mutex_destroy(&dlfb->render_mutex); + if (info->cmap.len != 0) fb_dealloc_cmap(&info->cmap); if (info->monspecs.modedb) @@ -1682,6 +1700,7 @@ static int dlfb_usb_probe(struct usb_int dlfb->ops = dlfb_ops; info->fbops = &dlfb->ops;
+ mutex_init(&dlfb->render_mutex); dlfb_init_damage(dlfb); spin_lock_init(&dlfb->damage_lock); INIT_WORK(&dlfb->damage_work, dlfb_damage_work); --- a/include/video/udlfb.h +++ b/include/video/udlfb.h @@ -48,6 +48,7 @@ struct dlfb_data { int base8; u32 pseudo_palette[256]; int blank_mode; /*one of FB_BLANK_ */ + struct mutex render_mutex; int damage_x; int damage_y; int damage_x2;
From: Miklos Szeredi mszeredi@redhat.com
commit 9de5be06d0a89ca97b5ab902694d42dfd2bb77d2 upstream.
Writepage requests were cropped to i_size & 0xffffffff, which meant that mmaped writes to any file larger than 4G might be silently discarded.
Fix by storing the file size in a properly sized variable (loff_t instead of size_t).
Reported-by: Antonio SJ Musumeci trapexit@spawn.link Fixes: 6eaf4782eb09 ("fuse: writepages: crop secondary requests") Cc: stable@vger.kernel.org # v3.13 Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/fuse/file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -1526,7 +1526,7 @@ __acquires(fc->lock) { struct fuse_conn *fc = get_fuse_conn(inode); struct fuse_inode *fi = get_fuse_inode(inode); - size_t crop = i_size_read(inode); + loff_t crop = i_size_read(inode); struct fuse_req *req;
while (fi->writectr >= 0 && !list_empty(&fi->queued_writes)) {
From: Liu Bo bo.liu@linux.alibaba.com
commit 0cbade024ba501313da3b7e5dd2a188a6bc491b5 upstream.
fstests generic/228 reported this failure that fuse fallocate does not honor what 'ulimit -f' has set.
This adds the necessary inode_newsize_ok() check.
Signed-off-by: Liu Bo bo.liu@linux.alibaba.com Fixes: 05ba1f082300 ("fuse: add FALLOCATE operation") Cc: stable@vger.kernel.org # v3.5 Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/fuse/file.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -2975,6 +2975,13 @@ static long fuse_file_fallocate(struct f } }
+ if (!(mode & FALLOC_FL_KEEP_SIZE) && + offset + length > i_size_read(inode)) { + err = inode_newsize_ok(inode, offset + length); + if (err) + return err; + } + if (!(mode & FALLOC_FL_KEEP_SIZE)) set_bit(FUSE_I_SIZE_UNSTABLE, &fi->state);
From: Amir Goldstein amir73il@gmail.com
commit 3428030da004a1128cbdcf93dc03e16f184d845b upstream.
Generalize the helper ovl_open_maybe_copy_up() and use it to copy up file with data before FS_IOC_SETFLAGS ioctl.
The FS_IOC_SETFLAGS ioctl is a bit of an odd ball in vfs, which probably caused the confusion. File may be open O_RDONLY, but ioctl modifies the file. VFS does not call mnt_want_write_file() nor lock inode mutex, but fs-specific code for FS_IOC_SETFLAGS does. So ovl_ioctl() calls mnt_want_write_file() for the overlay file, and fs-specific code calls mnt_want_write_file() for upper fs file, but there was no call for ovl_want_write() for copy up duration which prevents overlayfs from copying up on a frozen upper fs.
Fixes: dab5ca8fd9dd ("ovl: add lsattr/chattr support") Cc: stable@vger.kernel.org # v4.19 Signed-off-by: Amir Goldstein amir73il@gmail.com Acked-by: Vivek Goyal vgoyal@redhat.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/overlayfs/copy_up.c | 6 +++--- fs/overlayfs/file.c | 5 ++--- fs/overlayfs/overlayfs.h | 2 +- 3 files changed, 6 insertions(+), 7 deletions(-)
--- a/fs/overlayfs/copy_up.c +++ b/fs/overlayfs/copy_up.c @@ -878,14 +878,14 @@ static bool ovl_open_need_copy_up(struct return true; }
-int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags) +int ovl_maybe_copy_up(struct dentry *dentry, int flags) { int err = 0;
- if (ovl_open_need_copy_up(dentry, file_flags)) { + if (ovl_open_need_copy_up(dentry, flags)) { err = ovl_want_write(dentry); if (!err) { - err = ovl_copy_up_flags(dentry, file_flags); + err = ovl_copy_up_flags(dentry, flags); ovl_drop_write(dentry); } } --- a/fs/overlayfs/file.c +++ b/fs/overlayfs/file.c @@ -116,11 +116,10 @@ static int ovl_real_fdget(const struct f
static int ovl_open(struct inode *inode, struct file *file) { - struct dentry *dentry = file_dentry(file); struct file *realfile; int err;
- err = ovl_open_maybe_copy_up(dentry, file->f_flags); + err = ovl_maybe_copy_up(file_dentry(file), file->f_flags); if (err) return err;
@@ -390,7 +389,7 @@ static long ovl_ioctl(struct file *file, if (ret) return ret;
- ret = ovl_copy_up_with_data(file_dentry(file)); + ret = ovl_maybe_copy_up(file_dentry(file), O_WRONLY); if (!ret) { ret = ovl_real_ioctl(file, cmd, arg);
--- a/fs/overlayfs/overlayfs.h +++ b/fs/overlayfs/overlayfs.h @@ -411,7 +411,7 @@ extern const struct file_operations ovl_ int ovl_copy_up(struct dentry *dentry); int ovl_copy_up_with_data(struct dentry *dentry); int ovl_copy_up_flags(struct dentry *dentry, int flags); -int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags); +int ovl_maybe_copy_up(struct dentry *dentry, int flags); int ovl_copy_xattr(struct dentry *old, struct dentry *new); int ovl_set_attr(struct dentry *upper, struct kstat *stat); struct ovl_fh *ovl_encode_real_fh(struct dentry *real, bool is_upper);
From: Dmitry Osipenko digetx@gmail.com
commit 43a0541e312f7136e081e6bf58f6c8a2e9672688 upstream.
Both Tegra30 and Tegra114 have 4 ASID's and the corresponding bitfield of the TLB_FLUSH register differs from later Tegra generations that have 128 ASID's.
In a result the PTE's are now flushed correctly from TLB and this fixes problems with graphics (randomly failing tests) on Tegra30.
Cc: stable stable@vger.kernel.org Signed-off-by: Dmitry Osipenko digetx@gmail.com Acked-by: Thierry Reding treding@nvidia.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iommu/tegra-smmu.c | 25 ++++++++++++++++++------- 1 file changed, 18 insertions(+), 7 deletions(-)
--- a/drivers/iommu/tegra-smmu.c +++ b/drivers/iommu/tegra-smmu.c @@ -102,7 +102,6 @@ static inline u32 smmu_readl(struct tegr #define SMMU_TLB_FLUSH_VA_MATCH_ALL (0 << 0) #define SMMU_TLB_FLUSH_VA_MATCH_SECTION (2 << 0) #define SMMU_TLB_FLUSH_VA_MATCH_GROUP (3 << 0) -#define SMMU_TLB_FLUSH_ASID(x) (((x) & 0x7f) << 24) #define SMMU_TLB_FLUSH_VA_SECTION(addr) ((((addr) & 0xffc00000) >> 12) | \ SMMU_TLB_FLUSH_VA_MATCH_SECTION) #define SMMU_TLB_FLUSH_VA_GROUP(addr) ((((addr) & 0xffffc000) >> 12) | \ @@ -205,8 +204,12 @@ static inline void smmu_flush_tlb_asid(s { u32 value;
- value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) | - SMMU_TLB_FLUSH_VA_MATCH_ALL; + if (smmu->soc->num_asids == 4) + value = (asid & 0x3) << 29; + else + value = (asid & 0x7f) << 24; + + value |= SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_VA_MATCH_ALL; smmu_writel(smmu, value, SMMU_TLB_FLUSH); }
@@ -216,8 +219,12 @@ static inline void smmu_flush_tlb_sectio { u32 value;
- value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) | - SMMU_TLB_FLUSH_VA_SECTION(iova); + if (smmu->soc->num_asids == 4) + value = (asid & 0x3) << 29; + else + value = (asid & 0x7f) << 24; + + value |= SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_VA_SECTION(iova); smmu_writel(smmu, value, SMMU_TLB_FLUSH); }
@@ -227,8 +234,12 @@ static inline void smmu_flush_tlb_group( { u32 value;
- value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) | - SMMU_TLB_FLUSH_VA_GROUP(iova); + if (smmu->soc->num_asids == 4) + value = (asid & 0x3) << 29; + else + value = (asid & 0x7f) << 24; + + value |= SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_VA_GROUP(iova); smmu_writel(smmu, value, SMMU_TLB_FLUSH); }
From: Jeff Layton jlayton@kernel.org
commit 00abf69dd24f4444d185982379c5cc3bb7b6d1fc upstream.
xfstest generic/452 was triggering a "Busy inodes after umount" warning. ceph was allowing the mount to go read-only without first flushing out dirty inodes in the cache. Ensure we sync out the filesystem before allowing a remount to proceed.
Cc: stable@vger.kernel.org Link: http://tracker.ceph.com/issues/39571 Signed-off-by: Jeff Layton jlayton@kernel.org Reviewed-by: "Yan, Zheng" zyan@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/ceph/super.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -819,6 +819,12 @@ static void ceph_umount_begin(struct sup return; }
+static int ceph_remount(struct super_block *sb, int *flags, char *data) +{ + sync_filesystem(sb); + return 0; +} + static const struct super_operations ceph_super_ops = { .alloc_inode = ceph_alloc_inode, .destroy_inode = ceph_destroy_inode, @@ -826,6 +832,7 @@ static const struct super_operations cep .drop_inode = ceph_drop_inode, .sync_fs = ceph_sync_fs, .put_super = ceph_put_super, + .remount_fs = ceph_remount, .show_options = ceph_show_options, .statfs = ceph_statfs, .umount_begin = ceph_umount_begin,
From: Josh Poimboeuf jpoimboe@redhat.com
commit 2700fefdb2d9751c416ad56897e27d41e409324a upstream.
To allow an int3 handler to emulate a call instruction, it must be able to push a return address onto the stack. Add a gap to the stack to allow the int3 handler to push the return address and change the return from int3 to jump straight to the emulated called function target.
Link: http://lkml.kernel.org/r/20181130183917.hxmti5josgq4clti@treble Link: http://lkml.kernel.org/r/20190502162133.GX2623@hirez.programming.kicks-ass.n...
[ Note, this is needed to allow Live Kernel Patching to not miss calling a patched function when tracing is enabled. -- Steven Rostedt ]
Cc: stable@vger.kernel.org Fixes: b700e7f03df5 ("livepatch: kernel: add support for live patching") Tested-by: Nicolai Stange nstange@suse.de Reviewed-by: Nicolai Stange nstange@suse.de Reviewed-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/entry/entry_64.S | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-)
--- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -905,7 +905,7 @@ apicinterrupt IRQ_WORK_VECTOR irq_work */ #define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss_rw) + (TSS_ist + ((x) - 1) * 8)
-.macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1 +.macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1 create_gap=0 ENTRY(\sym) UNWIND_HINT_IRET_REGS offset=\has_error_code*8
@@ -925,6 +925,20 @@ ENTRY(\sym) jnz .Lfrom_usermode_switch_stack_@ .endif
+ .if \create_gap == 1 + /* + * If coming from kernel space, create a 6-word gap to allow the + * int3 handler to emulate a call instruction. + */ + testb $3, CS-ORIG_RAX(%rsp) + jnz .Lfrom_usermode_no_gap_@ + .rept 6 + pushq 5*8(%rsp) + .endr + UNWIND_HINT_IRET_REGS offset=8 +.Lfrom_usermode_no_gap_@: + .endif + .if \paranoid call paranoid_entry .else @@ -1154,7 +1168,7 @@ apicinterrupt3 HYPERV_STIMER0_VECTOR \ #endif /* CONFIG_HYPERV */
idtentry debug do_debug has_error_code=0 paranoid=1 shift_ist=DEBUG_STACK -idtentry int3 do_int3 has_error_code=0 +idtentry int3 do_int3 has_error_code=0 create_gap=1 idtentry stack_segment do_stack_segment has_error_code=1
#ifdef CONFIG_XEN
From: Peter Zijlstra peterz@infradead.org
commit 4b33dadf37666c0860b88f9e52a16d07bf6d0b03 upstream.
In order to allow breakpoints to emulate call instructions, they need to push the return address onto the stack. The x86_64 int3 handler adds a small gap to allow the stack to grow some. Use this gap to add the return address to be able to emulate a call instruction at the breakpoint location.
These helper functions are added:
int3_emulate_jmp(): changes the location of the regs->ip to return there.
(The next two are only for x86_64) int3_emulate_push(): to push the address onto the gap in the stack int3_emulate_call(): push the return address and change regs->ip
Cc: Andy Lutomirski luto@kernel.org Cc: Nicolai Stange nstange@suse.de Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@redhat.com Cc: Borislav Petkov bp@alien8.de Cc: "H. Peter Anvin" hpa@zytor.com Cc: the arch/x86 maintainers x86@kernel.org Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Jiri Kosina jikos@kernel.org Cc: Miroslav Benes mbenes@suse.cz Cc: Petr Mladek pmladek@suse.com Cc: Joe Lawrence joe.lawrence@redhat.com Cc: Shuah Khan shuah@kernel.org Cc: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Sebastian Andrzej Siewior bigeasy@linutronix.de Cc: Mimi Zohar zohar@linux.ibm.com Cc: Juergen Gross jgross@suse.com Cc: Nick Desaulniers ndesaulniers@google.com Cc: Nayna Jain nayna@linux.ibm.com Cc: Masahiro Yamada yamada.masahiro@socionext.com Cc: Joerg Roedel jroedel@suse.de Cc: "open list:KERNEL SELFTEST FRAMEWORK" linux-kselftest@vger.kernel.org Cc: stable@vger.kernel.org Fixes: b700e7f03df5 ("livepatch: kernel: add support for live patching") Tested-by: Nicolai Stange nstange@suse.de Reviewed-by: Nicolai Stange nstange@suse.de Reviewed-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org [ Modified to only work for x86_64 and added comment to int3_emulate_push() ] Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/include/asm/text-patching.h | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)
--- a/arch/x86/include/asm/text-patching.h +++ b/arch/x86/include/asm/text-patching.h @@ -39,4 +39,32 @@ extern int poke_int3_handler(struct pt_r extern void *text_poke_bp(void *addr, const void *opcode, size_t len, void *handler); extern int after_bootmem;
+static inline void int3_emulate_jmp(struct pt_regs *regs, unsigned long ip) +{ + regs->ip = ip; +} + +#define INT3_INSN_SIZE 1 +#define CALL_INSN_SIZE 5 + +#ifdef CONFIG_X86_64 +static inline void int3_emulate_push(struct pt_regs *regs, unsigned long val) +{ + /* + * The int3 handler in entry_64.S adds a gap between the + * stack where the break point happened, and the saving of + * pt_regs. We can extend the original stack because of + * this gap. See the idtentry macro's create_gap option. + */ + regs->sp -= sizeof(unsigned long); + *(unsigned long *)regs->sp = val; +} + +static inline void int3_emulate_call(struct pt_regs *regs, unsigned long func) +{ + int3_emulate_push(regs, regs->ip - INT3_INSN_SIZE + CALL_INSN_SIZE); + int3_emulate_jmp(regs, func); +} +#endif + #endif /* _ASM_X86_TEXT_PATCHING_H */
From: Peter Zijlstra peterz@infradead.org
commit 9e298e8604088a600d8100a111a532a9d342af09 upstream.
Nicolai Stange discovered[1] that if live kernel patching is enabled, and the function tracer started tracing the same function that was patched, the conversion of the fentry call site during the translation of going from calling the live kernel patch trampoline to the iterator trampoline, would have as slight window where it didn't call anything.
As live kernel patching depends on ftrace to always call its code (to prevent the function being traced from being called, as it will redirect it). This small window would allow the old buggy function to be called, and this can cause undesirable results.
Nicolai submitted new patches[2] but these were controversial. As this is similar to the static call emulation issues that came up a while ago[3]. But after some debate[4][5] adding a gap in the stack when entering the breakpoint handler allows for pushing the return address onto the stack to easily emulate a call.
[1] http://lkml.kernel.org/r/20180726104029.7736-1-nstange@suse.de [2] http://lkml.kernel.org/r/20190427100639.15074-1-nstange@suse.de [3] http://lkml.kernel.org/r/3cf04e113d71c9f8e4be95fb84a510f085aa4afa.1541711457... [4] http://lkml.kernel.org/r/CAHk-=wh5OpheSU8Em_Q3Hg8qw_JtoijxOdPtHru6d+5K8TWM=A... [5] http://lkml.kernel.org/r/CAHk-=wjvQxY4DvPrJ6haPgAa6b906h=MwZXO6G8OtiTGe=N7_w...
[ Live kernel patching is not implemented on x86_32, thus the emulate calls are only for x86_64. ]
Cc: Andy Lutomirski luto@kernel.org Cc: Nicolai Stange nstange@suse.de Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@redhat.com Cc: Borislav Petkov bp@alien8.de Cc: "H. Peter Anvin" hpa@zytor.com Cc: the arch/x86 maintainers x86@kernel.org Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Jiri Kosina jikos@kernel.org Cc: Miroslav Benes mbenes@suse.cz Cc: Petr Mladek pmladek@suse.com Cc: Joe Lawrence joe.lawrence@redhat.com Cc: Shuah Khan shuah@kernel.org Cc: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Sebastian Andrzej Siewior bigeasy@linutronix.de Cc: Mimi Zohar zohar@linux.ibm.com Cc: Juergen Gross jgross@suse.com Cc: Nick Desaulniers ndesaulniers@google.com Cc: Nayna Jain nayna@linux.ibm.com Cc: Masahiro Yamada yamada.masahiro@socionext.com Cc: Joerg Roedel jroedel@suse.de Cc: "open list:KERNEL SELFTEST FRAMEWORK" linux-kselftest@vger.kernel.org Cc: stable@vger.kernel.org Fixes: b700e7f03df5 ("livepatch: kernel: add support for live patching") Tested-by: Nicolai Stange nstange@suse.de Reviewed-by: Nicolai Stange nstange@suse.de Reviewed-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org [ Changed to only implement emulated calls for x86_64 ] Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/ftrace.c | 32 +++++++++++++++++++++++++++----- 1 file changed, 27 insertions(+), 5 deletions(-)
--- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -29,6 +29,7 @@ #include <asm/kprobes.h> #include <asm/ftrace.h> #include <asm/nops.h> +#include <asm/text-patching.h>
#ifdef CONFIG_DYNAMIC_FTRACE
@@ -228,6 +229,7 @@ int ftrace_modify_call(struct dyn_ftrace }
static unsigned long ftrace_update_func; +static unsigned long ftrace_update_func_call;
static int update_ftrace_func(unsigned long ip, void *new) { @@ -256,6 +258,8 @@ int ftrace_update_ftrace_func(ftrace_fun unsigned char *new; int ret;
+ ftrace_update_func_call = (unsigned long)func; + new = ftrace_call_replace(ip, (unsigned long)func); ret = update_ftrace_func(ip, new);
@@ -291,13 +295,28 @@ int ftrace_int3_handler(struct pt_regs * if (WARN_ON_ONCE(!regs)) return 0;
- ip = regs->ip - 1; - if (!ftrace_location(ip) && !is_ftrace_caller(ip)) - return 0; + ip = regs->ip - INT3_INSN_SIZE;
- regs->ip += MCOUNT_INSN_SIZE - 1; +#ifdef CONFIG_X86_64 + if (ftrace_location(ip)) { + int3_emulate_call(regs, (unsigned long)ftrace_regs_caller); + return 1; + } else if (is_ftrace_caller(ip)) { + if (!ftrace_update_func_call) { + int3_emulate_jmp(regs, ip + CALL_INSN_SIZE); + return 1; + } + int3_emulate_call(regs, ftrace_update_func_call); + return 1; + } +#else + if (ftrace_location(ip) || is_ftrace_caller(ip)) { + int3_emulate_jmp(regs, ip + CALL_INSN_SIZE); + return 1; + } +#endif
- return 1; + return 0; }
static int ftrace_write(unsigned long ip, const char *val, int size) @@ -868,6 +887,8 @@ void arch_ftrace_update_trampoline(struc
func = ftrace_ops_get_func(ops);
+ ftrace_update_func_call = (unsigned long)func; + /* Do a safe modify in case the trampoline is executing */ new = ftrace_call_replace(ip, (unsigned long)func); ret = update_ftrace_func(ip, new); @@ -964,6 +985,7 @@ static int ftrace_mod_jmp(unsigned long { unsigned char *new;
+ ftrace_update_func_call = 0UL; new = ftrace_jmp_replace(ip, (unsigned long)func);
return update_ftrace_func(ip, new);
From: Elazar Leibovich elazar@lightbitslabs.com
commit cbe08bcbbe787315c425dde284dcb715cfbf3f39 upstream.
When reading only part of the id file, the ppos isn't tracked correctly. This is taken care by simple_read_from_buffer.
Reading a single byte, and then the next byte would result EOF.
While this seems like not a big deal, this breaks abstractions that reads information from files unbuffered. See for example https://github.com/golang/go/issues/29399
This code was mentioned as problematic in commit cd458ba9d5a5 ("tracing: Do not (ab)use trace_seq in event_id_read()")
An example C code that show this bug is:
#include <stdio.h> #include <stdint.h>
#include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h>
int main(int argc, char **argv) { if (argc < 2) return 1; int fd = open(argv[1], O_RDONLY); char c; read(fd, &c, 1); printf("First %c\n", c); read(fd, &c, 1); printf("Second %c\n", c); }
Then run with, e.g.
sudo ./a.out /sys/kernel/debug/tracing/events/tcp/tcp_set_state/id
You'll notice you're getting the first character twice, instead of the first two characters in the id file.
Link: http://lkml.kernel.org/r/20181231115837.4932-1-elazar@lightbitslabs.com
Cc: Orit Wasserman orit.was@gmail.com Cc: Oleg Nesterov oleg@redhat.com Cc: Ingo Molnar mingo@redhat.com Cc: stable@vger.kernel.org Fixes: 23725aeeab10b ("ftrace: provide an id file for each event") Signed-off-by: Elazar Leibovich elazar@lightbitslabs.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/trace/trace_events.c | 3 --- 1 file changed, 3 deletions(-)
--- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -1318,9 +1318,6 @@ event_id_read(struct file *filp, char __ char buf[32]; int len;
- if (*ppos) - return 0; - if (unlikely(!id)) return -ENODEV;
From: Dmitry Osipenko digetx@gmail.com
commit b906c056b6023c390f18347169071193fda57dde upstream.
Multiplying the Memory Controller clock rate by the tick count results in an integer overflow and in result the truncated tick value is being programmed into hardware, such that the GR3D memory client performance is reduced by two times.
Cc: stable stable@vger.kernel.org Signed-off-by: Dmitry Osipenko digetx@gmail.com Signed-off-by: Thierry Reding treding@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/memory/tegra/mc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/memory/tegra/mc.c +++ b/drivers/memory/tegra/mc.c @@ -280,7 +280,7 @@ static int tegra_mc_setup_latency_allowa u32 value;
/* compute the number of MC clock cycles per tick */ - tick = mc->tick * clk_get_rate(mc->clk); + tick = (unsigned long long)mc->tick * clk_get_rate(mc->clk); do_div(tick, NSEC_PER_SEC);
value = readl(mc->regs + MC_EMEM_ARB_CFG);
From: Adrian Hunter adrian.hunter@intel.com
commit 7ba8fa20e26eb3c0c04d747f7fd2223694eac4d5 upstream.
The timestamp used to determine if an instruction sample is made, is an estimate based on the number of instructions since the last known timestamp. A consequence is that it might go backwards, which results in extra samples. Change it so that a sample is only made when the timestamp goes forwards.
Note this does not affect a sampling period of 0 or sampling periods specified as a count of instructions.
Example:
Before:
$ perf script --itrace=i10us ls 13812 [003] 2167315.222583: 3270 instructions:u: 7fac71e2e494 __GI___tunables_init+0xf4 (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 30902 instructions:u: 7fac71e2da0f _dl_cache_libcmp+0x2f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 10 instructions:u: 7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 8 instructions:u: 7fac71e2d9ea _dl_cache_libcmp+0xa (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 14 instructions:u: 7fac71e2d9ea _dl_cache_libcmp+0xa (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 6 instructions:u: 7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 14 instructions:u: 7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 4 instructions:u: 7fac71e2dab2 _dl_cache_libcmp+0xd2 (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222728: 16423 instructions:u: 7fac71e2477a _dl_map_object_deps+0x1ba (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222734: 12731 instructions:u: 7fac71e27938 _dl_name_match_p+0x68 (/lib/x86_64-linux-gnu/ld-2.28.so) ...
After: $ perf script --itrace=i10us ls 13812 [003] 2167315.222583: 3270 instructions:u: 7fac71e2e494 __GI___tunables_init+0xf4 (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 30902 instructions:u: 7fac71e2da0f _dl_cache_libcmp+0x2f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222728: 16479 instructions:u: 7fac71e2477a _dl_map_object_deps+0x1ba (/lib/x86_64-linux-gnu/ld-2.28.so) ...
Signed-off-by: Adrian Hunter adrian.hunter@intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: stable@vger.kernel.org Fixes: f4aa081949e7b ("perf tools: Add Intel PT decoder") Link: http://lkml.kernel.org/r/20190510124143.27054-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c +++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c @@ -888,16 +888,20 @@ static uint64_t intel_pt_next_period(str timestamp = decoder->timestamp + decoder->timestamp_insn_cnt; masked_timestamp = timestamp & decoder->period_mask; if (decoder->continuous_period) { - if (masked_timestamp != decoder->last_masked_timestamp) + if (masked_timestamp > decoder->last_masked_timestamp) return 1; } else { timestamp += 1; masked_timestamp = timestamp & decoder->period_mask; - if (masked_timestamp != decoder->last_masked_timestamp) { + if (masked_timestamp > decoder->last_masked_timestamp) { decoder->last_masked_timestamp = masked_timestamp; decoder->continuous_period = true; } } + + if (masked_timestamp < decoder->last_masked_timestamp) + return decoder->period_ticks; + return decoder->period_ticks - (timestamp - masked_timestamp); }
@@ -926,7 +930,10 @@ static void intel_pt_sample_insn(struct case INTEL_PT_PERIOD_TICKS: timestamp = decoder->timestamp + decoder->timestamp_insn_cnt; masked_timestamp = timestamp & decoder->period_mask; - decoder->last_masked_timestamp = masked_timestamp; + if (masked_timestamp > decoder->last_masked_timestamp) + decoder->last_masked_timestamp = masked_timestamp; + else + decoder->last_masked_timestamp += decoder->period_ticks; break; case INTEL_PT_PERIOD_NONE: case INTEL_PT_PERIOD_MTC:
From: Adrian Hunter adrian.hunter@intel.com
commit 61b6e08dc8e3ea80b7485c9b3f875ddd45c8466b upstream.
The decoder uses its current timestamp in samples. Usually that is a timestamp that has already passed, but in some cases it is a timestamp for a branch that the decoder is walking towards, and consequently hasn't reached.
The intel_pt_sample_time() function decides which is which, but was not handling TNT packets exactly correctly.
In the case of TNT, the timestamp applies to the first branch, so the decoder must first walk to that branch.
That means intel_pt_sample_time() should return true for TNT, and this patch makes that change. However, if the first branch is a non-taken branch (i.e. a 'N'), then intel_pt_sample_time() needs to return false for subsequent taken branches in the same TNT packet.
To handle that, introduce a new state INTEL_PT_STATE_TNT_CONT to distinguish the cases.
Note that commit 3f04d98e972b5 ("perf intel-pt: Improve sample timestamp") was also a stable fix and appears, for example, in v4.4 stable tree as commit a4ebb58fd124 ("perf intel-pt: Improve sample timestamp").
Signed-off-by: Adrian Hunter adrian.hunter@intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: stable@vger.kernel.org # v4.4+ Fixes: 3f04d98e972b5 ("perf intel-pt: Improve sample timestamp") Link: http://lkml.kernel.org/r/20190510124143.27054-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c +++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c @@ -58,6 +58,7 @@ enum intel_pt_pkt_state { INTEL_PT_STATE_NO_IP, INTEL_PT_STATE_ERR_RESYNC, INTEL_PT_STATE_IN_SYNC, + INTEL_PT_STATE_TNT_CONT, INTEL_PT_STATE_TNT, INTEL_PT_STATE_TIP, INTEL_PT_STATE_TIP_PGD, @@ -72,8 +73,9 @@ static inline bool intel_pt_sample_time( case INTEL_PT_STATE_NO_IP: case INTEL_PT_STATE_ERR_RESYNC: case INTEL_PT_STATE_IN_SYNC: - case INTEL_PT_STATE_TNT: + case INTEL_PT_STATE_TNT_CONT: return true; + case INTEL_PT_STATE_TNT: case INTEL_PT_STATE_TIP: case INTEL_PT_STATE_TIP_PGD: case INTEL_PT_STATE_FUP: @@ -1256,7 +1258,9 @@ static int intel_pt_walk_tnt(struct inte return -ENOENT; } decoder->tnt.count -= 1; - if (!decoder->tnt.count) + if (decoder->tnt.count) + decoder->pkt_state = INTEL_PT_STATE_TNT_CONT; + else decoder->pkt_state = INTEL_PT_STATE_IN_SYNC; decoder->tnt.payload <<= 1; decoder->state.from_ip = decoder->ip; @@ -1287,7 +1291,9 @@ static int intel_pt_walk_tnt(struct inte
if (intel_pt_insn.branch == INTEL_PT_BR_CONDITIONAL) { decoder->tnt.count -= 1; - if (!decoder->tnt.count) + if (decoder->tnt.count) + decoder->pkt_state = INTEL_PT_STATE_TNT_CONT; + else decoder->pkt_state = INTEL_PT_STATE_IN_SYNC; if (decoder->tnt.payload & BIT63) { decoder->tnt.payload <<= 1; @@ -2356,6 +2362,7 @@ const struct intel_pt_state *intel_pt_de err = intel_pt_walk_trace(decoder); break; case INTEL_PT_STATE_TNT: + case INTEL_PT_STATE_TNT_CONT: err = intel_pt_walk_tnt(decoder); if (err == -EAGAIN) err = intel_pt_walk_trace(decoder);
From: Adrian Hunter adrian.hunter@intel.com
commit 1b6599a9d8e6c9f7e9b0476012383b1777f7fc93 upstream.
The sample timestamp is updated to ensure that the timestamp represents the time of the sample and not a branch that the decoder is still walking towards. The sample timestamp is updated when the decoder returns, but the decoder does not return for non-taken branches. Update the sample timestamp then also.
Note that commit 3f04d98e972b5 ("perf intel-pt: Improve sample timestamp") was also a stable fix and appears, for example, in v4.4 stable tree as commit a4ebb58fd124 ("perf intel-pt: Improve sample timestamp").
Signed-off-by: Adrian Hunter adrian.hunter@intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: stable@vger.kernel.org # v4.4+ Fixes: 3f04d98e972b ("perf intel-pt: Improve sample timestamp") Link: http://lkml.kernel.org/r/20190510124143.27054-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c +++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c @@ -1313,8 +1313,11 @@ static int intel_pt_walk_tnt(struct inte return 0; } decoder->ip += intel_pt_insn.length; - if (!decoder->tnt.count) + if (!decoder->tnt.count) { + decoder->sample_timestamp = decoder->timestamp; + decoder->sample_insn_cnt = decoder->timestamp_insn_cnt; return -EAGAIN; + } decoder->tnt.payload <<= 1; continue; }
From: Florian Fainelli f.fainelli@gmail.com
commit 1b1f01b653b408ebe58fec78c566d1075d285c64 upstream.
arch/mips/kernel/perf_event_mipsxx.c: In function 'mipsxx_pmu_enable_event': arch/mips/kernel/perf_event_mipsxx.c:326:21: error: unused variable 'event' [-Werror=unused-variable] struct perf_event *event = container_of(evt, struct perf_event, hw); ^~~~~
Fix this by making use of IS_ENABLED() to simplify the code and avoid unnecessary ifdefery.
Fixes: 84002c88599d ("MIPS: perf: Fix perf with MT counting other threads") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: Paul Burton paul.burton@mips.com Cc: linux-mips@linux-mips.org Cc: Peter Zijlstra peterz@infradead.org Cc: Ingo Molnar mingo@redhat.com Cc: Arnaldo Carvalho de Melo acme@kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: Namhyung Kim namhyung@kernel.org Cc: Ralf Baechle ralf@linux-mips.org Cc: James Hogan jhogan@kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mips@vger.kernel.org Cc: stable@vger.kernel.org # v4.18+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/mips/kernel/perf_event_mipsxx.c | 21 +++------------------ 1 file changed, 3 insertions(+), 18 deletions(-)
--- a/arch/mips/kernel/perf_event_mipsxx.c +++ b/arch/mips/kernel/perf_event_mipsxx.c @@ -64,17 +64,11 @@ struct mips_perf_event { #define CNTR_EVEN 0x55555555 #define CNTR_ODD 0xaaaaaaaa #define CNTR_ALL 0xffffffff -#ifdef CONFIG_MIPS_MT_SMP enum { T = 0, V = 1, P = 2, } range; -#else - #define T - #define V - #define P -#endif };
static struct mips_perf_event raw_event; @@ -325,9 +319,7 @@ static void mipsxx_pmu_enable_event(stru { struct perf_event *event = container_of(evt, struct perf_event, hw); struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events); -#ifdef CONFIG_MIPS_MT_SMP unsigned int range = evt->event_base >> 24; -#endif /* CONFIG_MIPS_MT_SMP */
WARN_ON(idx < 0 || idx >= mipspmu.num_counters);
@@ -336,21 +328,15 @@ static void mipsxx_pmu_enable_event(stru /* Make sure interrupt enabled. */ MIPS_PERFCTRL_IE;
-#ifdef CONFIG_CPU_BMIPS5000 - { + if (IS_ENABLED(CONFIG_CPU_BMIPS5000)) { /* enable the counter for the calling thread */ cpuc->saved_ctrl[idx] |= (1 << (12 + vpe_id())) | BRCM_PERFCTRL_TC; - } -#else -#ifdef CONFIG_MIPS_MT_SMP - if (range > V) { + } else if (IS_ENABLED(CONFIG_MIPS_MT_SMP) && range > V) { /* The counter is processor wide. Set it up to count all TCs. */ pr_debug("Enabling perf counter for all TCs\n"); cpuc->saved_ctrl[idx] |= M_TC_EN_ALL; - } else -#endif /* CONFIG_MIPS_MT_SMP */ - { + } else { unsigned int cpu, ctrl;
/* @@ -365,7 +351,6 @@ static void mipsxx_pmu_enable_event(stru cpuc->saved_ctrl[idx] |= ctrl; pr_debug("Enabling perf counter for CPU%d\n", cpu); } -#endif /* CONFIG_CPU_BMIPS5000 */ /* * We do not actually let the counter run. Leave it until start(). */
From: Nathan Chancellor natechancellor@gmail.com
commit 8ea58f1e8b11cca3087b294779bf5959bf89cc10 upstream.
Currently, this Makefile hardcodes GNU ar, meaning that if it is not available, there is no way to supply a different one and the build will fail.
$ make AR=llvm-ar CC=clang LD=ld.lld HOSTAR=llvm-ar HOSTCC=clang \ HOSTLD=ld.lld HOSTLDFLAGS=-fuse-ld=lld defconfig modules_prepare ... AR /out/tools/objtool/libsubcmd.a /bin/sh: 1: ar: not found ...
Follow the logic of HOST{CC,LD} and allow the user to specify a different ar tool via HOSTAR (which is used elsewhere in other tools/ Makefiles).
Signed-off-by: Nathan Chancellor natechancellor@gmail.com Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Reviewed-by: Nick Desaulniers ndesaulniers@google.com Reviewed-by: Mukesh Ojha mojha@codeaurora.org Cc: stable@vger.kernel.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/80822a9353926c38fd7a152991c6292491a9d0e8.1558028966... Link: https://github.com/ClangBuiltLinux/linux/issues/481 Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/objtool/Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/tools/objtool/Makefile +++ b/tools/objtool/Makefile @@ -7,11 +7,12 @@ ARCH := x86 endif
# always use the host compiler +HOSTAR ?= ar HOSTCC ?= gcc HOSTLD ?= ld +AR = $(HOSTAR) CC = $(HOSTCC) LD = $(HOSTLD) -AR = ar
ifeq ($(srctree),) srctree := $(patsubst %/,%,$(dir $(CURDIR)))
From: Ard Biesheuvel ard.biesheuvel@linaro.org
commit f8585539df0a1527c78b5d760665c89fe1c105a9 upstream.
The following commit:
38ac0287b7f4 ("fbdev/efifb: Honour UEFI memory map attributes when mapping the FB")
updated the EFI framebuffer code to use memory mappings for the linear framebuffer that are permitted by the memory attributes described by the EFI memory map for the particular region, if the framebuffer happens to be covered by the EFI memory map (which is typically only the case for framebuffers in shared memory). This is required since non-x86 systems may require cacheable attributes for memory mappings that are shared with other masters (such as GPUs), and this information cannot be described by the Graphics Output Protocol (GOP) EFI protocol itself, and so we rely on the EFI memory map for this.
As reported by James, this breaks some x86 systems:
[ 1.173368] efifb: probing for efifb [ 1.173386] efifb: abort, cannot remap video memory 0x1d5000 @ 0xcf800000 [ 1.173395] Trying to free nonexistent resource <00000000cf800000-00000000cf9d4bff> [ 1.173413] efi-framebuffer: probe of efi-framebuffer.0 failed with error -5
The problem turns out to be that the memory map entry that describes the framebuffer has no memory attributes listed at all, and so we end up with a mem_flags value of 0x0.
So work around this by ensuring that the memory map entry's attribute field has a sane value before using it to mask the set of usable attributes.
Reported-by: James Hilliard james.hilliard1@gmail.com Tested-by: James Hilliard james.hilliard1@gmail.com Signed-off-by: Ard Biesheuvel ard.biesheuvel@linaro.org Cc: stable@vger.kernel.org # v4.19+ Cc: Borislav Petkov bp@alien8.de Cc: James Morse james.morse@arm.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Matt Fleming matt@codeblueprint.co.uk Cc: Peter Jones pjones@redhat.com Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: linux-efi@vger.kernel.org Fixes: 38ac0287b7f4 ("fbdev/efifb: Honour UEFI memory map attributes when ...") Link: http://lkml.kernel.org/r/20190516213159.3530-2-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/video/fbdev/efifb.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/video/fbdev/efifb.c +++ b/drivers/video/fbdev/efifb.c @@ -476,8 +476,12 @@ static int efifb_probe(struct platform_d * If the UEFI memory map covers the efifb region, we may only * remap it using the attributes the memory map prescribes. */ - mem_flags |= EFI_MEMORY_WT | EFI_MEMORY_WB; - mem_flags &= md.attribute; + md.attribute &= EFI_MEMORY_UC | EFI_MEMORY_WC | + EFI_MEMORY_WT | EFI_MEMORY_WB; + if (md.attribute) { + mem_flags |= EFI_MEMORY_WT | EFI_MEMORY_WB; + mem_flags &= md.attribute; + } } if (mem_flags & EFI_MEMORY_WC) info->screen_base = ioremap_wc(efifb_fix.smem_start,
From: Yifeng Li tomli@tomli.me
commit 5481115e25e42b9215f2619452aa99c95f08492f upstream.
On a Thinkpad s30 (Pentium III / i440MX, Lynx3DM), rebooting with sm712fb framebuffer driver would cause the role of brightness up/down button to swap.
Experiments showed the FPR30 register caused this behavior. Moreover, even if this register don't have side-effect on other systems, over- writing it is also highly questionable, since it was originally configurated by the motherboard manufacturer by hardwiring pull-down resistors to indicate the type of LCD panel. We should not mess with it.
Stop writing to the SR30 (a.k.a FPR30) register.
Signed-off-by: Yifeng Li tomli@tomli.me Tested-by: Sudip Mukherjee sudipm.mukherjee@gmail.com Cc: Teddy Wang teddy.wang@siliconmotion.com Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: Bartlomiej Zolnierkiewicz b.zolnierkie@samsung.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/video/fbdev/sm712fb.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/video/fbdev/sm712fb.c +++ b/drivers/video/fbdev/sm712fb.c @@ -1145,8 +1145,8 @@ static void sm7xx_set_timing(struct smtc
/* init SEQ register SR30 - SR75 */ for (i = 0; i < SIZE_SR30_SR75; i++) - if ((i + 0x30) != 0x62 && (i + 0x30) != 0x6a && - (i + 0x30) != 0x6b) + if ((i + 0x30) != 0x30 && (i + 0x30) != 0x62 && + (i + 0x30) != 0x6a && (i + 0x30) != 0x6b) smtc_seqw(i + 0x30, vgamode[j].init_sr30_sr75[i]);
From: Yifeng Li tomli@tomli.me
commit dcf9070595e100942c539e229dde4770aaeaa4e9 upstream.
On a Thinkpad s30 (Pentium III / i440MX, Lynx3DM), the amount of Video RAM is not detected correctly by the xf86-video-siliconmotion driver. This is because sm712fb overwrites the GPR71 Scratch Pad Register, which is set by BIOS on x86 and used to indicate amount of VRAM.
Other Scratch Pad Registers, including GPR70/74/75, don't have the same side-effect, but overwriting to them is still questionable, as they are not related to modesetting.
Stop writing to SR70/71/74/75 (a.k.a GPR70/71/74/75).
Signed-off-by: Yifeng Li tomli@tomli.me Tested-by: Sudip Mukherjee sudipm.mukherjee@gmail.com Cc: Teddy Wang teddy.wang@siliconmotion.com Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: Bartlomiej Zolnierkiewicz b.zolnierkie@samsung.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/video/fbdev/sm712fb.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/video/fbdev/sm712fb.c +++ b/drivers/video/fbdev/sm712fb.c @@ -1146,7 +1146,9 @@ static void sm7xx_set_timing(struct smtc /* init SEQ register SR30 - SR75 */ for (i = 0; i < SIZE_SR30_SR75; i++) if ((i + 0x30) != 0x30 && (i + 0x30) != 0x62 && - (i + 0x30) != 0x6a && (i + 0x30) != 0x6b) + (i + 0x30) != 0x6a && (i + 0x30) != 0x6b && + (i + 0x30) != 0x70 && (i + 0x30) != 0x71 && + (i + 0x30) != 0x74 && (i + 0x30) != 0x75) smtc_seqw(i + 0x30, vgamode[j].init_sr30_sr75[i]);
From: Yifeng Li tomli@tomli.me
commit 8069053880e0ee3a75fd6d7e0a30293265fe3de4 upstream.
On a Thinkpad s30 (Pentium III / i440MX, Lynx3DM), rebooting with sm712fb framebuffer driver would cause a white screen of death on the next POST, presumably the proper timings for the LCD panel was not reprogrammed properly by the BIOS.
Experiments showed a few CRTC Scratch Registers, including CRT3D, CRT3E and CRT3F may be used internally by BIOS as some flags. CRT3B is a hardware testing register, we shouldn't mess with it. CRT3C has blanking signal and line compare control, which is not needed for this driver.
Stop writing to CR3B-CR3F (a.k.a CRT3B-CRT3F) registers. Even if these registers don't have side-effect on other systems, writing to them is also highly questionable.
Signed-off-by: Yifeng Li tomli@tomli.me Tested-by: Sudip Mukherjee sudipm.mukherjee@gmail.com Cc: Teddy Wang teddy.wang@siliconmotion.com Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: Bartlomiej Zolnierkiewicz b.zolnierkie@samsung.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/video/fbdev/sm712fb.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/video/fbdev/sm712fb.c +++ b/drivers/video/fbdev/sm712fb.c @@ -1173,8 +1173,12 @@ static void sm7xx_set_timing(struct smtc smtc_crtcw(i, vgamode[j].init_cr00_cr18[i]);
/* init CRTC register CR30 - CR4D */ - for (i = 0; i < SIZE_CR30_CR4D; i++) + for (i = 0; i < SIZE_CR30_CR4D; i++) { + if ((i + 0x30) >= 0x3B && (i + 0x30) <= 0x3F) + /* side-effect, don't write to CR3B-CR3F */ + continue; smtc_crtcw(i + 0x30, vgamode[j].init_cr30_cr4d[i]); + }
/* init CRTC register CR90 - CRA7 */ for (i = 0; i < SIZE_CR90_CRA7; i++)
From: Yifeng Li tomli@tomli.me
commit ec1587d5073f29820e358f3a383850d61601d981 upstream.
When the machine is booted in VGA mode, loading sm712fb would cause a glitch of random pixels shown on the screen. To prevent it from happening, we first clear the entire framebuffer, and we also need to stop calling smtcfb_setmode() during initialization, the fbdev layer will call it for us later when it's ready.
Signed-off-by: Yifeng Li tomli@tomli.me Tested-by: Sudip Mukherjee sudipm.mukherjee@gmail.com Cc: Teddy Wang teddy.wang@siliconmotion.com Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: Bartlomiej Zolnierkiewicz b.zolnierkie@samsung.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/video/fbdev/sm712fb.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/video/fbdev/sm712fb.c +++ b/drivers/video/fbdev/sm712fb.c @@ -1493,7 +1493,11 @@ static int smtcfb_pci_probe(struct pci_d if (err) goto failed;
- smtcfb_setmode(sfb); + /* + * The screen would be temporarily garbled when sm712fb takes over + * vesafb or VGA text mode. Zero the framebuffer. + */ + memset_io(sfb->lfb, 0, sfb->fb->fix.smem_len);
err = register_framebuffer(info); if (err < 0)
From: Yifeng Li tomli@tomli.me
commit 9e0e59993df0601cddb95c4f6c61aa3d5e753c00 upstream.
On a Thinkpad s30 (Pentium III / i440MX, Lynx3DM), running fbtest or X will crash the machine instantly, because the VRAM/framebuffer is not mapped correctly.
On SM712, the framebuffer starts at the beginning of address space, but SM720's framebuffer starts at the 1 MiB offset from the beginning. However, sm712fb fails to take this into account, as a result, writing to the framebuffer will destroy all the registers and kill the system immediately. Another problem is the driver assumes 8 MiB of VRAM for SM720, but some SM720 system, such as this IBM Thinkpad, only has 4 MiB of VRAM.
Fix this problem by removing the hardcoded VRAM size, adding a function to query the amount of VRAM from register MCR76 on SM720, and adding proper framebuffer offset.
Please note that the memory map may have additional problems on Big-Endian system, which is not available for testing by myself. But I highly suspect that the original code is also broken on Big-Endian machines for SM720, so at least we are not making the problem worse. More, the driver also assumed SM710/SM712 has 4 MiB of VRAM, but it has a 2 MiB version as well, and used in earlier laptops, such as IBM Thinkpad 240X, the driver would probably crash on them. I've never seen one of those machines and cannot fix it, but I have documented these problems in the comments.
Signed-off-by: Yifeng Li tomli@tomli.me Tested-by: Sudip Mukherjee sudipm.mukherjee@gmail.com Cc: Teddy Wang teddy.wang@siliconmotion.com Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: Bartlomiej Zolnierkiewicz b.zolnierkie@samsung.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/video/fbdev/sm712.h | 5 ---- drivers/video/fbdev/sm712fb.c | 48 ++++++++++++++++++++++++++++++++++++++---- 2 files changed, 44 insertions(+), 9 deletions(-)
--- a/drivers/video/fbdev/sm712.h +++ b/drivers/video/fbdev/sm712.h @@ -19,11 +19,6 @@ #define SCREEN_Y_RES 600 #define SCREEN_BPP 16
-/*Assume SM712 graphics chip has 4MB VRAM */ -#define SM712_VIDEOMEMORYSIZE 0x00400000 -/*Assume SM722 graphics chip has 8MB VRAM */ -#define SM722_VIDEOMEMORYSIZE 0x00800000 - #define dac_reg (0x3c8) #define dac_val (0x3c9)
--- a/drivers/video/fbdev/sm712fb.c +++ b/drivers/video/fbdev/sm712fb.c @@ -1329,6 +1329,11 @@ static int smtc_map_smem(struct smtcfb_i { sfb->fb->fix.smem_start = pci_resource_start(pdev, 0);
+ if (sfb->chip_id == 0x720) + /* on SM720, the framebuffer starts at the 1 MB offset */ + sfb->fb->fix.smem_start += 0x00200000; + + /* XXX: is it safe for SM720 on Big-Endian? */ if (sfb->fb->var.bits_per_pixel == 32) sfb->fb->fix.smem_start += big_addr;
@@ -1366,12 +1371,45 @@ static inline void sm7xx_init_hw(void) outb_p(0x11, 0x3c5); }
+static u_long sm7xx_vram_probe(struct smtcfb_info *sfb) +{ + u8 vram; + + switch (sfb->chip_id) { + case 0x710: + case 0x712: + /* + * Assume SM712 graphics chip has 4MB VRAM. + * + * FIXME: SM712 can have 2MB VRAM, which is used on earlier + * laptops, such as IBM Thinkpad 240X. This driver would + * probably crash on those machines. If anyone gets one of + * those and is willing to help, run "git blame" and send me + * an E-mail. + */ + return 0x00400000; + case 0x720: + outb_p(0x76, 0x3c4); + vram = inb_p(0x3c5) >> 6; + + if (vram == 0x00) + return 0x00800000; /* 8 MB */ + else if (vram == 0x01) + return 0x01000000; /* 16 MB */ + else if (vram == 0x02) + return 0x00400000; /* illegal, fallback to 4 MB */ + else if (vram == 0x03) + return 0x00400000; /* 4 MB */ + } + return 0; /* unknown hardware */ +} + static int smtcfb_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent) { struct smtcfb_info *sfb; struct fb_info *info; - u_long smem_size = 0x00800000; /* default 8MB */ + u_long smem_size; int err; unsigned long mmio_base;
@@ -1428,12 +1466,15 @@ static int smtcfb_pci_probe(struct pci_d mmio_base = pci_resource_start(pdev, 0); pci_read_config_byte(pdev, PCI_REVISION_ID, &sfb->chip_rev_id);
+ smem_size = sm7xx_vram_probe(sfb); + dev_info(&pdev->dev, "%lu MiB of VRAM detected.\n", + smem_size / 1048576); + switch (sfb->chip_id) { case 0x710: case 0x712: sfb->fb->fix.mmio_start = mmio_base + 0x00400000; sfb->fb->fix.mmio_len = 0x00400000; - smem_size = SM712_VIDEOMEMORYSIZE; sfb->lfb = ioremap(mmio_base, mmio_addr); if (!sfb->lfb) { dev_err(&pdev->dev, @@ -1465,8 +1506,7 @@ static int smtcfb_pci_probe(struct pci_d case 0x720: sfb->fb->fix.mmio_start = mmio_base; sfb->fb->fix.mmio_len = 0x00200000; - smem_size = SM722_VIDEOMEMORYSIZE; - sfb->dp_regs = ioremap(mmio_base, 0x00a00000); + sfb->dp_regs = ioremap(mmio_base, 0x00200000 + smem_size); sfb->lfb = sfb->dp_regs + 0x00200000; sfb->mmio = (smtc_regbaseaddress = sfb->dp_regs + 0x000c0000);
From: Yifeng Li tomli@tomli.me
commit 6053d3a4793e5bde6299ac5388e76a3bf679ff65 upstream.
In order to support the 1024x600 panel on Yeeloong Loongson MIPS laptop, the original 1024x768-16 table was modified to 1024x600-16, without leaving the original. It causes problem on x86 laptop as the 1024x768-16 support was still claimed but not working.
Fix it by introducing the 1024x768-16 mode.
Signed-off-by: Yifeng Li tomli@tomli.me Tested-by: Sudip Mukherjee sudipm.mukherjee@gmail.com Cc: Teddy Wang teddy.wang@siliconmotion.com Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: Bartlomiej Zolnierkiewicz b.zolnierkie@samsung.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/video/fbdev/sm712fb.c | 59 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 59 insertions(+)
--- a/drivers/video/fbdev/sm712fb.c +++ b/drivers/video/fbdev/sm712fb.c @@ -530,6 +530,65 @@ static const struct modeinit vgamode[] = 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x15, 0x03, }, }, + { /* 1024 x 768 16Bpp 60Hz */ + 1024, 768, 16, 60, + /* Init_MISC */ + 0xEB, + { /* Init_SR0_SR4 */ + 0x03, 0x01, 0x0F, 0x03, 0x0E, + }, + { /* Init_SR10_SR24 */ + 0xF3, 0xB6, 0xC0, 0xDD, 0x00, 0x0E, 0x17, 0x2C, + 0x99, 0x02, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, + 0xC4, 0x30, 0x02, 0x01, 0x01, + }, + { /* Init_SR30_SR75 */ + 0x38, 0x03, 0x20, 0x09, 0xC0, 0x3A, 0x3A, 0x3A, + 0x3A, 0x3A, 0x3A, 0x3A, 0x00, 0x00, 0x03, 0xFF, + 0x00, 0xFC, 0x00, 0x00, 0x20, 0x18, 0x00, 0xFC, + 0x20, 0x0C, 0x44, 0x20, 0x00, 0x00, 0x00, 0x3A, + 0x06, 0x68, 0xA7, 0x7F, 0x83, 0x24, 0xFF, 0x03, + 0x0F, 0x60, 0x59, 0x3A, 0x3A, 0x00, 0x00, 0x3A, + 0x01, 0x80, 0x7E, 0x1A, 0x1A, 0x00, 0x00, 0x00, + 0x50, 0x03, 0x74, 0x14, 0x3B, 0x0D, 0x09, 0x02, + 0x04, 0x45, 0x30, 0x30, 0x40, 0x20, + }, + { /* Init_SR80_SR93 */ + 0xFF, 0x07, 0x00, 0xFF, 0xFF, 0xFF, 0xFF, 0x3A, + 0xF7, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x3A, 0x3A, + 0x00, 0x00, 0x00, 0x00, + }, + { /* Init_SRA0_SRAF */ + 0x00, 0xFB, 0x9F, 0x01, 0x00, 0xED, 0xED, 0xED, + 0x7B, 0xFB, 0xFF, 0xFF, 0x97, 0xEF, 0xBF, 0xDF, + }, + { /* Init_GR00_GR08 */ + 0x00, 0x00, 0x00, 0x00, 0x00, 0x40, 0x05, 0x0F, + 0xFF, + }, + { /* Init_AR00_AR14 */ + 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, + 0x08, 0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F, + 0x41, 0x00, 0x0F, 0x00, 0x00, + }, + { /* Init_CR00_CR18 */ + 0xA3, 0x7F, 0x7F, 0x00, 0x85, 0x16, 0x24, 0xF5, + 0x00, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x03, 0x09, 0xFF, 0x80, 0x40, 0xFF, 0x00, 0xE3, + 0xFF, + }, + { /* Init_CR30_CR4D */ + 0x00, 0x00, 0x00, 0x00, 0x00, 0x80, 0x02, 0x20, + 0x00, 0x00, 0x00, 0x40, 0x00, 0xFF, 0xBF, 0xFF, + 0xA3, 0x7F, 0x00, 0x86, 0x15, 0x24, 0xFF, 0x00, + 0x01, 0x07, 0xE5, 0x20, 0x7F, 0xFF, + }, + { /* Init_CR90_CRA7 */ + 0x55, 0xD9, 0x5D, 0xE1, 0x86, 0x1B, 0x8E, 0x26, + 0xDA, 0x8D, 0xDE, 0x94, 0x00, 0x00, 0x18, 0x00, + 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x15, 0x03, + }, + }, { /* mode#5: 1024 x 768 24Bpp 60Hz */ 1024, 768, 24, 60, /* Init_MISC */
From: Yifeng Li tomli@tomli.me
commit 4ed7d2ccb7684510ec5f7a8f7ef534bc6a3d55b2 upstream.
Loongson MIPS netbooks use 1024x600 LCD panels, which is the original target platform of this driver, but nearly all old x86 laptops have 1024x768. Lighting 768 panels using 600's timings would partially garble the display. Since it's not possible to distinguish them reliably, we change the default to 768, but keep 600 as-is on MIPS.
Further, earlier laptops, such as IBM Thinkpad 240X, has a 800x600 LCD panel, this driver would probably garbled those display. As we don't have one for testing, the original behavior of the driver is kept as-is, but the problem has been documented is the comments.
Signed-off-by: Yifeng Li tomli@tomli.me Tested-by: Sudip Mukherjee sudipm.mukherjee@gmail.com Cc: Teddy Wang teddy.wang@siliconmotion.com Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: Bartlomiej Zolnierkiewicz b.zolnierkie@samsung.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/video/fbdev/sm712.h | 7 +++-- drivers/video/fbdev/sm712fb.c | 53 +++++++++++++++++++++++++++++++----------- 2 files changed, 44 insertions(+), 16 deletions(-)
--- a/drivers/video/fbdev/sm712.h +++ b/drivers/video/fbdev/sm712.h @@ -15,9 +15,10 @@
#define FB_ACCEL_SMI_LYNX 88
-#define SCREEN_X_RES 1024 -#define SCREEN_Y_RES 600 -#define SCREEN_BPP 16 +#define SCREEN_X_RES 1024 +#define SCREEN_Y_RES_PC 768 +#define SCREEN_Y_RES_NETBOOK 600 +#define SCREEN_BPP 16
#define dac_reg (0x3c8) #define dac_val (0x3c9) --- a/drivers/video/fbdev/sm712fb.c +++ b/drivers/video/fbdev/sm712fb.c @@ -1463,6 +1463,43 @@ static u_long sm7xx_vram_probe(struct sm return 0; /* unknown hardware */ }
+static void sm7xx_resolution_probe(struct smtcfb_info *sfb) +{ + /* get mode parameter from smtc_scr_info */ + if (smtc_scr_info.lfb_width != 0) { + sfb->fb->var.xres = smtc_scr_info.lfb_width; + sfb->fb->var.yres = smtc_scr_info.lfb_height; + sfb->fb->var.bits_per_pixel = smtc_scr_info.lfb_depth; + goto final; + } + + /* + * No parameter, default resolution is 1024x768-16. + * + * FIXME: earlier laptops, such as IBM Thinkpad 240X, has a 800x600 + * panel, also see the comments about Thinkpad 240X above. + */ + sfb->fb->var.xres = SCREEN_X_RES; + sfb->fb->var.yres = SCREEN_Y_RES_PC; + sfb->fb->var.bits_per_pixel = SCREEN_BPP; + +#ifdef CONFIG_MIPS + /* + * Loongson MIPS netbooks use 1024x600 LCD panels, which is the original + * target platform of this driver, but nearly all old x86 laptops have + * 1024x768. Lighting 768 panels using 600's timings would partially + * garble the display, so we don't want that. But it's not possible to + * distinguish them reliably. + * + * So we change the default to 768, but keep 600 as-is on MIPS. + */ + sfb->fb->var.yres = SCREEN_Y_RES_NETBOOK; +#endif + +final: + big_pixel_depth(sfb->fb->var.bits_per_pixel, smtc_scr_info.lfb_depth); +} + static int smtcfb_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent) { @@ -1508,19 +1545,6 @@ static int smtcfb_pci_probe(struct pci_d
sm7xx_init_hw();
- /* get mode parameter from smtc_scr_info */ - if (smtc_scr_info.lfb_width != 0) { - sfb->fb->var.xres = smtc_scr_info.lfb_width; - sfb->fb->var.yres = smtc_scr_info.lfb_height; - sfb->fb->var.bits_per_pixel = smtc_scr_info.lfb_depth; - } else { - /* default resolution 1024x600 16bit mode */ - sfb->fb->var.xres = SCREEN_X_RES; - sfb->fb->var.yres = SCREEN_Y_RES; - sfb->fb->var.bits_per_pixel = SCREEN_BPP; - } - - big_pixel_depth(sfb->fb->var.bits_per_pixel, smtc_scr_info.lfb_depth); /* Map address and memory detection */ mmio_base = pci_resource_start(pdev, 0); pci_read_config_byte(pdev, PCI_REVISION_ID, &sfb->chip_rev_id); @@ -1582,6 +1606,9 @@ static int smtcfb_pci_probe(struct pci_d goto failed_fb; }
+ /* probe and decide resolution */ + sm7xx_resolution_probe(sfb); + /* can support 32 bpp */ if (sfb->fb->var.bits_per_pixel == 15) sfb->fb->var.bits_per_pixel = 16;
From: Yifeng Li tomli@tomli.me
commit f627caf55b8e735dcec8fa6538e9668632b55276 upstream.
On a Thinkpad s30 (Pentium III / i440MX, Lynx3DM), blanking the display or starting the X server will crash and freeze the system, or garble the display.
Experiments showed this problem can mostly be solved by adjusting the order of register writes. Also, sm712fb failed to consider the difference of clock frequency when unblanking the display, and programs the clock for SM712 to SM720.
Fix them by adjusting the order of register writes, and adding an additional check for SM720 for programming the clock frequency.
Signed-off-by: Yifeng Li tomli@tomli.me Tested-by: Sudip Mukherjee sudipm.mukherjee@gmail.com Cc: Teddy Wang teddy.wang@siliconmotion.com Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: Bartlomiej Zolnierkiewicz b.zolnierkie@samsung.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/video/fbdev/sm712fb.c | 64 ++++++++++++++++++++++++------------------ 1 file changed, 38 insertions(+), 26 deletions(-)
--- a/drivers/video/fbdev/sm712fb.c +++ b/drivers/video/fbdev/sm712fb.c @@ -886,67 +886,79 @@ static inline unsigned int chan_to_field
static int smtc_blank(int blank_mode, struct fb_info *info) { + struct smtcfb_info *sfb = info->par; + /* clear DPMS setting */ switch (blank_mode) { case FB_BLANK_UNBLANK: /* Screen On: HSync: On, VSync : On */ + + switch (sfb->chip_id) { + case 0x710: + case 0x712: + smtc_seqw(0x6a, 0x16); + smtc_seqw(0x6b, 0x02); + case 0x720: + smtc_seqw(0x6a, 0x0d); + smtc_seqw(0x6b, 0x02); + break; + } + + smtc_seqw(0x23, (smtc_seqr(0x23) & (~0xc0))); smtc_seqw(0x01, (smtc_seqr(0x01) & (~0x20))); - smtc_seqw(0x6a, 0x16); - smtc_seqw(0x6b, 0x02); smtc_seqw(0x21, (smtc_seqr(0x21) & 0x77)); smtc_seqw(0x22, (smtc_seqr(0x22) & (~0x30))); - smtc_seqw(0x23, (smtc_seqr(0x23) & (~0xc0))); - smtc_seqw(0x24, (smtc_seqr(0x24) | 0x01)); smtc_seqw(0x31, (smtc_seqr(0x31) | 0x03)); + smtc_seqw(0x24, (smtc_seqr(0x24) | 0x01)); break; case FB_BLANK_NORMAL: /* Screen Off: HSync: On, VSync : On Soft blank */ + smtc_seqw(0x24, (smtc_seqr(0x24) | 0x01)); + smtc_seqw(0x31, ((smtc_seqr(0x31) & (~0x07)) | 0x00)); + smtc_seqw(0x23, (smtc_seqr(0x23) & (~0xc0))); smtc_seqw(0x01, (smtc_seqr(0x01) & (~0x20))); + smtc_seqw(0x22, (smtc_seqr(0x22) & (~0x30))); smtc_seqw(0x6a, 0x16); smtc_seqw(0x6b, 0x02); - smtc_seqw(0x22, (smtc_seqr(0x22) & (~0x30))); - smtc_seqw(0x23, (smtc_seqr(0x23) & (~0xc0))); - smtc_seqw(0x24, (smtc_seqr(0x24) | 0x01)); - smtc_seqw(0x31, ((smtc_seqr(0x31) & (~0x07)) | 0x00)); break; case FB_BLANK_VSYNC_SUSPEND: /* Screen On: HSync: On, VSync : Off */ + smtc_seqw(0x24, (smtc_seqr(0x24) & (~0x01))); + smtc_seqw(0x31, ((smtc_seqr(0x31) & (~0x07)) | 0x00)); + smtc_seqw(0x23, ((smtc_seqr(0x23) & (~0xc0)) | 0x20)); smtc_seqw(0x01, (smtc_seqr(0x01) | 0x20)); - smtc_seqw(0x20, (smtc_seqr(0x20) & (~0xB0))); - smtc_seqw(0x6a, 0x0c); - smtc_seqw(0x6b, 0x02); smtc_seqw(0x21, (smtc_seqr(0x21) | 0x88)); + smtc_seqw(0x20, (smtc_seqr(0x20) & (~0xB0))); smtc_seqw(0x22, ((smtc_seqr(0x22) & (~0x30)) | 0x20)); - smtc_seqw(0x23, ((smtc_seqr(0x23) & (~0xc0)) | 0x20)); - smtc_seqw(0x24, (smtc_seqr(0x24) & (~0x01))); - smtc_seqw(0x31, ((smtc_seqr(0x31) & (~0x07)) | 0x00)); smtc_seqw(0x34, (smtc_seqr(0x34) | 0x80)); + smtc_seqw(0x6a, 0x0c); + smtc_seqw(0x6b, 0x02); break; case FB_BLANK_HSYNC_SUSPEND: /* Screen On: HSync: Off, VSync : On */ + smtc_seqw(0x24, (smtc_seqr(0x24) & (~0x01))); + smtc_seqw(0x31, ((smtc_seqr(0x31) & (~0x07)) | 0x00)); + smtc_seqw(0x23, ((smtc_seqr(0x23) & (~0xc0)) | 0xD8)); smtc_seqw(0x01, (smtc_seqr(0x01) | 0x20)); - smtc_seqw(0x20, (smtc_seqr(0x20) & (~0xB0))); - smtc_seqw(0x6a, 0x0c); - smtc_seqw(0x6b, 0x02); smtc_seqw(0x21, (smtc_seqr(0x21) | 0x88)); + smtc_seqw(0x20, (smtc_seqr(0x20) & (~0xB0))); smtc_seqw(0x22, ((smtc_seqr(0x22) & (~0x30)) | 0x10)); - smtc_seqw(0x23, ((smtc_seqr(0x23) & (~0xc0)) | 0xD8)); - smtc_seqw(0x24, (smtc_seqr(0x24) & (~0x01))); - smtc_seqw(0x31, ((smtc_seqr(0x31) & (~0x07)) | 0x00)); smtc_seqw(0x34, (smtc_seqr(0x34) | 0x80)); + smtc_seqw(0x6a, 0x0c); + smtc_seqw(0x6b, 0x02); break; case FB_BLANK_POWERDOWN: /* Screen On: HSync: Off, VSync : Off */ + smtc_seqw(0x24, (smtc_seqr(0x24) & (~0x01))); + smtc_seqw(0x31, ((smtc_seqr(0x31) & (~0x07)) | 0x00)); + smtc_seqw(0x23, ((smtc_seqr(0x23) & (~0xc0)) | 0xD8)); smtc_seqw(0x01, (smtc_seqr(0x01) | 0x20)); - smtc_seqw(0x20, (smtc_seqr(0x20) & (~0xB0))); - smtc_seqw(0x6a, 0x0c); - smtc_seqw(0x6b, 0x02); smtc_seqw(0x21, (smtc_seqr(0x21) | 0x88)); + smtc_seqw(0x20, (smtc_seqr(0x20) & (~0xB0))); smtc_seqw(0x22, ((smtc_seqr(0x22) & (~0x30)) | 0x30)); - smtc_seqw(0x23, ((smtc_seqr(0x23) & (~0xc0)) | 0xD8)); - smtc_seqw(0x24, (smtc_seqr(0x24) & (~0x01))); - smtc_seqw(0x31, ((smtc_seqr(0x31) & (~0x07)) | 0x00)); smtc_seqw(0x34, (smtc_seqr(0x34) | 0x80)); + smtc_seqw(0x6a, 0x0c); + smtc_seqw(0x6b, 0x02); break; default: return -EINVAL;
From: Nikolai Kostrigin nickel@altlinux.org
commit d28ca864c493637f3c957f4ed9348a94fca6de60 upstream.
ATS is broken on the Radeon R7 GPU (at least for Stoney Ridge based laptop) and causes IOMMU stalls and system failure. Disable ATS on these devices to make them usable again with IOMMU enabled.
Thanks to Joerg Roedel jroedel@suse.de for help.
[bhelgaas: In the email thread mentioned below, Alex suspects the real problem is in sbios or iommu, so it may affect only certain systems, and it may affect other devices in those systems as well. However, per Joerg we lack the ability to debug further, so this quirk is the best we can do for now.]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=194521 Link: https://lore.kernel.org/lkml/20190408103725.30426-1-nickel@altlinux.org Fixes: 9b44b0b09dec ("PCI: Mark AMD Stoney GPU ATS as broken") Signed-off-by: Nikolai Kostrigin nickel@altlinux.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Acked-by: Joerg Roedel jroedel@suse.de CC: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/quirks.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -4878,6 +4878,7 @@ static void quirk_no_ats(struct pci_dev
/* AMD Stoney platform GPU */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x98e4, quirk_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x6900, quirk_no_ats); #endif /* CONFIG_PCI_ATS */
/* Freescale PCIe doesn't support MSI in RC mode */
From: James Prestwood james.prestwood@linux.intel.com
commit 6afb7e26978da5e86e57e540fdce65c8b04f398a upstream.
When using PCI passthrough with this device, the host machine locks up completely when starting the VM, requiring a hard reboot. Add a quirk to avoid bus resets on this device.
Fixes: c3e59ee4e766 ("PCI: Mark Atheros AR93xx to avoid bus reset") Link: https://lore.kernel.org/linux-pci/20190107213248.3034-1-james.prestwood@linu... Signed-off-by: James Prestwood james.prestwood@linux.intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com CC: stable@vger.kernel.org # v3.14+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/quirks.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -3383,6 +3383,7 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_A DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0032, quirk_no_bus_reset); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x003c, quirk_no_bus_reset); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0033, quirk_no_bus_reset); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0034, quirk_no_bus_reset);
/* * Root port on some Cavium CN8xxx chips do not successfully complete a bus
From: Jean-Philippe Brucker jean-philippe.brucker@arm.com
commit 6302bf3ef78dd210b5ff4a922afcb7d8eff8a211 upstream.
Two functions allocate a host bridge: devm_pci_alloc_host_bridge() and pci_alloc_host_bridge(). At the moment, only the unmanaged one initializes the PCIe feature bits, which prevents from using features such as hotplug or AER on some systems, when booting with device tree. Make the initialization code common.
Fixes: 02bfeb484230 ("PCI/portdrv: Simplify PCIe feature permission checking") Signed-off-by: Jean-Philippe Brucker jean-philippe.brucker@arm.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com CC: stable@vger.kernel.org # v4.17+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/probe.c | 23 ++++++++++++++--------- 1 file changed, 14 insertions(+), 9 deletions(-)
--- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -535,16 +535,9 @@ static void pci_release_host_bridge_dev( kfree(to_pci_host_bridge(dev)); }
-struct pci_host_bridge *pci_alloc_host_bridge(size_t priv) +static void pci_init_host_bridge(struct pci_host_bridge *bridge) { - struct pci_host_bridge *bridge; - - bridge = kzalloc(sizeof(*bridge) + priv, GFP_KERNEL); - if (!bridge) - return NULL; - INIT_LIST_HEAD(&bridge->windows); - bridge->dev.release = pci_release_host_bridge_dev;
/* * We assume we can manage these PCIe features. Some systems may @@ -557,6 +550,18 @@ struct pci_host_bridge *pci_alloc_host_b bridge->native_shpc_hotplug = 1; bridge->native_pme = 1; bridge->native_ltr = 1; +} + +struct pci_host_bridge *pci_alloc_host_bridge(size_t priv) +{ + struct pci_host_bridge *bridge; + + bridge = kzalloc(sizeof(*bridge) + priv, GFP_KERNEL); + if (!bridge) + return NULL; + + pci_init_host_bridge(bridge); + bridge->dev.release = pci_release_host_bridge_dev;
return bridge; } @@ -571,7 +576,7 @@ struct pci_host_bridge *devm_pci_alloc_h if (!bridge) return NULL;
- INIT_LIST_HEAD(&bridge->windows); + pci_init_host_bridge(bridge); bridge->dev.release = devm_pci_release_host_bridge_dev;
return bridge;
From: Jisheng Zhang Jisheng.Zhang@synaptics.com
commit 31f996efbd5a7825f4d30150469e9d110aea00e8 upstream.
Commit 60ed982a4e78 ("PCI/AER: Move internal declarations to drivers/pci/pci.h") changed pci_aer_init() to return "void", but didn't change the stub for when CONFIG_PCIEAER isn't enabled. Change the stub to match.
Fixes: 60ed982a4e78 ("PCI/AER: Move internal declarations to drivers/pci/pci.h") Signed-off-by: Jisheng Zhang Jisheng.Zhang@synaptics.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com CC: stable@vger.kernel.org # v4.19+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/pci.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -530,7 +530,7 @@ void pci_aer_clear_fatal_status(struct p void pci_aer_clear_device_status(struct pci_dev *dev); #else static inline void pci_no_aer(void) { } -static inline int pci_aer_init(struct pci_dev *d) { return -ENODEV; } +static inline void pci_aer_init(struct pci_dev *d) { } static inline void pci_aer_exit(struct pci_dev *d) { } static inline void pci_aer_clear_fatal_status(struct pci_dev *dev) { } static inline void pci_aer_clear_device_status(struct pci_dev *dev) { }
From: Kazufumi Ikeda kaz-ikeda@xc.jp.nec.com
commit be20bbcb0a8cb5597cc62b3e28d275919f3431df upstream.
Reestablish the PCIe link very early in the resume process in case it went down to prevent PCI accesses from hanging the bus. Such accesses can happen early in the PCI resume process, as early as the SUSPEND_RESUME_NOIRQ step, thus the link must be reestablished in the driver resume_noirq() callback.
Fixes: e015f88c368d ("PCI: rcar: Add support for R-Car H3 to pcie-rcar") Signed-off-by: Kazufumi Ikeda kaz-ikeda@xc.jp.nec.com Signed-off-by: Gaku Inami gaku.inami.xw@bp.renesas.com Signed-off-by: Marek Vasut marek.vasut+renesas@gmail.com [lorenzo.pieralisi@arm.com: reformatted commit log] Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Reviewed-by: Simon Horman horms+renesas@verge.net.au Reviewed-by: Geert Uytterhoeven geert+renesas@glider.be Acked-by: Wolfram Sang wsa+renesas@sang-engineering.com Cc: stable@vger.kernel.org Cc: Geert Uytterhoeven geert+renesas@glider.be Cc: Phil Edworthy phil.edworthy@renesas.com Cc: Simon Horman horms+renesas@verge.net.au Cc: Wolfram Sang wsa@the-dreams.de Cc: linux-renesas-soc@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/controller/pcie-rcar.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+)
--- a/drivers/pci/controller/pcie-rcar.c +++ b/drivers/pci/controller/pcie-rcar.c @@ -46,6 +46,7 @@
/* Transfer control */ #define PCIETCTLR 0x02000 +#define DL_DOWN BIT(3) #define CFINIT 1 #define PCIETSTR 0x02004 #define DATA_LINK_ACTIVE 1 @@ -94,6 +95,7 @@ #define MACCTLR 0x011058 #define SPEED_CHANGE BIT(24) #define SCRAMBLE_DISABLE BIT(27) +#define PMSR 0x01105c #define MACS2R 0x011078 #define MACCGSPSETR 0x011084 #define SPCNGRSN BIT(31) @@ -1130,6 +1132,7 @@ static int rcar_pcie_probe(struct platfo pcie = pci_host_bridge_priv(bridge);
pcie->dev = dev; + platform_set_drvdata(pdev, pcie);
err = pci_parse_request_of_pci_ranges(dev, &pcie->resources, NULL); if (err) @@ -1221,10 +1224,28 @@ err_free_bridge: return err; }
+static int rcar_pcie_resume_noirq(struct device *dev) +{ + struct rcar_pcie *pcie = dev_get_drvdata(dev); + + if (rcar_pci_read_reg(pcie, PMSR) && + !(rcar_pci_read_reg(pcie, PCIETCTLR) & DL_DOWN)) + return 0; + + /* Re-establish the PCIe link */ + rcar_pci_write_reg(pcie, CFINIT, PCIETCTLR); + return rcar_pcie_wait_for_dl(pcie); +} + +static const struct dev_pm_ops rcar_pcie_pm_ops = { + .resume_noirq = rcar_pcie_resume_noirq, +}; + static struct platform_driver rcar_pcie_driver = { .driver = { .name = "rcar-pcie", .of_match_table = rcar_pcie_of_match, + .pm = &rcar_pcie_pm_ops, .suppress_bind_attrs = true, }, .probe = rcar_pcie_probe,
From: Stefan Mätje stefan.maetje@esd.eu
commit 86fa6a344209d9414ea962b1f1ac6ade9dd7563a upstream.
Factor out pcie_retrain_link() to use for Pericom Retrain Link quirk. No functional change intended.
Signed-off-by: Stefan Mätje stefan.maetje@esd.eu Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com CC: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/pcie/aspm.c | 40 ++++++++++++++++++++++++---------------- 1 file changed, 24 insertions(+), 16 deletions(-)
--- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -198,6 +198,29 @@ static void pcie_clkpm_cap_init(struct p link->clkpm_capable = (blacklist) ? 0 : capable; }
+static bool pcie_retrain_link(struct pcie_link_state *link) +{ + struct pci_dev *parent = link->pdev; + unsigned long start_jiffies; + u16 reg16; + + pcie_capability_read_word(parent, PCI_EXP_LNKCTL, ®16); + reg16 |= PCI_EXP_LNKCTL_RL; + pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16); + + /* Wait for link training end. Break out after waiting for timeout */ + start_jiffies = jiffies; + for (;;) { + pcie_capability_read_word(parent, PCI_EXP_LNKSTA, ®16); + if (!(reg16 & PCI_EXP_LNKSTA_LT)) + break; + if (time_after(jiffies, start_jiffies + LINK_RETRAIN_TIMEOUT)) + break; + msleep(1); + } + return !(reg16 & PCI_EXP_LNKSTA_LT); +} + /* * pcie_aspm_configure_common_clock: check if the 2 ends of a link * could use common clock. If they are, configure them to use the @@ -207,7 +230,6 @@ static void pcie_aspm_configure_common_c { int same_clock = 1; u16 reg16, parent_reg, child_reg[8]; - unsigned long start_jiffies; struct pci_dev *child, *parent = link->pdev; struct pci_bus *linkbus = parent->subordinate; /* @@ -265,21 +287,7 @@ static void pcie_aspm_configure_common_c reg16 &= ~PCI_EXP_LNKCTL_CCC; pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
- /* Retrain link */ - reg16 |= PCI_EXP_LNKCTL_RL; - pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16); - - /* Wait for link training end. Break out after waiting for timeout */ - start_jiffies = jiffies; - for (;;) { - pcie_capability_read_word(parent, PCI_EXP_LNKSTA, ®16); - if (!(reg16 & PCI_EXP_LNKSTA_LT)) - break; - if (time_after(jiffies, start_jiffies + LINK_RETRAIN_TIMEOUT)) - break; - msleep(1); - } - if (!(reg16 & PCI_EXP_LNKSTA_LT)) + if (pcie_retrain_link(link)) return;
/* Training failed. Restore common clock configurations */
From: Stefan Mätje stefan.maetje@esd.eu
commit 4ec73791a64bab25cabf16a6067ee478692e506d upstream.
Due to an erratum in some Pericom PCIe-to-PCI bridges in reverse mode (conventional PCI on primary side, PCIe on downstream side), the Retrain Link bit needs to be cleared manually to allow the link training to complete successfully.
If it is not cleared manually, the link training is continuously restarted and no devices below the PCI-to-PCIe bridge can be accessed. That means drivers for devices below the bridge will be loaded but won't work and may even crash because the driver is only reading 0xffff.
See the Pericom Errata Sheet PI7C9X111SLB_errata_rev1.2_102711.pdf for details. Devices known as affected so far are: PI7C9X110, PI7C9X111SL, PI7C9X130.
Add a new flag, clear_retrain_link, in struct pci_dev. Quirks for affected devices set this bit.
Note that pcie_retrain_link() lives in aspm.c because that's currently the only place we use it, but this erratum is not specific to ASPM, and we may retrain links for other reasons in the future.
Signed-off-by: Stefan Mätje stefan.maetje@esd.eu [bhelgaas: apply regardless of CONFIG_PCIEASPM] Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com CC: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/pcie/aspm.c | 9 +++++++++ drivers/pci/quirks.c | 17 +++++++++++++++++ include/linux/pci.h | 2 ++ 3 files changed, 28 insertions(+)
--- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -207,6 +207,15 @@ static bool pcie_retrain_link(struct pci pcie_capability_read_word(parent, PCI_EXP_LNKCTL, ®16); reg16 |= PCI_EXP_LNKCTL_RL; pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16); + if (parent->clear_retrain_link) { + /* + * Due to an erratum in some devices the Retrain Link bit + * needs to be cleared again manually to allow the link + * training to succeed. + */ + reg16 &= ~PCI_EXP_LNKCTL_RL; + pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16); + }
/* Wait for link training end. Break out after waiting for timeout */ start_jiffies = jiffies; --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -2220,6 +2220,23 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_IN DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f4, quirk_disable_aspm_l0s); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1508, quirk_disable_aspm_l0s);
+/* + * Some Pericom PCIe-to-PCI bridges in reverse mode need the PCIe Retrain + * Link bit cleared after starting the link retrain process to allow this + * process to finish. + * + * Affected devices: PI7C9X110, PI7C9X111SL, PI7C9X130. See also the + * Pericom Errata Sheet PI7C9X111SLB_errata_rev1.2_102711.pdf. + */ +static void quirk_enable_clear_retrain_link(struct pci_dev *dev) +{ + dev->clear_retrain_link = 1; + pci_info(dev, "Enable PCIe Retrain Link quirk\n"); +} +DECLARE_PCI_FIXUP_HEADER(0x12d8, 0xe110, quirk_enable_clear_retrain_link); +DECLARE_PCI_FIXUP_HEADER(0x12d8, 0xe111, quirk_enable_clear_retrain_link); +DECLARE_PCI_FIXUP_HEADER(0x12d8, 0xe130, quirk_enable_clear_retrain_link); + static void fixup_rev1_53c810(struct pci_dev *dev) { u32 class = dev->class; --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -346,6 +346,8 @@ struct pci_dev { unsigned int hotplug_user_indicators:1; /* SlotCtl indicators controlled exclusively by user sysfs */ + unsigned int clear_retrain_link:1; /* Need to clear Retrain Link + bit manually */ unsigned int d3_delay; /* D3->D0 transition time in ms */ unsigned int d3cold_delay; /* D3cold->D0 transition time in ms */
From: Nikos Tsironis ntsironis@arrikto.com
commit e28adc3bf34e434b30e8d063df4823ba0f3e0529 upstream.
Add missing dm_bitset_cursor_next() to properly advance the bitset cursor.
Otherwise, the discarded state of all blocks is set according to the discarded state of the first block.
Fixes: ae4a46a1f6 ("dm cache metadata: use bitset cursor api to load discard bitset") Cc: stable@vger.kernel.org Signed-off-by: Nikos Tsironis ntsironis@arrikto.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/dm-cache-metadata.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
--- a/drivers/md/dm-cache-metadata.c +++ b/drivers/md/dm-cache-metadata.c @@ -1167,11 +1167,18 @@ static int __load_discards(struct dm_cac if (r) return r;
- for (b = 0; b < from_dblock(cmd->discard_nr_blocks); b++) { + for (b = 0; ; b++) { r = fn(context, cmd->discard_block_size, to_dblock(b), dm_bitset_cursor_get_value(&c)); if (r) break; + + if (b >= (from_dblock(cmd->discard_nr_blocks) - 1)) + break; + + r = dm_bitset_cursor_next(&c); + if (r) + break; }
dm_bitset_cursor_end(&c);
From: Damien Le Moal damien.lemoal@wdc.com
commit 7aedf75ff740a98f3683439449cd91c8662d03b2 upstream.
The function blkdev_report_zones() returns success even if no zone information is reported (empty report). Empty zone reports can only happen if the report start sector passed exceeds the device capacity. The conditions for this to happen are either a bug in the caller code, or, a change in the device that forced the low level driver to change the device capacity to a value that is lower than the report start sector. This situation includes a failed disk revalidation resulting in the disk capacity being changed to 0.
If this change happens while dm-zoned is in its initialization phase executing dmz_init_zones(), this function may enter an infinite loop and hang the system. To avoid this, add a check to disallow empty zone reports and bail out early. Also fix the function dmz_update_zone() to make sure that the report for the requested zone was correctly obtained.
Fixes: 3b1a94c88b79 ("dm zoned: drive-managed zoned block device target") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal damien.lemoal@wdc.com Reviewed-by: Shaun Tancheff shaun@tancheff.com Signed-off-by: Damien Le Moal damien.lemoal@wdc.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/dm-zoned-metadata.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/md/dm-zoned-metadata.c +++ b/drivers/md/dm-zoned-metadata.c @@ -1169,6 +1169,9 @@ static int dmz_init_zones(struct dmz_met goto out; }
+ if (!nr_blkz) + break; + /* Process report */ for (i = 0; i < nr_blkz; i++) { ret = dmz_init_zone(zmd, zone, &blkz[i]); @@ -1204,6 +1207,8 @@ static int dmz_update_zone(struct dmz_me /* Get zone information from disk */ ret = blkdev_report_zones(zmd->dev->bdev, dmz_start_sect(zmd, zone), &blkz, &nr_blkz, GFP_NOIO); + if (!nr_blkz) + ret = -EIO; if (ret) { dmz_dev_err(zmd->dev, "Get zone %u report failed", dmz_id(zmd, zone));
From: Mikulas Patocka mpatocka@redhat.com
commit 81bc6d150ace6250503b825d9d0c10f7bbd24095 upstream.
When the target line contains an invalid device, delay_ctr() will call delay_dtr() with NULL workqueue. Attempting to destroy the NULL workqueue causes a crash.
Signed-off-by: Mikulas Patocka mpatocka@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/dm-delay.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/md/dm-delay.c +++ b/drivers/md/dm-delay.c @@ -121,7 +121,8 @@ static void delay_dtr(struct dm_target * { struct delay_c *dc = ti->private;
- destroy_workqueue(dc->kdelayd_wq); + if (dc->kdelayd_wq) + destroy_workqueue(dc->kdelayd_wq);
if (dc->read.dev) dm_put_device(ti, dc->read.dev);
From: Mikulas Patocka mpatocka@redhat.com
commit 30bba430ddf737978e40561198693ba91386dac1 upstream.
When we use separate devices for data and metadata, dm-integrity would incorrectly calculate the size of the metadata device as if it had 512-byte block size - and it would refuse activation with larger block size and smaller metadata device.
Fix this so that it takes actual block size into account, which fixes the following reported issue: https://gitlab.com/cryptsetup/cryptsetup/issues/450
Fixes: 356d9d52e122 ("dm integrity: allow separate metadata device") Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Mikulas Patocka mpatocka@redhat.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/dm-integrity.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/md/dm-integrity.c +++ b/drivers/md/dm-integrity.c @@ -2557,7 +2557,7 @@ static int calculate_device_limits(struc if (last_sector < ic->start || last_sector >= ic->meta_device_sectors) return -EINVAL; } else { - __u64 meta_size = ic->provided_data_sectors * ic->tag_size; + __u64 meta_size = (ic->provided_data_sectors >> ic->sb->log2_sectors_per_block) * ic->tag_size; meta_size = (meta_size + ((1U << (ic->log2_buffer_sectors + SECTOR_SHIFT)) - 1)) >> (ic->log2_buffer_sectors + SECTOR_SHIFT); meta_size <<= ic->log2_buffer_sectors; @@ -3428,7 +3428,7 @@ try_smaller_buffer: DEBUG_print(" journal_sections %u\n", (unsigned)le32_to_cpu(ic->sb->journal_sections)); DEBUG_print(" journal_entries %u\n", ic->journal_entries); DEBUG_print(" log2_interleave_sectors %d\n", ic->sb->log2_interleave_sectors); - DEBUG_print(" device_sectors 0x%llx\n", (unsigned long long)ic->device_sectors); + DEBUG_print(" data_device_sectors 0x%llx\n", (unsigned long long)ic->data_device_sectors); DEBUG_print(" initial_sectors 0x%x\n", ic->initial_sectors); DEBUG_print(" metadata_run 0x%x\n", ic->metadata_run); DEBUG_print(" log2_metadata_run %d\n", ic->log2_metadata_run);
From: Martin Wilck mwilck@suse.com
commit 940bc471780b004a5277c1931f52af363c2fc9da upstream.
Commit b592211c33f7 ("dm mpath: fix attached_handler_name leak and dangling hw_handler_name pointer") fixed a memory leak for the case where setup_scsi_dh() returns failure. But setup_scsi_dh may return success and not "use" attached_handler_name if the retain_attached_hwhandler flag is not set on the map. As setup_scsi_sh properly "steals" the pointer by nullifying it, freeing it unconditionally in parse_path() is safe.
Fixes: b592211c33f7 ("dm mpath: fix attached_handler_name leak and dangling hw_handler_name pointer") Cc: stable@vger.kernel.org Reported-by: Yufen Yu yuyufen@huawei.com Signed-off-by: Martin Wilck mwilck@suse.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/dm-mpath.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -892,6 +892,7 @@ static struct pgpath *parse_path(struct if (attached_handler_name || m->hw_handler_name) { INIT_DELAYED_WORK(&p->activate_path, activate_path_work); r = setup_scsi_dh(p->path.dev->bdev, m, &attached_handler_name, &ti->error); + kfree(attached_handler_name); if (r) { dm_put_device(ti, p->path.dev); goto bad; @@ -906,7 +907,6 @@ static struct pgpath *parse_path(struct
return p; bad: - kfree(attached_handler_name); free_pgpath(p); return ERR_PTR(r); }
From: Kirill Smelkov kirr@nexedi.com
commit bbd84f33652f852ce5992d65db4d020aba21f882 upstream.
Starting from commit 9c225f2655e3 ("vfs: atomic f_pos accesses as per POSIX") files opened even via nonseekable_open gate read and write via lock and do not allow them to be run simultaneously. This can create read vs write deadlock if a filesystem is trying to implement a socket-like file which is intended to be simultaneously used for both read and write from filesystem client. See commit 10dce8af3422 ("fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock") for details and e.g. commit 581d21a2d02a ("xenbus: fix deadlock on writes to /proc/xen/xenbus") for a similar deadlock example on /proc/xen/xenbus.
To avoid such deadlock it was tempting to adjust fuse_finish_open to use stream_open instead of nonseekable_open on just FOPEN_NONSEEKABLE flags, but grepping through Debian codesearch shows users of FOPEN_NONSEEKABLE, and in particular GVFS which actually uses offset in its read and write handlers
https://codesearch.debian.net/search?q=-%3Enonseekable+%3D https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfused... https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfused... https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfused...
so if we would do such a change it will break a real user.
Add another flag (FOPEN_STREAM) for filesystem servers to indicate that the opened handler is having stream-like semantics; does not use file position and thus the kernel is free to issue simultaneous read and write request on opened file handle.
This patch together with stream_open() should be added to stable kernels starting from v3.14+. This will allow to patch OSSPD and other FUSE filesystems that provide stream-like files to return FOPEN_STREAM | FOPEN_NONSEEKABLE in open handler and this way avoid the deadlock on all kernel versions. This should work because fuse_finish_open ignores unknown open flags returned from a filesystem and so passing FOPEN_STREAM to a kernel that is not aware of this flag cannot hurt. In turn the kernel that is not aware of FOPEN_STREAM will be < v3.14 where just FOPEN_NONSEEKABLE is sufficient to implement streams without read vs write deadlock.
Cc: stable@vger.kernel.org # v3.14+ Signed-off-by: Kirill Smelkov kirr@nexedi.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/fuse/file.c | 4 +++- include/uapi/linux/fuse.h | 2 ++ 2 files changed, 5 insertions(+), 1 deletion(-)
--- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -179,7 +179,9 @@ void fuse_finish_open(struct inode *inod file->f_op = &fuse_direct_io_file_operations; if (!(ff->open_flags & FOPEN_KEEP_CACHE)) invalidate_inode_pages2(inode->i_mapping); - if (ff->open_flags & FOPEN_NONSEEKABLE) + if (ff->open_flags & FOPEN_STREAM) + stream_open(inode, file); + else if (ff->open_flags & FOPEN_NONSEEKABLE) nonseekable_open(inode, file); if (fc->atomic_o_trunc && (file->f_flags & O_TRUNC)) { struct fuse_inode *fi = get_fuse_inode(inode); --- a/include/uapi/linux/fuse.h +++ b/include/uapi/linux/fuse.h @@ -219,10 +219,12 @@ struct fuse_file_lock { * FOPEN_DIRECT_IO: bypass page cache for this open file * FOPEN_KEEP_CACHE: don't invalidate the data cache on open * FOPEN_NONSEEKABLE: the file is not seekable + * FOPEN_STREAM: the file is stream-like (no file position at all) */ #define FOPEN_DIRECT_IO (1 << 0) #define FOPEN_KEEP_CACHE (1 << 1) #define FOPEN_NONSEEKABLE (1 << 2) +#define FOPEN_STREAM (1 << 4)
/** * INIT request/reply flags
[ Upstream commit b805d78d300bcf2c83d6df7da0c818b0fee41427 ]
UBSAN report this:
UBSAN: Undefined behaviour in net/xfrm/xfrm_policy.c:1289:24 index 6 is out of range for type 'unsigned int [6]' CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.4.162-514.55.6.9.x86_64+ #13 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 0000000000000000 1466cf39b41b23c9 ffff8801f6b07a58 ffffffff81cb35f4 0000000041b58ab3 ffffffff83230f9c ffffffff81cb34e0 ffff8801f6b07a80 ffff8801f6b07a20 1466cf39b41b23c9 ffffffff851706e0 ffff8801f6b07ae8 Call Trace: <IRQ> [<ffffffff81cb35f4>] __dump_stack lib/dump_stack.c:15 [inline] <IRQ> [<ffffffff81cb35f4>] dump_stack+0x114/0x1a0 lib/dump_stack.c:51 [<ffffffff81d94225>] ubsan_epilogue+0x12/0x8f lib/ubsan.c:164 [<ffffffff81d954db>] __ubsan_handle_out_of_bounds+0x16e/0x1b2 lib/ubsan.c:382 [<ffffffff82a25acd>] __xfrm_policy_unlink+0x3dd/0x5b0 net/xfrm/xfrm_policy.c:1289 [<ffffffff82a2e572>] xfrm_policy_delete+0x52/0xb0 net/xfrm/xfrm_policy.c:1309 [<ffffffff82a3319b>] xfrm_policy_timer+0x30b/0x590 net/xfrm/xfrm_policy.c:243 [<ffffffff813d3927>] call_timer_fn+0x237/0x990 kernel/time/timer.c:1144 [<ffffffff813d8e7e>] __run_timers kernel/time/timer.c:1218 [inline] [<ffffffff813d8e7e>] run_timer_softirq+0x6ce/0xb80 kernel/time/timer.c:1401 [<ffffffff8120d6f9>] __do_softirq+0x299/0xe10 kernel/softirq.c:273 [<ffffffff8120e676>] invoke_softirq kernel/softirq.c:350 [inline] [<ffffffff8120e676>] irq_exit+0x216/0x2c0 kernel/softirq.c:391 [<ffffffff82c5edab>] exiting_irq arch/x86/include/asm/apic.h:652 [inline] [<ffffffff82c5edab>] smp_apic_timer_interrupt+0x8b/0xc0 arch/x86/kernel/apic/apic.c:926 [<ffffffff82c5c985>] apic_timer_interrupt+0xa5/0xb0 arch/x86/entry/entry_64.S:735 <EOI> [<ffffffff81188096>] ? native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:52 [<ffffffff810834d7>] arch_safe_halt arch/x86/include/asm/paravirt.h:111 [inline] [<ffffffff810834d7>] default_idle+0x27/0x430 arch/x86/kernel/process.c:446 [<ffffffff81085f05>] arch_cpu_idle+0x15/0x20 arch/x86/kernel/process.c:437 [<ffffffff8132abc3>] default_idle_call+0x53/0x90 kernel/sched/idle.c:92 [<ffffffff8132b32d>] cpuidle_idle_call kernel/sched/idle.c:156 [inline] [<ffffffff8132b32d>] cpu_idle_loop kernel/sched/idle.c:251 [inline] [<ffffffff8132b32d>] cpu_startup_entry+0x60d/0x9a0 kernel/sched/idle.c:299 [<ffffffff8113e119>] start_secondary+0x3c9/0x560 arch/x86/kernel/smpboot.c:245
The issue is triggered as this:
xfrm_add_policy -->verify_newpolicy_info //check the index provided by user with XFRM_POLICY_MAX //In my case, the index is 0x6E6BB6, so it pass the check. -->xfrm_policy_construct //copy the user's policy and set xfrm_policy_timer -->xfrm_policy_insert --> __xfrm_policy_link //use the orgin dir, in my case is 2 --> xfrm_gen_index //generate policy index, there is 0x6E6BB6
then xfrm_policy_timer be fired
xfrm_policy_timer --> xfrm_policy_id2dir //get dir from (policy index & 7), in my case is 6 --> xfrm_policy_delete --> __xfrm_policy_unlink //access policy_count[dir], trigger out of range access
Add xfrm_policy_id2dir check in verify_newpolicy_info, make sure the computed dir is valid, to fix the issue.
Reported-by: Hulk Robot hulkci@huawei.com Fixes: e682adf021be ("xfrm: Try to honor policy index if it's supplied by user") Signed-off-by: YueHaibing yuehaibing@huawei.com Acked-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/xfrm/xfrm_user.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index 7e4904b930041..060afc4ffd958 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -1424,7 +1424,7 @@ static int verify_newpolicy_info(struct xfrm_userpolicy_info *p) ret = verify_policy_dir(p->dir); if (ret) return ret; - if (p->index && ((p->index & XFRM_POLICY_MAX) != p->dir)) + if (p->index && (xfrm_policy_id2dir(p->index) != p->dir)) return -EINVAL;
return 0;
[ Upstream commit 6ee02a54ef990a71bf542b6f0a4e3321de9d9c66 ]
When unloading xfrm6_tunnel module, xfrm6_tunnel_fini directly frees the xfrm6_tunnel_spi_kmem. Maybe someone has gotten the xfrm6_tunnel_spi, so need to wait it.
Fixes: 91cc3bb0b04ff("xfrm6_tunnel: RCU conversion") Signed-off-by: Su Yanjun suyj.fnst@cn.fujitsu.com Acked-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/xfrm6_tunnel.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/net/ipv6/xfrm6_tunnel.c b/net/ipv6/xfrm6_tunnel.c index bc65db782bfb1..12cb3aa990af4 100644 --- a/net/ipv6/xfrm6_tunnel.c +++ b/net/ipv6/xfrm6_tunnel.c @@ -402,6 +402,10 @@ static void __exit xfrm6_tunnel_fini(void) xfrm6_tunnel_deregister(&xfrm6_tunnel_handler, AF_INET6); xfrm_unregister_type(&xfrm6_tunnel_type, AF_INET6); unregister_pernet_subsys(&xfrm6_tunnel_net_ops); + /* Someone maybe has gotten the xfrm6_tunnel_spi. + * So need to wait it. + */ + rcu_barrier(); kmem_cache_destroy(xfrm6_tunnel_spi_kmem); }
[ Upstream commit 5483844c3fc18474de29f5d6733003526e0a9f78 ]
If tunnel registration failed during module initialization, the module would fail to deregister the IPPROTO_COMP protocol and would attempt to deregister the tunnel.
The tunnel was not deregistered during module-exit.
Fixes: dd9ee3444014e ("vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel") Signed-off-by: Jeremy Sowden jeremy@azazel.net Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/ip_vti.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c index 40a7cd56e0087..808f8d15c5197 100644 --- a/net/ipv4/ip_vti.c +++ b/net/ipv4/ip_vti.c @@ -659,9 +659,9 @@ static int __init vti_init(void) return err;
rtnl_link_failed: - xfrm4_protocol_deregister(&vti_ipcomp4_protocol, IPPROTO_COMP); -xfrm_tunnel_failed: xfrm4_tunnel_deregister(&ipip_handler, AF_INET); +xfrm_tunnel_failed: + xfrm4_protocol_deregister(&vti_ipcomp4_protocol, IPPROTO_COMP); xfrm_proto_comp_failed: xfrm4_protocol_deregister(&vti_ah4_protocol, IPPROTO_AH); xfrm_proto_ah_failed: @@ -676,6 +676,7 @@ pernet_dev_failed: static void __exit vti_fini(void) { rtnl_link_unregister(&vti_link_ops); + xfrm4_tunnel_deregister(&ipip_handler, AF_INET); xfrm4_protocol_deregister(&vti_ipcomp4_protocol, IPPROTO_COMP); xfrm4_protocol_deregister(&vti_ah4_protocol, IPPROTO_AH); xfrm4_protocol_deregister(&vti_esp4_protocol, IPPROTO_ESP);
[ Upstream commit dbb2483b2a46fbaf833cfb5deb5ed9cace9c7399 ]
In commit 6a53b7593233 ("xfrm: check id proto in validate_tmpl()") I introduced a check for xfrm protocol, but according to Herbert IPSEC_PROTO_ANY should only be used as a wildcard for lookup, so it should be removed from validate_tmpl().
And, IPSEC_PROTO_ANY is expected to only match 3 IPSec-specific protocols, this is why xfrm_state_flush() could still miss IPPROTO_ROUTING, which leads that those entries are left in net->xfrm.state_all before exit net. Fix this by replacing IPSEC_PROTO_ANY with zero.
This patch also extracts the check from validate_tmpl() to xfrm_id_proto_valid() and uses it in parse_ipsecrequest(). With this, no other protocols should be added into xfrm.
Fixes: 6a53b7593233 ("xfrm: check id proto in validate_tmpl()") Reported-by: syzbot+0bf0519d6e0de15914fe@syzkaller.appspotmail.com Cc: Steffen Klassert steffen.klassert@secunet.com Cc: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Cong Wang xiyou.wangcong@gmail.com Acked-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/xfrm.h | 17 +++++++++++++++++ net/ipv6/xfrm6_tunnel.c | 2 +- net/key/af_key.c | 4 +++- net/xfrm/xfrm_state.c | 2 +- net/xfrm/xfrm_user.c | 14 +------------- 5 files changed, 23 insertions(+), 16 deletions(-)
diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 5e3daf53b3d1e..3e966c632f3b2 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -1430,6 +1430,23 @@ static inline int xfrm_state_kern(const struct xfrm_state *x) return atomic_read(&x->tunnel_users); }
+static inline bool xfrm_id_proto_valid(u8 proto) +{ + switch (proto) { + case IPPROTO_AH: + case IPPROTO_ESP: + case IPPROTO_COMP: +#if IS_ENABLED(CONFIG_IPV6) + case IPPROTO_ROUTING: + case IPPROTO_DSTOPTS: +#endif + return true; + default: + return false; + } +} + +/* IPSEC_PROTO_ANY only matches 3 IPsec protocols, 0 could match all. */ static inline int xfrm_id_proto_match(u8 proto, u8 userproto) { return (!userproto || proto == userproto || diff --git a/net/ipv6/xfrm6_tunnel.c b/net/ipv6/xfrm6_tunnel.c index 12cb3aa990af4..d9e5f6808811a 100644 --- a/net/ipv6/xfrm6_tunnel.c +++ b/net/ipv6/xfrm6_tunnel.c @@ -345,7 +345,7 @@ static void __net_exit xfrm6_tunnel_net_exit(struct net *net) unsigned int i;
xfrm_flush_gc(); - xfrm_state_flush(net, IPSEC_PROTO_ANY, false, true); + xfrm_state_flush(net, 0, false, true);
for (i = 0; i < XFRM6_TUNNEL_SPI_BYADDR_HSIZE; i++) WARN_ON_ONCE(!hlist_empty(&xfrm6_tn->spi_byaddr[i])); diff --git a/net/key/af_key.c b/net/key/af_key.c index 7d4bed9550605..0b79c9aa8eb1f 100644 --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -1951,8 +1951,10 @@ parse_ipsecrequest(struct xfrm_policy *xp, struct sadb_x_ipsecrequest *rq)
if (rq->sadb_x_ipsecrequest_mode == 0) return -EINVAL; + if (!xfrm_id_proto_valid(rq->sadb_x_ipsecrequest_proto)) + return -EINVAL;
- t->id.proto = rq->sadb_x_ipsecrequest_proto; /* XXX check proto */ + t->id.proto = rq->sadb_x_ipsecrequest_proto; if ((mode = pfkey_mode_to_xfrm(rq->sadb_x_ipsecrequest_mode)) < 0) return -EINVAL; t->mode = mode; diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 3f729cd512aff..11e09eb138d60 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -2386,7 +2386,7 @@ void xfrm_state_fini(struct net *net)
flush_work(&net->xfrm.state_hash_work); flush_work(&xfrm_state_gc_work); - xfrm_state_flush(net, IPSEC_PROTO_ANY, false, true); + xfrm_state_flush(net, 0, false, true);
WARN_ON(!list_empty(&net->xfrm.state_all));
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index 060afc4ffd958..2122f89f61555 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -1513,20 +1513,8 @@ static int validate_tmpl(int nr, struct xfrm_user_tmpl *ut, u16 family) return -EINVAL; }
- switch (ut[i].id.proto) { - case IPPROTO_AH: - case IPPROTO_ESP: - case IPPROTO_COMP: -#if IS_ENABLED(CONFIG_IPV6) - case IPPROTO_ROUTING: - case IPPROTO_DSTOPTS: -#endif - case IPSEC_PROTO_ANY: - break; - default: + if (!xfrm_id_proto_valid(ut[i].id.proto)) return -EINVAL; - } - }
return 0;
[ Upstream commit 8dfb4eba4100e7cdd161a8baef2d8d61b7a7e62e ]
esp_output_udp_encap can produce a length that doesn't fit in the 16 bits of a UDP header's length field. In that case, we'll send a fragmented packet whose length is larger than IP_MAX_MTU (resulting in "Oversized IP packet" warnings on receive) and with a bogus UDP length.
To prevent this, add a length check to esp_output_udp_encap and return -EMSGSIZE on failure.
This seems to be older than git history.
Signed-off-by: Sabrina Dubroca sd@queasysnail.net Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/esp4.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-)
diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c index 12a43a5369a54..114f9def1ec54 100644 --- a/net/ipv4/esp4.c +++ b/net/ipv4/esp4.c @@ -223,7 +223,7 @@ static void esp_output_fill_trailer(u8 *tail, int tfclen, int plen, __u8 proto) tail[plen - 1] = proto; }
-static void esp_output_udp_encap(struct xfrm_state *x, struct sk_buff *skb, struct esp_info *esp) +static int esp_output_udp_encap(struct xfrm_state *x, struct sk_buff *skb, struct esp_info *esp) { int encap_type; struct udphdr *uh; @@ -231,6 +231,7 @@ static void esp_output_udp_encap(struct xfrm_state *x, struct sk_buff *skb, stru __be16 sport, dport; struct xfrm_encap_tmpl *encap = x->encap; struct ip_esp_hdr *esph = esp->esph; + unsigned int len;
spin_lock_bh(&x->lock); sport = encap->encap_sport; @@ -238,11 +239,14 @@ static void esp_output_udp_encap(struct xfrm_state *x, struct sk_buff *skb, stru encap_type = encap->encap_type; spin_unlock_bh(&x->lock);
+ len = skb->len + esp->tailen - skb_transport_offset(skb); + if (len + sizeof(struct iphdr) >= IP_MAX_MTU) + return -EMSGSIZE; + uh = (struct udphdr *)esph; uh->source = sport; uh->dest = dport; - uh->len = htons(skb->len + esp->tailen - - skb_transport_offset(skb)); + uh->len = htons(len); uh->check = 0;
switch (encap_type) { @@ -259,6 +263,8 @@ static void esp_output_udp_encap(struct xfrm_state *x, struct sk_buff *skb, stru
*skb_mac_header(skb) = IPPROTO_UDP; esp->esph = esph; + + return 0; }
int esp_output_head(struct xfrm_state *x, struct sk_buff *skb, struct esp_info *esp) @@ -272,8 +278,12 @@ int esp_output_head(struct xfrm_state *x, struct sk_buff *skb, struct esp_info * int tailen = esp->tailen;
/* this is non-NULL only with UDP Encapsulation */ - if (x->encap) - esp_output_udp_encap(x, skb, esp); + if (x->encap) { + int err = esp_output_udp_encap(x, skb, esp); + + if (err < 0) + return err; + }
if (!skb_cloned(skb)) { if (tailen <= skb_tailroom(skb)) {
[ Upstream commit 025c65e119bf58b610549ca359c9ecc5dee6a8d2 ]
If an xfrmi is associated to a vrf layer 3 master device, xfrm_policy_check() fails after traffic decapsulation. The input interface is replaced by the layer 3 master device, and hence xfrmi_decode_session() can't match the xfrmi anymore to satisfy policy checking.
Extend ingress xfrmi lookup to honor the original layer 3 slave device, allowing xfrm interfaces to operate within a vrf domain.
Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces") Signed-off-by: Martin Willi martin@strongswan.org Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/xfrm.h | 3 ++- net/xfrm/xfrm_interface.c | 17 ++++++++++++++--- net/xfrm/xfrm_policy.c | 2 +- 3 files changed, 17 insertions(+), 5 deletions(-)
diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 3e966c632f3b2..4ddd2b13ac8d6 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -295,7 +295,8 @@ struct xfrm_replay { };
struct xfrm_if_cb { - struct xfrm_if *(*decode_session)(struct sk_buff *skb); + struct xfrm_if *(*decode_session)(struct sk_buff *skb, + unsigned short family); };
void xfrm_if_register_cb(const struct xfrm_if_cb *ifcb); diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c index 82723ef44db3e..555ee2aca6c01 100644 --- a/net/xfrm/xfrm_interface.c +++ b/net/xfrm/xfrm_interface.c @@ -70,17 +70,28 @@ static struct xfrm_if *xfrmi_lookup(struct net *net, struct xfrm_state *x) return NULL; }
-static struct xfrm_if *xfrmi_decode_session(struct sk_buff *skb) +static struct xfrm_if *xfrmi_decode_session(struct sk_buff *skb, + unsigned short family) { struct xfrmi_net *xfrmn; - int ifindex; struct xfrm_if *xi; + int ifindex = 0;
if (!secpath_exists(skb) || !skb->dev) return NULL;
+ switch (family) { + case AF_INET6: + ifindex = inet6_sdif(skb); + break; + case AF_INET: + ifindex = inet_sdif(skb); + break; + } + if (!ifindex) + ifindex = skb->dev->ifindex; + xfrmn = net_generic(xs_net(xfrm_input_state(skb)), xfrmi_net_id); - ifindex = skb->dev->ifindex;
for_each_xfrmi_rcu(xfrmn->xfrmi[0], xi) { if (ifindex == xi->dev->ifindex && diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index bf5d59270f79d..ce1b262ce9646 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -2339,7 +2339,7 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb, ifcb = xfrm_if_get_cb();
if (ifcb) { - xi = ifcb->decode_session(skb); + xi = ifcb->decode_session(skb, family); if (xi) { if_id = xi->p.if_id; net = xi->net;
[ Upstream commit 8742dc86d0c7a9628117a989c11f04a9b6b898f3 ]
We currently don't reload pointers pointing into skb header after doing pskb_may_pull() in _decode_session4(). So in case pskb_may_pull() changed the pointers, we read from random memory. Fix this by putting all the needed infos on the stack, so that we don't need to access the header pointers after doing pskb_may_pull().
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/xfrm4_policy.c | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c index d73a6d6652f60..2b144b92ae46a 100644 --- a/net/ipv4/xfrm4_policy.c +++ b/net/ipv4/xfrm4_policy.c @@ -111,7 +111,8 @@ static void _decode_session4(struct sk_buff *skb, struct flowi *fl, int reverse) { const struct iphdr *iph = ip_hdr(skb); - u8 *xprth = skb_network_header(skb) + iph->ihl * 4; + int ihl = iph->ihl; + u8 *xprth = skb_network_header(skb) + ihl * 4; struct flowi4 *fl4 = &fl->u.ip4; int oif = 0;
@@ -122,6 +123,11 @@ _decode_session4(struct sk_buff *skb, struct flowi *fl, int reverse) fl4->flowi4_mark = skb->mark; fl4->flowi4_oif = reverse ? skb->skb_iif : oif;
+ fl4->flowi4_proto = iph->protocol; + fl4->daddr = reverse ? iph->saddr : iph->daddr; + fl4->saddr = reverse ? iph->daddr : iph->saddr; + fl4->flowi4_tos = iph->tos; + if (!ip_is_fragment(iph)) { switch (iph->protocol) { case IPPROTO_UDP: @@ -133,7 +139,7 @@ _decode_session4(struct sk_buff *skb, struct flowi *fl, int reverse) pskb_may_pull(skb, xprth + 4 - skb->data)) { __be16 *ports;
- xprth = skb_network_header(skb) + iph->ihl * 4; + xprth = skb_network_header(skb) + ihl * 4; ports = (__be16 *)xprth;
fl4->fl4_sport = ports[!!reverse]; @@ -146,7 +152,7 @@ _decode_session4(struct sk_buff *skb, struct flowi *fl, int reverse) pskb_may_pull(skb, xprth + 2 - skb->data)) { u8 *icmp;
- xprth = skb_network_header(skb) + iph->ihl * 4; + xprth = skb_network_header(skb) + ihl * 4; icmp = xprth;
fl4->fl4_icmp_type = icmp[0]; @@ -159,7 +165,7 @@ _decode_session4(struct sk_buff *skb, struct flowi *fl, int reverse) pskb_may_pull(skb, xprth + 4 - skb->data)) { __be32 *ehdr;
- xprth = skb_network_header(skb) + iph->ihl * 4; + xprth = skb_network_header(skb) + ihl * 4; ehdr = (__be32 *)xprth;
fl4->fl4_ipsec_spi = ehdr[0]; @@ -171,7 +177,7 @@ _decode_session4(struct sk_buff *skb, struct flowi *fl, int reverse) pskb_may_pull(skb, xprth + 8 - skb->data)) { __be32 *ah_hdr;
- xprth = skb_network_header(skb) + iph->ihl * 4; + xprth = skb_network_header(skb) + ihl * 4; ah_hdr = (__be32 *)xprth;
fl4->fl4_ipsec_spi = ah_hdr[1]; @@ -183,7 +189,7 @@ _decode_session4(struct sk_buff *skb, struct flowi *fl, int reverse) pskb_may_pull(skb, xprth + 4 - skb->data)) { __be16 *ipcomp_hdr;
- xprth = skb_network_header(skb) + iph->ihl * 4; + xprth = skb_network_header(skb) + ihl * 4; ipcomp_hdr = (__be16 *)xprth;
fl4->fl4_ipsec_spi = htonl(ntohs(ipcomp_hdr[1])); @@ -196,7 +202,7 @@ _decode_session4(struct sk_buff *skb, struct flowi *fl, int reverse) __be16 *greflags; __be32 *gre_hdr;
- xprth = skb_network_header(skb) + iph->ihl * 4; + xprth = skb_network_header(skb) + ihl * 4; greflags = (__be16 *)xprth; gre_hdr = (__be32 *)xprth;
@@ -213,10 +219,6 @@ _decode_session4(struct sk_buff *skb, struct flowi *fl, int reverse) break; } } - fl4->flowi4_proto = iph->protocol; - fl4->daddr = reverse ? iph->saddr : iph->daddr; - fl4->saddr = reverse ? iph->daddr : iph->saddr; - fl4->flowi4_tos = iph->tos; }
static void xfrm4_update_pmtu(struct dst_entry *dst, struct sock *sk,
[ Upstream commit 2abc330c514fe56c570bb1a6318b054b06a4f72e ]
Sometimes one of the nkmp factors is unused. This means that one of the factors shift and width values are set to 0. Current nkmp clock code generates a mask for each factor with GENMASK(width + shift - 1, shift). For unused factor this translates to GENMASK(-1, 0). This code is further expanded by C preprocessor to final version: (((~0UL) - (1UL << (0)) + 1) & (~0UL >> (BITS_PER_LONG - 1 - (-1)))) or a bit simplified: (~0UL & (~0UL >> BITS_PER_LONG))
It turns out that result of the second part (~0UL >> BITS_PER_LONG) is actually undefined by C standard, which clearly specifies:
"If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined."
Additionally, compiling kernel with aarch64-linux-gnu-gcc 8.3.0 gave different results whether literals or variables with same values as literals were used. GENMASK with literals -1 and 0 gives zero and with variables gives 0xFFFFFFFFFFFFFFF (~0UL). Because nkmp driver uses GENMASK with variables as parameter, expression calculates mask as ~0UL instead of 0. This has further consequences that LSB in register is always set to 1 (1 is neutral value for a factor and shift is 0).
For example, H6 pll-de clock is set to 600 MHz by sun4i-drm driver, but due to this bug ends up being 300 MHz. Additionally, 300 MHz seems to be too low because following warning can be found in dmesg:
[ 1.752763] WARNING: CPU: 2 PID: 41 at drivers/clk/sunxi-ng/ccu_common.c:41 ccu_helper_wait_for_lock.part.0+0x6c/0x90 [ 1.763378] Modules linked in: [ 1.766441] CPU: 2 PID: 41 Comm: kworker/2:1 Not tainted 5.1.0-rc2-next-20190401 #138 [ 1.774269] Hardware name: Pine H64 (DT) [ 1.778200] Workqueue: events deferred_probe_work_func [ 1.783341] pstate: 40000005 (nZcv daif -PAN -UAO) [ 1.788135] pc : ccu_helper_wait_for_lock.part.0+0x6c/0x90 [ 1.793623] lr : ccu_helper_wait_for_lock.part.0+0x48/0x90 [ 1.799107] sp : ffff000010f93840 [ 1.802422] x29: ffff000010f93840 x28: 0000000000000000 [ 1.807735] x27: ffff800073ce9d80 x26: ffff000010afd1b8 [ 1.813049] x25: ffffffffffffffff x24: 00000000ffffffff [ 1.818362] x23: 0000000000000001 x22: ffff000010abd5c8 [ 1.823675] x21: 0000000010000000 x20: 00000000685f367e [ 1.828987] x19: 0000000000001801 x18: 0000000000000001 [ 1.834300] x17: 0000000000000001 x16: 0000000000000000 [ 1.839613] x15: 0000000000000000 x14: ffff000010789858 [ 1.844926] x13: 0000000000000000 x12: 0000000000000001 [ 1.850239] x11: 0000000000000000 x10: 0000000000000970 [ 1.855551] x9 : ffff000010f936c0 x8 : ffff800074cec0d0 [ 1.860864] x7 : 0000800067117000 x6 : 0000000115c30b41 [ 1.866177] x5 : 00ffffffffffffff x4 : 002c959300bfe500 [ 1.871490] x3 : 0000000000000018 x2 : 0000000029aaaaab [ 1.876802] x1 : 00000000000002e6 x0 : 00000000686072bc [ 1.882114] Call trace: [ 1.884565] ccu_helper_wait_for_lock.part.0+0x6c/0x90 [ 1.889705] ccu_helper_wait_for_lock+0x10/0x20 [ 1.894236] ccu_nkmp_set_rate+0x244/0x2a8 [ 1.898334] clk_change_rate+0x144/0x290 [ 1.902258] clk_core_set_rate_nolock+0x180/0x1b8 [ 1.906963] clk_set_rate+0x34/0xa0 [ 1.910455] sun8i_mixer_bind+0x484/0x558 [ 1.914466] component_bind_all+0x10c/0x230 [ 1.918651] sun4i_drv_bind+0xc4/0x1a0 [ 1.922401] try_to_bring_up_master+0x164/0x1c0 [ 1.926932] __component_add+0xa0/0x168 [ 1.930769] component_add+0x10/0x18 [ 1.934346] sun8i_dw_hdmi_probe+0x18/0x20 [ 1.938443] platform_drv_probe+0x50/0xa0 [ 1.942455] really_probe+0xcc/0x280 [ 1.946032] driver_probe_device+0x54/0xe8 [ 1.950130] __device_attach_driver+0x80/0xb8 [ 1.954488] bus_for_each_drv+0x78/0xc8 [ 1.958326] __device_attach+0xd4/0x130 [ 1.962163] device_initial_probe+0x10/0x18 [ 1.966348] bus_probe_device+0x90/0x98 [ 1.970185] deferred_probe_work_func+0x6c/0xa0 [ 1.974720] process_one_work+0x1e0/0x320 [ 1.978732] worker_thread+0x228/0x428 [ 1.982484] kthread+0x120/0x128 [ 1.985714] ret_from_fork+0x10/0x18 [ 1.989290] ---[ end trace 9babd42e1ca4b84f ]---
This commit solves the issue by first checking value of the factor width. If it is equal to 0 (unused factor), mask is set to 0, otherwise GENMASK() macro is used as before.
Fixes: d897ef56faf9 ("clk: sunxi-ng: Mask nkmp factors when setting register") Signed-off-by: Jernej Skrabec jernej.skrabec@siol.net Signed-off-by: Maxime Ripard maxime.ripard@bootlin.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/sunxi-ng/ccu_nkmp.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/drivers/clk/sunxi-ng/ccu_nkmp.c b/drivers/clk/sunxi-ng/ccu_nkmp.c index ebd9436d2c7cd..1ad53d1016a3e 100644 --- a/drivers/clk/sunxi-ng/ccu_nkmp.c +++ b/drivers/clk/sunxi-ng/ccu_nkmp.c @@ -160,7 +160,7 @@ static int ccu_nkmp_set_rate(struct clk_hw *hw, unsigned long rate, unsigned long parent_rate) { struct ccu_nkmp *nkmp = hw_to_ccu_nkmp(hw); - u32 n_mask, k_mask, m_mask, p_mask; + u32 n_mask = 0, k_mask = 0, m_mask = 0, p_mask = 0; struct _ccu_nkmp _nkmp; unsigned long flags; u32 reg; @@ -179,10 +179,18 @@ static int ccu_nkmp_set_rate(struct clk_hw *hw, unsigned long rate,
ccu_nkmp_find_best(parent_rate, rate, &_nkmp);
- n_mask = GENMASK(nkmp->n.width + nkmp->n.shift - 1, nkmp->n.shift); - k_mask = GENMASK(nkmp->k.width + nkmp->k.shift - 1, nkmp->k.shift); - m_mask = GENMASK(nkmp->m.width + nkmp->m.shift - 1, nkmp->m.shift); - p_mask = GENMASK(nkmp->p.width + nkmp->p.shift - 1, nkmp->p.shift); + if (nkmp->n.width) + n_mask = GENMASK(nkmp->n.width + nkmp->n.shift - 1, + nkmp->n.shift); + if (nkmp->k.width) + k_mask = GENMASK(nkmp->k.width + nkmp->k.shift - 1, + nkmp->k.shift); + if (nkmp->m.width) + m_mask = GENMASK(nkmp->m.width + nkmp->m.shift - 1, + nkmp->m.shift); + if (nkmp->p.width) + p_mask = GENMASK(nkmp->p.width + nkmp->p.shift - 1, + nkmp->p.shift);
spin_lock_irqsave(nkmp->common.lock, flags);
[ Upstream commit dbe7208c6c4aec083571f2ec742870a0d0edbea3 ]
If called fast enough so samples do not increment, we can get division by zero in kernel:
__div0 cpcap_battery_cc_raw_div cpcap_battery_get_property power_supply_get_property.part.1 power_supply_get_property power_supply_show_property power_supply_uevent
Fixes: 874b2adbed12 ("power: supply: cpcap-battery: Add a battery driver") Signed-off-by: Tony Lindgren tony@atomide.com Acked-by: Pavel Machek pavel@ucw.cz Signed-off-by: Sebastian Reichel sebastian.reichel@collabora.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/power/supply/cpcap-battery.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/power/supply/cpcap-battery.c b/drivers/power/supply/cpcap-battery.c index 98ba07869c3b0..3bae02380bb22 100644 --- a/drivers/power/supply/cpcap-battery.c +++ b/drivers/power/supply/cpcap-battery.c @@ -221,6 +221,9 @@ static int cpcap_battery_cc_raw_div(struct cpcap_battery_ddata *ddata, int avg_current; u32 cc_lsb;
+ if (!divider) + return 0; + sample &= 0xffffff; /* 24-bits, unsigned */ offset &= 0x7ff; /* 10-bits, signed */
[ Upstream commit 46c874419652bbefdfed17420fd6e88d8a31d9ec ]
symlink body shouldn't be freed without an RCU delay. Switch securityfs to ->destroy_inode() and use of call_rcu(); free both the inode and symlink body in the callback.
Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- security/inode.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/security/inode.c b/security/inode.c index 8dd9ca8848e43..829f15672e01f 100644 --- a/security/inode.c +++ b/security/inode.c @@ -26,17 +26,22 @@ static struct vfsmount *mount; static int mount_count;
-static void securityfs_evict_inode(struct inode *inode) +static void securityfs_i_callback(struct rcu_head *head) { - truncate_inode_pages_final(&inode->i_data); - clear_inode(inode); + struct inode *inode = container_of(head, struct inode, i_rcu); if (S_ISLNK(inode->i_mode)) kfree(inode->i_link); + free_inode_nonrcu(inode); +} + +static void securityfs_destroy_inode(struct inode *inode) +{ + call_rcu(&inode->i_rcu, securityfs_i_callback); }
static const struct super_operations securityfs_super_operations = { .statfs = simple_statfs, - .evict_inode = securityfs_evict_inode, + .destroy_inode = securityfs_destroy_inode, };
static int fill_super(struct super_block *sb, void *data, int silent)
[ Upstream commit f51dcd0f621caac5380ce90fbbeafc32ce4517ae ]
symlink body shouldn't be freed without an RCU delay. Switch apparmorfs to ->destroy_inode() and use of call_rcu(); free both the inode and symlink body in the callback.
Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- security/apparmor/apparmorfs.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/security/apparmor/apparmorfs.c b/security/apparmor/apparmorfs.c index e09fe4d7307cd..40e3a098f6fb5 100644 --- a/security/apparmor/apparmorfs.c +++ b/security/apparmor/apparmorfs.c @@ -123,17 +123,22 @@ static int aafs_show_path(struct seq_file *seq, struct dentry *dentry) return 0; }
-static void aafs_evict_inode(struct inode *inode) +static void aafs_i_callback(struct rcu_head *head) { - truncate_inode_pages_final(&inode->i_data); - clear_inode(inode); + struct inode *inode = container_of(head, struct inode, i_rcu); if (S_ISLNK(inode->i_mode)) kfree(inode->i_link); + free_inode_nonrcu(inode); +} + +static void aafs_destroy_inode(struct inode *inode) +{ + call_rcu(&inode->i_rcu, aafs_i_callback); }
static const struct super_operations aafs_super_ops = { .statfs = simple_statfs, - .evict_inode = aafs_evict_inode, + .destroy_inode = aafs_destroy_inode, .show_path = aafs_show_path, };
[ Upstream commit d5bc73f34cc97c4b4b9202cc93182c2515076edf ]
In most cases, kmalloc() will not be available early in boot when pci_setup() is called. Thus, the kstrdup() call that was added to fix the __initdata bug with the disable_acs_redir parameter usually returns NULL, so the parameter is discarded and has no effect.
To fix this, store the string that's in initdata until an initcall function can allocate the memory appropriately. This way we don't need any additional static memory.
Fixes: d2fd6e81912a ("PCI: Fix __initdata issue with "pci=disable_acs_redir" parameter") Signed-off-by: Logan Gunthorpe logang@deltatee.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pci.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 30649addc6252..61f2ef28ea1c7 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -6135,8 +6135,7 @@ static int __init pci_setup(char *str) } else if (!strncmp(str, "pcie_scan_all", 13)) { pci_add_flags(PCI_SCAN_ALL_PCIE_DEVS); } else if (!strncmp(str, "disable_acs_redir=", 18)) { - disable_acs_redir_param = - kstrdup(str + 18, GFP_KERNEL); + disable_acs_redir_param = str + 18; } else { printk(KERN_ERR "PCI: Unknown option `%s'\n", str); @@ -6147,3 +6146,19 @@ static int __init pci_setup(char *str) return 0; } early_param("pci", pci_setup); + +/* + * 'disable_acs_redir_param' is initialized in pci_setup(), above, to point + * to data in the __initdata section which will be freed after the init + * sequence is complete. We can't allocate memory in pci_setup() because some + * architectures do not have any memory allocation service available during + * an early_param() call. So we allocate memory and copy the variable here + * before the init section is freed. + */ +static int __init pci_realloc_setup_params(void) +{ + disable_acs_redir_param = kstrdup(disable_acs_redir_param, GFP_KERNEL); + + return 0; +} +pure_initcall(pci_realloc_setup_params);
[ Upstream commit da66761c2d93a46270d69001abb5692717495a68 ]
It was reported that with some special Multi Processor Group configuration, e.g: bcdedit.exe /set groupsize 1 bcdedit.exe /set maxgroup on bcdedit.exe /set groupaware on for a 16-vCPU guest WS2012 shows BSOD on boot when PV TLB flush mechanism is in use.
Tracing kvm_hv_flush_tlb immediately reveals the issue:
kvm_hv_flush_tlb: processor_mask 0x0 address_space 0x0 flags 0x2
The only flag set in this request is HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES, however, processor_mask is 0x0 and no HV_FLUSH_ALL_PROCESSORS is specified. We don't flush anything and apparently it's not what Windows expects.
TLFS doesn't say anything about such requests and newer Windows versions seem to be unaffected. This all feels like a WS2012 bug, which is, however, easy to workaround in KVM: let's flush everything when we see an empty flush request, over-flushing doesn't hurt.
Signed-off-by: Vitaly Kuznetsov vkuznets@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/hyperv.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c index 01d209ab5481b..229d996051653 100644 --- a/arch/x86/kvm/hyperv.c +++ b/arch/x86/kvm/hyperv.c @@ -1291,7 +1291,16 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa, flush.address_space, flush.flags);
sparse_banks[0] = flush.processor_mask; - all_cpus = flush.flags & HV_FLUSH_ALL_PROCESSORS; + + /* + * Work around possible WS2012 bug: it sends hypercalls + * with processor_mask = 0x0 and HV_FLUSH_ALL_PROCESSORS clear, + * while also expecting us to flush something and crashing if + * we don't. Let's treat processor_mask == 0 same as + * HV_FLUSH_ALL_PROCESSORS. + */ + all_cpus = (flush.flags & HV_FLUSH_ALL_PROCESSORS) || + flush.processor_mask == 0; } else { if (unlikely(kvm_read_guest(kvm, ingpa, &flush_ex, sizeof(flush_ex))))
[ Upstream commit f1267cf3c01b12e0f843fb6a7450a7f0b2efab8a ]
The txq of vif is added to active_txqs list for ATF TXQ scheduling in the function ieee80211_queue_skb(), but it was not properly removed before freeing the txq object. It was causing use after free of the txq objects from the active_txqs list, result was kernel panic due to invalid memory access.
Fix kernel invalid memory access by properly removing txq object from active_txqs list before free the object.
Signed-off-by: Bhagavathi Perumal S bperumal@codeaurora.org Acked-by: Toke Høiland-Jørgensen toke@redhat.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/iface.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c index 3a0171a65db32..152d4365f9616 100644 --- a/net/mac80211/iface.c +++ b/net/mac80211/iface.c @@ -1910,6 +1910,9 @@ void ieee80211_if_remove(struct ieee80211_sub_if_data *sdata) list_del_rcu(&sdata->list); mutex_unlock(&sdata->local->iflist_mtx);
+ if (sdata->vif.txq) + ieee80211_txq_purge(sdata->local, to_txq_info(sdata->vif.txq)); + synchronize_rcu();
if (sdata->dev) {
[ Upstream commit 22e8860cf8f777fbf6a83f2fb7127f682a8e9de4 ]
regmap_update_bits could fail and deserves a check.
The patch adds the checks and if it fails, returns its error code upstream.
Signed-off-by: Kangjie Lu kjlu@umn.edu Reviewed-by: Mukesh Ojha mojha@codeaurora.org Signed-off-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ieee802154/mcr20a.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/net/ieee802154/mcr20a.c b/drivers/net/ieee802154/mcr20a.c index 04891429a5542..fe4057fca83d8 100644 --- a/drivers/net/ieee802154/mcr20a.c +++ b/drivers/net/ieee802154/mcr20a.c @@ -539,6 +539,8 @@ mcr20a_start(struct ieee802154_hw *hw) dev_dbg(printdev(lp), "no slotted operation\n"); ret = regmap_update_bits(lp->regmap_dar, DAR_PHY_CTRL1, DAR_PHY_CTRL1_SLOTTED, 0x0); + if (ret < 0) + return ret;
/* enable irq */ enable_irq(lp->spi->irq); @@ -546,11 +548,15 @@ mcr20a_start(struct ieee802154_hw *hw) /* Unmask SEQ interrupt */ ret = regmap_update_bits(lp->regmap_dar, DAR_PHY_CTRL2, DAR_PHY_CTRL2_SEQMSK, 0x0); + if (ret < 0) + return ret;
/* Start the RX sequence */ dev_dbg(printdev(lp), "start the RX sequence\n"); ret = regmap_update_bits(lp->regmap_dar, DAR_PHY_CTRL1, DAR_PHY_CTRL1_XCVSEQ_MASK, MCR20A_XCVSEQ_RX); + if (ret < 0) + return ret;
return 0; }
[ Upstream commit 811328fc3222f7b55846de0cd0404339e2e1e6d7 ]
A failed KVM_ARM_VCPU_INIT should not set the vcpu target, as the vcpu target is used by kvm_vcpu_initialized() to determine if other vcpu ioctls may proceed. We need to set the target before calling kvm_reset_vcpu(), but if that call fails, we should then unset it and clear the feature bitmap while we're at it.
Signed-off-by: Andrew Jones drjones@redhat.com [maz: Simplified patch, completed commit message] Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- virt/kvm/arm/arm.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c index 1415e36fed3db..fef3527af3bd7 100644 --- a/virt/kvm/arm/arm.c +++ b/virt/kvm/arm/arm.c @@ -949,7 +949,7 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level, static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu, const struct kvm_vcpu_init *init) { - unsigned int i; + unsigned int i, ret; int phys_target = kvm_target_cpu();
if (init->target != phys_target) @@ -984,9 +984,14 @@ static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu, vcpu->arch.target = phys_target;
/* Now we know what it is, we can reset it. */ - return kvm_reset_vcpu(vcpu); -} + ret = kvm_reset_vcpu(vcpu); + if (ret) { + vcpu->arch.target = -1; + bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES); + }
+ return ret; +}
static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, struct kvm_vcpu_init *init)
[ Upstream commit 349ced9984ff540ce74ca8a0b2e9b03dc434b9dd ]
Fix a similar endless event loop as was done in commit 8dcf32175b4e ("i2c: prevent endless uevent loop with CONFIG_I2C_DEBUG_CORE"):
The culprit is the dev_dbg printk in the i2c uevent handler. If this is activated (for instance by CONFIG_I2C_DEBUG_CORE) it results in an endless loop with systemd-journald.
This happens if user-space scans the system log and reads the uevent file to get information about a newly created device, which seems fair use to me. Unfortunately reading the "uevent" file uses the same function that runs for creating the uevent for a new device, generating the next syslog entry
Both CONFIG_I2C_DEBUG_CORE and CONFIG_POWER_SUPPLY_DEBUG were reported in https://bugs.freedesktop.org/show_bug.cgi?id=76886 but only former seems to have been fixed. Drop debug prints as it was done in I2C subsystem to resolve the issue.
Signed-off-by: Andrey Smirnov andrew.smirnov@gmail.com Cc: Chris Healy cphealy@gmail.com Cc: linux-pm@vger.kernel.org Signed-off-by: Sebastian Reichel sebastian.reichel@collabora.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/power/supply/power_supply_sysfs.c | 6 ------ 1 file changed, 6 deletions(-)
diff --git a/drivers/power/supply/power_supply_sysfs.c b/drivers/power/supply/power_supply_sysfs.c index 6170ed8b6854b..5a2757a7f4088 100644 --- a/drivers/power/supply/power_supply_sysfs.c +++ b/drivers/power/supply/power_supply_sysfs.c @@ -382,15 +382,11 @@ int power_supply_uevent(struct device *dev, struct kobj_uevent_env *env) char *prop_buf; char *attrname;
- dev_dbg(dev, "uevent\n"); - if (!psy || !psy->desc) { dev_dbg(dev, "No power supply yet\n"); return ret; }
- dev_dbg(dev, "POWER_SUPPLY_NAME=%s\n", psy->desc->name); - ret = add_uevent_var(env, "POWER_SUPPLY_NAME=%s", psy->desc->name); if (ret) return ret; @@ -426,8 +422,6 @@ int power_supply_uevent(struct device *dev, struct kobj_uevent_env *env) goto out; }
- dev_dbg(dev, "prop %s=%s\n", attrname, prop_buf); - ret = add_uevent_var(env, "POWER_SUPPLY_%s=%s", attrname, prop_buf); kfree(attrname); if (ret)
[ Upstream commit 0edd6b64d1939e9e9168ff27947995bb7751db5d ]
Unless the very next line is schedule(), or implies it, one must not use preempt_enable_no_resched(). It can cause a preemption to go missing and thereby cause arbitrary delays, breaking the PREEMPT=y invariant.
Cc: Roman Gushchin guro@fb.com Cc: Alexei Starovoitov ast@kernel.org Cc: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/bpf.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 523481a3471b5..fd95f2efe5f32 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -400,7 +400,7 @@ int bpf_prog_array_copy(struct bpf_prog_array __rcu *old_array, } \ _out: \ rcu_read_unlock(); \ - preempt_enable_no_resched(); \ + preempt_enable(); \ _ret; \ })
[ Upstream commit 88ef66a28391ea7b624bfb7508a5b015c13b28f3 ]
Adding device entries found in vendor modified versions of this driver. Function maps for some of the devices follow:
WNC D16Q1, D16Q5, D18Q1 LTE CAT3 module (1435:0918)
MI_00 Qualcomm HS-USB Diagnostics MI_01 Android Debug interface MI_02 Qualcomm HS-USB Modem MI_03 Qualcomm Wireless HS-USB Ethernet Adapter MI_04 Qualcomm Wireless HS-USB Ethernet Adapter MI_05 Qualcomm Wireless HS-USB Ethernet Adapter MI_06 USB Mass Storage Device
T: Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1435 ProdID=0918 Rev= 2.32 S: Manufacturer=Android S: Product=Android S: SerialNumber=0123456789ABCDEF C:* #Ifs= 7 Cfg#= 1 Atr=80 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=84(I) Atr=03(Int.) MxPS= 64 Ivl=32ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=86(I) Atr=03(Int.) MxPS= 64 Ivl=32ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=88(I) Atr=03(Int.) MxPS= 64 Ivl=32ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=8a(I) Atr=03(Int.) MxPS= 64 Ivl=32ms E: Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
WNC D18 LTE CAT3 module (1435:d182)
MI_00 Qualcomm HS-USB Diagnostics MI_01 Androd Debug interface MI_02 Qualcomm HS-USB Modem MI_03 Qualcomm HS-USB NMEA MI_04 Qualcomm Wireless HS-USB Ethernet Adapter MI_05 Qualcomm Wireless HS-USB Ethernet Adapter MI_06 USB Mass Storage Device
ZM8510/ZM8620/ME3960 (19d2:0396)
MI_00 ZTE Mobile Broadband Diagnostics Port MI_01 ZTE Mobile Broadband AT Port MI_02 ZTE Mobile Broadband Modem MI_03 ZTE Mobile Broadband NDIS Port (qmi_wwan) MI_04 ZTE Mobile Broadband ADB Port
ME3620_X (19d2:1432)
MI_00 ZTE Diagnostics Device MI_01 ZTE UI AT Interface MI_02 ZTE Modem Device MI_03 ZTE Mobile Broadband Network Adapter MI_04 ZTE Composite ADB Interface
Reported-by: Lars Melin larsm17@gmail.com Signed-off-by: Bjørn Mork bjorn@mork.no Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/qmi_wwan.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c index 94389c84ede65..366217263d704 100644 --- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -1122,9 +1122,16 @@ static const struct usb_device_id products[] = { {QMI_FIXED_INTF(0x0846, 0x68d3, 8)}, /* Netgear Aircard 779S */ {QMI_FIXED_INTF(0x12d1, 0x140c, 1)}, /* Huawei E173 */ {QMI_FIXED_INTF(0x12d1, 0x14ac, 1)}, /* Huawei E1820 */ + {QMI_FIXED_INTF(0x1435, 0x0918, 3)}, /* Wistron NeWeb D16Q1 */ + {QMI_FIXED_INTF(0x1435, 0x0918, 4)}, /* Wistron NeWeb D16Q1 */ + {QMI_FIXED_INTF(0x1435, 0x0918, 5)}, /* Wistron NeWeb D16Q1 */ + {QMI_FIXED_INTF(0x1435, 0x3185, 4)}, /* Wistron NeWeb M18Q5 */ + {QMI_FIXED_INTF(0x1435, 0xd111, 4)}, /* M9615A DM11-1 D51QC */ {QMI_FIXED_INTF(0x1435, 0xd181, 3)}, /* Wistron NeWeb D18Q1 */ {QMI_FIXED_INTF(0x1435, 0xd181, 4)}, /* Wistron NeWeb D18Q1 */ {QMI_FIXED_INTF(0x1435, 0xd181, 5)}, /* Wistron NeWeb D18Q1 */ + {QMI_FIXED_INTF(0x1435, 0xd182, 4)}, /* Wistron NeWeb D18 */ + {QMI_FIXED_INTF(0x1435, 0xd182, 5)}, /* Wistron NeWeb D18 */ {QMI_FIXED_INTF(0x1435, 0xd191, 4)}, /* Wistron NeWeb D19Q1 */ {QMI_QUIRK_SET_DTR(0x1508, 0x1001, 4)}, /* Fibocom NL668 series */ {QMI_FIXED_INTF(0x16d8, 0x6003, 0)}, /* CMOTech 6003 */ @@ -1180,6 +1187,7 @@ static const struct usb_device_id products[] = { {QMI_FIXED_INTF(0x19d2, 0x0265, 4)}, /* ONDA MT8205 4G LTE */ {QMI_FIXED_INTF(0x19d2, 0x0284, 4)}, /* ZTE MF880 */ {QMI_FIXED_INTF(0x19d2, 0x0326, 4)}, /* ZTE MF821D */ + {QMI_FIXED_INTF(0x19d2, 0x0396, 3)}, /* ZTE ZM8620 */ {QMI_FIXED_INTF(0x19d2, 0x0412, 4)}, /* Telewell TW-LTE 4G */ {QMI_FIXED_INTF(0x19d2, 0x1008, 4)}, /* ZTE (Vodafone) K3570-Z */ {QMI_FIXED_INTF(0x19d2, 0x1010, 4)}, /* ZTE (Vodafone) K3571-Z */ @@ -1200,7 +1208,9 @@ static const struct usb_device_id products[] = { {QMI_FIXED_INTF(0x19d2, 0x1425, 2)}, {QMI_FIXED_INTF(0x19d2, 0x1426, 2)}, /* ZTE MF91 */ {QMI_FIXED_INTF(0x19d2, 0x1428, 2)}, /* Telewell TW-LTE 4G v2 */ + {QMI_FIXED_INTF(0x19d2, 0x1432, 3)}, /* ZTE ME3620 */ {QMI_FIXED_INTF(0x19d2, 0x2002, 4)}, /* ZTE (Vodafone) K3765-Z */ + {QMI_FIXED_INTF(0x2001, 0x7e16, 3)}, /* D-Link DWM-221 */ {QMI_FIXED_INTF(0x2001, 0x7e19, 4)}, /* D-Link DWM-221 B1 */ {QMI_FIXED_INTF(0x2001, 0x7e35, 4)}, /* D-Link DWM-222 */ {QMI_FIXED_INTF(0x2020, 0x2031, 4)}, /* Olicard 600 */
[ Upstream commit de1887c064b9996ac03120d90d0a909a3f678f98 ]
We don't check for the validity of the lengths in the packet received from the firmware. If the MPDU length received in the rx descriptor is too short to contain the header length and the crypt length together, we may end up trying to copy a negative number of bytes (headlen - hdrlen < 0) which will underflow and cause us to try to copy a huge amount of data. This causes oopses such as this one:
BUG: unable to handle kernel paging request at ffff896be2970000 PGD 5e201067 P4D 5e201067 PUD 5e205067 PMD 16110d063 PTE 8000000162970161 Oops: 0003 [#1] PREEMPT SMP NOPTI CPU: 2 PID: 1824 Comm: irq/134-iwlwifi Not tainted 4.19.33-04308-geea41cf4930f #1 Hardware name: [...] RIP: 0010:memcpy_erms+0x6/0x10 Code: 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38 fe RSP: 0018:ffffa4630196fc60 EFLAGS: 00010287 RAX: ffff896be2924618 RBX: ffff896bc8ecc600 RCX: 00000000fffb4610 RDX: 00000000fffffff8 RSI: ffff896a835e2a38 RDI: ffff896be2970000 RBP: ffffa4630196fd30 R08: ffff896bc8ecc600 R09: ffff896a83597000 R10: ffff896bd6998400 R11: 000000000200407f R12: ffff896a83597050 R13: 00000000fffffff8 R14: 0000000000000010 R15: ffff896a83597038 FS: 0000000000000000(0000) GS:ffff896be8280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff896be2970000 CR3: 000000005dc12002 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: iwl_mvm_rx_mpdu_mq+0xb51/0x121b [iwlmvm] iwl_pcie_rx_handle+0x58c/0xa89 [iwlwifi] iwl_pcie_irq_rx_msix_handler+0xd9/0x12a [iwlwifi] irq_thread_fn+0x24/0x49 irq_thread+0xb0/0x122 kthread+0x138/0x140 ret_from_fork+0x1f/0x40
Fix that by checking the lengths for correctness and trigger a warning to show that we have received wrong data.
Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c | 28 ++++++++++++++++--- 1 file changed, 24 insertions(+), 4 deletions(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c b/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c index b53148f972a4a..036d1d82d93e7 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c @@ -143,9 +143,9 @@ static inline int iwl_mvm_check_pn(struct iwl_mvm *mvm, struct sk_buff *skb, }
/* iwl_mvm_create_skb Adds the rxb to a new skb */ -static void iwl_mvm_create_skb(struct sk_buff *skb, struct ieee80211_hdr *hdr, - u16 len, u8 crypt_len, - struct iwl_rx_cmd_buffer *rxb) +static int iwl_mvm_create_skb(struct iwl_mvm *mvm, struct sk_buff *skb, + struct ieee80211_hdr *hdr, u16 len, u8 crypt_len, + struct iwl_rx_cmd_buffer *rxb) { struct iwl_rx_packet *pkt = rxb_addr(rxb); struct iwl_rx_mpdu_desc *desc = (void *)pkt->data; @@ -178,6 +178,20 @@ static void iwl_mvm_create_skb(struct sk_buff *skb, struct ieee80211_hdr *hdr, * present before copying packet data. */ hdrlen += crypt_len; + + if (WARN_ONCE(headlen < hdrlen, + "invalid packet lengths (hdrlen=%d, len=%d, crypt_len=%d)\n", + hdrlen, len, crypt_len)) { + /* + * We warn and trace because we want to be able to see + * it in trace-cmd as well. + */ + IWL_DEBUG_RX(mvm, + "invalid packet lengths (hdrlen=%d, len=%d, crypt_len=%d)\n", + hdrlen, len, crypt_len); + return -EINVAL; + } + skb_put_data(skb, hdr, hdrlen); skb_put_data(skb, (u8 *)hdr + hdrlen + pad_len, headlen - hdrlen);
@@ -190,6 +204,8 @@ static void iwl_mvm_create_skb(struct sk_buff *skb, struct ieee80211_hdr *hdr, skb_add_rx_frag(skb, 0, rxb_steal_page(rxb), offset, fraglen, rxb->truesize); } + + return 0; }
/* iwl_mvm_pass_packet_to_mac80211 - passes the packet for mac80211 */ @@ -1425,7 +1441,11 @@ void iwl_mvm_rx_mpdu_mq(struct iwl_mvm *mvm, struct napi_struct *napi, rx_status->boottime_ns = ktime_get_boot_ns(); }
- iwl_mvm_create_skb(skb, hdr, len, crypt_len, rxb); + if (iwl_mvm_create_skb(mvm, skb, hdr, len, crypt_len, rxb)) { + kfree_skb(skb); + goto out; + } + if (!iwl_mvm_reorder(mvm, napi, queue, sta, skb, desc)) iwl_mvm_pass_packet_to_mac80211(mvm, napi, skb, queue, sta); out:
[ Upstream commit 9a4f26cc98d81b67ecc23b890c28e2df324e29f3 ]
Currently the error return path from kobject_init_and_add() is not followed by a call to kobject_put() - which means we are leaking the kobject.
Fix it by adding a call to kobject_put() in the error path of kobject_init_and_add().
Signed-off-by: Tobin C. Harding tobin@kernel.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Tobin C. Harding tobin@kernel.org Cc: Vincent Guittot vincent.guittot@linaro.org Cc: Viresh Kumar viresh.kumar@linaro.org Link: http://lkml.kernel.org/r/20190430001144.24890-1-tobin@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/cpufreq_schedutil.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 217f81ecae176..4e3625109b28d 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -751,6 +751,7 @@ out: return 0;
fail: + kobject_put(&tunables->attr_set.kobj); policy->governor_data = NULL; sugov_tunables_free(tunables);
[ Upstream commit b51ce3744f115850166f3d6c292b9c8cb849ad4f ]
Enablement of AMD's Secure Memory Encryption feature is determined very early after start_kernel() is entered. Part of this procedure involves scanning the command line for the parameter 'mem_encrypt'.
To determine intended state, the function sme_enable() uses library functions cmdline_find_option() and strncmp(). Their use occurs early enough such that it cannot be assumed that any instrumentation subsystem is initialized.
For example, making calls to a KASAN-instrumented function before KASAN is set up will result in the use of uninitialized memory and a boot failure.
When AMD's SME support is enabled, conditionally disable instrumentation of these dependent functions in lib/string.c and arch/x86/lib/cmdline.c.
[ bp: Get rid of intermediary nostackp var and cleanup whitespace. ]
Fixes: aca20d546214 ("x86/mm: Add support to make use of Secure Memory Encryption") Reported-by: Li RongQing lirongqing@baidu.com Signed-off-by: Gary R Hook gary.hook@amd.com Signed-off-by: Borislav Petkov bp@suse.de Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Andrew Morton akpm@linux-foundation.org Cc: Andy Shevchenko andriy.shevchenko@linux.intel.com Cc: Boris Brezillon bbrezillon@kernel.org Cc: Coly Li colyli@suse.de Cc: "dave.hansen@linux.intel.com" dave.hansen@linux.intel.com Cc: "H. Peter Anvin" hpa@zytor.com Cc: Ingo Molnar mingo@redhat.com Cc: Kees Cook keescook@chromium.org Cc: Kent Overstreet kent.overstreet@gmail.com Cc: "luto@kernel.org" luto@kernel.org Cc: Masahiro Yamada yamada.masahiro@socionext.com Cc: Matthew Wilcox willy@infradead.org Cc: "mingo@redhat.com" mingo@redhat.com Cc: "peterz@infradead.org" peterz@infradead.org Cc: Sebastian Andrzej Siewior bigeasy@linutronix.de Cc: Thomas Gleixner tglx@linutronix.de Cc: x86-ml x86@kernel.org Link: https://lkml.kernel.org/r/155657657552.7116.18363762932464011367.stgit@sosrh... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/lib/Makefile | 12 ++++++++++++ lib/Makefile | 11 +++++++++++ 2 files changed, 23 insertions(+)
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile index 25a972c61b0ae..3c19d60316a88 100644 --- a/arch/x86/lib/Makefile +++ b/arch/x86/lib/Makefile @@ -6,6 +6,18 @@ # Produces uninteresting flaky coverage. KCOV_INSTRUMENT_delay.o := n
+# Early boot use of cmdline; don't instrument it +ifdef CONFIG_AMD_MEM_ENCRYPT +KCOV_INSTRUMENT_cmdline.o := n +KASAN_SANITIZE_cmdline.o := n + +ifdef CONFIG_FUNCTION_TRACER +CFLAGS_REMOVE_cmdline.o = -pg +endif + +CFLAGS_cmdline.o := $(call cc-option, -fno-stack-protector) +endif + inat_tables_script = $(srctree)/arch/x86/tools/gen-insn-attr-x86.awk inat_tables_maps = $(srctree)/arch/x86/lib/x86-opcode-map.txt quiet_cmd_inat_tables = GEN $@ diff --git a/lib/Makefile b/lib/Makefile index 4238764468109..0ab808318202c 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -17,6 +17,17 @@ KCOV_INSTRUMENT_list_debug.o := n KCOV_INSTRUMENT_debugobjects.o := n KCOV_INSTRUMENT_dynamic_debug.o := n
+# Early boot use of cmdline, don't instrument it +ifdef CONFIG_AMD_MEM_ENCRYPT +KASAN_SANITIZE_string.o := n + +ifdef CONFIG_FUNCTION_TRACER +CFLAGS_REMOVE_string.o = -pg +endif + +CFLAGS_string.o := $(call cc-option, -fno-stack-protector) +endif + lib-y := ctype.o string.o vsprintf.o cmdline.o \ rbtree.o radix-tree.o timerqueue.o\ idr.o int_sqrt.o extable.o \
[ Upstream commit 4e9036042fedaffcd868d7f7aa948756c48c637d ]
To choose whether to pick the GID from the old (16bit) or new (32bit) field, we should check if the old gid field is set to 0xffff. Mainline checks the old *UID* field instead - cut'n'paste from the corresponding code in ufs_get_inode_uid().
Fixes: 252e211e90ce Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ufs/util.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/ufs/util.h b/fs/ufs/util.h index 1fd3011ea6236..7fd4802222b8c 100644 --- a/fs/ufs/util.h +++ b/fs/ufs/util.h @@ -229,7 +229,7 @@ ufs_get_inode_gid(struct super_block *sb, struct ufs_inode *inode) case UFS_UID_44BSD: return fs32_to_cpu(sb, inode->ui_u3.ui_44.ui_gid); case UFS_UID_EFT: - if (inode->ui_u1.oldids.ui_suid == 0xFFFF) + if (inode->ui_u1.oldids.ui_sgid == 0xFFFF) return fs32_to_cpu(sb, inode->ui_u3.ui_sun.ui_gid); /* Fall through */ default:
[ Upstream commit bf561d3c13423fc54daa19b5d49dc15fafdb7acc ]
While cross building perf to the ARC architecture on a fedora 30 host, we were failing with:
CC /tmp/build/perf/bench/numa.o bench/numa.c: In function ‘worker_thread’: bench/numa.c:1261:12: error: ‘RUSAGE_THREAD’ undeclared (first use in this function); did you mean ‘SIGEV_THREAD’? getrusage(RUSAGE_THREAD, &rusage); ^~~~~~~~~~~~~ SIGEV_THREAD bench/numa.c:1261:12: note: each undeclared identifier is reported only once for each function it appears in
[perfbuilder@60d5802468f6 perf]$ /arc_gnu_2019.03-rc1_prebuilt_uclibc_le_archs_linux_install/bin/arc-linux-gcc --version | head -1 arc-linux-gcc (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1) 8.3.1 20190225 [perfbuilder@60d5802468f6 perf]$
Trying to reproduce a report by Vineet, I noticed that, with just cross-built zlib and numactl libraries, I ended up with the above failure.
So, since RUSAGE_THREAD is available as a define, check for that and numactl libraries, I ended up with the above failure.
So, since RUSAGE_THREAD is available as a define in the system headers, check if it is defined in the 'perf bench numa' sources and define it if not.
Now it builds and I have to figure out if the problem reported by Vineet only takes place if we have libelf or some other library available.
Cc: Arnd Bergmann arnd@arndb.de Cc: Jiri Olsa jolsa@kernel.org Cc: linux-snps-arc@lists.infradead.org Cc: Namhyung Kim namhyung@kernel.org Cc: Vineet Gupta Vineet.Gupta1@synopsys.com Link: https://lkml.kernel.org/n/tip-2wb4r1gir9xrevbpq7qp0amk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/bench/numa.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/tools/perf/bench/numa.c b/tools/perf/bench/numa.c index 44195514b19e6..fa56fde6e8d80 100644 --- a/tools/perf/bench/numa.c +++ b/tools/perf/bench/numa.c @@ -38,6 +38,10 @@ #include <numa.h> #include <numaif.h>
+#ifndef RUSAGE_THREAD +# define RUSAGE_THREAD 1 +#endif + /* * Regular printout to the terminal, supressed if -q is specified: */
[ Upstream commit 6f55967ad9d9752813e36de6d5fdbd19741adfc7 ]
New race in x86_pmu_stop() was introduced by replacing the atomic __test_and_clear_bit() of cpuc->active_mask by separate test_bit() and __clear_bit() calls in the following commit:
3966c3feca3f ("x86/perf/amd: Remove need to check "running" bit in NMI handler")
The race causes panic for PEBS events with enabled callchains:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 ... RIP: 0010:perf_prepare_sample+0x8c/0x530 Call Trace: <NMI> perf_event_output_forward+0x2a/0x80 __perf_event_overflow+0x51/0xe0 handle_pmi_common+0x19e/0x240 intel_pmu_handle_irq+0xad/0x170 perf_event_nmi_handler+0x2e/0x50 nmi_handle+0x69/0x110 default_do_nmi+0x3e/0x100 do_nmi+0x11a/0x180 end_repeat_nmi+0x16/0x1a RIP: 0010:native_write_msr+0x6/0x20 ... </NMI> intel_pmu_disable_event+0x98/0xf0 x86_pmu_stop+0x6e/0xb0 x86_pmu_del+0x46/0x140 event_sched_out.isra.97+0x7e/0x160 ...
The event is configured to make samples from PEBS drain code, but when it's disabled, we'll go through NMI path instead, where data->callchain will not get allocated and we'll crash:
x86_pmu_stop test_bit(hwc->idx, cpuc->active_mask) intel_pmu_disable_event(event) { ... intel_pmu_pebs_disable(event); ...
EVENT OVERFLOW -> <NMI> intel_pmu_handle_irq handle_pmi_common TEST PASSES -> test_bit(bit, cpuc->active_mask)) perf_event_overflow perf_prepare_sample { ... if (!(sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY)) data->callchain = perf_callchain(event, regs);
CRASH -> size += data->callchain->nr; } </NMI> ... x86_pmu_disable_event(event) }
__clear_bit(hwc->idx, cpuc->active_mask);
Fixing this by disabling the event itself before setting off the PEBS bit.
Signed-off-by: Jiri Olsa jolsa@kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: David Arcari darcari@redhat.com Cc: Jiri Olsa jolsa@redhat.com Cc: Lendacky Thomas Thomas.Lendacky@amd.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Fixes: 3966c3feca3f ("x86/perf/amd: Remove need to check "running" bit in NMI handler") Link: http://lkml.kernel.org/r/20190504151556.31031-1-jolsa@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/events/intel/core.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index a759e59990fbd..09c53bcbd497d 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -2074,15 +2074,19 @@ static void intel_pmu_disable_event(struct perf_event *event) cpuc->intel_ctrl_host_mask &= ~(1ull << hwc->idx); cpuc->intel_cp_status &= ~(1ull << hwc->idx);
- if (unlikely(event->attr.precise_ip)) - intel_pmu_pebs_disable(event); - if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) { intel_pmu_disable_fixed(hwc); return; }
x86_pmu_disable_event(event); + + /* + * Needs to be called after x86_pmu_disable_event, + * so we don't trigger the event without PEBS bit set. + */ + if (unlikely(event->attr.precise_ip)) + intel_pmu_pebs_disable(event); }
static void intel_pmu_del_event(struct perf_event *event)
From: Song Liu songliubraving@fb.com
commit a25d8c327bb41742dbd59f8c545f59f3b9c39983 upstream.
This reverts commit 4f4fd7c5798bbdd5a03a60f6269cf1177fbd11ef.
Cc: Dan Williams dan.j.williams@intel.com Cc: Nigel Croxon ncroxon@redhat.com Cc: Xiao Ni xni@redhat.com Signed-off-by: Song Liu songliubraving@fb.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/raid5.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)
--- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -4221,15 +4221,26 @@ static void handle_parity_checks6(struct case check_state_check_result: sh->check_state = check_state_idle;
- if (s->failed > 1) - break; /* handle a successful check operation, if parity is correct * we are done. Otherwise update the mismatch count and repair * parity if !MD_RECOVERY_CHECK */ if (sh->ops.zero_sum_result == 0) { - /* Any parity checked was correct */ - set_bit(STRIPE_INSYNC, &sh->state); + /* both parities are correct */ + if (!s->failed) + set_bit(STRIPE_INSYNC, &sh->state); + else { + /* in contrast to the raid5 case we can validate + * parity, but still have a failure to write + * back + */ + sh->check_state = check_state_compute_result; + /* Returning at this point means that we may go + * off and bring p and/or q uptodate again so + * we make sure to check zero_sum_result again + * to verify if p or q need writeback + */ + } } else { atomic64_add(STRIPE_SECTORS, &conf->mddev->resync_mismatches); if (test_bit(MD_RECOVERY_CHECK, &conf->mddev->recovery)) {
On Thu 2019-05-23 21:06:47, Greg Kroah-Hartman wrote:
From: Song Liu songliubraving@fb.com
commit a25d8c327bb41742dbd59f8c545f59f3b9c39983 upstream.
This reverts commit 4f4fd7c5798bbdd5a03a60f6269cf1177fbd11ef.
Cc: Dan Williams dan.j.williams@intel.com Cc: Nigel Croxon ncroxon@redhat.com Cc: Xiao Ni xni@redhat.com Signed-off-by: Song Liu songliubraving@fb.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
We normally reject patches without changelog, and this has none. Why make exception here?
Pavel
drivers/md/raid5.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)
--- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -4221,15 +4221,26 @@ static void handle_parity_checks6(struct case check_state_check_result: sh->check_state = check_state_idle;
if (s->failed > 1)
/* handle a successful check operation, if parity is correctbreak;
*/ if (sh->ops.zero_sum_result == 0) {
- we are done. Otherwise update the mismatch count and repair
- parity if !MD_RECOVERY_CHECK
/* Any parity checked was correct */
set_bit(STRIPE_INSYNC, &sh->state);
/* both parities are correct */
if (!s->failed)
set_bit(STRIPE_INSYNC, &sh->state);
else {
/* in contrast to the raid5 case we can validate
* parity, but still have a failure to write
* back
*/
sh->check_state = check_state_compute_result;
/* Returning at this point means that we may go
* off and bring p and/or q uptodate again so
* we make sure to check zero_sum_result again
* to verify if p or q need writeback
*/
} else { atomic64_add(STRIPE_SECTORS, &conf->mddev->resync_mismatches); if (test_bit(MD_RECOVERY_CHECK, &conf->mddev->recovery)) {}
On Sat, May 25, 2019 at 02:08:35PM +0200, Pavel Machek wrote:
On Thu 2019-05-23 21:06:47, Greg Kroah-Hartman wrote:
From: Song Liu songliubraving@fb.com
commit a25d8c327bb41742dbd59f8c545f59f3b9c39983 upstream.
This reverts commit 4f4fd7c5798bbdd5a03a60f6269cf1177fbd11ef.
Cc: Dan Williams dan.j.williams@intel.com Cc: Nigel Croxon ncroxon@redhat.com Cc: Xiao Ni xni@redhat.com Signed-off-by: Song Liu songliubraving@fb.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
We normally reject patches without changelog, and this has none. Why make exception here?
Why are you reviewing stable kernel patches as if there is anything you can do about them at this point in time to change things like this? These are all mirrors of what is in Linus's tree. If you object to the original patch, that's fine, but please respond to the original patch, not the stable kernel patches as there's nothing I can do about changelog text at this late in the process.
thanks,
greg k-h
On May 25, 2019, at 9:07 AM, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Sat, May 25, 2019 at 02:08:35PM +0200, Pavel Machek wrote:
On Thu 2019-05-23 21:06:47, Greg Kroah-Hartman wrote:
From: Song Liu songliubraving@fb.com
commit a25d8c327bb41742dbd59f8c545f59f3b9c39983 upstream.
This reverts commit 4f4fd7c5798bbdd5a03a60f6269cf1177fbd11ef.
Cc: Dan Williams dan.j.williams@intel.com Cc: Nigel Croxon ncroxon@redhat.com Cc: Xiao Ni xni@redhat.com Signed-off-by: Song Liu songliubraving@fb.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
We normally reject patches without changelog, and this has none. Why make exception here?
Hi Pavel,
It was all my fault not to include more information about the revert. Thanks for highlighting this. I will be more careful with this in the future.
Thanks, Song
From: Nigel Croxon ncroxon@redhat.com
commit b2176a1dfb518d870ee073445d27055fea64dfb8 upstream.
The problem is that any 'uptodate' vs 'disks' check is not precise in this path. Put a "WARN_ON(!test_bit(R5_UPTODATE, &dev->flags)" on the device that might try to kick off writes and then skip the action. Better to prevent the raid driver from taking unexpected action *and* keep the system alive vs killing the machine with BUG_ON.
Note: fixed warning reported by kbuild test robot lkp@intel.com
Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Nigel Croxon ncroxon@redhat.com Signed-off-by: Song Liu songliubraving@fb.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/raid5.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
--- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -4185,7 +4185,7 @@ static void handle_parity_checks6(struct /* now write out any block on a failed drive, * or P or Q if they were recomputed */ - BUG_ON(s->uptodate < disks - 1); /* We don't need Q to recover */ + dev = NULL; if (s->failed == 2) { dev = &sh->dev[s->failed_num[1]]; s->locked++; @@ -4210,6 +4210,14 @@ static void handle_parity_checks6(struct set_bit(R5_LOCKED, &dev->flags); set_bit(R5_Wantwrite, &dev->flags); } + if (WARN_ONCE(dev && !test_bit(R5_UPTODATE, &dev->flags), + "%s: disk%td not up to date\n", + mdname(conf->mddev), + dev - (struct r5dev *) &sh->dev)) { + clear_bit(R5_LOCKED, &dev->flags); + clear_bit(R5_Wantwrite, &dev->flags); + s->locked--; + } clear_bit(STRIPE_DEGRADED, &sh->state);
set_bit(STRIPE_INSYNC, &sh->state);
From: John Garry john.garry@huawei.com
commit 0b777eee88d712256ba8232a9429edb17c4f9ceb upstream.
In commit 376991db4b64 ("driver core: Postpone DMA tear-down until after devres release"), we changed the ordering of tearing down the device DMA ops and releasing all the device's resources; this was because the DMA ops should be maintained until we release the device's managed DMA memories.
However, we have seen another crash on an arm64 system when a device driver probe fails:
hisi_sas_v3_hw 0000:74:02.0: Adding to iommu group 2 scsi host1: hisi_sas_v3_hw BUG: Bad page state in process swapper/0 pfn:313f5 page:ffff7e0000c4fd40 count:1 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0xfffe00000001000(reserved) raw: 0fffe00000001000 ffff7e0000c4fd48 ffff7e0000c4fd48 0000000000000000 raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set bad because of flags: 0x1000(reserved) Modules linked in: CPU: 49 PID: 1 Comm: swapper/0 Not tainted 5.1.0-rc1-43081-g22d97fd-dirty #1433 Hardware name: Huawei D06/D06, BIOS Hisilicon D06 UEFI RC0 - V1.12.01 01/29/2019 Call trace: dump_backtrace+0x0/0x118 show_stack+0x14/0x1c dump_stack+0xa4/0xc8 bad_page+0xe4/0x13c free_pages_check_bad+0x4c/0xc0 __free_pages_ok+0x30c/0x340 __free_pages+0x30/0x44 __dma_direct_free_pages+0x30/0x38 dma_direct_free+0x24/0x38 dma_free_attrs+0x9c/0xd8 dmam_release+0x20/0x28 release_nodes+0x17c/0x220 devres_release_all+0x34/0x54 really_probe+0xc4/0x2c8 driver_probe_device+0x58/0xfc device_driver_attach+0x68/0x70 __driver_attach+0x94/0xdc bus_for_each_dev+0x5c/0xb4 driver_attach+0x20/0x28 bus_add_driver+0x14c/0x200 driver_register+0x6c/0x124 __pci_register_driver+0x48/0x50 sas_v3_pci_driver_init+0x20/0x28 do_one_initcall+0x40/0x25c kernel_init_freeable+0x2b8/0x3c0 kernel_init+0x10/0x100 ret_from_fork+0x10/0x18 Disabling lock debugging due to kernel taint BUG: Bad page state in process swapper/0 pfn:313f6 page:ffff7e0000c4fd80 count:1 mapcount:0 mapping:0000000000000000 index:0x0 [ 89.322983] flags: 0xfffe00000001000(reserved) raw: 0fffe00000001000 ffff7e0000c4fd88 ffff7e0000c4fd88 0000000000000000 raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
The crash occurs for the same reason.
In this case, on the really_probe() failure path, we are still clearing the DMA ops prior to releasing the device's managed memories.
This patch fixes this issue by reordering the DMA ops teardown and the call to devres_release_all() on the failure path.
Reported-by: Xiang Chen chenxiang66@hisilicon.com Tested-by: Xiang Chen chenxiang66@hisilicon.com Signed-off-by: John Garry john.garry@huawei.com Reviewed-by: Robin Murphy robin.murphy@arm.com [jpg: backport to 4.19.x and earlier] Signed-off-by: John Garry john.garry@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/base/dd.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -482,7 +482,7 @@ re_probe:
ret = dma_configure(dev); if (ret) - goto dma_failed; + goto probe_failed;
if (driver_sysfs_add(dev)) { printk(KERN_ERR "%s: driver_sysfs_add(%s) failed\n", @@ -537,14 +537,13 @@ re_probe: goto done;
probe_failed: - dma_deconfigure(dev); -dma_failed: if (dev->bus) blocking_notifier_call_chain(&dev->bus->p->bus_notifier, BUS_NOTIFY_DRIVER_NOT_BOUND, dev); pinctrl_bind_failed: device_links_no_driver(dev); devres_release_all(dev); + dma_deconfigure(dev); driver_sysfs_remove(dev); dev->driver = NULL; dev_set_drvdata(dev, NULL);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
This reverts commit 118d38a3577f7728278f6afda8436af05a6bec7f which is commit 8184d44c9a577a2f1842ed6cc844bfd4a9981d8e upstream.
Tommi reports that this patch breaks the build, it's not really needed so let's revert it.
Reported-by: Tommi Rantala tommi.t.rantala@nokia.com Cc: Stanislav Fomichev sdf@google.com Cc: Sasha Levin sashal@kernel.org Acked-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/bpf/test_verifier.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-)
--- a/tools/testing/selftests/bpf/test_verifier.c +++ b/tools/testing/selftests/bpf/test_verifier.c @@ -32,7 +32,6 @@ #include <linux/if_ether.h>
#include <bpf/bpf.h> -#include <bpf/libbpf.h>
#ifdef HAVE_GENHDR # include "autoconf.h" @@ -57,7 +56,6 @@
#define UNPRIV_SYSCTL "kernel/unprivileged_bpf_disabled" static bool unpriv_disabled = false; -static int skips;
struct bpf_test { const char *descr; @@ -12772,11 +12770,6 @@ static void do_test_single(struct bpf_te fd_prog = bpf_verify_program(prog_type ? : BPF_PROG_TYPE_SOCKET_FILTER, prog, prog_len, test->flags & F_LOAD_WITH_STRICT_ALIGNMENT, "GPL", 0, bpf_vlog, sizeof(bpf_vlog), 1); - if (fd_prog < 0 && !bpf_probe_prog_type(prog_type, 0)) { - printf("SKIP (unsupported program type %d)\n", prog_type); - skips++; - goto close_fds; - }
expected_ret = unpriv && test->result_unpriv != UNDEF ? test->result_unpriv : test->result; @@ -12912,7 +12905,7 @@ static void get_unpriv_disabled()
static int do_test(bool unpriv, unsigned int from, unsigned int to) { - int i, passes = 0, errors = 0; + int i, passes = 0, errors = 0, skips = 0;
for (i = from; i < to; i++) { struct bpf_test *test = &tests[i];
From: Chenbo Feng fengc@google.com
commit e547ff3f803e779a3898f1f48447b29f43c54085 upstream.
For iptable module to load a bpf program from a pinned location, it only retrieve a loaded program and cannot change the program content so requiring a write permission for it might not be necessary. Also when adding or removing an unrelated iptable rule, it might need to flush and reload the xt_bpf related rules as well and triggers the inode permission check. It might be better to remove the write premission check for the inode so we won't need to grant write access to all the processes that flush and restore iptables rules.
Signed-off-by: Chenbo Feng fengc@google.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/bpf/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/bpf/inode.c +++ b/kernel/bpf/inode.c @@ -518,7 +518,7 @@ out: static struct bpf_prog *__get_prog_inode(struct inode *inode, enum bpf_prog_type type) { struct bpf_prog *prog; - int ret = inode_permission(inode, MAY_READ | MAY_WRITE); + int ret = inode_permission(inode, MAY_READ); if (ret) return ERR_PTR(ret);
From: Daniel Borkmann daniel@iogearbox.net
commit c6110222c6f49ea68169f353565eb865488a8619 upstream.
Add a callback map_lookup_elem_sys_only() that map implementations could use over map_lookup_elem() from system call side in case the map implementation needs to handle the latter differently than from the BPF data path. If map_lookup_elem_sys_only() is set, this will be preferred pick for map lookups out of user space. This hook is used in a follow-up fix for LRU map, but once development window opens, we can convert other map types from map_lookup_elem() (here, the one called upon BPF_MAP_LOOKUP_ELEM cmd is meant) over to use the callback to simplify and clean up the latter.
Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Martin KaFai Lau kafai@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/bpf.h | 1 + kernel/bpf/syscall.c | 5 ++++- 2 files changed, 5 insertions(+), 1 deletion(-)
--- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -34,6 +34,7 @@ struct bpf_map_ops { void (*map_free)(struct bpf_map *map); int (*map_get_next_key)(struct bpf_map *map, void *key, void *next_key); void (*map_release_uref)(struct bpf_map *map); + void *(*map_lookup_elem_sys_only)(struct bpf_map *map, void *key);
/* funcs callable from userspace and from eBPF programs */ void *(*map_lookup_elem)(struct bpf_map *map, void *key); --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -721,7 +721,10 @@ static int map_lookup_elem(union bpf_att err = bpf_fd_reuseport_array_lookup_elem(map, key, value); } else { rcu_read_lock(); - ptr = map->ops->map_lookup_elem(map, key); + if (map->ops->map_lookup_elem_sys_only) + ptr = map->ops->map_lookup_elem_sys_only(map, key); + else + ptr = map->ops->map_lookup_elem(map, key); if (ptr) memcpy(value, ptr, value_size); rcu_read_unlock();
From: Daniel Borkmann daniel@iogearbox.net
commit 50b045a8c0ccf44f76640ac3eea8d80ca53979a3 upstream.
One of the biggest issues we face right now with picking LRU map over regular hash table is that a map walk out of user space, for example, to just dump the existing entries or to remove certain ones, will completely mess up LRU eviction heuristics and wrong entries such as just created ones will get evicted instead. The reason for this is that we mark an entry as "in use" via bpf_lru_node_set_ref() from system call lookup side as well. Thus upon walk, all entries are being marked, so information of actual least recently used ones are "lost".
In case of Cilium where it can be used (besides others) as a BPF based connection tracker, this current behavior causes disruption upon control plane changes that need to walk the map from user space to evict certain entries. Discussion result from bpfconf [0] was that we should simply just remove marking from system call side as no good use case could be found where it's actually needed there. Therefore this patch removes marking for regular LRU and per-CPU flavor. If there ever should be a need in future, the behavior could be selected via map creation flag, but due to mentioned reason we avoid this here.
[0] http://vger.kernel.org/bpfconf.html
Fixes: 29ba732acbee ("bpf: Add BPF_MAP_TYPE_LRU_HASH") Fixes: 8f8449384ec3 ("bpf: Add BPF_MAP_TYPE_LRU_PERCPU_HASH") Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Martin KaFai Lau kafai@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/bpf/hashtab.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-)
--- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -518,18 +518,30 @@ static u32 htab_map_gen_lookup(struct bp return insn - insn_buf; }
-static void *htab_lru_map_lookup_elem(struct bpf_map *map, void *key) +static __always_inline void *__htab_lru_map_lookup_elem(struct bpf_map *map, + void *key, const bool mark) { struct htab_elem *l = __htab_map_lookup_elem(map, key);
if (l) { - bpf_lru_node_set_ref(&l->lru_node); + if (mark) + bpf_lru_node_set_ref(&l->lru_node); return l->key + round_up(map->key_size, 8); }
return NULL; }
+static void *htab_lru_map_lookup_elem(struct bpf_map *map, void *key) +{ + return __htab_lru_map_lookup_elem(map, key, true); +} + +static void *htab_lru_map_lookup_elem_sys(struct bpf_map *map, void *key) +{ + return __htab_lru_map_lookup_elem(map, key, false); +} + static u32 htab_lru_map_gen_lookup(struct bpf_map *map, struct bpf_insn *insn_buf) { @@ -1206,6 +1218,7 @@ const struct bpf_map_ops htab_lru_map_op .map_free = htab_map_free, .map_get_next_key = htab_map_get_next_key, .map_lookup_elem = htab_lru_map_lookup_elem, + .map_lookup_elem_sys_only = htab_lru_map_lookup_elem_sys, .map_update_elem = htab_lru_map_update_elem, .map_delete_elem = htab_lru_map_delete_elem, .map_gen_lookup = htab_lru_map_gen_lookup, @@ -1237,7 +1250,6 @@ static void *htab_lru_percpu_map_lookup_
int bpf_percpu_hash_copy(struct bpf_map *map, void *key, void *value) { - struct bpf_htab *htab = container_of(map, struct bpf_htab, map); struct htab_elem *l; void __percpu *pptr; int ret = -ENOENT; @@ -1253,8 +1265,9 @@ int bpf_percpu_hash_copy(struct bpf_map l = __htab_map_lookup_elem(map, key); if (!l) goto out; - if (htab_is_lru(htab)) - bpf_lru_node_set_ref(&l->lru_node); + /* We do not mark LRU map element here in order to not mess up + * eviction heuristics when user space does a map walk. + */ pptr = htab_elem_get_ptr(l, map->key_size); for_each_possible_cpu(cpu) { bpf_long_memcpy(value + off,
stable-rc/linux-4.19.y boot: 125 boots: 0 failed, 112 passed with 13 offline (v4.19.45-115-g071ff9cc9849)
Full Boot Summary: https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.19.y/kernel/v4.19... Full Build Summary: https://kernelci.org/build/stable-rc/branch/linux-4.19.y/kernel/v4.19.45-115...
Tree: stable-rc Branch: linux-4.19.y Git Describe: v4.19.45-115-g071ff9cc9849 Git Commit: 071ff9cc98498f489abc097471549b19933ba3e2 Git URL: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git Tested: 68 unique boards, 24 SoC families, 14 builds out of 206
Offline Platforms:
arm:
sama5_defconfig: gcc-8 at91-sama5d4_xplained: 1 offline lab
multi_v7_defconfig: gcc-8 alpine-db: 1 offline lab at91-sama5d4_xplained: 1 offline lab socfpga_cyclone5_de0_sockit: 1 offline lab stih410-b2120: 1 offline lab sun5i-r8-chip: 1 offline lab tegra124-jetson-tk1: 1 offline lab
tegra_defconfig: gcc-8 tegra124-jetson-tk1: 1 offline lab
sunxi_defconfig: gcc-8 sun5i-r8-chip: 1 offline lab
bcm2835_defconfig: gcc-8 bcm2835-rpi-b: 1 offline lab
arm64:
defconfig: gcc-8 apq8016-sbc: 1 offline lab juno-r2: 1 offline lab mt7622-rfb1: 1 offline lab
--- For more info write to info@kernelci.org
On Fri, 24 May 2019 at 00:44, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.19.46 release. There are 114 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat 25 May 2019 06:15:02 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.46-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Summary ------------------------------------------------------------------------
kernel: 4.19.46-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.19.y git commit: 071ff9cc98498f489abc097471549b19933ba3e2 git describe: v4.19.45-115-g071ff9cc9849 Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.19-oe/build/v4.19.45-11...
No regressions (compared to build v4.19.45)
No fixes (compared to build v4.19.45)
Ran 23173 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - arm64 - i386 - juno-r2 - arm64 - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - arm - x86_64
Test Suites ----------- * build * install-android-platform-tools-r2600 * kselftest * libgpiod * libhugetlbfs * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-timers-tests * perf * spectre-meltdown-checker-test * v4l2-compliance * ltp-open-posix-tests * kvm-unit-tests * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-none
On 23/05/2019 20:04, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.19.46 release. There are 114 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat 25 May 2019 06:15:02 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.46-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
All tests are passing for Tegra ...
Test results for stable-v4.19: 12 builds: 12 pass, 0 fail 22 boots: 22 pass, 0 fail 32 tests: 32 pass, 0 fail
Linux version: 4.19.46-rc1-g071ff9c Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra20-ventana, tegra210-p2371-2180, tegra30-cardhu-a04
Cheers Jon
On 5/23/19 1:04 PM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.19.46 release. There are 114 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat 25 May 2019 06:15:02 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.46-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
linux-stable-mirror@lists.linaro.org