May 2019 - Linux-stable-mirror

FAILED: patch "[PATCH] ipv6: prevent possible fib6 leaks" failed to apply to 4.14-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 61fb0d01680771f72cc9d39783fb2c122aaad51e Mon Sep 17 00:00:00 2001 From: Eric Dumazet <edumazet(a)google.com> Date: Wed, 15 May 2019 19:39:52 -0700 Subject: [PATCH] ipv6: prevent possible fib6 leaks At ipv6 route dismantle, fib6_drop_pcpu_from() is responsible for finding all percpu routes and set their ->from pointer to NULL, so that fib6_ref can reach its expected value (1). The problem right now is that other cpus can still catch the route being deleted, since there is no rcu grace period between the route deletion and call to fib6_drop_pcpu_from() This can leak the fib6 and associated resources, since no notifier will take care of removing the last reference(s). I decided to add another boolean (fib6_destroying) instead of reusing/renaming exception_bucket_flushed to ease stable backports, and properly document the memory barriers used to implement this fix. This patch has been co-developped with Wei Wang. Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst based routes") Signed-off-by: Eric Dumazet <edumazet(a)google.com> Reported-by: syzbot <syzkaller(a)googlegroups.com> Cc: Wei Wang <weiwan(a)google.com> Cc: David Ahern <dsahern(a)gmail.com> Cc: Martin Lau <kafai(a)fb.com> Acked-by: Wei Wang <weiwan(a)google.com> Acked-by: Martin KaFai Lau <kafai(a)fb.com> Reviewed-by: David Ahern <dsahern(a)gmail.com> Signed-off-by: David S. Miller <davem(a)davemloft.net> diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h index 40105738e2f6..525f701653ca 100644 --- a/include/net/ip6_fib.h +++ b/include/net/ip6_fib.h @@ -167,7 +167,8 @@ struct fib6_info { dst_nocount:1, dst_nopolicy:1, dst_host:1, - unused:3; + fib6_destroying:1, + unused:2; struct fib6_nh fib6_nh; struct rcu_head rcu; diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index 08e0390e001c..008421b550c6 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -904,6 +904,12 @@ static void fib6_drop_pcpu_from(struct fib6_info *f6i, { int cpu; + /* Make sure rt6_make_pcpu_route() wont add other percpu routes + * while we are cleaning them here. + */ + f6i->fib6_destroying = 1; + mb(); /* paired with the cmpxchg() in rt6_make_pcpu_route() */ + /* release the reference to this fib entry from * all of its cached pcpu routes */ @@ -927,6 +933,9 @@ static void fib6_purge_rt(struct fib6_info *rt, struct fib6_node *fn, { struct fib6_table *table = rt->fib6_table; + if (rt->rt6i_pcpu) + fib6_drop_pcpu_from(rt, table); + if (refcount_read(&rt->fib6_ref) != 1) { /* This route is used as dummy address holder in some split * nodes. It is not leaked, but it still holds other resources, @@ -948,9 +957,6 @@ static void fib6_purge_rt(struct fib6_info *rt, struct fib6_node *fn, fn = rcu_dereference_protected(fn->parent, lockdep_is_held(&table->tb6_lock)); } - - if (rt->rt6i_pcpu) - fib6_drop_pcpu_from(rt, table); } } diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 23a20d62daac..27c0cc5d9d30 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1295,6 +1295,13 @@ static struct rt6_info *rt6_make_pcpu_route(struct net *net, prev = cmpxchg(p, NULL, pcpu_rt); BUG_ON(prev); + if (res->f6i->fib6_destroying) { + struct fib6_info *from; + + from = xchg((__force struct fib6_info **)&pcpu_rt->from, NULL); + fib6_info_release(from); + } + return pcpu_rt; }

6 years

3
4
0 0

[PATCH 5.0 000/123] 5.0.18-stable review

by Greg Kroah-Hartman

This is the start of the stable review cycle for the 5.0.18 release. There are 123 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Wed 22 May 2019 11:50:46 AM UTC. Anything received after that time might be too late. The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.0.18-rc1… or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.0.y and the diffstat can be found below. thanks, greg k-h ------------- Pseudo-Shortlog of commits: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Linux 5.0.18-rc1 Andreas Dilger <adilger(a)dilger.ca> ext4: don't update s_rev_level if not required zhangyi (F) <yi.zhang(a)huawei.com> ext4: fix compile error when using BUFFER_TRACE Theodore Ts'o <tytso(a)mit.edu> ext4: fix block validity checks for journal inodes using indirect blocks Colin Ian King <colin.king(a)canonical.com> ext4: unsigned int compared against zero Martin Schwidefsky <schwidefsky(a)de.ibm.com> s390/mm: convert to the generic get_user_pages_fast code Martin Schwidefsky <schwidefsky(a)de.ibm.com> s390/mm: make the pxd_offset functions more robust Eric Dumazet <edumazet(a)google.com> iov_iter: optimize page_copy_sane() Dan Williams <dan.j.williams(a)intel.com> libnvdimm/namespace: Fix label tracking error Roger Pau Monne <roger.pau(a)citrix.com> xen/pvh: correctly setup the PV EFI interface for dom0 Roger Pau Monne <roger.pau(a)citrix.com> xen/pvh: set xen_domain_type to HVM in xen_pvh_init Masahiro Yamada <yamada.masahiro(a)socionext.com> kbuild: turn auto.conf.cmd into a mandatory include file Sean Christopherson <sean.j.christopherson(a)intel.com> KVM: lapic: Busy wait for timer to expire when using hv_timer Sean Christopherson <sean.j.christopherson(a)intel.com> KVM: x86: Skip EFER vs. guest CPUID checks for host-initiated writes Peter Xu <peterx(a)redhat.com> KVM: Fix the bitmap range to copy during clear dirty Chengguang Xu <cgxu519(a)gmail.com> jbd2: fix potential double free Michał Wadowski <wadosm(a)gmail.com> ALSA: hda/realtek - Fix for Lenovo B50-70 inverted internal microphone bug Kailang Yang <kailang(a)realtek.com> ALSA: hda/realtek - Fixup headphone noise via runtime suspend Jeremy Soller <jeremy(a)system76.com> ALSA: hda/realtek - Corrected fixup for System76 Gazelle (gaze14) Jan Kara <jack(a)suse.cz> ext4: avoid panic during forced reboot due to aborted journal Sahitya Tummala <stummala(a)codeaurora.org> ext4: fix use-after-free in dx_release() Lukas Czerner <lczerner(a)redhat.com> ext4: fix data corruption caused by overlapping unaligned and aligned IO Sriram Rajagopalan <sriramr(a)arista.com> ext4: zero out the unused memory region in the extent tree block Anup Patel <Anup.Patel(a)wdc.com> tty: Don't force RISCV SBI console as preferred console Jiufei Xue <jiufei.xue(a)linux.alibaba.com> fs/writeback.c: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount Eric Biggers <ebiggers(a)google.com> crypto: ccm - fix incompatibility between "ccm" and "ccm_base" Kamlakant Patel <kamlakantp(a)marvell.com> ipmi:ssif: compare block number correctly for multi-part return messages Coly Li <colyli(a)suse.de> bcache: never set KEY_PTRS of journal key to 0 in journal_reclaim() Liang Chen <liangchen.linux(a)gmail.com> bcache: fix a race between cache register and cacheset unregister Filipe Manana <fdmanana(a)suse.com> Btrfs: fix race between send and deduplication that lead to failures and crashes Filipe Manana <fdmanana(a)suse.com> Btrfs: do not start a transaction at iterate_extent_inodes() Filipe Manana <fdmanana(a)suse.com> Btrfs: do not start a transaction during fiemap Filipe Manana <fdmanana(a)suse.com> Btrfs: send, flush dellaloc in order to avoid data loss Nikolay Borisov <nborisov(a)suse.com> btrfs: Honour FITRIM range constraints during free space trim Nikolay Borisov <nborisov(a)suse.com> btrfs: Correctly free extent buffer in case btree_read_extent_buffer_pages fails Qu Wenruo <wqu(a)suse.com> btrfs: Check the first key and level for cached extent buffer Debabrata Banerjee <dbanerje(a)akamai.com> ext4: fix ext4_show_options for file systems w/o journal Kirill Tkhai <ktkhai(a)virtuozzo.com> ext4: actually request zeroing of inode table after grow Barret Rhoden <brho(a)google.com> ext4: fix use-after-free race with debug_want_extra_isize Pan Bian <bianpan2016(a)163.com> ext4: avoid drop reference to iloc.bh twice Theodore Ts'o <tytso(a)mit.edu> ext4: ignore e_value_offs for xattrs with value-in-ea-inode Theodore Ts'o <tytso(a)mit.edu> ext4: protect journal inode's blocks using block_validity Jan Kara <jack(a)suse.cz> ext4: make sanity check in mballoc more strict Jiufei Xue <jiufei.xue(a)linux.alibaba.com> jbd2: check superblock mapped prior to committing Sergei Trofimovich <slyfox(a)gentoo.org> tty/vt: fix write/write race in ioctl(KDSKBSENT) handler Yifeng Li <tomli(a)tomli.me> tty: vt.c: Fix TIOCL_BLANKSCREEN console blanking if blankinterval == 0 Chris Packham <chris.packham(a)alliedtelesis.co.nz> mtd: maps: Allow MTD_PHYSMAP with MTD_RAM Chris Packham <chris.packham(a)alliedtelesis.co.nz> mtd: maps: physmap: Store gpio_values correctly Alexander Sverdlin <alexander.sverdlin(a)nokia.com> mtd: spi-nor: intel-spi: Avoid crossing 4K address boundary on read/write Dmitry Osipenko <digetx(a)gmail.com> mfd: max77620: Fix swapped FPS_PERIOD_MAX_US values Steve Twiss <stwiss.opensource(a)diasemi.com> mfd: da9063: Fix OTP control register names to match datasheets for DA9063/63L Rajat Jain <rajatja(a)google.com> ACPI: PM: Set enable_for_wake for wakeup GPEs during suspend-to-idle Andrea Arcangeli <aarcange(a)redhat.com> userfaultfd: use RCU to free the task struct when fork fails Shuning Zhang <sunny.s.zhang(a)oracle.com> ocfs2: fix ocfs2 read inode data panic in ocfs2_iget Mike Kravetz <mike.kravetz(a)oracle.com> hugetlb: use same fault hash key for shared and private mappings Kai Shen <shenkai8(a)huawei.com> mm/hugetlb.c: don't put_page in lock of hugetlb_lock Dan Williams <dan.j.williams(a)intel.com> mm/huge_memory: fix vmf_insert_pfn_{pmd, pud}() crash, handle unaligned addresses Jiri Kosina <jkosina(a)suse.cz> mm/mincore.c: make mincore() more conservative Ofir Drang <ofir.drang(a)arm.com> crypto: ccree - handle tee fips error during power management resume Ofir Drang <ofir.drang(a)arm.com> crypto: ccree - add function to handle cryptocell tee fips error Ofir Drang <ofir.drang(a)arm.com> crypto: ccree - HOST_POWER_DOWN_EN should be the last CC access during suspend Ofir Drang <ofir.drang(a)arm.com> crypto: ccree - pm resume first enable the source clk Gilad Ben-Yossef <gilad(a)benyossef.com> crypto: ccree - don't map AEAD key and IV on stack Gilad Ben-Yossef <gilad(a)benyossef.com> crypto: ccree - use correct internal state sizes for export Gilad Ben-Yossef <gilad(a)benyossef.com> crypto: ccree - don't map MAC key on stack Gilad Ben-Yossef <gilad(a)benyossef.com> crypto: ccree - fix mem leak on error path Gilad Ben-Yossef <gilad(a)benyossef.com> crypto: ccree - remove special handling of chained sg Daniel Borkmann <daniel(a)iogearbox.net> bpf, arm64: remove prefetch insn in xadd mapping Libin Yang <libin.yang(a)intel.com> ASoC: codec: hdac_hdmi add device_link to card device S.j. Wang <shengjiu.wang(a)nxp.com> ASoC: fsl_esai: Fix missing break in switch statement Curtis Malainey <cujomalainey(a)chromium.org> ASoC: RT5677-SPI: Disable 16Bit SPI Transfers Jon Hunter <jonathanh(a)nvidia.com> ASoC: max98090: Fix restore of DAPM Muxes Jeremy Soller <jeremy(a)system76.com> ALSA: hdea/realtek - Headset fixup for System76 Gazelle (gaze14) Kailang Yang <kailang(a)realtek.com> ALSA: hda/realtek - EAPD turn on later Hui Wang <hui.wang(a)canonical.com> ALSA: hda/hdmi - Consider eld_valid when reporting jack event Hui Wang <hui.wang(a)canonical.com> ALSA: hda/hdmi - Read the pin sense from register when repolling Wenwen Wang <wang6495(a)umn.edu> ALSA: usb-audio: Fix a memory leak bug Takashi Iwai <tiwai(a)suse.de> ALSA: line6: toneport: Fix broken usage of timer for delayed execution Adrian Hunter <adrian.hunter(a)intel.com> mmc: sdhci-pci: Fix BYT OCP setting Raul E Rangel <rrangel(a)chromium.org> mmc: core: Fix tag set memory leak Sowjanya Komatineni <skomatineni(a)nvidia.com> mmc: tegra: fix ddr signaling for non-ddr modes Eric Biggers <ebiggers(a)google.com> crypto: arm64/aes-neonbs - don't access already-freed walk.iv Eric Biggers <ebiggers(a)google.com> crypto: arm/aes-neonbs - don't access already-freed walk.iv Horia Geantă <horia.geanta(a)nxp.com> crypto: caam/qi2 - generate hash keys in-place Horia Geantă <horia.geanta(a)nxp.com> crypto: caam/qi2 - fix DMA mapping of stack memory Horia Geantă <horia.geanta(a)nxp.com> crypto: caam/qi2 - fix zero-length buffer DMA mapping Zhang Zhijie <zhangzj(a)rock-chips.com> crypto: rockchip - update IV buffer to contain the next IV Eric Biggers <ebiggers(a)google.com> crypto: gcm - fix incompatibility between "gcm" and "gcm_base" Eric Biggers <ebiggers(a)google.com> crypto: arm64/gcm-aes-ce - fix no-NEON fallback code Eric Biggers <ebiggers(a)google.com> crypto: x86/crct10dif-pcl - fix use via crypto_shash_digest() Eric Biggers <ebiggers(a)google.com> crypto: crct10dif-generic - fix use via crypto_shash_digest() Eric Biggers <ebiggers(a)google.com> crypto: skcipher - don't WARN on unprocessed data after slow walk step Daniel Axtens <dja(a)axtens.net> crypto: vmx - fix copy-paste error in CTR mode Singh, Brijesh <brijesh.singh(a)amd.com> crypto: ccp - Do not free psp_master when PLATFORM_INIT fails Eric Biggers <ebiggers(a)google.com> crypto: chacha20poly1305 - set cra_name correctly Eric Biggers <ebiggers(a)google.com> crypto: chacha-generic - fix use as arm64 no-NEON fallback Eric Biggers <ebiggers(a)google.com> crypto: lrw - don't access already-freed walk.iv Eric Biggers <ebiggers(a)google.com> crypto: salsa20 - don't access already-freed walk.iv Christian Lamparter <chunkeey(a)gmail.com> crypto: crypto4xx - fix cfb and ofb "overran dst buffer" issues Christian Lamparter <chunkeey(a)gmail.com> crypto: crypto4xx - fix ctr-aes missing output IV Yazen Ghannam <yazen.ghannam(a)amd.com> x86/MCE/AMD: Don't report L1 BTB MCA errors on some family 17h models Yazen Ghannam <yazen.ghannam(a)amd.com> x86/MCE: Group AMD function prototypes in <asm/mce.h> Shirish S <Shirish.S(a)amd.com> x86/MCE/AMD: Carve out the MC4_MISC thresholding quirk Shirish S <Shirish.S(a)amd.com> x86/MCE/AMD: Turn off MC4_MISC thresholding on all family 0x15 models Yazen Ghannam <yazen.ghannam(a)amd.com> x86/MCE: Add an MCE-record filtering function Peter Zijlstra <peterz(a)infradead.org> sched/x86: Save [ER]FLAGS on context switch Jean-Philippe Brucker <jean-philippe.brucker(a)arm.com> arm64: Save and restore OSDLR_EL1 across suspend/resume Jean-Philippe Brucker <jean-philippe.brucker(a)arm.com> arm64: Clear OSDLR_EL1 on CPU boot Vincenzo Frascino <vincenzo.frascino(a)arm.com> arm64: compat: Reduce address limit Will Deacon <will.deacon(a)arm.com> arm64: arch_timer: Ensure counter register reads occur with seqlock held Boyang Zhou <zhouby_cn(a)126.com> arm64: mmap: Ensure file offset is treated as unsigned Hans de Goede <hdegoede(a)redhat.com> power: supply: axp288_fuel_gauge: Add ACEPC T8 and T11 mini PCs to the blacklist Gustavo A. R. Silva <gustavo(a)embeddedor.com> power: supply: axp288_charger: Fix unchecked return value Wen Yang <wen.yang99(a)zte.com.cn> ARM: exynos: Fix a leaked reference by adding missing of_node_put Christoph Muellner <christoph.muellner(a)theobroma-systems.com> mmc: sdhci-of-arasan: Add DTS property to disable DCMDs. Sylwester Nawrocki <s.nawrocki(a)samsung.com> ARM: dts: exynos: Fix audio (microphone) routing on Odroid XU3 Stuart Menefy <stuart.menefy(a)mathembedded.com> ARM: dts: exynos: Fix interrupt for shared EINTs on Exynos5260 Christian Lamparter <chunkeey(a)gmail.com> ARM: dts: qcom: ipq4019: enlarge PCIe BAR range Christoph Muellner <christoph.muellner(a)theobroma-systems.com> arm64: dts: rockchip: Disable DCMDs on RK3399's eMMC controller. Katsuhiro Suzuki <katsuhiro(a)katsuster.net> arm64: dts: rockchip: fix IO domain voltage setting of APIO5 on rockpro64 Josh Poimboeuf <jpoimboe(a)redhat.com> objtool: Fix function fallthrough detection Andy Lutomirski <luto(a)kernel.org> x86/speculation/mds: Improve CPU buffer clear documentation Andy Lutomirski <luto(a)kernel.org> x86/speculation/mds: Revert CPU buffer clear on double fault exit Waiman Long <longman(a)redhat.com> locking/rwsem: Prevent decrement of reader count before increment ------------- Diffstat: Documentation/x86/mds.rst | 44 +-- Makefile | 6 +- arch/arm/boot/dts/exynos5260.dtsi | 2 +- arch/arm/boot/dts/exynos5422-odroidxu3-audio.dtsi | 2 +- arch/arm/boot/dts/qcom-ipq4019.dtsi | 4 +- arch/arm/crypto/aes-neonbs-glue.c | 2 + arch/arm/mach-exynos/firmware.c | 1 + arch/arm/mach-exynos/suspend.c | 2 + arch/arm64/boot/dts/rockchip/rk3399-rockpro64.dts | 2 +- arch/arm64/boot/dts/rockchip/rk3399.dtsi | 1 + arch/arm64/crypto/aes-neonbs-glue.c | 2 + arch/arm64/crypto/ghash-ce-glue.c | 10 +- arch/arm64/include/asm/arch_timer.h | 33 ++- arch/arm64/include/asm/processor.h | 8 + arch/arm64/kernel/debug-monitors.c | 1 + arch/arm64/kernel/sys.c | 2 +- arch/arm64/kernel/vdso/gettimeofday.S | 15 +- arch/arm64/mm/proc.S | 34 +-- arch/arm64/net/bpf_jit.h | 6 - arch/arm64/net/bpf_jit_comp.c | 1 - arch/s390/Kconfig | 1 + arch/s390/include/asm/pgtable.h | 79 ++++-- arch/s390/mm/Makefile | 2 +- arch/s390/mm/gup.c | 300 --------------------- arch/x86/crypto/crct10dif-pclmul_glue.c | 13 +- arch/x86/entry/entry_32.S | 2 + arch/x86/entry/entry_64.S | 2 + arch/x86/include/asm/mce.h | 25 +- arch/x86/include/asm/switch_to.h | 1 + arch/x86/kernel/cpu/mce/amd.c | 62 +++++ arch/x86/kernel/cpu/mce/core.c | 38 +-- arch/x86/kernel/cpu/mce/genpool.c | 3 + arch/x86/kernel/cpu/mce/internal.h | 9 + arch/x86/kernel/process_32.c | 7 + arch/x86/kernel/process_64.c | 8 + arch/x86/kernel/traps.c | 8 - arch/x86/kvm/lapic.c | 2 +- arch/x86/kvm/x86.c | 37 ++- arch/x86/platform/pvh/enlighten.c | 8 +- arch/x86/xen/efi.c | 12 +- arch/x86/xen/enlighten_pv.c | 2 +- arch/x86/xen/enlighten_pvh.c | 7 +- arch/x86/xen/xen-ops.h | 4 +- crypto/ccm.c | 44 ++- crypto/chacha20poly1305.c | 4 +- crypto/chacha_generic.c | 2 +- crypto/crct10dif_generic.c | 11 +- crypto/gcm.c | 34 +-- crypto/lrw.c | 4 +- crypto/salsa20_generic.c | 2 +- crypto/skcipher.c | 9 +- drivers/acpi/sleep.c | 4 + drivers/char/ipmi/ipmi_ssif.c | 6 +- drivers/crypto/amcc/crypto4xx_alg.c | 12 +- drivers/crypto/amcc/crypto4xx_core.c | 31 ++- drivers/crypto/caam/caamalg_qi2.c | 177 ++++++------ drivers/crypto/caam/caamalg_qi2.h | 2 - drivers/crypto/ccp/psp-dev.c | 2 +- drivers/crypto/ccree/cc_aead.c | 11 +- drivers/crypto/ccree/cc_buffer_mgr.c | 113 +++----- drivers/crypto/ccree/cc_driver.h | 1 + drivers/crypto/ccree/cc_fips.c | 23 +- drivers/crypto/ccree/cc_fips.h | 2 + drivers/crypto/ccree/cc_hash.c | 28 +- drivers/crypto/ccree/cc_ivgen.c | 9 +- drivers/crypto/ccree/cc_pm.c | 9 +- drivers/crypto/rockchip/rk3288_crypto_ablkcipher.c | 25 +- drivers/crypto/vmx/aesp8-ppc.pl | 4 +- drivers/dax/device.c | 6 +- drivers/edac/mce_amd.c | 4 +- drivers/md/bcache/journal.c | 11 +- drivers/md/bcache/super.c | 2 +- drivers/mmc/core/queue.c | 1 + drivers/mmc/host/Kconfig | 1 + drivers/mmc/host/sdhci-of-arasan.c | 5 +- drivers/mmc/host/sdhci-pci-core.c | 96 +++++++ drivers/mmc/host/sdhci-tegra.c | 1 + drivers/mtd/maps/Kconfig | 2 +- drivers/mtd/maps/physmap-core.c | 2 + drivers/mtd/spi-nor/intel-spi.c | 8 + drivers/nvdimm/label.c | 29 +- drivers/nvdimm/namespace_devs.c | 15 ++ drivers/nvdimm/nd.h | 4 + drivers/power/supply/axp288_charger.c | 4 + drivers/power/supply/axp288_fuel_gauge.c | 20 ++ drivers/tty/hvc/hvc_riscv_sbi.c | 1 - drivers/tty/vt/keyboard.c | 33 ++- drivers/tty/vt/vt.c | 2 - fs/btrfs/backref.c | 34 ++- fs/btrfs/ctree.c | 10 + fs/btrfs/ctree.h | 6 + fs/btrfs/disk-io.c | 27 +- fs/btrfs/disk-io.h | 3 + fs/btrfs/extent-tree.c | 25 +- fs/btrfs/ioctl.c | 19 +- fs/btrfs/send.c | 62 +++++ fs/dax.c | 6 +- fs/ext4/block_validity.c | 54 ++++ fs/ext4/ext4.h | 6 +- fs/ext4/extents.c | 17 +- fs/ext4/file.c | 7 + fs/ext4/inode.c | 7 +- fs/ext4/ioctl.c | 2 +- fs/ext4/mballoc.c | 2 +- fs/ext4/namei.c | 5 +- fs/ext4/resize.c | 1 + fs/ext4/super.c | 63 +++-- fs/ext4/xattr.c | 2 +- fs/fs-writeback.c | 11 +- fs/hugetlbfs/inode.c | 7 +- fs/jbd2/journal.c | 53 ++-- fs/jbd2/revoke.c | 32 ++- fs/jbd2/transaction.c | 8 +- fs/ocfs2/export.c | 30 ++- include/linux/huge_mm.h | 6 +- include/linux/hugetlb.h | 4 +- include/linux/jbd2.h | 8 +- include/linux/mfd/da9063/registers.h | 6 +- include/linux/mfd/max77620.h | 4 +- kernel/fork.c | 31 ++- kernel/locking/rwsem-xadd.c | 44 ++- lib/iov_iter.c | 17 +- mm/huge_memory.c | 16 +- mm/hugetlb.c | 25 +- mm/mincore.c | 23 +- mm/userfaultfd.c | 3 +- sound/pci/hda/patch_hdmi.c | 11 +- sound/pci/hda/patch_realtek.c | 68 +++-- sound/soc/codecs/hdac_hdmi.c | 11 + sound/soc/codecs/max98090.c | 12 +- sound/soc/codecs/rt5677-spi.c | 35 ++- sound/soc/fsl/fsl_esai.c | 2 +- sound/usb/line6/toneport.c | 16 +- sound/usb/mixer.c | 2 + tools/objtool/check.c | 3 +- virt/kvm/kvm_main.c | 2 +- 136 files changed, 1421 insertions(+), 1053 deletions(-)

6 years

6
129
0 0

[PATCH 8/8] vmbus: fix subchannel removal

by Stephen Hemminger

From: Dexuan Cui <decui(a)microsoft.com> commit b5679cebf780c6f1c2451a73bf1842a4409840e7 upstream The changes to split ring allocation from open/close, broke the cleanup of subchannels. This resulted in problems using uio on network devices because the subchannel was left behind when the network device was unbound. The cause was in the disconnect logic which used list splice to move the subchannel list into a local variable. This won't work because the subchannel list is needed later during the process of the rescind messages (relid2channel). The fix is to just leave the subchannel list in place which is what the original code did. The list is cleaned up later when the host rescind is processed. Without the fix, we have a lot of "hang" issues in netvsc when we try to change the NIC's MTU, set the number of channels, etc. Fixes: ae6935ed7d42 ("vmbus: split ring buffer allocation from open") Cc: stable(a)vger.kernel.org Signed-off-by: Stephen Hemminger <sthemmin(a)microsoft.com> Signed-off-by: Dexuan Cui <decui(a)microsoft.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- drivers/hv/channel.c | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index 170770339720..8e23ed6ea9ab 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -701,20 +701,12 @@ static int vmbus_close_internal(struct vmbus_channel *channel) int vmbus_disconnect_ring(struct vmbus_channel *channel) { struct vmbus_channel *cur_channel, *tmp; - unsigned long flags; - LIST_HEAD(list); int ret; if (channel->primary_channel != NULL) return -EINVAL; - /* Snapshot the list of subchannels */ - spin_lock_irqsave(&channel->lock, flags); - list_splice_init(&channel->sc_list, &list); - channel->num_sc = 0; - spin_unlock_irqrestore(&channel->lock, flags); - - list_for_each_entry_safe(cur_channel, tmp, &list, sc_list) { + list_for_each_entry_safe(cur_channel, tmp, &channel->sc_list, sc_list) { if (cur_channel->rescind) wait_for_completion(&cur_channel->rescind_event); -- 2.20.1

6 years, 1 month

1
0
0 0

[PATCH v2] drm/i915/gvt: Initialize intel_gvt_gtt_entry in stack

by Tina Zhang

Stack struct intel_gvt_gtt_entry value needs to be initialized before being used, as the fields may contain garbage values. W/o this patch, set_ggtt_entry prints: ------------------------------------- 274.046840: set_ggtt_entry: vgpu1:set ggtt entry 0x9bed8000ffffe900 274.046846: set_ggtt_entry: vgpu1:set ggtt entry 0xe55df001 274.046852: set_ggtt_entry: vgpu1:set ggtt entry 0x9bed8000ffffe900 0x9bed8000 is the stack grabage. W/ this patch, set_ggtt_entry prints: ------------------------------------ 274.046840: set_ggtt_entry: vgpu1:set ggtt entry 0xffffe900 274.046846: set_ggtt_entry: vgpu1:set ggtt entry 0xe55df001 274.046852: set_ggtt_entry: vgpu1:set ggtt entry 0xffffe900 v2: - Initialize during declaration. (Zhenyu) Fixes: 7598e8700e9a(drm/i915/gvt: Missed to cancel dma map for ggtt entries) Cc: stable(a)vger.kernel.org # v4.20+ Cc: Zhenyu Wang <zhenyuw(a)linux.intel.com> Signed-off-by: Tina Zhang <tina.zhang(a)intel.com> --- drivers/gpu/drm/i915/gvt/gtt.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gvt/gtt.c b/drivers/gpu/drm/i915/gvt/gtt.c index 15216c5b40aa..ebc1e5228bf5 100644 --- a/drivers/gpu/drm/i915/gvt/gtt.c +++ b/drivers/gpu/drm/i915/gvt/gtt.c @@ -2179,7 +2179,8 @@ static int emulate_ggtt_mmio_write(struct intel_vgpu *vgpu, unsigned int off, struct intel_gvt_gtt_pte_ops *ops = gvt->gtt.pte_ops; unsigned long g_gtt_index = off >> info->gtt_entry_size_shift; unsigned long gma, gfn; - struct intel_gvt_gtt_entry e, m; + struct intel_gvt_gtt_entry e = {.val64 = 0, .type = GTT_TYPE_GGTT_PTE}; + struct intel_gvt_gtt_entry m = {.val64 = 0, .type = GTT_TYPE_GGTT_PTE}; dma_addr_t dma_addr; int ret; struct intel_gvt_partial_pte *partial_pte, *pos, *n; @@ -2246,7 +2247,8 @@ static int emulate_ggtt_mmio_write(struct intel_vgpu *vgpu, unsigned int off, if (!partial_update && (ops->test_present(&e))) { gfn = ops->get_pfn(&e); - m = e; + m.val64 = e.val64; + m.type = e.type; /* one PTE update may be issued in multiple writes and the * first write may not construct a valid gfn -- 2.17.1

6 years, 1 month

2
1
0 0

[PATCH AUTOSEL 4.14 001/167] gfs2: Fix lru_count going negative

by Sasha Levin

From: Ross Lagerwall <ross.lagerwall(a)citrix.com> [ Upstream commit 7881ef3f33bb80f459ea6020d1e021fc524a6348 ] Under certain conditions, lru_count may drop below zero resulting in a large amount of log spam like this: vmscan: shrink_slab: gfs2_dump_glock+0x3b0/0x630 [gfs2] \ negative objects to delete nr=-1 This happens as follows: 1) A glock is moved from lru_list to the dispose list and lru_count is decremented. 2) The dispose function calls cond_resched() and drops the lru lock. 3) Another thread takes the lru lock and tries to add the same glock to lru_list, checking if the glock is on an lru list. 4) It is on a list (actually the dispose list) and so it avoids incrementing lru_count. 5) The glock is moved to lru_list. 5) The original thread doesn't dispose it because it has been re-added to the lru list but the lru_count has still decreased by one. Fix by checking if the LRU flag is set on the glock rather than checking if the glock is on some list and rearrange the code so that the LRU flag is added/removed precisely when the glock is added/removed from lru_list. Signed-off-by: Ross Lagerwall <ross.lagerwall(a)citrix.com> Signed-off-by: Andreas Gruenbacher <agruenba(a)redhat.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- fs/gfs2/glock.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c index d5284d0dbdb59..cd6a64478a026 100644 --- a/fs/gfs2/glock.c +++ b/fs/gfs2/glock.c @@ -183,15 +183,19 @@ static int demote_ok(const struct gfs2_glock *gl) void gfs2_glock_add_to_lru(struct gfs2_glock *gl) { + if (!(gl->gl_ops->go_flags & GLOF_LRU)) + return; + spin_lock(&lru_lock); - if (!list_empty(&gl->gl_lru)) - list_del_init(&gl->gl_lru); - else + list_del(&gl->gl_lru); + list_add_tail(&gl->gl_lru, &lru_list); + + if (!test_bit(GLF_LRU, &gl->gl_flags)) { + set_bit(GLF_LRU, &gl->gl_flags); atomic_inc(&lru_count); + } - list_add_tail(&gl->gl_lru, &lru_list); - set_bit(GLF_LRU, &gl->gl_flags); spin_unlock(&lru_lock); } @@ -201,7 +205,7 @@ static void gfs2_glock_remove_from_lru(struct gfs2_glock *gl) return; spin_lock(&lru_lock); - if (!list_empty(&gl->gl_lru)) { + if (test_bit(GLF_LRU, &gl->gl_flags)) { list_del_init(&gl->gl_lru); atomic_dec(&lru_count); clear_bit(GLF_LRU, &gl->gl_flags); @@ -1158,8 +1162,7 @@ void gfs2_glock_dq(struct gfs2_holder *gh) !test_bit(GLF_DEMOTE, &gl->gl_flags)) fast_path = 1; } - if (!test_bit(GLF_LFLUSH, &gl->gl_flags) && demote_ok(gl) && - (glops->go_flags & GLOF_LRU)) + if (!test_bit(GLF_LFLUSH, &gl->gl_flags) && demote_ok(gl)) gfs2_glock_add_to_lru(gl); trace_gfs2_glock_queue(gh, 0); @@ -1454,6 +1457,7 @@ __acquires(&lru_lock) if (!spin_trylock(&gl->gl_lockref.lock)) { add_back_to_lru: list_add(&gl->gl_lru, &lru_list); + set_bit(GLF_LRU, &gl->gl_flags); atomic_inc(&lru_count); continue; } @@ -1461,7 +1465,6 @@ __acquires(&lru_lock) spin_unlock(&gl->gl_lockref.lock); goto add_back_to_lru; } - clear_bit(GLF_LRU, &gl->gl_flags); gl->gl_lockref.count++; if (demote_ok(gl)) handle_callback(gl, LM_ST_UNLOCKED, 0, false); @@ -1496,6 +1499,7 @@ static long gfs2_scan_glock_lru(int nr) if (!test_bit(GLF_LOCK, &gl->gl_flags)) { list_move(&gl->gl_lru, &dispose); atomic_dec(&lru_count); + clear_bit(GLF_LRU, &gl->gl_flags); freed++; continue; } -- 2.20.1

6 years, 1 month

2
61
0 0

[git:media_tree/fixes] media: dvb: warning about dvb frequency limits produces too much noise

by Mauro Carvalho Chehab

This is an automatic generated email to let you know that the following patch were queued: Subject: media: dvb: warning about dvb frequency limits produces too much noise Author: Sean Young <sean(a)mess.org> Date: Mon May 20 15:43:49 2019 -0400 This can be a debug message. Favour dev_dbg() over dprintk() as this is already used much more than dprintk(). dvb_frontend: dvb_frontend_get_frequency_limits: frequency interval: tuner: 45000000...860000000, frontend: 44250000...867250000 Fixes: 00ecd6bc7128 ("media: dvb_frontend: add debug message for frequency intervals") Cc: <stable(a)vger.kernel.org> # 5.0 Signed-off-by: Sean Young <sean(a)mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung(a)kernel.org> drivers/media/dvb-core/dvb_frontend.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- diff --git a/drivers/media/dvb-core/dvb_frontend.c b/drivers/media/dvb-core/dvb_frontend.c index fbdb4ecc7c50..7402c9834189 100644 --- a/drivers/media/dvb-core/dvb_frontend.c +++ b/drivers/media/dvb-core/dvb_frontend.c @@ -917,7 +917,7 @@ static void dvb_frontend_get_frequency_limits(struct dvb_frontend *fe, "DVB: adapter %i frontend %u frequency limits undefined - fix the driver\n", fe->dvb->num, fe->id); - dprintk("frequency interval: tuner: %u...%u, frontend: %u...%u", + dev_dbg(fe->dvb->device, "frequency interval: tuner: %u...%u, frontend: %u...%u", tuner_min, tuner_max, frontend_min, frontend_max); /* If the standard is for satellite, convert frequencies to kHz */

6 years, 1 month

1
0
0 0

[PATCH AUTOSEL 4.4 01/92] gfs2: Fix lru_count going negative

by Sasha Levin

From: Ross Lagerwall <ross.lagerwall(a)citrix.com> [ Upstream commit 7881ef3f33bb80f459ea6020d1e021fc524a6348 ] Under certain conditions, lru_count may drop below zero resulting in a large amount of log spam like this: vmscan: shrink_slab: gfs2_dump_glock+0x3b0/0x630 [gfs2] \ negative objects to delete nr=-1 This happens as follows: 1) A glock is moved from lru_list to the dispose list and lru_count is decremented. 2) The dispose function calls cond_resched() and drops the lru lock. 3) Another thread takes the lru lock and tries to add the same glock to lru_list, checking if the glock is on an lru list. 4) It is on a list (actually the dispose list) and so it avoids incrementing lru_count. 5) The glock is moved to lru_list. 5) The original thread doesn't dispose it because it has been re-added to the lru list but the lru_count has still decreased by one. Fix by checking if the LRU flag is set on the glock rather than checking if the glock is on some list and rearrange the code so that the LRU flag is added/removed precisely when the glock is added/removed from lru_list. Signed-off-by: Ross Lagerwall <ross.lagerwall(a)citrix.com> Signed-off-by: Andreas Gruenbacher <agruenba(a)redhat.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- fs/gfs2/glock.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c index 09a0cf5f3dd86..1eb737c466ddc 100644 --- a/fs/gfs2/glock.c +++ b/fs/gfs2/glock.c @@ -136,22 +136,26 @@ static int demote_ok(const struct gfs2_glock *gl) void gfs2_glock_add_to_lru(struct gfs2_glock *gl) { + if (!(gl->gl_ops->go_flags & GLOF_LRU)) + return; + spin_lock(&lru_lock); - if (!list_empty(&gl->gl_lru)) - list_del_init(&gl->gl_lru); - else + list_del(&gl->gl_lru); + list_add_tail(&gl->gl_lru, &lru_list); + + if (!test_bit(GLF_LRU, &gl->gl_flags)) { + set_bit(GLF_LRU, &gl->gl_flags); atomic_inc(&lru_count); + } - list_add_tail(&gl->gl_lru, &lru_list); - set_bit(GLF_LRU, &gl->gl_flags); spin_unlock(&lru_lock); } static void gfs2_glock_remove_from_lru(struct gfs2_glock *gl) { spin_lock(&lru_lock); - if (!list_empty(&gl->gl_lru)) { + if (test_bit(GLF_LRU, &gl->gl_flags)) { list_del_init(&gl->gl_lru); atomic_dec(&lru_count); clear_bit(GLF_LRU, &gl->gl_flags); @@ -1040,8 +1044,7 @@ void gfs2_glock_dq(struct gfs2_holder *gh) !test_bit(GLF_DEMOTE, &gl->gl_flags)) fast_path = 1; } - if (!test_bit(GLF_LFLUSH, &gl->gl_flags) && demote_ok(gl) && - (glops->go_flags & GLOF_LRU)) + if (!test_bit(GLF_LFLUSH, &gl->gl_flags) && demote_ok(gl)) gfs2_glock_add_to_lru(gl); trace_gfs2_glock_queue(gh, 0); @@ -1341,6 +1344,7 @@ __acquires(&lru_lock) if (!spin_trylock(&gl->gl_lockref.lock)) { add_back_to_lru: list_add(&gl->gl_lru, &lru_list); + set_bit(GLF_LRU, &gl->gl_flags); atomic_inc(&lru_count); continue; } @@ -1348,7 +1352,6 @@ __acquires(&lru_lock) spin_unlock(&gl->gl_lockref.lock); goto add_back_to_lru; } - clear_bit(GLF_LRU, &gl->gl_flags); gl->gl_lockref.count++; if (demote_ok(gl)) handle_callback(gl, LM_ST_UNLOCKED, 0, false); @@ -1384,6 +1387,7 @@ static long gfs2_scan_glock_lru(int nr) if (!test_bit(GLF_LOCK, &gl->gl_flags)) { list_move(&gl->gl_lru, &dispose); atomic_dec(&lru_count); + clear_bit(GLF_LRU, &gl->gl_flags); freed++; continue; } -- 2.20.1

6 years, 1 month

1
31
0 0

[PATCH AUTOSEL 4.19 001/244] gfs2: Fix lru_count going negative

by Sasha Levin

From: Ross Lagerwall <ross.lagerwall(a)citrix.com> [ Upstream commit 7881ef3f33bb80f459ea6020d1e021fc524a6348 ] Under certain conditions, lru_count may drop below zero resulting in a large amount of log spam like this: vmscan: shrink_slab: gfs2_dump_glock+0x3b0/0x630 [gfs2] \ negative objects to delete nr=-1 This happens as follows: 1) A glock is moved from lru_list to the dispose list and lru_count is decremented. 2) The dispose function calls cond_resched() and drops the lru lock. 3) Another thread takes the lru lock and tries to add the same glock to lru_list, checking if the glock is on an lru list. 4) It is on a list (actually the dispose list) and so it avoids incrementing lru_count. 5) The glock is moved to lru_list. 5) The original thread doesn't dispose it because it has been re-added to the lru list but the lru_count has still decreased by one. Fix by checking if the LRU flag is set on the glock rather than checking if the glock is on some list and rearrange the code so that the LRU flag is added/removed precisely when the glock is added/removed from lru_list. Signed-off-by: Ross Lagerwall <ross.lagerwall(a)citrix.com> Signed-off-by: Andreas Gruenbacher <agruenba(a)redhat.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- fs/gfs2/glock.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c index 9d566e62684c2..775256141e9fb 100644 --- a/fs/gfs2/glock.c +++ b/fs/gfs2/glock.c @@ -183,15 +183,19 @@ static int demote_ok(const struct gfs2_glock *gl) void gfs2_glock_add_to_lru(struct gfs2_glock *gl) { + if (!(gl->gl_ops->go_flags & GLOF_LRU)) + return; + spin_lock(&lru_lock); - if (!list_empty(&gl->gl_lru)) - list_del_init(&gl->gl_lru); - else + list_del(&gl->gl_lru); + list_add_tail(&gl->gl_lru, &lru_list); + + if (!test_bit(GLF_LRU, &gl->gl_flags)) { + set_bit(GLF_LRU, &gl->gl_flags); atomic_inc(&lru_count); + } - list_add_tail(&gl->gl_lru, &lru_list); - set_bit(GLF_LRU, &gl->gl_flags); spin_unlock(&lru_lock); } @@ -201,7 +205,7 @@ static void gfs2_glock_remove_from_lru(struct gfs2_glock *gl) return; spin_lock(&lru_lock); - if (!list_empty(&gl->gl_lru)) { + if (test_bit(GLF_LRU, &gl->gl_flags)) { list_del_init(&gl->gl_lru); atomic_dec(&lru_count); clear_bit(GLF_LRU, &gl->gl_flags); @@ -1158,8 +1162,7 @@ void gfs2_glock_dq(struct gfs2_holder *gh) !test_bit(GLF_DEMOTE, &gl->gl_flags)) fast_path = 1; } - if (!test_bit(GLF_LFLUSH, &gl->gl_flags) && demote_ok(gl) && - (glops->go_flags & GLOF_LRU)) + if (!test_bit(GLF_LFLUSH, &gl->gl_flags) && demote_ok(gl)) gfs2_glock_add_to_lru(gl); trace_gfs2_glock_queue(gh, 0); @@ -1455,6 +1458,7 @@ __acquires(&lru_lock) if (!spin_trylock(&gl->gl_lockref.lock)) { add_back_to_lru: list_add(&gl->gl_lru, &lru_list); + set_bit(GLF_LRU, &gl->gl_flags); atomic_inc(&lru_count); continue; } @@ -1462,7 +1466,6 @@ __acquires(&lru_lock) spin_unlock(&gl->gl_lockref.lock); goto add_back_to_lru; } - clear_bit(GLF_LRU, &gl->gl_flags); gl->gl_lockref.count++; if (demote_ok(gl)) handle_callback(gl, LM_ST_UNLOCKED, 0, false); @@ -1497,6 +1500,7 @@ static long gfs2_scan_glock_lru(int nr) if (!test_bit(GLF_LOCK, &gl->gl_flags)) { list_move(&gl->gl_lru, &dispose); atomic_dec(&lru_count); + clear_bit(GLF_LRU, &gl->gl_flags); freed++; continue; } -- 2.20.1

6 years, 1 month

2
83
0 0

[PATCH AUTOSEL 4.9 001/114] gfs2: Fix lru_count going negative

by Sasha Levin

From: Ross Lagerwall <ross.lagerwall(a)citrix.com> [ Upstream commit 7881ef3f33bb80f459ea6020d1e021fc524a6348 ] Under certain conditions, lru_count may drop below zero resulting in a large amount of log spam like this: vmscan: shrink_slab: gfs2_dump_glock+0x3b0/0x630 [gfs2] \ negative objects to delete nr=-1 This happens as follows: 1) A glock is moved from lru_list to the dispose list and lru_count is decremented. 2) The dispose function calls cond_resched() and drops the lru lock. 3) Another thread takes the lru lock and tries to add the same glock to lru_list, checking if the glock is on an lru list. 4) It is on a list (actually the dispose list) and so it avoids incrementing lru_count. 5) The glock is moved to lru_list. 5) The original thread doesn't dispose it because it has been re-added to the lru list but the lru_count has still decreased by one. Fix by checking if the LRU flag is set on the glock rather than checking if the glock is on some list and rearrange the code so that the LRU flag is added/removed precisely when the glock is added/removed from lru_list. Signed-off-by: Ross Lagerwall <ross.lagerwall(a)citrix.com> Signed-off-by: Andreas Gruenbacher <agruenba(a)redhat.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- fs/gfs2/glock.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c index 7a8b1d72e3d91..efd44d5645d83 100644 --- a/fs/gfs2/glock.c +++ b/fs/gfs2/glock.c @@ -136,22 +136,26 @@ static int demote_ok(const struct gfs2_glock *gl) void gfs2_glock_add_to_lru(struct gfs2_glock *gl) { + if (!(gl->gl_ops->go_flags & GLOF_LRU)) + return; + spin_lock(&lru_lock); - if (!list_empty(&gl->gl_lru)) - list_del_init(&gl->gl_lru); - else + list_del(&gl->gl_lru); + list_add_tail(&gl->gl_lru, &lru_list); + + if (!test_bit(GLF_LRU, &gl->gl_flags)) { + set_bit(GLF_LRU, &gl->gl_flags); atomic_inc(&lru_count); + } - list_add_tail(&gl->gl_lru, &lru_list); - set_bit(GLF_LRU, &gl->gl_flags); spin_unlock(&lru_lock); } static void gfs2_glock_remove_from_lru(struct gfs2_glock *gl) { spin_lock(&lru_lock); - if (!list_empty(&gl->gl_lru)) { + if (test_bit(GLF_LRU, &gl->gl_flags)) { list_del_init(&gl->gl_lru); atomic_dec(&lru_count); clear_bit(GLF_LRU, &gl->gl_flags); @@ -1048,8 +1052,7 @@ void gfs2_glock_dq(struct gfs2_holder *gh) !test_bit(GLF_DEMOTE, &gl->gl_flags)) fast_path = 1; } - if (!test_bit(GLF_LFLUSH, &gl->gl_flags) && demote_ok(gl) && - (glops->go_flags & GLOF_LRU)) + if (!test_bit(GLF_LFLUSH, &gl->gl_flags) && demote_ok(gl)) gfs2_glock_add_to_lru(gl); trace_gfs2_glock_queue(gh, 0); @@ -1349,6 +1352,7 @@ __acquires(&lru_lock) if (!spin_trylock(&gl->gl_lockref.lock)) { add_back_to_lru: list_add(&gl->gl_lru, &lru_list); + set_bit(GLF_LRU, &gl->gl_flags); atomic_inc(&lru_count); continue; } @@ -1356,7 +1360,6 @@ __acquires(&lru_lock) spin_unlock(&gl->gl_lockref.lock); goto add_back_to_lru; } - clear_bit(GLF_LRU, &gl->gl_flags); gl->gl_lockref.count++; if (demote_ok(gl)) handle_callback(gl, LM_ST_UNLOCKED, 0, false); @@ -1392,6 +1395,7 @@ static long gfs2_scan_glock_lru(int nr) if (!test_bit(GLF_LOCK, &gl->gl_flags)) { list_move(&gl->gl_lru, &dispose); atomic_dec(&lru_count); + clear_bit(GLF_LRU, &gl->gl_flags); freed++; continue; } -- 2.20.1

6 years, 1 month

1
41
0 0

+ memcg-make-it-work-on-sparse-non-0-node-systems.patch added to -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: memcg: make it work on sparse non-0-node systems has been added to the -mm tree. Its filename is memcg-make-it-work-on-sparse-non-0-node-systems.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/memcg-make-it-work-on-sparse-non-0… and later at http://ozlabs.org/~akpm/mmotm/broken-out/memcg-make-it-work-on-sparse-non-0… Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Jiri Slaby <jslaby(a)suse.cz> Subject: memcg: make it work on sparse non-0-node systems We have a single node system with node 0 disabled: Scanning NUMA topology in Northbridge 24 Number of physical nodes 2 Skipping disabled node 0 Node 1 MemBase 0000000000000000 Limit 00000000fbff0000 NODE_DATA(1) allocated [mem 0xfbfda000-0xfbfeffff] This causes crashes in memcg when system boots: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 #PF error: [normal kernel read fault] ... RIP: 0010:list_lru_add+0x94/0x170 ... Call Trace: d_lru_add+0x44/0x50 dput.part.34+0xfc/0x110 __fput+0x108/0x230 task_work_run+0x9f/0xc0 exit_to_usermode_loop+0xf5/0x100 It is reproducible as far as 4.12. I did not try older kernels. You have to have a new enough systemd, e.g. 241 (the reason is unknown -- was not investigated). Cannot be reproduced with systemd 234. The system crashes because the size of lru array is never updated in memcg_update_all_list_lrus and the reads are past the zero-sized array, causing dereferences of random memory. The root cause are list_lru_memcg_aware checks in the list_lru code. The test in list_lru_memcg_aware is broken: it assumes node 0 is always present, but it is not true on some systems as can be seen above. So fix this by avoiding checks on node 0. Remember the memcg-awareness by a bool flag in struct list_lru. Link: http://lkml.kernel.org/r/20190522091940.3615-1-jslaby@suse.cz Fixes: 60d3fd32a7a9 ("list_lru: introduce per-memcg lists") Signed-off-by: Jiri Slaby <jslaby(a)suse.cz> Acked-by: Michal Hocko <mhocko(a)suse.com> Suggested-by: Vladimir Davydov <vdavydov.dev(a)gmail.com> Acked-by: Vladimir Davydov <vdavydov.dev(a)gmail.com> Reviewed-by: Shakeel Butt <shakeelb(a)google.com> Cc: Johannes Weiner <hannes(a)cmpxchg.org> Cc: Raghavendra K T <raghavendra.kt(a)linux.vnet.ibm.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- include/linux/list_lru.h | 1 + mm/list_lru.c | 8 +++----- 2 files changed, 4 insertions(+), 5 deletions(-) --- a/include/linux/list_lru.h~memcg-make-it-work-on-sparse-non-0-node-systems +++ a/include/linux/list_lru.h @@ -54,6 +54,7 @@ struct list_lru { #ifdef CONFIG_MEMCG_KMEM struct list_head list; int shrinker_id; + bool memcg_aware; #endif }; --- a/mm/list_lru.c~memcg-make-it-work-on-sparse-non-0-node-systems +++ a/mm/list_lru.c @@ -37,11 +37,7 @@ static int lru_shrinker_id(struct list_l static inline bool list_lru_memcg_aware(struct list_lru *lru) { - /* - * This needs node 0 to be always present, even - * in the systems supporting sparse numa ids. - */ - return !!lru->node[0].memcg_lrus; + return lru->memcg_aware; } static inline struct list_lru_one * @@ -451,6 +447,8 @@ static int memcg_init_list_lru(struct li { int i; + lru->memcg_aware = memcg_aware; + if (!memcg_aware) return 0; _ Patches currently in -mm which might be from jslaby(a)suse.cz are memcg-make-it-work-on-sparse-non-0-node-systems.patch

6 years, 1 month

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror May 2019