February 2024 - Linux-stable-mirror

[PATCH] drm/radeon: Call mmiowb() at the end of radeon_ring_commit()

by Huacai Chen

Commit fb24ea52f78e0d595852e ("drivers: Remove explicit invocations of mmiowb()") remove all mmiowb() in drivers, but it says: "NOTE: mmiowb() has only ever guaranteed ordering in conjunction with spin_unlock(). However, pairing each mmiowb() removal in this patch with the corresponding call to spin_unlock() is not at all trivial, so there is a small chance that this change may regress any drivers incorrectly relying on mmiowb() to order MMIO writes between CPUs using lock-free synchronisation." The mmio in radeon_ring_commit() is protected by a mutex rather than a spinlock, but in the mutex fastpath it behaves similar to spinlock and need a mmiowb() to make sure the wptr is up-to-date for hardware. Without this, we get such an error when run 'glxgears' on weak ordering architectures such as LoongArch: radeon 0000:04:00.0: ring 0 stalled for more than 10324msec radeon 0000:04:00.0: ring 3 stalled for more than 10240msec radeon 0000:04:00.0: GPU lockup (current fence id 0x000000000001f412 last fence id 0x000000000001f414 on ring 3) radeon 0000:04:00.0: GPU lockup (current fence id 0x000000000000f940 last fence id 0x000000000000f941 on ring 0) radeon 0000:04:00.0: scheduling IB failed (-35). [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35) radeon 0000:04:00.0: scheduling IB failed (-35). [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35) radeon 0000:04:00.0: scheduling IB failed (-35). [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35) radeon 0000:04:00.0: scheduling IB failed (-35). [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35) radeon 0000:04:00.0: scheduling IB failed (-35). [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35) radeon 0000:04:00.0: scheduling IB failed (-35). [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35) radeon 0000:04:00.0: scheduling IB failed (-35). [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35) Cc: stable(a)vger.kernel.org Signed-off-by: Tianyang Zhang <zhangtianyang(a)loongson.cn> Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn> --- drivers/gpu/drm/radeon/radeon_ring.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c index 38048593bb4a..d461dc85d820 100644 --- a/drivers/gpu/drm/radeon/radeon_ring.c +++ b/drivers/gpu/drm/radeon/radeon_ring.c @@ -183,6 +183,7 @@ void radeon_ring_commit(struct radeon_device *rdev, struct radeon_ring *ring, if (hdp_flush && rdev->asic->mmio_hdp_flush) rdev->asic->mmio_hdp_flush(rdev); radeon_ring_set_wptr(rdev, ring); + mmiowb(); /* Make sure wptr is up-to-date for hw */ } /** -- 2.43.0

1 year, 10 months

2
1
0 0

[PATCH 6.7 000/124] 6.7.5-rc1 review

by Greg Kroah-Hartman

This is the start of the stable review cycle for the 6.7.5 release. There are 124 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Thu, 15 Feb 2024 17:18:29 +0000. Anything received after that time might be too late. The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.7.5-rc1.… or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.7.y and the diffstat can be found below. thanks, greg k-h ------------- Pseudo-Shortlog of commits: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Linux 6.7.5-rc1 Michael Lass <bevan(a)bi-co.net> net: Fix from address in memcpy_to_iter_csum() Jens Axboe <axboe(a)kernel.dk> io_uring/net: limit inline multishot retries Jens Axboe <axboe(a)kernel.dk> io_uring/poll: add requeue return code from poll multishot handling Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Revert "ASoC: amd: Add new dmi entries for acp5x platform" Kent Overstreet <kent.overstreet(a)linux.dev> bcachefs: time_stats: Check for last_event == 0 when updating freq stats Guoyu Ou <benogy(a)gmail.com> bcachefs: unlock parent dir if entry is not found in subvolume deletion Christoph Hellwig <hch(a)lst.de> bcachefs: fix incorrect usage of REQ_OP_FLUSH Su Yue <glass.su(a)suse.com> bcachefs: grab s_umount only if snapshotting Su Yue <glass.su(a)suse.com> bcachefs: kvfree bch_fs::snapshots in bch2_fs_snapshots_exit Kent Overstreet <kent.overstreet(a)linux.dev> bcachefs: bch2_kthread_io_clock_wait() no longer sleeps until full amount Kent Overstreet <kent.overstreet(a)linux.dev> bcachefs: Add missing bch2_moving_ctxt_flush_all() Daniel Hill <daniel(a)gluo.nz> bcachefs: rebalance should wakeup on shutdown if disabled Kent Overstreet <kent.overstreet(a)linux.dev> bcachefs: Don't pass memcmp() as a pointer Al Viro <viro(a)zeniv.linux.org.uk> bch2_ioctl_subvolume_destroy(): fix locking Al Viro <viro(a)zeniv.linux.org.uk> new helper: user_path_locked_at() Johan Hovold <johan+linaro(a)kernel.org> PCI/ASPM: Fix deadlock when enabling ASPM Jens Axboe <axboe(a)kernel.dk> io_uring/rw: ensure poll based multishot read retries appropriately Jens Axboe <axboe(a)kernel.dk> io_uring/net: un-indent mshot retry path in io_recv_finish() Jens Axboe <axboe(a)kernel.dk> io_uring/poll: move poll execution helpers higher up Jens Axboe <axboe(a)kernel.dk> io_uring/net: fix sr->len for IORING_OP_RECV with MSG_WAITALL and buffers Emmanuel Grumbach <emmanuel.grumbach(a)intel.com> wifi: iwlwifi: mvm: fix a battery life regression Hans de Goede <hdegoede(a)redhat.com> Input: atkbd - skip ATKBD_CMD_SETLEDS when skipping ATKBD_CMD_GETID Werner Sembach <wse(a)tuxedocomputers.com> Input: i8042 - fix strange behavior of touchpad on Clevo NS70PU Frederic Weisbecker <frederic(a)kernel.org> hrtimer: Report offline hrtimer enqueue Heikki Krogerus <heikki.krogerus(a)linux.intel.com> usb: dwc3: pci: add support for the Intel Arrow Lake-H Michal Pecio <michal.pecio(a)gmail.com> xhci: handle isoc Babble and Buffer Overrun events properly Mathias Nyman <mathias.nyman(a)linux.intel.com> xhci: process isoc TD properly when there was a transaction error mid TD. Prashanth K <quic_prashk(a)quicinc.com> usb: host: xhci-plat: Add support for XHCI_SG_TRB_CACHE_SIZE_QUIRK Prashanth K <quic_prashk(a)quicinc.com> usb: dwc3: host: Set XHCI_SG_TRB_CACHE_SIZE_QUIRK Qiuxu Zhuo <qiuxu.zhuo(a)intel.com> x86/lib: Revert to _ASM_EXTABLE_UA() for {get,put}_user() fixups Mario Limonciello <mario.limonciello(a)amd.com> Revert "drm/amd/pm: fix the high voltage and temperature issue" Badhri Jagan Sridharan <badhri(a)google.com> Revert "usb: typec: tcpm: fix cc role at port reset" Leonard Dallmayr <leonard.dallmayr(a)mailbox.org> USB: serial: cp210x: add ID for IMST iM871A-USB Puliang Lu <puliang.lu(a)fibocom.com> USB: serial: option: add Fibocom FM101-GL variant JackBB Wu <wojackbb(a)gmail.com> USB: serial: qcserial: add new usb-id for Dell Wireless DW5826e Sean Young <sean(a)mess.org> ALSA: usb-audio: add quirk for RODE NT-USB+ Julian Sikorski <belegdol+github(a)gmail.com> ALSA: usb-audio: Add a quirk for Yamaha YIT-W12TX transmitter Alexander Tsoy <alexander(a)tsoy.me> ALSA: usb-audio: Add delay quirk for MOTU M Series 2nd revision Tejun Heo <tj(a)kernel.org> blk-iocost: Fix an UBSAN shift-out-of-bounds warning Muhammad Usama Anjum <usama.anjum(a)collabora.com> selftests: core: include linux/close_range.h for CLOSE_RANGE_* macros Maurizio Lombardi <mlombard(a)redhat.com> nvme-host: fix the updating of the firmware version Ben Dooks <ben.dooks(a)codethink.co.uk> riscv: declare overflow_stack as exported from traps.c Alexandre Ghiti <alexghiti(a)rivosinc.com> riscv: Fix arch_hugetlb_migration_supported() for NAPOT Xiubo Li <xiubli(a)redhat.com> ceph: always set initial i_blkbits to CEPH_FSCRYPT_BLOCK_SHIFT Xiubo Li <xiubli(a)redhat.com> libceph: just wait for more data to be available on the socket Xiubo Li <xiubli(a)redhat.com> libceph: rename read_sparse_msg_*() to read_partial_sparse_msg_*() Alexandre Ghiti <alexghiti(a)rivosinc.com> riscv: Flush the tlb when a page directory is freed Ming Lei <ming.lei(a)redhat.com> scsi: core: Move scsi_host_busy() out of host lock if it is for per-command Alexandre Ghiti <alexghiti(a)rivosinc.com> riscv: Fix hugetlb_mask_last_page() when NAPOT is enabled Alexandre Ghiti <alexghiti(a)rivosinc.com> riscv: Fix set_huge_pte_at() for NAPOT mapping Vincent Chen <vincent.chen(a)sifive.com> riscv: mm: execute local TLB flush after populating vmemmap Alexandre Ghiti <alexghiti(a)rivosinc.com> mm: Introduce flush_cache_vmap_early() Dan Carpenter <dan.carpenter(a)linaro.org> fs/ntfs3: Fix an NULL dereference bug Florian Westphal <fw(a)strlen.de> netfilter: nft_set_pipapo: remove scratch_aligned pointer Florian Westphal <fw(a)strlen.de> netfilter: nft_set_pipapo: add helper to release pcpu scratch area Florian Westphal <fw(a)strlen.de> netfilter: nft_set_pipapo: store index in scratch maps Florian Westphal <fw(a)strlen.de> netfilter: nfnetlink_queue: un-break NF_REPEAT Pablo Neira Ayuso <pablo(a)netfilter.org> netfilter: nf_tables: use timestamp to check for set element timeout Pablo Neira Ayuso <pablo(a)netfilter.org> netfilter: nft_ct: reject direction for ct id Pablo Neira Ayuso <pablo(a)netfilter.org> netfilter: nft_set_pipapo: remove static in nft_pipapo_get() Srinivasan Shanmugam <srinivasan.shanmugam(a)amd.com> drm/amd/display: Implement bounds check for stream encoder creation in DCN301 Srinivasan Shanmugam <srinivasan.shanmugam(a)amd.com> drm/amd/display: Add NULL test for 'timing generator' in 'dcn21_set_pipe()' Srinivasan Shanmugam <srinivasan.shanmugam(a)amd.com> drm/amd/display: Fix 'panel_cntl' could be null in 'dcn21_set_backlight_level()' Pablo Neira Ayuso <pablo(a)netfilter.org> netfilter: nft_compat: restrict match/target protocol to u16 Pablo Neira Ayuso <pablo(a)netfilter.org> netfilter: nft_compat: reject unused compat flag Pablo Neira Ayuso <pablo(a)netfilter.org> netfilter: nft_compat: narrow down revision to unsigned 8-bits Jakub Kicinski <kuba(a)kernel.org> selftests: cmsg_ipv6: repeat the exact packet Eric Dumazet <edumazet(a)google.com> ppp_async: limit MRU to 64K Jiri Pirko <jiri(a)resnulli.us> devlink: avoid potential loop in devlink_rel_nested_in_notify_work() Kuniyuki Iwashima <kuniyu(a)amazon.com> af_unix: Call kfree_skb() for dead unix_(sk)->oob_skb in GC. Shigeru Yoshida <syoshida(a)redhat.com> tipc: Check the bearer type before calling tipc_udp_nl_bearer_add() Paolo Abeni <pabeni(a)redhat.com> selftests: net: let big_tcp test cope with slow env David Howells <dhowells(a)redhat.com> rxrpc: Fix counting of new acks and nacks David Howells <dhowells(a)redhat.com> rxrpc: Fix response to PING RESPONSE ACKs to a dead call David Howells <dhowells(a)redhat.com> rxrpc: Fix delayed ACKs to not set the reference serial number David Howells <dhowells(a)redhat.com> rxrpc: Fix generation of serial numbers to skip zero Ard Biesheuvel <ardb(a)kernel.org> x86/efistub: Use 1:1 file:memory mapping for PE/COFF .compat section Dan Carpenter <dan.carpenter(a)linaro.org> drm/i915/gvt: Fix uninitialized variable in handle_mmio() Eric Dumazet <edumazet(a)google.com> inet: read sk->sk_family once in inet_recv_error() Zhang Rui <rui.zhang(a)intel.com> hwmon: (coretemp) Fix bogus core_id to attr name mapping Zhang Rui <rui.zhang(a)intel.com> hwmon: (coretemp) Fix out-of-bounds memory access Loic Prylli <lprylli(a)netflix.com> hwmon: (aspeed-pwm-tacho) mutex for tach reading Zhipeng Lu <alexious(a)zju.edu.cn> octeontx2-pf: Fix a memleak otx2_sq_init Zhipeng Lu <alexious(a)zju.edu.cn> atm: idt77252: fix a memleak in open_card_ubr0 Antoine Tenart <atenart(a)kernel.org> tunnels: fix out of bounds access when building IPv6 PMTU error Gerhard Engleder <gerhard(a)engleder-embedded.com> tsnep: Fix mapping for zero copy XDP_TX action Paolo Abeni <pabeni(a)redhat.com> selftests: net: avoid just another constant wait Paolo Abeni <pabeni(a)redhat.com> selftests: net: fix tcp listener handling in pmtu.sh Yujie Liu <yujie.liu(a)intel.com> selftests/net: change shebang to bash to support "source" Hangbin Liu <liuhangbin(a)gmail.com> selftests/net: convert pmtu.sh to run it in unique namespace Hangbin Liu <liuhangbin(a)gmail.com> selftests/net: convert unicast_extensions.sh to run it in unique namespace Paolo Abeni <pabeni(a)redhat.com> selftests: net: cut more slack for gro fwd tests. Ivan Vecera <ivecera(a)redhat.com> net: atlantic: Fix DMA mapping for PTP hwts ring Eric Dumazet <edumazet(a)google.com> netdevsim: avoid potential loop in nsim_dev_trap_report_work() Kees Cook <keescook(a)chromium.org> wifi: brcmfmac: Adjust n_channels usage for __counted_by Miri Korenblit <miriam.rachel.korenblit(a)intel.com> wifi: iwlwifi: exit eSR only after the FW does Johannes Berg <johannes.berg(a)intel.com> wifi: mac80211: fix waiting for beacons logic Johannes Berg <johannes.berg(a)intel.com> wifi: mac80211: fix unsolicited broadcast probe config Johannes Berg <johannes.berg(a)intel.com> wifi: mac80211: fix RCU use in TDLS fast-xmit Johannes Berg <johannes.berg(a)intel.com> wifi: mac80211: improve CSA/ECSA connection refusal Johannes Berg <johannes.berg(a)intel.com> wifi: cfg80211: detect stuck ECSA element in probe resp Benjamin Berg <benjamin.berg(a)intel.com> wifi: cfg80211: consume both probe response and beacon IEs Furong Xu <0x1207(a)gmail.com> net: stmmac: xgmac: fix handling of DPP safety error for DMA channels Ard Biesheuvel <ardb(a)kernel.org> x86/efistub: Avoid placing the kernel below LOAD_PHYSICAL_ADDR Ard Biesheuvel <ardb(a)kernel.org> x86/efistub: Give up if memory attribute protocol returns an error Benjamin Berg <benjamin.berg(a)intel.com> wifi: iwlwifi: mvm: skip adding debugfs symlink for reconfig Abhinav Kumar <quic_abhinavk(a)quicinc.com> drm/msm/dpu: check for valid hw_pp in dpu_encoder_helper_phys_cleanup Kuogee Hsieh <quic_khsieh(a)quicinc.com> drm/msm/dp: return correct Colorimetry for DP_TEST_DYNAMIC_RANGE_CEA case Kuogee Hsieh <quic_khsieh(a)quicinc.com> drm/msms/dp: fixed link clock divider bits be over written in BPC unknown case Shyam Prasad N <sprasad(a)microsoft.com> cifs: failure to add channel on iface should bump up weight Shyam Prasad N <sprasad(a)microsoft.com> cifs: avoid redundant calls to disable multichannel Tony Lindgren <tony(a)atomide.com> phy: ti: phy-omap-usb2: Fix NULL pointer dereference for SRP Frank Li <Frank.Li(a)nxp.com> dmaengine: fix is_slave_direction() return false when DMA_DEV_TO_DEV James Clark <james.clark(a)arm.com> perf evlist: Fix evlist__new_default() for > 1 core PMU Thomas Richter <tmricht(a)linux.ibm.com> perf test: Fix 'perf script' tests on s390 Ian Rogers <irogers(a)google.com> perf tests: Add perf script test Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com> phy: renesas: rcar-gen3-usb2: Fix returning wrong error code Mantas Pucka <mantas(a)8devices.com> phy: qcom-qmp-usb: fix serdes init sequence for IPQ6018 Mantas Pucka <mantas(a)8devices.com> phy: qcom-qmp-usb: fix register offsets for ipq8074/ipq6018 Christophe JAILLET <christophe.jaillet(a)wanadoo.fr> dmaengine: fsl-qdma: Fix a memory leak related to the queue command DMA Christophe JAILLET <christophe.jaillet(a)wanadoo.fr> dmaengine: fsl-qdma: Fix a memory leak related to the status queue DMA Jai Luthra <j-luthra(a)ti.com> dmaengine: ti: k3-udma: Report short packet errors Guanhua Gao <guanhua.gao(a)nxp.com> dmaengine: fsl-dpaa2-qdma: Fix the size of dma pools Baokun Li <libaokun1(a)huawei.com> ext4: regenerate buddy after block freeing failed if under fc replay ------------- Diffstat: Makefile | 4 +- arch/arc/include/asm/cacheflush.h | 1 + arch/arm/include/asm/cacheflush.h | 2 + arch/csky/abiv1/inc/abi/cacheflush.h | 1 + arch/csky/abiv2/inc/abi/cacheflush.h | 1 + arch/m68k/include/asm/cacheflush_mm.h | 1 + arch/mips/include/asm/cacheflush.h | 2 + arch/nios2/include/asm/cacheflush.h | 1 + arch/parisc/include/asm/cacheflush.h | 1 + arch/riscv/include/asm/cacheflush.h | 3 +- arch/riscv/include/asm/hugetlb.h | 3 + arch/riscv/include/asm/stacktrace.h | 5 + arch/riscv/include/asm/tlb.h | 2 +- arch/riscv/include/asm/tlbflush.h | 2 + arch/riscv/mm/hugetlbpage.c | 78 ++++++++++++- arch/riscv/mm/init.c | 4 + arch/riscv/mm/tlbflush.c | 6 + arch/sh/include/asm/cacheflush.h | 1 + arch/sparc/include/asm/cacheflush_32.h | 1 + arch/sparc/include/asm/cacheflush_64.h | 1 + arch/x86/boot/header.S | 14 +-- arch/x86/boot/setup.ld | 6 +- arch/x86/lib/getuser.S | 24 ++-- arch/x86/lib/putuser.S | 20 ++-- arch/xtensa/include/asm/cacheflush.h | 6 +- block/blk-iocost.c | 7 ++ drivers/atm/idt77252.c | 2 + drivers/dma/fsl-dpaa2-qdma/dpaa2-qdma.c | 10 +- drivers/dma/fsl-qdma.c | 28 ++--- drivers/dma/ti/k3-udma.c | 10 +- drivers/firmware/efi/libstub/efistub.h | 3 +- drivers/firmware/efi/libstub/kaslr.c | 2 +- drivers/firmware/efi/libstub/randomalloc.c | 12 +- drivers/firmware/efi/libstub/x86-stub.c | 25 ++-- drivers/firmware/efi/libstub/x86-stub.h | 4 +- drivers/firmware/efi/libstub/zboot.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 24 ++-- .../drm/amd/display/dc/dcn301/dcn301_resource.c | 2 +- .../drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c | 63 +++++----- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 33 +----- drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 1 - .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 8 +- .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c | 8 +- drivers/gpu/drm/i915/gvt/handlers.c | 3 +- drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 4 +- drivers/gpu/drm/msm/dp/dp_ctrl.c | 5 - drivers/gpu/drm/msm/dp/dp_link.c | 22 ++-- drivers/gpu/drm/msm/dp/dp_reg.h | 3 + drivers/hwmon/aspeed-pwm-tacho.c | 7 ++ drivers/hwmon/coretemp.c | 40 ++++--- drivers/input/keyboard/atkbd.c | 13 ++- drivers/input/serio/i8042-acpipnpio.h | 6 + drivers/net/ethernet/aquantia/atlantic/aq_ptp.c | 4 +- drivers/net/ethernet/aquantia/atlantic/aq_ring.c | 13 +++ drivers/net/ethernet/aquantia/atlantic/aq_ring.h | 1 + drivers/net/ethernet/engleder/tsnep_main.c | 16 ++- .../ethernet/marvell/octeontx2/nic/otx2_common.c | 14 ++- drivers/net/ethernet/stmicro/stmmac/common.h | 1 + drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h | 3 + .../net/ethernet/stmicro/stmmac/dwxgmac2_core.c | 57 ++++++++- drivers/net/netdevsim/dev.c | 8 +- drivers/net/ppp/ppp_async.c | 4 + .../broadcom/brcm80211/brcmfmac/cfg80211.c | 6 +- drivers/net/wireless/intel/iwlwifi/fw/api/debug.h | 2 +- drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c | 6 +- .../net/wireless/intel/iwlwifi/mvm/mld-mac80211.c | 9 +- drivers/nvme/host/core.c | 7 +- drivers/pci/bus.c | 49 +++++--- drivers/pci/controller/dwc/pcie-qcom.c | 2 +- drivers/pci/pci.c | 78 ++++++++----- drivers/pci/pci.h | 4 +- drivers/pci/pcie/aspm.c | 13 ++- drivers/phy/qualcomm/phy-qcom-qmp-usb.c | 30 ++++- drivers/phy/renesas/phy-rcar-gen3-usb2.c | 4 - drivers/phy/ti/phy-omap-usb2.c | 4 +- drivers/scsi/scsi_error.c | 3 +- drivers/scsi/scsi_lib.c | 4 +- drivers/usb/dwc3/dwc3-pci.c | 4 + drivers/usb/dwc3/host.c | 4 +- drivers/usb/host/xhci-plat.c | 3 + drivers/usb/host/xhci-ring.c | 80 ++++++++++--- drivers/usb/host/xhci.h | 1 + drivers/usb/serial/cp210x.c | 1 + drivers/usb/serial/option.c | 1 + drivers/usb/serial/qcserial.c | 2 + drivers/usb/typec/tcpm/tcpm.c | 3 +- fs/bcachefs/clock.c | 4 +- fs/bcachefs/fs-io.c | 2 +- fs/bcachefs/fs-ioctl.c | 42 +++---- fs/bcachefs/journal_io.c | 3 +- fs/bcachefs/move.c | 2 +- fs/bcachefs/move.h | 1 + fs/bcachefs/rebalance.c | 13 ++- fs/bcachefs/replicas.c | 10 +- fs/bcachefs/snapshot.c | 2 +- fs/bcachefs/util.c | 5 +- fs/ceph/inode.c | 2 + fs/ext4/mballoc.c | 20 ++++ fs/namei.c | 16 ++- fs/ntfs3/ntfs_fs.h | 2 +- fs/smb/client/sess.c | 2 + fs/smb/client/smb2pdu.c | 2 +- include/asm-generic/cacheflush.h | 6 + include/linux/ceph/messenger.h | 2 +- include/linux/dmaengine.h | 3 +- include/linux/hrtimer.h | 4 +- include/linux/namei.h | 1 + include/linux/pci.h | 5 + include/net/cfg80211.h | 4 + include/net/netfilter/nf_tables.h | 16 ++- include/trace/events/rxrpc.h | 8 +- include/uapi/linux/netfilter/nf_tables.h | 2 + io_uring/io_uring.h | 7 ++ io_uring/net.c | 54 ++++++--- io_uring/poll.c | 49 ++++---- io_uring/poll.h | 9 ++ io_uring/rw.c | 10 +- kernel/time/hrtimer.c | 3 + mm/percpu.c | 8 +- net/ceph/messenger_v1.c | 33 +++--- net/ceph/messenger_v2.c | 4 +- net/ceph/osd_client.c | 9 +- net/core/datagram.c | 2 +- net/devlink/core.c | 12 +- net/ipv4/af_inet.c | 6 +- net/ipv4/ip_tunnel_core.c | 2 +- net/mac80211/cfg.c | 14 +-- net/mac80211/mlme.c | 106 ++++++++++++----- net/mac80211/tx.c | 7 +- net/netfilter/nf_tables_api.c | 4 +- net/netfilter/nfnetlink_queue.c | 13 ++- net/netfilter/nft_compat.c | 17 ++- net/netfilter/nft_ct.c | 3 + net/netfilter/nft_set_hash.c | 8 +- net/netfilter/nft_set_pipapo.c | 128 +++++++++++---------- net/netfilter/nft_set_pipapo.h | 18 ++- net/netfilter/nft_set_pipapo_avx2.c | 17 ++- net/netfilter/nft_set_rbtree.c | 11 +- net/rxrpc/ar-internal.h | 37 ++++-- net/rxrpc/call_event.c | 12 +- net/rxrpc/call_object.c | 1 + net/rxrpc/conn_event.c | 10 +- net/rxrpc/input.c | 115 +++++++++++++++--- net/rxrpc/output.c | 8 +- net/rxrpc/proc.c | 2 +- net/rxrpc/rxkad.c | 4 +- net/tipc/bearer.c | 6 + net/unix/garbage.c | 11 ++ net/wireless/scan.c | 63 +++++++++- sound/soc/amd/acp-config.c | 15 +-- sound/usb/quirks.c | 6 + tools/perf/tests/shell/script.sh | 73 ++++++++++++ tools/perf/util/evlist.c | 9 +- tools/testing/selftests/core/close_range_test.c | 1 + tools/testing/selftests/net/big_tcp.sh | 4 +- tools/testing/selftests/net/cmsg_ipv6.sh | 4 +- tools/testing/selftests/net/pmtu.sh | 52 +++++---- tools/testing/selftests/net/udpgro_fwd.sh | 14 ++- tools/testing/selftests/net/udpgso_bench_rx.c | 2 +- tools/testing/selftests/net/unicast_extensions.sh | 93 +++++++-------- 160 files changed, 1538 insertions(+), 715 deletions(-)

1 year, 10 months

14
144
0 0

[PATCH 5/7] ext4: fix slab-out-of-bounds in ext4_mb_find_good_group_avg_frag_lists()

by Baokun Li

We can trigger a slab-out-of-bounds with the following commands: mkfs.ext4 -F /dev/$disk 10G mount /dev/$disk /tmp/test echo 2147483647 > /sys/fs/ext4/$disk/mb_group_prealloc echo test > /tmp/test/file && sync ================================================================== BUG: KASAN: slab-out-of-bounds in ext4_mb_find_good_group_avg_frag_lists+0x8a/0x200 [ext4] Read of size 8 at addr ffff888121b9d0f0 by task kworker/u2:0/11 CPU: 0 PID: 11 Comm: kworker/u2:0 Tainted: GL 6.7.0-next-20240118 #521 Call Trace: dump_stack_lvl+0x2c/0x50 kasan_report+0xb6/0xf0 ext4_mb_find_good_group_avg_frag_lists+0x8a/0x200 [ext4] ext4_mb_regular_allocator+0x19e9/0x2370 [ext4] ext4_mb_new_blocks+0x88a/0x1370 [ext4] ext4_ext_map_blocks+0x14f7/0x2390 [ext4] ext4_map_blocks+0x569/0xea0 [ext4] ext4_do_writepages+0x10f6/0x1bc0 [ext4] [...] ================================================================== The flow of issue triggering is as follows: // Set s_mb_group_prealloc to 2147483647 via sysfs ext4_mb_new_blocks ext4_mb_normalize_request ext4_mb_normalize_group_request ac->ac_g_ex.fe_len = EXT4_SB(sb)->s_mb_group_prealloc ext4_mb_regular_allocator ext4_mb_choose_next_group ext4_mb_choose_next_group_best_avail mb_avg_fragment_size_order order = fls(len) - 2 = 29 ext4_mb_find_good_group_avg_frag_lists frag_list = &sbi->s_mb_avg_fragment_size[order] if (list_empty(frag_list)) // Trigger SOOB! At 4k block size, the length of the s_mb_avg_fragment_size list is 14, but an oversized s_mb_group_prealloc is set, causing slab-out-of-bounds to be triggered by an attempt to access an element at index 29. Therefore it is not allowed to set s_mb_group_prealloc to a value greater than s_clusters_per_group via sysfs, and to avoid returning an order from mb_avg_fragment_size_order() that is greater than MB_NUM_ORDERS(sb). Fixes: 7e170922f06b ("ext4: Add allocation criteria 1.5 (CR1_5)") CC: stable(a)vger.kernel.org Signed-off-by: Baokun Li <libaokun1(a)huawei.com> --- fs/ext4/mballoc.c | 2 ++ fs/ext4/sysfs.c | 9 ++++++++- 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index f44f668e407f..1ea6491b6b00 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -832,6 +832,8 @@ static int mb_avg_fragment_size_order(struct super_block *sb, ext4_grpblk_t len) return 0; if (order == MB_NUM_ORDERS(sb)) order--; + if (WARN_ON_ONCE(order > MB_NUM_ORDERS(sb))) + order = MB_NUM_ORDERS(sb) - 1; return order; } diff --git a/fs/ext4/sysfs.c b/fs/ext4/sysfs.c index 6f9f96e00f2f..60ca7b2797b2 100644 --- a/fs/ext4/sysfs.c +++ b/fs/ext4/sysfs.c @@ -29,6 +29,7 @@ typedef enum { attr_trigger_test_error, attr_first_error_time, attr_last_error_time, + attr_group_prealloc, attr_feature, attr_pointer_pi, attr_pointer_ui, @@ -211,13 +212,14 @@ EXT4_ATTR_FUNC(sra_exceeded_retry_limit, 0444); EXT4_ATTR_OFFSET(inode_readahead_blks, 0644, inode_readahead, ext4_sb_info, s_inode_readahead_blks); +EXT4_ATTR_OFFSET(mb_group_prealloc, 0644, group_prealloc, + ext4_sb_info, s_mb_group_prealloc); EXT4_RW_ATTR_SBI_UI(inode_goal, s_inode_goal); EXT4_RW_ATTR_SBI_UI(mb_stats, s_mb_stats); EXT4_RW_ATTR_SBI_UI(mb_max_to_scan, s_mb_max_to_scan); EXT4_RW_ATTR_SBI_UI(mb_min_to_scan, s_mb_min_to_scan); EXT4_RW_ATTR_SBI_UI(mb_order2_req, s_mb_order2_reqs); EXT4_RW_ATTR_SBI_UI(mb_stream_req, s_mb_stream_request); -EXT4_RW_ATTR_SBI_PI(mb_group_prealloc, s_mb_group_prealloc); EXT4_RW_ATTR_SBI_UI(mb_max_linear_groups, s_mb_max_linear_groups); EXT4_RW_ATTR_SBI_UI(extent_max_zeroout_kb, s_extent_max_zeroout_kb); EXT4_ATTR(trigger_fs_error, 0200, trigger_test_error); @@ -380,6 +382,7 @@ static ssize_t ext4_generic_attr_show(struct ext4_attr *a, switch (a->attr_id) { case attr_inode_readahead: + case attr_group_prealloc: case attr_pointer_pi: case attr_pointer_ui: if (a->attr_ptr == ptr_ext4_super_block_offset) @@ -453,6 +456,10 @@ static ssize_t ext4_generic_attr_store(struct ext4_attr *a, return ret; switch (a->attr_id) { + case attr_group_prealloc: + if (t > sbi->s_clusters_per_group) + return -EINVAL; + fallthrough; case attr_pointer_pi: if ((int)t < 0) return -EINVAL; -- 2.31.1

1 year, 10 months

4
4
0 0

Re: [PATCH v4] mm/swap: fix race when skipping swapcache

by Chengming Zhou

On 2024/2/20 13:32, Kairui Song wrote: > On Tue, Feb 20, 2024 at 12:49 PM Chengming Zhou <zhouchengming(a)bytedance.com> > wrote: >> >> On 2024/2/20 06:10, Barry Song wrote: >>> On Mon, Feb 19, 2024 at 9:21 PM Kairui Song <ryncsn(a)gmail.com> wrote: >>>> >>>> From: Kairui Song <kasong(a)tencent.com> >>>> >>>> When skipping swapcache for SWP_SYNCHRONOUS_IO, if two or more threads >>>> swapin the same entry at the same time, they get different pages (A, > B). >>>> Before one thread (T0) finishes the swapin and installs page (A) >>>> to the PTE, another thread (T1) could finish swapin of page (B), >>>> swap_free the entry, then swap out the possibly modified page >>>> reusing the same entry. It breaks the pte_same check in (T0) because >>>> PTE value is unchanged, causing ABA problem. Thread (T0) will >>>> install a stalled page (A) into the PTE and cause data corruption. >>>> >>>> One possible callstack is like this: >>>> >>>> CPU0 CPU1 >>>> ---- ---- >>>> do_swap_page() do_swap_page() with same entry >>>> <direct swapin path> <direct swapin path> >>>> <alloc page A> <alloc page B> >>>> swap_read_folio() <- read to page A swap_read_folio() <- read to page > B >>>> <slow on later locks or interrupt> <finished swapin first> >>>> .. set_pte_at() >>>> swap_free() <- entry is free >>>> <write to page B, now page A > stalled> >>>> <swap out page B to same swap > entry> >>>> pte_same() <- Check pass, PTE seems >>>> unchanged, but page A >>>> is stalled! >>>> swap_free() <- page B content lost! >>>> set_pte_at() <- staled page A installed! >>>> >>>> And besides, for ZRAM, swap_free() allows the swap device to discard >>>> the entry content, so even if page (B) is not modified, if >>>> swap_read_folio() on CPU0 happens later than swap_free() on CPU1, >>>> it may also cause data loss. >>>> >>>> To fix this, reuse swapcache_prepare which will pin the swap entry > using >>>> the cache flag, and allow only one thread to swap it in, also prevent >>>> any parallel code from putting the entry in the cache. Release the pin >>>> after PT unlocked. >>>> >>>> Racers just loop and wait since it's a rare and very short event. >>>> A schedule_timeout_uninterruptible(1) call is added to avoid repeated >>>> page faults wasting too much CPU, causing livelock or adding too much >>>> noise to perf statistics. A similar livelock issue was described in >>>> commit 029c4628b2eb ("mm: swap: get rid of livelock in swapin > readahead") >>>> >>>> Reproducer: >>>> >>>> This race issue can be triggered easily using a well constructed >>>> reproducer and patched brd (with a delay in read path) [1]: >>>> >>>> With latest 6.8 mainline, race caused data loss can be observed easily: >>>> $ gcc -g -lpthread test-thread-swap-race.c && ./a.out >>>> Polulating 32MB of memory region... >>>> Keep swapping out... >>>> Starting round 0... >>>> Spawning 65536 workers... >>>> 32746 workers spawned, wait for done... >>>> Round 0: Error on 0x5aa00, expected 32746, got 32743, 3 data loss! >>>> Round 0: Error on 0x395200, expected 32746, got 32743, 3 data loss! >>>> Round 0: Error on 0x3fd000, expected 32746, got 32737, 9 data loss! >>>> Round 0 Failed, 15 data loss! >>>> >>>> This reproducer spawns multiple threads sharing the same memory region >>>> using a small swap device. Every two threads updates mapped pages one > by >>>> one in opposite direction trying to create a race, with one dedicated >>>> thread keep swapping out the data out using madvise. >>>> >>>> The reproducer created a reproduce rate of about once every 5 minutes, >>>> so the race should be totally possible in production. >>>> >>>> After this patch, I ran the reproducer for over a few hundred rounds >>>> and no data loss observed. >>>> >>>> Performance overhead is minimal, microbenchmark swapin 10G from 32G >>>> zram: >>>> >>>> Before: 10934698 us >>>> After: 11157121 us >>>> Cached: 13155355 us (Dropping SWP_SYNCHRONOUS_IO flag) >>>> >>>> Fixes: 0bcac06f27d7 ("mm, swap: skip swapcache for swapin of > synchronous device") >>>> Link: > https://github.com/ryncsn/emm-test-project/tree/master/swap-stress-race [1] >>>> Reported-by: "Huang, Ying" <ying.huang(a)intel.com> >>>> Closes: > https://lore.kernel.org/lkml/87bk92gqpx.fsf_-_@yhuang6-desk2.ccr.corp.intel… >>>> Signed-off-by: Kairui Song <kasong(a)tencent.com> >>>> Cc: stable(a)vger.kernel.org >>>> >>>> --- >>>> V3: > https://lore.kernel.org/all/20240216095105.14502-1-ryncsn@gmail.com/ >>>> Update from V3: >>>> - Use schedule_timeout_uninterruptible(1) for now instead of > schedule() to >>>> prevent the busy faulting task holds CPU and livelocks [Huang, Ying] >>>> >>>> V2: > https://lore.kernel.org/all/20240206182559.32264-1-ryncsn@gmail.com/ >>>> Update from V2: >>>> - Add a schedule() if raced to prevent repeated page faults wasting CPU >>>> and add noise to perf statistics. >>>> - Use a bool to state the special case instead of reusing existing >>>> variables fixing error handling [Minchan Kim]. >>>> >>>> V1: https://lore.kernel.org/all/20240205110959.4021-1-ryncsn@gmail.com/ >>>> Update from V1: >>>> - Add some words on ZRAM case, it will discard swap content on > swap_free >>>> so the race window is a bit different but cure is the same. [Barry > Song] >>>> - Update comments make it cleaner [Huang, Ying] >>>> - Add a function place holder to fix CONFIG_SWAP=n built [SeongJae > Park] >>>> - Update the commit message and summary, refer to SWP_SYNCHRONOUS_IO >>>> instead of "direct swapin path" [Yu Zhao] >>>> - Update commit message. >>>> - Collect Review and Acks. >>>> >>>> include/linux/swap.h | 5 +++++ >>>> mm/memory.c | 20 ++++++++++++++++++++ >>>> mm/swap.h | 5 +++++ >>>> mm/swapfile.c | 13 +++++++++++++ >>>> 4 files changed, 43 insertions(+) >>>> >>>> diff --git a/include/linux/swap.h b/include/linux/swap.h >>>> index 4db00ddad261..8d28f6091a32 100644 >>>> --- a/include/linux/swap.h >>>> +++ b/include/linux/swap.h >>>> @@ -549,6 +549,11 @@ static inline int swap_duplicate(swp_entry_t swp) >>>> return 0; >>>> } >>>> >>>> +static inline int swapcache_prepare(swp_entry_t swp) >>>> +{ >>>> + return 0; >>>> +} >>>> + >>>> static inline void swap_free(swp_entry_t swp) >>>> { >>>> } >>>> diff --git a/mm/memory.c b/mm/memory.c >>>> index 7e1f4849463a..a99f5e7be9a5 100644 >>>> --- a/mm/memory.c >>>> +++ b/mm/memory.c >>>> @@ -3799,6 +3799,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) >>>> struct page *page; >>>> struct swap_info_struct *si = NULL; >>>> rmap_t rmap_flags = RMAP_NONE; >>>> + bool need_clear_cache = false; >>>> bool exclusive = false; >>>> swp_entry_t entry; >>>> pte_t pte; >>>> @@ -3867,6 +3868,20 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) >>>> if (!folio) { >>>> if (data_race(si->flags & SWP_SYNCHRONOUS_IO) && >>>> __swap_count(entry) == 1) { >>>> + /* >>>> + * Prevent parallel swapin from proceeding with >>>> + * the cache flag. Otherwise, another thread > may >>>> + * finish swapin first, free the entry, and > swapout >>>> + * reusing the same entry. It's undetectable as >>>> + * pte_same() returns true due to entry reuse. >>>> + */ >>>> + if (swapcache_prepare(entry)) { >>>> + /* Relax a bit to prevent rapid > repeated page faults */ >>>> + schedule_timeout_uninterruptible(1); >>> >>> Not a ideal model, imaging two tasks, >>> >>> task A - low priority running on a LITTLE core >>> task B - high priority and have real-time requirements such as audio, >>> graphics running on a big core. >>> >>> The original code will make B win even if it is a bit later than A as > its CPU is >>> much faster to finish swap_read_folio for example from zRAM. task B's >>> swap-in can finish very soon. >>> >>> With the patch, B will wait a tick and its real-time performance will be >>> negatively affected from time to time once low priority and high > priority >>> tasks fault in the same PTE and high priority tasks are a bit later than >>> low priority tasks. This is a kind of priority inversion. >>> >>> When we support large folio swap-in, things can get worse. For example, >>> to swap-in 16 or even more pages in one do_swap_page, the chance for >>> task A and task B located in the same range of 16 PTEs will increase >>> though they are not located in the same PTE. >>> >>> Please consider this is not a blocker for this patch. But I will put > the problem >>> in my list and run some real tests on Android phones later. >> >> Good point. Late for the discussion, I'm wondering why not get an extra > reference >> on the swap entry, instead of swapcache_prepare()? Then the faster thread > will >> succeed, but can't free the swap entry. Later, the slower thread will > find the >> changed pte value and fail, and free the swap entry. Maybe I missed > something? > > Hi, Chengming > > That was my initial purpose. Then found a lot of problems with it. Increase > swap count here, it may race with another swap free and end up increasing > the swap count of a freed entry. > > That can be fixed with audits and new helpers, but there are many other > potential issues too. One major problem is that after count bump, raced > swap threads will fallback to cached swap in. Pages in swapcache can be > swaped out without allocating an entry, making the problem we were trying > to resolve more serious. Thanks for your clarification! Right, there are many issues I just ignored...

1 year, 10 months

1
0
0 0

[PATCH] mm/vmscan: Fix a bug calling wakeup_kswapd() with a wrong zone index

by Byungchul Park

With numa balancing on, when a numa system is running where a numa node doesn't have its local memory so it has no managed zones, the following oops has been observed. It's because wakeup_kswapd() is called with a wrong zone index, -1. Fixed it by checking the index before calling wakeup_kswapd(). > BUG: unable to handle page fault for address: 00000000000033f3 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > PGD 0 P4D 0 > Oops: 0000 [#1] PREEMPT SMP NOPTI > CPU: 2 PID: 895 Comm: masim Not tainted 6.6.0-dirty #255 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 > RIP: 0010:wakeup_kswapd (./linux/mm/vmscan.c:7812) > Code: (omitted) > RSP: 0000:ffffc90004257d58 EFLAGS: 00010286 > RAX: ffffffffffffffff RBX: ffff88883fff0480 RCX: 0000000000000003 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88883fff0480 > RBP: ffffffffffffffff R08: ff0003ffffffffff R09: ffffffffffffffff > R10: ffff888106c95540 R11: 0000000055555554 R12: 0000000000000003 > R13: 0000000000000000 R14: 0000000000000000 R15: ffff88883fff0940 > FS: 00007fc4b8124740(0000) GS:ffff888827c00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00000000000033f3 CR3: 000000026cc08004 CR4: 0000000000770ee0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > PKRU: 55555554 > Call Trace: > <TASK> > ? __die > ? page_fault_oops > ? __pte_offset_map_lock > ? exc_page_fault > ? asm_exc_page_fault > ? wakeup_kswapd > migrate_misplaced_page > __handle_mm_fault > handle_mm_fault > do_user_addr_fault > exc_page_fault > asm_exc_page_fault > RIP: 0033:0x55b897ba0808 > Code: (omitted) > RSP: 002b:00007ffeefa821a0 EFLAGS: 00010287 > RAX: 000055b89983acd0 RBX: 00007ffeefa823f8 RCX: 000055b89983acd0 > RDX: 00007fc2f8122010 RSI: 0000000000020000 RDI: 000055b89983acd0 > RBP: 00007ffeefa821a0 R08: 0000000000000037 R09: 0000000000000075 > R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000 > R13: 00007ffeefa82410 R14: 000055b897ba5dd8 R15: 00007fc4b8340000 > </TASK> Signed-off-by: Byungchul Park <byungchul(a)sk.com> Reported-by: Hyeongtak Ji <hyeongtak.ji(a)sk.com> Cc: stable(a)vger.kernel.org Fixes: c574bbe917036 ("NUMA balancing: optimize page placement for memory tiering system") --- mm/migrate.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/mm/migrate.c b/mm/migrate.c index fbc8586ed735..51ee6865b0f6 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2825,6 +2825,14 @@ static int numamigrate_isolate_folio(pg_data_t *pgdat, struct folio *folio) if (managed_zone(pgdat->node_zones + z)) break; } + + /* + * If there are no managed zones, it should not proceed + * further. + */ + if (z < 0) + return 0; + wakeup_kswapd(pgdat->node_zones + z, 0, folio_order(folio), ZONE_MOVABLE); return 0; -- 2.17.1

1 year, 10 months

4
7
0 0

FAILED: patch "[PATCH] zonefs: Improve error handling" failed to apply to 5.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y git checkout FETCH_HEAD git cherry-pick -x 14db5f64a971fce3d8ea35de4dfc7f443a3efb92 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024021944-kettle-upturned-4371@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^.. Possible dependencies: 14db5f64a971 ("zonefs: Improve error handling") 77af13ba3c7f ("zonefs: Do not propagate iomap_dio_rw() ENOTBLK error to user space") aa7f243f32e1 ("zonefs: Separate zone information from inode information") 34422914dc00 ("zonefs: Reduce struct zonefs_inode_info size") 46a9c526eef7 ("zonefs: Simplify IO error handling") 4008e2a0b01a ("zonefs: Reorganize code") a608da3bd730 ("zonefs: Detect append writes at invalid locations") db58653ce0c7 ("zonefs: Fix active zone accounting") 7dd12d65ac64 ("zonefs: fix zone report size in __zonefs_io_error()") 8745889a7fd0 ("Merge tag 'iomap-6.0-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 14db5f64a971fce3d8ea35de4dfc7f443a3efb92 Mon Sep 17 00:00:00 2001 From: Damien Le Moal <dlemoal(a)kernel.org> Date: Thu, 8 Feb 2024 17:26:59 +0900 Subject: [PATCH] zonefs: Improve error handling Write error handling is racy and can sometime lead to the error recovery path wrongly changing the inode size of a sequential zone file to an incorrect value which results in garbage data being readable at the end of a file. There are 2 problems: 1) zonefs_file_dio_write() updates a zone file write pointer offset after issuing a direct IO with iomap_dio_rw(). This update is done only if the IO succeed for synchronous direct writes. However, for asynchronous direct writes, the update is done without waiting for the IO completion so that the next asynchronous IO can be immediately issued. However, if an asynchronous IO completes with a failure right before the i_truncate_mutex lock protecting the update, the update may change the value of the inode write pointer offset that was corrected by the error path (zonefs_io_error() function). 2) zonefs_io_error() is called when a read or write error occurs. This function executes a report zone operation using the callback function zonefs_io_error_cb(), which does all the error recovery handling based on the current zone condition, write pointer position and according to the mount options being used. However, depending on the zoned device being used, a report zone callback may be executed in a context that is different from the context of __zonefs_io_error(). As a result, zonefs_io_error_cb() may be executed without the inode truncate mutex lock held, which can lead to invalid error processing. Fix both problems as follows: - Problem 1: Perform the inode write pointer offset update before a direct write is issued with iomap_dio_rw(). This is safe to do as partial direct writes are not supported (IOMAP_DIO_PARTIAL is not set) and any failed IO will trigger the execution of zonefs_io_error() which will correct the inode write pointer offset to reflect the current state of the one on the device. - Problem 2: Change zonefs_io_error_cb() into zonefs_handle_io_error() and call this function directly from __zonefs_io_error() after obtaining the zone information using blkdev_report_zones() with a simple callback function that copies to a local stack variable the struct blk_zone obtained from the device. This ensures that error handling is performed holding the inode truncate mutex. This change also simplifies error handling for conventional zone files by bypassing the execution of report zones entirely. This is safe to do because the condition of conventional zones cannot be read-only or offline and conventional zone files are always fully mapped with a constant file size. Reported-by: Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com> Fixes: 8dcc1a9d90c1 ("fs: New zonefs file system") Cc: stable(a)vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal(a)kernel.org> Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com> Reviewed-by: Himanshu Madhani <himanshu.madhani(a)oracle.com> diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c index 6ab2318a9c8e..dba5dcb62bef 100644 --- a/fs/zonefs/file.c +++ b/fs/zonefs/file.c @@ -348,7 +348,12 @@ static int zonefs_file_write_dio_end_io(struct kiocb *iocb, ssize_t size, struct zonefs_inode_info *zi = ZONEFS_I(inode); if (error) { - zonefs_io_error(inode, true); + /* + * For Sync IOs, error recovery is called from + * zonefs_file_dio_write(). + */ + if (!is_sync_kiocb(iocb)) + zonefs_io_error(inode, true); return error; } @@ -491,6 +496,14 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from) ret = -EINVAL; goto inode_unlock; } + /* + * Advance the zone write pointer offset. This assumes that the + * IO will succeed, which is OK to do because we do not allow + * partial writes (IOMAP_DIO_PARTIAL is not set) and if the IO + * fails, the error path will correct the write pointer offset. + */ + z->z_wpoffset += count; + zonefs_inode_account_active(inode); mutex_unlock(&zi->i_truncate_mutex); } @@ -504,20 +517,19 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from) if (ret == -ENOTBLK) ret = -EBUSY; - if (zonefs_zone_is_seq(z) && - (ret > 0 || ret == -EIOCBQUEUED)) { - if (ret > 0) - count = ret; - - /* - * Update the zone write pointer offset assuming the write - * operation succeeded. If it did not, the error recovery path - * will correct it. Also do active seq file accounting. - */ - mutex_lock(&zi->i_truncate_mutex); - z->z_wpoffset += count; - zonefs_inode_account_active(inode); - mutex_unlock(&zi->i_truncate_mutex); + /* + * For a failed IO or partial completion, trigger error recovery + * to update the zone write pointer offset to a correct value. + * For asynchronous IOs, zonefs_file_write_dio_end_io() may already + * have executed error recovery if the IO already completed when we + * reach here. However, we cannot know that and execute error recovery + * again (that will not change anything). + */ + if (zonefs_zone_is_seq(z)) { + if (ret > 0 && ret != count) + ret = -EIO; + if (ret < 0 && ret != -EIOCBQUEUED) + zonefs_io_error(inode, true); } inode_unlock: diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c index 93971742613a..b6e8e7c96251 100644 --- a/fs/zonefs/super.c +++ b/fs/zonefs/super.c @@ -246,16 +246,18 @@ static void zonefs_inode_update_mode(struct inode *inode) z->z_mode = inode->i_mode; } -struct zonefs_ioerr_data { - struct inode *inode; - bool write; -}; - static int zonefs_io_error_cb(struct blk_zone *zone, unsigned int idx, void *data) { - struct zonefs_ioerr_data *err = data; - struct inode *inode = err->inode; + struct blk_zone *z = data; + + *z = *zone; + return 0; +} + +static void zonefs_handle_io_error(struct inode *inode, struct blk_zone *zone, + bool write) +{ struct zonefs_zone *z = zonefs_inode_zone(inode); struct super_block *sb = inode->i_sb; struct zonefs_sb_info *sbi = ZONEFS_SB(sb); @@ -270,8 +272,8 @@ static int zonefs_io_error_cb(struct blk_zone *zone, unsigned int idx, data_size = zonefs_check_zone_condition(sb, z, zone); isize = i_size_read(inode); if (!(z->z_flags & (ZONEFS_ZONE_READONLY | ZONEFS_ZONE_OFFLINE)) && - !err->write && isize == data_size) - return 0; + !write && isize == data_size) + return; /* * At this point, we detected either a bad zone or an inconsistency @@ -292,7 +294,7 @@ static int zonefs_io_error_cb(struct blk_zone *zone, unsigned int idx, * In all cases, warn about inode size inconsistency and handle the * IO error according to the zone condition and to the mount options. */ - if (zonefs_zone_is_seq(z) && isize != data_size) + if (isize != data_size) zonefs_warn(sb, "inode %lu: invalid size %lld (should be %lld)\n", inode->i_ino, isize, data_size); @@ -352,8 +354,6 @@ static int zonefs_io_error_cb(struct blk_zone *zone, unsigned int idx, zonefs_i_size_write(inode, data_size); z->z_wpoffset = data_size; zonefs_inode_account_active(inode); - - return 0; } /* @@ -367,23 +367,25 @@ void __zonefs_io_error(struct inode *inode, bool write) { struct zonefs_zone *z = zonefs_inode_zone(inode); struct super_block *sb = inode->i_sb; - struct zonefs_sb_info *sbi = ZONEFS_SB(sb); unsigned int noio_flag; - unsigned int nr_zones = 1; - struct zonefs_ioerr_data err = { - .inode = inode, - .write = write, - }; + struct blk_zone zone; int ret; /* - * The only files that have more than one zone are conventional zone - * files with aggregated conventional zones, for which the inode zone - * size is always larger than the device zone size. + * Conventional zone have no write pointer and cannot become read-only + * or offline. So simply fake a report for a single or aggregated zone + * and let zonefs_handle_io_error() correct the zone inode information + * according to the mount options. */ - if (z->z_size > bdev_zone_sectors(sb->s_bdev)) - nr_zones = z->z_size >> - (sbi->s_zone_sectors_shift + SECTOR_SHIFT); + if (!zonefs_zone_is_seq(z)) { + zone.start = z->z_sector; + zone.len = z->z_size >> SECTOR_SHIFT; + zone.wp = zone.start + zone.len; + zone.type = BLK_ZONE_TYPE_CONVENTIONAL; + zone.cond = BLK_ZONE_COND_NOT_WP; + zone.capacity = zone.len; + goto handle_io_error; + } /* * Memory allocations in blkdev_report_zones() can trigger a memory @@ -394,12 +396,20 @@ void __zonefs_io_error(struct inode *inode, bool write) * the GFP_NOIO context avoids both problems. */ noio_flag = memalloc_noio_save(); - ret = blkdev_report_zones(sb->s_bdev, z->z_sector, nr_zones, - zonefs_io_error_cb, &err); - if (ret != nr_zones) + ret = blkdev_report_zones(sb->s_bdev, z->z_sector, 1, + zonefs_io_error_cb, &zone); + memalloc_noio_restore(noio_flag); + + if (ret != 1) { zonefs_err(sb, "Get inode %lu zone information failed %d\n", inode->i_ino, ret); - memalloc_noio_restore(noio_flag); + zonefs_warn(sb, "remounting filesystem read-only\n"); + sb->s_flags |= SB_RDONLY; + return; + } + +handle_io_error: + zonefs_handle_io_error(inode, &zone, write); } static struct kmem_cache *zonefs_inode_cachep;

1 year, 10 months

2
1
0 0

[regression 6.1.y] f2fs: invalid zstd compress level: 6

by Salvatore Bonaccorso

Hi Jaegeuk Kim, Chao Yu, In Debian the following regression was reported after a Dhya updated to 6.1.76: On Wed, Feb 07, 2024 at 10:43:47PM -0500, Dhya wrote: > Package: src:linux > Version: 6.1.76-1 > Severity: critical > Justification: breaks the whole system > > Dear Maintainer, > > After upgrade to linux-image-6.1.0-18-amd64 6.1.76-1 F2FS filesystem > fails to mount rw. Message in the boot journal: > > kernel: F2FS-fs (nvme0n1p6): invalid zstd compress level: 6 > > There was recently an f2fs patch to the 6.1 kernel tree which might be > related: https://www.spinics.net/lists/stable-commits/msg329957.html > > Was able to recover the system by doing: > > sudo mount -o remount,rw,relatime,lazytime,background_gc=on,discard,no_heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,extent_cache,mode=adaptive,active_logs=6,alloc_mode=default,checkpoint_merge,fsync_mode=posix,compress_algorithm=lz4,compress_log_size=2,compress_mode=fs,atgc,discard_unit=block,memory=normal /dev/nvme0n1p6 / > > under the running bad 6.1.0-18-amd64 kernel, then editing > /etc/default/grub: > > GRUB_DEFAULT="Advanced options for Debian GNU/Linux>Debian GNU/Linux, with Linux 6.1.0-17-amd64" > > and running 'update-grub' and rebooting to boot the 6.1.0-17-amd64 > kernel. The issue is easily reproducible by: # dd if=/dev/zero of=test.img count=100 bs=1M # mkfs.f2fs -f -O compression,extra_attr ./test.img # mount -t f2fs -o compress_algorithm=zstd:6,compress_chksum,atgc,gc_merge,lazytime ./test.img /mnt resulting in [ 60.789982] F2FS-fs (loop0): invalid zstd compress level: 6 A bugzilla report has been submitted in https://bugzilla.kernel.org/show_bug.cgi?id=218471 #regzbot introduced: v6.1.69..v6.1.76 #regzbot link: https://bugs.debian.org/1063422 #regzbot link: https://bugzilla.kernel.org/show_bug.cgi?id=218471 Regards, Salvatore

1 year, 10 months

3
4
0 0

+ sched-numa-mm-do-not-try-to-migrate-memory-to-memoryless-nodes.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: sched/numa, mm: do not try to migrate memory to memoryless nodes has been added to the -mm mm-hotfixes-unstable branch. Its filename is sched-numa-mm-do-not-try-to-migrate-memory-to-memoryless-nodes.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Byungchul Park <byungchul(a)sk.com> Subject: sched/numa, mm: do not try to migrate memory to memoryless nodes Date: Mon, 19 Feb 2024 13:10:47 +0900 With numa balancing on, when a numa system is running where a numa node doesn't have its local memory so it has no managed zones, the following oops has been observed. It's because wakeup_kswapd() is called with a wrong zone index, -1. Fixed it by checking the index before calling wakeup_kswapd(). > BUG: unable to handle page fault for address: 00000000000033f3 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > PGD 0 P4D 0 > Oops: 0000 [#1] PREEMPT SMP NOPTI > CPU: 2 PID: 895 Comm: masim Not tainted 6.6.0-dirty #255 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 > RIP: 0010:wakeup_kswapd (./linux/mm/vmscan.c:7812) > Code: (omitted) > RSP: 0000:ffffc90004257d58 EFLAGS: 00010286 > RAX: ffffffffffffffff RBX: ffff88883fff0480 RCX: 0000000000000003 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88883fff0480 > RBP: ffffffffffffffff R08: ff0003ffffffffff R09: ffffffffffffffff > R10: ffff888106c95540 R11: 0000000055555554 R12: 0000000000000003 > R13: 0000000000000000 R14: 0000000000000000 R15: ffff88883fff0940 > FS: 00007fc4b8124740(0000) GS:ffff888827c00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00000000000033f3 CR3: 000000026cc08004 CR4: 0000000000770ee0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > PKRU: 55555554 > Call Trace: > <TASK> > ? __die > ? page_fault_oops > ? __pte_offset_map_lock > ? exc_page_fault > ? asm_exc_page_fault > ? wakeup_kswapd > migrate_misplaced_page > __handle_mm_fault > handle_mm_fault > do_user_addr_fault > exc_page_fault > asm_exc_page_fault > RIP: 0033:0x55b897ba0808 > Code: (omitted) > RSP: 002b:00007ffeefa821a0 EFLAGS: 00010287 > RAX: 000055b89983acd0 RBX: 00007ffeefa823f8 RCX: 000055b89983acd0 > RDX: 00007fc2f8122010 RSI: 0000000000020000 RDI: 000055b89983acd0 > RBP: 00007ffeefa821a0 R08: 0000000000000037 R09: 0000000000000075 > R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000 > R13: 00007ffeefa82410 R14: 000055b897ba5dd8 R15: 00007fc4b8340000 > </TASK> Fix this by avoiding any attempt to migrate memory to memoryless nodes. Link: https://lkml.kernel.org/r/20240219041920.1183-1-byungchul@sk.com Link: https://lkml.kernel.org/r/20240216111502.79759-1-byungchul@sk.com Fixes: c574bbe917036 ("NUMA balancing: optimize page placement for memory tiering system") Signed-off-by: Byungchul Park <byungchul(a)sk.com> Reviewed-by: Oscar Salvador <osalvador(a)suse.de> Reviewed-by: "Huang, Ying" <ying.huang(a)intel.com> Reviewed-by: Phil Auld <pauld(a)redhat.com> Cc: Benjamin Segall <bsegall(a)google.com> Cc: Daniel Bristot de Oliveira <bristot(a)redhat.com> Cc: Dietmar Eggemann <dietmar.eggemann(a)arm.com> Cc: Ingo Molnar <mingo(a)redhat.com> Cc: Juri Lelli <juri.lelli(a)redhat.com> Cc: Mel Gorman <mgorman(a)suse.de> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Steven Rostedt <rostedt(a)goodmis.org> Cc: Valentin Schneider <vschneid(a)redhat.com> Cc: Vincent Guittot <vincent.guittot(a)linaro.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- kernel/sched/fair.c | 6 ++++++ 1 file changed, 6 insertions(+) --- a/kernel/sched/fair.c~sched-numa-mm-do-not-try-to-migrate-memory-to-memoryless-nodes +++ a/kernel/sched/fair.c @@ -1831,6 +1831,12 @@ bool should_numa_migrate_memory(struct t int last_cpupid, this_cpupid; /* + * Cannot migrate to memoryless nodes. + */ + if (!node_state(dst_nid, N_MEMORY)) + return false; + + /* * The pages in slow memory node should be migrated according * to hot/cold instead of private/shared. */ _ Patches currently in -mm which might be from byungchul(a)sk.com are sched-numa-mm-do-not-try-to-migrate-memory-to-memoryless-nodes.patch mm-vmscan-dont-turn-on-cache_trim_mode-at-the-highest-scan-priority.patch

1 year, 10 months

1
0
0 0

[PATCH v2] selftests/mqueue: Set timeout to 180 seconds

by SeongJae Park

While mq_perf_tests runs with the default kselftest timeout limit, which is 45 seconds, the test takes about 60 seconds to complete on i3.metal AWS instances. Hence, the test always times out. Increase the timeout to 100 seconds. Link: https://lore.kernel.org/r/20240208212925.68286-1-sj@kernel.org Fixes: 852c8cbf34d3 ("selftests/kselftest/runner.sh: Add 45 second timeout per test") Cc: <stable(a)vger.kernel.org> # 5.4.x Signed-off-by: SeongJae Park <sj(a)kernel.org> Reviewed-by: Kees Cook <keescook(a)chromium.org> --- Changes from v1 (https://lore.kernel.org/r/20240208212925.68286-1-sj@kernel.org) - Use 180 seconds timeout instead of 100 seconds tools/testing/selftests/mqueue/setting | 1 + 1 file changed, 1 insertion(+) create mode 100644 tools/testing/selftests/mqueue/setting diff --git a/tools/testing/selftests/mqueue/setting b/tools/testing/selftests/mqueue/setting new file mode 100644 index 000000000000..a953c96aa16e --- /dev/null +++ b/tools/testing/selftests/mqueue/setting @@ -0,0 +1 @@ +timeout=180 -- 2.39.2

1 year, 10 months

1
1
0 0

+ mm-damon-lru_sort-fix-quota-status-loss-due-to-online-tunings.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: mm/damon/lru_sort: fix quota status loss due to online tunings has been added to the -mm mm-hotfixes-unstable branch. Its filename is mm-damon-lru_sort-fix-quota-status-loss-due-to-online-tunings.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: SeongJae Park <sj(a)kernel.org> Subject: mm/damon/lru_sort: fix quota status loss due to online tunings Date: Fri, 16 Feb 2024 11:40:25 -0800 For online parameters change, DAMON_LRU_SORT creates new schemes based on latest values of the parameters and replaces the old schemes with the new one. When creating it, the internal status of the quotas of the old schemes is not preserved. As a result, charging of the quota starts from zero after the online tuning. The data that collected to estimate the throughput of the scheme's action is also reset, and therefore the estimation should start from the scratch again. Because the throughput estimation is being used to convert the time quota to the effective size quota, this could result in temporal time quota inaccuracy. It would be recovered over time, though. In short, the quota accuracy could be temporarily degraded after online parameters update. Fix the problem by checking the case and copying the internal fields for the status. Link: https://lkml.kernel.org/r/20240216194025.9207-3-sj@kernel.org Fixes: 40e983cca927 ("mm/damon: introduce DAMON-based LRU-lists Sorting") Signed-off-by: SeongJae Park <sj(a)kernel.org> Cc: <stable(a)vger.kernel.org> [6.0+] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/damon/lru_sort.c | 43 +++++++++++++++++++++++++++++++++++------- 1 file changed, 36 insertions(+), 7 deletions(-) --- a/mm/damon/lru_sort.c~mm-damon-lru_sort-fix-quota-status-loss-due-to-online-tunings +++ a/mm/damon/lru_sort.c @@ -185,9 +185,21 @@ static struct damos *damon_lru_sort_new_ return damon_lru_sort_new_scheme(&pattern, DAMOS_LRU_DEPRIO); } +static void damon_lru_sort_copy_quota_status(struct damos_quota *dst, + struct damos_quota *src) +{ + dst->total_charged_sz = src->total_charged_sz; + dst->total_charged_ns = src->total_charged_ns; + dst->charged_sz = src->charged_sz; + dst->charged_from = src->charged_from; + dst->charge_target_from = src->charge_target_from; + dst->charge_addr_from = src->charge_addr_from; +} + static int damon_lru_sort_apply_parameters(void) { - struct damos *scheme; + struct damos *scheme, *hot_scheme, *cold_scheme; + struct damos *old_hot_scheme = NULL, *old_cold_scheme = NULL; unsigned int hot_thres, cold_thres; int err = 0; @@ -195,18 +207,35 @@ static int damon_lru_sort_apply_paramete if (err) return err; + damon_for_each_scheme(scheme, ctx) { + if (!old_hot_scheme) { + old_hot_scheme = scheme; + continue; + } + old_cold_scheme = scheme; + } + hot_thres = damon_max_nr_accesses(&damon_lru_sort_mon_attrs) * hot_thres_access_freq / 1000; - scheme = damon_lru_sort_new_hot_scheme(hot_thres); - if (!scheme) + hot_scheme = damon_lru_sort_new_hot_scheme(hot_thres); + if (!hot_scheme) return -ENOMEM; - damon_set_schemes(ctx, &scheme, 1); + if (old_hot_scheme) + damon_lru_sort_copy_quota_status(&hot_scheme->quota, + &old_hot_scheme->quota); cold_thres = cold_min_age / damon_lru_sort_mon_attrs.aggr_interval; - scheme = damon_lru_sort_new_cold_scheme(cold_thres); - if (!scheme) + cold_scheme = damon_lru_sort_new_cold_scheme(cold_thres); + if (!cold_scheme) { + damon_destroy_scheme(hot_scheme); return -ENOMEM; - damon_add_scheme(ctx, scheme); + } + if (old_cold_scheme) + damon_lru_sort_copy_quota_status(&cold_scheme->quota, + &old_cold_scheme->quota); + + damon_set_schemes(ctx, &hot_scheme, 1); + damon_add_scheme(ctx, cold_scheme); return damon_set_region_biggest_system_ram_default(target, &monitor_region_start, _ Patches currently in -mm which might be from sj(a)kernel.org are mm-damon-core-check-apply-interval-in-damon_do_apply_schemes.patch mm-damon-sysfs-schemes-handle-schemes-sysfs-dir-removal-before-commit_schemes_quota_goals.patch mm-damon-reclaim-fix-quota-stauts-loss-due-to-online-tunings.patch mm-damon-lru_sort-fix-quota-status-loss-due-to-online-tunings.patch docs-admin-guide-mm-damon-usage-use-sysfs-interface-for-tracepoints-example.patch mm-damon-rename-config_damon_dbgfs-to-damon_dbgfs_deprecated.patch mm-damon-dbgfs-implement-deprecation-notice-file.patch mm-damon-dbgfs-make-debugfs-interface-deprecation-message-a-macro.patch docs-admin-guide-mm-damon-usage-document-deprecated-file-of-damon-debugfs-interface.patch selftets-damon-prepare-for-monitor_on-file-renaming.patch mm-damon-dbgfs-rename-monitor_on-file-to-monitor_on_deprecated.patch docs-admin-guide-mm-damon-usage-update-for-monitor_on-renaming.patch docs-translations-damon-usage-update-for-monitor_on-renaming.patch mm-damon-sysfs-handle-state-file-inputs-for-every-sampling-interval-if-possible.patch selftests-damon-_damon_sysfs-support-damos-quota.patch selftests-damon-_damon_sysfs-support-damos-stats.patch selftests-damon-_damon_sysfs-support-damos-apply-interval.patch selftests-damon-add-a-test-for-damos-quota.patch selftests-damon-add-a-test-for-damos-apply-intervals.patch selftests-damon-add-a-test-for-a-race-between-target_ids_read-and-dbgfs_before_terminate.patch selftests-damon-add-a-test-for-the-pid-leak-of-dbgfs_target_ids_write.patch selftests-damon-_chk_dependency-get-debugfs-mount-point-from-proc-mounts.patch docs-mm-damon-maintainer-profile-fix-reference-links-for-mm-stable-tree.patch docs-mm-damon-move-the-list-of-damos-actions-to-design-doc.patch docs-mm-damon-move-damon-operation-sets-list-from-the-usage-to-the-design-document.patch docs-mm-damon-move-monitoring-target-regions-setup-detail-from-the-usage-to-the-design-document.patch docs-admin-guide-mm-damon-usage-fix-wrong-quotas-diabling-condition.patch mm-damon-core-set-damos_quota-esz-as-public-field-and-document.patch mm-damon-sysfs-schemes-implement-quota-effective_bytes-file.patch mm-damon-sysfs-implement-a-kdamond-command-for-updating-schemes-effective-quotas.patch docs-abi-damon-document-effective_bytes-sysfs-file.patch docs-admin-guide-mm-damon-usage-document-effective_bytes-file.patch mm-damon-move-comments-and-fields-for-damos-quota-prioritization-to-the-end.patch mm-damon-core-split-out-quota-goal-related-fields-to-a-struct.patch mm-damon-core-add-multiple-goals-per-damos_quota-and-helpers-for-those.patch mm-damon-sysfs-use-only-quota-goals.patch mm-damon-core-remove-goal-field-of-damos_quota.patch mm-damon-core-let-goal-specified-with-only-target-and-current-values.patch mm-damon-core-support-multiple-metrics-for-quota-goal.patch mm-damon-core-implement-psi-metric-damos-quota-goal.patch mm-damon-sysfs-schemes-support-psi-based-quota-auto-tune.patch docs-mm-damon-design-document-quota-goal-self-tuning.patch docs-abi-damon-document-quota-goal-metric-file.patch docs-admin-guide-mm-damon-usage-document-quota-goal-metric-file.patch mm-damon-reclaim-implement-user-feedback-driven-quota-auto-tuning.patch mm-damon-reclaim-implement-memory-psi-driven-quota-self-tuning.patch docs-admin-guide-mm-damon-reclaim-document-auto-tuning-parameters.patch

1 year, 10 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror February 2024