The quilt patch titled
Subject: squashfs: fix buffer release race condition in readahead code
has been removed from the -mm tree. Its filename was
squashfs-fix-buffer-release-race-condition-in-readahead-code.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: squashfs: fix buffer release race condition in readahead code
Date: Thu, 20 Oct 2022 23:36:16 +0100
Fix a buffer release race condition, where the error value was used after
release.
Link: https://lkml.kernel.org/r/20221020223616.7571-4-phillip@squashfs.org.uk
Fixes: b09a7a036d20 ("squashfs: support reading fragments in readahead call")
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Tested-by: Bagas Sanjaya <bagasdotme(a)gmail.com>
Reported-by: Marc Miltenberger <marcmiltenberger(a)gmail.com>
Cc: Dimitri John Ledkov <dimitri.ledkov(a)canonical.com>
Cc: Hsin-Yi Wang <hsinyi(a)chromium.org>
Cc: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Cc: Slade Watkins <srw(a)sladewatkins.net>
Cc: Thorsten Leemhuis <regressions(a)leemhuis.info>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/fs/squashfs/file.c~squashfs-fix-buffer-release-race-condition-in-readahead-code
+++ a/fs/squashfs/file.c
@@ -506,8 +506,9 @@ static int squashfs_readahead_fragment(s
squashfs_i(inode)->fragment_size);
struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
unsigned int n, mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
+ int error = buffer->error;
- if (buffer->error)
+ if (error)
goto out;
expected += squashfs_i(inode)->fragment_offset;
@@ -529,7 +530,7 @@ static int squashfs_readahead_fragment(s
out:
squashfs_cache_put(buffer);
- return buffer->error;
+ return error;
}
static void squashfs_readahead(struct readahead_control *ractl)
_
Patches currently in -mm which might be from phillip(a)squashfs.org.uk are
From: Lino Sanfilippo <LinoSanfilippo(a)gmx.de>
Several drivers that support setting the RS485 configuration via userspace
implement one or more of the following tasks:
- in case of an invalid RTS configuration (both RTS after send and RTS on
send set or both unset) fall back to enable RTS on send and disable RTS
after send
- nullify the padding field of the returned serial_rs485 struct
- copy the configuration into the uart port struct
- limit RTS delays to 100 ms
Move these tasks into the serial core to make them generic and to provide
a consistent behaviour among all drivers.
Signed-off-by: Lino Sanfilippo <LinoSanfilippo(a)gmx.de>
Link: https://lore.kernel.org/r/20220410104642.32195-2-LinoSanfilippo@gmx.de
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
[ Upstream commit 0ed12afa5655512ee418047fb3546d229df20aa1 ]
Signed-off-by: Daisuke Mizobuchi <mizo(a)atmark-techno.com>
Signed-off-by: Dominique Martinet <dominique.martinet(a)atmark-techno.com>
---
Follow-up of https://lkml.kernel.org/r/20221017013807.34614-1-dominique.martinet@atmark-…
v2: identical to v1
drivers/tty/serial/serial_core.c | 33 ++++++++++++++++++++++++++++++++
1 file changed, 33 insertions(+)
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index b578f7090b63..6cc909d44a81 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -42,6 +42,11 @@ static struct lock_class_key port_lock_key;
#define HIGH_BITS_OFFSET ((sizeof(long)-sizeof(int))*8)
+/*
+ * Max time with active RTS before/after data is sent.
+ */
+#define RS485_MAX_RTS_DELAY 100 /* msecs */
+
static void uart_change_speed(struct tty_struct *tty, struct uart_state *state,
struct ktermios *old_termios);
static void uart_wait_until_sent(struct tty_struct *tty, int timeout);
@@ -1326,8 +1331,36 @@ static int uart_set_rs485_config(struct uart_port *port,
if (copy_from_user(&rs485, rs485_user, sizeof(*rs485_user)))
return -EFAULT;
+ /* pick sane settings if the user hasn't */
+ if (!(rs485.flags & SER_RS485_RTS_ON_SEND) ==
+ !(rs485.flags & SER_RS485_RTS_AFTER_SEND)) {
+ dev_warn_ratelimited(port->dev,
+ "%s (%d): invalid RTS setting, using RTS_ON_SEND instead\n",
+ port->name, port->line);
+ rs485.flags |= SER_RS485_RTS_ON_SEND;
+ rs485.flags &= ~SER_RS485_RTS_AFTER_SEND;
+ }
+
+ if (rs485.delay_rts_before_send > RS485_MAX_RTS_DELAY) {
+ rs485.delay_rts_before_send = RS485_MAX_RTS_DELAY;
+ dev_warn_ratelimited(port->dev,
+ "%s (%d): RTS delay before sending clamped to %u ms\n",
+ port->name, port->line, rs485.delay_rts_before_send);
+ }
+
+ if (rs485.delay_rts_after_send > RS485_MAX_RTS_DELAY) {
+ rs485.delay_rts_after_send = RS485_MAX_RTS_DELAY;
+ dev_warn_ratelimited(port->dev,
+ "%s (%d): RTS delay after sending clamped to %u ms\n",
+ port->name, port->line, rs485.delay_rts_after_send);
+ }
+ /* Return clean padding area to userspace */
+ memset(rs485.padding, 0, sizeof(rs485.padding));
+
spin_lock_irqsave(&port->lock, flags);
ret = port->rs485_config(port, &rs485);
+ if (!ret)
+ port->rs485 = rs485;
spin_unlock_irqrestore(&port->lock, flags);
if (ret)
return ret;
--
2.35.1
This is the start of the stable review cycle for the 5.4.221 release.
There are 53 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 29 Oct 2022 16:50:35 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.221-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.4.221-rc1
Seth Jenkins <sethjenkins(a)google.com>
mm: /proc/pid/smaps_rollup: fix no vma's null-deref
Gaurav Kohli <gauravkohli(a)linux.microsoft.com>
hv_netvsc: Fix race between VF offering and VF association message from host
Nick Desaulniers <ndesaulniers(a)google.com>
Makefile.debug: re-enable debug info for .S files
Werner Sembach <wse(a)tuxedocomputers.com>
ACPI: video: Force backlight native for more TongFang devices
Conor Dooley <conor.dooley(a)microchip.com>
riscv: topology: fix default topology reporting
Conor Dooley <conor.dooley(a)microchip.com>
arm64: topology: move store_cpu_topology() to shared code
Jerry Snitselaar <jsnitsel(a)redhat.com>
iommu/vt-d: Clean up si_domain in the init_dmars() error path
Yang Yingliang <yangyingliang(a)huawei.com>
net: hns: fix possible memory leak in hnae_ae_register()
Zhengchao Shao <shaozhengchao(a)huawei.com>
net: sched: cake: fix null pointer access issue when cake_init() fails
Harini Katakam <harini.katakam(a)amd.com>
net: phy: dp83867: Extend RX strap quirk for SGMII mode
Xiaobo Liu <cppcoffee(a)gmail.com>
net/atm: fix proc_mpc_write incorrect return value
José Expósito <jose.exposito89(a)gmail.com>
HID: magicmouse: Do not set BTN_MOUSE on double report
Alexander Potapenko <glider(a)google.com>
tipc: fix an information leak in tipc_topsrv_kern_subscr
Mark Tomlinson <mark.tomlinson(a)alliedtelesis.co.nz>
tipc: Fix recognition of trial period
Tony Luck <tony.luck(a)intel.com>
ACPI: extlog: Handle multiple records
Filipe Manana <fdmanana(a)suse.com>
btrfs: fix processing of delayed tree block refs during backref walking
Filipe Manana <fdmanana(a)suse.com>
btrfs: fix processing of delayed data refs during backref walking
Jean-Francois Le Fillatre <jflf_kernel(a)gmx.com>
r8152: add PID for the Lenovo OneLink+ Dock
James Morse <james.morse(a)arm.com>
arm64: errata: Remove AES hwcap for COMPAT tasks
Bryan O'Donoghue <bryan.odonoghue(a)linaro.org>
media: venus: dec: Handle the case where find_format fails
Eric Ren <renzhengeek(a)gmail.com>
KVM: arm64: vgic: Fix exit condition in scan_its_table()
Kai-Heng Feng <kai.heng.feng(a)canonical.com>
ata: ahci: Match EM_MAX_SLOTS with SATA_PMP_MAX_PORTS
Alexander Stein <alexander.stein(a)ew.tq-group.com>
ata: ahci-imx: Fix MODULE_ALIAS
Zhang Rui <rui.zhang(a)intel.com>
hwmon/coretemp: Handle large core ID value
Borislav Petkov <bp(a)suse.de>
x86/microcode/AMD: Apply the patch early on every logical thread
Joseph Qi <joseph.qi(a)linux.alibaba.com>
ocfs2: fix BUG when iput after ocfs2_mknod fails
Joseph Qi <joseph.qi(a)linux.alibaba.com>
ocfs2: clear dinode links count in case of error
Dave Chinner <dchinner(a)redhat.com>
xfs: fix use-after-free on CIL context on shutdown
Darrick J. Wong <darrick.wong(a)oracle.com>
xfs: move inode flush to the sync workqueue
Christoph Hellwig <hch(a)lst.de>
xfs: reflink should force the log out if mounted with wsync
Christoph Hellwig <hch(a)lst.de>
xfs: factor out a new xfs_log_force_inode helper
Brian Foster <bfoster(a)redhat.com>
xfs: trylock underlying buffer on dquot flush
Darrick J. Wong <darrick.wong(a)oracle.com>
xfs: don't write a corrupt unmount record to force summary counter recalc
Dave Chinner <dchinner(a)redhat.com>
xfs: tail updates only need to occur when LSN changes
Dave Chinner <dchinner(a)redhat.com>
xfs: factor common AIL item deletion code
Dave Chinner <dchinner(a)redhat.com>
xfs: Throttle commits on delayed background CIL push
Dave Chinner <dchinner(a)redhat.com>
xfs: Lower CIL flush limit for large logs
Darrick J. Wong <darrick.wong(a)oracle.com>
xfs: preserve default grace interval during quotacheck
Brian Foster <bfoster(a)redhat.com>
xfs: fix unmount hang and memory leak on shutdown during quotaoff
Brian Foster <bfoster(a)redhat.com>
xfs: factor out quotaoff intent AIL removal and memory free
Pavel Reichl <preichl(a)redhat.com>
xfs: Replace function declaration by actual definition
Pavel Reichl <preichl(a)redhat.com>
xfs: remove the xfs_qoff_logitem_t typedef
Pavel Reichl <preichl(a)redhat.com>
xfs: remove the xfs_dq_logitem_t typedef
Pavel Reichl <preichl(a)redhat.com>
xfs: remove the xfs_disk_dquot_t and xfs_dquot_t
Takashi Iwai <tiwai(a)suse.de>
xfs: Use scnprintf() for avoiding potential buffer overflow
Darrick J. Wong <darrick.wong(a)oracle.com>
xfs: check owner of dir3 blocks
Darrick J. Wong <darrick.wong(a)oracle.com>
xfs: check owner of dir3 data blocks
Darrick J. Wong <darrick.wong(a)oracle.com>
xfs: fix buffer corruption reporting when xfs_dir3_free_header_check fails
Darrick J. Wong <darrick.wong(a)oracle.com>
xfs: xfs_buf_corruption_error should take __this_address
Darrick J. Wong <darrick.wong(a)oracle.com>
xfs: add a function to deal with corrupt buffers post-verifiers
Brian Foster <bfoster(a)redhat.com>
xfs: rework collapse range into an atomic operation
Brian Foster <bfoster(a)redhat.com>
xfs: rework insert range into an atomic operation
Brian Foster <bfoster(a)redhat.com>
xfs: open code insert range extent split helper
-------------
Diffstat:
Documentation/arm64/silicon-errata.rst | 4 +
Makefile | 8 +-
arch/arm64/Kconfig | 16 ++++
arch/arm64/include/asm/cpucaps.h | 3 +-
arch/arm64/kernel/cpu_errata.c | 16 ++++
arch/arm64/kernel/cpufeature.c | 13 ++-
arch/arm64/kernel/topology.c | 40 ---------
arch/riscv/Kconfig | 2 +-
arch/riscv/kernel/smpboot.c | 4 +-
arch/x86/kernel/cpu/microcode/amd.c | 16 +++-
drivers/acpi/acpi_extlog.c | 33 ++++---
drivers/acpi/video_detect.c | 64 ++++++++++++++
drivers/ata/ahci.h | 2 +-
drivers/ata/ahci_imx.c | 2 +-
drivers/base/arch_topology.c | 19 ++++
drivers/hid/hid-magicmouse.c | 2 +-
drivers/hwmon/coretemp.c | 56 ++++++++----
drivers/iommu/intel-iommu.c | 5 ++
drivers/media/platform/qcom/venus/vdec.c | 2 +
drivers/net/ethernet/hisilicon/hns/hnae.c | 4 +-
drivers/net/hyperv/hyperv_net.h | 3 +
drivers/net/hyperv/netvsc.c | 4 +
drivers/net/hyperv/netvsc_drv.c | 20 +++++
drivers/net/phy/dp83867.c | 8 ++
drivers/net/usb/cdc_ether.c | 7 ++
drivers/net/usb/r8152.c | 1 +
fs/btrfs/backref.c | 46 ++++++----
fs/ocfs2/namei.c | 23 +++--
fs/proc/task_mmu.c | 2 +-
fs/xfs/libxfs/xfs_alloc.c | 2 +-
fs/xfs/libxfs/xfs_attr_leaf.c | 6 +-
fs/xfs/libxfs/xfs_bmap.c | 32 +------
fs/xfs/libxfs/xfs_bmap.h | 3 +-
fs/xfs/libxfs/xfs_btree.c | 2 +-
fs/xfs/libxfs/xfs_da_btree.c | 10 +--
fs/xfs/libxfs/xfs_dir2_block.c | 33 ++++++-
fs/xfs/libxfs/xfs_dir2_data.c | 32 ++++++-
fs/xfs/libxfs/xfs_dir2_leaf.c | 2 +-
fs/xfs/libxfs/xfs_dir2_node.c | 8 +-
fs/xfs/libxfs/xfs_dquot_buf.c | 8 +-
fs/xfs/libxfs/xfs_format.h | 10 +--
fs/xfs/libxfs/xfs_trans_resv.c | 6 +-
fs/xfs/xfs_attr_inactive.c | 6 +-
fs/xfs/xfs_attr_list.c | 2 +-
fs/xfs/xfs_bmap_util.c | 57 ++++++------
fs/xfs/xfs_buf.c | 22 +++++
fs/xfs/xfs_buf.h | 2 +
fs/xfs/xfs_dquot.c | 26 +++---
fs/xfs/xfs_dquot.h | 98 +++++++++++----------
fs/xfs/xfs_dquot_item.c | 47 +++++++---
fs/xfs/xfs_dquot_item.h | 35 ++++----
fs/xfs/xfs_error.c | 7 +-
fs/xfs/xfs_error.h | 2 +-
fs/xfs/xfs_export.c | 14 +--
fs/xfs/xfs_file.c | 16 ++--
fs/xfs/xfs_inode.c | 23 ++++-
fs/xfs/xfs_inode.h | 1 +
fs/xfs/xfs_inode_item.c | 28 +++---
fs/xfs/xfs_log.c | 26 +++---
fs/xfs/xfs_log_cil.c | 39 ++++++--
fs/xfs/xfs_log_priv.h | 53 +++++++++--
fs/xfs/xfs_log_recover.c | 5 +-
fs/xfs/xfs_mount.h | 5 ++
fs/xfs/xfs_qm.c | 64 ++++++++------
fs/xfs/xfs_qm_bhv.c | 6 +-
fs/xfs/xfs_qm_syscalls.c | 142 +++++++++++++++---------------
fs/xfs/xfs_stats.c | 10 +--
fs/xfs/xfs_super.c | 28 ++++--
fs/xfs/xfs_trace.h | 1 +
fs/xfs/xfs_trans_ail.c | 88 +++++++++++-------
fs/xfs/xfs_trans_dquot.c | 54 ++++++------
fs/xfs/xfs_trans_priv.h | 6 +-
net/atm/mpoa_proc.c | 3 +-
net/sched/sch_cake.c | 4 +
net/tipc/discover.c | 2 +-
net/tipc/topsrv.c | 2 +-
virt/kvm/arm/vgic/vgic-its.c | 5 +-
77 files changed, 973 insertions(+), 535 deletions(-)
Your email was luckly selected to received a donation of 10 BTC (BITCOIN) equal to 4,124,270.00 from Elon Musk.
Urgently email us: (teamelonmusk94(a)gmail.com).
The quilt patch titled
Subject: hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing
has been removed from the -mm tree. Its filename was
hugetlb-dont-delete-vma_lock-in-hugetlb-madv_dontneed-processing.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing
Date: Sat, 22 Oct 2022 19:50:47 -0700
madvise(MADV_DONTNEED) ends up calling zap_page_range() to clear the page
tables associated with the address range. For hugetlb vmas,
zap_page_range will call __unmap_hugepage_range_final. However,
__unmap_hugepage_range_final assumes the passed vma is about to be removed
and deletes the vma_lock to prevent pmd sharing as the vma is on the way
out. In the case of madvise(MADV_DONTNEED) the vma remains, but the
missing vma_lock prevents pmd sharing and could potentially lead to issues
with truncation/fault races.
This issue was originally reported here [1] as a BUG triggered in
page_try_dup_anon_rmap. Prior to the introduction of the hugetlb
vma_lock, __unmap_hugepage_range_final cleared the VM_MAYSHARE flag to
prevent pmd sharing. Subsequent faults on this vma were confused as
VM_MAYSHARE indicates a sharable vma, but was not set so page_mapping was
not set in new pages added to the page table. This resulted in pages that
appeared anonymous in a VM_SHARED vma and triggered the BUG.
Create a new routine clear_hugetlb_page_range() that can be called from
madvise(MADV_DONTNEED) for hugetlb vmas. It has the same setup as
zap_page_range, but does not delete the vma_lock.
[1] https://lore.kernel.org/lkml/CAO4mrfdLMXsao9RF4fUE8-Wfde8xmjsKrTNMNC9wjUb6J…
Link: https://lkml.kernel.org/r/20221023025047.470646-1-mike.kravetz@oracle.com
Fixes: 90e7e7f5ef3f ("mm: enable MADV_DONTNEED for hugetlb mappings")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reported-by: Wei Chen <harperchen1110(a)gmail.com>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Mina Almasry <almasrymina(a)google.com>
Cc: Naoya Horiguchi <naoya.horiguchi(a)linux.dev>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/hugetlb.h | 7 ++++
mm/hugetlb.c | 62 ++++++++++++++++++++++++++++++--------
mm/madvise.c | 5 ++-
3 files changed, 61 insertions(+), 13 deletions(-)
--- a/include/linux/hugetlb.h~hugetlb-dont-delete-vma_lock-in-hugetlb-madv_dontneed-processing
+++ a/include/linux/hugetlb.h
@@ -156,6 +156,8 @@ long follow_hugetlb_page(struct mm_struc
void unmap_hugepage_range(struct vm_area_struct *,
unsigned long, unsigned long, struct page *,
zap_flags_t);
+void clear_hugetlb_page_range(struct vm_area_struct *vma,
+ unsigned long start, unsigned long end);
void __unmap_hugepage_range_final(struct mmu_gather *tlb,
struct vm_area_struct *vma,
unsigned long start, unsigned long end,
@@ -460,6 +462,11 @@ static inline void __unmap_hugepage_rang
BUG();
}
+static void __maybe_unused clear_hugetlb_page_range(struct vm_area_struct *vma,
+ unsigned long start, unsigned long end)
+{
+}
+
static inline vm_fault_t hugetlb_fault(struct mm_struct *mm,
struct vm_area_struct *vma, unsigned long address,
unsigned int flags)
--- a/mm/hugetlb.c~hugetlb-dont-delete-vma_lock-in-hugetlb-madv_dontneed-processing
+++ a/mm/hugetlb.c
@@ -5194,28 +5194,66 @@ static void __unmap_hugepage_range(struc
tlb_flush_mmu_tlbonly(tlb);
}
-void __unmap_hugepage_range_final(struct mmu_gather *tlb,
+static void __unmap_hugepage_range_locking(struct mmu_gather *tlb,
struct vm_area_struct *vma, unsigned long start,
unsigned long end, struct page *ref_page,
- zap_flags_t zap_flags)
+ zap_flags_t zap_flags, bool final)
{
hugetlb_vma_lock_write(vma);
i_mmap_lock_write(vma->vm_file->f_mapping);
__unmap_hugepage_range(tlb, vma, start, end, ref_page, zap_flags);
- /*
- * Unlock and free the vma lock before releasing i_mmap_rwsem. When
- * the vma_lock is freed, this makes the vma ineligible for pmd
- * sharing. And, i_mmap_rwsem is required to set up pmd sharing.
- * This is important as page tables for this unmapped range will
- * be asynchrously deleted. If the page tables are shared, there
- * will be issues when accessed by someone else.
- */
- __hugetlb_vma_unlock_write_free(vma);
+ if (final) {
+ /*
+ * Unlock and free the vma lock before releasing i_mmap_rwsem.
+ * When the vma_lock is freed, this makes the vma ineligible
+ * for pmd sharing. And, i_mmap_rwsem is required to set up
+ * pmd sharing. This is important as page tables for this
+ * unmapped range will be asynchrously deleted. If the page
+ * tables are shared, there will be issues when accessed by
+ * someone else.
+ */
+ __hugetlb_vma_unlock_write_free(vma);
+ i_mmap_unlock_write(vma->vm_file->f_mapping);
+ } else {
+ i_mmap_unlock_write(vma->vm_file->f_mapping);
+ hugetlb_vma_unlock_write(vma);
+ }
+}
+
+void __unmap_hugepage_range_final(struct mmu_gather *tlb,
+ struct vm_area_struct *vma, unsigned long start,
+ unsigned long end, struct page *ref_page,
+ zap_flags_t zap_flags)
+{
+ __unmap_hugepage_range_locking(tlb, vma, start, end, ref_page,
+ zap_flags, true);
+}
+
+#ifdef CONFIG_ADVISE_SYSCALLS
+/*
+ * Similar setup as in zap_page_range(). madvise(MADV_DONTNEED) can not call
+ * zap_page_range for hugetlb vmas as __unmap_hugepage_range_final will delete
+ * the associated vma_lock.
+ */
+void clear_hugetlb_page_range(struct vm_area_struct *vma, unsigned long start,
+ unsigned long end)
+{
+ struct mmu_notifier_range range;
+ struct mmu_gather tlb;
+
+ mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm,
+ start, end);
+ tlb_gather_mmu(&tlb, vma->vm_mm);
+ update_hiwater_rss(vma->vm_mm);
+
+ __unmap_hugepage_range_locking(&tlb, vma, start, end, NULL, 0, false);
- i_mmap_unlock_write(vma->vm_file->f_mapping);
+ mmu_notifier_invalidate_range_end(&range);
+ tlb_finish_mmu(&tlb);
}
+#endif
void unmap_hugepage_range(struct vm_area_struct *vma, unsigned long start,
unsigned long end, struct page *ref_page,
--- a/mm/madvise.c~hugetlb-dont-delete-vma_lock-in-hugetlb-madv_dontneed-processing
+++ a/mm/madvise.c
@@ -790,7 +790,10 @@ static int madvise_free_single_vma(struc
static long madvise_dontneed_single_vma(struct vm_area_struct *vma,
unsigned long start, unsigned long end)
{
- zap_page_range(vma, start, end - start);
+ if (!is_vm_hugetlb_page(vma))
+ zap_page_range(vma, start, end - start);
+ else
+ clear_hugetlb_page_range(vma, start, end);
return 0;
}
_
Patches currently in -mm which might be from mike.kravetz(a)oracle.com are
hugetlb-simplify-hugetlb-handling-in-follow_page_mask.patch
hugetlb-simplify-hugetlb-handling-in-follow_page_mask-v4.patch
The quilt patch titled
Subject: mm: migrate: fix return value if all subpages of THPs are migrated successfully
has been removed from the -mm tree. Its filename was
mm-migrate-fix-return-value-if-all-subpages-of-thps-are-migrated-successfully.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Subject: mm: migrate: fix return value if all subpages of THPs are migrated successfully
Date: Mon, 24 Oct 2022 16:34:21 +0800
During THP migration, if THPs are not migrated but they are split and all
subpages are migrated successfully, migrate_pages() will still return the
number of THP pages that were not migrated. This will confuse the callers
of migrate_pages(). For example, the longterm pinning will failed though
all pages are migrated successfully.
Thus we should return 0 to indicate that all pages are migrated in this
case
Link: https://lkml.kernel.org/r/de386aa864be9158d2f3b344091419ea7c38b2f7.16665998…
Fixes: b5bade978e9b ("mm: migrate: fix the return value of migrate_pages()")
Signed-off-by: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Reviewed-by: Alistair Popple <apopple(a)nvidia.com>
Reviewed-by: Yang Shi <shy828301(a)gmail.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: "Huang, Ying" <ying.huang(a)intel.com>
Cc: Zi Yan <ziy(a)nvidia.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/mm/migrate.c~mm-migrate-fix-return-value-if-all-subpages-of-thps-are-migrated-successfully
+++ a/mm/migrate.c
@@ -1582,6 +1582,13 @@ out:
*/
list_splice(&ret_pages, from);
+ /*
+ * Return 0 in case all subpages of fail-to-migrate THPs are
+ * migrated successfully.
+ */
+ if (list_empty(from))
+ rc = 0;
+
count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
count_vm_events(PGMIGRATE_FAIL, nr_failed_pages);
count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);
_
Patches currently in -mm which might be from baolin.wang(a)linux.alibaba.com are
mm-migrate-try-again-if-thp-split-is-failed-due-to-page-refcnt.patch
The quilt patch titled
Subject: mm/uffd: fix vma check on userfault for wp
has been removed from the -mm tree. Its filename was
mm-uffd-fix-vma-check-on-userfault-for-wp.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Peter Xu <peterx(a)redhat.com>
Subject: mm/uffd: fix vma check on userfault for wp
Date: Mon, 24 Oct 2022 15:33:35 -0400
We used to have a report that pte-marker code can be reached even when
uffd-wp is not compiled in for file memories, here:
https://lore.kernel.org/all/YzeR+R6b4bwBlBHh@x1n/T/#u
I just got time to revisit this and found that the root cause is we simply
messed up with the vma check, so that for !PTE_MARKER_UFFD_WP system, we
will allow UFFDIO_REGISTER of MINOR & WP upon shmem as the check was
wrong:
if (vm_flags & VM_UFFD_MINOR)
return is_vm_hugetlb_page(vma) || vma_is_shmem(vma);
Where we'll allow anything to pass on shmem as long as minor mode is
requested.
Axel did it right when introducing minor mode but I messed it up in
b1f9e876862d when moving code around. Fix it.
Link: https://lkml.kernel.org/r/20221024193336.1233616-1-peterx@redhat.com
Link: https://lkml.kernel.org/r/20221024193336.1233616-2-peterx@redhat.com
Fixes: b1f9e876862d ("mm/uffd: enable write protection for shmem & hugetlbfs")
Signed-off-by: Peter Xu <peterx(a)redhat.com>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/include/linux/userfaultfd_k.h~mm-uffd-fix-vma-check-on-userfault-for-wp
+++ a/include/linux/userfaultfd_k.h
@@ -146,9 +146,9 @@ static inline bool userfaultfd_armed(str
static inline bool vma_can_userfault(struct vm_area_struct *vma,
unsigned long vm_flags)
{
- if (vm_flags & VM_UFFD_MINOR)
- return is_vm_hugetlb_page(vma) || vma_is_shmem(vma);
-
+ if ((vm_flags & VM_UFFD_MINOR) &&
+ (!is_vm_hugetlb_page(vma) && !vma_is_shmem(vma)))
+ return false;
#ifndef CONFIG_PTE_MARKER_UFFD_WP
/*
* If user requested uffd-wp but not enabled pte markers for
_
Patches currently in -mm which might be from peterx(a)redhat.com are
selftests-vm-use-memfd-for-uffd-hugetlb-tests.patch
selftests-vm-use-memfd-for-hugetlb-madvise-test.patch
selftests-vm-use-memfd-for-hugepage-mremap-test.patch
selftests-vm-drop-mnt-point-for-hugetlb-in-run_vmtestssh.patch
mm-hugetlb-unify-clearing-of-restorereserve-for-private-pages.patch
revert-mm-uffd-fix-warning-without-pte_marker_uffd_wp-compiled-in.patch
The quilt patch titled
Subject: mm: prep_compound_tail() clear page->private
has been removed from the -mm tree. Its filename was
mm-prep_compound_tail-clear-page-private.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Hugh Dickins <hughd(a)google.com>
Subject: mm: prep_compound_tail() clear page->private
Date: Sat, 22 Oct 2022 00:51:06 -0700 (PDT)
Although page allocation always clears page->private in the first page or
head page of an allocation, it has never made a point of clearing
page->private in the tails (though 0 is often what is already there).
But now commit 71e2d666ef85 ("mm/huge_memory: do not clobber swp_entry_t
during THP split") issues a warning when page_tail->private is found to be
non-0 (unless it's swapcache).
Change that warning to dump page_tail (which also dumps head), instead of
just the head: so far we have seen dead000000000122, dead000000000003,
dead000000000001 or 0000000000000002 in the raw output for tail private.
We could just delete the warning, but today's consensus appears to want
page->private to be 0, unless there's a good reason for it to be set: so
now clear it in prep_compound_tail() (more general than just for THP; but
not for high order allocation, which makes no pass down the tails).
Link: https://lkml.kernel.org/r/1c4233bb-4e4d-5969-fbd4-96604268a285@google.com
Fixes: 71e2d666ef85 ("mm/huge_memory: do not clobber swp_entry_t during THP split")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Acked-by: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/mm/huge_memory.c~mm-prep_compound_tail-clear-page-private
+++ a/mm/huge_memory.c
@@ -2462,7 +2462,7 @@ static void __split_huge_page_tail(struc
* Fix up and warn once if private is unexpectedly set.
*/
if (!folio_test_swapcache(page_folio(head))) {
- VM_WARN_ON_ONCE_PAGE(page_tail->private != 0, head);
+ VM_WARN_ON_ONCE_PAGE(page_tail->private != 0, page_tail);
page_tail->private = 0;
}
--- a/mm/page_alloc.c~mm-prep_compound_tail-clear-page-private
+++ a/mm/page_alloc.c
@@ -807,6 +807,7 @@ static void prep_compound_tail(struct pa
p->mapping = TAIL_MAPPING;
set_compound_head(p, head);
+ set_page_private(p, 0);
}
void prep_compound_page(struct page *page, unsigned int order)
_
Patches currently in -mm which might be from hughd(a)google.com are
The quilt patch titled
Subject: mm,madvise,hugetlb: fix unexpected data loss with MADV_DONTNEED on hugetlbfs
has been removed from the -mm tree. Its filename was
mmmadvisehugetlb-fix-unexpected-data-loss-with-madv_dontneed-on-hugetlbfs.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Rik van Riel <riel(a)surriel.com>
Subject: mm,madvise,hugetlb: fix unexpected data loss with MADV_DONTNEED on hugetlbfs
Date: Fri, 21 Oct 2022 19:28:05 -0400
A common use case for hugetlbfs is for the application to create
memory pools backed by huge pages, which then get handed over to
some malloc library (eg. jemalloc) for further management.
That malloc library may be doing MADV_DONTNEED calls on memory
that is no longer needed, expecting those calls to happen on
PAGE_SIZE boundaries.
However, currently the MADV_DONTNEED code rounds up any such
requests to HPAGE_PMD_SIZE boundaries. This leads to undesired
outcomes when jemalloc expects a 4kB MADV_DONTNEED, but 2MB of
memory get zeroed out, instead.
Use of pre-built shared libraries means that user code does not
always know the page size of every memory arena in use.
Avoid unexpected data loss with MADV_DONTNEED by rounding up
only to PAGE_SIZE (in do_madvise), and rounding down to huge
page granularity.
That way programs will only get as much memory zeroed out as
they requested.
Link: https://lkml.kernel.org/r/20221021192805.366ad573@imladris.surriel.com
Fixes: 90e7e7f5ef3f ("mm: enable MADV_DONTNEED for hugetlb mappings")
Signed-off-by: Rik van Riel <riel(a)surriel.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/mm/madvise.c~mmmadvisehugetlb-fix-unexpected-data-loss-with-madv_dontneed-on-hugetlbfs
+++ a/mm/madvise.c
@@ -813,7 +813,14 @@ static bool madvise_dontneed_free_valid_
if (start & ~huge_page_mask(hstate_vma(vma)))
return false;
- *end = ALIGN(*end, huge_page_size(hstate_vma(vma)));
+ /*
+ * Madvise callers expect the length to be rounded up to PAGE_SIZE
+ * boundaries, and may be unaware that this VMA uses huge pages.
+ * Avoid unexpected data loss by rounding down the number of
+ * huge pages freed.
+ */
+ *end = ALIGN_DOWN(*end, huge_page_size(hstate_vma(vma)));
+
return true;
}
@@ -828,6 +835,9 @@ static long madvise_dontneed_free(struct
if (!madvise_dontneed_free_valid_vma(vma, start, &end, behavior))
return -EINVAL;
+ if (start == end)
+ return 0;
+
if (!userfaultfd_remove(vma, start, end)) {
*prev = NULL; /* mmap_lock has been dropped, prev is stale */
_
Patches currently in -mm which might be from riel(a)surriel.com are
The quilt patch titled
Subject: mm/kmemleak: prevent soft lockup in kmemleak_scan()'s object iteration loops
has been removed from the -mm tree. Its filename was
mm-kmemleak-prevent-soft-lockup-in-kmemleak_scans-object-iteration-loops.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Waiman Long <longman(a)redhat.com>
Subject: mm/kmemleak: prevent soft lockup in kmemleak_scan()'s object iteration loops
Date: Thu, 20 Oct 2022 13:56:19 -0400
Commit 6edda04ccc7c ("mm/kmemleak: prevent soft lockup in first object
iteration loop of kmemleak_scan()") adds cond_resched() in the first
object iteration loop of kmemleak_scan(). However, it turns that the 2nd
objection iteration loop can still cause soft lockup to happen in some
cases. So add a cond_resched() call in the 2nd and 3rd loops as well to
prevent that and for completeness.
Link: https://lkml.kernel.org/r/20221020175619.366317-1-longman@redhat.com
Fixes: 6edda04ccc7c ("mm/kmemleak: prevent soft lockup in first object iteration loop of kmemleak_scan()")
Signed-off-by: Waiman Long <longman(a)redhat.com>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/mm/kmemleak.c~mm-kmemleak-prevent-soft-lockup-in-kmemleak_scans-object-iteration-loops
+++ a/mm/kmemleak.c
@@ -1461,6 +1461,27 @@ static void scan_gray_list(void)
}
/*
+ * Conditionally call resched() in a object iteration loop while making sure
+ * that the given object won't go away without RCU read lock by performing a
+ * get_object() if !pinned.
+ *
+ * Return: false if can't do a cond_resched() due to get_object() failure
+ * true otherwise
+ */
+static bool kmemleak_cond_resched(struct kmemleak_object *object, bool pinned)
+{
+ if (!pinned && !get_object(object))
+ return false;
+
+ rcu_read_unlock();
+ cond_resched();
+ rcu_read_lock();
+ if (!pinned)
+ put_object(object);
+ return true;
+}
+
+/*
* Scan data sections and all the referenced memory blocks allocated via the
* kernel's standard allocators. This function must be called with the
* scan_mutex held.
@@ -1471,7 +1492,7 @@ static void kmemleak_scan(void)
struct zone *zone;
int __maybe_unused i;
int new_leaks = 0;
- int loop1_cnt = 0;
+ int loop_cnt = 0;
jiffies_last_scan = jiffies;
@@ -1480,7 +1501,6 @@ static void kmemleak_scan(void)
list_for_each_entry_rcu(object, &object_list, object_list) {
bool obj_pinned = false;
- loop1_cnt++;
raw_spin_lock_irq(&object->lock);
#ifdef DEBUG
/*
@@ -1514,24 +1534,11 @@ static void kmemleak_scan(void)
raw_spin_unlock_irq(&object->lock);
/*
- * Do a cond_resched() to avoid soft lockup every 64k objects.
- * Make sure a reference has been taken so that the object
- * won't go away without RCU read lock.
+ * Do a cond_resched() every 64k objects to avoid soft lockup.
*/
- if (!(loop1_cnt & 0xffff)) {
- if (!obj_pinned && !get_object(object)) {
- /* Try the next object instead */
- loop1_cnt--;
- continue;
- }
-
- rcu_read_unlock();
- cond_resched();
- rcu_read_lock();
-
- if (!obj_pinned)
- put_object(object);
- }
+ if (!(++loop_cnt & 0xffff) &&
+ !kmemleak_cond_resched(object, obj_pinned))
+ loop_cnt--; /* Try again on next object */
}
rcu_read_unlock();
@@ -1598,8 +1605,16 @@ static void kmemleak_scan(void)
* scan and color them gray until the next scan.
*/
rcu_read_lock();
+ loop_cnt = 0;
list_for_each_entry_rcu(object, &object_list, object_list) {
/*
+ * Do a cond_resched() every 64k objects to avoid soft lockup.
+ */
+ if (!(++loop_cnt & 0xffff) &&
+ !kmemleak_cond_resched(object, false))
+ loop_cnt--; /* Try again on next object */
+
+ /*
* This is racy but we can save the overhead of lock/unlock
* calls. The missed objects, if any, should be caught in
* the next scan.
@@ -1632,8 +1647,16 @@ static void kmemleak_scan(void)
* Scanning result reporting.
*/
rcu_read_lock();
+ loop_cnt = 0;
list_for_each_entry_rcu(object, &object_list, object_list) {
/*
+ * Do a cond_resched() every 64k objects to avoid soft lockup.
+ */
+ if (!(++loop_cnt & 0xffff) &&
+ !kmemleak_cond_resched(object, false))
+ loop_cnt--; /* Try again on next object */
+
+ /*
* This is racy but we can save the overhead of lock/unlock
* calls. The missed objects, if any, should be caught in
* the next scan.
_
Patches currently in -mm which might be from longman(a)redhat.com are
The quilt patch titled
Subject: squashfs: fix extending readahead beyond end of file
has been removed from the -mm tree. Its filename was
squashfs-fix-extending-readahead-beyond-end-of-file.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: squashfs: fix extending readahead beyond end of file
Date: Thu, 20 Oct 2022 23:36:15 +0100
The readahead code will try to extend readahead to the entire size of the
Squashfs data block.
But, it didn't take into account that the last block at the end of the
file may not be a whole block. In this case, the code would extend
readahead to beyond the end of the file, leaving trailing pages.
Fix this by only requesting the expected number of pages.
Link: https://lkml.kernel.org/r/20221020223616.7571-3-phillip@squashfs.org.uk
Fixes: 8fc78b6fe24c ("squashfs: implement readahead")
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Tested-by: Bagas Sanjaya <bagasdotme(a)gmail.com>
Reported-by: Marc Miltenberger <marcmiltenberger(a)gmail.com>
Cc: Dimitri John Ledkov <dimitri.ledkov(a)canonical.com>
Cc: Hsin-Yi Wang <hsinyi(a)chromium.org>
Cc: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Cc: Slade Watkins <srw(a)sladewatkins.net>
Cc: Thorsten Leemhuis <regressions(a)leemhuis.info>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/fs/squashfs/file.c~squashfs-fix-extending-readahead-beyond-end-of-file
+++ a/fs/squashfs/file.c
@@ -559,6 +559,12 @@ static void squashfs_readahead(struct re
unsigned int expected;
struct page *last_page;
+ expected = start >> msblk->block_log == file_end ?
+ (i_size_read(inode) & (msblk->block_size - 1)) :
+ msblk->block_size;
+
+ max_pages = (expected + PAGE_SIZE - 1) >> PAGE_SHIFT;
+
nr_pages = __readahead_batch(ractl, pages, max_pages);
if (!nr_pages)
break;
@@ -567,13 +573,10 @@ static void squashfs_readahead(struct re
goto skip_pages;
index = pages[0]->index >> shift;
+
if ((pages[nr_pages - 1]->index >> shift) != index)
goto skip_pages;
- expected = index == file_end ?
- (i_size_read(inode) & (msblk->block_size - 1)) :
- msblk->block_size;
-
if (index == file_end && squashfs_i(inode)->fragment_block !=
SQUASHFS_INVALID_BLK) {
res = squashfs_readahead_fragment(pages, nr_pages,
_
Patches currently in -mm which might be from phillip(a)squashfs.org.uk are
The quilt patch titled
Subject: squashfs: fix read regression introduced in readahead code
has been removed from the -mm tree. Its filename was
squashfs-fix-read-regression-introduced-in-readahead-code.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: squashfs: fix read regression introduced in readahead code
Date: Thu, 20 Oct 2022 23:36:14 +0100
Patch series "squashfs: fix some regressions introduced in the readahead
code".
This patchset fixes 3 regressions introduced by the recent readahead code
changes. The first regression is causing "snaps" to randomly fail after a
couple of hours or days, which how the regression came to light.
This patch (of 3):
If a file isn't a whole multiple of the page size, the last page will have
trailing bytes unfilled.
There was a mistake in the readahead code which did this. In particular
it incorrectly assumed that the last page in the readahead page array
(page[nr_pages - 1]) will always contain the last page in the block, which
if we're at file end, will be the page that needs to be zero filled.
But the readahead code may not return the last page in the block, which
means it is unmapped and will be skipped by the decompressors (a temporary
buffer used).
In this case the zero filling code will zero out the wrong page, leading
to data corruption.
Fix this by by extending the "page actor" to return the last page if
present, or NULL if a temporary buffer was used.
Link: https://lkml.kernel.org/r/20221020223616.7571-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20221020223616.7571-2-phillip@squashfs.org.uk
Fixes: 8fc78b6fe24c ("squashfs: implement readahead")
Link: https://lore.kernel.org/lkml/b0c258c3-6dcf-aade-efc4-d62a8b3a1ce2@alu.unizg…
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Tested-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Tested-by: Slade Watkins <srw(a)sladewatkins.net>
Tested-by: Bagas Sanjaya <bagasdotme(a)gmail.com>
Reported-by: Marc Miltenberger <marcmiltenberger(a)gmail.com>
Cc: Dimitri John Ledkov <dimitri.ledkov(a)canonical.com>
Cc: Hsin-Yi Wang <hsinyi(a)chromium.org>
Cc: Thorsten Leemhuis <regressions(a)leemhuis.info>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/fs/squashfs/file.c~squashfs-fix-read-regression-introduced-in-readahead-code
+++ a/fs/squashfs/file.c
@@ -557,6 +557,7 @@ static void squashfs_readahead(struct re
int res, bsize;
u64 block = 0;
unsigned int expected;
+ struct page *last_page;
nr_pages = __readahead_batch(ractl, pages, max_pages);
if (!nr_pages)
@@ -593,15 +594,15 @@ static void squashfs_readahead(struct re
res = squashfs_read_data(inode->i_sb, block, bsize, NULL, actor);
- squashfs_page_actor_free(actor);
+ last_page = squashfs_page_actor_free(actor);
if (res == expected) {
int bytes;
/* Last page (if present) may have trailing bytes not filled */
bytes = res % PAGE_SIZE;
- if (pages[nr_pages - 1]->index == file_end && bytes)
- memzero_page(pages[nr_pages - 1], bytes,
+ if (index == file_end && bytes && last_page)
+ memzero_page(last_page, bytes,
PAGE_SIZE - bytes);
for (i = 0; i < nr_pages; i++) {
--- a/fs/squashfs/page_actor.c~squashfs-fix-read-regression-introduced-in-readahead-code
+++ a/fs/squashfs/page_actor.c
@@ -71,11 +71,13 @@ static void *handle_next_page(struct squ
(actor->next_index != actor->page[actor->next_page]->index)) {
actor->next_index++;
actor->returned_pages++;
+ actor->last_page = NULL;
return actor->alloc_buffer ? actor->tmp_buffer : ERR_PTR(-ENOMEM);
}
actor->next_index++;
actor->returned_pages++;
+ actor->last_page = actor->page[actor->next_page];
return actor->pageaddr = kmap_local_page(actor->page[actor->next_page++]);
}
@@ -125,6 +127,7 @@ struct squashfs_page_actor *squashfs_pag
actor->returned_pages = 0;
actor->next_index = page[0]->index & ~((1 << (msblk->block_log - PAGE_SHIFT)) - 1);
actor->pageaddr = NULL;
+ actor->last_page = NULL;
actor->alloc_buffer = msblk->decompressor->alloc_buffer;
actor->squashfs_first_page = direct_first_page;
actor->squashfs_next_page = direct_next_page;
--- a/fs/squashfs/page_actor.h~squashfs-fix-read-regression-introduced-in-readahead-code
+++ a/fs/squashfs/page_actor.h
@@ -16,6 +16,7 @@ struct squashfs_page_actor {
void *(*squashfs_first_page)(struct squashfs_page_actor *);
void *(*squashfs_next_page)(struct squashfs_page_actor *);
void (*squashfs_finish_page)(struct squashfs_page_actor *);
+ struct page *last_page;
int pages;
int length;
int next_page;
@@ -29,10 +30,13 @@ extern struct squashfs_page_actor *squas
extern struct squashfs_page_actor *squashfs_page_actor_init_special(
struct squashfs_sb_info *msblk,
struct page **page, int pages, int length);
-static inline void squashfs_page_actor_free(struct squashfs_page_actor *actor)
+static inline struct page *squashfs_page_actor_free(struct squashfs_page_actor *actor)
{
+ struct page *last_page = actor->last_page;
+
kfree(actor->tmp_buffer);
kfree(actor);
+ return last_page;
}
static inline void *squashfs_first_page(struct squashfs_page_actor *actor)
{
_
Patches currently in -mm which might be from phillip(a)squashfs.org.uk are
Updating the machine's kernel from v5.19.x to v6.0.x causes the machine to not
successfully boot. The machine boots successfully (and exhibits stable operation)
with version v5.19.17 and multiple earlier releases in the 5.19 line. Multiple releases
from the 6.0 line (including 6.0.0, 6.0.3, and 6.0.5), with no other changes to the
software environment, do not boot. Instead, the machine hangs after loading services
but before presenting a display manager; the machine instead shows repetitive hard
drive activity at this point and then no apparent activity.
''uname'' output for the machine successfully running v5.19.17 is:
Linux [MACHINE_NAME] 5.19.17 #1 SMP PREEMPT_DYNAMIC Mon Oct 24 13:32:29 2022 i686 Intel(R) Atom(TM) CPU N270 @ 1.60GHz GenuineIntel GNU/Linux
The machine is an OCZ Neutrino netbook, running a custom OS build largely similar to
LFS development. The kernel update uses ''make olddefconfig''.
#regzbot introduced: v5.19..v6.0
---
Dominic Jones
jonesd(a)xmission.com
Reinstating Cc stable, which I removed just before the discussion settled.
On Thu, 27 Oct 2022, Peter Xu wrote:
> ...
>
> After a re-read and 2nd thought, I think David has a valid point in that we
> shouldn't have special handling of !anon pages on CoW during fork(),
> because that seems to be against the fundamental concept of fork().
>
> So now I think I agree the !Anon original check does look a bit cleaner,
> and also make fork() behavior matching with the old/new kernels, irrelevant
> of the pin mess.
Thanks Peter. So Yuanzheng's patch for 5.10 is exactly right.
Sorry for leading everyone astray: my mistake was to suppose that
its !PageAnon check was simply to avoid the later BUG_ON(!anon_vma):
whereas David and Peter now agree that it actually corrects the
semantics for fork() on file pages.
I lift my hold on Yuanzheng's patch: nobody actually said "Acked-by",
but I think the discussion and resolution have given better than that.
(No 3rd thoughts please!)
Hugh
On Tue, Oct 11, 2022 at 03:46:15PM +0800, Yaxiong Tian wrote:
> >Note that there could be also the case in which there are many other
> >threads of executions on different cores trying to send a bunch of SCMI
> >commands (there are indeed many SCMI drivers currently in the stack) so
> >all these requests HAVE TO be queued while waiting for the shared
> >memory area to be free again, so I don't think you can assume in general
> >that shmem_tx_prepare() is called only after a successfull command completion
> >or timeout: it is probably true when an SCMI Mailbox transport is
> >used but it is not neccessarly the case when other shmem based transports
> >like optee/smc are used. You could have N threads trying to send an SCMI
> >command, queued, spinning on that loop, while waiting for the channel to
> >be free: in such a case note that you TX has not even complete, you are
> >still waiting for the channel the SCMI upper layer TX timeout has NOT
> >even being started since your the transmission itself is pending (i.e.
> >send_message has still not returned...)
>
> I read the code in optee/smc, scmi_optee_send_message()/smc_send_message()
> have channel lock before shmem_tx_prepare(). The lock was released when
> transports was completed 、timeout or err. So although they have N threads
> trying to send an SCMI command, queued, but only one could take up the channel.
> Also only one thread call shmem_tx_prepare() in one channel and spin() in it.
>
> It is also true in mailboxs or other shmem based transports,because SCMI
> protocol say:"agent must exclusive the channel."This is very reasonable through
> the channel lock rather than shared memory.
>
> So it is simple for selecting timeout period. Because there is only one thread
> spin() in one channel, so the timeout period only depending on the firmware.
> In other words this time can be a constant.
>
Yes, my bad, I forgot that any shared-mem based transport has some kind
of mechanism to access exclusively the channel for the whole transaction
to avoid some thread can issue a tx_prepare before the previous
transaction has fully completed (i.e. the result in the reply, if any,
was fetched before being overwritten by the next)
> >Not sure that all of this kind of work would be worth to address some,
> >maybe transient, error conditions due to a broken SCMI server, BUT in any
> >case, any kind of timeout you want to introduce in the spin loop MUST
> >result in a failed transmission until the FREE bitflag is cleared by the
> >SCMI server; i.e. if that flag won't be cleared EVER by the server, you
> >have to end up with a sequence of timed-out spinloops and transmission
> >failures, you definetely cannot recover forcibly like this.
>
> I totally agree above.In such broken SCMI server,users cannot get any Any
> hints.So I think it at least pr_warn(). We can set prepare_tx_timout parameter
> in scmi_desc,or just set options for users to check error.
>
Problem is anyway, as you said, you'll have to pick this timeout from the
related transport scmi_desc (even if as of now the max_rx_timeout for
all existent shared mem transport is the same..) and this means anyway
adding more complexity to the chain of calls to just to print a warn of
some kind in a rare error-situation from which you cannot recover anyway.
Due to other unrelated discussions, I was starting to think about
exposing some debug-only (Kconfig dependent) SCMI stats like timeouts, errors,
unpexpected/OoO/late_replies in order to ease the debug and monitoring
of the health of a running SCMI stack: maybe this could be a place where
to flag this FW issues without changing the spinloop above (or
to add the kind of timeout you mentioned but only when some sort of
CONFIG_SCMI_DEBUG is enabled...)...still to fully think it through, though.
Any thoughts ?
Thanks,
Cristian
Commit fc64623637da ("phy: qcom-qmp-combo,usb: add support for separate
PCS_USB region") started treating the PCS_USB registers as potentially
separate from the PCS registers but used the wrong base when no PCS_USB
offset has been provided.
Fix the PCS_USB base used at runtime resume to prevent dereferencing a
NULL pointer on platforms that do not provide a PCS_USB offset (e.g.
SC7180).
Fixes: fc64623637da ("phy: qcom-qmp-combo,usb: add support for separate PCS_USB region")
Cc: stable(a)vger.kernel.org # 5.20
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
drivers/phy/qualcomm/phy-qcom-qmp-combo.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/phy/qualcomm/phy-qcom-qmp-combo.c b/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
index f6328434c61e..ad6a0fd7ba8e 100644
--- a/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
+++ b/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
@@ -2144,7 +2144,7 @@ static void qmp_combo_enable_autonomous_mode(struct qmp_phy *qphy)
static void qmp_combo_disable_autonomous_mode(struct qmp_phy *qphy)
{
const struct qmp_phy_cfg *cfg = qphy->cfg;
- void __iomem *pcs_usb = qphy->pcs_usb ?: qphy->pcs_usb;
+ void __iomem *pcs_usb = qphy->pcs_usb ?: qphy->pcs;
void __iomem *pcs_misc = qphy->pcs_misc;
/* Disable i/o clamp_n on resume for normal mode */
--
2.37.3
I'm announcing the release of the 5.10.151 kernel.
All users of the 5.10 kernel series must upgrade.
The updated 5.10.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.10.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 5 ++++-
scripts/link-vmlinux.sh | 2 +-
scripts/pahole-flags.sh | 21 +++++++++++++++++++++
3 files changed, 26 insertions(+), 2 deletions(-)
Andrii Nakryiko (1):
kbuild: skip per-CPU BTF generation for pahole v1.18-v1.21
Greg Kroah-Hartman (1):
Linux 5.10.151
Ilya Leoshkevich (1):
bpf: Generate BTF_KIND_FLOAT when linking vmlinux
Javier Martinez Canillas (1):
kbuild: Quote OBJCOPY var to avoid a pahole call break the build
Jiri Olsa (1):
kbuild: Unify options for BTF generation for vmlinux and modules
Martin Rodriguez Reboredo (1):
kbuild: Add skip_encoding_btf_enum64 option to pahole
Until now, simpledrm unconditionally advertised all formats that can be
supported natively as conversions. However, we don't actually have a
full conversion matrix of helpers. Although the list is arguably
provided to userspace in precedence order, userspace can pick something
out-of-order (and thus break when it shouldn't), or simply only support
a format that is unsupported (and thus think it can work, which results
in the appearance of a hang as FB blits fail later on, instead of the
initialization error you'd expect in this case).
Split up the format table into separate ones for each required subset,
and then pick one based on the native format. Also remove the
native<->conversion overlap check from the helper (which doesn't make
sense any more, since the native format is advertised anyway and this
way RGB565/RGB888 can share a format table), and instead print the same
message in simpledrm when the native format is not one for which we have
conversions at all.
This fixes a real user regression where the ?RGB2101010 support commit
started advertising it unconditionally where not supported, and KWin
decided to start to use it over the native format, but also the fixes
the spurious RGB565/RGB888 formats which have been wrongly
unconditionally advertised since the dawn of simpledrm.
Note: this patch is merged because splitting it into two patches, one
for the helper and one for simpledrm, would regress at the midpoint
regardless of the order. If simpledrm is changed first, that would break
working conversions to RGB565/RGB888 (since those share a table that
does not include the native formats). If the helper is changed first, it
would start spuriously advertising all conversion formats when the
native format doesn't have any supported conversions at all.
Acked-by: Pekka Paalanen <pekka.paalanen(a)collabora.com>
Fixes: 6ea966fca084 ("drm/simpledrm: Add [AX]RGB2101010 formats")
Fixes: 11e8f5fd223b ("drm: Add simpledrm driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hector Martin <marcan(a)marcan.st>
---
drivers/gpu/drm/drm_format_helper.c | 15 -------
drivers/gpu/drm/tiny/simpledrm.c | 62 +++++++++++++++++++++++++----
2 files changed, 55 insertions(+), 22 deletions(-)
diff --git a/drivers/gpu/drm/drm_format_helper.c b/drivers/gpu/drm/drm_format_helper.c
index e2f76621453c..c60c13f3a872 100644
--- a/drivers/gpu/drm/drm_format_helper.c
+++ b/drivers/gpu/drm/drm_format_helper.c
@@ -864,20 +864,6 @@ size_t drm_fb_build_fourcc_list(struct drm_device *dev,
++fourccs;
}
- /*
- * The plane's atomic_update helper converts the framebuffer's color format
- * to a native format when copying to device memory.
- *
- * If there is not a single format supported by both, device and
- * driver, the native formats are likely not supported by the conversion
- * helpers. Therefore *only* support the native formats and add a
- * conversion helper ASAP.
- */
- if (!found_native) {
- drm_warn(dev, "Format conversion helpers required to add extra formats.\n");
- goto out;
- }
-
/*
* The extra formats, emulated by the driver, go second.
*/
@@ -898,7 +884,6 @@ size_t drm_fb_build_fourcc_list(struct drm_device *dev,
++fourccs;
}
-out:
return fourccs - fourccs_out;
}
EXPORT_SYMBOL(drm_fb_build_fourcc_list);
diff --git a/drivers/gpu/drm/tiny/simpledrm.c b/drivers/gpu/drm/tiny/simpledrm.c
index 18489779fb8a..1257411f3d44 100644
--- a/drivers/gpu/drm/tiny/simpledrm.c
+++ b/drivers/gpu/drm/tiny/simpledrm.c
@@ -446,22 +446,48 @@ static int simpledrm_device_init_regulators(struct simpledrm_device *sdev)
*/
/*
- * Support all formats of simplefb and maybe more; in order
- * of preference. The display's update function will do any
+ * Support the subset of formats that we have conversion helpers for,
+ * in order of preference. The display's update function will do any
* conversion necessary.
*
* TODO: Add blit helpers for remaining formats and uncomment
* constants.
*/
-static const uint32_t simpledrm_primary_plane_formats[] = {
+
+/*
+ * Supported conversions to RGB565 and RGB888:
+ * from [AX]RGB8888
+ */
+static const uint32_t simpledrm_primary_plane_formats_base[] = {
+ DRM_FORMAT_XRGB8888,
+ DRM_FORMAT_ARGB8888,
+};
+
+/*
+ * Supported conversions to [AX]RGB8888:
+ * A/X variants (no-op)
+ * from RGB565
+ * from RGB888
+ */
+static const uint32_t simpledrm_primary_plane_formats_xrgb8888[] = {
DRM_FORMAT_XRGB8888,
DRM_FORMAT_ARGB8888,
+ DRM_FORMAT_RGB888,
DRM_FORMAT_RGB565,
//DRM_FORMAT_XRGB1555,
//DRM_FORMAT_ARGB1555,
- DRM_FORMAT_RGB888,
+};
+
+/*
+ * Supported conversions to [AX]RGB2101010:
+ * A/X variants (no-op)
+ * from [AX]RGB8888
+ */
+static const uint32_t simpledrm_primary_plane_formats_xrgb2101010[] = {
DRM_FORMAT_XRGB2101010,
DRM_FORMAT_ARGB2101010,
+ DRM_FORMAT_XRGB8888,
+ DRM_FORMAT_ARGB8888,
};
static const uint64_t simpledrm_primary_plane_format_modifiers[] = {
@@ -642,7 +668,8 @@ static struct simpledrm_device *simpledrm_device_create(struct drm_driver *drv,
struct drm_encoder *encoder;
struct drm_connector *connector;
unsigned long max_width, max_height;
- size_t nformats;
+ const uint32_t *conv_formats;
+ size_t conv_nformats, nformats;
int ret;
sdev = devm_drm_dev_alloc(&pdev->dev, drv, struct simpledrm_device, dev);
@@ -755,10 +782,31 @@ static struct simpledrm_device *simpledrm_device_create(struct drm_driver *drv,
dev->mode_config.funcs = &simpledrm_mode_config_funcs;
/* Primary plane */
+ switch (format->format) {
+ case DRM_FORMAT_RGB565:
+ case DRM_FORMAT_RGB888:
+ conv_formats = simpledrm_primary_plane_formats_base;
+ conv_nformats = ARRAY_SIZE(simpledrm_primary_plane_formats_base);
+ break;
+ case DRM_FORMAT_XRGB8888:
+ case DRM_FORMAT_ARGB8888:
+ conv_formats = simpledrm_primary_plane_formats_xrgb8888;
+ conv_nformats = ARRAY_SIZE(simpledrm_primary_plane_formats_xrgb8888);
+ break;
+ case DRM_FORMAT_XRGB2101010:
+ case DRM_FORMAT_ARGB2101010:
+ conv_formats = simpledrm_primary_plane_formats_xrgb2101010;
+ conv_nformats = ARRAY_SIZE(simpledrm_primary_plane_formats_xrgb2101010);
+ break;
+ default:
+ conv_formats = NULL;
+ conv_nformats = 0;
+ drm_warn(dev, "Format conversion helpers required to add extra formats.\n");
+ break;
+ }
nformats = drm_fb_build_fourcc_list(dev, &format->format, 1,
- simpledrm_primary_plane_formats,
- ARRAY_SIZE(simpledrm_primary_plane_formats),
+ conv_formats, conv_nformats,
sdev->formats, ARRAY_SIZE(sdev->formats));
primary_plane = &sdev->primary_plane;
--
2.35.1
The following commit has been merged into the ras/core branch of tip:
Commit-ID: bc1b705b0eee4c645ad8b3bbff3c8a66e9688362
Gitweb: https://git.kernel.org/tip/bc1b705b0eee4c645ad8b3bbff3c8a66e9688362
Author: Yazen Ghannam <yazen.ghannam(a)amd.com>
AuthorDate: Tue, 21 Jun 2022 15:59:43
Committer: Borislav Petkov <bp(a)suse.de>
CommitterDate: Thu, 27 Oct 2022 17:01:25 +02:00
x86/MCE/AMD: Clear DFR errors found in THR handler
AMD's MCA Thresholding feature counts errors of all severity levels, not
just correctable errors. If a deferred error causes the threshold limit
to be reached (it was the error that caused the overflow), then both a
deferred error interrupt and a thresholding interrupt will be triggered.
The order of the interrupts is not guaranteed. If the threshold
interrupt handler is executed first, then it will clear MCA_STATUS for
the error. It will not check or clear MCA_DESTAT which also holds a copy
of the deferred error. When the deferred error interrupt handler runs it
will not find an error in MCA_STATUS, but it will find the error in
MCA_DESTAT. This will cause two errors to be logged.
Check for deferred errors when handling a threshold interrupt. If a bank
contains a deferred error, then clear the bank's MCA_DESTAT register.
Define a new helper function to do the deferred error check and clearing
of MCA_DESTAT.
[ bp: Simplify, convert comment to passive voice. ]
Fixes: 37d43acfd79f ("x86/mce/AMD: Redo error logging from APIC LVT interrupt handlers")
Signed-off-by: Yazen Ghannam <yazen.ghannam(a)amd.com>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20220621155943.33623-1-yazen.ghannam@amd.com
---
arch/x86/kernel/cpu/mce/amd.c | 33 ++++++++++++++++++++-------------
1 file changed, 20 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 1c87501..10fb5b5 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -788,6 +788,24 @@ _log_error_bank(unsigned int bank, u32 msr_stat, u32 msr_addr, u64 misc)
return status & MCI_STATUS_DEFERRED;
}
+static bool _log_error_deferred(unsigned int bank, u32 misc)
+{
+ if (!_log_error_bank(bank, mca_msr_reg(bank, MCA_STATUS),
+ mca_msr_reg(bank, MCA_ADDR), misc))
+ return false;
+
+ /*
+ * Non-SMCA systems don't have MCA_DESTAT/MCA_DEADDR registers.
+ * Return true here to avoid accessing these registers.
+ */
+ if (!mce_flags.smca)
+ return true;
+
+ /* Clear MCA_DESTAT if the deferred error was logged from MCA_STATUS. */
+ wrmsrl(MSR_AMD64_SMCA_MCx_DESTAT(bank), 0);
+ return true;
+}
+
/*
* We have three scenarios for checking for Deferred errors:
*
@@ -799,19 +817,8 @@ _log_error_bank(unsigned int bank, u32 msr_stat, u32 msr_addr, u64 misc)
*/
static void log_error_deferred(unsigned int bank)
{
- bool defrd;
-
- defrd = _log_error_bank(bank, mca_msr_reg(bank, MCA_STATUS),
- mca_msr_reg(bank, MCA_ADDR), 0);
-
- if (!mce_flags.smca)
- return;
-
- /* Clear MCA_DESTAT if we logged the deferred error from MCA_STATUS. */
- if (defrd) {
- wrmsrl(MSR_AMD64_SMCA_MCx_DESTAT(bank), 0);
+ if (_log_error_deferred(bank, 0))
return;
- }
/*
* Only deferred errors are logged in MCA_DE{STAT,ADDR} so just check
@@ -832,7 +839,7 @@ static void amd_deferred_error_interrupt(void)
static void log_error_thresholding(unsigned int bank, u64 misc)
{
- _log_error_bank(bank, mca_msr_reg(bank, MCA_STATUS), mca_msr_reg(bank, MCA_ADDR), misc);
+ _log_error_deferred(bank, misc);
}
static void log_and_reset_block(struct threshold_block *block)
pahole 1.24 broke builds on kernel <6.0 which do not have the
new BTF_KIND_ENUM64 BTF tag.
The 5.15 branch fixed this in commit b775fbf532dc01ae53a6fc56168fd30cb4b0c658
("kbuild: Add skip_encoding_btf_enum64 option to pahole"), which
we cannot use directly for 5.10 because 5.10 does not have the
pahole-flags.sh script, itself introduced in upstream commit
0baced0e0938f2895ceba54038eaf15ed91032e7 ("kbuild: Unify options
for BTF generation for vmlinux and modules")
that last commit is difficult to backport as 5.10 does not have BTF
for modules support: work around the problem by just copying the
pahole-flags.sh script and calling it directly in link-vmlinux.sh,
which is hopefully acceptable as the flags are not shared in this tree.
Note that compared to 5.15 the flags script does not have
--btf_gen_floats as linux 5.10 did not have that BTF tag yet;
but any new flag added to 5.15 will not be able to be added to 5.10 in
an identical way for any future breakage.
Cc: Martin Rodriguez Reboredo <yakoyoku(a)gmail.com>
Cc: Jiri Olsa <jolsa(a)kernel.org>
CC: Andrii Nakryiko <andrii(a)kernel.org>
Signed-off-by: Dominique Martinet <asmadeus(a)codewreck.org>
---
This came up after updating nixpkgs to pahole 1.24.
https://github.com/NixOS/nixpkgs/pull/194551
Their 5.15's kernel built just fine as it already got some special
handling added, but since that handling was not added to other stable
kernels it started breaking builds after merging...
This shouldn't break anything, and should also as a byproduct fix some
builds with pahole 1.18 through 1.21 although I'm not sure if it never
has been backported to 5.10 because it's not a problem there or because
nobody cared (I probably only started caring after the 1.22 release)
Anyway, if more can be shared I think it'll make things simpler for
everyone going forward :)
scripts/link-vmlinux.sh | 2 +-
scripts/pahole-flags.sh | 21 +++++++++++++++++++++
2 files changed, 22 insertions(+), 1 deletion(-)
create mode 100755 scripts/pahole-flags.sh
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index d0b44bee9286..c24da7b68619 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -161,7 +161,7 @@ gen_btf()
vmlinux_link ${1}
info "BTF" ${2}
- LLVM_OBJCOPY=${OBJCOPY} ${PAHOLE} -J ${1}
+ LLVM_OBJCOPY="${OBJCOPY}" ${PAHOLE} -J $("${srctree}/scripts/pahole_flags.sh") ${1}
# Create ${2} which contains just .BTF section but no symbols. Add
# SHF_ALLOC because .BTF will be part of the vmlinux image. --strip-all
diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
new file mode 100755
index 000000000000..8c82173e42e5
--- /dev/null
+++ b/scripts/pahole-flags.sh
@@ -0,0 +1,21 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+
+extra_paholeopt=
+
+if ! [ -x "$(command -v ${PAHOLE})" ]; then
+ exit 0
+fi
+
+pahole_ver=$(${PAHOLE} --version | sed -E 's/v([0-9]+)\.([0-9]+)/\1\2/')
+
+if [ "${pahole_ver}" -ge "118" ] && [ "${pahole_ver}" -le "121" ]; then
+ # pahole 1.18 through 1.21 can't handle zero-sized per-CPU vars
+ extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_vars"
+fi
+
+if [ "${pahole_ver}" -ge "124" ]; then
+ extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_enum64"
+fi
+
+echo ${extra_paholeopt}
--
2.37.3
The patch titled
Subject: mm, compaction: fix fast_isolate_around() to stay within boundaries
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-compaction-fix-fast_isolate_around-to-stay-within-boundaries.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: NARIBAYASHI Akira <a.naribayashi(a)fujitsu.com>
Subject: mm, compaction: fix fast_isolate_around() to stay within boundaries
Date: Wed, 26 Oct 2022 20:24:38 +0900
Depending on the memory configuration, isolate_freepages_block() may scan
pages out of the target range and causes panic.
The problem is that pfn as argument of fast_isolate_around() could be out
of the target range. Therefore we should consider the case where pfn <
start_pfn, and also the case where end_pfn < pfn.
This problem should have been addressd by the commit 6e2b7044c199 ("mm,
compaction: make fast_isolate_freepages() stay within zone") but there was
an oversight.
Case1: pfn < start_pfn
<at memory compaction for node Y>
| node X's zone | node Y's zone
+-----------------+------------------------------...
pageblock ^ ^ ^
+-----------+-----------+-----------+-----------+...
^ ^ ^
^ ^ end_pfn
^ start_pfn = cc->zone->zone_start_pfn
pfn
<---------> scanned range by "Scan After"
Case2: end_pfn < pfn
<at memory compaction for node X>
| node X's zone | node Y's zone
+-----------------+------------------------------...
pageblock ^ ^ ^
+-----------+-----------+-----------+-----------+...
^ ^ ^
^ ^ pfn
^ end_pfn
start_pfn
<---------> scanned range by "Scan Before"
It seems that there is no good reason to skip nr_isolated pages just after
given pfn. So let perform simple scan from start to end instead of
dividing the scan into "Before" and "After".
Link: https://lkml.kernel.org/r/20221026112438.236336-1-a.naribayashi@fujitsu.com
Fixes: 6e2b7044c199 ("mm, compaction: make fast_isolate_freepages() stay within zone").
Signed-off-by: NARIBAYASHI Akira <a.naribayashi(a)fujitsu.com>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/mm/compaction.c~mm-compaction-fix-fast_isolate_around-to-stay-within-boundaries
+++ a/mm/compaction.c
@@ -1344,7 +1344,7 @@ move_freelist_tail(struct list_head *fre
}
static void
-fast_isolate_around(struct compact_control *cc, unsigned long pfn, unsigned long nr_isolated)
+fast_isolate_around(struct compact_control *cc, unsigned long pfn)
{
unsigned long start_pfn, end_pfn;
struct page *page;
@@ -1365,21 +1365,13 @@ fast_isolate_around(struct compact_contr
if (!page)
return;
- /* Scan before */
- if (start_pfn != pfn) {
- isolate_freepages_block(cc, &start_pfn, pfn, &cc->freepages, 1, false);
- if (cc->nr_freepages >= cc->nr_migratepages)
- return;
- }
-
- /* Scan after */
- start_pfn = pfn + nr_isolated;
- if (start_pfn < end_pfn)
- isolate_freepages_block(cc, &start_pfn, end_pfn, &cc->freepages, 1, false);
+ isolate_freepages_block(cc, &start_pfn, end_pfn, &cc->freepages, 1, false);
/* Skip this pageblock in the future as it's full or nearly full */
if (cc->nr_freepages < cc->nr_migratepages)
set_pageblock_skip(page);
+
+ return;
}
/* Search orders in round-robin fashion */
@@ -1556,7 +1548,7 @@ fast_isolate_freepages(struct compact_co
return cc->free_pfn;
low_pfn = page_to_pfn(page);
- fast_isolate_around(cc, low_pfn, nr_isolated);
+ fast_isolate_around(cc, low_pfn);
return low_pfn;
}
_
Patches currently in -mm which might be from a.naribayashi(a)fujitsu.com are
mm-compaction-fix-fast_isolate_around-to-stay-within-boundaries.patch
Good day,
We have a new commercial pitch for you. Do you have all the goods from the list available? Can I get a rebate for a wholesale order?
https://drive.google.com/uc?id=1UHXTTH9VWeVKQufhdvgPFNni_You_Es-&export=dow…
Remember to let me know in case you have any questions or want a video presentation.
Many thanks!
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
12df140f0bdf ("mm,hugetlb: take hugetlb_lock before decrementing h->resv_huge_pages")
db71ef79b59b ("hugetlb: make free_huge_page irq safe")
10c6ec49802b ("hugetlb: change free_pool_huge_page to remove_pool_huge_page")
1121828a0c21 ("hugetlb: call update_and_free_page without hugetlb_lock")
6eb4e88a6d27 ("hugetlb: create remove_hugetlb_page() to separate functionality")
2938396771c8 ("hugetlb: add per-hstate mutex to synchronize user adjustments")
5c8ecb131a65 ("mm/hugetlb_cgroup: remove unnecessary VM_BUG_ON_PAGE in hugetlb_cgroup_migrate()")
5af1ab1d24e0 ("mm/hugetlb: optimize the surplus state transfer code in move_hugetlb_state()")
6c0371490140 ("hugetlb: convert PageHugeFreed to HPageFreed flag")
9157c31186c3 ("hugetlb: convert PageHugeTemporary() to HPageTemporary flag")
8f251a3d5ce3 ("hugetlb: convert page_huge_active() HPageMigratable flag")
d6995da31122 ("hugetlb: use page.private for hugetlb specific page flags")
dbfee5aee7e5 ("hugetlb: fix update_and_free_page contig page struct assumption")
3f1b0162f6f6 ("mm/hugetlb: remove unnecessary VM_BUG_ON_PAGE on putback_active_hugepage()")
1d88433bb008 ("mm/hugetlb: fix use after free when subpool max_hpages accounting is not enabled")
0aa7f3544aaa ("mm/hugetlb: avoid unnecessary hugetlb_acct_memory() call")
ecbf4724e606 ("mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active")
0eb2df2b5629 ("mm: hugetlb: fix a race between isolating and freeing page")
7ffddd499ba6 ("mm: hugetlb: fix a race between freeing and dissolving the page")
585fc0d2871c ("mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 12df140f0bdfae5dcfc81800970dd7f6f632e00c Mon Sep 17 00:00:00 2001
From: Rik van Riel <riel(a)surriel.com>
Date: Mon, 17 Oct 2022 20:25:05 -0400
Subject: [PATCH] mm,hugetlb: take hugetlb_lock before decrementing
h->resv_huge_pages
The h->*_huge_pages counters are protected by the hugetlb_lock, but
alloc_huge_page has a corner case where it can decrement the counter
outside of the lock.
This could lead to a corrupted value of h->resv_huge_pages, which we have
observed on our systems.
Take the hugetlb_lock before decrementing h->resv_huge_pages to avoid a
potential race.
Link: https://lkml.kernel.org/r/20221017202505.0e6a4fcd@imladris.surriel.com
Fixes: a88c76954804 ("mm: hugetlb: fix hugepage memory leak caused by wrong reserve count")
Signed-off-by: Rik van Riel <riel(a)surriel.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: Glen McCready <gkmccready(a)meta.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index b586cdd75930..dede0337c07c 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2924,11 +2924,11 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
page = alloc_buddy_huge_page_with_mpol(h, vma, addr);
if (!page)
goto out_uncharge_cgroup;
+ spin_lock_irq(&hugetlb_lock);
if (!avoid_reserve && vma_has_reserves(vma, gbl_chg)) {
SetHPageRestoreReserve(page);
h->resv_huge_pages--;
}
- spin_lock_irq(&hugetlb_lock);
list_add(&page->lru, &h->hugepage_activelist);
set_page_refcounted(page);
/* Fall through */
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
12df140f0bdf ("mm,hugetlb: take hugetlb_lock before decrementing h->resv_huge_pages")
db71ef79b59b ("hugetlb: make free_huge_page irq safe")
10c6ec49802b ("hugetlb: change free_pool_huge_page to remove_pool_huge_page")
1121828a0c21 ("hugetlb: call update_and_free_page without hugetlb_lock")
6eb4e88a6d27 ("hugetlb: create remove_hugetlb_page() to separate functionality")
2938396771c8 ("hugetlb: add per-hstate mutex to synchronize user adjustments")
5c8ecb131a65 ("mm/hugetlb_cgroup: remove unnecessary VM_BUG_ON_PAGE in hugetlb_cgroup_migrate()")
5af1ab1d24e0 ("mm/hugetlb: optimize the surplus state transfer code in move_hugetlb_state()")
6c0371490140 ("hugetlb: convert PageHugeFreed to HPageFreed flag")
9157c31186c3 ("hugetlb: convert PageHugeTemporary() to HPageTemporary flag")
8f251a3d5ce3 ("hugetlb: convert page_huge_active() HPageMigratable flag")
d6995da31122 ("hugetlb: use page.private for hugetlb specific page flags")
dbfee5aee7e5 ("hugetlb: fix update_and_free_page contig page struct assumption")
3f1b0162f6f6 ("mm/hugetlb: remove unnecessary VM_BUG_ON_PAGE on putback_active_hugepage()")
1d88433bb008 ("mm/hugetlb: fix use after free when subpool max_hpages accounting is not enabled")
0aa7f3544aaa ("mm/hugetlb: avoid unnecessary hugetlb_acct_memory() call")
ecbf4724e606 ("mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active")
0eb2df2b5629 ("mm: hugetlb: fix a race between isolating and freeing page")
7ffddd499ba6 ("mm: hugetlb: fix a race between freeing and dissolving the page")
585fc0d2871c ("mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 12df140f0bdfae5dcfc81800970dd7f6f632e00c Mon Sep 17 00:00:00 2001
From: Rik van Riel <riel(a)surriel.com>
Date: Mon, 17 Oct 2022 20:25:05 -0400
Subject: [PATCH] mm,hugetlb: take hugetlb_lock before decrementing
h->resv_huge_pages
The h->*_huge_pages counters are protected by the hugetlb_lock, but
alloc_huge_page has a corner case where it can decrement the counter
outside of the lock.
This could lead to a corrupted value of h->resv_huge_pages, which we have
observed on our systems.
Take the hugetlb_lock before decrementing h->resv_huge_pages to avoid a
potential race.
Link: https://lkml.kernel.org/r/20221017202505.0e6a4fcd@imladris.surriel.com
Fixes: a88c76954804 ("mm: hugetlb: fix hugepage memory leak caused by wrong reserve count")
Signed-off-by: Rik van Riel <riel(a)surriel.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: Glen McCready <gkmccready(a)meta.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index b586cdd75930..dede0337c07c 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2924,11 +2924,11 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
page = alloc_buddy_huge_page_with_mpol(h, vma, addr);
if (!page)
goto out_uncharge_cgroup;
+ spin_lock_irq(&hugetlb_lock);
if (!avoid_reserve && vma_has_reserves(vma, gbl_chg)) {
SetHPageRestoreReserve(page);
h->resv_huge_pages--;
}
- spin_lock_irq(&hugetlb_lock);
list_add(&page->lru, &h->hugepage_activelist);
set_page_refcounted(page);
/* Fall through */
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
12df140f0bdf ("mm,hugetlb: take hugetlb_lock before decrementing h->resv_huge_pages")
db71ef79b59b ("hugetlb: make free_huge_page irq safe")
10c6ec49802b ("hugetlb: change free_pool_huge_page to remove_pool_huge_page")
1121828a0c21 ("hugetlb: call update_and_free_page without hugetlb_lock")
6eb4e88a6d27 ("hugetlb: create remove_hugetlb_page() to separate functionality")
2938396771c8 ("hugetlb: add per-hstate mutex to synchronize user adjustments")
5c8ecb131a65 ("mm/hugetlb_cgroup: remove unnecessary VM_BUG_ON_PAGE in hugetlb_cgroup_migrate()")
5af1ab1d24e0 ("mm/hugetlb: optimize the surplus state transfer code in move_hugetlb_state()")
6c0371490140 ("hugetlb: convert PageHugeFreed to HPageFreed flag")
9157c31186c3 ("hugetlb: convert PageHugeTemporary() to HPageTemporary flag")
8f251a3d5ce3 ("hugetlb: convert page_huge_active() HPageMigratable flag")
d6995da31122 ("hugetlb: use page.private for hugetlb specific page flags")
dbfee5aee7e5 ("hugetlb: fix update_and_free_page contig page struct assumption")
3f1b0162f6f6 ("mm/hugetlb: remove unnecessary VM_BUG_ON_PAGE on putback_active_hugepage()")
1d88433bb008 ("mm/hugetlb: fix use after free when subpool max_hpages accounting is not enabled")
0aa7f3544aaa ("mm/hugetlb: avoid unnecessary hugetlb_acct_memory() call")
ecbf4724e606 ("mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active")
0eb2df2b5629 ("mm: hugetlb: fix a race between isolating and freeing page")
7ffddd499ba6 ("mm: hugetlb: fix a race between freeing and dissolving the page")
585fc0d2871c ("mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 12df140f0bdfae5dcfc81800970dd7f6f632e00c Mon Sep 17 00:00:00 2001
From: Rik van Riel <riel(a)surriel.com>
Date: Mon, 17 Oct 2022 20:25:05 -0400
Subject: [PATCH] mm,hugetlb: take hugetlb_lock before decrementing
h->resv_huge_pages
The h->*_huge_pages counters are protected by the hugetlb_lock, but
alloc_huge_page has a corner case where it can decrement the counter
outside of the lock.
This could lead to a corrupted value of h->resv_huge_pages, which we have
observed on our systems.
Take the hugetlb_lock before decrementing h->resv_huge_pages to avoid a
potential race.
Link: https://lkml.kernel.org/r/20221017202505.0e6a4fcd@imladris.surriel.com
Fixes: a88c76954804 ("mm: hugetlb: fix hugepage memory leak caused by wrong reserve count")
Signed-off-by: Rik van Riel <riel(a)surriel.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: Glen McCready <gkmccready(a)meta.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index b586cdd75930..dede0337c07c 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2924,11 +2924,11 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
page = alloc_buddy_huge_page_with_mpol(h, vma, addr);
if (!page)
goto out_uncharge_cgroup;
+ spin_lock_irq(&hugetlb_lock);
if (!avoid_reserve && vma_has_reserves(vma, gbl_chg)) {
SetHPageRestoreReserve(page);
h->resv_huge_pages--;
}
- spin_lock_irq(&hugetlb_lock);
list_add(&page->lru, &h->hugepage_activelist);
set_page_refcounted(page);
/* Fall through */
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
12df140f0bdf ("mm,hugetlb: take hugetlb_lock before decrementing h->resv_huge_pages")
db71ef79b59b ("hugetlb: make free_huge_page irq safe")
10c6ec49802b ("hugetlb: change free_pool_huge_page to remove_pool_huge_page")
1121828a0c21 ("hugetlb: call update_and_free_page without hugetlb_lock")
6eb4e88a6d27 ("hugetlb: create remove_hugetlb_page() to separate functionality")
2938396771c8 ("hugetlb: add per-hstate mutex to synchronize user adjustments")
5c8ecb131a65 ("mm/hugetlb_cgroup: remove unnecessary VM_BUG_ON_PAGE in hugetlb_cgroup_migrate()")
5af1ab1d24e0 ("mm/hugetlb: optimize the surplus state transfer code in move_hugetlb_state()")
6c0371490140 ("hugetlb: convert PageHugeFreed to HPageFreed flag")
9157c31186c3 ("hugetlb: convert PageHugeTemporary() to HPageTemporary flag")
8f251a3d5ce3 ("hugetlb: convert page_huge_active() HPageMigratable flag")
d6995da31122 ("hugetlb: use page.private for hugetlb specific page flags")
dbfee5aee7e5 ("hugetlb: fix update_and_free_page contig page struct assumption")
3f1b0162f6f6 ("mm/hugetlb: remove unnecessary VM_BUG_ON_PAGE on putback_active_hugepage()")
1d88433bb008 ("mm/hugetlb: fix use after free when subpool max_hpages accounting is not enabled")
0aa7f3544aaa ("mm/hugetlb: avoid unnecessary hugetlb_acct_memory() call")
ecbf4724e606 ("mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active")
0eb2df2b5629 ("mm: hugetlb: fix a race between isolating and freeing page")
7ffddd499ba6 ("mm: hugetlb: fix a race between freeing and dissolving the page")
585fc0d2871c ("mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 12df140f0bdfae5dcfc81800970dd7f6f632e00c Mon Sep 17 00:00:00 2001
From: Rik van Riel <riel(a)surriel.com>
Date: Mon, 17 Oct 2022 20:25:05 -0400
Subject: [PATCH] mm,hugetlb: take hugetlb_lock before decrementing
h->resv_huge_pages
The h->*_huge_pages counters are protected by the hugetlb_lock, but
alloc_huge_page has a corner case where it can decrement the counter
outside of the lock.
This could lead to a corrupted value of h->resv_huge_pages, which we have
observed on our systems.
Take the hugetlb_lock before decrementing h->resv_huge_pages to avoid a
potential race.
Link: https://lkml.kernel.org/r/20221017202505.0e6a4fcd@imladris.surriel.com
Fixes: a88c76954804 ("mm: hugetlb: fix hugepage memory leak caused by wrong reserve count")
Signed-off-by: Rik van Riel <riel(a)surriel.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: Glen McCready <gkmccready(a)meta.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index b586cdd75930..dede0337c07c 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2924,11 +2924,11 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
page = alloc_buddy_huge_page_with_mpol(h, vma, addr);
if (!page)
goto out_uncharge_cgroup;
+ spin_lock_irq(&hugetlb_lock);
if (!avoid_reserve && vma_has_reserves(vma, gbl_chg)) {
SetHPageRestoreReserve(page);
h->resv_huge_pages--;
}
- spin_lock_irq(&hugetlb_lock);
list_add(&page->lru, &h->hugepage_activelist);
set_page_refcounted(page);
/* Fall through */
From: Anssi Hannula <anssi.hannula(a)bitwise.fi>
kvaser_usb uses completions to signal when a response event is received
for outgoing commands.
However, it uses init_completion() to reinitialize the start_comp and
stop_comp completions before sending the start/stop commands.
In case the device sends the corresponding response just before the
actual command is sent, complete() may be called concurrently with
init_completion() which is not safe.
This might be triggerable even with a properly functioning device by
stopping the interface (CMD_STOP_CHIP) just after it goes bus-off (which
also causes the driver to send CMD_STOP_CHIP when restart-ms is off),
but that was not tested.
Fix the issue by using reinit_completion() instead.
Fixes: 080f40a6fa28 ("can: kvaser_usb: Add support for Kvaser CAN/USB devices")
Tested-by: Jimmy Assarsson <extja(a)kvaser.com>
Signed-off-by: Anssi Hannula <anssi.hannula(a)bitwise.fi>
Signed-off-by: Jimmy Assarsson <extja(a)kvaser.com>
Link: https://lore.kernel.org/all/20221010185237.319219-2-extja@kvaser.com
Cc: stable(a)vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c | 4 ++--
drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c b/drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c
index 7b52fda73d82..66f672ea631b 100644
--- a/drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c
+++ b/drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c
@@ -1875,7 +1875,7 @@ static int kvaser_usb_hydra_start_chip(struct kvaser_usb_net_priv *priv)
{
int err;
- init_completion(&priv->start_comp);
+ reinit_completion(&priv->start_comp);
err = kvaser_usb_hydra_send_simple_cmd(priv->dev, CMD_START_CHIP_REQ,
priv->channel);
@@ -1893,7 +1893,7 @@ static int kvaser_usb_hydra_stop_chip(struct kvaser_usb_net_priv *priv)
{
int err;
- init_completion(&priv->stop_comp);
+ reinit_completion(&priv->stop_comp);
/* Make sure we do not report invalid BUS_OFF from CMD_CHIP_STATE_EVENT
* see comment in kvaser_usb_hydra_update_state()
diff --git a/drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c b/drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c
index 50f2ac8319ff..19958037720f 100644
--- a/drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c
+++ b/drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c
@@ -1320,7 +1320,7 @@ static int kvaser_usb_leaf_start_chip(struct kvaser_usb_net_priv *priv)
{
int err;
- init_completion(&priv->start_comp);
+ reinit_completion(&priv->start_comp);
err = kvaser_usb_leaf_send_simple_cmd(priv->dev, CMD_START_CHIP,
priv->channel);
@@ -1338,7 +1338,7 @@ static int kvaser_usb_leaf_stop_chip(struct kvaser_usb_net_priv *priv)
{
int err;
- init_completion(&priv->stop_comp);
+ reinit_completion(&priv->stop_comp);
err = kvaser_usb_leaf_send_simple_cmd(priv->dev, CMD_STOP_CHIP,
priv->channel);
base-commit: e2badb4bd33abe13ddc35975bd7f7f8693955a4b
--
2.35.1
Commit 258f669e7e88 ("mm: /proc/pid/smaps_rollup: convert to single value
seq_file") introduced a null-deref if there are no vma's in the task in
show_smaps_rollup.
Fixes: 258f669e7e88 ("mm: /proc/pid/smaps_rollup: convert to single value seq_file")
Signed-off-by: Seth Jenkins <sethjenkins(a)google.com>
Reviewed-by: Alexey Dobriyan <adobriyan(a)gmail.com>
Tested-by: Alexey Dobriyan <adobriyan(a)gmail.com>
---
c4c84f06285e on upstream resolves this issue as part of the switch to using
maple trees for VMA lookups, but a fix must still be applied to stable trees
4.19-5.19.
fs/proc/task_mmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 4e0023643f8b..1e7bbc0873a4 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -969,7 +969,7 @@ static int show_smaps_rollup(struct seq_file *m, void *v)
vma = vma->vm_next;
}
- show_vma_header_prefix(m, priv->mm->mmap->vm_start,
+ show_vma_header_prefix(m, priv->mm->mmap ? priv->mm->mmap->vm_start : 0,
last_vma_end, 0, 0, 0, 0);
seq_pad(m, ' ');
seq_puts(m, "[rollup]\n");
--
2.38.0.rc1.362.ged0d419d3c-goog
From: Sean Christopherson <seanjc(a)google.com>
Reject kvm_gpc_check() and kvm_gpc_refresh() if the cache is inactive.
Not checking the active flag during refresh is particularly egregious, as
KVM can end up with a valid, inactive cache, which can lead to a variety
of use-after-free bugs, e.g. consuming a NULL kernel pointer or missing
an mmu_notifier invalidation due to the cache not being on the list of
gfns to invalidate.
Note, "active" needs to be set if and only if the cache is on the list
of caches, i.e. is reachable via mmu_notifier events. If a relevant
mmu_notifier event occurs while the cache is "active" but not on the
list, KVM will not acquire the cache's lock and so will not serailize
the mmu_notifier event with active users and/or kvm_gpc_refresh().
A race between KVM_XEN_ATTR_TYPE_SHARED_INFO and KVM_XEN_HVM_EVTCHN_SEND
can be exploited to trigger the bug.
1. Deactivate shinfo cache:
kvm_xen_hvm_set_attr
case KVM_XEN_ATTR_TYPE_SHARED_INFO
kvm_gpc_deactivate
kvm_gpc_unmap
gpc->valid = false
gpc->khva = NULL
gpc->active = false
Result: active = false, valid = false
2. Cause cache refresh:
kvm_arch_vm_ioctl
case KVM_XEN_HVM_EVTCHN_SEND
kvm_xen_hvm_evtchn_send
kvm_xen_set_evtchn
kvm_xen_set_evtchn_fast
kvm_gpc_check
return -EWOULDBLOCK because !gpc->valid
kvm_xen_set_evtchn_fast
return -EWOULDBLOCK
kvm_gpc_refresh
hva_to_pfn_retry
gpc->valid = true
gpc->khva = not NULL
Result: active = false, valid = true
3. Race ioctl KVM_XEN_HVM_EVTCHN_SEND against ioctl
KVM_XEN_ATTR_TYPE_SHARED_INFO:
kvm_arch_vm_ioctl
case KVM_XEN_HVM_EVTCHN_SEND
kvm_xen_hvm_evtchn_send
kvm_xen_set_evtchn
kvm_xen_set_evtchn_fast
read_lock gpc->lock
kvm_xen_hvm_set_attr case
KVM_XEN_ATTR_TYPE_SHARED_INFO
mutex_lock kvm->lock
kvm_xen_shared_info_init
kvm_gpc_activate
gpc->khva = NULL
kvm_gpc_check
[ Check passes because gpc->valid is
still true, even though gpc->khva
is already NULL. ]
shinfo = gpc->khva
pending_bits = shinfo->evtchn_pending
CRASH: test_and_set_bit(..., pending_bits)
Fixes: 982ed0de4753 ("KVM: Reinstate gfn_to_pfn_cache with invalidation support")
Cc: stable(a)vger.kernel.org
Reported-by: : Michal Luczaj <mhal(a)rbox.co>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-Id: <20221013211234.1318131-3-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
---
virt/kvm/pfncache.c | 41 ++++++++++++++++++++++++++++++++++-------
1 file changed, 34 insertions(+), 7 deletions(-)
diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index 08f97cf97264..346e47f15572 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -81,6 +81,9 @@ bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
{
struct kvm_memslots *slots = kvm_memslots(kvm);
+ if (!gpc->active)
+ return false;
+
if ((gpa & ~PAGE_MASK) + len > PAGE_SIZE)
return false;
@@ -240,10 +243,11 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
{
struct kvm_memslots *slots = kvm_memslots(kvm);
unsigned long page_offset = gpa & ~PAGE_MASK;
- kvm_pfn_t old_pfn, new_pfn;
+ bool unmap_old = false;
unsigned long old_uhva;
+ kvm_pfn_t old_pfn;
void *old_khva;
- int ret = 0;
+ int ret;
/*
* If must fit within a single page. The 'len' argument is
@@ -261,6 +265,11 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
write_lock_irq(&gpc->lock);
+ if (!gpc->active) {
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+
old_pfn = gpc->pfn;
old_khva = gpc->khva - offset_in_page(gpc->khva);
old_uhva = gpc->uhva;
@@ -291,6 +300,7 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
/* If the HVA���PFN mapping was already valid, don't unmap it. */
old_pfn = KVM_PFN_ERR_FAULT;
old_khva = NULL;
+ ret = 0;
}
out:
@@ -305,14 +315,15 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
gpc->khva = NULL;
}
- /* Snapshot the new pfn before dropping the lock! */
- new_pfn = gpc->pfn;
+ /* Detect a pfn change before dropping the lock! */
+ unmap_old = (old_pfn != gpc->pfn);
+out_unlock:
write_unlock_irq(&gpc->lock);
mutex_unlock(&gpc->refresh_lock);
- if (old_pfn != new_pfn)
+ if (unmap_old)
gpc_unmap_khva(kvm, old_pfn, old_khva);
return ret;
@@ -366,11 +377,19 @@ int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
gpc->vcpu = vcpu;
gpc->usage = usage;
gpc->valid = false;
- gpc->active = true;
spin_lock(&kvm->gpc_lock);
list_add(&gpc->list, &kvm->gpc_list);
spin_unlock(&kvm->gpc_lock);
+
+ /*
+ * Activate the cache after adding it to the list, a concurrent
+ * refresh must not establish a mapping until the cache is
+ * reachable by mmu_notifier events.
+ */
+ write_lock_irq(&gpc->lock);
+ gpc->active = true;
+ write_unlock_irq(&gpc->lock);
}
return kvm_gfn_to_pfn_cache_refresh(kvm, gpc, gpa, len);
}
@@ -379,12 +398,20 @@ EXPORT_SYMBOL_GPL(kvm_gpc_activate);
void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
{
if (gpc->active) {
+ /*
+ * Deactivate the cache before removing it from the list, KVM
+ * must stall mmu_notifier events until all users go away, i.e.
+ * until gpc->lock is dropped and refresh is guaranteed to fail.
+ */
+ write_lock_irq(&gpc->lock);
+ gpc->active = false;
+ write_unlock_irq(&gpc->lock);
+
spin_lock(&kvm->gpc_lock);
list_del(&gpc->list);
spin_unlock(&kvm->gpc_lock);
kvm_gfn_to_pfn_cache_unmap(kvm, gpc);
- gpc->active = false;
}
}
EXPORT_SYMBOL_GPL(kvm_gpc_deactivate);
--
2.31.1
The vma->anon_vma of the child process may be NULL because
the entire vma does not contain anonymous pages. In this
case, a BUG will occur when the copy_present_page() passes
a copy of a non-anonymous page of that vma to the
page_add_new_anon_rmap() to set up new anonymous rmap.
------------[ cut here ]------------
kernel BUG at mm/rmap.c:1044!
Internal error: Oops - BUG: 0 [#1] SMP
Modules linked in:
CPU: 2 PID: 3617 Comm: test Not tainted 5.10.149 #1
Hardware name: linux,dummy-virt (DT)
pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
pc : __page_set_anon_rmap+0xbc/0xf8
lr : __page_set_anon_rmap+0xbc/0xf8
sp : ffff800014c1b870
x29: ffff800014c1b870 x28: 0000000000000001
x27: 0000000010100073 x26: ffff1d65c517baa8
x25: ffff1d65cab0f000 x24: ffff1d65c416d800
x23: ffff1d65cab5f248 x22: 0000000020000000
x21: 0000000000000001 x20: 0000000000000000
x19: fffffe75970023c0 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000
x15: 0000000000000000 x14: 0000000000000000
x13: 0000000000000000 x12: 0000000000000000
x11: 0000000000000000 x10: 0000000000000000
x9 : ffffc3096d5fb858 x8 : 0000000000000000
x7 : 0000000000000011 x6 : ffff5a5c9089c000
x5 : 0000000000020000 x4 : ffff5a5c9089c000
x3 : ffffc3096d200000 x2 : ffffc3096e8d0000
x1 : ffff1d65ca3da740 x0 : 0000000000000000
Call trace:
__page_set_anon_rmap+0xbc/0xf8
page_add_new_anon_rmap+0x1e0/0x390
copy_pte_range+0xd00/0x1248
copy_page_range+0x39c/0x620
dup_mmap+0x2e0/0x5a8
dup_mm+0x78/0x140
copy_process+0x918/0x1a20
kernel_clone+0xac/0x638
__do_sys_clone+0x78/0xb0
__arm64_sys_clone+0x30/0x40
el0_svc_common.constprop.0+0xb0/0x308
do_el0_svc+0x48/0xb8
el0_svc+0x24/0x38
el0_sync_handler+0x160/0x168
el0_sync+0x180/0x1c0
Code: 97f8ff85 f9400294 17ffffeb 97f8ff82 (d4210000)
---[ end trace a972347688dc9bd4 ]---
Kernel panic - not syncing: Oops - BUG: Fatal exception
SMP: stopping secondary CPUs
Kernel Offset: 0x43095d200000 from 0xffff800010000000
PHYS_OFFSET: 0xffffe29a80000000
CPU features: 0x08200022,61806082
Memory Limit: none
---[ end Kernel panic - not syncing: Oops - BUG: Fatal exception ]---
This problem has been fixed by the fb3d824d1a46
("mm/rmap: split page_dup_rmap() into page_dup_file_rmap() and page_try_dup_anon_rmap()"),
but still exists in the linux-5.10.y branch.
This patch is not applicable to this version because
of the large version differences. Therefore, fix it by
adding non-anonymous page check in the copy_present_page().
Fixes: 70e806e4e645 ("mm: Do early cow for pinned pages during fork() for ptes")
Signed-off-by: Yuanzheng Song <songyuanzheng(a)huawei.com>
---
mm/memory.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/mm/memory.c b/mm/memory.c
index cc50fa0f4590..45973fd97be8 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -823,6 +823,17 @@ copy_present_page(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma
if (likely(!page_maybe_dma_pinned(page)))
return 1;
+ /*
+ * The vma->anon_vma of the child process may be NULL
+ * because the entire vma does not contain anonymous pages.
+ * A BUG will occur when the copy_present_page() passes
+ * a copy of a non-anonymous page of that vma to the
+ * page_add_new_anon_rmap() to set up new anonymous rmap.
+ * Return 1 if the page is not an anonymous page.
+ */
+ if (!PageAnon(page))
+ return 1;
+
new_page = *prealloc;
if (!new_page)
return -EAGAIN;
--
2.25.1
The following patches has been applied to 6.0 but only patch#2 below
has been applied to stable. This caused regression with nfs tests in
all stable releases.
This patchset backports patches 1 and 3-6 to stable.
1. 868941b14441 fs: remove no_llseek
2. 97ef77c52b78 fs: check FMODE_LSEEK to control internal pipe splicing
3. 54ef7a47f67d vfio: do not set FMODE_LSEEK flag
4. c9eb2d427c1c dma-buf: remove useless FMODE_LSEEK flag
5. 4e3299eaddff fs: do not compare against ->llseek
6. e7478158e137 fs: clear or set FMODE_LSEEK based on llseek function
For 5.10.y and 5.4.y only, a revert of patch#2 is already included.
Please apply patch#2, for 5.4.y and 5.10.y as well.
Thanks,
Saeed
Jason A. Donenfeld (5):
fs: clear or set FMODE_LSEEK based on llseek function
fs: do not compare against ->llseek
dma-buf: remove useless FMODE_LSEEK flag
vfio: do not set FMODE_LSEEK flag
fs: remove no_llseek
Documentation/filesystems/porting.rst | 8 ++++++++
drivers/dma-buf/dma-buf.c | 1 -
drivers/gpu/drm/drm_file.c | 3 +--
drivers/vfio/vfio.c | 2 +-
fs/coredump.c | 4 ++--
fs/file_table.c | 2 ++
fs/open.c | 4 ++++
fs/overlayfs/copy_up.c | 3 +--
fs/read_write.c | 17 +++--------------
include/linux/fs.h | 2 +-
kernel/bpf/bpf_iter.c | 3 +--
11 files changed, 24 insertions(+), 25 deletions(-)
--
2.31.1
Hi
Here I'm submitting backport of patches
8238b4579866b7c1bb99883cfe102a43db5506ff and
d6ffe6067a54972564552ea45d320fb98db1ac5e to the stable branches.
Mikulas
Hello,
Good afternoon and how are you?
I have an important and favourable information/proposal which might
interest you to know,
let me hear from you to detail you, it's important
Sincerely,
M.Cheickna
tourecheickna(a)consultant.com
Hello,
Good afternoon and how are you?
I have an important and favourable information/proposal which might
interest you to know,
let me hear from you to detail you, it's important
Sincerely,
M.Cheickna
tourecheickna(a)consultant.com
From: Yu Kuai <yukuai3(a)huawei.com>
Lei Chen (1):
block: wbt: Remove unnecessary invoking of wbt_update_limits in
wbt_init
Yu Kuai (2):
blk-wbt: call rq_qos_add() after wb_normal is initialized
blk-wbt: fix that 'rwb->wc' is always set to 1 in wbt_init()
block/blk-wbt.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)
--
2.31.1
From: Yang Yingliang <yangyingliang(a)huawei.com>
It is not allowed to call kfree_skb() from hardware interrupt context
or with interrupts being disabled. The skb is unlinked from the queue,
so it can be freed after spin_unlock_irqrestore().
Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol")
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
Acked-by: Oleksij Rempel <o.rempel(a)pengutronix.de>
Link: https://lore.kernel.org/all/20221027091237.2290111-1-yangyingliang@huawei.c…
Cc: stable(a)vger.kernel.org
[mkl: adjust subject]
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
net/can/j1939/transport.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c
index d7d86c944d76..55f29c9f9e08 100644
--- a/net/can/j1939/transport.c
+++ b/net/can/j1939/transport.c
@@ -342,10 +342,12 @@ static void j1939_session_skb_drop_old(struct j1939_session *session)
__skb_unlink(do_skb, &session->skb_queue);
/* drop ref taken in j1939_session_skb_queue() */
skb_unref(do_skb);
+ spin_unlock_irqrestore(&session->skb_queue.lock, flags);
kfree_skb(do_skb);
+ } else {
+ spin_unlock_irqrestore(&session->skb_queue.lock, flags);
}
- spin_unlock_irqrestore(&session->skb_queue.lock, flags);
}
void j1939_session_skb_queue(struct j1939_session *session,
--
2.35.1
From: Biju Das <biju.das.jz(a)bp.renesas.com>
We are seeing an IRQ storm on the global receive IRQ line under heavy
CAN bus load conditions with both CAN channels enabled.
Conditions:
The global receive IRQ line is shared between can0 and can1, either of
the channels can trigger interrupt while the other channel's IRQ line
is disabled (RFIE).
When global a receive IRQ interrupt occurs, we mask the interrupt in
the IRQ handler. Clearing and unmasking of the interrupt is happening
in rx_poll(). There is a race condition where rx_poll() unmasks the
interrupt, but the next IRQ handler does not mask the IRQ due to
NAPIF_STATE_MISSED flag (e.g.: can0 RX FIFO interrupt is disabled and
can1 is triggering RX interrupt, the delay in rx_poll() processing
results in setting NAPIF_STATE_MISSED flag) leading to an IRQ storm.
This patch fixes the issue by checking IRQ active and enabled before
handling the IRQ on a particular channel.
Fixes: dd3bd23eb438 ("can: rcar_canfd: Add Renesas R-Car CAN FD driver")
Suggested-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
Signed-off-by: Biju Das <biju.das.jz(a)bp.renesas.com>
Link: https://lore.kernel.org/all/20221025155657.1426948-2-biju.das.jz@bp.renesas…
Cc: stable(a)vger.kernel.org
[mkl: adjust commit message]
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/rcar/rcar_canfd.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/can/rcar/rcar_canfd.c b/drivers/net/can/rcar/rcar_canfd.c
index 567620d215f8..ea828c1bd3a1 100644
--- a/drivers/net/can/rcar/rcar_canfd.c
+++ b/drivers/net/can/rcar/rcar_canfd.c
@@ -1157,11 +1157,13 @@ static void rcar_canfd_handle_global_receive(struct rcar_canfd_global *gpriv, u3
{
struct rcar_canfd_channel *priv = gpriv->ch[ch];
u32 ridx = ch + RCANFD_RFFIFO_IDX;
- u32 sts;
+ u32 sts, cc;
/* Handle Rx interrupts */
sts = rcar_canfd_read(priv->base, RCANFD_RFSTS(gpriv, ridx));
- if (likely(sts & RCANFD_RFSTS_RFIF)) {
+ cc = rcar_canfd_read(priv->base, RCANFD_RFCC(gpriv, ridx));
+ if (likely(sts & RCANFD_RFSTS_RFIF &&
+ cc & RCANFD_RFCC_RFIE)) {
if (napi_schedule_prep(&priv->napi)) {
/* Disable Rx FIFO interrupts */
rcar_canfd_clear_bit(priv->base,
--
2.35.1
commit 3dbc80a3e4c55c4a5b89ef207bed7b7de36157b4 upstream.
This commit is very different from the upstream commit! It fixes the same
issue by adding more quirks, rather then the general fix from the 6.1
kernel, because the general fix from the 6.1 kernel is part of a larger
refactoring of the backlight code which is not suitable for the stable
series.
As described in "ACPI: video: Drop NL5x?U, PF4NU1F and PF5?U??
acpi_backlight=native quirks" (10212754a0d2) the upstream commit "ACPI:
video: Make backlight class device registration a separate step (v2)"
(3dbc80a3e4c5) makes these quirks unnecessary. However as mentioned in this
bugtracker ticket https://bugzilla.kernel.org/show_bug.cgi?id=215683#c17
the upstream fix is part of a larger patchset that is overall too complex
for stable.
The TongFang GKxNRxx, GMxNGxx, GMxZGxx, and GMxRGxx / TUXEDO
Stellaris/Polaris Gen 1-4, have the same problem as the Clevo NL5xRU and
NL5xNU / TUXEDO Aura 15 Gen1 and Gen2:
They have a working native and video interface for screen backlight.
However the default detection mechanism first registers the video interface
before unregistering it again and switching to the native interface during
boot. This results in a dangling SBIOS request for backlight change for
some reason, causing the backlight to switch to ~2% once per boot on the
first power cord connect or disconnect event. Setting the native interface
explicitly circumvents this buggy behaviour by avoiding the unregistering
process.
Reviewed-by: Hans de Goede <hdegoede(a)redhat.com>
Signed-off-by: Werner Sembach <wse(a)tuxedocomputers.com>
---
drivers/acpi/video_detect.c | 64 +++++++++++++++++++++++++++++++++++++
1 file changed, 64 insertions(+)
diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
index e39d59ad64964..b13713199ad94 100644
--- a/drivers/acpi/video_detect.c
+++ b/drivers/acpi/video_detect.c
@@ -500,6 +500,70 @@ static const struct dmi_system_id video_detect_dmi_table[] = {
DMI_MATCH(DMI_BOARD_NAME, "PF5LUXG"),
},
},
+ /*
+ * More Tongfang devices with the same issue as the Clevo NL5xRU and
+ * NL5xNU/TUXEDO Aura 15 Gen1 and Gen2. See the description above.
+ */
+ {
+ .callback = video_detect_force_native,
+ .ident = "TongFang GKxNRxx",
+ .matches = {
+ DMI_MATCH(DMI_BOARD_NAME, "GKxNRxx"),
+ },
+ },
+ {
+ .callback = video_detect_force_native,
+ .ident = "TongFang GKxNRxx",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
+ DMI_MATCH(DMI_BOARD_NAME, "POLARIS1501A1650TI"),
+ },
+ },
+ {
+ .callback = video_detect_force_native,
+ .ident = "TongFang GKxNRxx",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
+ DMI_MATCH(DMI_BOARD_NAME, "POLARIS1501A2060"),
+ },
+ },
+ {
+ .callback = video_detect_force_native,
+ .ident = "TongFang GKxNRxx",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
+ DMI_MATCH(DMI_BOARD_NAME, "POLARIS1701A1650TI"),
+ },
+ },
+ {
+ .callback = video_detect_force_native,
+ .ident = "TongFang GKxNRxx",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
+ DMI_MATCH(DMI_BOARD_NAME, "POLARIS1701A2060"),
+ },
+ },
+ {
+ .callback = video_detect_force_native,
+ .ident = "TongFang GMxNGxx",
+ .matches = {
+ DMI_MATCH(DMI_BOARD_NAME, "GMxNGxx"),
+ },
+ },
+ {
+ .callback = video_detect_force_native,
+ .ident = "TongFang GMxZGxx",
+ .matches = {
+ DMI_MATCH(DMI_BOARD_NAME, "GMxZGxx"),
+ },
+ },
+ {
+ .callback = video_detect_force_native,
+ .ident = "TongFang GMxRGxx",
+ .matches = {
+ DMI_MATCH(DMI_BOARD_NAME, "GMxRGxx"),
+ },
+ },
/*
* Desktops which falsely report a backlight and which our heuristics
* for this do not catch.
--
2.34.1
Dear stable kernel maintainers,
Our production kernel team and ChromeOS kernel teams are reporting
that they are unable to symbolize addresses of symbols defined in
assembly sources due to a regression I caused with
commit a66049e2cf0e ("Kbuild: make DWARF version a choice")
I fixed this upstream with
commit 32ef9e5054ec ("Makefile.debug: re-enable debug info for .S files")
but I think this is infeasible to backport through to 4.19.y.
Do the attached branch-specific variants look acceptable?
--
Thanks,
~Nick Desaulniers
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
33806e7cb8d5 ("x86/Kconfig: Drop check for -mabi=ms for CONFIG_EFI_STUB")
c6dbd3e5e69c ("x86/mmx_32: Remove X86_USE_3DNOW")
6bf8a55d8344 ("x86: Fix misspelled Kconfig symbols")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 33806e7cb8d50379f55c3e8f335e91e1b359dc7b Mon Sep 17 00:00:00 2001
From: Nathan Chancellor <nathan(a)kernel.org>
Date: Thu, 29 Sep 2022 08:20:10 -0700
Subject: [PATCH] x86/Kconfig: Drop check for -mabi=ms for CONFIG_EFI_STUB
A recent change in LLVM made CONFIG_EFI_STUB unselectable because it no
longer pretends to support -mabi=ms, breaking the dependency in
Kconfig. Lack of CONFIG_EFI_STUB can prevent kernels from booting via
EFI in certain circumstances.
This check was added by
8f24f8c2fc82 ("efi/libstub: Annotate firmware routines as __efiapi")
to ensure that __attribute__((ms_abi)) was available, as -mabi=ms is
not actually used in any cflags.
According to the GCC documentation, this attribute has been supported
since GCC 4.4.7. The kernel currently requires GCC 5.1 so this check is
not necessary; even when that change landed in 5.6, the kernel required
GCC 4.9 so it was unnecessary then as well.
Clang supports __attribute__((ms_abi)) for all versions that are
supported for building the kernel so no additional check is needed.
Remove the 'depends on' line altogether to allow CONFIG_EFI_STUB to be
selected when CONFIG_EFI is enabled, regardless of compiler.
Fixes: 8f24f8c2fc82 ("efi/libstub: Annotate firmware routines as __efiapi")
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Reviewed-by: Nick Desaulniers <ndesaulniers(a)google.com>
Acked-by: Ard Biesheuvel <ardb(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://github.com/llvm/llvm-project/commit/d1ad006a8f64bdc17f618deffa9e7c9…
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 6d1879ef933a..67745ceab0db 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1973,7 +1973,6 @@ config EFI
config EFI_STUB
bool "EFI stub support"
depends on EFI
- depends on $(cc-option,-mabi=ms) || X86_32
select RELOCATABLE
help
This kernel feature allows a bzImage to be loaded directly
Currently vmx enables SECONDARY_EXEC_ENCLS_EXITING even when sgx
is not set in the host MSR.
When booting a guest, KVM checks that the cpuid bit is actually set
in vmx.c, and if not, it does not enable the feature.
However, in nesting this control bit is blindly set, and will be
propagated to VMCS12 and VMCS02. Therefore, when L1 tries to boot
the guest, the host will try to execute VMLOAD with VMCS02 containing
a feature that the hardware does not support, making it fail with
hardware error 0x7.
According to section "Secondary Processor-Based VM-Execution Controls"
in the Intel SDM, software should *always* check the value in the
actual MSR_IA32_VMX_PROCBASED_CTLS2 before enabling this bit.
Not updating enable_sgx is responsible for a second bug:
vmx_set_cpu_caps() doesn't clear the SGX bits when hardware support is
unavailable. This is a much less problematic bug as it only pops up
if SGX is soft-disabled (the case being handled by cpu_has_sgx()) or if
SGX is supported for bare metal but not in the VMCS (will never happen
when running on bare metal, but can theoertically happen when running in
a VM).
Last but not least, KVM should ideally have module params reflect KVM's
actual configuration.
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=2127128
Fixes: 72add915fbd5 ("KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC")
Cc: stable(a)vger.kernel.org
Suggested-by: Sean Christopherson <seanjc(a)google.com>
Suggested-by: Bandan Das <bsd(a)redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit(a)redhat.com>
---
arch/x86/kvm/vmx/vmx.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 9dba04b6b019..ea0c65d3c08a 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8263,6 +8263,11 @@ static __init int hardware_setup(void)
if (!cpu_has_virtual_nmis())
enable_vnmi = 0;
+ #ifdef CONFIG_X86_SGX_KVM
+ if (!cpu_has_vmx_encls_vmexit())
+ enable_sgx = false;
+ #endif
+
/*
* set_apic_access_page_addr() is used to reload apic access
* page upon invalidation. No need to do anything if not
--
2.31.1
commit 456797da792fa7cbf6698febf275fe9b36691f78 upstream.
arm64's method of defining a default cpu topology requires only minimal
changes to apply to RISC-V also. The current arm64 implementation exits
early in a uniprocessor configuration by reading MPIDR & claiming that
uniprocessor can rely on the default values.
This is appears to be a hangover from prior to '3102bc0e6ac7 ("arm64:
topology: Stop using MPIDR for topology information")', because the
current code just assigns default values for multiprocessor systems.
With the MPIDR references removed, store_cpu_topolgy() can be moved to
the common arch_topology code.
Reviewed-by: Sudeep Holla <sudeep.holla(a)arm.com>
Acked-by: Catalin Marinas <catalin.marinas(a)arm.com>
Reviewed-by: Atish Patra <atishp(a)rivosinc.com>
Signed-off-by: Conor Dooley <conor.dooley(a)microchip.com>
---
arch/arm64/kernel/topology.c | 40 ------------------------------------
drivers/base/arch_topology.c | 19 +++++++++++++++++
2 files changed, 19 insertions(+), 40 deletions(-)
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 4358bc319306..f35af19b7055 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -22,46 +22,6 @@
#include <asm/cputype.h>
#include <asm/topology.h>
-void store_cpu_topology(unsigned int cpuid)
-{
- struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
- u64 mpidr;
-
- if (cpuid_topo->package_id != -1)
- goto topology_populated;
-
- mpidr = read_cpuid_mpidr();
-
- /* Uniprocessor systems can rely on default topology values */
- if (mpidr & MPIDR_UP_BITMASK)
- return;
-
- /*
- * This would be the place to create cpu topology based on MPIDR.
- *
- * However, it cannot be trusted to depict the actual topology; some
- * pieces of the architecture enforce an artificial cap on Aff0 values
- * (e.g. GICv3's ICC_SGI1R_EL1 limits it to 15), leading to an
- * artificial cycling of Aff1, Aff2 and Aff3 values. IOW, these end up
- * having absolutely no relationship to the actual underlying system
- * topology, and cannot be reasonably used as core / package ID.
- *
- * If the MT bit is set, Aff0 *could* be used to define a thread ID, but
- * we still wouldn't be able to obtain a sane core ID. This means we
- * need to entirely ignore MPIDR for any topology deduction.
- */
- cpuid_topo->thread_id = -1;
- cpuid_topo->core_id = cpuid;
- cpuid_topo->package_id = cpu_to_node(cpuid);
-
- pr_debug("CPU%u: cluster %d core %d thread %d mpidr %#016llx\n",
- cpuid, cpuid_topo->package_id, cpuid_topo->core_id,
- cpuid_topo->thread_id, mpidr);
-
-topology_populated:
- update_siblings_masks(cpuid);
-}
-
#ifdef CONFIG_ACPI
static bool __init acpi_cpu_is_threaded(int cpu)
{
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 8272a3a002a3..51647926e605 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -596,4 +596,23 @@ void __init init_cpu_topology(void)
else if (of_have_populated_dt() && parse_dt_topology())
reset_cpu_topology();
}
+
+void store_cpu_topology(unsigned int cpuid)
+{
+ struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
+
+ if (cpuid_topo->package_id != -1)
+ goto topology_populated;
+
+ cpuid_topo->thread_id = -1;
+ cpuid_topo->core_id = cpuid;
+ cpuid_topo->package_id = cpu_to_node(cpuid);
+
+ pr_debug("CPU%u: package %d core %d thread %d\n",
+ cpuid, cpuid_topo->package_id, cpuid_topo->core_id,
+ cpuid_topo->thread_id);
+
+topology_populated:
+ update_siblings_masks(cpuid);
+}
#endif
--
2.38.0
This patch series backports a bunch of patches related IRQ handling
with respect to freeing the irq line while IRQ is in flight at CPU
or at the hardware level.
Recently we saw this issue in serial 8250 driver where the IRQ was being
freed while the irq was in flight or not yet delivered to the CPU. As a
result the irqchip was going into a wedged state and IRQ was not getting
delivered to the cpu. These patches helped fixed the issue in 4.14
kernel.
Let us know if more patches need backporting.
Lukas Wunner (2):
genirq: Update code comments wrt recycled thread_mask
genirq: Synchronize only with single thread on free_irq()
Thomas Gleixner (4):
genirq: Delay deactivation in free_irq()
genirq: Fix misleading synchronize_irq() documentation
genirq: Add optional hardware synchronization for shutdown
x86/ioapic: Implement irq_get_irqchip_state() callback
arch/x86/kernel/apic/io_apic.c | 46 ++++++++++++++
kernel/irq/autoprobe.c | 6 +-
kernel/irq/chip.c | 6 ++
kernel/irq/cpuhotplug.c | 2 +-
kernel/irq/internals.h | 5 ++
kernel/irq/manage.c | 106 ++++++++++++++++++++++-----------
6 files changed, 133 insertions(+), 38 deletions(-)
--
2.37.1
Before adding this quirk, this (mechanical keyboard) device would not be
recognized, logging:
new full-speed USB device number 56 using xhci_hcd
unable to read config index 0 descriptor/start: -32
chopping to 0 config(s)
It would take dozens of plugging/unpuggling cycles for the keyboard to
be recognized. Keyboard seems to simply work after applying this quirk.
This issue had been reported by users in two places already ([1], [2])
but nobody tried upstreaming a patch yet. After testing I believe their
suggested fix (DELAY_INIT + NO_LPM + DEVICE_QUALIFIER) was probably a
little overkill. I assume this particular combination was tested because
it had been previously suggested in [3], but only NO_LPM seems
sufficient for this device.
[1]: https://qiita.com/float168/items/fed43d540c8e2201b543
[2]: https://blog.kostic.dev/posts/making-the-realforce-87ub-work-with-usb30-on-…
[3]: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1678477
Cc: stable(a)vger.kernel.org
Signed-off-by: Nicolas Dumazet <ndumazet(a)google.com>
---
drivers/usb/core/quirks.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c
index 0722d2131305..e775d1bbea4d 100644
--- a/drivers/usb/core/quirks.c
+++ b/drivers/usb/core/quirks.c
@@ -532,6 +532,9 @@ static const struct usb_device_id usb_quirk_list[] = {
/* INTEL VALUE SSD */
{ USB_DEVICE(0x8086, 0xf1a5), .driver_info = USB_QUIRK_RESET_RESUME },
+ /* Realforce 87U Keyboard */
+ { USB_DEVICE(0x0853, 0x011b), .driver_info = USB_QUIRK_NO_LPM },
+
{ } /* terminating entry must be last */
};
--
2.38.0.135.g90850a2211-goog
On Thu, Oct 27, 2022 at 03:53:08AM +0200, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
> mmc: core: Support zeroout using TRIM for eMMC
>
> to the 5.15-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> mmc-core-support-zeroout-using-trim-for-emmc.patch
> and it can be found in the queue-5.15 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
> commit d3cc702e4a9f80cf1ae41399593d843d4c509e08
> Author: Vincent Whitchurch <vincent.whitchurch(a)axis.com>
> Date: Fri Apr 29 17:21:18 2022 +0200
>
> mmc: core: Support zeroout using TRIM for eMMC
>
> [ Upstream commit f7b6fc327327698924ef3afa0c3e87a5b7466af3 ]
>
> If an eMMC card supports TRIM and indicates that it erases to zeros, we can
> use it to support hardware offloading of REQ_OP_WRITE_ZEROES, so let's add
> support for this.
>
> Signed-off-by: Vincent Whitchurch <vincent.whitchurch(a)axis.com>
> Reviewed-by: Avri Altman <Avri.Altman(a)wdc.com>
> Link: https://lore.kernel.org/r/20220429152118.3617303-1-vincent.whitchurch@axis.…
> Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
> Stable-dep-of: 07d2872bf4c8 ("mmc: core: Add SD card quirk for broken discard")
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
This patch is not stable material, please do not backport it.
Commit 258f669e7e88 ("mm: /proc/pid/smaps_rollup: convert to single value
seq_file") introduced a null-deref if there are no vma's in the task in
show_smaps_rollup.
Fixes: 258f669e7e88 ("mm: /proc/pid/smaps_rollup: convert to single value seq_file")
Signed-off-by: Seth Jenkins <sethjenkins(a)google.com>
Reviewed-by: Alexey Dobriyan <adobriyan(a)gmail.com>
Tested-by: Alexey Dobriyan <adobriyan(a)gmail.com>
---
c4c84f06285e on upstream resolves this issue, but a fix must still be applied to stable trees 4.19-5.19.
fs/proc/task_mmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 4e0023643f8b..1e7bbc0873a4 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -969,7 +969,7 @@ static int show_smaps_rollup(struct seq_file *m, void *v)
vma = vma->vm_next;
}
- show_vma_header_prefix(m, priv->mm->mmap->vm_start,
+ show_vma_header_prefix(m, priv->mm->mmap ? priv->mm->mmap->vm_start : 0,
last_vma_end, 0, 0, 0, 0);
seq_pad(m, ' ');
seq_puts(m, "[rollup]\n");
--
2.38.0.rc1.362.ged0d419d3c-goog
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Systems based on Alder Lake P see significant boot time delay if
boot firmware tries to control usb ports in unexpected link states.
This is seen with self-powered usb devices that survive in U3 link
suspended state over S5.
A more generic solution to power off ports at shutdown was attempted in
commit 83810f84ecf1 ("xhci: turn off port power in shutdown")
but it caused regression.
Add host specific XHCI_RESET_TO_DEFAULT quirk which will reset host and
ports back to default state in shutdown.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
---
drivers/usb/host/xhci-pci.c | 4 ++++
drivers/usb/host/xhci.c | 10 ++++++++--
drivers/usb/host/xhci.h | 1 +
3 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 6dd3102749b7..fbbd547ba12a 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -257,6 +257,10 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
pdev->device == PCI_DEVICE_ID_INTEL_DNV_XHCI))
xhci->quirks |= XHCI_MISSING_CAS;
+ if (pdev->vendor == PCI_VENDOR_ID_INTEL &&
+ pdev->device == PCI_DEVICE_ID_INTEL_ALDER_LAKE_PCH_XHCI)
+ xhci->quirks |= XHCI_RESET_TO_DEFAULT;
+
if (pdev->vendor == PCI_VENDOR_ID_INTEL &&
(pdev->device == PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_2C_XHCI ||
pdev->device == PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_4C_XHCI ||
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 5176765c4013..79d7931c048a 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -810,9 +810,15 @@ void xhci_shutdown(struct usb_hcd *hcd)
spin_lock_irq(&xhci->lock);
xhci_halt(xhci);
- /* Workaround for spurious wakeups at shutdown with HSW */
- if (xhci->quirks & XHCI_SPURIOUS_WAKEUP)
+
+ /*
+ * Workaround for spurious wakeps at shutdown with HSW, and for boot
+ * firmware delay in ADL-P PCH if port are left in U3 at shutdown
+ */
+ if (xhci->quirks & XHCI_SPURIOUS_WAKEUP ||
+ xhci->quirks & XHCI_RESET_TO_DEFAULT)
xhci_reset(xhci, XHCI_RESET_SHORT_USEC);
+
spin_unlock_irq(&xhci->lock);
xhci_cleanup_msix(xhci);
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index c0964fe8ac12..cc084d9505cd 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1897,6 +1897,7 @@ struct xhci_hcd {
#define XHCI_BROKEN_D3COLD BIT_ULL(41)
#define XHCI_EP_CTX_BROKEN_DCS BIT_ULL(42)
#define XHCI_SUSPEND_RESUME_CLKS BIT_ULL(43)
+#define XHCI_RESET_TO_DEFAULT BIT_ULL(44)
unsigned int num_active_eps;
unsigned int limit_active_eps;
--
2.25.1
Hi, this is your Linux kernel regression tracker speaking.
I noticed a regression report in bugzilla.kernel.org. As many (most?)
kernel developer don't keep an eye on it, I decided to forward it by
mail. Quoting from https://bugzilla.kernel.org/show_bug.cgi?id=216613 :
> Grzegorz Alibożek 2022-10-21 19:26:43 UTC
>
> After upgrade kernel from 6.0.2 to 6.0.3 on Lenovo T14 Gen2i, sound stopped working.
> dmesg:
>
> paź 21 21:11:45 kernel: snd_hda_codec_hdmi ehdaudio0D2: failed to create hda codec -12
> paź 21 21:11:45 kernel: snd_hda_codec_hdmi ehdaudio0D2: ASoC: error at snd_soc_component_probe on ehdaudio0D2: -12
> paź 21 21:11:45 kernel: skl_hda_dsp_generic skl_hda_dsp_generic: ASoC: failed to instantiate card -12
>
> [reply] [−] Comment 1 Grzegorz Alibożek 2022-10-21 19:56:43 UTC
>
> Created attachment 303070 [details]
> trace
See the ticket for more details.
BTW, let me use this mail to also add the report to the list of tracked
regressions to ensure it's doesn't fall through the cracks:
#regzbot introduced: v6.0.2..v6.0.3
https://bugzilla.kernel.org/show_bug.cgi?id=216613
#regzbot ignore-activity
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.
From: ChiYuan Huang <cy_huang(a)richtek.com>
Fix the potential risk if virtual bank index is over the maximum.
Refer to the discussion list on mt6370.
https://lore.kernel.org/all/20220914013345.GA5802@cyhuang-hp-elitebook-840-…
If not to check the boundary, there is also the same issue on mt6360.
For mt6360 register virtual mapping, the normal range is 0 to 0x3FF.
Below's the backtrace in my experiment to access mt6360 0x400 register
with regmap_raw_read() and regmap_raw_write() function.
1) regmap_raw_read()
Unable to handle kernel execute from non-executable memory at virtual
address ffffffd4940c4c20
pc : platform_bus+0x8/0x2e8
lr : i2c_smbus_xfer+0x60/0x120
Call trace:
platform_bus+0x8/0x2e8
i2c_smbus_read_i2c_block_data+0x74/0xc0
mt6360_regmap_read+0x9c/0x180 [mt6360_core]
_regmap_raw_read+0xe4/0x278
regmap_raw_read+0xec/0x240
2) regmap_raw_write()
Unable to handle kernel execute from non-executable memory at virtual
address ffffffe4a0ac4c20
pc : platform_bus+0x8/0x2e8
lr : i2c_smbus_xfer+0x60/0x120
Call trace:
platform_bus+0x8/0x2e8
i2c_smbus_write_i2c_block_data+0x84/0xd0
mt6360_regmap_write+0xa8/0x150 [mt6360_core]
_regmap_raw_write_impl+0x6e8/0x828
_regmap_raw_write+0xb4/0x130
regmap_raw_write+0x74/0xb0
After adding the boundary check, the above two cases can be solved.
Fixes: 3b0850440a06c (mfd: mt6360: Merge different sub-devices I2C read/write)
Cc: stable(a)vger.kernel.org
Signed-off-by: ChiYuan Huang <cy_huang(a)richtek.com>
---
Since v3:
- Add backtrace log to help understanding what the potential risk is.
Since v2:
- Assign i2c bank variable after bank index is already checked.
---
drivers/mfd/mt6360-core.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/mfd/mt6360-core.c b/drivers/mfd/mt6360-core.c
index 6eaa677..d3b32eb 100644
--- a/drivers/mfd/mt6360-core.c
+++ b/drivers/mfd/mt6360-core.c
@@ -402,7 +402,7 @@ static int mt6360_regmap_read(void *context, const void *reg, size_t reg_size,
struct mt6360_ddata *ddata = context;
u8 bank = *(u8 *)reg;
u8 reg_addr = *(u8 *)(reg + 1);
- struct i2c_client *i2c = ddata->i2c[bank];
+ struct i2c_client *i2c;
bool crc_needed = false;
u8 *buf;
int buf_len = MT6360_ALLOC_READ_SIZE(val_size);
@@ -410,6 +410,11 @@ static int mt6360_regmap_read(void *context, const void *reg, size_t reg_size,
u8 crc;
int ret;
+ if (bank >= MT6360_SLAVE_MAX)
+ return -EINVAL;
+
+ i2c = ddata->i2c[bank];
+
if (bank == MT6360_SLAVE_PMIC || bank == MT6360_SLAVE_LDO) {
crc_needed = true;
ret = mt6360_xlate_pmicldo_addr(®_addr, val_size);
@@ -453,13 +458,18 @@ static int mt6360_regmap_write(void *context, const void *val, size_t val_size)
struct mt6360_ddata *ddata = context;
u8 bank = *(u8 *)val;
u8 reg_addr = *(u8 *)(val + 1);
- struct i2c_client *i2c = ddata->i2c[bank];
+ struct i2c_client *i2c;
bool crc_needed = false;
u8 *buf;
int buf_len = MT6360_ALLOC_WRITE_SIZE(val_size);
int write_size = val_size - MT6360_REGMAP_REG_BYTE_SIZE;
int ret;
+ if (bank >= MT6360_SLAVE_MAX)
+ return -EINVAL;
+
+ i2c = ddata->i2c[bank];
+
if (bank == MT6360_SLAVE_PMIC || bank == MT6360_SLAVE_LDO) {
crc_needed = true;
ret = mt6360_xlate_pmicldo_addr(®_addr, val_size - MT6360_REGMAP_REG_BYTE_SIZE);
--
2.7.4