The patch titled
Subject: cachestat: do not flush stats in recency check
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
cachestat-do-not-flush-stats-in-recency-check.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Nhat Pham <nphamcs(a)gmail.com>
Subject: cachestat: do not flush stats in recency check
Date: Thu, 27 Jun 2024 13:17:37 -0700
syzbot detects that cachestat() is flushing stats, which can sleep, in its
RCU read section (see [1]). This is done in the workingset_test_recent()
step (which checks if the folio's eviction is recent).
Move the stat flushing step to before the RCU read section of cachestat,
and skip stat flushing during the recency check.
[1]: https://lore.kernel.org/cgroups/000000000000f71227061bdf97e0@google.com/
Link: https://lkml.kernel.org/r/20240627201737.3506959-1-nphamcs@gmail.com
Fixes: b00684722262 ("mm: workingset: move the stats flush into workingset_test_recent()")
Signed-off-by: Nhat Pham <nphamcs(a)gmail.com>
Reported-by: syzbot+b7f13b2d0cc156edf61a(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/cgroups/000000000000f71227061bdf97e0@google.com/
Debugged-by: Johannes Weiner <hannes(a)cmpxchg.org>
Suggested-by: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: "Huang, Ying" <ying.huang(a)intel.com>
Cc: Kairui Song <kasong(a)tencent.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Shakeel Butt <shakeel.butt(a)linux.dev>
Cc: Yosry Ahmed <yosryahmed(a)google.com>
Cc: <stable(a)vger.kernel.org> [6.8+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/swap.h | 3 ++-
mm/filemap.c | 5 ++++-
mm/workingset.c | 14 +++++++++++---
3 files changed, 17 insertions(+), 5 deletions(-)
--- a/include/linux/swap.h~cachestat-do-not-flush-stats-in-recency-check
+++ a/include/linux/swap.h
@@ -354,7 +354,8 @@ static inline swp_entry_t page_swap_entr
}
/* linux/mm/workingset.c */
-bool workingset_test_recent(void *shadow, bool file, bool *workingset);
+bool workingset_test_recent(void *shadow, bool file, bool *workingset,
+ bool flush);
void workingset_age_nonresident(struct lruvec *lruvec, unsigned long nr_pages);
void *workingset_eviction(struct folio *folio, struct mem_cgroup *target_memcg);
void workingset_refault(struct folio *folio, void *shadow);
--- a/mm/filemap.c~cachestat-do-not-flush-stats-in-recency-check
+++ a/mm/filemap.c
@@ -4248,6 +4248,9 @@ static void filemap_cachestat(struct add
XA_STATE(xas, &mapping->i_pages, first_index);
struct folio *folio;
+ /* Flush stats (and potentially sleep) outside the RCU read section. */
+ mem_cgroup_flush_stats_ratelimited(NULL);
+
rcu_read_lock();
xas_for_each(&xas, folio, last_index) {
int order;
@@ -4311,7 +4314,7 @@ static void filemap_cachestat(struct add
goto resched;
}
#endif
- if (workingset_test_recent(shadow, true, &workingset))
+ if (workingset_test_recent(shadow, true, &workingset, false))
cs->nr_recently_evicted += nr_pages;
goto resched;
--- a/mm/workingset.c~cachestat-do-not-flush-stats-in-recency-check
+++ a/mm/workingset.c
@@ -412,10 +412,12 @@ void *workingset_eviction(struct folio *
* @file: whether the corresponding folio is from the file lru.
* @workingset: where the workingset value unpacked from shadow should
* be stored.
+ * @flush: whether to flush cgroup rstat.
*
* Return: true if the shadow is for a recently evicted folio; false otherwise.
*/
-bool workingset_test_recent(void *shadow, bool file, bool *workingset)
+bool workingset_test_recent(void *shadow, bool file, bool *workingset,
+ bool flush)
{
struct mem_cgroup *eviction_memcg;
struct lruvec *eviction_lruvec;
@@ -467,10 +469,16 @@ bool workingset_test_recent(void *shadow
/*
* Flush stats (and potentially sleep) outside the RCU read section.
+ *
+ * Note that workingset_test_recent() itself might be called in RCU read
+ * section (for e.g, in cachestat) - these callers need to skip flushing
+ * stats (via the flush argument).
+ *
* XXX: With per-memcg flushing and thresholding, is ratelimiting
* still needed here?
*/
- mem_cgroup_flush_stats_ratelimited(eviction_memcg);
+ if (flush)
+ mem_cgroup_flush_stats_ratelimited(eviction_memcg);
eviction_lruvec = mem_cgroup_lruvec(eviction_memcg, pgdat);
refault = atomic_long_read(&eviction_lruvec->nonresident_age);
@@ -558,7 +566,7 @@ void workingset_refault(struct folio *fo
mod_lruvec_state(lruvec, WORKINGSET_REFAULT_BASE + file, nr);
- if (!workingset_test_recent(shadow, file, &workingset))
+ if (!workingset_test_recent(shadow, file, &workingset, true))
return;
folio_set_active(folio);
_
Patches currently in -mm which might be from nphamcs(a)gmail.com are
cachestat-do-not-flush-stats-in-recency-check.patch
On Wed, 2024-06-26 at 15:07 -0400, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> scsi: mpt3sas: Add ioc_<level> logging macros
>
> to the 4.19-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> scsi-mpt3sas-add-ioc_-level-logging-macros.patch
> and it can be found in the queue-4.19 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
[]
> Stable-dep-of: 4254dfeda82f ("scsi: mpt3sas: Avoid test/set_bit() operating in non-allocated memory")
Huh? This doesn't make sense as far as I can tell.
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
[]
> @@ -160,6 +160,15 @@ struct mpt3sas_nvme_cmd {
> */
> #define MPT3SAS_FMT "%s: "
>
> +#define ioc_err(ioc, fmt, ...) \
> + pr_err("%s: " fmt, (ioc)->name, ##__VA_ARGS__)
> +#define ioc_notice(ioc, fmt, ...) \
> + pr_notice("%s: " fmt, (ioc)->name, ##__VA_ARGS__)
> +#define ioc_warn(ioc, fmt, ...) \
> + pr_warn("%s: " fmt, (ioc)->name, ##__VA_ARGS__)
> +#define ioc_info(ioc, fmt, ...) \
> + pr_info("%s: " fmt, (ioc)->name, ##__VA_ARGS__)
> +
This only adds ioc_<level> macros and the
nominal stable dep patch below doesn't use them.
$ git log --stat -p -1 4254dfeda82f
commit 4254dfeda82f20844299dca6c38cbffcfd499f41
Author: Breno Leitao <leitao(a)debian.org>
Date: Wed Jun 5 01:55:29 2024 -0700
scsi: mpt3sas: Avoid test/set_bit() operating in non-allocated memory
There is a potential out-of-bounds access when using test_bit() on a single
word. The test_bit() and set_bit() functions operate on long values, and
when testing or setting a single word, they can exceed the word
boundary. KASAN detects this issue and produces a dump:
BUG: KASAN: slab-out-of-bounds in _scsih_add_device.constprop.0 (./arch/x86/include/asm/bitops.h:60 ./include/asm-generic/bitops/instrumented-atomic.h:29 drivers/scsi/mpt3sas/mpt3sas_scsih.c:7331) mpt3sas
Write of size 8 at addr ffff8881d26e3c60 by task kworker/u1536:2/2965
For full log, please look at [1].
Make the allocation at least the size of sizeof(unsigned long) so that
set_bit() and test_bit() have sufficient room for read/write operations
without overwriting unallocated memory.
[1] Link: https://lore.kernel.org/all/ZkNcALr3W3KGYYJG@gmail.com/
Fixes: c696f7b83ede ("scsi: mpt3sas: Implement device_remove_in_progress check in IOCTL path")
Cc: stable(a)vger.kernel.org
Suggested-by: Keith Busch <kbusch(a)kernel.org>
Signed-off-by: Breno Leitao <leitao(a)debian.org>
Link: https://lore.kernel.org/r/20240605085530.499432-1-leitao@debian.org
Reviewed-by: Keith Busch <kbusch(a)kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
---
drivers/scsi/mpt3sas/mpt3sas_base.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 258647fc6bddb..1092497563b22 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -8512,6 +8512,12 @@ mpt3sas_base_attach(struct MPT3SAS_ADAPTER *ioc)
ioc->pd_handles_sz = (ioc->facts.MaxDevHandle / 8);
if (ioc->facts.MaxDevHandle % 8)
ioc->pd_handles_sz++;
+ /*
+ * pd_handles_sz should have, at least, the minimal room for
+ * set_bit()/test_bit(), otherwise out-of-memory touch may occur.
+ */
+ ioc->pd_handles_sz = ALIGN(ioc->pd_handles_sz, sizeof(unsigned long));
+
ioc->pd_handles = kzalloc(ioc->pd_handles_sz,
GFP_KERNEL);
if (!ioc->pd_handles) {
@@ -8529,6 +8535,13 @@ mpt3sas_base_attach(struct MPT3SAS_ADAPTER *ioc)
ioc->pend_os_device_add_sz = (ioc->facts.MaxDevHandle / 8);
if (ioc->facts.MaxDevHandle % 8)
ioc->pend_os_device_add_sz++;
+
+ /*
+ * pend_os_device_add_sz should have, at least, the minimal room for
+ * set_bit()/test_bit(), otherwise out-of-memory may occur.
+ */
+ ioc->pend_os_device_add_sz = ALIGN(ioc->pend_os_device_add_sz,
+ sizeof(unsigned long));
ioc->pend_os_device_add = kzalloc(ioc->pend_os_device_add_sz,
GFP_KERNEL);
if (!ioc->pend_os_device_add) {
@@ -8820,6 +8833,12 @@ _base_check_ioc_facts_changes(struct MPT3SAS_ADAPTER *ioc)
if (ioc->facts.MaxDevHandle % 8)
pd_handles_sz++;
+ /*
+ * pd_handles should have, at least, the minimal room for
+ * set_bit()/test_bit(), otherwise out-of-memory touch may
+ * occur.
+ */
+ pd_handles_sz = ALIGN(pd_handles_sz, sizeof(unsigned long));
pd_handles = krealloc(ioc->pd_handles, pd_handles_sz,
GFP_KERNEL);
if (!pd_handles) {
From: Murali Nalajala <quic_mnalajal(a)quicinc.com>
Currently get_wq_ctx() is wrongly configured as a
standard call. When two SMC calls are in sleep and one
SMC wakes up, it calls get_wq_ctx() to resume the
corresponding sleeping thread. But if get_wq_ctx() is
interrupted, goes to sleep and another SMC call is
waiting to be allocated a waitq context, it leads to a
deadlock.
To avoid this get_wq_ctx() must be an atomic call and
can't be a standard SMC call. Hence mark get_wq_ctx()
as a fast call.
Fixes: 6bf325992236 ("firmware: qcom: scm: Add wait-queue handling logic")
Cc: stable(a)vger.kernel.org
Signed-off-by: Murali Nalajala <quic_mnalajal(a)quicinc.com>
Signed-off-by: Unnathi Chalicheemala <quic_uchalich(a)quicinc.com>
---
drivers/firmware/qcom/qcom_scm-smc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/firmware/qcom/qcom_scm-smc.c b/drivers/firmware/qcom/qcom_scm-smc.c
index 16cf88acfa8e..0a2a2c794d0e 100644
--- a/drivers/firmware/qcom/qcom_scm-smc.c
+++ b/drivers/firmware/qcom/qcom_scm-smc.c
@@ -71,7 +71,7 @@ int scm_get_wq_ctx(u32 *wq_ctx, u32 *flags, u32 *more_pending)
struct arm_smccc_res get_wq_res;
struct arm_smccc_args get_wq_ctx = {0};
- get_wq_ctx.args[0] = ARM_SMCCC_CALL_VAL(ARM_SMCCC_STD_CALL,
+ get_wq_ctx.args[0] = ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL,
ARM_SMCCC_SMC_64, ARM_SMCCC_OWNER_SIP,
SCM_SMC_FNID(QCOM_SCM_SVC_WAITQ, QCOM_SCM_WAITQ_GET_WQ_CTX));
--
2.34.1
Currently during driver initialization, mac registration is allowed
only for devices that advertise WMI_TLV_SERVICE_WOW, but QCN9274
doesn't support WoW and hence mac registration is aborted and driver
is de-initialized.
Allow mac registration to proceed without WoW Support for devices
that don't advertise WMI_TLV_SERVICE_WOW.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
Cc: stable(a)vger.kernel.org
Fixes: 4a3c212eee0e ("wifi: ath12k: add basic WoW functionalities")
Signed-off-by: Rameshkumar Sundaram <quic_ramess(a)quicinc.com>
---
drivers/net/wireless/ath/ath12k/wow.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/ath/ath12k/wow.c b/drivers/net/wireless/ath/ath12k/wow.c
index 685e8e98d845..c5cba825a84a 100644
--- a/drivers/net/wireless/ath/ath12k/wow.c
+++ b/drivers/net/wireless/ath/ath12k/wow.c
@@ -1001,8 +1001,8 @@ int ath12k_wow_op_resume(struct ieee80211_hw *hw)
int ath12k_wow_init(struct ath12k *ar)
{
- if (WARN_ON(!test_bit(WMI_TLV_SERVICE_WOW, ar->wmi->wmi_ab->svc_map)))
- return -EINVAL;
+ if (!test_bit(WMI_TLV_SERVICE_WOW, ar->wmi->wmi_ab->svc_map))
+ return 0;
ar->wow.wowlan_support = ath12k_wowlan_support;
base-commit: dadc1101eabb54ace51aa6fc58c902bf43ac0ed7
--
2.25.1
Otherwise when the tracer changes syscall number to -1, the kernel fails
to initialize a0 with -ENOSYS and subsequently fails to return the error
code of the failed syscall to userspace. For example, it will break
strace syscall tampering.
Fixes: 52449c17bdd1 ("riscv: entry: set a0 = -ENOSYS only when syscall != -1")
Cc: stable(a)vger.kernel.org
Signed-off-by: Celeste Liu <CoelacanthusHex(a)gmail.com>
---
arch/riscv/kernel/traps.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index 05a16b1f0aee..51ebfd23e007 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -319,6 +319,7 @@ void do_trap_ecall_u(struct pt_regs *regs)
regs->epc += 4;
regs->orig_a0 = regs->a0;
+ regs->a0 = -ENOSYS;
riscv_v_vstate_discard(regs);
@@ -328,8 +329,7 @@ void do_trap_ecall_u(struct pt_regs *regs)
if (syscall >= 0 && syscall < NR_syscalls)
syscall_handler(regs, syscall);
- else if (syscall != -1)
- regs->a0 = -ENOSYS;
+
/*
* Ultimately, this value will get limited by KSTACK_OFFSET_MAX(),
* so the maximum stack offset is 1k bytes (10 bits).
--
2.45.2
Usb device connect may not be detected after runtime resume if
xHC is reset during resume.
In runtime resume cases xhci_resume() will only resume roothubs if there
are pending port events. If the xHC host is reset during runtime resume
due to a Save/Restore Error (SRE) then these pending port events won't be
detected as PORTSC change bits are not immediately set by host after reset.
Unconditionally resume roothubs if xHC is reset during resume to ensure
device connections are detected.
Also return early with error code if starting xHC fails after reset.
Issue was debugged and a similar solution suggested by Remi Pommarel.
Using this instead as it simplifies future refactoring.
Reported-by: Remi Pommarel <repk(a)triplefau.lt>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218987
Suggested-by: Remi Pommarel <repk(a)triplefau.lt>
Tested-by: Remi Pommarel <repk(a)triplefau.lt>
Cc: stable(a)vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
---
drivers/usb/host/xhci.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 37eb37b0affa..0a8cf6c17f82 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -1125,10 +1125,20 @@ int xhci_resume(struct xhci_hcd *xhci, pm_message_t msg)
xhci_dbg(xhci, "Start the secondary HCD\n");
retval = xhci_run(xhci->shared_hcd);
}
-
+ if (retval)
+ return retval;
+ /*
+ * Resume roothubs unconditionally as PORTSC change bits are not
+ * immediately visible after xHC reset
+ */
hcd->state = HC_STATE_SUSPENDED;
- if (xhci->shared_hcd)
+
+ if (xhci->shared_hcd) {
xhci->shared_hcd->state = HC_STATE_SUSPENDED;
+ usb_hcd_resume_root_hub(xhci->shared_hcd);
+ }
+ usb_hcd_resume_root_hub(hcd);
+
goto done;
}
@@ -1152,7 +1162,6 @@ int xhci_resume(struct xhci_hcd *xhci, pm_message_t msg)
xhci_dbc_resume(xhci);
- done:
if (retval == 0) {
/*
* Resume roothubs only if there are pending events.
@@ -1178,6 +1187,7 @@ int xhci_resume(struct xhci_hcd *xhci, pm_message_t msg)
usb_hcd_resume_root_hub(hcd);
}
}
+done:
/*
* If system is subject to the Quirk, Compliance Mode Timer needs to
* be re-initialized Always after a system resume. Ports are subject
--
2.25.1