The patch titled
Subject: mm: shrinker: avoid memleak in alloc_shrinker_info
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-shrinker-avoid-memleak-in-alloc_shrinker_info.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Chen Ridong <chenridong(a)huawei.com>
Subject: mm: shrinker: avoid memleak in alloc_shrinker_info
Date: Fri, 25 Oct 2024 06:09:42 +0000
A memleak was found as below:
unreferenced object 0xffff8881010d2a80 (size 32):
comm "mkdir", pid 1559, jiffies 4294932666
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @...............
backtrace (crc 2e7ef6fa):
[<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470
[<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0
[<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0
[<ffffffff81198dd9>] online_css+0x29/0xa0
[<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360
[<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0
[<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90
[<ffffffff813dbb24>] vfs_mkdir+0x144/0x220
[<ffffffff813e1c97>] do_mkdirat+0x87/0x130
[<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70
[<ffffffff81f8c928>] do_syscall_64+0x68/0x140
[<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
alloc_shrinker_info(), when shrinker_unit_alloc() returns an errer, the
info won't be freed. Just fix it.
Link: https://lkml.kernel.org/r/20241025060942.1049263-1-chenridong@huaweicloud.c…
Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}")
Signed-off-by: Chen Ridong <chenridong(a)huawei.com>
Acked-by: Qi Zheng <zhengqi.arch(a)bytedance.com>
Acked-by: Roman Gushchin <roman.gushchin(a)linux.dev>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual(a)arm.com>
Cc: Dave Chinner <david(a)fromorbit.com>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Wang Weiyang <wangweiyang2(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/shrinker.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
--- a/mm/shrinker.c~mm-shrinker-avoid-memleak-in-alloc_shrinker_info
+++ a/mm/shrinker.c
@@ -76,19 +76,21 @@ void free_shrinker_info(struct mem_cgrou
int alloc_shrinker_info(struct mem_cgroup *memcg)
{
- struct shrinker_info *info;
int nid, ret = 0;
int array_size = 0;
mutex_lock(&shrinker_mutex);
array_size = shrinker_unit_size(shrinker_nr_max);
for_each_node(nid) {
- info = kvzalloc_node(sizeof(*info) + array_size, GFP_KERNEL, nid);
+ struct shrinker_info *info = kvzalloc_node(sizeof(*info) + array_size,
+ GFP_KERNEL, nid);
if (!info)
goto err;
info->map_nr_max = shrinker_nr_max;
- if (shrinker_unit_alloc(info, NULL, nid))
+ if (shrinker_unit_alloc(info, NULL, nid)) {
+ kvfree(info);
goto err;
+ }
rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info);
}
mutex_unlock(&shrinker_mutex);
_
Patches currently in -mm which might be from chenridong(a)huawei.com are
mm-shrinker-avoid-memleak-in-alloc_shrinker_info.patch
We notice some platforms set "snps,dis_u3_susphy_quirk" and
"snps,dis_u2_susphy_quirk" when they should not need to. Just make sure that
the GUSB3PIPECTL.SUSPENDENABLE and GUSB2PHYCFG.SUSPHY are clear during
initialization. The host initialization involved xhci. So the dwc3 needs to
implement the xhci_plat_priv->plat_start() for xhci to re-enable the suspend
bits.
Since there's a prerequisite patch to drivers/usb/host/xhci-plat.h that's not a
fix patch, this series should go on Greg's usb-testing branch instead of
usb-linus.
Thinh Nguyen (2):
usb: xhci-plat: Don't include xhci.h
usb: dwc3: core: Prevent phy suspend during init
drivers/usb/dwc3/core.c | 90 +++++++++++++++---------------------
drivers/usb/dwc3/core.h | 1 +
drivers/usb/dwc3/gadget.c | 2 +
drivers/usb/dwc3/host.c | 27 +++++++++++
drivers/usb/host/xhci-plat.h | 4 +-
5 files changed, 71 insertions(+), 53 deletions(-)
base-commit: 3d122e6d27e417a9fa91181922743df26b2cd679
--
2.28.0
The patch titled
Subject: vmscan,migrate: fix double-decrement on node stats when demoting pages
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
vmscanmigrate-fix-double-decrement-on-node-stats-when-demoting-pages.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Gregory Price <gourry(a)gourry.net>
Subject: vmscan,migrate: fix double-decrement on node stats when demoting pages
Date: Fri, 25 Oct 2024 10:17:24 -0400
When numa balancing is enabled with demotion, vmscan will call
migrate_pages when shrinking LRUs. Successful demotions will cause node
vmstat numbers to double-decrement, leading to an imbalanced page count.
The result is dmesg output like such:
$ cat /proc/sys/vm/stat_refresh
[77383.088417] vmstat_refresh: nr_isolated_anon -103212
[77383.088417] vmstat_refresh: nr_isolated_file -899642
This negative value may impact compaction and reclaim throttling.
The double-decrement occurs in the migrate_pages path:
caller to shrink_folio_list decrements the count
shrink_folio_list
demote_folio_list
migrate_pages
migrate_pages_batch
migrate_folio_move
migrate_folio_done
mod_node_page_state(-ve) <- second decrement
This path happens for SUCCESSFUL migrations, not failures. Typically
callers to migrate_pages are required to handle putback/accounting for
failures, but this is already handled in the shrink code.
When accounting for migrations, instead do not decrement the count when
the migration reason is MR_DEMOTION. As of v6.11, this demotion logic is
the only source of MR_DEMOTION.
Link: https://lkml.kernel.org/r/20241025141724.17927-1-gourry@gourry.net
Fixes: 26aa2d199d6f2 ("mm/migrate: demote pages during reclaim")
Signed-off-by: Gregory Price <gourry(a)gourry.net>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Huang Ying <ying.huang(a)intel.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/migrate.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/migrate.c~vmscanmigrate-fix-double-decrement-on-node-stats-when-demoting-pages
+++ a/mm/migrate.c
@@ -1178,7 +1178,7 @@ static void migrate_folio_done(struct fo
* not accounted to NR_ISOLATED_*. They can be recognized
* as __folio_test_movable
*/
- if (likely(!__folio_test_movable(src)))
+ if (likely(!__folio_test_movable(src)) && reason != MR_DEMOTION)
mod_node_page_state(folio_pgdat(src), NR_ISOLATED_ANON +
folio_is_file_lru(src), -folio_nr_pages(src));
_
Patches currently in -mm which might be from gourry(a)gourry.net are
vmscanmigrate-fix-double-decrement-on-node-stats-when-demoting-pages.patch
Returning an abort to the guest for an unsupported MMIO access is a
documented feature of the KVM UAPI. Nevertheless, it's clear that this
plumbing has seen limited testing, since userspace can trivially cause a
WARN in the MMIO return:
WARNING: CPU: 0 PID: 30558 at arch/arm64/include/asm/kvm_emulate.h:536 kvm_handle_mmio_return+0x46c/0x5c4 arch/arm64/include/asm/kvm_emulate.h:536
Call trace:
kvm_handle_mmio_return+0x46c/0x5c4 arch/arm64/include/asm/kvm_emulate.h:536
kvm_arch_vcpu_ioctl_run+0x98/0x15b4 arch/arm64/kvm/arm.c:1133
kvm_vcpu_ioctl+0x75c/0xa78 virt/kvm/kvm_main.c:4487
__do_sys_ioctl fs/ioctl.c:51 [inline]
__se_sys_ioctl fs/ioctl.c:893 [inline]
__arm64_sys_ioctl+0x14c/0x1c8 fs/ioctl.c:893
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x1e0/0x23c arch/arm64/kernel/syscall.c:132
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
el0_svc+0x38/0x68 arch/arm64/kernel/entry-common.c:712
el0t_64_sync_handler+0x90/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
The splat is complaining that KVM is advancing PC while an exception is
pending, i.e. that KVM is retiring the MMIO instruction despite a
pending synchronous external abort. Womp womp.
Fix the glaring UAPI bug by skipping over all the MMIO emulation in
case there is a pending synchronous exception. Note that while userspace
is capable of pending an asynchronous exception (SError, IRQ, or FIQ),
it is still safe to retire the MMIO instruction in this case as (1) they
are by definition asynchronous, and (2) KVM relies on hardware support
for pending/delivering these exceptions instead of the software state
machine for advancing PC.
Cc: stable(a)vger.kernel.org
Fixes: da345174ceca ("KVM: arm/arm64: Allow user injection of external data aborts")
Reported-by: Alexander Potapenko <glider(a)google.com>
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
---
arch/arm64/kvm/mmio.c | 32 ++++++++++++++++++++++++++++++--
1 file changed, 30 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
index cd6b7b83e2c3..ab365e839874 100644
--- a/arch/arm64/kvm/mmio.c
+++ b/arch/arm64/kvm/mmio.c
@@ -72,6 +72,31 @@ unsigned long kvm_mmio_read_buf(const void *buf, unsigned int len)
return data;
}
+static bool kvm_pending_sync_exception(struct kvm_vcpu *vcpu)
+{
+ if (!vcpu_get_flag(vcpu, PENDING_EXCEPTION))
+ return false;
+
+ if (vcpu_el1_is_32bit(vcpu)) {
+ switch (vcpu_get_flag(vcpu, EXCEPT_MASK)) {
+ case unpack_vcpu_flag(EXCEPT_AA32_UND):
+ case unpack_vcpu_flag(EXCEPT_AA32_IABT):
+ case unpack_vcpu_flag(EXCEPT_AA32_DABT):
+ return true;
+ default:
+ return false;
+ }
+ } else {
+ switch (vcpu_get_flag(vcpu, EXCEPT_MASK)) {
+ case unpack_vcpu_flag(EXCEPT_AA64_EL1_SYNC):
+ case unpack_vcpu_flag(EXCEPT_AA64_EL2_SYNC):
+ return true;
+ default:
+ return false;
+ }
+ }
+}
+
/**
* kvm_handle_mmio_return -- Handle MMIO loads after user space emulation
* or in-kernel IO emulation
@@ -84,8 +109,11 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu)
unsigned int len;
int mask;
- /* Detect an already handled MMIO return */
- if (unlikely(!vcpu->mmio_needed))
+ /*
+ * Detect if the MMIO return was already handled or if userspace aborted
+ * the MMIO access.
+ */
+ if (unlikely(!vcpu->mmio_needed || kvm_pending_sync_exception(vcpu)))
return 1;
vcpu->mmio_needed = 0;
--
2.47.0.163.g1226f6d8fa-goog
The patch titled
Subject: sched/numa: fix the potential null pointer dereference in task_numa_work()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
sched-numa-fix-the-potential-null-pointer-dereference-in-task_numa_work.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Shawn Wang <shawnwang(a)linux.alibaba.com>
Subject: sched/numa: fix the potential null pointer dereference in task_numa_work()
Date: Fri, 25 Oct 2024 10:22:08 +0800
When running stress-ng-vm-segv test, we found a null pointer dereference
error in task_numa_work(). Here is the backtrace:
[323676.066985] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
......
[323676.067108] CPU: 35 PID: 2694524 Comm: stress-ng-vm-se
......
[323676.067113] pstate: 23401009 (nzCv daif +PAN -UAO +TCO +DIT +SSBS BTYPE=--)
[323676.067115] pc : vma_migratable+0x1c/0xd0
[323676.067122] lr : task_numa_work+0x1ec/0x4e0
[323676.067127] sp : ffff8000ada73d20
[323676.067128] x29: ffff8000ada73d20 x28: 0000000000000000 x27: 000000003e89f010
[323676.067130] x26: 0000000000080000 x25: ffff800081b5c0d8 x24: ffff800081b27000
[323676.067133] x23: 0000000000010000 x22: 0000000104d18cc0 x21: ffff0009f7158000
[323676.067135] x20: 0000000000000000 x19: 0000000000000000 x18: ffff8000ada73db8
[323676.067138] x17: 0001400000000000 x16: ffff800080df40b0 x15: 0000000000000035
[323676.067140] x14: ffff8000ada73cc8 x13: 1fffe0017cc72001 x12: ffff8000ada73cc8
[323676.067142] x11: ffff80008001160c x10: ffff000be639000c x9 : ffff8000800f4ba4
[323676.067145] x8 : ffff000810375000 x7 : ffff8000ada73974 x6 : 0000000000000001
[323676.067147] x5 : 0068000b33e26707 x4 : 0000000000000001 x3 : ffff0009f7158000
[323676.067149] x2 : 0000000000000041 x1 : 0000000000004400 x0 : 0000000000000000
[323676.067152] Call trace:
[323676.067153] vma_migratable+0x1c/0xd0
[323676.067155] task_numa_work+0x1ec/0x4e0
[323676.067157] task_work_run+0x78/0xd8
[323676.067161] do_notify_resume+0x1ec/0x290
[323676.067163] el0_svc+0x150/0x160
[323676.067167] el0t_64_sync_handler+0xf8/0x128
[323676.067170] el0t_64_sync+0x17c/0x180
[323676.067173] Code: d2888001 910003fd f9000bf3 aa0003f3 (f9401000)
[323676.067177] SMP: stopping secondary CPUs
[323676.070184] Starting crashdump kernel...
stress-ng-vm-segv in stress-ng is used to stress test the SIGSEGV error
handling function of the system, which tries to cause a SIGSEGV error on
return from unmapping the whole address space of the child process.
Normally this program will not cause kernel crashes. But before the
munmap system call returns to user mode, a potential task_numa_work() for
numa balancing could be added and executed. In this scenario, since the
child process has no vma after munmap, the vma_next() in task_numa_work()
will return a null pointer even if the vma iterator restarts from 0.
Recheck the vma pointer before dereferencing it in task_numa_work().
Link: https://lkml.kernel.org/r/20241025022208.125527-1-shawnwang@linux.alibaba.c…
Fixes: 214dbc428137 ("sched: convert to vma iterator")
Signed-off-by: Shawn Wang <shawnwang(a)linux.alibaba.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Ben Segall <bsegall(a)google.com>
Cc: Dietmar Eggemann <dietmar.eggemann(a)arm.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Juri Lelli <juri.lelli(a)redhat.com>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Cc: Valentin Schneider <vschneid(a)redhat.com>
Cc: Vincent Guittot <vincent.guittot(a)linaro.org>
Cc: <stable(a)vger.kernel.org> [6.2+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/sched/fair.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/kernel/sched/fair.c~sched-numa-fix-the-potential-null-pointer-dereference-in-task_numa_work
+++ a/kernel/sched/fair.c
@@ -3369,7 +3369,7 @@ retry_pids:
vma = vma_next(&vmi);
}
- do {
+ for (; vma; vma = vma_next(&vmi)) {
if (!vma_migratable(vma) || !vma_policy_mof(vma) ||
is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_MIXEDMAP)) {
trace_sched_skip_vma_numa(mm, vma, NUMAB_SKIP_UNSUITABLE);
@@ -3491,7 +3491,7 @@ retry_pids:
*/
if (vma_pids_forced)
break;
- } for_each_vma(vmi, vma);
+ }
/*
* If no VMAs are remaining and VMAs were skipped due to the PID
_
Patches currently in -mm which might be from shawnwang(a)linux.alibaba.com are
sched-numa-fix-the-potential-null-pointer-dereference-in-task_numa_work.patch
Commit 723e8462a4fe ("pinctrl: qcom: spmi-gpio: Fix the GPIO strength
mapping") fixed a long-standing issue in the Qualcomm SPMI PMIC gpio
driver which had the 'low' and 'high' drive strength settings switched
but failed to update the debugfs interface which still gets this wrong.
Fix the debugfs code so that the exported values match the hardware
settings.
Note that this probably means that most devicetrees that try to describe
the firmware settings got this wrong if the settings were derived from
debugfs. Before the above mentioned commit the settings would have
actually matched the firmware settings even if they were described
incorrectly, but now they are inverted.
Fixes: 723e8462a4fe ("pinctrl: qcom: spmi-gpio: Fix the GPIO strength mapping")
Fixes: eadff3024472 ("pinctrl: Qualcomm SPMI PMIC GPIO pin controller driver")
Cc: Anjelique Melendez <quic_amelende(a)quicinc.com>
Cc: stable(a)vger.kernel.org # 3.19
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
drivers/pinctrl/qcom/pinctrl-spmi-gpio.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c b/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c
index 3d03293f6320..3a12304e2b7d 100644
--- a/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c
+++ b/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c
@@ -667,7 +667,7 @@ static void pmic_gpio_config_dbg_show(struct pinctrl_dev *pctldev,
"push-pull", "open-drain", "open-source"
};
static const char *const strengths[] = {
- "no", "high", "medium", "low"
+ "no", "low", "medium", "high"
};
pad = pctldev->desc->pins[pin].drv_data;
--
2.45.2
The quilt patch titled
Subject: mm/page_alloc: fix NUMA stats update for cpu-less nodes
has been removed from the -mm tree. Its filename was
mm-page_alloc-fix-numa-stats-update-for-cpu-less-nodes.patch
This patch was dropped because an updated version will be issued
------------------------------------------------------
From: Dongjoo Seo <dongjoo.linux.dev(a)gmail.com>
Subject: mm/page_alloc: fix NUMA stats update for cpu-less nodes
Date: Wed, 23 Oct 2024 10:50:37 -0700
In the case of memoryless node, when a process prefers a node with no
memory(e.g., because it is running on a CPU local to that node), the
kernel treats a nearby node with memory as the preferred node. As a
result, such allocations do not increment the numa_foreign counter on the
memoryless node, leading to skewed NUMA_HIT, NUMA_MISS, and NUMA_FOREIGN
stats for the nearest node.
This patch corrects this issue by:
1. Checking if the zone or preferred zone is CPU-less before updating
the NUMA stats.
2. Ensuring NUMA_HIT is only updated if the zone is not CPU-less.
3. Ensuring NUMA_FOREIGN is only updated if the preferred zone is not
CPU-less.
Example Before and After Patch:
- Before Patch:
node0 node1 node2
numa_hit 86333181 114338269 5108
numa_miss 5199455 0 56844591
numa_foreign 32281033 29763013 0
interleave_hit 91 91 0
local_node 86326417 114288458 0
other_node 5206219 49768 56849702
- After Patch:
node0 node1 node2
numa_hit 2523058 9225528 0
numa_miss 150213 10226 21495942
numa_foreign 17144215 4501270 0
interleave_hit 91 94 0
local_node 2493918 9208226 0
other_node 179351 27528 21495942
Similarly, in the context of cpuless nodes, this patch ensures that NUMA
statistics are accurately updated by adding checks to prevent the
miscounting of memory allocations when the involved nodes have no CPUs.
This ensures more precise tracking of memory access patterns across all
nodes, regardless of whether they have CPUs or not, improving the overall
reliability of NUMA stat. The reason is that page allocation from
dev_dax, cpuset, memcg .. comes with preferred allocating zone in cpuless
node and it's hard to track the zone info for miss information.
Link: https://lkml.kernel.org/r/20241023175037.9125-1-dongjoo.linux.dev@gmail.com
Signed-off-by: Dongjoo Seo <dongjoo.linux.dev(a)gmail.com>
Cc: Davidlohr Bueso <dave(a)stgolabs.net>
Cc: Fan Ni <nifan(a)outlook.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Adam Manzanares <a.manzanares(a)samsung.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
--- a/mm/page_alloc.c~mm-page_alloc-fix-numa-stats-update-for-cpu-less-nodes
+++ a/mm/page_alloc.c
@@ -2858,19 +2858,21 @@ static inline void zone_statistics(struc
{
#ifdef CONFIG_NUMA
enum numa_stat_item local_stat = NUMA_LOCAL;
+ bool z_is_cpuless = !node_state(zone_to_nid(z), N_CPU);
+ bool pref_is_cpuless = !node_state(zone_to_nid(preferred_zone), N_CPU);
- /* skip numa counters update if numa stats is disabled */
if (!static_branch_likely(&vm_numa_stat_key))
return;
- if (zone_to_nid(z) != numa_node_id())
+ if (zone_to_nid(z) != numa_node_id() || z_is_cpuless)
local_stat = NUMA_OTHER;
- if (zone_to_nid(z) == zone_to_nid(preferred_zone))
+ if (zone_to_nid(z) == zone_to_nid(preferred_zone) && !z_is_cpuless)
__count_numa_events(z, NUMA_HIT, nr_account);
else {
__count_numa_events(z, NUMA_MISS, nr_account);
- __count_numa_events(preferred_zone, NUMA_FOREIGN, nr_account);
+ if (!pref_is_cpuless)
+ __count_numa_events(preferred_zone, NUMA_FOREIGN, nr_account);
}
__count_numa_events(z, local_stat, nr_account);
#endif
_
Patches currently in -mm which might be from dongjoo.linux.dev(a)gmail.com are
Changes since v1 [1]:
- Fix some misspellings missed by checkpatch in changelogs (Jonathan)
- Add comments explaining the order of objects in drivers/cxl/Makefile
(Jonathan)
- Rename attach_device => cxl_rescan_attach (Jonathan)
- Fixup Zijun's email (Zijun)
[1]: http://lore.kernel.org/172862483180.2150669.5564474284074502692.stgit@dwill…
---
Original cover:
Gregory's modest proposal to fix CXL cxl_mem_probe() failures due to
delayed arrival of the CXL "root" infrastructure [1] prompted questions
of how the existing mechanism for retrying cxl_mem_probe() could be
failing.
The critical missing piece in the debug was that Gregory's setup had
almost all CXL modules built-in to the kernel.
On the way to that discovery several other bugs and init-order corner
cases were discovered.
The main fix is to make sure the drivers/cxl/Makefile object order
supports root CXL ports being fully initialized upon cxl_acpi_probe()
exit. The modular case has some similar potential holes that are fixed
with MODULE_SOFTDEP() and other fix ups. Finally, an attempt to update
cxl_test to reproduce the original report resulted in the discovery of a
separate long standing use after free bug in cxl_region_detach().
[2]: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net
---
Dan Williams (6):
cxl/port: Fix CXL port initialization order when the subsystem is built-in
cxl/port: Fix cxl_bus_rescan() vs bus_rescan_devices()
cxl/acpi: Ensure ports ready at cxl_acpi_probe() return
cxl/port: Fix use-after-free, permit out-of-order decoder shutdown
cxl/port: Prevent out-of-order decoder allocation
cxl/test: Improve init-order fidelity relative to real-world systems
drivers/base/core.c | 35 +++++++
drivers/cxl/Kconfig | 1
drivers/cxl/Makefile | 20 +++-
drivers/cxl/acpi.c | 7 +
drivers/cxl/core/hdm.c | 50 +++++++++--
drivers/cxl/core/port.c | 13 ++-
drivers/cxl/core/region.c | 91 ++++++++++---------
drivers/cxl/cxl.h | 3 -
include/linux/device.h | 3 +
tools/testing/cxl/test/cxl.c | 200 +++++++++++++++++++++++-------------------
tools/testing/cxl/test/mem.c | 1
11 files changed, 269 insertions(+), 155 deletions(-)
base-commit: 8cf0b93919e13d1e8d4466eb4080a4c4d9d66d7b
From: Tvrtko Ursulin <tvrtko.ursulin(a)igalia.com>
KASAN reports that the GPU metrics table allocated in
vangogh_tables_init() is not large enough for the memset done in
smu_cmn_init_soft_gpu_metrics(). Condensed report follows:
[ 33.861314] BUG: KASAN: slab-out-of-bounds in smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu]
[ 33.861799] Write of size 168 at addr ffff888129f59500 by task mangoapp/1067
...
[ 33.861808] CPU: 6 UID: 1000 PID: 1067 Comm: mangoapp Tainted: G W 6.12.0-rc4 #356 1a56f59a8b5182eeaf67eb7cb8b13594dd23b544
[ 33.861816] Tainted: [W]=WARN
[ 33.861818] Hardware name: Valve Galileo/Galileo, BIOS F7G0107 12/01/2023
[ 33.861822] Call Trace:
[ 33.861826] <TASK>
[ 33.861829] dump_stack_lvl+0x66/0x90
[ 33.861838] print_report+0xce/0x620
[ 33.861853] kasan_report+0xda/0x110
[ 33.862794] kasan_check_range+0xfd/0x1a0
[ 33.862799] __asan_memset+0x23/0x40
[ 33.862803] smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[ 33.863306] vangogh_get_gpu_metrics_v2_4+0x123/0xad0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[ 33.864257] vangogh_common_get_gpu_metrics+0xb0c/0xbc0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[ 33.865682] amdgpu_dpm_get_gpu_metrics+0xcc/0x110 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[ 33.866160] amdgpu_get_gpu_metrics+0x154/0x2d0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[ 33.867135] dev_attr_show+0x43/0xc0
[ 33.867147] sysfs_kf_seq_show+0x1f1/0x3b0
[ 33.867155] seq_read_iter+0x3f8/0x1140
[ 33.867173] vfs_read+0x76c/0xc50
[ 33.867198] ksys_read+0xfb/0x1d0
[ 33.867214] do_syscall_64+0x90/0x160
...
[ 33.867353] Allocated by task 378 on cpu 7 at 22.794876s:
[ 33.867358] kasan_save_stack+0x33/0x50
[ 33.867364] kasan_save_track+0x17/0x60
[ 33.867367] __kasan_kmalloc+0x87/0x90
[ 33.867371] vangogh_init_smc_tables+0x3f9/0x840 [amdgpu]
[ 33.867835] smu_sw_init+0xa32/0x1850 [amdgpu]
[ 33.868299] amdgpu_device_init+0x467b/0x8d90 [amdgpu]
[ 33.868733] amdgpu_driver_load_kms+0x19/0xf0 [amdgpu]
[ 33.869167] amdgpu_pci_probe+0x2d6/0xcd0 [amdgpu]
[ 33.869608] local_pci_probe+0xda/0x180
[ 33.869614] pci_device_probe+0x43f/0x6b0
Empirically we can confirm that the former allocates 152 bytes for the
table, while the latter memsets the 168 large block.
This is somewhat alleviated by the fact that allocation goes into a 192
SLAB bucket, but then for v3_0 metrics the table grows to 264 bytes which
would definitely be a problem.
Root cause appears that when GPU metrics tables for v2_4 parts were added
it was not considered to enlarge the table to fit.
The fix in this patch is rather "brute force" and perhaps later should be
done in a smarter way, by extracting and consolidating the part version to
size logic to a common helper, instead of brute forcing the largest
possible allocation. Nevertheless, for now this works and fixes the out of
bounds write.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin(a)igalia.com>
Fixes: 41cec40bc9ba ("drm/amd/pm: Vangogh: Add new gpu_metrics_v2_4 to acquire gpu_metrics")
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Cc: Evan Quan <evan.quan(a)amd.com>
Cc: Wenyou Yang <WenYou.Yang(a)amd.com>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: <stable(a)vger.kernel.org> # v6.6+
---
drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
index 22737b11b1bf..36f4a4651918 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
@@ -242,7 +242,10 @@ static int vangogh_tables_init(struct smu_context *smu)
goto err0_out;
smu_table->metrics_time = 0;
- smu_table->gpu_metrics_table_size = max(sizeof(struct gpu_metrics_v2_3), sizeof(struct gpu_metrics_v2_2));
+ smu_table->gpu_metrics_table_size = sizeof(struct gpu_metrics_v2_2);
+ smu_table->gpu_metrics_table_size = max(smu_table->gpu_metrics_table_size, sizeof(struct gpu_metrics_v2_3));
+ smu_table->gpu_metrics_table_size = max(smu_table->gpu_metrics_table_size, sizeof(struct gpu_metrics_v2_4));
+ smu_table->gpu_metrics_table_size = max(smu_table->gpu_metrics_table_size, sizeof(struct gpu_metrics_v3_0));
smu_table->gpu_metrics_table = kzalloc(smu_table->gpu_metrics_table_size, GFP_KERNEL);
if (!smu_table->gpu_metrics_table)
goto err1_out;
--
2.46.0
The previous implementation limited the tracing capabilities when perf
was run in the init PID namespace, making it impossible to trace
applications in non-init PID namespaces.
This update improves the tracing process by verifying the event owner.
This allows us to determine whether the user has the necessary
permissions to trace the application.
Cc: stable(a)vger.kernel.org
Fixes: aab473867fed ("coresight: etm4x: Don't trace PID for non-root PID namespace")
Signed-off-by: Julien Meunier <julien.meunier(a)nokia.com>
---
drivers/hwtracing/coresight/coresight-etm4x-core.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
index bf01f01964cf..8365307b1aec 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
@@ -695,7 +695,7 @@ static int etm4_parse_event_config(struct coresight_device *csdev,
/* Only trace contextID when runs in root PID namespace */
if ((attr->config & BIT(ETM_OPT_CTXTID)) &&
- task_is_in_init_pid_ns(current))
+ task_is_in_init_pid_ns(event->owner))
/* bit[6], Context ID tracing bit */
config->cfg |= TRCCONFIGR_CID;
@@ -710,7 +710,7 @@ static int etm4_parse_event_config(struct coresight_device *csdev,
goto out;
}
/* Only trace virtual contextID when runs in root PID namespace */
- if (task_is_in_init_pid_ns(current))
+ if (task_is_in_init_pid_ns(event->owner))
config->cfg |= TRCCONFIGR_VMID | TRCCONFIGR_VMIDOPT;
}
--
2.34.1
On Fri, Oct 25, 2024 at 06:06:47PM +0200, Nirmoy Das wrote:
> On 10/24/2024 7:22 PM, Matthew Brost wrote:
>
> On Thu, Oct 24, 2024 at 10:14:21AM -0700, John Harrison wrote:
>
> On 10/24/2024 08:18, Nirmoy Das wrote:
>
> Flush xe ordered_wq in case of ufence timeout which is observed
> on LNL and that points to the recent scheduling issue with E-cores.
>
> This is similar to the recent fix:
> commit e51527233804 ("drm/xe/guc/ct: Flush g2h worker in case of g2h
> response timeout") and should be removed once there is E core
> scheduling fix.
>
> v2: Add platform check(Himal)
> s/__flush_workqueue/flush_workqueue(Jani)
>
> Cc: Badal Nilawar [1]<badal.nilawar(a)intel.com>
> Cc: Jani Nikula [2]<jani.nikula(a)intel.com>
> Cc: Matthew Auld [3]<matthew.auld(a)intel.com>
> Cc: John Harrison [4]<John.C.Harrison(a)Intel.com>
> Cc: Himal Prasad Ghimiray [5]<himal.prasad.ghimiray(a)intel.com>
> Cc: Lucas De Marchi [6]<lucas.demarchi(a)intel.com>
> Cc: [7]<stable(a)vger.kernel.org> # v6.11+
> Link: [8]https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2754
> Suggested-by: Matthew Brost [9]<matthew.brost(a)intel.com>
> Signed-off-by: Nirmoy Das [10]<nirmoy.das(a)intel.com>
> Reviewed-by: Matthew Brost [11]<matthew.brost(a)intel.com>
> ---
> drivers/gpu/drm/xe/xe_wait_user_fence.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_wait_user_fence.c b/drivers/gpu/drm/xe/xe_wai
> t_user_fence.c
> index f5deb81eba01..78a0ad3c78fe 100644
> --- a/drivers/gpu/drm/xe/xe_wait_user_fence.c
> +++ b/drivers/gpu/drm/xe/xe_wait_user_fence.c
> @@ -13,6 +13,7 @@
> #include "xe_device.h"
> #include "xe_gt.h"
> #include "xe_macros.h"
> +#include "compat-i915-headers/i915_drv.h"
> #include "xe_exec_queue.h"
> static int do_compare(u64 addr, u64 value, u64 mask, u16 op)
> @@ -155,6 +156,19 @@ int xe_wait_user_fence_ioctl(struct drm_device *dev, void *
> data,
> }
> if (!timeout) {
> + if (IS_LUNARLAKE(xe)) {
> + /*
> + * This is analogous to e51527233804 ("drm/xe/gu
> c/ct: Flush g2h
> + * worker in case of g2h response timeout")
> + *
> + * TODO: Drop this change once workqueue schedul
> ing delay issue is
> + * fixed on LNL Hybrid CPU.
> + */
> + flush_workqueue(xe->ordered_wq);
>
> If we are having multiple instances of this workaround, can we wrap them up
> in as 'LNL_FLUSH_WORKQUEUE(q)' or some such? Put the IS_LNL check inside the
> macro and make it pretty obvious exactly where all the instances are by
> having a single macro name to search for.
>
>
> +1, I think Lucas is suggesting something similar to this on the chat to
> make sure we don't lose track of removing these W/A when this gets
> fixed.
>
> Matt
>
> Sounds good. I will add LNL_FLUSH_WORKQUEUE() and use that for all the
> places we need this WA.
>
You will need 2 macros...
- LNL_FLUSH_WORKQUEUE() which accepts xe_device, workqueue_struct
- LNL_FLUSH_WORK() which accepts xe_device, work_struct
Matt
> Regards,
>
> Nirmoy
>
>
>
> John.
>
>
> + err = do_compare(addr, args->value, args->mask,
> args->op);
> + if (err <= 0)
> + break;
> + }
> err = -ETIME;
> break;
> }
>
> References
>
> 1. mailto:badal.nilawar@intel.com
> 2. mailto:jani.nikula@intel.com
> 3. mailto:matthew.auld@intel.com
> 4. mailto:John.C.Harrison@Intel.com
> 5. mailto:himal.prasad.ghimiray@intel.com
> 6. mailto:lucas.demarchi@intel.com
> 7. mailto:stable@vger.kernel.org
> 8. https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2754
> 9. mailto:matthew.brost@intel.com
> 10. mailto:nirmoy.das@intel.com
> 11. mailto:matthew.brost@intel.com
From: Tvrtko Ursulin <tvrtko.ursulin(a)igalia.com>
KASAN reports that the GPU metrics table allocated in
vangogh_tables_init() is not large enough for the memset done in
smu_cmn_init_soft_gpu_metrics(). Condensed report follows:
[ 33.861314] BUG: KASAN: slab-out-of-bounds in smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu]
[ 33.861799] Write of size 168 at addr ffff888129f59500 by task mangoapp/1067
...
[ 33.861808] CPU: 6 UID: 1000 PID: 1067 Comm: mangoapp Tainted: G W 6.12.0-rc4 #356 1a56f59a8b5182eeaf67eb7cb8b13594dd23b544
[ 33.861816] Tainted: [W]=WARN
[ 33.861818] Hardware name: Valve Galileo/Galileo, BIOS F7G0107 12/01/2023
[ 33.861822] Call Trace:
[ 33.861826] <TASK>
[ 33.861829] dump_stack_lvl+0x66/0x90
[ 33.861838] print_report+0xce/0x620
[ 33.861853] kasan_report+0xda/0x110
[ 33.862794] kasan_check_range+0xfd/0x1a0
[ 33.862799] __asan_memset+0x23/0x40
[ 33.862803] smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[ 33.863306] vangogh_get_gpu_metrics_v2_4+0x123/0xad0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[ 33.864257] vangogh_common_get_gpu_metrics+0xb0c/0xbc0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[ 33.865682] amdgpu_dpm_get_gpu_metrics+0xcc/0x110 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[ 33.866160] amdgpu_get_gpu_metrics+0x154/0x2d0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[ 33.867135] dev_attr_show+0x43/0xc0
[ 33.867147] sysfs_kf_seq_show+0x1f1/0x3b0
[ 33.867155] seq_read_iter+0x3f8/0x1140
[ 33.867173] vfs_read+0x76c/0xc50
[ 33.867198] ksys_read+0xfb/0x1d0
[ 33.867214] do_syscall_64+0x90/0x160
...
[ 33.867353] Allocated by task 378 on cpu 7 at 22.794876s:
[ 33.867358] kasan_save_stack+0x33/0x50
[ 33.867364] kasan_save_track+0x17/0x60
[ 33.867367] __kasan_kmalloc+0x87/0x90
[ 33.867371] vangogh_init_smc_tables+0x3f9/0x840 [amdgpu]
[ 33.867835] smu_sw_init+0xa32/0x1850 [amdgpu]
[ 33.868299] amdgpu_device_init+0x467b/0x8d90 [amdgpu]
[ 33.868733] amdgpu_driver_load_kms+0x19/0xf0 [amdgpu]
[ 33.869167] amdgpu_pci_probe+0x2d6/0xcd0 [amdgpu]
[ 33.869608] local_pci_probe+0xda/0x180
[ 33.869614] pci_device_probe+0x43f/0x6b0
Empirically we can confirm that the former allocates 152 bytes for the
table, while the latter memsets the 168 large block.
Root cause appears that when GPU metrics tables for v2_4 parts were added
it was not considered to enlarge the table to fit.
The fix in this patch is rather "brute force" and perhaps later should be
done in a smarter way, by extracting and consolidating the part version to
size logic to a common helper, instead of brute forcing the largest
possible allocation. Nevertheless, for now this works and fixes the out of
bounds write.
v2:
* Drop impossible v3_0 case. (Mario)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin(a)igalia.com>
Fixes: 41cec40bc9ba ("drm/amd/pm: Vangogh: Add new gpu_metrics_v2_4 to acquire gpu_metrics")
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Cc: Evan Quan <evan.quan(a)amd.com>
Cc: Wenyou Yang <WenYou.Yang(a)amd.com>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: <stable(a)vger.kernel.org> # v6.6+
---
drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
index 22737b11b1bf..1fe020f1f4db 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
@@ -242,7 +242,9 @@ static int vangogh_tables_init(struct smu_context *smu)
goto err0_out;
smu_table->metrics_time = 0;
- smu_table->gpu_metrics_table_size = max(sizeof(struct gpu_metrics_v2_3), sizeof(struct gpu_metrics_v2_2));
+ smu_table->gpu_metrics_table_size = sizeof(struct gpu_metrics_v2_2);
+ smu_table->gpu_metrics_table_size = max(smu_table->gpu_metrics_table_size, sizeof(struct gpu_metrics_v2_3));
+ smu_table->gpu_metrics_table_size = max(smu_table->gpu_metrics_table_size, sizeof(struct gpu_metrics_v2_4));
smu_table->gpu_metrics_table = kzalloc(smu_table->gpu_metrics_table_size, GFP_KERNEL);
if (!smu_table->gpu_metrics_table)
goto err1_out;
--
2.46.0
In an event where the platform running the device control plane
is rebooted, reset is detected on the driver. It releases
all the resources and waits for the reset to complete. Once the
reset is done, it tries to build the resources back. At this
time if the device control plane is not yet started, then
the driver timeouts on the virtchnl message and retries to
establish the mailbox again.
In the retry flow, mailbox is deinitialized but the mailbox
workqueue is still alive and polling for the mailbox message.
This results in accessing the released control queue leading to
null-ptr-deref. Fix it by unrolling the work queue cancellation
and mailbox deinitialization in the order which they got
initialized.
Also remove the redundant scheduling of the mailbox task in
idpf_vc_core_init.
Fixes: 4930fbf419a7 ("idpf: add core init and interrupt request")
Fixes: 34c21fa894a1 ("idpf: implement virtchnl transaction manager")
Cc: stable(a)vger.kernel.org # 6.9+
Reviewed-by: Tarun K Singh <tarun.k.singh(a)intel.com>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga(a)intel.com>
---
drivers/net/ethernet/intel/idpf/idpf_lib.c | 1 +
drivers/net/ethernet/intel/idpf/idpf_virtchnl.c | 7 -------
2 files changed, 1 insertion(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c
index c3848e10e7db..b4fbb99bfad2 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_lib.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c
@@ -1786,6 +1786,7 @@ static int idpf_init_hard_reset(struct idpf_adapter *adapter)
*/
err = idpf_vc_core_init(adapter);
if (err) {
+ cancel_delayed_work_sync(&adapter->mbx_task);
idpf_deinit_dflt_mbx(adapter);
goto unlock_mutex;
}
diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
index 3be883726b87..d77d6c3805e2 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
@@ -3017,11 +3017,6 @@ int idpf_vc_core_init(struct idpf_adapter *adapter)
goto err_netdev_alloc;
}
- /* Start the mailbox task before requesting vectors. This will ensure
- * vector information response from mailbox is handled
- */
- queue_delayed_work(adapter->mbx_wq, &adapter->mbx_task, 0);
-
queue_delayed_work(adapter->serv_wq, &adapter->serv_task,
msecs_to_jiffies(5 * (adapter->pdev->devfn & 0x07)));
@@ -3046,7 +3041,6 @@ int idpf_vc_core_init(struct idpf_adapter *adapter)
err_intr_req:
cancel_delayed_work_sync(&adapter->serv_task);
- cancel_delayed_work_sync(&adapter->mbx_task);
idpf_vport_params_buf_rel(adapter);
err_netdev_alloc:
kfree(adapter->vports);
@@ -3070,7 +3064,6 @@ int idpf_vc_core_init(struct idpf_adapter *adapter)
adapter->state = __IDPF_VER_CHECK;
if (adapter->vcxn_mngr)
idpf_vc_xn_shutdown(adapter->vcxn_mngr);
- idpf_deinit_dflt_mbx(adapter);
set_bit(IDPF_HR_DRV_LOAD, adapter->flags);
queue_delayed_work(adapter->vc_event_wq, &adapter->vc_event_task,
msecs_to_jiffies(task_delay));
--
2.43.0
PD3.1 spec ("8.3.3.3.3 PE_SNK_Wait_for_Capabilities State") mandates
that the policy engine perform a hard reset when SinkWaitCapTimer
expires. Instead the code explicitly does a GET_SOURCE_CAP when the
timer expires as part of SNK_WAIT_CAPABILITIES_TIMEOUT. Due to this the
following compliance test failures are reported by the compliance tester
(added excerpts from the PD Test Spec):
* COMMON.PROC.PD.2#1:
The Tester receives a Get_Source_Cap Message from the UUT. This
message is valid except the following conditions: [COMMON.PROC.PD.2#1]
a. The check fails if the UUT sends this message before the Tester
has established an Explicit Contract
...
* TEST.PD.PROT.SNK.4:
...
4. The check fails if the UUT does not send a Hard Reset between
tTypeCSinkWaitCap min and max. [TEST.PD.PROT.SNK.4#1] The delay is
between the VBUS present vSafe5V min and the time of the first bit
of Preamble of the Hard Reset sent by the UUT.
For the purpose of interoperability, restrict the quirk introduced in
https://lore.kernel.org/all/20240523171806.223727-1-sebastian.reichel@colla…
to only non self-powered devices as battery powered devices will not
have the issue mentioned in that commit.
Cc: stable(a)vger.kernel.org
Fixes: 122968f8dda8 ("usb: typec: tcpm: avoid resets for missing source capability messages")
Reported-by: Badhri Jagan Sridharan <badhri(a)google.com>
Closes: https://lore.kernel.org/all/CAPTae5LAwsVugb0dxuKLHFqncjeZeJ785nkY4Jfd+M-tCj…
Signed-off-by: Amit Sunil Dhamne <amitsd(a)google.com>
Reviewed-by: Badhri Jagan Sridharan <badhri(a)google.com>
---
drivers/usb/typec/tcpm/tcpm.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c
index d6f2412381cf..c8f467d24fbb 100644
--- a/drivers/usb/typec/tcpm/tcpm.c
+++ b/drivers/usb/typec/tcpm/tcpm.c
@@ -4515,7 +4515,8 @@ static inline enum tcpm_state hard_reset_state(struct tcpm_port *port)
return ERROR_RECOVERY;
if (port->pwr_role == TYPEC_SOURCE)
return SRC_UNATTACHED;
- if (port->state == SNK_WAIT_CAPABILITIES_TIMEOUT)
+ if (port->state == SNK_WAIT_CAPABILITIES ||
+ port->state == SNK_WAIT_CAPABILITIES_TIMEOUT)
return SNK_READY;
return SNK_UNATTACHED;
}
@@ -5043,8 +5044,11 @@ static void run_state_machine(struct tcpm_port *port)
tcpm_set_state(port, SNK_SOFT_RESET,
PD_T_SINK_WAIT_CAP);
} else {
- tcpm_set_state(port, SNK_WAIT_CAPABILITIES_TIMEOUT,
- PD_T_SINK_WAIT_CAP);
+ if (!port->self_powered)
+ upcoming_state = SNK_WAIT_CAPABILITIES_TIMEOUT;
+ else
+ upcoming_state = hard_reset_state(port);
+ tcpm_set_state(port, upcoming_state, PD_T_SINK_WAIT_CAP);
}
break;
case SNK_WAIT_CAPABILITIES_TIMEOUT:
base-commit: c6d9e43954bfa7415a1e9efdb2806ec1d8a8afc8
--
2.47.0.105.g07ac214952-goog
ctrl->dh_key might be used across multiple calls to nvmet_setup_dhgroup()
for the same controller. So it's better to nullify it after release on
error path in order to avoid double free later in nvmet_destroy_auth().
Found by Linux Verification Center (linuxtesting.org) with Svace.
Fixes: 7a277c37d352 ("nvmet-auth: Diffie-Hellman key exchange support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Vitaliy Shevtsov <v.shevtsov(a)maxima.ru>
---
drivers/nvme/target/auth.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/nvme/target/auth.c b/drivers/nvme/target/auth.c
index e900525b7866..7bca64de4a2f 100644
--- a/drivers/nvme/target/auth.c
+++ b/drivers/nvme/target/auth.c
@@ -101,6 +101,7 @@ int nvmet_setup_dhgroup(struct nvmet_ctrl *ctrl, u8 dhgroup_id)
pr_debug("%s: ctrl %d failed to generate private key, err %d\n",
__func__, ctrl->cntlid, ret);
kfree_sensitive(ctrl->dh_key);
+ ctrl->dh_key = NULL;
return ret;
}
ctrl->dh_keysize = crypto_kpp_maxsize(ctrl->dh_tfm);
--
2.46.1
As there is very little ordering in the KVM API, userspace can
instanciate a half-baked GIC (missing its memory map, for example)
at almost any time.
This means that, with the right timing, a thread running vcpu-0
can enter the kernel without a GIC configured and get a GIC created
behind its back by another thread. Amusingly, it will pick up
that GIC and start messing with the data structures without the
GIC having been fully initialised.
Similarly, a thread running vcpu-1 can enter the kernel, and try
to init the GIC that was previously created. Since this GIC isn't
properly configured (no memory map), it fails to correctly initialise.
And that's the point where we decide to teardown the GIC, freeing all
its resources. Behind vcpu-0's back. Things stop pretty abruptly,
with a variety of symptoms. Clearly, this isn't good, we should be
a bit more careful about this.
It is obvious that this guest is not viable, as it is missing some
important part of its configuration. So instead of trying to tear
bits of it down, let's just mark it as *dead*. It means that any
further interaction from userspace will result in -EIO. The memory
will be released on the "normal" path, when userspace gives up.
Cc: stable(a)vger.kernel.org
Reported-by: Alexander Potapenko <glider(a)google.com>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
---
arch/arm64/kvm/arm.c | 3 +++
arch/arm64/kvm/vgic/vgic-init.c | 6 +++---
2 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index a0d01c46e4084..b97ada19f06a7 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -997,6 +997,9 @@ static int kvm_vcpu_suspend(struct kvm_vcpu *vcpu)
static int check_vcpu_requests(struct kvm_vcpu *vcpu)
{
if (kvm_request_pending(vcpu)) {
+ if (kvm_check_request(KVM_REQ_VM_DEAD, vcpu))
+ return -EIO;
+
if (kvm_check_request(KVM_REQ_SLEEP, vcpu))
kvm_vcpu_sleep(vcpu);
diff --git a/arch/arm64/kvm/vgic/vgic-init.c b/arch/arm64/kvm/vgic/vgic-init.c
index e7c53e8af3d16..c4cbf798e71a4 100644
--- a/arch/arm64/kvm/vgic/vgic-init.c
+++ b/arch/arm64/kvm/vgic/vgic-init.c
@@ -536,10 +536,10 @@ int kvm_vgic_map_resources(struct kvm *kvm)
out:
mutex_unlock(&kvm->arch.config_lock);
out_slots:
- mutex_unlock(&kvm->slots_lock);
-
if (ret)
- kvm_vgic_destroy(kvm);
+ kvm_vm_dead(kvm);
+
+ mutex_unlock(&kvm->slots_lock);
return ret;
}
--
2.39.2
İşlemlerinize ilişkin e-dekontlarınız ekte gönderilmiştir. İyi
günler dileriz. Bu mail otomatik olarak bilgilendirme amacıyla
gönderilmiştir. Lütfen cevaplamayınız.
Türkiye Cumhuriyeti Ziraat Bankası Anonim Şirketi, Finanskent
Mahallesi, Finans Caddesi No:44A Ümraniye/İstanbul
Ticaret Sicil No:475225-5 Mersis No: 0998006967505633 |
www.ziraatbank.com.tr
From: "Jiri Slaby (SUSE)" <jirislaby(a)kernel.org>
[ Upstream commit 602babaa84d627923713acaf5f7e9a4369e77473 ]
Commit af224ca2df29 (serial: core: Prevent unsafe uart port access, part
3) added few uport == NULL checks. It added one to uart_shutdown(), so
the commit assumes, uport can be NULL in there. But right after that
protection, there is an unprotected "uart_port_dtr_rts(uport, false);"
call. That is invoked only if HUPCL is set, so I assume that is the
reason why we do not see lots of these reports.
Or it cannot be NULL at this point at all for some reason :P.
Until the above is investigated, stay on the safe side and move this
dereference to the if too.
I got this inconsistency from Coverity under CID 1585130. Thanks.
Signed-off-by: Jiri Slaby (SUSE) <jirislaby(a)kernel.org>
Cc: Peter Hurley <peter(a)hurleysoftware.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Link: https://lore.kernel.org/r/20240805102046.307511-3-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
[Adapted over commit 5701cb8bf50e ("tty: Call ->dtr_rts() parameter
active consistently") not in the tree]
Signed-off-by: Tomas Krcka <krckatom(a)amazon.de>
---
drivers/tty/serial/serial_core.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index 58e857fb8dee..fd9f635dc247 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -339,14 +339,16 @@ static void uart_shutdown(struct tty_struct *tty, struct uart_state *state)
/*
* Turn off DTR and RTS early.
*/
- if (uport && uart_console(uport) && tty) {
- uport->cons->cflag = tty->termios.c_cflag;
- uport->cons->ispeed = tty->termios.c_ispeed;
- uport->cons->ospeed = tty->termios.c_ospeed;
- }
+ if (uport) {
+ if (uart_console(uport) && tty) {
+ uport->cons->cflag = tty->termios.c_cflag;
+ uport->cons->ispeed = tty->termios.c_ispeed;
+ uport->cons->ospeed = tty->termios.c_ospeed;
+ }
- if (!tty || C_HUPCL(tty))
- uart_port_dtr_rts(uport, 0);
+ if (!tty || C_HUPCL(tty))
+ uart_port_dtr_rts(uport, 0);
+ }
uart_port_shutdown(port);
}
--
2.40.1
From: Si-Wei Liu <si-wei.liu(a)oracle.com>
When calculating the physical address range based on the iotlb and mr
[start,end) ranges, the offset of mr->start relative to map->start
is not taken into account. This leads to some incorrect and duplicate
mappings.
For the case when mr->start < map->start the code is already correct:
the range in [mr->start, map->start) was handled by a different
iteration.
Fixes: 94abbccdf291 ("vdpa/mlx5: Add shared memory registration code")
Cc: stable(a)vger.kernel.org
Signed-off-by: Si-Wei Liu <si-wei.liu(a)oracle.com>
Signed-off-by: Dragos Tatulea <dtatulea(a)nvidia.com>
---
drivers/vdpa/mlx5/core/mr.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c
index 2dd21e0b399e..7d0c83b5b071 100644
--- a/drivers/vdpa/mlx5/core/mr.c
+++ b/drivers/vdpa/mlx5/core/mr.c
@@ -373,7 +373,7 @@ static int map_direct_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_direct_mr
struct page *pg;
unsigned int nsg;
int sglen;
- u64 pa;
+ u64 pa, offset;
u64 paend;
struct scatterlist *sg;
struct device *dma = mvdev->vdev.dma_dev;
@@ -396,8 +396,10 @@ static int map_direct_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_direct_mr
sg = mr->sg_head.sgl;
for (map = vhost_iotlb_itree_first(iotlb, mr->start, mr->end - 1);
map; map = vhost_iotlb_itree_next(map, mr->start, mr->end - 1)) {
- paend = map->addr + maplen(map, mr);
- for (pa = map->addr; pa < paend; pa += sglen) {
+ offset = mr->start > map->start ? mr->start - map->start : 0;
+ pa = map->addr + offset;
+ paend = map->addr + offset + maplen(map, mr);
+ for (; pa < paend; pa += sglen) {
pg = pfn_to_page(__phys_to_pfn(pa));
if (!sg) {
mlx5_vdpa_warn(mvdev, "sg null. start 0x%llx, end 0x%llx\n",
--
2.46.1
On a x86 system under test with 1780 CPUs, topology_span_sane() takes
around 8 seconds cumulatively for all the iterations. It is an expensive
operation which does the sanity of non-NUMA topology masks.
CPU topology is not something which changes very frequently hence make
this check optional for the systems where the topology is trusted and
need faster bootup.
Restrict this to SCHED_DEBUG builds so that this penalty can be avoided
for the systems who wants to avoid it.
Fixes: ccf74128d66c ("sched/topology: Assert non-NUMA topology masks don't (partially) overlap")
Signed-off-by: Saurabh Sengar <ssengar(a)linux.microsoft.com>
---
kernel/sched/topology.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 9748a4c8d668..dacc8c6f978b 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -2354,6 +2354,7 @@ static struct sched_domain *build_sched_domain(struct sched_domain_topology_leve
return sd;
}
+#ifdef CONFIG_SCHED_DEBUG
/*
* Ensure topology masks are sane, i.e. there are no conflicts (overlaps) for
* any two given CPUs at this (non-NUMA) topology level.
@@ -2387,6 +2388,7 @@ static bool topology_span_sane(struct sched_domain_topology_level *tl,
return true;
}
+#endif
/*
* Build sched domains for a given set of CPUs and attach the sched domains
@@ -2417,8 +2419,10 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
sd = NULL;
for_each_sd_topology(tl) {
+#ifdef CONFIG_SCHED_DEBUG
if (WARN_ON(!topology_span_sane(tl, cpu_map, i)))
goto error;
+#endif
sd = build_sched_domain(tl, cpu_map, attr, sd, i);
--
2.43.0
Hello,
I'd like to report a regression to menuconfig.
In "Device Drivers" ---> "Input device support"
There used to be submenus for keyboards, mice etc.
Now, only the entry "Hardware I/O ports" remains.
They can still be accessed (and configured) by searching for them,
then pressing the corresponding number:
/Keyboard
/KEYBOARD_ATKBD
I also determined the commit:
f79dc03fe68c79d388908182e68d702f7f1786bc
kconfig: refactor choice value calculation
#regzbot introduced: f79dc03fe68c
I've only encountered this here but it is not impossible other entries
elsewhere in menuconfig might be missing aswell.
The issue is present in stable 6.11.5 and mainline.
Kind regards,
Edmund Raile.
Hello,
I'd like to report a regression in firewire-ohci that results
in the kernel hardlocking when re-discovering a FireWire device.
TI XIO2213B
RME FireFace 800
It will occur under three conditions:
* power-cycling the FireWire device
* un- and re-plugging the FireWire device
* suspending and then waking the PC
Often it would also occur directly on boot in QEMU but I have not
yet observed this specific behavior on bare metal.
Here is an excerpt from the stack trace (don't know whether it is
acceptable to send in full):
kernel: ------------[ cut here ]------------
kernel: refcount_t: addition on 0; use-after-free.
kernel: WARNING: CPU: 3 PID: 116 at lib/refcount.c:25
refcount_warn_saturate (/build/linux/lib/refcount.c:25 (discriminator
1))
kernel: Workqueue: firewire_ohci bus_reset_work
kernel: RIP: 0010:refcount_warn_saturate
(/build/linux/lib/refcount.c:25 (discriminator 1))
kernel: Call Trace:
kernel: <TASK>
kernel: ? refcount_warn_saturate (/build/linux/lib/refcount.c:25
(discriminator 1))
kernel: ? __warn.cold (/build/linux/kernel/panic.c:693)
kernel: ? refcount_warn_saturate (/build/linux/lib/refcount.c:25
(discriminator 1))
kernel: ? report_bug (/build/linux/lib/bug.c:180
/build/linux/lib/bug.c:219)
kernel: ? handle_bug (/build/linux/arch/x86/kernel/traps.c:218)
kernel: ? exc_invalid_op (/build/linux/arch/x86/kernel/traps.c:260
(discriminator 1))
kernel: ? asm_exc_invalid_op
(/build/linux/./arch/x86/include/asm/idtentry.h:621)
kernel: ? refcount_warn_saturate (/build/linux/lib/refcount.c:25
(discriminator 1))
kernel: for_each_fw_node (/build/linux/./include/linux/refcount.h:190
/build/linux/./include/linux/refcount.h:241
/build/linux/./include/linux/refcount.h:258
/build/linux/drivers/firewire/core.h:199
/build/linux/drivers/firewire/core-topology.c:275)
kernel: ? __pfx_report_found_node (/build/linux/drivers/firewire/core-
topology.c:312)
kernel: fw_core_handle_bus_reset (/build/linux/drivers/firewire/core-
topology.c:399 (discriminator 1) /build/linux/drivers/firewire/core-
topology.c:504 (discriminator 1))
kernel: bus_reset_work (/build/linux/drivers/firewire/ohci.c:2121)
kernel: process_one_work
(/build/linux/./arch/x86/include/asm/jump_label.h:27
/build/linux/./include/linux/jump_label.h:207
/build/linux/./include/trace/events/workqueue.h:110
/build/linux/kernel/workqueue.c:3236)
kernel: worker_thread (/build/linux/kernel/workqueue.c:3306
(discriminator 2) /build/linux/kernel/workqueue.c:3393 (discriminator
2))
kernel: ? __pfx_worker_thread (/build/linux/kernel/workqueue.c:3339)
kernel: kthread (/build/linux/kernel/kthread.c:389)
kernel: ? __pfx_kthread (/build/linux/kernel/kthread.c:342)
kernel: ret_from_fork (/build/linux/arch/x86/kernel/process.c:153)
kernel: ? __pfx_kthread (/build/linux/kernel/kthread.c:342)
kernel: ret_from_fork_asm (/build/linux/arch/x86/entry/entry_64.S:254)
kernel: </TASK>
I have identified the commit via bisection:
24b7f8e5cd656196a13077e160aec45ad89b58d9
firewire: core: use helper functions for self ID sequence
It was part of the following patch series:
firewire: add tracepoints events for self ID sequence
https://lore.kernel.org/all/20240605235155.116468-6-o-takashi@sakamocchi.jp/
#regzbot introduced: 24b7f8e5cd65
Since this was before v6.10-rc5 and stable 6.10.14 is EOL,
stable v6.11.5 and mainline are affected.
Reversion appears to be non-trivial as it is part of a patch
series, other files have been altered as well and other commits
build on top of it.
Call chain:
core-topology.c fw_core_handle_bus_reset()
-> core-topology.c for_each_fw_node(card, local_node,
report_found_node)
-> core.h fw_node_get(root)
-> refcount.h __refcount_inc(&node)
-> refcount.h __refcount_add(1, r, oldp);
-> refcount.h refcount_warn_saturate(r, REFCOUNT_ADD_UAF);
-> refcount.h REFCOUNT_WARN("addition on 0; use-after-free")
Since local_node of fw_core_handle_bus_reset() is retrieved by
local_node = build_tree(card, self_ids, self_id_count);
build_tree() needs to be looked at, it was indeed altered by
24b7f8e5cd65.
After a hard 3 hour look traversing all used functions and comparing
against the original function (as of e404cacfc5ed), this caught my eye:
for (port_index = 0; port_index < total_port_count;
++port_index) {
switch (port_status) {
case PHY_PACKET_SELF_ID_PORT_STATUS_PARENT:
node->color = i;
In both for loops, "port_index" was replaced by "i"
"i" remains in use above:
for (i = 0, h = &stack; i < child_port_count; i++)
h = h->prev;
While the original also used the less descriptive i in the loop
for (i = 0; i < port_count; i++) {
switch (get_port_type(sid, i)) {
case SELFID_PORT_PARENT:
node->color = i;
but reset it to 0 at the beginning of the loop.
So the stray "i" in the for loop should be replaced with the loop
iterator "port_index" as it is meant to be synchronous with the
loop iterator (i.e. the port_index), no?
diff --git a/drivers/firewire/core-topology.c b/drivers/firewire/core-
topology.c
index 8c10f47cc8fc..7fd91ba9c9c4 100644
--- a/drivers/firewire/core-topology.c
+++ b/drivers/firewire/core-topology.c
@@ -207,7 +207,7 @@ static struct fw_node *build_tree(struct fw_card
*card, const u32 *sid, int self
// the node->ports array where the
parent node should be. Later,
// when we handle the parent node, we
fix up the reference.
++parent_count;
- node->color = i;
+ node->color = port_index;
break;
What threw me off was discaridng node->color as it would be replaced
later anyways (can't be important!), or so I thought.
Please tell me, is this line of reasoning correct or am I missing
something?
Compiling 24b7f8e5cd65 and later mainline with the patch above
resulted in a kernel that didn't crash!
In case my solution should turn out to be correct, I will gladly
submit the patch.
Kind regards,
Edmund Raile.
With the transition of pd-mapper into the kernel, the timing was altered
such that on some targets the initial rpmsg_send() requests from
pmic_glink clients would be attempted before the firmware had announced
intents, and the firmware reject intent requests.
Fix this
Signed-off-by: Bjorn Andersson <bjorn.andersson(a)oss.qualcomm.com>
---
Changes in v2:
- Introduced "intents" and fixed a few spelling mistakes in the commit
message of patch 1
- Cleaned up log snippet in commit message of patch 2, added battery
manager log
- Changed the arbitrary 10 second timeout to 5... Ought to be enough for
anybody.
- Added a small sleep in the send-loop in patch 2, and by that
refactored the loop completely.
- Link to v1: https://lore.kernel.org/r/20241022-pmic-glink-ecancelled-v1-0-9e26fc74e0a3@…
---
Bjorn Andersson (2):
rpmsg: glink: Handle rejected intent request better
soc: qcom: pmic_glink: Handle GLINK intent allocation rejections
drivers/rpmsg/qcom_glink_native.c | 10 +++++++---
drivers/soc/qcom/pmic_glink.c | 25 ++++++++++++++++++++++---
2 files changed, 29 insertions(+), 6 deletions(-)
---
base-commit: 42f7652d3eb527d03665b09edac47f85fb600924
change-id: 20241022-pmic-glink-ecancelled-d899a9ca0358
Best regards,
--
Bjorn Andersson <bjorn.andersson(a)oss.qualcomm.com>
Adds support for detecting and reporting the speed of unaligned vector
accesses on RISC-V CPUs. Adds vec_misaligned_speed key to the hwprobe
adds Zicclsm to cpufeature and fixes the check for scalar unaligned
emulated all CPUs. The vec_misaligned_speed key keeps the same format
as the scalar unaligned access speed key.
This set does not emulate unaligned vector accesses on CPUs that do not
support them. Only reports if userspace can run them and speed of
unaligned vector accesses if supported.
Signed-off-by: Charlie Jenkins <charlie(a)rivosinc.com>
Signed-off-by: Jesse Taube <jesse(a)rivosinc.com>
---
Changes in V6:
Added ("RISC-V: Scalar unaligned access emulated on hotplug CPUs")
Changes in V8:
Dropped Zicclsm
s/RISCV_HWPROBE_VECTOR_MISALIGNED/RISCV_HWPROBE_MISALIGNED_VECTOR/g
to match RISCV_HWPROBE_MISALIGNED_SCALAR_*
Rebased onto palmer/fixes (32d5f7add080a936e28ab4142bfeea6b06999789)
Changes in V9:
Missed a RISCV_HWPROBE_VECTOR_MISALIGNED...
Changes in V10:
- I sent on behalf of Jesse
- Remove v0 from clobber args in inline asm and leave comment
---
Jesse Taube (6):
RISC-V: Check scalar unaligned access on all CPUs
RISC-V: Scalar unaligned access emulated on hotplug CPUs
RISC-V: Replace RISCV_MISALIGNED with RISCV_SCALAR_MISALIGNED
RISC-V: Detect unaligned vector accesses supported
RISC-V: Report vector unaligned access speed hwprobe
RISC-V: hwprobe: Document unaligned vector perf key
Documentation/arch/riscv/hwprobe.rst | 16 +++
arch/riscv/Kconfig | 58 ++++++++++-
arch/riscv/include/asm/cpufeature.h | 10 +-
arch/riscv/include/asm/entry-common.h | 11 --
arch/riscv/include/asm/hwprobe.h | 2 +-
arch/riscv/include/asm/vector.h | 2 +
arch/riscv/include/uapi/asm/hwprobe.h | 5 +
arch/riscv/kernel/Makefile | 3 +-
arch/riscv/kernel/copy-unaligned.h | 5 +
arch/riscv/kernel/fpu.S | 4 +-
arch/riscv/kernel/sys_hwprobe.c | 41 ++++++++
arch/riscv/kernel/traps_misaligned.c | 139 +++++++++++++++++++++++--
arch/riscv/kernel/unaligned_access_speed.c | 156 +++++++++++++++++++++++++++--
arch/riscv/kernel/vec-copy-unaligned.S | 58 +++++++++++
arch/riscv/kernel/vector.c | 2 +-
15 files changed, 474 insertions(+), 38 deletions(-)
---
base-commit: 98f7e32f20d28ec452afb208f9cffc08448a2652
change-id: 20240920-jesse_unaligned_vector-7083fd28659c
--
- Charlie
The raw value conversion to obtain a measurement in lux as
INT_PLUS_MICRO does not calculate the decimal part properly to display
it as micro (in this case microlux). It only calculates the module to
obtain the decimal part from a resolution that is 10000 times the
provided in the datasheet (0.5376 lux/cnt for the veml6030). The
resulting value must still be multiplied by 100 to make it micro.
This bug was introduced with the original implementation of the driver.
Cc: stable(a)vger.kernel.org
Fixes: 7b779f573c48 ("iio: light: add driver for veml6030 ambient light sensor")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
---
I found this almost by chance while testing new supported devices. The
decimal part was always suspiciously small, and when I compared samples
to the expected value according to the datasheet, it became clear what was
going on.
Example with a veml7700 (same resolution as the veml6030):
Resolution for gain = 1/8, IT = 100 ms: 0.5736 lux/cnt.
cat in_illuminance_raw in_illuminance_input
40
21.005040 -> wrong! 40 * 0.5736 is 21.504.
Tested with a veml6035 and a veml7700, the same will happen with the
original veml6030, as the operation is identical for all devices.
---
drivers/iio/light/veml6030.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iio/light/veml6030.c b/drivers/iio/light/veml6030.c
index d6f3b104b0e6..a0bf03e37df7 100644
--- a/drivers/iio/light/veml6030.c
+++ b/drivers/iio/light/veml6030.c
@@ -691,7 +691,7 @@ static int veml6030_read_raw(struct iio_dev *indio_dev,
}
if (mask == IIO_CHAN_INFO_PROCESSED) {
*val = (reg * data->cur_resolution) / 10000;
- *val2 = (reg * data->cur_resolution) % 10000;
+ *val2 = (reg * data->cur_resolution) % 10000 * 100;
return IIO_VAL_INT_PLUS_MICRO;
}
*val = reg;
---
base-commit: 15e7d45e786a62a211dd0098fee7c57f84f8c681
change-id: 20241016-veml6030-fix-processed-micro-616d00d555dc
Best regards,
--
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Flush the g2h worker explicitly if TLB timeout happens which is
observed on LNL and that points to the recent scheduling issue with
E-cores on LNL.
This is similar to the recent fix:
commit e51527233804 ("drm/xe/guc/ct: Flush g2h worker in case of g2h
response timeout") and should be removed once there is E core
scheduling fix.
v2: Add platform check(Himal)
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2687
Cc: Badal Nilawar <badal.nilawar(a)intel.com>
Cc: Matthew Brost <matthew.brost(a)intel.com>
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: John Harrison <John.C.Harrison(a)Intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray(a)intel.com>
Cc: Lucas De Marchi <lucas.demarchi(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.11+
Signed-off-by: Nirmoy Das <nirmoy.das(a)intel.com>
Reviewed-by: Matthew Brost <matthew.brost(a)intel.com>
Acked-by: Badal Nilawar <badal.nilawar(a)intel.com>
---
drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
index 773de1f08db9..5aba6ed950b7 100644
--- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
+++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
@@ -6,6 +6,7 @@
#include "xe_gt_tlb_invalidation.h"
#include "abi/guc_actions_abi.h"
+#include "compat-i915-headers/i915_drv.h"
#include "xe_device.h"
#include "xe_force_wake.h"
#include "xe_gt.h"
@@ -72,6 +73,16 @@ static void xe_gt_tlb_fence_timeout(struct work_struct *work)
struct xe_device *xe = gt_to_xe(gt);
struct xe_gt_tlb_invalidation_fence *fence, *next;
+ /*
+ * This is analogous to e51527233804 ("drm/xe/guc/ct: Flush g2h worker
+ * in case of g2h response timeout")
+ *
+ * TODO: Drop this change once workqueue scheduling delay issue is
+ * fixed on LNL Hybrid CPU.
+ */
+ if (IS_LUNARLAKE(xe))
+ flush_work(>->uc.guc.ct.g2h_worker);
+
spin_lock_irq(>->tlb_invalidation.pending_lock);
list_for_each_entry_safe(fence, next,
>->tlb_invalidation.pending_fences, link) {
--
2.46.0
In the original code, the device reset would not be triggered
when the driver is set to multi-master and bus is free.
It needs to be considered with multi-master condition.
Fixes: <f327c686d3ba> ("i2c: aspeed: Reset the i2c controller when timeout occurs")
Signed-off-by: Tommy Huang <tommy_huang(a)aspeedtech.com>
---
drivers/i2c/busses/i2c-aspeed.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
index cc5a26637fd5..7639ae3ace67 100644
--- a/drivers/i2c/busses/i2c-aspeed.c
+++ b/drivers/i2c/busses/i2c-aspeed.c
@@ -716,14 +716,15 @@ static int aspeed_i2c_master_xfer(struct i2c_adapter *adap,
if (time_left == 0) {
/*
* In a multi-master setup, if a timeout occurs, attempt
- * recovery. But if the bus is idle, we still need to reset the
- * i2c controller to clear the remaining interrupts.
+ * recovery device. But if the bus is idle,
+ * we still need to reset the i2c controller to clear
+ * the remaining interrupts or reset device abnormal condition.
*/
- if (bus->multi_master &&
- (readl(bus->base + ASPEED_I2C_CMD_REG) &
- ASPEED_I2CD_BUS_BUSY_STS))
- aspeed_i2c_recover_bus(bus);
- else
+ if ((readl(bus->base + ASPEED_I2C_CMD_REG) &
+ ASPEED_I2CD_BUS_BUSY_STS)){
+ if (bus->multi_master)
+ aspeed_i2c_recover_bus(bus);
+ } else
aspeed_i2c_reset(bus);
/*
--
2.25.1
Hi there,
We are excited to offer you a comprehensive email list of school districts that includes key contact information such as phone numbers, email addresses, mailing addresses, company revenue, size, and web addresses. Our databases also cover related industries such as:
* K-12 schools
* Universities
* Vocational schools and training programs
* Performing arts schools
* Fitness centers and gyms
* Child care services and providers
* Educational publishers and suppliers
If you're interested, we would be happy to provide you with relevant counts and a test file based on your specific requirements.
Thank you for your time and consideration, and please let us know if you have any questions or concerns.
Thanks,
Ava Phillips
To remove from this mailing reply with the subject line " LEAVE US".
tpm2_sessions_init() does not ignore the result of
tpm2_create_null_primary(). Address this by returning -ENODEV to the
caller. Given that upper layers cannot help healing the situation
further, deal with the TPM error here by
Cc: stable(a)vger.kernel.org # v6.10+
Fixes: d2add27cf2b8 ("tpm: Add NULL primary creation")
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
---
v7:
- Add the error message back but fix it up a bit:
1. Remove 'TPM:' given dev_err().
2. s/NULL/null/ as this has nothing to do with the macro in libc.
3. Fix the reasoning: null key creation failed
v6:
- Address:
https://lore.kernel.org/linux-integrity/69c893e7-6b87-4daa-80db-44d1120e80f…
as TPM RC is taken care of at the call site. Add also the missing
documentation for the return values.
v5:
- Do not print klog messages on error, as tpm2_save_context() already
takes care of this.
v4:
- Fixed up stable version.
v3:
- Handle TPM and POSIX error separately and return -ENODEV always back
to the caller.
v2:
- Refined the commit message.
---
drivers/char/tpm/tpm2-sessions.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/char/tpm/tpm2-sessions.c b/drivers/char/tpm/tpm2-sessions.c
index d3521aadd43e..1e12e0b2492e 100644
--- a/drivers/char/tpm/tpm2-sessions.c
+++ b/drivers/char/tpm/tpm2-sessions.c
@@ -1347,14 +1347,21 @@ static int tpm2_create_null_primary(struct tpm_chip *chip)
*
* Derive and context save the null primary and allocate memory in the
* struct tpm_chip for the authorizations.
+ *
+ * Return:
+ * * 0 - OK
+ * * -errno - A system error
+ * * TPM_RC - A TPM error
*/
int tpm2_sessions_init(struct tpm_chip *chip)
{
int rc;
rc = tpm2_create_null_primary(chip);
- if (rc)
- dev_err(&chip->dev, "TPM: security failed (NULL seed derivation): %d\n", rc);
+ if (rc) {
+ dev_err(&chip->dev, "null primary key creation failed with %d\n", rc);
+ return rc;
+ }
chip->auth = kmalloc(sizeof(*chip->auth), GFP_KERNEL);
if (!chip->auth)
--
2.47.0
The existing code moves VF to the same namespace as the synthetic NIC
during netvsc_register_vf(). But, if the synthetic device is moved to a
new namespace after the VF registration, the VF won't be moved together.
To make the behavior more consistent, add a namespace check for synthetic
NIC's NETDEV_REGISTER event (generated during its move), and move the VF
if it is not in the same namespace.
Cc: stable(a)vger.kernel.org
Fixes: c0a41b887ce6 ("hv_netvsc: move VF to same namespace as netvsc device")
Suggested-by: Stephen Hemminger <stephen(a)networkplumber.org>
Signed-off-by: Haiyang Zhang <haiyangz(a)microsoft.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
---
v3: Use RCT order as suggested by Simon.
v2: Move my fix to synthetic NIC's NETDEV_REGISTER event as suggested by Stephen.
---
drivers/net/hyperv/netvsc_drv.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 153b97f8ec0d..23180f7b67b6 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -2798,6 +2798,31 @@ static struct hv_driver netvsc_drv = {
},
};
+/* Set VF's namespace same as the synthetic NIC */
+static void netvsc_event_set_vf_ns(struct net_device *ndev)
+{
+ struct net_device_context *ndev_ctx = netdev_priv(ndev);
+ struct net_device *vf_netdev;
+ int ret;
+
+ vf_netdev = rtnl_dereference(ndev_ctx->vf_netdev);
+ if (!vf_netdev)
+ return;
+
+ if (!net_eq(dev_net(ndev), dev_net(vf_netdev))) {
+ ret = dev_change_net_namespace(vf_netdev, dev_net(ndev),
+ "eth%d");
+ if (ret)
+ netdev_err(vf_netdev,
+ "Cannot move to same namespace as %s: %d\n",
+ ndev->name, ret);
+ else
+ netdev_info(vf_netdev,
+ "Moved VF to namespace with: %s\n",
+ ndev->name);
+ }
+}
+
/*
* On Hyper-V, every VF interface is matched with a corresponding
* synthetic interface. The synthetic interface is presented first
@@ -2810,6 +2835,11 @@ static int netvsc_netdev_event(struct notifier_block *this,
struct net_device *event_dev = netdev_notifier_info_to_dev(ptr);
int ret = 0;
+ if (event_dev->netdev_ops == &device_ops && event == NETDEV_REGISTER) {
+ netvsc_event_set_vf_ns(event_dev);
+ return NOTIFY_DONE;
+ }
+
ret = check_dev_is_matching_vf(event_dev);
if (ret != 0)
return NOTIFY_DONE;
--
2.34.1
On Tue, Mar 12, 2024 at 07:04:10AM -0400, Eric Hagberg wrote:
> On Thu, Mar 7, 2024 at 11:33 AM Steve Wahl <steve.wahl(a)hpe.com> wrote:
> > What Linux Distribution are you running on that machine? My guess
> > would be that this is not distro related; if you are running something
> > quite different from Pavin that would confirm this.
>
> Distro in use is Rocky 8, so it’s pretty clear not to be distro-specific.
>
> > I found an AMD based system to try to reproduce this on.
>
> yeah, it probably requires either a specific cpu or set or devices plus cpu
> to trigger… found that it also affects Dell R7625 servers in addition to
> the R6615s
I agree that it's likely the CPU or particular set of surrounding
devices that trigger the problem.
I have not succeeded in reproducing the problem yet. I tried an AMD
based system lent to me, but it's probably the wrong generation (AMD
EPYC 7251) and I didn't see the problem. I have a line on a system
that's more in line with the systems the bug was reported on that I
should be able to try tomorrow.
I would love to have some direction from the community at large on
this. The fact that nogbpages on the command line causes the same
problem without my patch suggests it's not bad code directly in my
patch, but something in the way kexec reacts to the resulting identity
map. One quick solution would be a kernel command line parameter to
select between the previous identity map creation behavior and the new
behavior. E.g. in addition to "nogbpages", we could have
"somegbpages" and "allgbpages" -- or gbpages=[all, some, none] with
nogbpages a synonym for backwards compatibility.
But I don't want to introduce a new command line parameter if the
actual problem can be understood and fixed. The question is how much
time do I have to persue a direct fix before some other action needs
to be taken?
Thanks,
--> Steve Wahl
--
Steve Wahl, Hewlett Packard Enterprise
When ata_qc_complete() schedules a command for EH using
ata_qc_schedule_eh(), blk_abort_request() will be called, which leads to
req->q->mq_ops->timeout() / scsi_timeout() being called.
scsi_timeout(), if the LLDD has no abort handler (libata has no abort
handler), will set host byte to DID_TIME_OUT, and then call
scsi_eh_scmd_add() to add the command to EH.
Thus, when commands first enter libata's EH strategy_handler, all the
commands that have been added to EH will have DID_TIME_OUT set.
Commit e5dd410acb34 ("ata: libata: Clear DID_TIME_OUT for ATA PT commands
with sense data") clears this bogus DID_TIME_OUT flag for all commands
that reached libata's EH strategy_handler.
libata has its own flag (AC_ERR_TIMEOUT), that it sets for commands that
have not received a completion at the time of entering EH.
ata_eh_worth_retry() has no special handling for AC_ERR_TIMEOUT, so by
default timed out commands will get flag ATA_QCFLAG_RETRY set and will be
retried after the port has been reset (ata_eh_link_autopsy() always
triggers a port reset if any command has AC_ERR_TIMEOUT set).
For commands that have ATA_QCFLAG_RETRY set, but also has an error flag
set (e.g. AC_ERR_TIMEOUT), ata_eh_finish() will not increment
scmd->allowed, so the command will at most be retried (scmd->allowed
number of times, which by default is set to 3).
However, scsi_eh_flush_done_q() will only retry commands for which
scsi_noretry_cmd() returns false.
For commands that has DID_TIME_OUT set, if the command is either
has FAILFAST or if the command is a passthrough command, scsi_noretry_cmd()
will return true. Thus, such commands will never be retried.
Thus, make sure that libata sets SCSI's DID_TIME_OUT flag for commands that
actually timed out (libata's AC_ERR_TIMEOUT flag), such that timed out
commands will once again not be retried if they are also a FAILFAST or
passthrough command.
Cc: stable(a)vger.kernel.org
Fixes: e5dd410acb34 ("ata: libata: Clear DID_TIME_OUT for ATA PT commands with sense data")
Reported-by: Lai, Yi <yi1.lai(a)linux.intel.com>
Closes: https://lore.kernel.org/linux-ide/ZxYz871I3Blsi30F@ly-workstation/
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
---
drivers/ata/libata-eh.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index fa41ea57a978..3b303d4ae37a 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -651,6 +651,7 @@ void ata_scsi_cmd_error_handler(struct Scsi_Host *host, struct ata_port *ap,
/* the scmd has an associated qc */
if (!(qc->flags & ATA_QCFLAG_EH)) {
/* which hasn't failed yet, timeout */
+ set_host_byte(scmd, DID_TIME_OUT);
qc->err_mask |= AC_ERR_TIMEOUT;
qc->flags |= ATA_QCFLAG_EH;
nr_timedout++;
--
2.47.0
The patch titled
Subject: mm/page_alloc: fix NUMA stats update for cpu-less nodes
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-page_alloc-fix-numa-stats-update-for-cpu-less-nodes.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Dongjoo Seo <dongjoo.linux.dev(a)gmail.com>
Subject: mm/page_alloc: fix NUMA stats update for cpu-less nodes
Date: Wed, 23 Oct 2024 10:50:37 -0700
In the case of memoryless node, when a process prefers a node with no
memory(e.g., because it is running on a CPU local to that node), the
kernel treats a nearby node with memory as the preferred node. As a
result, such allocations do not increment the numa_foreign counter on the
memoryless node, leading to skewed NUMA_HIT, NUMA_MISS, and NUMA_FOREIGN
stats for the nearest node.
This patch corrects this issue by:
1. Checking if the zone or preferred zone is CPU-less before updating
the NUMA stats.
2. Ensuring NUMA_HIT is only updated if the zone is not CPU-less.
3. Ensuring NUMA_FOREIGN is only updated if the preferred zone is not
CPU-less.
Example Before and After Patch:
- Before Patch:
node0 node1 node2
numa_hit 86333181 114338269 5108
numa_miss 5199455 0 56844591
numa_foreign 32281033 29763013 0
interleave_hit 91 91 0
local_node 86326417 114288458 0
other_node 5206219 49768 56849702
- After Patch:
node0 node1 node2
numa_hit 2523058 9225528 0
numa_miss 150213 10226 21495942
numa_foreign 17144215 4501270 0
interleave_hit 91 94 0
local_node 2493918 9208226 0
other_node 179351 27528 21495942
Similarly, in the context of cpuless nodes, this patch ensures that NUMA
statistics are accurately updated by adding checks to prevent the
miscounting of memory allocations when the involved nodes have no CPUs.
This ensures more precise tracking of memory access patterns accross all
nodes, regardless of whether they have CPUs or not, improving the overall
reliability of NUMA stat. The reason is that page allocation from
dev_dax, cpuset, memcg .. comes with preferred allocating zone in cpuless
node and its hard to track the zone info for miss information.
Link: https://lkml.kernel.org/r/20241023175037.9125-1-dongjoo.linux.dev@gmail.com
Signed-off-by: Dongjoo Seo <dongjoo.linux.dev(a)gmail.com>
Cc: Davidlohr Bueso <dave(a)stgolabs.net>
Cc: Fan Ni <nifan(a)outlook.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Adam Manzanares <a.manzanares(a)samsung.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
--- a/mm/page_alloc.c~mm-page_alloc-fix-numa-stats-update-for-cpu-less-nodes
+++ a/mm/page_alloc.c
@@ -2858,19 +2858,21 @@ static inline void zone_statistics(struc
{
#ifdef CONFIG_NUMA
enum numa_stat_item local_stat = NUMA_LOCAL;
+ bool z_is_cpuless = !node_state(zone_to_nid(z), N_CPU);
+ bool pref_is_cpuless = !node_state(zone_to_nid(preferred_zone), N_CPU);
- /* skip numa counters update if numa stats is disabled */
if (!static_branch_likely(&vm_numa_stat_key))
return;
- if (zone_to_nid(z) != numa_node_id())
+ if (zone_to_nid(z) != numa_node_id() || z_is_cpuless)
local_stat = NUMA_OTHER;
- if (zone_to_nid(z) == zone_to_nid(preferred_zone))
+ if (zone_to_nid(z) == zone_to_nid(preferred_zone) && !z_is_cpuless)
__count_numa_events(z, NUMA_HIT, nr_account);
else {
__count_numa_events(z, NUMA_MISS, nr_account);
- __count_numa_events(preferred_zone, NUMA_FOREIGN, nr_account);
+ if (!pref_is_cpuless)
+ __count_numa_events(preferred_zone, NUMA_FOREIGN, nr_account);
}
__count_numa_events(z, local_stat, nr_account);
#endif
_
Patches currently in -mm which might be from dongjoo.linux.dev(a)gmail.com are
mm-page_alloc-fix-numa-stats-update-for-cpu-less-nodes.patch
On 22. 10. 24, 19:54, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> tty/serial: Make ->dcd_change()+uart_handle_dcd_change() status bool active
This is a cleanup, not needed in stable. (Unless something
context-depends on it.)
--
js
suse labs
The quilt patch titled
Subject: Revert "selftests/mm: fix deadlock for fork after pthread_create on ARM"
has been removed from the -mm tree. Its filename was
revert-selftests-mm-fix-deadlock-for-fork-after-pthread_create-on-arm.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Edward Liaw <edliaw(a)google.com>
Subject: Revert "selftests/mm: fix deadlock for fork after pthread_create on ARM"
Date: Fri, 18 Oct 2024 17:17:22 +0000
Patch series "selftests/mm: revert pthread_barrier change"
On Android arm, pthread_create followed by a fork caused a deadlock in
the case where the fork required work to be completed by the created
thread.
The previous patches incorrectly assumed that the parent would
always initialize the pthread_barrier for the child thread. This
reverts the change and replaces the fix for wp-fork-with-event with the
original use of atomic_bool.
This patch (of 3):
This reverts commit e142cc87ac4ec618f2ccf5f68aedcd6e28a59d9d.
fork_event_consumer may be called by other tests that do not initialize
the pthread_barrier, so this approach is not correct. The subsequent
patch will revert to using atomic_bool instead.
Link: https://lkml.kernel.org/r/20241018171734.2315053-1-edliaw@google.com
Link: https://lkml.kernel.org/r/20241018171734.2315053-2-edliaw@google.com
Fixes: e142cc87ac4e ("fix deadlock for fork after pthread_create on ARM")
Signed-off-by: Edward Liaw <edliaw(a)google.com>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/uffd-unit-tests.c | 7 -------
1 file changed, 7 deletions(-)
--- a/tools/testing/selftests/mm/uffd-unit-tests.c~revert-selftests-mm-fix-deadlock-for-fork-after-pthread_create-on-arm
+++ a/tools/testing/selftests/mm/uffd-unit-tests.c
@@ -241,9 +241,6 @@ static void *fork_event_consumer(void *d
fork_event_args *args = data;
struct uffd_msg msg = { 0 };
- /* Ready for parent thread to fork */
- pthread_barrier_wait(&ready_for_fork);
-
/* Read until a full msg received */
while (uffd_read_msg(args->parent_uffd, &msg));
@@ -311,12 +308,8 @@ static int pagemap_test_fork(int uffd, b
/* Prepare a thread to resolve EVENT_FORK */
if (with_event) {
- pthread_barrier_init(&ready_for_fork, NULL, 2);
if (pthread_create(&thread, NULL, fork_event_consumer, &args))
err("pthread_create()");
- /* Wait for child thread to start before forking */
- pthread_barrier_wait(&ready_for_fork);
- pthread_barrier_destroy(&ready_for_fork);
}
child = fork();
_
Patches currently in -mm which might be from edliaw(a)google.com are
The quilt patch titled
Subject: nilfs2: fix kernel bug due to missing clearing of checked flag
has been removed from the -mm tree. Its filename was
nilfs2-fix-kernel-bug-due-to-missing-clearing-of-checked-flag.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Subject: nilfs2: fix kernel bug due to missing clearing of checked flag
Date: Fri, 18 Oct 2024 04:33:10 +0900
Syzbot reported that in directory operations after nilfs2 detects
filesystem corruption and degrades to read-only,
__block_write_begin_int(), which is called to prepare block writes, may
fail the BUG_ON check for accesses exceeding the folio/page size,
triggering a kernel bug.
This was found to be because the "checked" flag of a page/folio was not
cleared when it was discarded by nilfs2's own routine, which causes the
sanity check of directory entries to be skipped when the directory
page/folio is reloaded. So, fix that.
This was necessary when the use of nilfs2's own page discard routine was
applied to more than just metadata files.
Link: https://lkml.kernel.org/r/20241017193359.5051-1-konishi.ryusuke@gmail.com
Fixes: 8c26c4e2694a ("nilfs2: fix issue with flush kernel thread after remount in RO mode because of driver's internal error or metadata corruption")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+d6ca2daf692c7a82f959(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d6ca2daf692c7a82f959
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/nilfs2/page.c | 1 +
1 file changed, 1 insertion(+)
--- a/fs/nilfs2/page.c~nilfs2-fix-kernel-bug-due-to-missing-clearing-of-checked-flag
+++ a/fs/nilfs2/page.c
@@ -400,6 +400,7 @@ void nilfs_clear_folio_dirty(struct foli
folio_clear_uptodate(folio);
folio_clear_mappedtodisk(folio);
+ folio_clear_checked(folio);
head = folio_buffers(folio);
if (head) {
_
Patches currently in -mm which might be from konishi.ryusuke(a)gmail.com are
nilfs2-fix-potential-deadlock-with-newly-created-symlinks.patch
The quilt patch titled
Subject: ocfs2: pass u64 to ocfs2_truncate_inline maybe overflow
has been removed from the -mm tree. Its filename was
ocfs2-pass-u64-to-ocfs2_truncate_inline-maybe-overflow.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Edward Adam Davis <eadavis(a)qq.com>
Subject: ocfs2: pass u64 to ocfs2_truncate_inline maybe overflow
Date: Wed, 16 Oct 2024 19:43:47 +0800
Syzbot reported a kernel BUG in ocfs2_truncate_inline. There are two
reasons for this: first, the parameter value passed is greater than
ocfs2_max_inline_data_with_xattr, second, the start and end parameters of
ocfs2_truncate_inline are "unsigned int".
So, we need to add a sanity check for byte_start and byte_len right before
ocfs2_truncate_inline() in ocfs2_remove_inode_range(), if they are greater
than ocfs2_max_inline_data_with_xattr return -EINVAL.
Link: https://lkml.kernel.org/r/tencent_D48DB5122ADDAEDDD11918CFB68D93258C07@qq.c…
Fixes: 1afc32b95233 ("ocfs2: Write support for inline data")
Signed-off-by: Edward Adam Davis <eadavis(a)qq.com>
Reported-by: syzbot+81092778aac03460d6b7(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=81092778aac03460d6b7
Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: Junxiao Bi <junxiao.bi(a)oracle.com>
Cc: Changwei Ge <gechangwei(a)live.cn>
Cc: Gang He <ghe(a)suse.com>
Cc: Jun Piao <piaojun(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/ocfs2/file.c | 8 ++++++++
1 file changed, 8 insertions(+)
--- a/fs/ocfs2/file.c~ocfs2-pass-u64-to-ocfs2_truncate_inline-maybe-overflow
+++ a/fs/ocfs2/file.c
@@ -1784,6 +1784,14 @@ int ocfs2_remove_inode_range(struct inod
return 0;
if (OCFS2_I(inode)->ip_dyn_features & OCFS2_INLINE_DATA_FL) {
+ int id_count = ocfs2_max_inline_data_with_xattr(inode->i_sb, di);
+
+ if (byte_start > id_count || byte_start + byte_len > id_count) {
+ ret = -EINVAL;
+ mlog_errno(ret);
+ goto out;
+ }
+
ret = ocfs2_truncate_inline(inode, di_bh, byte_start,
byte_start + byte_len, 0);
if (ret) {
_
Patches currently in -mm which might be from eadavis(a)qq.com are
The quilt patch titled
Subject: mm: shmem: fix data-race in shmem_getattr()
has been removed from the -mm tree. Its filename was
mm-shmem-fix-data-race-in-shmem_getattr.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Jeongjun Park <aha310510(a)gmail.com>
Subject: mm: shmem: fix data-race in shmem_getattr()
Date: Mon, 9 Sep 2024 21:35:58 +0900
I got the following KCSAN report during syzbot testing:
==================================================================
BUG: KCSAN: data-race in generic_fillattr / inode_set_ctime_current
write to 0xffff888102eb3260 of 4 bytes by task 6565 on cpu 1:
inode_set_ctime_to_ts include/linux/fs.h:1638 [inline]
inode_set_ctime_current+0x169/0x1d0 fs/inode.c:2626
shmem_mknod+0x117/0x180 mm/shmem.c:3443
shmem_create+0x34/0x40 mm/shmem.c:3497
lookup_open fs/namei.c:3578 [inline]
open_last_lookups fs/namei.c:3647 [inline]
path_openat+0xdbc/0x1f00 fs/namei.c:3883
do_filp_open+0xf7/0x200 fs/namei.c:3913
do_sys_openat2+0xab/0x120 fs/open.c:1416
do_sys_open fs/open.c:1431 [inline]
__do_sys_openat fs/open.c:1447 [inline]
__se_sys_openat fs/open.c:1442 [inline]
__x64_sys_openat+0xf3/0x120 fs/open.c:1442
x64_sys_call+0x1025/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:258
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0x54/0x120 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x76/0x7e
read to 0xffff888102eb3260 of 4 bytes by task 3498 on cpu 0:
inode_get_ctime_nsec include/linux/fs.h:1623 [inline]
inode_get_ctime include/linux/fs.h:1629 [inline]
generic_fillattr+0x1dd/0x2f0 fs/stat.c:62
shmem_getattr+0x17b/0x200 mm/shmem.c:1157
vfs_getattr_nosec fs/stat.c:166 [inline]
vfs_getattr+0x19b/0x1e0 fs/stat.c:207
vfs_statx_path fs/stat.c:251 [inline]
vfs_statx+0x134/0x2f0 fs/stat.c:315
vfs_fstatat+0xec/0x110 fs/stat.c:341
__do_sys_newfstatat fs/stat.c:505 [inline]
__se_sys_newfstatat+0x58/0x260 fs/stat.c:499
__x64_sys_newfstatat+0x55/0x70 fs/stat.c:499
x64_sys_call+0x141f/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:263
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0x54/0x120 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x76/0x7e
value changed: 0x2755ae53 -> 0x27ee44d3
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 UID: 0 PID: 3498 Comm: udevd Not tainted 6.11.0-rc6-syzkaller-00326-gd1f2d51b711a-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
==================================================================
When calling generic_fillattr(), if you don't hold read lock, data-race
will occur in inode member variables, which can cause unexpected
behavior.
Since there is no special protection when shmem_getattr() calls
generic_fillattr(), data-race occurs by functions such as shmem_unlink()
or shmem_mknod(). This can cause unexpected results, so commenting it out
is not enough.
Therefore, when calling generic_fillattr() from shmem_getattr(), it is
appropriate to protect the inode using inode_lock_shared() and
inode_unlock_shared() to prevent data-race.
Link: https://lkml.kernel.org/r/20240909123558.70229-1-aha310510@gmail.com
Fixes: 44a30220bc0a ("shmem: recalculate file inode when fstat")
Signed-off-by: Jeongjun Park <aha310510(a)gmail.com>
Reported-by: syzbot <syzkaller(a)googlegroup.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/shmem.c | 2 ++
1 file changed, 2 insertions(+)
--- a/mm/shmem.c~mm-shmem-fix-data-race-in-shmem_getattr
+++ a/mm/shmem.c
@@ -1166,7 +1166,9 @@ static int shmem_getattr(struct mnt_idma
stat->attributes_mask |= (STATX_ATTR_APPEND |
STATX_ATTR_IMMUTABLE |
STATX_ATTR_NODUMP);
+ inode_lock_shared(inode);
generic_fillattr(idmap, request_mask, inode, stat);
+ inode_unlock_shared(inode);
if (shmem_huge_global_enabled(inode, 0, 0, false, NULL, 0))
stat->blksize = HPAGE_PMD_SIZE;
_
Patches currently in -mm which might be from aha310510(a)gmail.com are
The quilt patch titled
Subject: x86/traps: move kmsan check after instrumentation_begin
has been removed from the -mm tree. Its filename was
x86-traps-move-kmsan-check-after-instrumentation_begin.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Sabyrzhan Tasbolatov <snovitoll(a)gmail.com>
Subject: x86/traps: move kmsan check after instrumentation_begin
Date: Wed, 16 Oct 2024 20:24:07 +0500
During x86_64 kernel build with CONFIG_KMSAN, the objtool warns following:
AR built-in.a
AR vmlinux.a
LD vmlinux.o
vmlinux.o: warning: objtool: handle_bug+0x4: call to
kmsan_unpoison_entry_regs() leaves .noinstr.text section
OBJCOPY modules.builtin.modinfo
GEN modules.builtin
MODPOST Module.symvers
CC .vmlinux.export.o
Moving kmsan_unpoison_entry_regs() _after_ instrumentation_begin() fixes
the warning.
There is decode_bug(regs->ip, &imm) is left before KMSAN unpoisoining, but
it has the return condition and if we include it after
instrumentation_begin() it results the warning "return with
instrumentation enabled", hence, I'm concerned that regs will not be KMSAN
unpoisoned if `ud_type == BUG_NONE` is true.
Link: https://lkml.kernel.org/r/20241016152407.3149001-1-snovitoll@gmail.com
Fixes: ba54d194f8da ("x86/traps: avoid KMSAN bugs originating from handle_bug()")
Signed-off-by: Sabyrzhan Tasbolatov <snovitoll(a)gmail.com>
Reviewed-by: Alexander Potapenko <glider(a)google.com>
Cc: Borislav Petkov (AMD) <bp(a)alien8.de>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/x86/kernel/traps.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
--- a/arch/x86/kernel/traps.c~x86-traps-move-kmsan-check-after-instrumentation_begin
+++ a/arch/x86/kernel/traps.c
@@ -261,12 +261,6 @@ static noinstr bool handle_bug(struct pt
int ud_type;
u32 imm;
- /*
- * Normally @regs are unpoisoned by irqentry_enter(), but handle_bug()
- * is a rare case that uses @regs without passing them to
- * irqentry_enter().
- */
- kmsan_unpoison_entry_regs(regs);
ud_type = decode_bug(regs->ip, &imm);
if (ud_type == BUG_NONE)
return handled;
@@ -276,6 +270,12 @@ static noinstr bool handle_bug(struct pt
*/
instrumentation_begin();
/*
+ * Normally @regs are unpoisoned by irqentry_enter(), but handle_bug()
+ * is a rare case that uses @regs without passing them to
+ * irqentry_enter().
+ */
+ kmsan_unpoison_entry_regs(regs);
+ /*
* Since we're emulating a CALL with exceptions, restore the interrupt
* state to what it was at the exception site.
*/
_
Patches currently in -mm which might be from snovitoll(a)gmail.com are
mm-kasan-kmsan-copy_from-to_kernel_nofault.patch
kasan-move-checks-to-do_strncpy_from_user.patch
kasan-migrate-copy_user_test-to-kunit.patch
kasan-delete-config_kasan_module_test.patch
kasan-delete-config_kasan_module_test-fix.patch
The quilt patch titled
Subject: mm/page_alloc: let GFP_ATOMIC order-0 allocs access highatomic reserves
has been removed from the -mm tree. Its filename was
mm-page_alloc-let-gfp_atomic-order-0-allocs-access-highatomic-reserves.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Matt Fleming <mfleming(a)cloudflare.com>
Subject: mm/page_alloc: let GFP_ATOMIC order-0 allocs access highatomic reserves
Date: Fri, 11 Oct 2024 13:07:37 +0100
Under memory pressure it's possible for GFP_ATOMIC order-0 allocations to
fail even though free pages are available in the highatomic reserves.
GFP_ATOMIC allocations cannot trigger unreserve_highatomic_pageblock()
since it's only run from reclaim.
Given that such allocations will pass the watermarks in
__zone_watermark_unusable_free(), it makes sense to fallback to highatomic
reserves the same way that ALLOC_OOM can.
This fixes order-0 page allocation failures observed on Cloudflare's fleet
when handling network packets:
kswapd1: page allocation failure: order:0, mode:0x820(GFP_ATOMIC),
nodemask=(null),cpuset=/,mems_allowed=0-7
CPU: 10 PID: 696 Comm: kswapd1 Kdump: loaded Tainted: G O 6.6.43-CUSTOM #1
Hardware name: MACHINE
Call Trace:
<IRQ>
dump_stack_lvl+0x3c/0x50
warn_alloc+0x13a/0x1c0
__alloc_pages_slowpath.constprop.0+0xc9d/0xd10
__alloc_pages+0x327/0x340
__napi_alloc_skb+0x16d/0x1f0
bnxt_rx_page_skb+0x96/0x1b0 [bnxt_en]
bnxt_rx_pkt+0x201/0x15e0 [bnxt_en]
__bnxt_poll_work+0x156/0x2b0 [bnxt_en]
bnxt_poll+0xd9/0x1c0 [bnxt_en]
__napi_poll+0x2b/0x1b0
bpf_trampoline_6442524138+0x7d/0x1000
__napi_poll+0x5/0x1b0
net_rx_action+0x342/0x740
handle_softirqs+0xcf/0x2b0
irq_exit_rcu+0x6c/0x90
sysvec_apic_timer_interrupt+0x72/0x90
</IRQ>
[mfleming(a)cloudflare.com: update comment]
Link: https://lkml.kernel.org/r/20241015125158.3597702-1-matt@readmodwrite.com
Link: https://lkml.kernel.org/r/20241011120737.3300370-1-matt@readmodwrite.com
Link: https://lore.kernel.org/all/CAGis_TWzSu=P7QJmjD58WWiu3zjMTVKSzdOwWE8ORaGytz…
Fixes: 1d91df85f399 ("mm/page_alloc: handle a missing case for memalloc_nocma_{save/restore} APIs")
Signed-off-by: Matt Fleming <mfleming(a)cloudflare.com>
Suggested-by: Vlastimil Babka <vbabka(a)suse.cz>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
--- a/mm/page_alloc.c~mm-page_alloc-let-gfp_atomic-order-0-allocs-access-highatomic-reserves
+++ a/mm/page_alloc.c
@@ -2893,12 +2893,12 @@ struct page *rmqueue_buddy(struct zone *
page = __rmqueue(zone, order, migratetype, alloc_flags);
/*
- * If the allocation fails, allow OOM handling access
- * to HIGHATOMIC reserves as failing now is worse than
- * failing a high-order atomic allocation in the
- * future.
+ * If the allocation fails, allow OOM handling and
+ * order-0 (atomic) allocs access to HIGHATOMIC
+ * reserves as failing now is worse than failing a
+ * high-order atomic allocation in the future.
*/
- if (!page && (alloc_flags & ALLOC_OOM))
+ if (!page && (alloc_flags & (ALLOC_OOM|ALLOC_NON_BLOCK)))
page = __rmqueue_smallest(zone, order, MIGRATE_HIGHATOMIC);
if (!page) {
_
Patches currently in -mm which might be from mfleming(a)cloudflare.com are
The quilt patch titled
Subject: fork: only invoke khugepaged, ksm hooks if no error
has been removed from the -mm tree. Its filename was
fork-only-invoke-khugepaged-ksm-hooks-if-no-error.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Subject: fork: only invoke khugepaged, ksm hooks if no error
Date: Tue, 15 Oct 2024 18:56:06 +0100
There is no reason to invoke these hooks early against an mm that is in an
incomplete state.
The change in commit d24062914837 ("fork: use __mt_dup() to duplicate
maple tree in dup_mmap()") makes this more pertinent as we may be in a
state where entries in the maple tree are not yet consistent.
Their placement early in dup_mmap() only appears to have been meaningful
for early error checking, and since functionally it'd require a very small
allocation to fail (in practice 'too small to fail') that'd only occur in
the most dire circumstances, meaning the fork would fail or be OOM'd in
any case.
Since both khugepaged and KSM tracking are there to provide optimisations
to memory performance rather than critical functionality, it doesn't
really matter all that much if, under such dire memory pressure, we fail
to register an mm with these.
As a result, we follow the example of commit d2081b2bf819 ("mm:
khugepaged: make khugepaged_enter() void function") and make ksm_fork() a
void function also.
We only expose the mm to these functions once we are done with them and
only if no error occurred in the fork operation.
Link: https://lkml.kernel.org/r/e0cb8b840c9d1d5a6e84d4f8eff5f3f2022aa10c.17290143…
Fixes: d24062914837 ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Reported-by: Jann Horn <jannh(a)google.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett(a)Oracle.com>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Reviewed-by: Jann Horn <jannh(a)google.com>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Linus Torvalds <torvalds(a)linuxfoundation.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/ksm.h | 10 ++++------
kernel/fork.c | 7 ++-----
2 files changed, 6 insertions(+), 11 deletions(-)
--- a/include/linux/ksm.h~fork-only-invoke-khugepaged-ksm-hooks-if-no-error
+++ a/include/linux/ksm.h
@@ -54,12 +54,11 @@ static inline long mm_ksm_zero_pages(str
return atomic_long_read(&mm->ksm_zero_pages);
}
-static inline int ksm_fork(struct mm_struct *mm, struct mm_struct *oldmm)
+static inline void ksm_fork(struct mm_struct *mm, struct mm_struct *oldmm)
{
+ /* Adding mm to ksm is best effort on fork. */
if (test_bit(MMF_VM_MERGEABLE, &oldmm->flags))
- return __ksm_enter(mm);
-
- return 0;
+ __ksm_enter(mm);
}
static inline int ksm_execve(struct mm_struct *mm)
@@ -107,9 +106,8 @@ static inline int ksm_disable(struct mm_
return 0;
}
-static inline int ksm_fork(struct mm_struct *mm, struct mm_struct *oldmm)
+static inline void ksm_fork(struct mm_struct *mm, struct mm_struct *oldmm)
{
- return 0;
}
static inline int ksm_execve(struct mm_struct *mm)
--- a/kernel/fork.c~fork-only-invoke-khugepaged-ksm-hooks-if-no-error
+++ a/kernel/fork.c
@@ -653,11 +653,6 @@ static __latent_entropy int dup_mmap(str
mm->exec_vm = oldmm->exec_vm;
mm->stack_vm = oldmm->stack_vm;
- retval = ksm_fork(mm, oldmm);
- if (retval)
- goto out;
- khugepaged_fork(mm, oldmm);
-
/* Use __mt_dup() to efficiently build an identical maple tree. */
retval = __mt_dup(&oldmm->mm_mt, &mm->mm_mt, GFP_KERNEL);
if (unlikely(retval))
@@ -760,6 +755,8 @@ loop_out:
vma_iter_free(&vmi);
if (!retval) {
mt_set_in_rcu(vmi.mas.tree);
+ ksm_fork(mm, oldmm);
+ khugepaged_fork(mm, oldmm);
} else if (mpnt) {
/*
* The entire maple tree has already been duplicated. If the
_
Patches currently in -mm which might be from lorenzo.stoakes(a)oracle.com are
mm-avoid-unsafe-vma-hook-invocation-when-error-arises-on-mmap-hook.patch
mm-unconditionally-close-vmas-on-error.patch
mm-refactor-map_deny_write_exec.patch
mm-resolve-faulty-mmap_region-error-path-behaviour.patch
selftests-mm-add-pkey_sighandler_xx-hugetlb_dio-to-gitignore.patch
mm-refactor-mm_access-to-not-return-null.patch
mm-refactor-mm_access-to-not-return-null-fix.patch
mm-madvise-unrestrict-process_madvise-for-current-process.patch
maple_tree-do-not-hash-pointers-on-dump-in-debug-mode.patch
tools-testing-fix-phys_addr_t-size-on-64-bit-systems.patch
tools-testing-fix-phys_addr_t-size-on-64-bit-systems-fix.patch
tools-testing-add-additional-vma_internalh-stubs.patch
mm-isolate-mmap-internal-logic-to-mm-vmac.patch
mm-refactor-__mmap_region.patch
mm-defer-second-attempt-at-merge-on-mmap.patch
mm-pagewalk-add-the-ability-to-install-ptes.patch
mm-add-pte_marker_guard-pte-marker.patch
mm-madvise-implement-lightweight-guard-page-mechanism.patch
tools-testing-update-tools-uapi-header-for-mman-commonh.patch
selftests-mm-add-self-tests-for-guard-page-feature.patch
The quilt patch titled
Subject: fork: do not invoke uffd on fork if error occurs
has been removed from the -mm tree. Its filename was
fork-do-not-invoke-uffd-on-fork-if-error-occurs.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Subject: fork: do not invoke uffd on fork if error occurs
Date: Tue, 15 Oct 2024 18:56:05 +0100
Patch series "fork: do not expose incomplete mm on fork".
During fork we may place the virtual memory address space into an
inconsistent state before the fork operation is complete.
In addition, we may encounter an error during the fork operation that
indicates that the virtual memory address space is invalidated.
As a result, we should not be exposing it in any way to external machinery
that might interact with the mm or VMAs, machinery that is not designed to
deal with incomplete state.
We specifically update the fork logic to defer khugepaged and ksm to the
end of the operation and only to be invoked if no error arose, and
disallow uffd from observing fork events should an error have occurred.
This patch (of 2):
Currently on fork we expose the virtual address space of a process to
userland unconditionally if uffd is registered in VMAs, regardless of
whether an error arose in the fork.
This is performed in dup_userfaultfd_complete() which is invoked
unconditionally, and performs two duties - invoking registered handlers
for the UFFD_EVENT_FORK event via dup_fctx(), and clearing down
userfaultfd_fork_ctx objects established in dup_userfaultfd().
This is problematic, because the virtual address space may not yet be
correctly initialised if an error arose.
The change in commit d24062914837 ("fork: use __mt_dup() to duplicate
maple tree in dup_mmap()") makes this more pertinent as we may be in a
state where entries in the maple tree are not yet consistent.
We address this by, on fork error, ensuring that we roll back state that
we would otherwise expect to clean up through the event being handled by
userland and perform the memory freeing duty otherwise performed by
dup_userfaultfd_complete().
We do this by implementing a new function, dup_userfaultfd_fail(), which
performs the same loop, only decrementing reference counts.
Note that we perform mmgrab() on the parent and child mm's, however
userfaultfd_ctx_put() will mmdrop() this once the reference count drops to
zero, so we will avoid memory leaks correctly here.
Link: https://lkml.kernel.org/r/cover.1729014377.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/d3691d58bb58712b6fb3df2be441d175bd3cdf07.17290143…
Fixes: d24062914837 ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Reported-by: Jann Horn <jannh(a)google.com>
Reviewed-by: Jann Horn <jannh(a)google.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett(a)Oracle.com>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Linus Torvalds <torvalds(a)linuxfoundation.org>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/userfaultfd.c | 28 ++++++++++++++++++++++++++++
include/linux/userfaultfd_k.h | 5 +++++
kernel/fork.c | 5 ++++-
3 files changed, 37 insertions(+), 1 deletion(-)
--- a/fs/userfaultfd.c~fork-do-not-invoke-uffd-on-fork-if-error-occurs
+++ a/fs/userfaultfd.c
@@ -692,6 +692,34 @@ void dup_userfaultfd_complete(struct lis
}
}
+void dup_userfaultfd_fail(struct list_head *fcs)
+{
+ struct userfaultfd_fork_ctx *fctx, *n;
+
+ /*
+ * An error has occurred on fork, we will tear memory down, but have
+ * allocated memory for fctx's and raised reference counts for both the
+ * original and child contexts (and on the mm for each as a result).
+ *
+ * These would ordinarily be taken care of by a user handling the event,
+ * but we are no longer doing so, so manually clean up here.
+ *
+ * mm tear down will take care of cleaning up VMA contexts.
+ */
+ list_for_each_entry_safe(fctx, n, fcs, list) {
+ struct userfaultfd_ctx *octx = fctx->orig;
+ struct userfaultfd_ctx *ctx = fctx->new;
+
+ atomic_dec(&octx->mmap_changing);
+ VM_BUG_ON(atomic_read(&octx->mmap_changing) < 0);
+ userfaultfd_ctx_put(octx);
+ userfaultfd_ctx_put(ctx);
+
+ list_del(&fctx->list);
+ kfree(fctx);
+ }
+}
+
void mremap_userfaultfd_prep(struct vm_area_struct *vma,
struct vm_userfaultfd_ctx *vm_ctx)
{
--- a/include/linux/userfaultfd_k.h~fork-do-not-invoke-uffd-on-fork-if-error-occurs
+++ a/include/linux/userfaultfd_k.h
@@ -249,6 +249,7 @@ static inline bool vma_can_userfault(str
extern int dup_userfaultfd(struct vm_area_struct *, struct list_head *);
extern void dup_userfaultfd_complete(struct list_head *);
+void dup_userfaultfd_fail(struct list_head *);
extern void mremap_userfaultfd_prep(struct vm_area_struct *,
struct vm_userfaultfd_ctx *);
@@ -351,6 +352,10 @@ static inline void dup_userfaultfd_compl
{
}
+static inline void dup_userfaultfd_fail(struct list_head *l)
+{
+}
+
static inline void mremap_userfaultfd_prep(struct vm_area_struct *vma,
struct vm_userfaultfd_ctx *ctx)
{
--- a/kernel/fork.c~fork-do-not-invoke-uffd-on-fork-if-error-occurs
+++ a/kernel/fork.c
@@ -775,7 +775,10 @@ out:
mmap_write_unlock(mm);
flush_tlb_mm(oldmm);
mmap_write_unlock(oldmm);
- dup_userfaultfd_complete(&uf);
+ if (!retval)
+ dup_userfaultfd_complete(&uf);
+ else
+ dup_userfaultfd_fail(&uf);
fail_uprobe_end:
uprobe_end_dup_mmap();
return retval;
_
Patches currently in -mm which might be from lorenzo.stoakes(a)oracle.com are
mm-avoid-unsafe-vma-hook-invocation-when-error-arises-on-mmap-hook.patch
mm-unconditionally-close-vmas-on-error.patch
mm-refactor-map_deny_write_exec.patch
mm-resolve-faulty-mmap_region-error-path-behaviour.patch
selftests-mm-add-pkey_sighandler_xx-hugetlb_dio-to-gitignore.patch
mm-refactor-mm_access-to-not-return-null.patch
mm-refactor-mm_access-to-not-return-null-fix.patch
mm-madvise-unrestrict-process_madvise-for-current-process.patch
maple_tree-do-not-hash-pointers-on-dump-in-debug-mode.patch
tools-testing-fix-phys_addr_t-size-on-64-bit-systems.patch
tools-testing-fix-phys_addr_t-size-on-64-bit-systems-fix.patch
tools-testing-add-additional-vma_internalh-stubs.patch
mm-isolate-mmap-internal-logic-to-mm-vmac.patch
mm-refactor-__mmap_region.patch
mm-defer-second-attempt-at-merge-on-mmap.patch
mm-pagewalk-add-the-ability-to-install-ptes.patch
mm-add-pte_marker_guard-pte-marker.patch
mm-madvise-implement-lightweight-guard-page-mechanism.patch
tools-testing-update-tools-uapi-header-for-mman-commonh.patch
selftests-mm-add-self-tests-for-guard-page-feature.patch
From: Gustavo Sousa <gustavo.sousa(a)intel.com>
commit 9fc97277eb2d17492de636b68cf7d2f5c4f15c1b upstream.
Starting with Xe_LPD+, although FIA is still used to readout Type-C pin
assignment, part of Type-C support is moved to PICA and programming
PORT_TX_DFLEXDPMLE1(*) registers is not applicable anymore like it was
for previous display IPs (e.g. see BSpec 49190).
v2:
- Mention Bspec 49190 as a reference of instructions for previous
IPs. (Shekhar Chauhan)
- s/Xe_LPDP/Xe_LPD+/ in the commit message. (Matt Roper)
- Update commit message to be more accurate to the changes in the IP.
(Imre Deak)
Bspec: 65750, 65448
Reviewed-by: Shekhar Chauhan <shekhar.chauhan(a)intel.com>
Reviewed-by: Imre Deak <imre.deak(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240625202652.315936-1-gusta…
Signed-off-by: Gustavo Sousa <gustavo.sousa(a)intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi(a)intel.com>
---
drivers/gpu/drm/i915/display/intel_tc.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/intel_tc.c b/drivers/gpu/drm/i915/display/intel_tc.c
index 9887967b2ca5c..6f2ee7dbc43b3 100644
--- a/drivers/gpu/drm/i915/display/intel_tc.c
+++ b/drivers/gpu/drm/i915/display/intel_tc.c
@@ -393,6 +393,9 @@ void intel_tc_port_set_fia_lane_count(struct intel_digital_port *dig_port,
bool lane_reversal = dig_port->saved_port_bits & DDI_BUF_PORT_REVERSAL;
u32 val;
+ if (DISPLAY_VER(i915) >= 14)
+ return;
+
drm_WARN_ON(&i915->drm,
lane_reversal && tc->mode != TC_PORT_LEGACY);
--
2.47.0
The patch titled
Subject: lib: string_helpers: fix potential snprintf() output truncation
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
lib-string_helpers-fix-potential-snprintf-output-truncation.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
Subject: lib: string_helpers: fix potential snprintf() output truncation
Date: Mon, 21 Oct 2024 11:14:17 +0200
The output of ".%03u" with the unsigned int in range [0, 4294966295] may
get truncated if the target buffer is not 12 bytes.
Link: https://lkml.kernel.org/r/20241021091417.37796-1-brgl@bgdev.pl
Fixes: 3c9f3681d0b4 ("[SCSI] lib: add generic helper to print sizes rounded to the correct SI range")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
Reviewed-by: Andy Shevchenko <andy(a)kernel.org>
Cc: James E.J. Bottomley <James.Bottomley(a)HansenPartnership.com>
Cc: Kees Cook <kees(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/string_helpers.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/lib/string_helpers.c~lib-string_helpers-fix-potential-snprintf-output-truncation
+++ a/lib/string_helpers.c
@@ -57,7 +57,7 @@ int string_get_size(u64 size, u64 blk_si
static const unsigned int rounding[] = { 500, 50, 5 };
int i = 0, j;
u32 remainder = 0, sf_cap;
- char tmp[8];
+ char tmp[12];
const char *unit;
tmp[0] = '\0';
_
Patches currently in -mm which might be from bartosz.golaszewski(a)linaro.org are
lib-string_helpers-fix-potential-snprintf-output-truncation.patch
The patch titled
Subject: mm: multi-gen LRU: remove MM_LEAF_OLD and MM_NONLEAF_TOTAL stats
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-multi-gen-lru-remove-mm_leaf_old-and-mm_nonleaf_total-stats.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Yu Zhao <yuzhao(a)google.com>
Subject: mm: multi-gen LRU: remove MM_LEAF_OLD and MM_NONLEAF_TOTAL stats
Date: Sat, 19 Oct 2024 01:29:38 +0000
Patch series "mm: multi-gen LRU: Have secondary MMUs participate in
MM_WALK".
Today, the MM_WALK capability causes MGLRU to clear the young bit from
PMDs and PTEs during the page table walk before eviction, but MGLRU does
not call the clear_young() MMU notifier in this case. By not calling this
notifier, the MM walk takes less time/CPU, but it causes pages that are
accessed mostly through KVM / secondary MMUs to appear younger than they
should be.
We do call the clear_young() notifier today, but only when attempting to
evict the page, so we end up clearing young/accessed information less
frequently for secondary MMUs than for mm PTEs, and therefore they appear
younger and are less likely to be evicted. Therefore, memory that is
*not* being accessed mostly by KVM will be evicted *more* frequently,
worsening performance.
ChromeOS observed a tab-open latency regression when enabling MGLRU with a
setup that involved running a VM:
Tab-open latency histogram (ms)
Version p50 mean p95 p99 max
base 1315 1198 2347 3454 10319
mglru 2559 1311 7399 12060 43758
fix 1119 926 2470 4211 6947
This series replaces the final non-selftest patchs from this series[1],
which introduced a similar change (and a new MMU notifier) with KVM
optimizations. I'll send a separate series (to Sean and Paolo) for the
KVM optimizations.
This series also makes proactive reclaim with MGLRU possible for KVM
memory. I have verified that this functions correctly with the selftest
from [1], but given that that test is a KVM selftest, I'll send it with
the rest of the KVM optimizations later. Andrew, let me know if you'd
like to take the test now anyway.
[1]: https://lore.kernel.org/linux-mm/20240926013506.860253-18-jthoughton@google…
This patch (of 2):
The removed stats, MM_LEAF_OLD and MM_NONLEAF_TOTAL, are not very helpful
and become more complicated to properly compute when adding
test/clear_young() notifiers in MGLRU's mm walk.
Link: https://lkml.kernel.org/r/20241019012940.3656292-1-jthoughton@google.com
Link: https://lkml.kernel.org/r/20241019012940.3656292-2-jthoughton@google.com
Fixes: bd74fdaea146 ("mm: multi-gen LRU: support page table walks")
Signed-off-by: Yu Zhao <yuzhao(a)google.com>
Signed-off-by: James Houghton <jthoughton(a)google.com>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: David Matlack <dmatlack(a)google.com>
Cc: David Rientjes <rientjes(a)google.com>
Cc: David Stevens <stevensd(a)google.com>
Cc: Oliver Upton <oliver.upton(a)linux.dev>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Sean Christopherson <seanjc(a)google.com>
Cc: Wei Xu <weixugc(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/mmzone.h | 2 --
mm/vmscan.c | 14 +++++---------
2 files changed, 5 insertions(+), 11 deletions(-)
--- a/include/linux/mmzone.h~mm-multi-gen-lru-remove-mm_leaf_old-and-mm_nonleaf_total-stats
+++ a/include/linux/mmzone.h
@@ -458,9 +458,7 @@ struct lru_gen_folio {
enum {
MM_LEAF_TOTAL, /* total leaf entries */
- MM_LEAF_OLD, /* old leaf entries */
MM_LEAF_YOUNG, /* young leaf entries */
- MM_NONLEAF_TOTAL, /* total non-leaf entries */
MM_NONLEAF_FOUND, /* non-leaf entries found in Bloom filters */
MM_NONLEAF_ADDED, /* non-leaf entries added to Bloom filters */
NR_MM_STATS
--- a/mm/vmscan.c~mm-multi-gen-lru-remove-mm_leaf_old-and-mm_nonleaf_total-stats
+++ a/mm/vmscan.c
@@ -3399,7 +3399,6 @@ restart:
continue;
if (!pte_young(ptent)) {
- walk->mm_stats[MM_LEAF_OLD]++;
continue;
}
@@ -3552,7 +3551,6 @@ restart:
walk->mm_stats[MM_LEAF_TOTAL]++;
if (!pmd_young(val)) {
- walk->mm_stats[MM_LEAF_OLD]++;
continue;
}
@@ -3564,8 +3562,6 @@ restart:
continue;
}
- walk->mm_stats[MM_NONLEAF_TOTAL]++;
-
if (!walk->force_scan && should_clear_pmd_young()) {
if (!pmd_young(val))
continue;
@@ -5254,11 +5250,11 @@ static void lru_gen_seq_show_full(struct
for (tier = 0; tier < MAX_NR_TIERS; tier++) {
seq_printf(m, " %10d", tier);
for (type = 0; type < ANON_AND_FILE; type++) {
- const char *s = " ";
+ const char *s = "xxx";
unsigned long n[3] = {};
if (seq == max_seq) {
- s = "RT ";
+ s = "RTx";
n[0] = READ_ONCE(lrugen->avg_refaulted[type][tier]);
n[1] = READ_ONCE(lrugen->avg_total[type][tier]);
} else if (seq == min_seq[type] || NR_HIST_GENS > 1) {
@@ -5280,14 +5276,14 @@ static void lru_gen_seq_show_full(struct
seq_puts(m, " ");
for (i = 0; i < NR_MM_STATS; i++) {
- const char *s = " ";
+ const char *s = "xxxx";
unsigned long n = 0;
if (seq == max_seq && NR_HIST_GENS == 1) {
- s = "LOYNFA";
+ s = "TYFA";
n = READ_ONCE(mm_state->stats[hist][i]);
} else if (seq != max_seq && NR_HIST_GENS > 1) {
- s = "loynfa";
+ s = "tyfa";
n = READ_ONCE(mm_state->stats[hist][i]);
}
_
Patches currently in -mm which might be from yuzhao(a)google.com are
mm-allow-set-clear-page_type-again.patch
mm-multi-gen-lru-remove-mm_leaf_old-and-mm_nonleaf_total-stats.patch
mm-multi-gen-lru-use-pteppmdp_clear_young_notify.patch
The patch titled
Subject: mm: allow set/clear page_type again
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-allow-set-clear-page_type-again.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Yu Zhao <yuzhao(a)google.com>
Subject: mm: allow set/clear page_type again
Date: Sat, 19 Oct 2024 22:22:12 -0600
Some page flags (page->flags) were converted to page types
(page->page_types). A recent example is PG_hugetlb.
From the exclusive writer's perspective, e.g., a thread doing
__folio_set_hugetlb(), there is a difference between the page flag and
type APIs: the former allows the same non-atomic operation to be repeated
whereas the latter does not. For example, calling __folio_set_hugetlb()
twice triggers VM_BUG_ON_FOLIO(), since the second call expects the type
(PG_hugetlb) not to be set previously.
Using add_hugetlb_folio() as an example, it calls __folio_set_hugetlb() in
the following error-handling path. And when that happens, it triggers the
aforementioned VM_BUG_ON_FOLIO().
if (folio_test_hugetlb(folio)) {
rc = hugetlb_vmemmap_restore_folio(h, folio);
if (rc) {
spin_lock_irq(&hugetlb_lock);
add_hugetlb_folio(h, folio, false);
...
It is possible to make hugeTLB comply with the new requirements from the
page type API. However, a straightforward fix would be to just allow the
same page type to be set or cleared again inside the API, to avoid any
changes to its callers.
Link: https://lkml.kernel.org/r/20241020042212.296781-1-yuzhao@google.com
Fixes: d99e3140a4d3 ("mm: turn folio_test_hugetlb into a PageType")
Signed-off-by: Yu Zhao <yuzhao(a)google.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/page-flags.h | 8 ++++++++
1 file changed, 8 insertions(+)
--- a/include/linux/page-flags.h~mm-allow-set-clear-page_type-again
+++ a/include/linux/page-flags.h
@@ -975,12 +975,16 @@ static __always_inline bool folio_test_#
} \
static __always_inline void __folio_set_##fname(struct folio *folio) \
{ \
+ if (folio_test_##fname(folio)) \
+ return; \
VM_BUG_ON_FOLIO(data_race(folio->page.page_type) != UINT_MAX, \
folio); \
folio->page.page_type = (unsigned int)PGTY_##lname << 24; \
} \
static __always_inline void __folio_clear_##fname(struct folio *folio) \
{ \
+ if (folio->page.page_type == UINT_MAX) \
+ return; \
VM_BUG_ON_FOLIO(!folio_test_##fname(folio), folio); \
folio->page.page_type = UINT_MAX; \
}
@@ -993,11 +997,15 @@ static __always_inline int Page##uname(c
} \
static __always_inline void __SetPage##uname(struct page *page) \
{ \
+ if (Page##uname(page)) \
+ return; \
VM_BUG_ON_PAGE(data_race(page->page_type) != UINT_MAX, page); \
page->page_type = (unsigned int)PGTY_##lname << 24; \
} \
static __always_inline void __ClearPage##uname(struct page *page) \
{ \
+ if (page->page_type == UINT_MAX) \
+ return; \
VM_BUG_ON_PAGE(!Page##uname(page), page); \
page->page_type = UINT_MAX; \
}
_
Patches currently in -mm which might be from yuzhao(a)google.com are
mm-allow-set-clear-page_type-again.patch
The patch titled
Subject: nilfs2: fix potential deadlock with newly created symlinks
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
nilfs2-fix-potential-deadlock-with-newly-created-symlinks.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Subject: nilfs2: fix potential deadlock with newly created symlinks
Date: Sun, 20 Oct 2024 13:51:28 +0900
Syzbot reported that page_symlink(), called by nilfs_symlink(), triggers
memory reclamation involving the filesystem layer, which can result in
circular lock dependencies among the reader/writer semaphore
nilfs->ns_segctor_sem, s_writers percpu_rwsem (intwrite) and the
fs_reclaim pseudo lock.
This is because after commit 21fc61c73c39 ("don't put symlink bodies in
pagecache into highmem"), the gfp flags of the page cache for symbolic
links are overwritten to GFP_KERNEL via inode_nohighmem().
This is not a problem for symlinks read from the backing device, because
the __GFP_FS flag is dropped after inode_nohighmem() is called. However,
when a new symlink is created with nilfs_symlink(), the gfp flags remain
overwritten to GFP_KERNEL. Then, memory allocation called from
page_symlink() etc. triggers memory reclamation including the FS layer,
which may call nilfs_evict_inode() or nilfs_dirty_inode(). And these can
cause a deadlock if they are called while nilfs->ns_segctor_sem is held:
Fix this issue by dropping the __GFP_FS flag from the page cache GFP flags
of newly created symlinks in the same way that nilfs_new_inode() and
__nilfs_read_inode() do, as a workaround until we adopt nofs allocation
scope consistently or improve the locking constraints.
Link: https://lkml.kernel.org/r/20241020050003.4308-1-konishi.ryusuke@gmail.com
Fixes: 21fc61c73c39 ("don't put symlink bodies in pagecache into highmem")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+9ef37ac20608f4836256(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=9ef37ac20608f4836256
Tested-by: syzbot+9ef37ac20608f4836256(a)syzkaller.appspotmail.com
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/nilfs2/namei.c | 3 +++
1 file changed, 3 insertions(+)
--- a/fs/nilfs2/namei.c~nilfs2-fix-potential-deadlock-with-newly-created-symlinks
+++ a/fs/nilfs2/namei.c
@@ -157,6 +157,9 @@ static int nilfs_symlink(struct mnt_idma
/* slow symlink */
inode->i_op = &nilfs_symlink_inode_operations;
inode_nohighmem(inode);
+ mapping_set_gfp_mask(inode->i_mapping,
+ mapping_gfp_constraint(inode->i_mapping,
+ ~__GFP_FS));
inode->i_mapping->a_ops = &nilfs_aops;
err = page_symlink(inode, symname, l);
if (err)
_
Patches currently in -mm which might be from konishi.ryusuke(a)gmail.com are
nilfs2-fix-kernel-bug-due-to-missing-clearing-of-checked-flag.patch
nilfs2-fix-potential-deadlock-with-newly-created-symlinks.patch
The patch titled
Subject: mm/damon/vaddr: fix issue in damon_va_evenly_split_region()
has been added to the -mm mm-unstable branch. Its filename is
mm-damon-vaddr-fix-issue-in-damon_va_evenly_split_region.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Zheng Yejian <zhengyejian(a)huaweicloud.com>
Subject: mm/damon/vaddr: fix issue in damon_va_evenly_split_region()
Date: Tue, 22 Oct 2024 16:39:26 +0800
Patch series "mm/damon/vaddr: Fix issue in
damon_va_evenly_split_region()". v2.
According to the logic of damon_va_evenly_split_region(), currently
following split case would not meet the expectation:
Suppose DAMON_MIN_REGION=0x1000,
Case: Split [0x0, 0x3000) into 2 pieces, then the result would be
acutually 3 regions:
[0x0, 0x1000), [0x1000, 0x2000), [0x2000, 0x3000)
but NOT the expected 2 regions:
[0x0, 0x1000), [0x1000, 0x3000) !!!
The root cause is that when calculating size of each split piece in
damon_va_evenly_split_region():
`sz_piece = ALIGN_DOWN(sz_orig / nr_pieces, DAMON_MIN_REGION);`
both the dividing and the ALIGN_DOWN may cause loss of precision, then
each time split one piece of size 'sz_piece' from origin 'start' to 'end'
would cause more pieces are split out than expected!!!
To fix it, count for each piece split and make sure no more than
'nr_pieces'. In addition, add above case into damon_test_split_evenly().
And add 'nr_piece == 1' check in damon_va_evenly_split_region() for better
code readability and add a corresponding kunit testcase.
This patch (of 2):
According to the logic of damon_va_evenly_split_region(), currently
following split case would not meet the expectation:
Suppose DAMON_MIN_REGION=0x1000,
Case: Split [0x0, 0x3000) into 2 pieces, then the result would be
acutually 3 regions:
[0x0, 0x1000), [0x1000, 0x2000), [0x2000, 0x3000)
but NOT the expected 2 regions:
[0x0, 0x1000), [0x1000, 0x3000) !!!
The root cause is that when calculating size of each split piece in
damon_va_evenly_split_region():
`sz_piece = ALIGN_DOWN(sz_orig / nr_pieces, DAMON_MIN_REGION);`
both the dividing and the ALIGN_DOWN may cause loss of precision,
then each time split one piece of size 'sz_piece' from origin 'start' to
'end' would cause more pieces are split out than expected!!!
To fix it, count for each piece split and make sure no more than
'nr_pieces'. In addition, add above case into damon_test_split_evenly().
After this patch, damon-operations test passed:
# ./tools/testing/kunit/kunit.py run damon-operations
[...]
============== damon-operations (6 subtests) ===============
[PASSED] damon_test_three_regions_in_vmas
[PASSED] damon_test_apply_three_regions1
[PASSED] damon_test_apply_three_regions2
[PASSED] damon_test_apply_three_regions3
[PASSED] damon_test_apply_three_regions4
[PASSED] damon_test_split_evenly
================ [PASSED] damon-operations =================
Link: https://lkml.kernel.org/r/20241022083927.3592237-1-zhengyejian@huaweicloud.…
Link: https://lkml.kernel.org/r/20241022083927.3592237-2-zhengyejian@huaweicloud.…
Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces")
Signed-off-by: Zheng Yejian <zhengyejian(a)huaweicloud.com>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Cc: Fernand Sieber <sieberf(a)amazon.com>
Cc: Leonard Foerster <foersleo(a)amazon.de>
Cc: Shakeel Butt <shakeel.butt(a)linux.dev>
Cc: Ye Weihua <yeweihua4(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/tests/vaddr-kunit.h | 1 +
mm/damon/vaddr.c | 4 ++--
2 files changed, 3 insertions(+), 2 deletions(-)
--- a/mm/damon/tests/vaddr-kunit.h~mm-damon-vaddr-fix-issue-in-damon_va_evenly_split_region
+++ a/mm/damon/tests/vaddr-kunit.h
@@ -300,6 +300,7 @@ static void damon_test_split_evenly(stru
damon_test_split_evenly_fail(test, 0, 100, 0);
damon_test_split_evenly_succ(test, 0, 100, 10);
damon_test_split_evenly_succ(test, 5, 59, 5);
+ damon_test_split_evenly_succ(test, 0, 3, 2);
damon_test_split_evenly_fail(test, 5, 6, 2);
}
--- a/mm/damon/vaddr.c~mm-damon-vaddr-fix-issue-in-damon_va_evenly_split_region
+++ a/mm/damon/vaddr.c
@@ -67,6 +67,7 @@ static int damon_va_evenly_split_region(
unsigned long sz_orig, sz_piece, orig_end;
struct damon_region *n = NULL, *next;
unsigned long start;
+ unsigned int i;
if (!r || !nr_pieces)
return -EINVAL;
@@ -80,8 +81,7 @@ static int damon_va_evenly_split_region(
r->ar.end = r->ar.start + sz_piece;
next = damon_next_region(r);
- for (start = r->ar.end; start + sz_piece <= orig_end;
- start += sz_piece) {
+ for (start = r->ar.end, i = 1; i < nr_pieces; start += sz_piece, i++) {
n = damon_new_region(start, start + sz_piece);
if (!n)
return -ENOMEM;
_
Patches currently in -mm which might be from zhengyejian(a)huaweicloud.com are
mm-damon-vaddr-fix-issue-in-damon_va_evenly_split_region.patch
mm-damon-vaddr-add-nr_piece-==-1-check-in-damon_va_evenly_split_region.patch
From: Peter Wang <peter.wang(a)mediatek.com>
When ufshcd_rtc_work calls ufshcd_rpm_put_sync and the pm's
usage_count is 0, it will enter the runtime suspend callback.
However, the runtime suspend callback will wait to flush
ufshcd_rtc_work, causing a deadlock.
Replacing ufshcd_rpm_put_sync with ufshcd_rpm_put can avoid
the deadlock.
Fixes: 6bf999e0eb41 ("scsi: ufs: core: Add UFS RTC support")
Cc: <stable(a)vger.kernel.org> #6.11.x
Signed-off-by: Peter Wang <peter.wang(a)mediatek.com>
Reviewed-by: Bart Van Assche <bvanassche(a)acm.org>
---
drivers/ufs/core/ufshcd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index a63dcf48e59d..f5846598d80e 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -8219,7 +8219,7 @@ static void ufshcd_update_rtc(struct ufs_hba *hba)
err = ufshcd_query_attr(hba, UPIU_QUERY_OPCODE_WRITE_ATTR, QUERY_ATTR_IDN_SECONDS_PASSED,
0, 0, &val);
- ufshcd_rpm_put_sync(hba);
+ ufshcd_rpm_put(hba);
if (err)
dev_err(hba->dev, "%s: Failed to update rtc %d\n", __func__, err);
--
2.18.0
From: Peter Wang <peter.wang(a)mediatek.com>
When ufshcd_rtc_work calls ufshcd_rpm_put_sync and the pm's
usage_count is 0, it will enter the runtime suspend callback.
However, the runtime suspend callback will wait to flush
ufshcd_rtc_work, causing a deadlock.
Replacing ufshcd_rpm_put_sync with ufshcd_rpm_put can avoid
the deadlock.
Fixes: 6bf999e0eb41 ("scsi: ufs: core: Add UFS RTC support")
Cc: <stable(a)vger.kernel.org> 6.11.x
Signed-off-by: Peter Wang <peter.wang(a)mediatek.com>
---
drivers/ufs/core/ufshcd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index a63dcf48e59d..f5846598d80e 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -8219,7 +8219,7 @@ static void ufshcd_update_rtc(struct ufs_hba *hba)
err = ufshcd_query_attr(hba, UPIU_QUERY_OPCODE_WRITE_ATTR, QUERY_ATTR_IDN_SECONDS_PASSED,
0, 0, &val);
- ufshcd_rpm_put_sync(hba);
+ ufshcd_rpm_put(hba);
if (err)
dev_err(hba->dev, "%s: Failed to update rtc %d\n", __func__, err);
--
2.18.0
The patch titled
Subject: kasan: remove vmalloc_percpu test
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
kasan-remove-vmalloc_percpu-test.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Andrey Konovalov <andreyknvl(a)gmail.com>
Subject: kasan: remove vmalloc_percpu test
Date: Tue, 22 Oct 2024 18:07:06 +0200
Commit 1a2473f0cbc0 ("kasan: improve vmalloc tests") added the
vmalloc_percpu KASAN test with the assumption that __alloc_percpu always
uses vmalloc internally, which is tagged by KASAN.
However, __alloc_percpu might allocate memory from the first per-CPU
chunk, which is not allocated via vmalloc(). As a result, the test might
fail.
Remove the test until proper KASAN annotation for the per-CPU allocated
are added; tracked in https://bugzilla.kernel.org/show_bug.cgi?id=215019.
Link: https://lkml.kernel.org/r/20241022160706.38943-1-andrey.konovalov@linux.dev
Fixes: 1a2473f0cbc0 ("kasan: improve vmalloc tests")
Signed-off-by: Andrey Konovalov <andreyknvl(a)gmail.com>
Reported-by: Samuel Holland <samuel.holland(a)sifive.com>
Link: https://lore.kernel.org/all/4a245fff-cc46-44d1-a5f9-fd2f1c3764ae@sifive.com/
Reported-by: Sabyrzhan Tasbolatov <snovitoll(a)gmail.com>
Link: https://lore.kernel.org/all/CACzwLxiWzNqPBp4C1VkaXZ2wDwvY3yZeetCi1TLGFipKW7…
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Marco Elver <elver(a)google.com>
Cc: Sabyrzhan Tasbolatov <snovitoll(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kasan/kasan_test_c.c | 27 ---------------------------
1 file changed, 27 deletions(-)
--- a/mm/kasan/kasan_test_c.c~kasan-remove-vmalloc_percpu-test
+++ a/mm/kasan/kasan_test_c.c
@@ -1810,32 +1810,6 @@ static void vm_map_ram_tags(struct kunit
free_pages((unsigned long)p_ptr, 1);
}
-static void vmalloc_percpu(struct kunit *test)
-{
- char __percpu *ptr;
- int cpu;
-
- /*
- * This test is specifically crafted for the software tag-based mode,
- * the only tag-based mode that poisons percpu mappings.
- */
- KASAN_TEST_NEEDS_CONFIG_ON(test, CONFIG_KASAN_SW_TAGS);
-
- ptr = __alloc_percpu(PAGE_SIZE, PAGE_SIZE);
-
- for_each_possible_cpu(cpu) {
- char *c_ptr = per_cpu_ptr(ptr, cpu);
-
- KUNIT_EXPECT_GE(test, (u8)get_tag(c_ptr), (u8)KASAN_TAG_MIN);
- KUNIT_EXPECT_LT(test, (u8)get_tag(c_ptr), (u8)KASAN_TAG_KERNEL);
-
- /* Make sure that in-bounds accesses don't crash the kernel. */
- *c_ptr = 0;
- }
-
- free_percpu(ptr);
-}
-
/*
* Check that the assigned pointer tag falls within the [KASAN_TAG_MIN,
* KASAN_TAG_KERNEL) range (note: excluding the match-all tag) for tag-based
@@ -2023,7 +1997,6 @@ static struct kunit_case kasan_kunit_tes
KUNIT_CASE(vmalloc_oob),
KUNIT_CASE(vmap_tags),
KUNIT_CASE(vm_map_ram_tags),
- KUNIT_CASE(vmalloc_percpu),
KUNIT_CASE(match_all_not_assigned),
KUNIT_CASE(match_all_ptr_tag),
KUNIT_CASE(match_all_mem_tag),
_
Patches currently in -mm which might be from andreyknvl(a)gmail.com are
kasan-remove-vmalloc_percpu-test.patch
linux-stable-mirror(a)lists.linaro.org
`A8Y7D �0C
gE5w0B�1A�C7LinkedInSD1�01~D9`A8v84�A2S55�F7l42
N70[B6YD3T0D
e36NF6NBAu35[50�AENF6
IPW30W40
e70�CF
NA7TC1cCF�F0:
N70[B6VFD[B6
Andre Torres
sales.oiuy(a)outlook.com
78.54.***.***
***pcs
General goods,mol...
Spain
zCBS73VDEY0Dm88`6F
� 2024 LinkedIn Corporation, 1000 West Maude Avenue, Sunnyvale, CA 94085.
LinkedIn and the LinkedIn logo are registered trademarks of LinkedIn.
The patch titled
Subject: tools/mm: -Werror fixes in page-types/slabinfo
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
tools-mm-werror-fixes-in-page-types-slabinfo.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Wladislav Wiebe <wladislav.kw(a)gmail.com>
Subject: tools/mm: -Werror fixes in page-types/slabinfo
Date: Tue, 22 Oct 2024 19:21:13 +0200
Commit e6d2c436ff693 ("tools/mm: allow users to provide additional
cflags/ldflags") passes now CFLAGS to Makefile. With this, build systems
with default -Werror enabled found:
slabinfo.c:1300:25: error: ignoring return value of 'chdir'
declared with attribute 'warn_unused_result' [-Werror=unused-result]
������������������������������������������������ chdir("..");
������������������������������������������������ ^~~~~~~~~~~
page-types.c:397:35: error: format '%lu' expects argument of type
'long unsigned int', but argument 2 has type 'uint64_t'
{aka 'long long unsigned int'} [-Werror=format=]
������������������������������������������������ printf("%lu\t", mapcnt0);
���������������������������������������������������������������� ~~^�������� ~~~~~~~
..
Fix page-types by using PRIu64 for uint64_t prints and check in slabinfo
for return code on chdir("..").
Link: https://lkml.kernel.org/r/c1ceb507-94bc-461c-934d-c19b77edd825@gmail.com
Fixes: e6d2c436ff693 ("tools/mm: allow users to provide additional cflags/ldflags")
Signed-off-by: Wladislav Wiebe <wladislav.kw(a)gmail.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Herton R. Krzesinski <herton(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/mm/page-types.c | 9 +++++----
tools/mm/slabinfo.c | 4 +++-
2 files changed, 8 insertions(+), 5 deletions(-)
--- a/tools/mm/page-types.c~tools-mm-werror-fixes-in-page-types-slabinfo
+++ a/tools/mm/page-types.c
@@ -22,6 +22,7 @@
#include <time.h>
#include <setjmp.h>
#include <signal.h>
+#include <inttypes.h>
#include <sys/types.h>
#include <sys/errno.h>
#include <sys/fcntl.h>
@@ -391,9 +392,9 @@ static void show_page_range(unsigned lon
if (opt_file)
printf("%lx\t", voff);
if (opt_list_cgroup)
- printf("@%llu\t", (unsigned long long)cgroup0);
+ printf("@%" PRIu64 "\t", cgroup0);
if (opt_list_mapcnt)
- printf("%lu\t", mapcnt0);
+ printf("%" PRIu64 "\t", mapcnt0);
printf("%lx\t%lx\t%s\n",
index, count, page_flag_name(flags0));
}
@@ -419,9 +420,9 @@ static void show_page(unsigned long voff
if (opt_file)
printf("%lx\t", voffset);
if (opt_list_cgroup)
- printf("@%llu\t", (unsigned long long)cgroup);
+ printf("@%" PRIu64 "\t", cgroup)
if (opt_list_mapcnt)
- printf("%lu\t", mapcnt);
+ printf("%" PRIu64 "\t", mapcnt);
printf("%lx\t%s\n", offset, page_flag_name(flags));
}
--- a/tools/mm/slabinfo.c~tools-mm-werror-fixes-in-page-types-slabinfo
+++ a/tools/mm/slabinfo.c
@@ -1297,7 +1297,9 @@ static void read_slab_dir(void)
slab->cpu_partial_free = get_obj("cpu_partial_free");
slab->alloc_node_mismatch = get_obj("alloc_node_mismatch");
slab->deactivate_bypass = get_obj("deactivate_bypass");
- chdir("..");
+ if (chdir(".."))
+ fatal("Unable to chdir from slab ../%s\n",
+ slab->name);
if (slab->name[0] == ':')
alias_targets++;
slab++;
_
Patches currently in -mm which might be from wladislav.kw(a)gmail.com are
tools-mm-werror-fixes-in-page-types-slabinfo.patch
The patch titled
Subject: ipc: fix memleak if msg_init_ns failed in create_ipc_ns
has been added to the -mm mm-nonmm-unstable branch. Its filename is
ipc-fix-memleak-if-msg_init_ns-failed-in-create_ipc_ns.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-nonmm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Wupeng Ma <mawupeng1(a)huawei.com>
Subject: ipc: fix memleak if msg_init_ns failed in create_ipc_ns
Date: Wed, 23 Oct 2024 17:31:29 +0800
Percpu memory allocation may failed during create_ipc_ns however this
fail is not handled properly since ipc sysctls and mq sysctls is not
released properly. Fix this by release these two resource when failure.
Here is the kmemleak stack when percpu failed:
unreferenced object 0xffff88819de2a600 (size 512):
comm "shmem_2nstest", pid 120711, jiffies 4300542254
hex dump (first 32 bytes):
60 aa 9d 84 ff ff ff ff fc 18 48 b2 84 88 ff ff `.........H.....
04 00 00 00 a4 01 00 00 20 e4 56 81 ff ff ff ff ........ .V.....
backtrace (crc be7cba35):
[<ffffffff81b43f83>] __kmalloc_node_track_caller_noprof+0x333/0x420
[<ffffffff81a52e56>] kmemdup_noprof+0x26/0x50
[<ffffffff821b2f37>] setup_mq_sysctls+0x57/0x1d0
[<ffffffff821b29cc>] copy_ipcs+0x29c/0x3b0
[<ffffffff815d6a10>] create_new_namespaces+0x1d0/0x920
[<ffffffff815d7449>] copy_namespaces+0x2e9/0x3e0
[<ffffffff815458f3>] copy_process+0x29f3/0x7ff0
[<ffffffff8154b080>] kernel_clone+0xc0/0x650
[<ffffffff8154b6b1>] __do_sys_clone+0xa1/0xe0
[<ffffffff843df8ff>] do_syscall_64+0xbf/0x1c0
[<ffffffff846000b0>] entry_SYSCALL_64_after_hwframe+0x4b/0x53
Link: https://lkml.kernel.org/r/20241023093129.3074301-1-mawupeng1@huawei.com
Fixes: 72d1e611082e ("ipc/msg: mitigate the lock contention with percpu counter")
Signed-off-by: Ma Wupeng <mawupeng1(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
ipc/namespace.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/ipc/namespace.c~ipc-fix-memleak-if-msg_init_ns-failed-in-create_ipc_ns
+++ a/ipc/namespace.c
@@ -83,13 +83,15 @@ static struct ipc_namespace *create_ipc_
err = msg_init_ns(ns);
if (err)
- goto fail_put;
+ goto fail_ipc;
sem_init_ns(ns);
shm_init_ns(ns);
return ns;
+fail_ipc:
+ retire_ipc_sysctls(ns);
fail_mq:
retire_mq_sysctls(ns);
_
Patches currently in -mm which might be from mawupeng1(a)huawei.com are
ipc-fix-memleak-if-msg_init_ns-failed-in-create_ipc_ns.patch
Hi,
I have tested 6.11.4 with these additional fixes for the xe and i915
drivers:
f0ffa657e9f3913c7921cbd4d876343401f15f52
4551d60299b5ddc2655b6b365a4b92634e14e04f
ecabb5e6ce54711c28706fc794d77adb3ecd0605
2009e808bc3e0df6d4d83e2271bc25ae63a4ac05
e4ac526c440af8aa94d2bdfe6066339dd93b4db2
ab0d6ef864c5fa820e894ee1a07f861e63851664
7929ffce0f8b9c76cb5c2a67d1966beaed20ab61
da9a73b7b25eab574cb9c984fcce0b5e240bdd2c
014125c64d09e58e90dde49fbb57d802a13e2559
4cce34b3835b6f7dc52ee2da95c96b6364bb72e5
a8efd8ce280996fe29f2564f705e96e18da3fa62
f15e5587448989a55cf8b4feaad0df72ca3aa6a0
a9556637a23311dea96f27fa3c3e5bfba0b38ae4
c7085d08c7e53d9aef0cdd4b20798356f6f5d469
eb53e5b933b9ff315087305b3dc931af3067d19c
3e307d6c28e7bc7d94b5699d0ed7fe07df6db094
d34f4f058edf1235c103ca9c921dc54820d14d40
31b42af516afa1e184d1a9f9dd4096c54044269a
7fbad577c82c5dd6db7217855c26f51554e53d85
b2013783c4458a1fe8b25c0b249d2e878bcf6999
c55f79f317ab428ae6d005965bc07e37496f209f
9fc97277eb2d17492de636b68cf7d2f5c4f15c1b
I have them applied locally and could submit that if preferred, but
there were no conflicts (since it also brings some additional patches as
required for fixes to apply), so it should be trivial.
All of these patches are already in upstream. Some of them are brought
as dependency. The ones mentioning "performance changes" are knobs to
follow the hw spec and could be considered as fixes too. These patches
are also enabled downstream in Ubuntu 24.10 in order to allow the new
Lunar Lake and Battlemage to work correctly. They have more patches not
included here, but I may follow up with more depending on the acceptance
of these patches.
thanks
Lucas De Marchi
The patch titled
Subject: mm: resolve faulty mmap_region() error path behaviour
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-resolve-faulty-mmap_region-error-path-behaviour.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Subject: mm: resolve faulty mmap_region() error path behaviour
Date: Wed, 23 Oct 2024 21:38:29 +0100
The mmap_region() function is somewhat terrifying, with spaghetti-like
control flow and numerous means by which issues can arise and incomplete
state, memory leaks and other unpleasantness can occur.
A large amount of the complexity arises from trying to handle errors late
in the process of mapping a VMA, which forms the basis of recently
observed issues with resource leaks and observable inconsistent state.
Taking advantage of previous patches in this series we move a number of
checks earlier in the code, simplifying things by moving the core of the
logic into a static internal function __mmap_region().
Doing this allows us to perform a number of checks up front before we do
any real work, and allows us to unwind the writable unmap check
unconditionally as required and to perform a CONFIG_DEBUG_VM_MAPLE_TREE
validation unconditionally also.
We move a number of things here:
1. We preallocate memory for the iterator before we call the file-backed
memory hook, allowing us to exit early and avoid having to perform
complicated and error-prone close/free logic. We carefully free
iterator state on both success and error paths.
2. The enclosing mmap_region() function handles the mapping_map_writable()
logic early. Previously the logic had the mapping_map_writable() at the
point of mapping a newly allocated file-backed VMA, and a matching
mapping_unmap_writable() on success and error paths.
We now do this unconditionally if this is a file-backed, shared writable
mapping. If a driver changes the flags to eliminate VM_MAYWRITE, however
doing so does not invalidate the seal check we just performed, and we in
any case always decrement the counter in the wrapper.
We perform a debug assert to ensure a driver does not attempt to do the
opposite.
3. We also move arch_validate_flags() up into the mmap_region()
function. This is only relevant on arm64 and sparc64, and the check is
only meaningful for SPARC with ADI enabled. We explicitly add a warning
for this arch if a driver invalidates this check, though the code ought
eventually to be fixed to eliminate the need for this.
With all of these measures in place, we no longer need to explicitly close
the VMA on error paths, as we place all checks which might fail prior to a
call to any driver mmap hook.
This eliminates an entire class of errors, makes the code easier to reason
about and more robust.
Link: https://lkml.kernel.org/r/6e8deda970b982e1e8ffd876e3cef342c292fbb5.17297152…
Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Reported-by: Jann Horn <jannh(a)google.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mmap.c | 121 ++++++++++++++++++++++++++++------------------------
1 file changed, 66 insertions(+), 55 deletions(-)
--- a/mm/mmap.c~mm-resolve-faulty-mmap_region-error-path-behaviour
+++ a/mm/mmap.c
@@ -1357,20 +1357,18 @@ int do_munmap(struct mm_struct *mm, unsi
return do_vmi_munmap(&vmi, mm, start, len, uf, false);
}
-unsigned long mmap_region(struct file *file, unsigned long addr,
+static unsigned long __mmap_region(struct file *file, unsigned long addr,
unsigned long len, vm_flags_t vm_flags, unsigned long pgoff,
struct list_head *uf)
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma = NULL;
pgoff_t pglen = PHYS_PFN(len);
- struct vm_area_struct *merge;
unsigned long charged = 0;
struct vma_munmap_struct vms;
struct ma_state mas_detach;
struct maple_tree mt_detach;
unsigned long end = addr + len;
- bool writable_file_mapping = false;
int error;
VMA_ITERATOR(vmi, mm, addr);
VMG_STATE(vmg, mm, &vmi, addr, end, vm_flags, pgoff);
@@ -1444,28 +1442,26 @@ unsigned long mmap_region(struct file *f
vm_flags_init(vma, vm_flags);
vma->vm_page_prot = vm_get_page_prot(vm_flags);
+ if (vma_iter_prealloc(&vmi, vma)) {
+ error = -ENOMEM;
+ goto free_vma;
+ }
+
if (file) {
vma->vm_file = get_file(file);
error = mmap_file(file, vma);
if (error)
- goto unmap_and_free_vma;
-
- if (vma_is_shared_maywrite(vma)) {
- error = mapping_map_writable(file->f_mapping);
- if (error)
- goto close_and_free_vma;
-
- writable_file_mapping = true;
- }
+ goto unmap_and_free_file_vma;
+ /* Drivers cannot alter the address of the VMA. */
+ WARN_ON_ONCE(addr != vma->vm_start);
/*
- * Expansion is handled above, merging is handled below.
- * Drivers should not alter the address of the VMA.
+ * Drivers should not permit writability when previously it was
+ * disallowed.
*/
- if (WARN_ON((addr != vma->vm_start))) {
- error = -EINVAL;
- goto close_and_free_vma;
- }
+ VM_WARN_ON_ONCE(vm_flags != vma->vm_flags &&
+ !(vm_flags & VM_MAYWRITE) &&
+ (vma->vm_flags & VM_MAYWRITE));
vma_iter_config(&vmi, addr, end);
/*
@@ -1473,6 +1469,8 @@ unsigned long mmap_region(struct file *f
* vma again as we may succeed this time.
*/
if (unlikely(vm_flags != vma->vm_flags && vmg.prev)) {
+ struct vm_area_struct *merge;
+
vmg.flags = vma->vm_flags;
/* If this fails, state is reset ready for a reattempt. */
merge = vma_merge_new_range(&vmg);
@@ -1490,7 +1488,7 @@ unsigned long mmap_region(struct file *f
vma = merge;
/* Update vm_flags to pick up the change. */
vm_flags = vma->vm_flags;
- goto unmap_writable;
+ goto file_expanded;
}
vma_iter_config(&vmi, addr, end);
}
@@ -1499,26 +1497,15 @@ unsigned long mmap_region(struct file *f
} else if (vm_flags & VM_SHARED) {
error = shmem_zero_setup(vma);
if (error)
- goto free_vma;
+ goto free_iter_vma;
} else {
vma_set_anonymous(vma);
}
- if (map_deny_write_exec(vma->vm_flags, vma->vm_flags)) {
- error = -EACCES;
- goto close_and_free_vma;
- }
-
- /* Allow architectures to sanity-check the vm_flags */
- if (!arch_validate_flags(vma->vm_flags)) {
- error = -EINVAL;
- goto close_and_free_vma;
- }
-
- if (vma_iter_prealloc(&vmi, vma)) {
- error = -ENOMEM;
- goto close_and_free_vma;
- }
+#ifdef CONFIG_SPARC64
+ /* TODO: Fix SPARC ADI! */
+ WARN_ON_ONCE(!arch_validate_flags(vm_flags));
+#endif
/* Lock the VMA since it is modified after insertion into VMA tree */
vma_start_write(vma);
@@ -1532,10 +1519,7 @@ unsigned long mmap_region(struct file *f
*/
khugepaged_enter_vma(vma, vma->vm_flags);
- /* Once vma denies write, undo our temporary denial count */
-unmap_writable:
- if (writable_file_mapping)
- mapping_unmap_writable(file->f_mapping);
+file_expanded:
file = vma->vm_file;
ksm_add_vma(vma);
expanded:
@@ -1568,23 +1552,17 @@ expanded:
vma_set_page_prot(vma);
- validate_mm(mm);
return addr;
-close_and_free_vma:
- vma_close(vma);
-
- if (file || vma->vm_file) {
-unmap_and_free_vma:
- fput(vma->vm_file);
- vma->vm_file = NULL;
-
- vma_iter_set(&vmi, vma->vm_end);
- /* Undo any partial mapping done by a device driver. */
- unmap_region(&vmi.mas, vma, vmg.prev, vmg.next);
- }
- if (writable_file_mapping)
- mapping_unmap_writable(file->f_mapping);
+unmap_and_free_file_vma:
+ fput(vma->vm_file);
+ vma->vm_file = NULL;
+
+ vma_iter_set(&vmi, vma->vm_end);
+ /* Undo any partial mapping done by a device driver. */
+ unmap_region(&vmi.mas, vma, vmg.prev, vmg.next);
+free_iter_vma:
+ vma_iter_free(&vmi);
free_vma:
vm_area_free(vma);
unacct_error:
@@ -1594,10 +1572,43 @@ unacct_error:
abort_munmap:
vms_abort_munmap_vmas(&vms, &mas_detach);
gather_failed:
- validate_mm(mm);
return error;
}
+unsigned long mmap_region(struct file *file, unsigned long addr,
+ unsigned long len, vm_flags_t vm_flags, unsigned long pgoff,
+ struct list_head *uf)
+{
+ unsigned long ret;
+ bool writable_file_mapping = false;
+
+ /* Check to see if MDWE is applicable. */
+ if (map_deny_write_exec(vm_flags, vm_flags))
+ return -EACCES;
+
+ /* Allow architectures to sanity-check the vm_flags. */
+ if (!arch_validate_flags(vm_flags))
+ return -EINVAL;
+
+ /* Map writable and ensure this isn't a sealed memfd. */
+ if (file && is_shared_maywrite(vm_flags)) {
+ int error = mapping_map_writable(file->f_mapping);
+
+ if (error)
+ return error;
+ writable_file_mapping = true;
+ }
+
+ ret = __mmap_region(file, addr, len, vm_flags, pgoff, uf);
+
+ /* Clear our write mapping regardless of error. */
+ if (writable_file_mapping)
+ mapping_unmap_writable(file->f_mapping);
+
+ validate_mm(current->mm);
+ return ret;
+}
+
static int __vm_munmap(unsigned long start, size_t len, bool unlock)
{
int ret;
_
Patches currently in -mm which might be from lorenzo.stoakes(a)oracle.com are
fork-do-not-invoke-uffd-on-fork-if-error-occurs.patch
fork-only-invoke-khugepaged-ksm-hooks-if-no-error.patch
mm-vma-add-expand-only-vma-merge-mode-and-optimise-do_brk_flags.patch
tools-testing-add-expand-only-mode-vma-test.patch
mm-avoid-unsafe-vma-hook-invocation-when-error-arises-on-mmap-hook.patch
mm-unconditionally-close-vmas-on-error.patch
mm-refactor-map_deny_write_exec.patch
mm-resolve-faulty-mmap_region-error-path-behaviour.patch
tools-testing-add-additional-vma_internalh-stubs.patch
mm-isolate-mmap-internal-logic-to-mm-vmac.patch
mm-refactor-__mmap_region.patch
mm-defer-second-attempt-at-merge-on-mmap.patch
selftests-mm-add-pkey_sighandler_xx-hugetlb_dio-to-gitignore.patch
mm-refactor-mm_access-to-not-return-null.patch
mm-refactor-mm_access-to-not-return-null-fix.patch
mm-madvise-unrestrict-process_madvise-for-current-process.patch
maple_tree-do-not-hash-pointers-on-dump-in-debug-mode.patch
tools-testing-fix-phys_addr_t-size-on-64-bit-systems.patch
The patch titled
Subject: mm: refactor map_deny_write_exec()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-refactor-map_deny_write_exec.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Subject: mm: refactor map_deny_write_exec()
Date: Wed, 23 Oct 2024 21:38:28 +0100
Refactor the map_deny_write_exec() to not unnecessarily require a VMA
parameter but rather to accept VMA flags parameters, which allows us to
use this function early in mmap_region() in a subsequent commit.
While we're here, we refactor the function to be more readable and add
some additional documentation.
Link: https://lkml.kernel.org/r/e82d23cebaa87fe5b767fecf9e730e0059066e63.17297152…
Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Reported-by: Jann Horn <jannh(a)google.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Reviewed-by: Jann Horn <jannh(a)google.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/mman.h | 21 ++++++++++++++++++---
mm/mmap.c | 2 +-
mm/mprotect.c | 2 +-
mm/vma.h | 2 +-
4 files changed, 21 insertions(+), 6 deletions(-)
--- a/include/linux/mman.h~mm-refactor-map_deny_write_exec
+++ a/include/linux/mman.h
@@ -188,16 +188,31 @@ static inline bool arch_memory_deny_writ
*
* d) mmap(PROT_READ | PROT_EXEC)
* mmap(PROT_READ | PROT_EXEC | PROT_BTI)
+ *
+ * This is only applicable if the user has set the Memory-Deny-Write-Execute
+ * (MDWE) protection mask for the current process.
+ *
+ * @old specifies the VMA flags the VMA originally possessed, and @new the ones
+ * we propose to set.
+ *
+ * Return: false if proposed change is OK, true if not ok and should be denied.
*/
-static inline bool map_deny_write_exec(struct vm_area_struct *vma, unsigned long vm_flags)
+static inline bool map_deny_write_exec(unsigned long old, unsigned long new)
{
+ /* If MDWE is disabled, we have nothing to deny. */
if (!test_bit(MMF_HAS_MDWE, ¤t->mm->flags))
return false;
- if ((vm_flags & VM_EXEC) && (vm_flags & VM_WRITE))
+ /* If the new VMA is not executable, we have nothing to deny. */
+ if (!(new & VM_EXEC))
+ return false;
+
+ /* Under MDWE we do not accept newly writably executable VMAs... */
+ if (new & VM_WRITE)
return true;
- if (!(vma->vm_flags & VM_EXEC) && (vm_flags & VM_EXEC))
+ /* ...nor previously non-executable VMAs becoming executable. */
+ if (!(old & VM_EXEC))
return true;
return false;
--- a/mm/mmap.c~mm-refactor-map_deny_write_exec
+++ a/mm/mmap.c
@@ -1504,7 +1504,7 @@ unsigned long mmap_region(struct file *f
vma_set_anonymous(vma);
}
- if (map_deny_write_exec(vma, vma->vm_flags)) {
+ if (map_deny_write_exec(vma->vm_flags, vma->vm_flags)) {
error = -EACCES;
goto close_and_free_vma;
}
--- a/mm/mprotect.c~mm-refactor-map_deny_write_exec
+++ a/mm/mprotect.c
@@ -810,7 +810,7 @@ static int do_mprotect_pkey(unsigned lon
break;
}
- if (map_deny_write_exec(vma, newflags)) {
+ if (map_deny_write_exec(vma->vm_flags, newflags)) {
error = -EACCES;
break;
}
--- a/mm/vma.h~mm-refactor-map_deny_write_exec
+++ a/mm/vma.h
@@ -42,7 +42,7 @@ struct vma_munmap_struct {
int vma_count; /* Number of vmas that will be removed */
bool unlock; /* Unlock after the munmap */
bool clear_ptes; /* If there are outstanding PTE to be cleared */
- /* 1 byte hole */
+ /* 2 byte hole */
unsigned long nr_pages; /* Number of pages being removed */
unsigned long locked_vm; /* Number of locked pages */
unsigned long nr_accounted; /* Number of VM_ACCOUNT pages */
_
Patches currently in -mm which might be from lorenzo.stoakes(a)oracle.com are
fork-do-not-invoke-uffd-on-fork-if-error-occurs.patch
fork-only-invoke-khugepaged-ksm-hooks-if-no-error.patch
mm-vma-add-expand-only-vma-merge-mode-and-optimise-do_brk_flags.patch
tools-testing-add-expand-only-mode-vma-test.patch
mm-avoid-unsafe-vma-hook-invocation-when-error-arises-on-mmap-hook.patch
mm-unconditionally-close-vmas-on-error.patch
mm-refactor-map_deny_write_exec.patch
mm-resolve-faulty-mmap_region-error-path-behaviour.patch
tools-testing-add-additional-vma_internalh-stubs.patch
mm-isolate-mmap-internal-logic-to-mm-vmac.patch
mm-refactor-__mmap_region.patch
mm-defer-second-attempt-at-merge-on-mmap.patch
selftests-mm-add-pkey_sighandler_xx-hugetlb_dio-to-gitignore.patch
mm-refactor-mm_access-to-not-return-null.patch
mm-refactor-mm_access-to-not-return-null-fix.patch
mm-madvise-unrestrict-process_madvise-for-current-process.patch
maple_tree-do-not-hash-pointers-on-dump-in-debug-mode.patch
tools-testing-fix-phys_addr_t-size-on-64-bit-systems.patch
The patch titled
Subject: mm: unconditionally close VMAs on error
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-unconditionally-close-vmas-on-error.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Subject: mm: unconditionally close VMAs on error
Date: Wed, 23 Oct 2024 21:38:27 +0100
Incorrect invokation of VMA callbacks when the VMA is no longer in a
consistent state is bug prone and risky to perform.
With regards to the important vm_ops->close() callback We have gone to
great lengths to try to track whether or not we ought to close VMAs.
Rather than doing so and risking making a mistake somewhere, instead
unconditionally close and reset vma->vm_ops to an empty dummy operations
set with a NULL .close operator.
We introduce a new function to do so - vma_close() - and simplify existing
vms logic which tracked whether we needed to close or not.
This simplifies the logic, avoids incorrect double-calling of the .close()
callback and allows us to update error paths to simply call vma_close()
unconditionally - making VMA closure idempotent.
Link: https://lkml.kernel.org/r/72a81a6fb997508db644313a5fdf4d239f620da6.17297152…
Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Reported-by: Jann Horn <jannh(a)google.com>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Reviewed-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Reviewed-by: Jann Horn <jannh(a)google.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/internal.h | 18 ++++++++++++++++++
mm/mmap.c | 5 ++---
mm/nommu.c | 3 +--
mm/vma.c | 14 +++++---------
mm/vma.h | 4 +---
5 files changed, 27 insertions(+), 17 deletions(-)
--- a/mm/internal.h~mm-unconditionally-close-vmas-on-error
+++ a/mm/internal.h
@@ -135,6 +135,24 @@ static inline int mmap_file(struct file
return err;
}
+/*
+ * If the VMA has a close hook then close it, and since closing it might leave
+ * it in an inconsistent state which makes the use of any hooks suspect, clear
+ * them down by installing dummy empty hooks.
+ */
+static inline void vma_close(struct vm_area_struct *vma)
+{
+ if (vma->vm_ops && vma->vm_ops->close) {
+ vma->vm_ops->close(vma);
+
+ /*
+ * The mapping is in an inconsistent state, and no further hooks
+ * may be invoked upon it.
+ */
+ vma->vm_ops = &vma_dummy_vm_ops;
+ }
+}
+
#ifdef CONFIG_MMU
/* Flags for folio_pte_batch(). */
--- a/mm/mmap.c~mm-unconditionally-close-vmas-on-error
+++ a/mm/mmap.c
@@ -1572,8 +1572,7 @@ expanded:
return addr;
close_and_free_vma:
- if (file && !vms.closed_vm_ops && vma->vm_ops && vma->vm_ops->close)
- vma->vm_ops->close(vma);
+ vma_close(vma);
if (file || vma->vm_file) {
unmap_and_free_vma:
@@ -1933,7 +1932,7 @@ void exit_mmap(struct mm_struct *mm)
do {
if (vma->vm_flags & VM_ACCOUNT)
nr_accounted += vma_pages(vma);
- remove_vma(vma, /* unreachable = */ true, /* closed = */ false);
+ remove_vma(vma, /* unreachable = */ true);
count++;
cond_resched();
vma = vma_next(&vmi);
--- a/mm/nommu.c~mm-unconditionally-close-vmas-on-error
+++ a/mm/nommu.c
@@ -589,8 +589,7 @@ static int delete_vma_from_mm(struct vm_
*/
static void delete_vma(struct mm_struct *mm, struct vm_area_struct *vma)
{
- if (vma->vm_ops && vma->vm_ops->close)
- vma->vm_ops->close(vma);
+ vma_close(vma);
if (vma->vm_file)
fput(vma->vm_file);
put_nommu_region(vma->vm_region);
--- a/mm/vma.c~mm-unconditionally-close-vmas-on-error
+++ a/mm/vma.c
@@ -323,11 +323,10 @@ static bool can_vma_merge_right(struct v
/*
* Close a vm structure and free it.
*/
-void remove_vma(struct vm_area_struct *vma, bool unreachable, bool closed)
+void remove_vma(struct vm_area_struct *vma, bool unreachable)
{
might_sleep();
- if (!closed && vma->vm_ops && vma->vm_ops->close)
- vma->vm_ops->close(vma);
+ vma_close(vma);
if (vma->vm_file)
fput(vma->vm_file);
mpol_put(vma_policy(vma));
@@ -1115,9 +1114,7 @@ void vms_clean_up_area(struct vma_munmap
vms_clear_ptes(vms, mas_detach, true);
mas_set(mas_detach, 0);
mas_for_each(mas_detach, vma, ULONG_MAX)
- if (vma->vm_ops && vma->vm_ops->close)
- vma->vm_ops->close(vma);
- vms->closed_vm_ops = true;
+ vma_close(vma);
}
/*
@@ -1160,7 +1157,7 @@ void vms_complete_munmap_vmas(struct vma
/* Remove and clean up vmas */
mas_set(mas_detach, 0);
mas_for_each(mas_detach, vma, ULONG_MAX)
- remove_vma(vma, /* = */ false, vms->closed_vm_ops);
+ remove_vma(vma, /* unreachable = */ false);
vm_unacct_memory(vms->nr_accounted);
validate_mm(mm);
@@ -1684,8 +1681,7 @@ struct vm_area_struct *copy_vma(struct v
return new_vma;
out_vma_link:
- if (new_vma->vm_ops && new_vma->vm_ops->close)
- new_vma->vm_ops->close(new_vma);
+ vma_close(new_vma);
if (new_vma->vm_file)
fput(new_vma->vm_file);
--- a/mm/vma.h~mm-unconditionally-close-vmas-on-error
+++ a/mm/vma.h
@@ -42,7 +42,6 @@ struct vma_munmap_struct {
int vma_count; /* Number of vmas that will be removed */
bool unlock; /* Unlock after the munmap */
bool clear_ptes; /* If there are outstanding PTE to be cleared */
- bool closed_vm_ops; /* call_mmap() was encountered, so vmas may be closed */
/* 1 byte hole */
unsigned long nr_pages; /* Number of pages being removed */
unsigned long locked_vm; /* Number of locked pages */
@@ -198,7 +197,6 @@ static inline void init_vma_munmap(struc
vms->unmap_start = FIRST_USER_ADDRESS;
vms->unmap_end = USER_PGTABLES_CEILING;
vms->clear_ptes = false;
- vms->closed_vm_ops = false;
}
#endif
@@ -269,7 +267,7 @@ int do_vmi_munmap(struct vma_iterator *v
unsigned long start, size_t len, struct list_head *uf,
bool unlock);
-void remove_vma(struct vm_area_struct *vma, bool unreachable, bool closed);
+void remove_vma(struct vm_area_struct *vma, bool unreachable);
void unmap_region(struct ma_state *mas, struct vm_area_struct *vma,
struct vm_area_struct *prev, struct vm_area_struct *next);
_
Patches currently in -mm which might be from lorenzo.stoakes(a)oracle.com are
fork-do-not-invoke-uffd-on-fork-if-error-occurs.patch
fork-only-invoke-khugepaged-ksm-hooks-if-no-error.patch
mm-vma-add-expand-only-vma-merge-mode-and-optimise-do_brk_flags.patch
tools-testing-add-expand-only-mode-vma-test.patch
mm-avoid-unsafe-vma-hook-invocation-when-error-arises-on-mmap-hook.patch
mm-unconditionally-close-vmas-on-error.patch
mm-refactor-map_deny_write_exec.patch
mm-resolve-faulty-mmap_region-error-path-behaviour.patch
tools-testing-add-additional-vma_internalh-stubs.patch
mm-isolate-mmap-internal-logic-to-mm-vmac.patch
mm-refactor-__mmap_region.patch
mm-defer-second-attempt-at-merge-on-mmap.patch
selftests-mm-add-pkey_sighandler_xx-hugetlb_dio-to-gitignore.patch
mm-refactor-mm_access-to-not-return-null.patch
mm-refactor-mm_access-to-not-return-null-fix.patch
mm-madvise-unrestrict-process_madvise-for-current-process.patch
maple_tree-do-not-hash-pointers-on-dump-in-debug-mode.patch
tools-testing-fix-phys_addr_t-size-on-64-bit-systems.patch
The patch titled
Subject: mm: avoid unsafe VMA hook invocation when error arises on mmap hook
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-avoid-unsafe-vma-hook-invocation-when-error-arises-on-mmap-hook.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Subject: mm: avoid unsafe VMA hook invocation when error arises on mmap hook
Date: Wed, 23 Oct 2024 21:38:26 +0100
Patch series "fix error handling in mmap_region() and refactor", v2.
The mmap_region() function is somewhat terrifying, with spaghetti-like
control flow and numerous means by which issues can arise and incomplete
state, memory leaks and other unpleasantness can occur.
A large amount of the complexity arises from trying to handle errors late
in the process of mapping a VMA, which forms the basis of recently observed
issues with resource leaks and observable inconsistent state.
This series goes to great lengths to simplify how mmap_region() works and
to avoid unwinding errors late on in the process of setting up the VMA for
the new mapping, and equally avoids such operations occurring while the VMA
is in an inconsistent state.
The first four patches are intended for backporting to correct the
possibility of people encountering corrupted state while invoking mmap()
which is otherwise at risk of happening.
After this we go further, refactoring the code, placing it in mm/vma.c in
order to make it eventually userland testable, and significantly
simplifying the logic to avoid this issue arising in future.
This patch (of 4):
After an attempted mmap() fails, we are no longer in a situation where we
can safely interact with VMA hooks. This is currently not enforced,
meaning that we need complicated handling to ensure we do not incorrectly
call these hooks.
We can avoid the whole issue by treating the VMA as suspect the moment
that the file->f_ops->mmap() function reports an error by replacing
whatever VMA operations were installed with a dummy empty set of VMA
operations.
We do so through a new helper function internal to mm - mmap_file() -
which is both more logically named than the existing call_mmap() function
and correctly isolates handling of the vm_op reassignment to mm.
All the existing invocations of call_mmap() outside of mm are ultimately
nested within the call_mmap() from mm, which we now replace.
It is therefore safe to leave call_mmap() in place as a convenience
function (and to avoid churn). The invokers are:
ovl_file_operations -> mmap -> ovl_mmap() -> backing_file_mmap()
coda_file_operations -> mmap -> coda_file_mmap()
shm_file_operations -> shm_mmap()
shm_file_operations_huge -> shm_mmap()
dma_buf_fops -> dma_buf_mmap_internal -> i915_dmabuf_ops
-> i915_gem_dmabuf_mmap()
None of these callers interact with vm_ops or mappings in a problematic way
on error, quickly exiting out.
Link: https://lkml.kernel.org/r/cover.1729715266.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/69f3c04df1ece2b7d402a29451ec19290ff429a4.17297152…
Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Reported-by: Jann Horn <jannh(a)google.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Reviewed-by: Jann Horn <jannh(a)google.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/internal.h | 27 +++++++++++++++++++++++++++
mm/mmap.c | 6 +++---
mm/nommu.c | 4 ++--
3 files changed, 32 insertions(+), 5 deletions(-)
--- a/mm/internal.h~mm-avoid-unsafe-vma-hook-invocation-when-error-arises-on-mmap-hook
+++ a/mm/internal.h
@@ -108,6 +108,33 @@ static inline void *folio_raw_mapping(co
return (void *)(mapping & ~PAGE_MAPPING_FLAGS);
}
+/*
+ * This is a file-backed mapping, and is about to be memory mapped - invoke its
+ * mmap hook and safely handle error conditions. On error, VMA hooks will be
+ * mutated.
+ *
+ * @file: File which backs the mapping.
+ * @vma: VMA which we are mapping.
+ *
+ * Returns: 0 if success, error otherwise.
+ */
+static inline int mmap_file(struct file *file, struct vm_area_struct *vma)
+{
+ int err = call_mmap(file, vma);
+
+ if (likely(!err))
+ return 0;
+
+ /*
+ * OK, we tried to call the file hook for mmap(), but an error
+ * arose. The mapping is in an inconsistent state and we most not invoke
+ * any further hooks on it.
+ */
+ vma->vm_ops = &vma_dummy_vm_ops;
+
+ return err;
+}
+
#ifdef CONFIG_MMU
/* Flags for folio_pte_batch(). */
--- a/mm/mmap.c~mm-avoid-unsafe-vma-hook-invocation-when-error-arises-on-mmap-hook
+++ a/mm/mmap.c
@@ -1421,7 +1421,7 @@ unsigned long mmap_region(struct file *f
/*
* clear PTEs while the vma is still in the tree so that rmap
* cannot race with the freeing later in the truncate scenario.
- * This is also needed for call_mmap(), which is why vm_ops
+ * This is also needed for mmap_file(), which is why vm_ops
* close function is called.
*/
vms_clean_up_area(&vms, &mas_detach);
@@ -1446,7 +1446,7 @@ unsigned long mmap_region(struct file *f
if (file) {
vma->vm_file = get_file(file);
- error = call_mmap(file, vma);
+ error = mmap_file(file, vma);
if (error)
goto unmap_and_free_vma;
@@ -1469,7 +1469,7 @@ unsigned long mmap_region(struct file *f
vma_iter_config(&vmi, addr, end);
/*
- * If vm_flags changed after call_mmap(), we should try merge
+ * If vm_flags changed after mmap_file(), we should try merge
* vma again as we may succeed this time.
*/
if (unlikely(vm_flags != vma->vm_flags && vmg.prev)) {
--- a/mm/nommu.c~mm-avoid-unsafe-vma-hook-invocation-when-error-arises-on-mmap-hook
+++ a/mm/nommu.c
@@ -885,7 +885,7 @@ static int do_mmap_shared_file(struct vm
{
int ret;
- ret = call_mmap(vma->vm_file, vma);
+ ret = mmap_file(vma->vm_file, vma);
if (ret == 0) {
vma->vm_region->vm_top = vma->vm_region->vm_end;
return 0;
@@ -918,7 +918,7 @@ static int do_mmap_private(struct vm_are
* happy.
*/
if (capabilities & NOMMU_MAP_DIRECT) {
- ret = call_mmap(vma->vm_file, vma);
+ ret = mmap_file(vma->vm_file, vma);
/* shouldn't return success if we're not sharing */
if (WARN_ON_ONCE(!is_nommu_shared_mapping(vma->vm_flags)))
ret = -ENOSYS;
_
Patches currently in -mm which might be from lorenzo.stoakes(a)oracle.com are
fork-do-not-invoke-uffd-on-fork-if-error-occurs.patch
fork-only-invoke-khugepaged-ksm-hooks-if-no-error.patch
mm-vma-add-expand-only-vma-merge-mode-and-optimise-do_brk_flags.patch
tools-testing-add-expand-only-mode-vma-test.patch
mm-avoid-unsafe-vma-hook-invocation-when-error-arises-on-mmap-hook.patch
mm-unconditionally-close-vmas-on-error.patch
mm-refactor-map_deny_write_exec.patch
mm-resolve-faulty-mmap_region-error-path-behaviour.patch
tools-testing-add-additional-vma_internalh-stubs.patch
mm-isolate-mmap-internal-logic-to-mm-vmac.patch
mm-refactor-__mmap_region.patch
mm-defer-second-attempt-at-merge-on-mmap.patch
selftests-mm-add-pkey_sighandler_xx-hugetlb_dio-to-gitignore.patch
mm-refactor-mm_access-to-not-return-null.patch
mm-refactor-mm_access-to-not-return-null-fix.patch
mm-madvise-unrestrict-process_madvise-for-current-process.patch
maple_tree-do-not-hash-pointers-on-dump-in-debug-mode.patch
tools-testing-fix-phys_addr_t-size-on-64-bit-systems.patch
With the transition of pd-mapper into the kernel, the timing was altered
such that on some targets the initial rpmsg_send() requests from
pmic_glink clients would be attempted before the firmware had announced
intents, and the firmware reject intent requests.
Fix this
Signed-off-by: Bjorn Andersson <bjorn.andersson(a)oss.qualcomm.com>
---
Bjorn Andersson (2):
rpmsg: glink: Handle rejected intent request better
soc: qcom: pmic_glink: Handle GLINK intent allocation rejections
drivers/rpmsg/qcom_glink_native.c | 10 +++++++---
drivers/soc/qcom/pmic_glink.c | 18 +++++++++++++++---
2 files changed, 22 insertions(+), 6 deletions(-)
---
base-commit: 42f7652d3eb527d03665b09edac47f85fb600924
change-id: 20241022-pmic-glink-ecancelled-d899a9ca0358
Best regards,
--
Bjorn Andersson <bjorn.andersson(a)oss.qualcomm.com>
From: Jeff Xu <jeffxu(a)chromium.org>
Two fixes for madvise(MADV_DONTNEED) when sealed.
For PROT_NONE mappings, the previous blocking of
madvise(MADV_DONTNEED) is unnecessary. As PROT_NONE already prohibits
memory access, madvise(MADV_DONTNEED) should be allowed to proceed in
order to free the page.
For file-backed, private, read-only memory mappings, we previously did
not block the madvise(MADV_DONTNEED). This was based on
the assumption that the memory's content, being file-backed, could be
retrieved from the file if accessed again. However, this assumption
failed to consider scenarios where a mapping is initially created as
read-write, modified, and subsequently changed to read-only. The newly
introduced VM_WASWRITE flag addresses this oversight.
Reported-by: Pedro Falcato <pedro.falcato(a)gmail.com>
Link:https://lore.kernel.org/all/CABi2SkW2XzuZ2-TunWOVzTEX1qc29LhjfNQ3hD4Ny…
Fixes: 8be7258aad44 ("mseal: add mseal syscall")
Cc: <stable(a)vger.kernel.org> # 6.11.y: 4d1b3416659b: mm: move can_modify_vma to mm/vma.h
Cc: <stable(a)vger.kernel.org> # 6.11.y: 4a2dd02b0916: mm/mprotect: replace can_modify_mm with can_modify_vma
Cc: <stable(a)vger.kernel.org> # 6.11.y: 23c57d1fa2b9: mseal: replace can_modify_mm_madv with a vma variant
Cc: <stable(a)vger.kernel.org> # 6.11.y
Signed-off-by: Jeff Xu <jeffxu(a)chromium.org>
---
include/linux/mm.h | 2 ++
mm/mprotect.c | 3 +++
mm/mseal.c | 42 ++++++++++++++++++++++++++++++++++++------
3 files changed, 41 insertions(+), 6 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 4c32003c8404..b402eca2565a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -430,6 +430,8 @@ extern unsigned int kobjsize(const void *objp);
#ifdef CONFIG_64BIT
/* VM is sealed, in vm_flags */
#define VM_SEALED _BITUL(63)
+/* VM was writable */
+#define VM_WASWRITE _BITUL(62)
#endif
/* Bits set in the VMA until the stack is in its final location */
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 0c5d6d06107d..6397135ca526 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -821,6 +821,9 @@ static int do_mprotect_pkey(unsigned long start, size_t len,
break;
}
+ if ((vma->vm_flags & VM_WRITE) && !(newflags & VM_WRITE))
+ newflags |= VM_WASWRITE;
+
error = security_file_mprotect(vma, reqprot, prot);
if (error)
break;
diff --git a/mm/mseal.c b/mm/mseal.c
index ece977bd21e1..28f28487be17 100644
--- a/mm/mseal.c
+++ b/mm/mseal.c
@@ -36,12 +36,8 @@ static bool is_madv_discard(int behavior)
return false;
}
-static bool is_ro_anon(struct vm_area_struct *vma)
+static bool anon_is_ro(struct vm_area_struct *vma)
{
- /* check anonymous mapping. */
- if (vma->vm_file || vma->vm_flags & VM_SHARED)
- return false;
-
/*
* check for non-writable:
* PROT=RO or PKRU is not writeable.
@@ -53,6 +49,22 @@ static bool is_ro_anon(struct vm_area_struct *vma)
return false;
}
+static bool vma_is_prot_none(struct vm_area_struct *vma)
+{
+ if ((vma->vm_flags & VM_ACCESS_FLAGS) == VM_NONE)
+ return true;
+
+ return false;
+}
+
+static bool vma_was_writable_turn_readonly(struct vm_area_struct *vma)
+{
+ if (!(vma->vm_flags & VM_WRITE) && vma->vm_flags & VM_WASWRITE)
+ return true;
+
+ return false;
+}
+
/*
* Check if a vma is allowed to be modified by madvise.
*/
@@ -61,7 +73,25 @@ bool can_modify_vma_madv(struct vm_area_struct *vma, int behavior)
if (!is_madv_discard(behavior))
return true;
- if (unlikely(!can_modify_vma(vma) && is_ro_anon(vma)))
+ /* not sealed */
+ if (likely(can_modify_vma(vma)))
+ return true;
+
+ /* PROT_NONE mapping */
+ if (vma_is_prot_none(vma))
+ return true;
+
+ /* file-backed private mapping */
+ if (vma->vm_file) {
+ /* read-only but was writeable */
+ if (vma_was_writable_turn_readonly(vma))
+ return false;
+
+ return true;
+ }
+
+ /* anonymous mapping is read-only */
+ if (anon_is_ro(vma))
return false;
/* Allow by default. */
--
2.47.0.rc1.288.g06298d1525-goog
tpm2_load_null() has weak and broken error handling:
- The return value of tpm2_create_primary() is ignored.
- Leaks TPM return codes from tpm2_load_context() to the caller.
- If the key name comparison succeeds returns previous error
instead of zero to the caller.
Implement a proper error rollback.
Cc: stable(a)vger.kernel.org # v6.10+
Fixes: eb24c9788cd9 ("tpm: disable the TPM if NULL name changes")
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
--
v6:
- Address Stefan's remark:
https://lore.kernel.org/linux-integrity/def4ec2d-584b-405f-9d5e-99267013c3c…
v5:
- Fix the TPM error code leak from tpm2_load_context().
v4:
- No changes.
v3:
- Update log messages. Previously the log message incorrectly stated
on load failure that integrity check had been failed, even tho the
check is done *after* the load operation.
v2:
- Refined the commit message.
- Reverted tpm2_create_primary() changes. They are not required if
tmp_null_key is used as the parameter.
---
drivers/char/tpm/tpm2-sessions.c | 43 +++++++++++++++++---------------
1 file changed, 23 insertions(+), 20 deletions(-)
diff --git a/drivers/char/tpm/tpm2-sessions.c b/drivers/char/tpm/tpm2-sessions.c
index 1e12e0b2492e..bdac11964b55 100644
--- a/drivers/char/tpm/tpm2-sessions.c
+++ b/drivers/char/tpm/tpm2-sessions.c
@@ -915,33 +915,36 @@ static int tpm2_parse_start_auth_session(struct tpm2_auth *auth,
static int tpm2_load_null(struct tpm_chip *chip, u32 *null_key)
{
- int rc;
unsigned int offset = 0; /* dummy offset for null seed context */
u8 name[SHA256_DIGEST_SIZE + 2];
+ u32 tmp_null_key;
+ int rc;
rc = tpm2_load_context(chip, chip->null_key_context, &offset,
- null_key);
- if (rc != -EINVAL)
- return rc;
+ &tmp_null_key);
+ if (rc != -EINVAL) {
+ if (!rc)
+ *null_key = tmp_null_key;
+ goto err;
+ }
- /* an integrity failure may mean the TPM has been reset */
- dev_err(&chip->dev, "NULL key integrity failure!\n");
- /* check the null name against what we know */
- tpm2_create_primary(chip, TPM2_RH_NULL, NULL, name);
- if (memcmp(name, chip->null_key_name, sizeof(name)) == 0)
- /* name unchanged, assume transient integrity failure */
- return rc;
- /*
- * Fatal TPM failure: the NULL seed has actually changed, so
- * the TPM must have been illegally reset. All in-kernel TPM
- * operations will fail because the NULL primary can't be
- * loaded to salt the sessions, but disable the TPM anyway so
- * userspace programmes can't be compromised by it.
- */
- dev_err(&chip->dev, "NULL name has changed, disabling TPM due to interference\n");
+ rc = tpm2_create_primary(chip, TPM2_RH_NULL, &tmp_null_key, name);
+ if (rc)
+ goto err;
+
+ /* Return the null key if the name has not been changed: */
+ if (memcmp(name, chip->null_key_name, sizeof(name)) == 0) {
+ *null_key = tmp_null_key;
+ return 0;
+ }
+
+ /* Deduce from the name change TPM interference: */
+ dev_err(&chip->dev, "the null key integrity check failedh\n");
+ tpm2_flush_context(chip, tmp_null_key);
chip->flags |= TPM_CHIP_FLAG_DISABLE;
- return rc;
+err:
+ return rc ? -ENODEV : 0;
}
/**
--
2.47.0
From: Oliver Neukum <oneukum(a)suse.com>
[ Upstream commit e5876b088ba03a62124266fa20d00e65533c7269 ]
ipheth_sndbulk_callback() can submit carrier_work
as a part of its error handling. That means that
the driver must make sure that the work is cancelled
after it has made sure that no more URB can terminate
with an error condition.
Hence the order of actions in ipheth_close() needs
to be inverted.
Signed-off-by: Oliver Neukum <oneukum(a)suse.com>
Signed-off-by: Foster Snowhill <forst(a)pen.gy>
Tested-by: Georgi Valkov <gvalkov(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/net/usb/ipheth.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/usb/ipheth.c b/drivers/net/usb/ipheth.c
index 73ad78f47763c..7814856636907 100644
--- a/drivers/net/usb/ipheth.c
+++ b/drivers/net/usb/ipheth.c
@@ -353,8 +353,8 @@ static int ipheth_close(struct net_device *net)
{
struct ipheth_device *dev = netdev_priv(net);
- cancel_delayed_work_sync(&dev->carrier_work);
netif_stop_queue(net);
+ cancel_delayed_work_sync(&dev->carrier_work);
return 0;
}
--
2.43.0
commit f956052e00de211b5c9ebaa1958366c23f82ee9e upstream.
font.data may not initialize all memory spaces depending on the implementation
of vc->vc_sw->con_font_get. This may cause info-leak, so to prevent this, it
is safest to modify it to initialize the allocated memory space to 0, and it
generally does not affect the overall performance of the system.
Cc: stable(a)vger.kernel.org
Reported-by: syzbot+955da2d57931604ee691(a)syzkaller.appspotmail.com
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jeongjun Park <aha310510(a)gmail.com>
Link: https://lore.kernel.org/r/20241010174619.59662-1-aha310510@gmail.com
---
drivers/tty/vt/vt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 5f1183b0b89d..800979e8d5b6 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -4398,7 +4398,7 @@ static int con_font_get(struct vc_data *vc, struct console_font_op *op)
int c;
if (op->data) {
- font.data = kmalloc(max_font_size, GFP_KERNEL);
+ font.data = kzalloc(max_font_size, GFP_KERNEL);
if (!font.data)
return -ENOMEM;
} else
--
From: Dominique Martinet <asmadeus(a)codewreck.org>
[ Upstream commit 38d222b3163f7b7d737e5d999ffc890a12870e36 ]
It's possible for v9fs_fid_find "find by dentry" branch to not turn up
anything despite having an entry set (because e.g. uid doesn't match),
in which case the calling code will generally make an extra lookup
to the server.
In this case we might have had better luck looking by inode, so fall
back to look up by inode if we have one and the lookup by dentry failed.
Message-Id: <20240523210024.1214386-1-asmadeus(a)codewreck.org>
Reviewed-by: Christian Schoenebeck <linux_oss(a)crudebyte.com>
Signed-off-by: Dominique Martinet <asmadeus(a)codewreck.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
fs/9p/fid.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/fs/9p/fid.c b/fs/9p/fid.c
index de009a33e0e26..f84412290a30c 100644
--- a/fs/9p/fid.c
+++ b/fs/9p/fid.c
@@ -131,10 +131,9 @@ static struct p9_fid *v9fs_fid_find(struct dentry *dentry, kuid_t uid, int any)
}
}
spin_unlock(&dentry->d_lock);
- } else {
- if (dentry->d_inode)
- ret = v9fs_fid_find_inode(dentry->d_inode, false, uid, any);
}
+ if (!ret && dentry->d_inode)
+ ret = v9fs_fid_find_inode(dentry->d_inode, false, uid, any);
return ret;
}
--
2.43.0
From: Dominique Martinet <asmadeus(a)codewreck.org>
[ Upstream commit 38d222b3163f7b7d737e5d999ffc890a12870e36 ]
It's possible for v9fs_fid_find "find by dentry" branch to not turn up
anything despite having an entry set (because e.g. uid doesn't match),
in which case the calling code will generally make an extra lookup
to the server.
In this case we might have had better luck looking by inode, so fall
back to look up by inode if we have one and the lookup by dentry failed.
Message-Id: <20240523210024.1214386-1-asmadeus(a)codewreck.org>
Reviewed-by: Christian Schoenebeck <linux_oss(a)crudebyte.com>
Signed-off-by: Dominique Martinet <asmadeus(a)codewreck.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
fs/9p/fid.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/fs/9p/fid.c b/fs/9p/fid.c
index de009a33e0e26..f84412290a30c 100644
--- a/fs/9p/fid.c
+++ b/fs/9p/fid.c
@@ -131,10 +131,9 @@ static struct p9_fid *v9fs_fid_find(struct dentry *dentry, kuid_t uid, int any)
}
}
spin_unlock(&dentry->d_lock);
- } else {
- if (dentry->d_inode)
- ret = v9fs_fid_find_inode(dentry->d_inode, false, uid, any);
}
+ if (!ret && dentry->d_inode)
+ ret = v9fs_fid_find_inode(dentry->d_inode, false, uid, any);
return ret;
}
--
2.43.0
This problem reported by Clement LE GOFFIC manifest when
using CONFIG_KASAN_IN_VMALLOC and VMAP_STACK:
https://lore.kernel.org/linux-arm-kernel/a1a1d062-f3a2-4d05-9836-3b098de9db…
After some analysis it seems we are missing to sync the
VMALLOC shadow memory in top level PGD to all CPUs.
Add some code to perform this sync, and the bug appears
to go away.
As suggested by Ard, also perform a dummy read from the
shadow memory of the new VMAP_STACK in the low level
assembly.
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
---
Changes in v4:
- Since Kasan is not using header stubs, it is necessary to avoid
kasan_*() calls using ifdef when compiling without KASAN.
- Lift a line aligning the end of vmalloc from Melon Liu's
very similar patch so we have feature parity, credit Melon
as co-developer.
- Include the atomic_read_acquire() patch in the series due
to context requirements.
- Verify that the after the patch the kernel still builds and boots
without Kasan.
- Link to v3: https://lore.kernel.org/r/20241017-arm-kasan-vmalloc-crash-v3-0-d2a34cd5b66…
Changes in v3:
- Collect Mark Rutlands ACK on patch 1
- Change the simplified assembly add r2, ip, lsr #n to the canonical
add r2, r2, ip, lsr #n in patch 2.
- Link to v2: https://lore.kernel.org/r/20241016-arm-kasan-vmalloc-crash-v2-0-0a52fd086ee…
Changes in v2:
- Implement the two helper functions suggested by Russell
making the KASAN PGD copying less messy.
- Link to v1: https://lore.kernel.org/r/20241015-arm-kasan-vmalloc-crash-v1-0-dbb23592ca8…
---
Linus Walleij (3):
ARM: ioremap: Sync PGDs for VMALLOC shadow
ARM: entry: Do a dummy read from VMAP shadow
mm: Pair atomic_set_release() with _read_acquire()
arch/arm/kernel/entry-armv.S | 8 ++++++++
arch/arm/mm/ioremap.c | 35 ++++++++++++++++++++++++++++++-----
2 files changed, 38 insertions(+), 5 deletions(-)
---
base-commit: 9852d85ec9d492ebef56dc5f229416c925758edc
change-id: 20241015-arm-kasan-vmalloc-crash-fcbd51416457
Best regards,
--
Linus Walleij <linus.walleij(a)linaro.org>
On 2024/10/23 1:44, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> selftests: mm: fix the incorrect usage() info of khugepaged
>
> to the 6.11-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> selftests-mm-fix-the-incorrect-usage-info-of-khugepa.patch
> and it can be found in the queue-6.11 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
Hi,
I don't think this patch needs to be added to the stable tree because
this just fixes usage
information, as Andrew had previously said:
https://lore.kernel.org/lkml/20241017001441.2db5adaaa63dc3faa0934204@linux-…
>
>
>
> commit ad8b93ffe0a86e3b6be297826cd34b12080fc877
> Author: Nanyong Sun <sunnanyong(a)huawei.com>
> Date: Tue Oct 15 10:02:57 2024 +0800
>
> selftests: mm: fix the incorrect usage() info of khugepaged
>
> [ Upstream commit 3e822bed2fbd1527d88f483342b1d2a468520a9a ]
>
> The mount option of tmpfs should be huge=advise, not madvise which is not
> supported and may mislead the users.
>
> Link: https://lkml.kernel.org/r/20241015020257.139235-1-sunnanyong@huawei.com
> Fixes: 1b03d0d558a2 ("selftests/vm: add thp collapse file and tmpfs testing")
> Signed-off-by: Nanyong Sun <sunnanyong(a)huawei.com>
> Reviewed-by: Baolin Wang <baolin.wang(a)linux.alibaba.com>
> Reviewed-by: Anshuman Khandual <anshuman.khandual(a)arm.com>
> Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
> Cc: Shuah Khan <shuah(a)kernel.org>
> Cc: Zach O'Keefe <zokeefe(a)google.com>
> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/tools/testing/selftests/mm/khugepaged.c b/tools/testing/selftests/mm/khugepaged.c
> index 829320a519e72..89dec42986825 100644
> --- a/tools/testing/selftests/mm/khugepaged.c
> +++ b/tools/testing/selftests/mm/khugepaged.c
> @@ -1091,7 +1091,7 @@ static void usage(void)
> fprintf(stderr, "\n\t\"file,all\" mem_type requires kernel built with\n");
> fprintf(stderr, "\tCONFIG_READ_ONLY_THP_FOR_FS=y\n");
> fprintf(stderr, "\n\tif [dir] is a (sub)directory of a tmpfs mount, tmpfs must be\n");
> - fprintf(stderr, "\tmounted with huge=madvise option for khugepaged tests to work\n");
> + fprintf(stderr, "\tmounted with huge=advise option for khugepaged tests to work\n");
> fprintf(stderr, "\n\tSupported Options:\n");
> fprintf(stderr, "\t\t-h: This help message.\n");
> fprintf(stderr, "\t\t-s: mTHP size, expressed as page order.\n");
> .
This is the start of the stable review cycle for the 6.1.114 release.
There are 91 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 23 Oct 2024 10:22:25 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.114-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.1.114-rc1
Vasiliy Kovalev <kovalev(a)altlinux.org>
ALSA: hda/conexant - Use cached pin control for Node 0x1d on HP EliteOne 1000 G2
Nicholas Piggin <npiggin(a)gmail.com>
powerpc/64: Add big-endian ELFv2 flavour to crypto VMX asm generation
Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
nilfs2: propagate directory read errors from nilfs_find_entry()
Paolo Abeni <pabeni(a)redhat.com>
mptcp: prevent MPC handshake on port-based signal endpoints
Paolo Abeni <pabeni(a)redhat.com>
tcp: fix mptcp DSS corruption due to large pmtu xmit
Nam Cao <namcao(a)linutronix.de>
irqchip/sifive-plic: Unmask interrupt in plic_irq_enable()
Marc Zyngier <maz(a)kernel.org>
irqchip/gic-v4: Don't allow a VMOVP on a dying VPE
Ma Ke <make24(a)iscas.ac.cn>
pinctrl: apple: check devm_kasprintf() returned value
Sergey Matsievskiy <matsievskiysv(a)gmail.com>
pinctrl: ocelot: fix system hang on level based interrupts
Longlong Xia <xialonglong(a)kylinos.cn>
tty: n_gsm: Fix use-after-free in gsm_cleanup_mux
Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
x86/entry_32: Clear CPU buffers after register restore in NMI return
Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
x86/entry_32: Do not clobber user EFLAGS.ZF
Zhang Rui <rui.zhang(a)intel.com>
x86/apic: Always explicitly disarm TSC-deadline timer
Nathan Chancellor <nathan(a)kernel.org>
x86/resctrl: Annotate get_mem_config() functions as __init
Takashi Iwai <tiwai(a)suse.de>
parport: Proper fix for array out-of-bounds access
Prashanth K <quic_prashk(a)quicinc.com>
usb: dwc3: Wait for EndXfer completion before restoring GUSB2PHYCFG
Daniele Palmas <dnlplm(a)gmail.com>
USB: serial: option: add Telit FN920C04 MBIM compositions
Benjamin B. Frost <benjamin(a)geanix.com>
USB: serial: option: add support for Quectel EG916Q-GL
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: Mitigate failed set dequeue pointer commands
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: Fix incorrect stream context type macro
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: btusb: Fix regression with fake CSR controllers 0a12:0001
Aaron Thompson <dev(a)aaront.org>
Bluetooth: ISO: Fix multiple init when debugfs is disabled
Aaron Thompson <dev(a)aaront.org>
Bluetooth: Remove debugfs directory on module init failure
Aaron Thompson <dev(a)aaront.org>
Bluetooth: Call iso_exit() on module unload
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: adc: ti-ads124s08: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: dac: ad3552r: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: proximity: mb1232: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: dac: ad5766: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: amplifiers: ada4250: add missing select REGMAP_SPI in Kconfig
Emil Gedenryd <emil.gedenryd(a)axis.com>
iio: light: opt3001: add missing full-scale range value
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: light: veml6030: fix IIO device retrieval from embedded device
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: light: veml6030: fix ALS sensor resolution
Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
iio: hid-sensors: Fix an error handling path in _hid_sensor_set_report_latency()
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: adc: ti-ads8688: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: dac: stm32-dac-core: add missing select REGMAP_MMIO in Kconfig
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: dac: ltc1660: add missing select REGMAP_SPI in Kconfig
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: dac: ad5770r: add missing select REGMAP_SPI in Kconfig
Mohammed Anees <pvmohammedanees2003(a)gmail.com>
drm/amdgpu: prevent BO_HANDLES error from being overwritten
Alex Deucher <alexander.deucher(a)amd.com>
drm/amdgpu/swsmu: Only force workload setup on init
Nikolay Kuratov <kniv(a)yandex-team.ru>
drm/vmwgfx: Handle surface check failure correctly
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/radeon: Fix encoder->possible_clones
Seunghwan Baek <sh8267.baek(a)samsung.com>
scsi: ufs: core: Set SDEV_OFFLINE when UFS is shut down
Jens Axboe <axboe(a)kernel.dk>
io_uring/sqpoll: close race on waiting for sqring entries
Omar Sandoval <osandov(a)fb.com>
blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race
Johannes Wikner <kwikner(a)ethz.ch>
x86/bugs: Do not use UNTRAIN_RET with IBPB on entry
Johannes Wikner <kwikner(a)ethz.ch>
x86/bugs: Skip RSB fill at VMEXIT
Johannes Wikner <kwikner(a)ethz.ch>
x86/entry: Have entry_ibpb() invalidate return predictions
Johannes Wikner <kwikner(a)ethz.ch>
x86/cpufeatures: Add a IBPB_NO_RET BUG flag
Jim Mattson <jmattson(a)google.com>
x86/cpufeatures: Define X86_FEATURE_AMD_IBPB_RET
Michael Mueller <mimu(a)linux.ibm.com>
KVM: s390: Change virtual to physical address access in diag 0x258 handler
Nico Boehr <nrb(a)linux.ibm.com>
KVM: s390: gaccess: Check if guest address is in memslot
Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
s390/sclp_vt220: Convert newlines to CRLF instead of LFCR
Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
s390/sclp: Deactivate sclp after all its users
Lu Baolu <baolu.lu(a)linux.intel.com>
iommu/vt-d: Fix incorrect pci_for_each_dma_alias() for non-PCI devices
Wachowski, Karol <karol.wachowski(a)intel.com>
drm/shmem-helper: Fix BUG_ON() on mmap(PROT_WRITE, MAP_PRIVATE)
Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
maple_tree: correct tree corruption on spanning store
Jakub Kicinski <kuba(a)kernel.org>
devlink: bump the instance index directly when iterating
Jakub Kicinski <kuba(a)kernel.org>
devlink: drop the filter argument from devlinks_xa_find_get
Liu Shixin <liushixin2(a)huawei.com>
mm/swapfile: skip HugeTLB pages for unuse_vma
OGAWA Hirofumi <hirofumi(a)mail.parknet.co.jp>
fat: fix uninitialized variable
Nianyao Tang <tangnianyao(a)huawei.com>
irqchip/gic-v3-its: Fix VSYNC referencing an unmapped VPE on GIC v4.1
Oleksij Rempel <linux(a)rempel-privat.de>
net: macb: Avoid 20s boot delay by skipping MDIO bus registration for fixed-link PHY
Mark Rutland <mark.rutland(a)arm.com>
arm64: probes: Fix simulate_ldr*_literal()
Mark Rutland <mark.rutland(a)arm.com>
arm64: probes: Remove broken LDR (literal) uprobe support
Jinjie Ruan <ruanjinjie(a)huawei.com>
posix-clock: Fix missing timespec64 check in pc_clock_settime()
Wei Fang <wei.fang(a)nxp.com>
net: enetc: add missing static descriptor and inline keyword
Wei Fang <wei.fang(a)nxp.com>
net: enetc: remove xdp_drops statistic from enetc_xdp_drop()
Jan Kara <jack(a)suse.cz>
udf: Don't return bh from udf_expand_dir_adinicb()
Jan Kara <jack(a)suse.cz>
udf: Handle error when expanding directory
Jan Kara <jack(a)suse.cz>
udf: Remove old directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_link() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_mkdir() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_add_nondir() to new directory iteration
Jan Kara <jack(a)suse.cz>
udf: Implement adding of dir entries using new iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_unlink() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_rmdir() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert empty_dir() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_get_parent() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_lookup() to use new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_readdir() to new directory iteration
Jan Kara <jack(a)suse.cz>
udf: Convert udf_rename() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Provide function to mark entry as deleted using new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Implement searching for directory entry using new iteration code
Jan Kara <jack(a)suse.cz>
udf: Move udf_expand_dir_adinicb() to its callsite
Jan Kara <jack(a)suse.cz>
udf: Convert udf_expand_dir_adinicb() to new directory iteration
Jan Kara <jack(a)suse.cz>
udf: New directory iteration code
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
mptcp: pm: fix UaF read in mptcp_pm_nl_rm_addr_or_subflow
Vasiliy Kovalev <kovalev(a)altlinux.org>
ALSA: hda/conexant - Fix audio routing for HP EliteOne 1000 G2
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix user-after-free from session log off
Roi Martin <jroi.martin(a)gmail.com>
btrfs: fix uninitialized pointer free on read_alloc_one_name() error
Roi Martin <jroi.martin(a)gmail.com>
btrfs: fix uninitialized pointer free in add_inode_ref()
-------------
Diffstat:
Makefile | 4 +-
arch/arm64/kernel/probes/decode-insn.c | 16 +-
arch/arm64/kernel/probes/simulate-insn.c | 18 +-
arch/s390/kvm/diag.c | 2 +-
arch/s390/kvm/gaccess.c | 4 +
arch/s390/kvm/gaccess.h | 14 +-
arch/x86/entry/entry.S | 5 +
arch/x86/entry/entry_32.S | 6 +-
arch/x86/include/asm/cpufeatures.h | 4 +-
arch/x86/kernel/apic/apic.c | 14 +-
arch/x86/kernel/cpu/bugs.c | 32 +
arch/x86/kernel/cpu/common.c | 3 +
arch/x86/kernel/cpu/resctrl/core.c | 4 +-
block/blk-rq-qos.c | 2 +-
drivers/bluetooth/btusb.c | 13 +-
drivers/crypto/vmx/Makefile | 12 +-
drivers/crypto/vmx/ppc-xlate.pl | 10 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 6 +-
drivers/gpu/drm/drm_gem_shmem_helper.c | 3 +
drivers/gpu/drm/radeon/radeon_encoders.c | 2 +-
drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 1 +
drivers/iio/adc/Kconfig | 4 +
drivers/iio/amplifiers/Kconfig | 1 +
.../iio/common/hid-sensors/hid-sensor-trigger.c | 2 +-
drivers/iio/dac/Kconfig | 7 +
drivers/iio/light/opt3001.c | 4 +
drivers/iio/light/veml6030.c | 5 +-
drivers/iio/proximity/Kconfig | 2 +
drivers/iommu/intel/iommu.c | 4 +-
drivers/irqchip/irq-gic-v3-its.c | 26 +-
drivers/irqchip/irq-sifive-plic.c | 21 +-
drivers/net/ethernet/cadence/macb_main.c | 14 +-
drivers/net/ethernet/freescale/enetc/enetc.c | 2 +-
drivers/parport/procfs.c | 22 +-
drivers/pinctrl/pinctrl-apple-gpio.c | 3 +
drivers/pinctrl/pinctrl-ocelot.c | 8 +-
drivers/s390/char/sclp.c | 3 +-
drivers/s390/char/sclp_vt220.c | 4 +-
drivers/tty/n_gsm.c | 2 +
drivers/ufs/core/ufshcd.c | 4 +-
drivers/usb/dwc3/gadget.c | 10 +-
drivers/usb/host/xhci-ring.c | 2 +-
drivers/usb/host/xhci.h | 2 +-
drivers/usb/serial/option.c | 8 +
fs/btrfs/tree-log.c | 6 +-
fs/fat/namei_vfat.c | 2 +-
fs/nilfs2/dir.c | 50 +-
fs/nilfs2/namei.c | 39 +-
fs/nilfs2/nilfs.h | 2 +-
fs/smb/server/mgmt/user_session.c | 26 +-
fs/smb/server/mgmt/user_session.h | 4 +
fs/smb/server/server.c | 2 +
fs/smb/server/smb2pdu.c | 8 +-
fs/udf/dir.c | 148 +--
fs/udf/directory.c | 594 ++++++++---
fs/udf/inode.c | 90 --
fs/udf/namei.c | 1037 +++++++-------------
fs/udf/udfdecl.h | 45 +-
include/linux/fsl/enetc_mdio.h | 3 +-
include/linux/irqchip/arm-gic-v4.h | 4 +-
io_uring/io_uring.h | 9 +-
kernel/time/posix-clock.c | 3 +
lib/maple_tree.c | 12 +-
mm/swapfile.c | 2 +-
net/bluetooth/af_bluetooth.c | 3 +
net/bluetooth/iso.c | 6 +-
net/devlink/leftover.c | 40 +-
net/ipv4/tcp_output.c | 4 +-
net/mptcp/mib.c | 1 +
net/mptcp/mib.h | 1 +
net/mptcp/pm_netlink.c | 3 +-
net/mptcp/protocol.h | 1 +
net/mptcp/subflow.c | 11 +
sound/pci/hda/patch_conexant.c | 19 +
75 files changed, 1237 insertions(+), 1275 deletions(-)
From: Chris Wilson <chris.p.wilson(a)intel.com>
commit 78a033433a5ae4fee85511ee075bc9a48312c79e upstream.
If we abort driver initialisation in the middle of gt/engine discovery,
some engines will be fully setup and some not. Those incompletely setup
engines only have 'engine->release == NULL' and so will leak any of the
common objects allocated.
v2:
- Drop the destroy_pinned_context() helper for now. It's not really
worth it with just a single callsite at the moment. (Janusz)
Signed-off-by: Chris Wilson <chris.p.wilson(a)intel.com>
Cc: Janusz Krzysztofik <janusz.krzysztofik(a)linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper(a)intel.com>
Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik(a)linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220915232654.3283095-2-matt…
Cc: <stable(a)vger.kernel.org> # 5.10+
---
drivers/gpu/drm/i915/gt/intel_engine_cs.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index a19537706ed1..eb6f4d7f1e34 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -904,8 +904,13 @@ int intel_engines_init(struct intel_gt *gt)
return err;
err = setup(engine);
- if (err)
+ if (err) {
+ intel_engine_cleanup_common(engine);
return err;
+ }
+
+ /* The backend should now be responsible for cleanup */
+ GEM_BUG_ON(engine->release == NULL);
err = engine_init_common(engine);
if (err)
--
2.34.1
From: Johannes Berg <johannes.berg(a)intel.com>
If more than 255 colocated APs exist for the set of all
APs found during 2.5/5 GHz scanning, then the 6 GHz scan
construction will loop forever since the loop variable
has type u8, which can never reach the number found when
that's bigger than 255, and is stored in a u32 variable.
Also move it into the loops to have a smaller scope.
Using a u32 there is fine, we limit the number of APs in
the scan list and each has a limit on the number of RNR
entries due to the frame size. With a limit of 1000 scan
results, a frame size upper bound of 4096 (really it's
more like ~2300) and a TBTT entry size of at least 11,
we get an upper bound for the number of ~372k, well in
the bounds of a u32.
Cc: stable(a)vger.kernel.org
Fixes: eae94cf82d74 ("iwlwifi: mvm: add support for 6GHz")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219375
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
---
drivers/net/wireless/intel/iwlwifi/mvm/scan.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/scan.c b/drivers/net/wireless/intel/iwlwifi/mvm/scan.c
index 3ce9150213a7..ddcbd80a49fb 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/scan.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/scan.c
@@ -1774,7 +1774,7 @@ iwl_mvm_umac_scan_cfg_channels_v7_6g(struct iwl_mvm *mvm,
&cp->channel_config[ch_cnt];
u32 s_ssid_bitmap = 0, bssid_bitmap = 0, flags = 0;
- u8 j, k, n_s_ssids = 0, n_bssids = 0;
+ u8 k, n_s_ssids = 0, n_bssids = 0;
u8 max_s_ssids, max_bssids;
bool force_passive = false, found = false, allow_passive = true,
unsolicited_probe_on_chan = false, psc_no_listen = false;
@@ -1799,7 +1799,7 @@ iwl_mvm_umac_scan_cfg_channels_v7_6g(struct iwl_mvm *mvm,
cfg->v5.iter_count = 1;
cfg->v5.iter_interval = 0;
- for (j = 0; j < params->n_6ghz_params; j++) {
+ for (u32 j = 0; j < params->n_6ghz_params; j++) {
s8 tmp_psd_20;
if (!(scan_6ghz_params[j].channel_idx == i))
@@ -1873,7 +1873,7 @@ iwl_mvm_umac_scan_cfg_channels_v7_6g(struct iwl_mvm *mvm,
* SSID.
* TODO: improve this logic
*/
- for (j = 0; j < params->n_6ghz_params; j++) {
+ for (u32 j = 0; j < params->n_6ghz_params; j++) {
if (!(scan_6ghz_params[j].channel_idx == i))
continue;
--
2.47.0
On 22. 10. 24, 19:45, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> xhci: dbgtty: remove kfifo_out() wrapper
This is a cleanup, not needed in stable.
--
js
suse labs
The comment before the config of the GPLL3 PLL says that the
PLL should run at 930 MHz. In contrary to this, calculating
the frequency from the current configuration values by using
19.2 MHz as input frequency defined in 'qcs404.dtsi', it gives
921.6 MHz:
$ xo=19200000; l=48; alpha=0x0; alpha_hi=0x0
$ echo "$xo * ($((l)) + $(((alpha_hi << 32 | alpha) >> 8)) / 2^32)" | bc -l
921600000.00000000000000000000
Set 'alpha_hi' in the configuration to a value used in downstream
kernels [1][2] in order to get the correct output rate:
$ xo=19200000; l=48; alpha=0x0; alpha_hi=0x70
$ echo "$xo * ($((l)) + $(((alpha_hi << 32 | alpha) >> 8)) / 2^32)" | bc -l
930000000.00000000000000000000
The change is based on static code analysis, compile tested only.
[1] https://git.codelinaro.org/clo/la/kernel/msm-5.4/-/blob/kernel.lnx.5.4.r56-…
[2} https://git.codelinaro.org/clo/la/kernel/msm-5.15/-/blob/kernel.lnx.5.15.r4…
Cc: stable(a)vger.kernel.org
Fixes: 652f1813c113 ("clk: qcom: gcc: Add global clock controller driver for QCS404")
Signed-off-by: Gabor Juhos <j4g8y7(a)gmail.com>
---
Note: due to a bug in the clk_alpha_pll_configure() function, the following
patch is also needed in order for this fix to take effect:
https://lore.kernel.org/all/20241019-qcs615-mm-clockcontroller-v1-1-4cfb96d…
---
drivers/clk/qcom/gcc-qcs404.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/clk/qcom/gcc-qcs404.c b/drivers/clk/qcom/gcc-qcs404.c
index c3cfd572e7c1e0a987519be2cb2050c9bc7992c7..5ca003c9bfba89bee2e626b3c35936452cc02765 100644
--- a/drivers/clk/qcom/gcc-qcs404.c
+++ b/drivers/clk/qcom/gcc-qcs404.c
@@ -131,6 +131,7 @@ static struct clk_alpha_pll gpll1_out_main = {
/* 930MHz configuration */
static const struct alpha_pll_config gpll3_config = {
.l = 48,
+ .alpha_hi = 0x70,
.alpha = 0x0,
.alpha_en_mask = BIT(24),
.post_div_mask = 0xf << 8,
---
base-commit: 03dc72319cee7d0dfefee9ae7041b67732f6b8cd
change-id: 20241021-fix-gcc-qcs404-gpll3-f314335c8ecf
Best regards,
--
Gabor Juhos <j4g8y7(a)gmail.com>
From: Yu Kuai <yukuai3(a)huawei.com>
[ Upstream commit 73aeab373557fa6ee4ae0b742c6211ccd9859280 ]
Original state:
Process 1 Process 2 Process 3 Process 4
(BIC1) (BIC2) (BIC3) (BIC4)
Λ | | |
\--------------\ \-------------\ \-------------\|
V V V
bfqq1--------->bfqq2---------->bfqq3----------->bfqq4
ref 0 1 2 4
After commit 0e456dba86c7 ("block, bfq: choose the last bfqq from merge
chain in bfq_setup_cooperator()"), if P1 issues a new IO:
Without the patch:
Process 1 Process 2 Process 3 Process 4
(BIC1) (BIC2) (BIC3) (BIC4)
Λ | | |
\------------------------------\ \-------------\|
V V
bfqq1--------->bfqq2---------->bfqq3----------->bfqq4
ref 0 0 2 4
bfqq3 will be used to handle IO from P1, this is not expected, IO
should be redirected to bfqq4;
With the patch:
-------------------------------------------
| |
Process 1 Process 2 Process 3 | Process 4
(BIC1) (BIC2) (BIC3) | (BIC4)
| | | |
\-------------\ \-------------\|
V V
bfqq1--------->bfqq2---------->bfqq3----------->bfqq4
ref 0 0 2 4
IO is redirected to bfqq4, however, procress reference of bfqq3 is still
2, while there is only P2 using it.
Fix the problem by calling bfq_merge_bfqqs() for each bfqq in the merge
chain. Also change bfqq_merge_bfqqs() to return new_bfqq to simplify
code.
Fixes: 0e456dba86c7 ("block, bfq: choose the last bfqq from merge chain in bfq_setup_cooperator()")
Signed-off-by: Yu Kuai <yukuai3(a)huawei.com>
Link: https://lore.kernel.org/r/20240909134154.954924-3-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
---
block/bfq-iosched.c | 33 ++++++++++++++++-----------------
1 file changed, 16 insertions(+), 17 deletions(-)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 515e3c1a5475..c1600e3ac333 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -2774,10 +2774,12 @@ void bfq_release_process_ref(struct bfq_data *bfqd, struct bfq_queue *bfqq)
bfq_put_queue(bfqq);
}
-static void
-bfq_merge_bfqqs(struct bfq_data *bfqd, struct bfq_io_cq *bic,
- struct bfq_queue *bfqq, struct bfq_queue *new_bfqq)
+static struct bfq_queue *bfq_merge_bfqqs(struct bfq_data *bfqd,
+ struct bfq_io_cq *bic,
+ struct bfq_queue *bfqq)
{
+ struct bfq_queue *new_bfqq = bfqq->new_bfqq;
+
bfq_log_bfqq(bfqd, bfqq, "merging with queue %lu",
(unsigned long)new_bfqq->pid);
/* Save weight raising and idle window of the merged queues */
@@ -2845,6 +2847,8 @@ bfq_merge_bfqqs(struct bfq_data *bfqd, struct bfq_io_cq *bic,
new_bfqq->pid = -1;
bfqq->bic = NULL;
bfq_release_process_ref(bfqd, bfqq);
+
+ return new_bfqq;
}
static bool bfq_allow_bio_merge(struct request_queue *q, struct request *rq,
@@ -2880,14 +2884,8 @@ static bool bfq_allow_bio_merge(struct request_queue *q, struct request *rq,
* fulfilled, i.e., bic can be redirected to new_bfqq
* and bfqq can be put.
*/
- bfq_merge_bfqqs(bfqd, bfqd->bio_bic, bfqq,
- new_bfqq);
- /*
- * If we get here, bio will be queued into new_queue,
- * so use new_bfqq to decide whether bio and rq can be
- * merged.
- */
- bfqq = new_bfqq;
+ while (bfqq != new_bfqq)
+ bfqq = bfq_merge_bfqqs(bfqd, bfqd->bio_bic, bfqq);
/*
* Change also bqfd->bio_bfqq, as
@@ -5444,6 +5442,7 @@ static bool __bfq_insert_request(struct bfq_data *bfqd, struct request *rq)
bool waiting, idle_timer_disabled = false;
if (new_bfqq) {
+ struct bfq_queue *old_bfqq = bfqq;
/*
* Release the request's reference to the old bfqq
* and make sure one is taken to the shared queue.
@@ -5459,18 +5458,18 @@ static bool __bfq_insert_request(struct bfq_data *bfqd, struct request *rq)
* then complete the merge and redirect it to
* new_bfqq.
*/
- if (bic_to_bfqq(RQ_BIC(rq), 1) == bfqq)
- bfq_merge_bfqqs(bfqd, RQ_BIC(rq),
- bfqq, new_bfqq);
+ if (bic_to_bfqq(RQ_BIC(rq), 1) == bfqq) {
+ while (bfqq != new_bfqq)
+ bfqq = bfq_merge_bfqqs(bfqd, RQ_BIC(rq), bfqq);
+ }
- bfq_clear_bfqq_just_created(bfqq);
+ bfq_clear_bfqq_just_created(old_bfqq);
/*
* rq is about to be enqueued into new_bfqq,
* release rq reference on bfqq
*/
- bfq_put_queue(bfqq);
+ bfq_put_queue(old_bfqq);
rq->elv.priv[1] = new_bfqq;
- bfqq = new_bfqq;
}
bfq_update_io_thinktime(bfqd, bfqq);
--
2.39.2
From: Yu Kuai <yukuai3(a)huawei.com>
[ Upstream commit 73aeab373557fa6ee4ae0b742c6211ccd9859280 ]
Original state:
Process 1 Process 2 Process 3 Process 4
(BIC1) (BIC2) (BIC3) (BIC4)
Λ | | |
\--------------\ \-------------\ \-------------\|
V V V
bfqq1--------->bfqq2---------->bfqq3----------->bfqq4
ref 0 1 2 4
After commit 0e456dba86c7 ("block, bfq: choose the last bfqq from merge
chain in bfq_setup_cooperator()"), if P1 issues a new IO:
Without the patch:
Process 1 Process 2 Process 3 Process 4
(BIC1) (BIC2) (BIC3) (BIC4)
Λ | | |
\------------------------------\ \-------------\|
V V
bfqq1--------->bfqq2---------->bfqq3----------->bfqq4
ref 0 0 2 4
bfqq3 will be used to handle IO from P1, this is not expected, IO
should be redirected to bfqq4;
With the patch:
-------------------------------------------
| |
Process 1 Process 2 Process 3 | Process 4
(BIC1) (BIC2) (BIC3) | (BIC4)
| | | |
\-------------\ \-------------\|
V V
bfqq1--------->bfqq2---------->bfqq3----------->bfqq4
ref 0 0 2 4
IO is redirected to bfqq4, however, procress reference of bfqq3 is still
2, while there is only P2 using it.
Fix the problem by calling bfq_merge_bfqqs() for each bfqq in the merge
chain. Also change bfqq_merge_bfqqs() to return new_bfqq to simplify
code.
Fixes: 0e456dba86c7 ("block, bfq: choose the last bfqq from merge chain in bfq_setup_cooperator()")
Signed-off-by: Yu Kuai <yukuai3(a)huawei.com>
Link: https://lore.kernel.org/r/20240909134154.954924-3-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
---
block/bfq-iosched.c | 37 +++++++++++++++++--------------------
1 file changed, 17 insertions(+), 20 deletions(-)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index b0bdb5197530..c985c944fa65 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -2981,10 +2981,12 @@ void bfq_release_process_ref(struct bfq_data *bfqd, struct bfq_queue *bfqq)
bfq_put_queue(bfqq);
}
-static void
-bfq_merge_bfqqs(struct bfq_data *bfqd, struct bfq_io_cq *bic,
- struct bfq_queue *bfqq, struct bfq_queue *new_bfqq)
+static struct bfq_queue *bfq_merge_bfqqs(struct bfq_data *bfqd,
+ struct bfq_io_cq *bic,
+ struct bfq_queue *bfqq)
{
+ struct bfq_queue *new_bfqq = bfqq->new_bfqq;
+
bfq_log_bfqq(bfqd, bfqq, "merging with queue %lu",
(unsigned long)new_bfqq->pid);
/* Save weight raising and idle window of the merged queues */
@@ -3078,6 +3080,8 @@ bfq_merge_bfqqs(struct bfq_data *bfqd, struct bfq_io_cq *bic,
bfq_reassign_last_bfqq(bfqq, new_bfqq);
bfq_release_process_ref(bfqd, bfqq);
+
+ return new_bfqq;
}
static bool bfq_allow_bio_merge(struct request_queue *q, struct request *rq,
@@ -3113,14 +3117,8 @@ static bool bfq_allow_bio_merge(struct request_queue *q, struct request *rq,
* fulfilled, i.e., bic can be redirected to new_bfqq
* and bfqq can be put.
*/
- bfq_merge_bfqqs(bfqd, bfqd->bio_bic, bfqq,
- new_bfqq);
- /*
- * If we get here, bio will be queued into new_queue,
- * so use new_bfqq to decide whether bio and rq can be
- * merged.
- */
- bfqq = new_bfqq;
+ while (bfqq != new_bfqq)
+ bfqq = bfq_merge_bfqqs(bfqd, bfqd->bio_bic, bfqq);
/*
* Change also bqfd->bio_bfqq, as
@@ -5482,9 +5480,7 @@ bfq_do_early_stable_merge(struct bfq_data *bfqd, struct bfq_queue *bfqq,
* state before killing it.
*/
bfqq->bic = bic;
- bfq_merge_bfqqs(bfqd, bic, bfqq, new_bfqq);
-
- return new_bfqq;
+ return bfq_merge_bfqqs(bfqd, bic, bfqq);
}
/*
@@ -5916,6 +5912,7 @@ static bool __bfq_insert_request(struct bfq_data *bfqd, struct request *rq)
bool waiting, idle_timer_disabled = false;
if (new_bfqq) {
+ struct bfq_queue *old_bfqq = bfqq;
/*
* Release the request's reference to the old bfqq
* and make sure one is taken to the shared queue.
@@ -5931,18 +5928,18 @@ static bool __bfq_insert_request(struct bfq_data *bfqd, struct request *rq)
* then complete the merge and redirect it to
* new_bfqq.
*/
- if (bic_to_bfqq(RQ_BIC(rq), 1) == bfqq)
- bfq_merge_bfqqs(bfqd, RQ_BIC(rq),
- bfqq, new_bfqq);
+ if (bic_to_bfqq(RQ_BIC(rq), 1) == bfqq) {
+ while (bfqq != new_bfqq)
+ bfqq = bfq_merge_bfqqs(bfqd, RQ_BIC(rq), bfqq);
+ }
- bfq_clear_bfqq_just_created(bfqq);
+ bfq_clear_bfqq_just_created(old_bfqq);
/*
* rq is about to be enqueued into new_bfqq,
* release rq reference on bfqq
*/
- bfq_put_queue(bfqq);
+ bfq_put_queue(old_bfqq);
rq->elv.priv[1] = new_bfqq;
- bfqq = new_bfqq;
}
bfq_update_io_thinktime(bfqd, bfqq);
--
2.39.2
From: Yu Kuai <yukuai3(a)huawei.com>
[ Upstream commit 73aeab373557fa6ee4ae0b742c6211ccd9859280 ]
Original state:
Process 1 Process 2 Process 3 Process 4
(BIC1) (BIC2) (BIC3) (BIC4)
Λ | | |
\--------------\ \-------------\ \-------------\|
V V V
bfqq1--------->bfqq2---------->bfqq3----------->bfqq4
ref 0 1 2 4
After commit 0e456dba86c7 ("block, bfq: choose the last bfqq from merge
chain in bfq_setup_cooperator()"), if P1 issues a new IO:
Without the patch:
Process 1 Process 2 Process 3 Process 4
(BIC1) (BIC2) (BIC3) (BIC4)
Λ | | |
\------------------------------\ \-------------\|
V V
bfqq1--------->bfqq2---------->bfqq3----------->bfqq4
ref 0 0 2 4
bfqq3 will be used to handle IO from P1, this is not expected, IO
should be redirected to bfqq4;
With the patch:
-------------------------------------------
| |
Process 1 Process 2 Process 3 | Process 4
(BIC1) (BIC2) (BIC3) | (BIC4)
| | | |
\-------------\ \-------------\|
V V
bfqq1--------->bfqq2---------->bfqq3----------->bfqq4
ref 0 0 2 4
IO is redirected to bfqq4, however, procress reference of bfqq3 is still
2, while there is only P2 using it.
Fix the problem by calling bfq_merge_bfqqs() for each bfqq in the merge
chain. Also change bfqq_merge_bfqqs() to return new_bfqq to simplify
code.
Fixes: 0e456dba86c7 ("block, bfq: choose the last bfqq from merge chain in bfq_setup_cooperator()")
Signed-off-by: Yu Kuai <yukuai3(a)huawei.com>
Link: https://lore.kernel.org/r/20240909134154.954924-3-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
---
block/bfq-iosched.c | 37 +++++++++++++++++--------------------
1 file changed, 17 insertions(+), 20 deletions(-)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index bfce6343a577..8e797782cfe3 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -3117,10 +3117,12 @@ void bfq_release_process_ref(struct bfq_data *bfqd, struct bfq_queue *bfqq)
bfq_put_queue(bfqq);
}
-static void
-bfq_merge_bfqqs(struct bfq_data *bfqd, struct bfq_io_cq *bic,
- struct bfq_queue *bfqq, struct bfq_queue *new_bfqq)
+static struct bfq_queue *bfq_merge_bfqqs(struct bfq_data *bfqd,
+ struct bfq_io_cq *bic,
+ struct bfq_queue *bfqq)
{
+ struct bfq_queue *new_bfqq = bfqq->new_bfqq;
+
bfq_log_bfqq(bfqd, bfqq, "merging with queue %lu",
(unsigned long)new_bfqq->pid);
/* Save weight raising and idle window of the merged queues */
@@ -3214,6 +3216,8 @@ bfq_merge_bfqqs(struct bfq_data *bfqd, struct bfq_io_cq *bic,
bfq_reassign_last_bfqq(bfqq, new_bfqq);
bfq_release_process_ref(bfqd, bfqq);
+
+ return new_bfqq;
}
static bool bfq_allow_bio_merge(struct request_queue *q, struct request *rq,
@@ -3249,14 +3253,8 @@ static bool bfq_allow_bio_merge(struct request_queue *q, struct request *rq,
* fulfilled, i.e., bic can be redirected to new_bfqq
* and bfqq can be put.
*/
- bfq_merge_bfqqs(bfqd, bfqd->bio_bic, bfqq,
- new_bfqq);
- /*
- * If we get here, bio will be queued into new_queue,
- * so use new_bfqq to decide whether bio and rq can be
- * merged.
- */
- bfqq = new_bfqq;
+ while (bfqq != new_bfqq)
+ bfqq = bfq_merge_bfqqs(bfqd, bfqd->bio_bic, bfqq);
/*
* Change also bqfd->bio_bfqq, as
@@ -5616,9 +5614,7 @@ bfq_do_early_stable_merge(struct bfq_data *bfqd, struct bfq_queue *bfqq,
* state before killing it.
*/
bfqq->bic = bic;
- bfq_merge_bfqqs(bfqd, bic, bfqq, new_bfqq);
-
- return new_bfqq;
+ return bfq_merge_bfqqs(bfqd, bic, bfqq);
}
/*
@@ -6066,6 +6062,7 @@ static bool __bfq_insert_request(struct bfq_data *bfqd, struct request *rq)
bool waiting, idle_timer_disabled = false;
if (new_bfqq) {
+ struct bfq_queue *old_bfqq = bfqq;
/*
* Release the request's reference to the old bfqq
* and make sure one is taken to the shared queue.
@@ -6081,18 +6078,18 @@ static bool __bfq_insert_request(struct bfq_data *bfqd, struct request *rq)
* then complete the merge and redirect it to
* new_bfqq.
*/
- if (bic_to_bfqq(RQ_BIC(rq), 1) == bfqq)
- bfq_merge_bfqqs(bfqd, RQ_BIC(rq),
- bfqq, new_bfqq);
+ if (bic_to_bfqq(RQ_BIC(rq), 1) == bfqq) {
+ while (bfqq != new_bfqq)
+ bfqq = bfq_merge_bfqqs(bfqd, RQ_BIC(rq), bfqq);
+ }
- bfq_clear_bfqq_just_created(bfqq);
+ bfq_clear_bfqq_just_created(old_bfqq);
/*
* rq is about to be enqueued into new_bfqq,
* release rq reference on bfqq
*/
- bfq_put_queue(bfqq);
+ bfq_put_queue(old_bfqq);
rq->elv.priv[1] = new_bfqq;
- bfqq = new_bfqq;
}
bfq_update_io_thinktime(bfqd, bfqq);
--
2.39.2
From: Huacai Chen <chenhuacai(a)loongson.cn>
mainline inclusion
from mainline-v6.11-rc1
commit 7697a0fe0154468f5df35c23ebd7aa48994c2cdc
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IAZ33N
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
----------------------------------------------------------------------
Chromium sandbox apparently wants to deny statx [1] so it could properly
inspect arguments after the sandboxed process later falls back to fstat.
Because there's currently not a "fd-only" version of statx, so that the
sandbox has no way to ensure the path argument is empty without being
able to peek into the sandboxed process's memory. For architectures able
to do newfstatat though, glibc falls back to newfstatat after getting
-ENOSYS for statx, then the respective SIGSYS handler [2] takes care of
inspecting the path argument, transforming allowed newfstatat's into
fstat instead which is allowed and has the same type of return value.
But, as LoongArch is the first architecture to not have fstat nor
newfstatat, the LoongArch glibc does not attempt falling back at all
when it gets -ENOSYS for statx -- and you see the problem there!
Actually, back when the LoongArch port was under review, people were
aware of the same problem with sandboxing clone3 [3], so clone was
eventually kept. Unfortunately it seemed at that time no one had noticed
statx, so besides restoring fstat/newfstatat to LoongArch uapi (and
postponing the problem further), it seems inevitable that we would need
to tackle seccomp deep argument inspection.
However, this is obviously a decision that shouldn't be taken lightly,
so we just restore fstat/newfstatat by defining __ARCH_WANT_NEW_STAT
in unistd.h. This is the simplest solution for now, and so we hope the
community will tackle the long-standing problem of seccomp deep argument
inspection in the future [4][5].
Also add "newstat" to syscall_abis_64 in Makefile.syscalls due to
upstream asm-generic changes.
More infomation please reading this thread [6].
[1] https://chromium-review.googlesource.com/c/chromium/src/+/2823150
[2] https://chromium.googlesource.com/chromium/src/sandbox/+/c085b51940bd/linux…
[3] https://lore.kernel.org/linux-arch/20220511211231.GG7074@brightrain.aerifal…
[4] https://lwn.net/Articles/799557/
[5] https://lpc.events/event/4/contributions/560/attachments/397/640/deep-arg-i…
[6] https://lore.kernel.org/loongarch/20240226-granit-seilschaft-eccc2433014d@b…
Cc: stable(a)vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
---
arch/loongarch/include/asm/unistd.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/loongarch/include/asm/unistd.h b/arch/loongarch/include/asm/unistd.h
index cfddb0116a8c..f1d2b7e5c062 100644
--- a/arch/loongarch/include/asm/unistd.h
+++ b/arch/loongarch/include/asm/unistd.h
@@ -8,4 +8,5 @@
#include <uapi/asm/unistd.h>
+#define __ARCH_WANT_NEW_STAT
#define NR_syscalls (__NR_syscalls)
--
2.33.0
From: Barry Song <v-songbaohua(a)oppo.com>
Commit 13ddaf26be32 ("mm/swap: fix race when skipping swapcache")
introduced an unconditional one-tick sleep when `swapcache_prepare()`
fails, which has led to reports of UI stuttering on latency-sensitive
Android devices. To address this, we can use a waitqueue to wake up
tasks that fail `swapcache_prepare()` sooner, instead of always
sleeping for a full tick. While tasks may occasionally be woken by an
unrelated `do_swap_page()`, this method is preferable to two scenarios:
rapid re-entry into page faults, which can cause livelocks, and
multiple millisecond sleeps, which visibly degrade user experience.
Oven's testing shows that a single waitqueue resolves the UI
stuttering issue. If a 'thundering herd' problem becomes apparent
later, a waitqueue hash similar to `folio_wait_table[PAGE_WAIT_TABLE_SIZE]`
for page bit locks can be introduced.
Fixes: 13ddaf26be32 ("mm/swap: fix race when skipping swapcache")
Cc: Kairui Song <kasong(a)tencent.com>
Cc: "Huang, Ying" <ying.huang(a)intel.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Yosry Ahmed <yosryahmed(a)google.com>
Cc: SeongJae Park <sj(a)kernel.org>
Cc: Kalesh Singh <kaleshsingh(a)google.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: <stable(a)vger.kernel.org>
Reported-by: Oven Liyang <liyangouwen1(a)oppo.com>
Tested-by: Oven Liyang <liyangouwen1(a)oppo.com>
Signed-off-by: Barry Song <v-songbaohua(a)oppo.com>
---
mm/memory.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index 2366578015ad..6913174f7f41 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4192,6 +4192,8 @@ static struct folio *alloc_swap_folio(struct vm_fault *vmf)
}
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+static DECLARE_WAIT_QUEUE_HEAD(swapcache_wq);
+
/*
* We enter with non-exclusive mmap_lock (to exclude vma changes,
* but allow concurrent faults), and pte mapped but not yet locked.
@@ -4204,6 +4206,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
{
struct vm_area_struct *vma = vmf->vma;
struct folio *swapcache, *folio = NULL;
+ DECLARE_WAITQUEUE(wait, current);
struct page *page;
struct swap_info_struct *si = NULL;
rmap_t rmap_flags = RMAP_NONE;
@@ -4302,7 +4305,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
* Relax a bit to prevent rapid
* repeated page faults.
*/
+ add_wait_queue(&swapcache_wq, &wait);
schedule_timeout_uninterruptible(1);
+ remove_wait_queue(&swapcache_wq, &wait);
goto out_page;
}
need_clear_cache = true;
@@ -4609,8 +4614,10 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
pte_unmap_unlock(vmf->pte, vmf->ptl);
out:
/* Clear the swap cache pin for direct swapin after PTL unlock */
- if (need_clear_cache)
+ if (need_clear_cache) {
swapcache_clear(si, entry, nr_pages);
+ wake_up(&swapcache_wq);
+ }
if (si)
put_swap_device(si);
return ret;
@@ -4625,8 +4632,10 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
folio_unlock(swapcache);
folio_put(swapcache);
}
- if (need_clear_cache)
+ if (need_clear_cache) {
swapcache_clear(si, entry, nr_pages);
+ wake_up(&swapcache_wq);
+ }
if (si)
put_swap_device(si);
return ret;
--
2.34.1
Make sure that the tag_list_lock mutex is no longer held than necessary.
This change reduces latency if e.g. blk_mq_quiesce_tagset() is called
concurrently from more than one thread. This function is used by the
NVMe core and also by the UFS driver.
Reported-by: Peter Wang <peter.wang(a)mediatek.com>
Cc: Chao Leng <lengchao(a)huawei.com>
Cc: Ming Lei <ming.lei(a)redhat.com>
Cc: stable(a)vger.kernel.org
Fixes: commit 414dd48e882c ("blk-mq: add tagset quiesce interface")
Signed-off-by: Bart Van Assche <bvanassche(a)acm.org>
---
block/blk-mq.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 4b2c8e940f59..1ef227dfb9ba 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -283,8 +283,9 @@ void blk_mq_quiesce_tagset(struct blk_mq_tag_set *set)
if (!blk_queue_skip_tagset_quiesce(q))
blk_mq_quiesce_queue_nowait(q);
}
- blk_mq_wait_quiesce_done(set);
mutex_unlock(&set->tag_list_lock);
+
+ blk_mq_wait_quiesce_done(set);
}
EXPORT_SYMBOL_GPL(blk_mq_quiesce_tagset);
Gregory's modest proposal to fix CXL cxl_mem_probe() failures due to
delayed arrival of the CXL "root" infrastructure [1] prompted questions
of how the existing mechanism for retrying cxl_mem_probe() could be
failing.
The critical missing piece in the debug was that Gregory's setup had
almost all CXL modules built-in to the kernel.
On the way to that discovery several other bugs and init-order corner
cases were discovered.
The main fix is to make sure the drivers/cxl/Makefile object order
supports root CXL ports being fully initialized upon cxl_acpi_probe()
exit. The modular case has some similar potential holes that are fixed
with MODULE_SOFTDEP() and other fix ups. Finally, an attempt to update
cxl_test to reproduce the original report resulted in the discovery of a
separate long standing use after free bug in cxl_region_detach().
[1]: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net
---
Dan Williams (5):
cxl/port: Fix CXL port initialization order when the subsystem is built-in
cxl/port: Fix cxl_bus_rescan() vs bus_rescan_devices()
cxl/acpi: Ensure ports ready at cxl_acpi_probe() return
cxl/port: Fix use-after-free, permit out-of-order decoder shutdown
cxl/test: Improve init-order fidelity relative to real-world systems
drivers/base/core.c | 35 +++++++
drivers/cxl/Kconfig | 1
drivers/cxl/Makefile | 12 +--
drivers/cxl/acpi.c | 7 +
drivers/cxl/core/hdm.c | 50 +++++++++--
drivers/cxl/core/port.c | 13 ++-
drivers/cxl/core/region.c | 48 +++-------
drivers/cxl/cxl.h | 3 -
include/linux/device.h | 3 +
tools/testing/cxl/test/cxl.c | 200 +++++++++++++++++++++++-------------------
tools/testing/cxl/test/mem.c | 1
11 files changed, 228 insertions(+), 145 deletions(-)
base-commit: 8cf0b93919e13d1e8d4466eb4080a4c4d9d66d7b
The quilt patch titled
Subject: nilfs2: fix kernel bug due to missing clearing of buffer delay flag
has been removed from the -mm tree. Its filename was
nilfs2-fix-kernel-bug-due-to-missing-clearing-of-buffer-delay-flag.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Subject: nilfs2: fix kernel bug due to missing clearing of buffer delay flag
Date: Wed, 16 Oct 2024 06:32:07 +0900
Syzbot reported that after nilfs2 reads a corrupted file system image and
degrades to read-only, the BUG_ON check for the buffer delay flag in
submit_bh_wbc() may fail, causing a kernel bug.
This is because the buffer delay flag is not cleared when clearing the
buffer state flags to discard a page/folio or a buffer head. So, fix
this.
This became necessary when the use of nilfs2's own page clear routine was
expanded. This state inconsistency does not occur if the buffer is
written normally by log writing.
Link: https://lkml.kernel.org/r/20241015213300.7114-1-konishi.ryusuke@gmail.com
Fixes: 8c26c4e2694a ("nilfs2: fix issue with flush kernel thread after remount in RO mode because of driver's internal error or metadata corruption")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+985ada84bf055a575c07(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=985ada84bf055a575c07
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/nilfs2/page.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--- a/fs/nilfs2/page.c~nilfs2-fix-kernel-bug-due-to-missing-clearing-of-buffer-delay-flag
+++ a/fs/nilfs2/page.c
@@ -77,7 +77,8 @@ void nilfs_forget_buffer(struct buffer_h
const unsigned long clear_bits =
(BIT(BH_Uptodate) | BIT(BH_Dirty) | BIT(BH_Mapped) |
BIT(BH_Async_Write) | BIT(BH_NILFS_Volatile) |
- BIT(BH_NILFS_Checked) | BIT(BH_NILFS_Redirected));
+ BIT(BH_NILFS_Checked) | BIT(BH_NILFS_Redirected) |
+ BIT(BH_Delay));
lock_buffer(bh);
set_mask_bits(&bh->b_state, clear_bits, 0);
@@ -406,7 +407,8 @@ void nilfs_clear_folio_dirty(struct foli
const unsigned long clear_bits =
(BIT(BH_Uptodate) | BIT(BH_Dirty) | BIT(BH_Mapped) |
BIT(BH_Async_Write) | BIT(BH_NILFS_Volatile) |
- BIT(BH_NILFS_Checked) | BIT(BH_NILFS_Redirected));
+ BIT(BH_NILFS_Checked) | BIT(BH_NILFS_Redirected) |
+ BIT(BH_Delay));
bh = head;
do {
_
Patches currently in -mm which might be from konishi.ryusuke(a)gmail.com are
nilfs2-fix-kernel-bug-due-to-missing-clearing-of-checked-flag.patch
The following commit has been merged into the locking/core branch of tip:
Commit-ID: d7fe143cb115076fed0126ad8cf5ba6c3e575e43
Gitweb: https://git.kernel.org/tip/d7fe143cb115076fed0126ad8cf5ba6c3e575e43
Author: Ahmed Ehab <bottaawesome633(a)gmail.com>
AuthorDate: Sun, 25 Aug 2024 01:10:30 +03:00
Committer: Boqun Feng <boqun.feng(a)gmail.com>
CommitterDate: Thu, 17 Oct 2024 20:07:23 -07:00
locking/lockdep: Avoid creating new name string literals in lockdep_set_subclass()
Syzbot reports a problem that a warning will be triggered while
searching a lock class in look_up_lock_class().
The cause of the issue is that a new name is created and used by
lockdep_set_subclass() instead of using the existing one. This results
in a lock instance has a different name pointer than previous registered
one stored in lock class, and WARN_ONCE() is triggered because of that
in look_up_lock_class().
To fix this, change lockdep_set_subclass() to use the existing name
instead of a new one. Hence, no new name will be created by
lockdep_set_subclass(). Hence, the warning is avoided.
[boqun: Reword the commit log to state the correct issue]
Reported-by: <syzbot+7f4a6f7f7051474e40ad(a)syzkaller.appspotmail.com>
Fixes: de8f5e4f2dc1f ("lockdep: Introduce wait-type checks")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ahmed Ehab <bottaawesome633(a)gmail.com>
Signed-off-by: Boqun Feng <boqun.feng(a)gmail.com>
Link: https://lore.kernel.org/lkml/20240824221031.7751-1-bottaawesome633@gmail.co…
---
include/linux/lockdep.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 217f7ab..67964dc 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -173,7 +173,7 @@ static inline void lockdep_init_map(struct lockdep_map *lock, const char *name,
(lock)->dep_map.lock_type)
#define lockdep_set_subclass(lock, sub) \
- lockdep_init_map_type(&(lock)->dep_map, #lock, (lock)->dep_map.key, sub,\
+ lockdep_init_map_type(&(lock)->dep_map, (lock)->dep_map.name, (lock)->dep_map.key, sub,\
(lock)->dep_map.wait_type_inner, \
(lock)->dep_map.wait_type_outer, \
(lock)->dep_map.lock_type)
From: Mateusz Guzik <mjguzik(a)gmail.com>
[ Upstream commit 0d196e7589cefe207d5d41f37a0a28a1fdeeb7c6 ]
Both i_mode and noexec checks wrapped in WARN_ON stem from an artifact
of the previous implementation. They used to legitimately check for the
condition, but that got moved up in two commits:
633fb6ac3980 ("exec: move S_ISREG() check earlier")
0fd338b2d2cd ("exec: move path_noexec() check earlier")
Instead of being removed said checks are WARN_ON'ed instead, which
has some debug value.
However, the spurious path_noexec check is racy, resulting in
unwarranted warnings should someone race with setting the noexec flag.
One can note there is more to perm-checking whether execve is allowed
and none of the conditions are guaranteed to still hold after they were
tested for.
Additionally this does not validate whether the code path did any perm
checking to begin with -- it will pass if the inode happens to be
regular.
Keep the redundant path_noexec() check even though it's mindless
nonsense checking for guarantee that isn't given so drop the WARN.
Reword the commentary and do small tidy ups while here.
Signed-off-by: Mateusz Guzik <mjguzik(a)gmail.com>
Link: https://lore.kernel.org/r/20240805131721.765484-1-mjguzik@gmail.com
[brauner: keep redundant path_noexec() check]
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
[cascardo: keep exit label and use it]
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo(a)igalia.com>
---
fs/exec.c | 21 +++++++++------------
1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/fs/exec.c b/fs/exec.c
index 6e5324c7e9b6..7144c541818f 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -144,13 +144,11 @@ SYSCALL_DEFINE1(uselib, const char __user *, library)
goto out;
/*
- * may_open() has already checked for this, so it should be
- * impossible to trip now. But we need to be extra cautious
- * and check again at the very end too.
+ * Check do_open_execat() for an explanation.
*/
error = -EACCES;
- if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode) ||
- path_noexec(&file->f_path)))
+ if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode)) ||
+ path_noexec(&file->f_path))
goto exit;
fsnotify_open(file);
@@ -919,16 +917,16 @@ static struct file *do_open_execat(int fd, struct filename *name, int flags)
file = do_filp_open(fd, name, &open_exec_flags);
if (IS_ERR(file))
- goto out;
+ return file;
/*
- * may_open() has already checked for this, so it should be
- * impossible to trip now. But we need to be extra cautious
- * and check again at the very end too.
+ * In the past the regular type check was here. It moved to may_open() in
+ * 633fb6ac3980 ("exec: move S_ISREG() check earlier"). Since then it is
+ * an invariant that all non-regular files error out before we get here.
*/
err = -EACCES;
- if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode) ||
- path_noexec(&file->f_path)))
+ if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode)) ||
+ path_noexec(&file->f_path))
goto exit;
err = deny_write_access(file);
@@ -938,7 +936,6 @@ static struct file *do_open_execat(int fd, struct filename *name, int flags)
if (name->name[0] != '\0')
fsnotify_open(file);
-out:
return file;
exit:
--
2.34.1
From: Mateusz Guzik <mjguzik(a)gmail.com>
[ Upstream commit 0d196e7589cefe207d5d41f37a0a28a1fdeeb7c6 ]
Both i_mode and noexec checks wrapped in WARN_ON stem from an artifact
of the previous implementation. They used to legitimately check for the
condition, but that got moved up in two commits:
633fb6ac3980 ("exec: move S_ISREG() check earlier")
0fd338b2d2cd ("exec: move path_noexec() check earlier")
Instead of being removed said checks are WARN_ON'ed instead, which
has some debug value.
However, the spurious path_noexec check is racy, resulting in
unwarranted warnings should someone race with setting the noexec flag.
One can note there is more to perm-checking whether execve is allowed
and none of the conditions are guaranteed to still hold after they were
tested for.
Additionally this does not validate whether the code path did any perm
checking to begin with -- it will pass if the inode happens to be
regular.
Keep the redundant path_noexec() check even though it's mindless
nonsense checking for guarantee that isn't given so drop the WARN.
Reword the commentary and do small tidy ups while here.
Signed-off-by: Mateusz Guzik <mjguzik(a)gmail.com>
Link: https://lore.kernel.org/r/20240805131721.765484-1-mjguzik@gmail.com
[brauner: keep redundant path_noexec() check]
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
[cascardo: keep exit label and use it]
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo(a)igalia.com>
---
fs/exec.c | 21 +++++++++------------
1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/fs/exec.c b/fs/exec.c
index 26f0b79cb4f9..8395e7ff7b94 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -142,13 +142,11 @@ SYSCALL_DEFINE1(uselib, const char __user *, library)
goto out;
/*
- * may_open() has already checked for this, so it should be
- * impossible to trip now. But we need to be extra cautious
- * and check again at the very end too.
+ * Check do_open_execat() for an explanation.
*/
error = -EACCES;
- if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode) ||
- path_noexec(&file->f_path)))
+ if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode)) ||
+ path_noexec(&file->f_path))
goto exit;
fsnotify_open(file);
@@ -919,16 +917,16 @@ static struct file *do_open_execat(int fd, struct filename *name, int flags)
file = do_filp_open(fd, name, &open_exec_flags);
if (IS_ERR(file))
- goto out;
+ return file;
/*
- * may_open() has already checked for this, so it should be
- * impossible to trip now. But we need to be extra cautious
- * and check again at the very end too.
+ * In the past the regular type check was here. It moved to may_open() in
+ * 633fb6ac3980 ("exec: move S_ISREG() check earlier"). Since then it is
+ * an invariant that all non-regular files error out before we get here.
*/
err = -EACCES;
- if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode) ||
- path_noexec(&file->f_path)))
+ if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode)) ||
+ path_noexec(&file->f_path))
goto exit;
err = deny_write_access(file);
@@ -938,7 +936,6 @@ static struct file *do_open_execat(int fd, struct filename *name, int flags)
if (name->name[0] != '\0')
fsnotify_open(file);
-out:
return file;
exit:
--
2.34.1
From: Mateusz Guzik <mjguzik(a)gmail.com>
[ Upstream commit 0d196e7589cefe207d5d41f37a0a28a1fdeeb7c6 ]
Both i_mode and noexec checks wrapped in WARN_ON stem from an artifact
of the previous implementation. They used to legitimately check for the
condition, but that got moved up in two commits:
633fb6ac3980 ("exec: move S_ISREG() check earlier")
0fd338b2d2cd ("exec: move path_noexec() check earlier")
Instead of being removed said checks are WARN_ON'ed instead, which
has some debug value.
However, the spurious path_noexec check is racy, resulting in
unwarranted warnings should someone race with setting the noexec flag.
One can note there is more to perm-checking whether execve is allowed
and none of the conditions are guaranteed to still hold after they were
tested for.
Additionally this does not validate whether the code path did any perm
checking to begin with -- it will pass if the inode happens to be
regular.
Keep the redundant path_noexec() check even though it's mindless
nonsense checking for guarantee that isn't given so drop the WARN.
Reword the commentary and do small tidy ups while here.
Signed-off-by: Mateusz Guzik <mjguzik(a)gmail.com>
Link: https://lore.kernel.org/r/20240805131721.765484-1-mjguzik@gmail.com
[brauner: keep redundant path_noexec() check]
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
[cascardo: keep exit label and use it]
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo(a)igalia.com>
---
fs/exec.c | 21 +++++++++------------
1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/fs/exec.c b/fs/exec.c
index 65d3ebc24fd3..a42c9b8b070d 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -141,13 +141,11 @@ SYSCALL_DEFINE1(uselib, const char __user *, library)
goto out;
/*
- * may_open() has already checked for this, so it should be
- * impossible to trip now. But we need to be extra cautious
- * and check again at the very end too.
+ * Check do_open_execat() for an explanation.
*/
error = -EACCES;
- if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode) ||
- path_noexec(&file->f_path)))
+ if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode)) ||
+ path_noexec(&file->f_path))
goto exit;
fsnotify_open(file);
@@ -927,16 +925,16 @@ static struct file *do_open_execat(int fd, struct filename *name, int flags)
file = do_filp_open(fd, name, &open_exec_flags);
if (IS_ERR(file))
- goto out;
+ return file;
/*
- * may_open() has already checked for this, so it should be
- * impossible to trip now. But we need to be extra cautious
- * and check again at the very end too.
+ * In the past the regular type check was here. It moved to may_open() in
+ * 633fb6ac3980 ("exec: move S_ISREG() check earlier"). Since then it is
+ * an invariant that all non-regular files error out before we get here.
*/
err = -EACCES;
- if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode) ||
- path_noexec(&file->f_path)))
+ if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode)) ||
+ path_noexec(&file->f_path))
goto exit;
err = deny_write_access(file);
@@ -946,7 +944,6 @@ static struct file *do_open_execat(int fd, struct filename *name, int flags)
if (name->name[0] != '\0')
fsnotify_open(file);
-out:
return file;
exit:
--
2.34.1
From: Mateusz Guzik <mjguzik(a)gmail.com>
[ Upstream commit 0d196e7589cefe207d5d41f37a0a28a1fdeeb7c6 ]
Both i_mode and noexec checks wrapped in WARN_ON stem from an artifact
of the previous implementation. They used to legitimately check for the
condition, but that got moved up in two commits:
633fb6ac3980 ("exec: move S_ISREG() check earlier")
0fd338b2d2cd ("exec: move path_noexec() check earlier")
Instead of being removed said checks are WARN_ON'ed instead, which
has some debug value.
However, the spurious path_noexec check is racy, resulting in
unwarranted warnings should someone race with setting the noexec flag.
One can note there is more to perm-checking whether execve is allowed
and none of the conditions are guaranteed to still hold after they were
tested for.
Additionally this does not validate whether the code path did any perm
checking to begin with -- it will pass if the inode happens to be
regular.
Keep the redundant path_noexec() check even though it's mindless
nonsense checking for guarantee that isn't given so drop the WARN.
Reword the commentary and do small tidy ups while here.
Signed-off-by: Mateusz Guzik <mjguzik(a)gmail.com>
Link: https://lore.kernel.org/r/20240805131721.765484-1-mjguzik@gmail.com
[brauner: keep redundant path_noexec() check]
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
[cascardo: keep exit label and use it]
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo(a)igalia.com>
---
fs/exec.c | 21 +++++++++------------
1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/fs/exec.c b/fs/exec.c
index f49b352a6032..7776209d98c1 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -143,13 +143,11 @@ SYSCALL_DEFINE1(uselib, const char __user *, library)
goto out;
/*
- * may_open() has already checked for this, so it should be
- * impossible to trip now. But we need to be extra cautious
- * and check again at the very end too.
+ * Check do_open_execat() for an explanation.
*/
error = -EACCES;
- if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode) ||
- path_noexec(&file->f_path)))
+ if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode)) ||
+ path_noexec(&file->f_path))
goto exit;
error = -ENOEXEC;
@@ -925,23 +923,22 @@ static struct file *do_open_execat(int fd, struct filename *name, int flags)
file = do_filp_open(fd, name, &open_exec_flags);
if (IS_ERR(file))
- goto out;
+ return file;
/*
- * may_open() has already checked for this, so it should be
- * impossible to trip now. But we need to be extra cautious
- * and check again at the very end too.
+ * In the past the regular type check was here. It moved to may_open() in
+ * 633fb6ac3980 ("exec: move S_ISREG() check earlier"). Since then it is
+ * an invariant that all non-regular files error out before we get here.
*/
err = -EACCES;
- if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode) ||
- path_noexec(&file->f_path)))
+ if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode)) ||
+ path_noexec(&file->f_path))
goto exit;
err = deny_write_access(file);
if (err)
goto exit;
-out:
return file;
exit:
--
2.34.1
This is the start of the stable review cycle for the 5.15.169 release.
There are 82 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 23 Oct 2024 10:22:25 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.169-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.15.169-rc1
Vasiliy Kovalev <kovalev(a)altlinux.org>
ALSA: hda/conexant - Use cached pin control for Node 0x1d on HP EliteOne 1000 G2
Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
powerpc/mm: Always update max/min_low_pfn in mem_topology_setup()
Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
nilfs2: propagate directory read errors from nilfs_find_entry()
Paolo Abeni <pabeni(a)redhat.com>
mptcp: prevent MPC handshake on port-based signal endpoints
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
mptcp: pm: fix UaF read in mptcp_pm_nl_rm_addr_or_subflow
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
mptcp: fallback when MPTCP opts are dropped after 1st data
Paolo Abeni <pabeni(a)redhat.com>
tcp: fix mptcp DSS corruption due to large pmtu xmit
Paolo Abeni <pabeni(a)redhat.com>
mptcp: handle consistently DSS corruption
Geliang Tang <geliang.tang(a)suse.com>
mptcp: track and update contiguous data status
Marc Zyngier <maz(a)kernel.org>
irqchip/gic-v4: Don't allow a VMOVP on a dying VPE
Sergey Matsievskiy <matsievskiysv(a)gmail.com>
pinctrl: ocelot: fix system hang on level based interrupts
Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
x86/entry_32: Clear CPU buffers after register restore in NMI return
Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
x86/entry_32: Do not clobber user EFLAGS.ZF
Zhang Rui <rui.zhang(a)intel.com>
x86/apic: Always explicitly disarm TSC-deadline timer
Nathan Chancellor <nathan(a)kernel.org>
x86/resctrl: Annotate get_mem_config() functions as __init
Takashi Iwai <tiwai(a)suse.de>
parport: Proper fix for array out-of-bounds access
Daniele Palmas <dnlplm(a)gmail.com>
USB: serial: option: add Telit FN920C04 MBIM compositions
Benjamin B. Frost <benjamin(a)geanix.com>
USB: serial: option: add support for Quectel EG916Q-GL
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: Mitigate failed set dequeue pointer commands
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: Fix incorrect stream context type macro
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: btusb: Fix regression with fake CSR controllers 0a12:0001
Aaron Thompson <dev(a)aaront.org>
Bluetooth: Remove debugfs directory on module init failure
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: adc: ti-ads124s08: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: proximity: mb1232: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
Emil Gedenryd <emil.gedenryd(a)axis.com>
iio: light: opt3001: add missing full-scale range value
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: light: veml6030: fix IIO device retrieval from embedded device
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: light: veml6030: fix ALS sensor resolution
Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
iio: hid-sensors: Fix an error handling path in _hid_sensor_set_report_latency()
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: adc: ti-ads8688: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: dac: stm32-dac-core: add missing select REGMAP_MMIO in Kconfig
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: dac: ltc1660: add missing select REGMAP_SPI in Kconfig
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: dac: ad5770r: add missing select REGMAP_SPI in Kconfig
Nikolay Kuratov <kniv(a)yandex-team.ru>
drm/vmwgfx: Handle surface check failure correctly
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/radeon: Fix encoder->possible_clones
Jens Axboe <axboe(a)kernel.dk>
io_uring/sqpoll: close race on waiting for sqring entries
Omar Sandoval <osandov(a)fb.com>
blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race
Johannes Wikner <kwikner(a)ethz.ch>
x86/bugs: Do not use UNTRAIN_RET with IBPB on entry
Johannes Wikner <kwikner(a)ethz.ch>
x86/bugs: Skip RSB fill at VMEXIT
Johannes Wikner <kwikner(a)ethz.ch>
x86/entry: Have entry_ibpb() invalidate return predictions
Johannes Wikner <kwikner(a)ethz.ch>
x86/cpufeatures: Add a IBPB_NO_RET BUG flag
Jim Mattson <jmattson(a)google.com>
x86/cpufeatures: Define X86_FEATURE_AMD_IBPB_RET
Michael Mueller <mimu(a)linux.ibm.com>
KVM: s390: Change virtual to physical address access in diag 0x258 handler
Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
s390/sclp_vt220: Convert newlines to CRLF instead of LFCR
Lu Baolu <baolu.lu(a)linux.intel.com>
iommu/vt-d: Fix incorrect pci_for_each_dma_alias() for non-PCI devices
Felix Moessbauer <felix.moessbauer(a)siemens.com>
io_uring/sqpoll: do not put cpumask on stack
Jens Axboe <axboe(a)kernel.dk>
io_uring/sqpoll: retain test for whether the CPU is valid
Felix Moessbauer <felix.moessbauer(a)siemens.com>
io_uring/sqpoll: do not allow pinning outside of cpuset
Wachowski, Karol <karol.wachowski(a)intel.com>
drm/shmem-helper: Fix BUG_ON() on mmap(PROT_WRITE, MAP_PRIVATE)
Breno Leitao <leitao(a)debian.org>
KVM: Fix a data race on last_boosted_vcpu in kvm_vcpu_on_spin()
Mikulas Patocka <mpatocka(a)redhat.com>
dm-crypt, dm-verity: disable tasklets
Johannes Berg <johannes.berg(a)intel.com>
wifi: mac80211: fix potential key use-after-free
Patrick Roy <roypat(a)amazon.co.uk>
secretmem: disable memfd_secret() if arch cannot set direct map
Liu Shixin <liushixin2(a)huawei.com>
mm/swapfile: skip HugeTLB pages for unuse_vma
OGAWA Hirofumi <hirofumi(a)mail.parknet.co.jp>
fat: fix uninitialized variable
Nianyao Tang <tangnianyao(a)huawei.com>
irqchip/gic-v3-its: Fix VSYNC referencing an unmapped VPE on GIC v4.1
Oleksij Rempel <linux(a)rempel-privat.de>
net: macb: Avoid 20s boot delay by skipping MDIO bus registration for fixed-link PHY
Mark Rutland <mark.rutland(a)arm.com>
arm64: probes: Fix simulate_ldr*_literal()
Mark Rutland <mark.rutland(a)arm.com>
arm64: probes: Remove broken LDR (literal) uprobe support
Jinjie Ruan <ruanjinjie(a)huawei.com>
posix-clock: Fix missing timespec64 check in pc_clock_settime()
Wei Fang <wei.fang(a)nxp.com>
net: enetc: add missing static descriptor and inline keyword
Wei Fang <wei.fang(a)nxp.com>
net: enetc: remove xdp_drops statistic from enetc_xdp_drop()
Jan Kara <jack(a)suse.cz>
udf: Fix bogus checksum computation in udf_rename()
Jan Kara <jack(a)suse.cz>
udf: Don't return bh from udf_expand_dir_adinicb()
Jan Kara <jack(a)suse.cz>
udf: Handle error when expanding directory
Jan Kara <jack(a)suse.cz>
udf: Remove old directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_link() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_mkdir() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_add_nondir() to new directory iteration
Jan Kara <jack(a)suse.cz>
udf: Implement adding of dir entries using new iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_unlink() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_rmdir() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert empty_dir() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_get_parent() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_lookup() to use new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_readdir() to new directory iteration
Jan Kara <jack(a)suse.cz>
udf: Convert udf_rename() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Provide function to mark entry as deleted using new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Implement searching for directory entry using new iteration code
Jan Kara <jack(a)suse.cz>
udf: Move udf_expand_dir_adinicb() to its callsite
Jan Kara <jack(a)suse.cz>
udf: Convert udf_expand_dir_adinicb() to new directory iteration
Jan Kara <jack(a)suse.cz>
udf: New directory iteration code
Vasiliy Kovalev <kovalev(a)altlinux.org>
ALSA: hda/conexant - Fix audio routing for HP EliteOne 1000 G2
-------------
Diffstat:
Makefile | 4 +-
arch/arm64/kernel/probes/decode-insn.c | 16 +-
arch/arm64/kernel/probes/simulate-insn.c | 18 +-
arch/powerpc/mm/numa.c | 6 +-
arch/s390/kvm/diag.c | 2 +-
arch/x86/entry/entry.S | 5 +
arch/x86/entry/entry_32.S | 6 +-
arch/x86/include/asm/cpufeatures.h | 4 +-
arch/x86/kernel/apic/apic.c | 14 +-
arch/x86/kernel/cpu/bugs.c | 32 +
arch/x86/kernel/cpu/common.c | 3 +
arch/x86/kernel/cpu/resctrl/core.c | 4 +-
block/blk-rq-qos.c | 2 +-
drivers/bluetooth/btusb.c | 13 +-
drivers/gpu/drm/drm_gem_shmem_helper.c | 3 +
drivers/gpu/drm/radeon/radeon_encoders.c | 2 +-
drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 1 +
drivers/iio/adc/Kconfig | 4 +
.../iio/common/hid-sensors/hid-sensor-trigger.c | 2 +-
drivers/iio/dac/Kconfig | 3 +
drivers/iio/light/opt3001.c | 4 +
drivers/iio/light/veml6030.c | 5 +-
drivers/iio/proximity/Kconfig | 2 +
drivers/iommu/intel/iommu.c | 4 +-
drivers/irqchip/irq-gic-v3-its.c | 26 +-
drivers/md/dm-crypt.c | 37 +-
drivers/net/ethernet/cadence/macb_main.c | 14 +-
drivers/net/ethernet/freescale/enetc/enetc.c | 2 +-
drivers/parport/procfs.c | 22 +-
drivers/pinctrl/pinctrl-ocelot.c | 8 +-
drivers/s390/char/sclp_vt220.c | 4 +-
drivers/usb/host/xhci-ring.c | 2 +-
drivers/usb/host/xhci.h | 2 +-
drivers/usb/serial/option.c | 8 +
fs/fat/namei_vfat.c | 2 +-
fs/nilfs2/dir.c | 50 +-
fs/nilfs2/namei.c | 39 +-
fs/nilfs2/nilfs.h | 2 +-
fs/udf/dir.c | 148 +--
fs/udf/directory.c | 594 ++++++++---
fs/udf/inode.c | 90 --
fs/udf/namei.c | 1038 +++++++-------------
fs/udf/udfdecl.h | 45 +-
include/linux/fsl/enetc_mdio.h | 3 +-
include/linux/irqchip/arm-gic-v4.h | 4 +-
io_uring/io_uring.c | 21 +-
kernel/time/posix-clock.c | 3 +
mm/secretmem.c | 4 +-
mm/swapfile.c | 2 +-
net/bluetooth/af_bluetooth.c | 1 +
net/ipv4/tcp_output.c | 2 +-
net/mac80211/cfg.c | 3 +
net/mac80211/key.c | 2 +-
net/mptcp/mib.c | 3 +
net/mptcp/mib.h | 3 +
net/mptcp/pm_netlink.c | 3 +-
net/mptcp/protocol.c | 23 +-
net/mptcp/protocol.h | 2 +
net/mptcp/subflow.c | 19 +-
sound/pci/hda/patch_conexant.c | 19 +
virt/kvm/kvm_main.c | 5 +-
61 files changed, 1172 insertions(+), 1242 deletions(-)
From: Christoph Hellwig <hch(a)lst.de>
upstream 936e114a245b6e38e0dbf706a67e7611fc993da1 commit.
Move the ki_pos update down a bit to prepare for a better common helper
that invalidates pages based of an iocb.
Link: https://lkml.kernel.org/r/20230601145904.1385409-3-hch@lst.de
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Damien Le Moal <dlemoal(a)kernel.org>
Reviewed-by: Hannes Reinecke <hare(a)suse.de>
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: Andreas Gruenbacher <agruenba(a)redhat.com>
Cc: Anna Schumaker <anna(a)kernel.org>
Cc: Chao Yu <chao(a)kernel.org>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Ilya Dryomov <idryomov(a)gmail.com>
Cc: Jaegeuk Kim <jaegeuk(a)kernel.org>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Miklos Szeredi <miklos(a)szeredi.hu>
Cc: Miklos Szeredi <mszeredi(a)redhat.com>
Cc: Theodore Ts'o <tytso(a)mit.edu>
Cc: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Cc: Xiubo Li <xiubli(a)redhat.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Mahmoud Adam <mngyadam(a)amazon.com>
---
This fixes the ext3/4 data corruption casued by dde4c1e1663b6 ("ext4:
properly sync file size update after O_SYNC direct IO").
reported here: https://lore.kernel.org/all/2024102130-thieving-parchment-7885@gregkh/T/#mf…
fs/iomap/direct-io.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 933f234d5becd..8a49c0d3a7b46 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -93,7 +93,6 @@ ssize_t iomap_dio_complete(struct iomap_dio *dio)
if (offset + ret > dio->i_size &&
!(dio->flags & IOMAP_DIO_WRITE))
ret = dio->i_size - offset;
- iocb->ki_pos += ret;
}
/*
@@ -119,15 +118,18 @@ ssize_t iomap_dio_complete(struct iomap_dio *dio)
}
inode_dio_end(file_inode(iocb->ki_filp));
- /*
- * If this is a DSYNC write, make sure we push it to stable storage now
- * that we've written data.
- */
- if (ret > 0 && (dio->flags & IOMAP_DIO_NEED_SYNC))
- ret = generic_write_sync(iocb, ret);
- kfree(dio);
+ if (ret > 0) {
+ iocb->ki_pos += ret;
+ /*
+ * If this is a DSYNC write, make sure we push it to stable
+ * storage now that we've written data.
+ */
+ if (dio->flags & IOMAP_DIO_NEED_SYNC)
+ ret = generic_write_sync(iocb, ret);
+ }
+ kfree(dio);
return ret;
}
EXPORT_SYMBOL_GPL(iomap_dio_complete);
--
2.40.1
From: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Commit ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in
remap_file_pages()") fixed a security issue, it added an LSM check when
trying to remap file pages, so that LSMs have the opportunity to evaluate
such action like for other memory operations such as mmap() and mprotect().
However, that commit called security_mmap_file() inside the mmap_lock lock,
while the other calls do it before taking the lock, after commit
8b3ec6814c83 ("take security_mmap_file() outside of ->mmap_sem").
This caused lock inversion issue with IMA which was taking the mmap_lock
and i_mutex lock in the opposite way when the remap_file_pages() system
call was called.
Solve the issue by splitting the critical region in remap_file_pages() in
two regions: the first takes a read lock of mmap_lock, retrieves the VMA
and the file descriptor associated, and calculates the 'prot' and 'flags'
variables; the second takes a write lock on mmap_lock, checks that the VMA
flags and the VMA file descriptor are the same as the ones obtained in the
first critical region (otherwise the system call fails), and calls
do_mmap().
In between, after releasing the read lock and before taking the write lock,
call security_mmap_file(), and solve the lock inversion issue.
Cc: stable(a)vger.kernel.org # v6.12-rcx
Fixes: ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in remap_file_pages()")
Reported-by: syzbot+1cd571a672400ef3a930(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-security-module/66f7b10e.050a0220.46d20.0036.…
Reviewed-by: Roberto Sassu <roberto.sassu(a)huawei.com>
Reviewed-by: Jann Horn <jannh(a)google.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Tested-by: Roberto Sassu <roberto.sassu(a)huawei.com>
Tested-by: syzbot+1cd571a672400ef3a930(a)syzkaller.appspotmail.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
---
mm/mmap.c | 69 +++++++++++++++++++++++++++++++++++++++++--------------
1 file changed, 52 insertions(+), 17 deletions(-)
diff --git a/mm/mmap.c b/mm/mmap.c
index 9c0fb43064b5..f731dd69e162 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1640,6 +1640,7 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
unsigned long populate = 0;
unsigned long ret = -EINVAL;
struct file *file;
+ vm_flags_t vm_flags;
pr_warn_once("%s (%d) uses deprecated remap_file_pages() syscall. See Documentation/mm/remap_file_pages.rst.\n",
current->comm, current->pid);
@@ -1656,12 +1657,60 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
if (pgoff + (size >> PAGE_SHIFT) < pgoff)
return ret;
- if (mmap_write_lock_killable(mm))
+ if (mmap_read_lock_killable(mm))
return -EINTR;
+ /*
+ * Look up VMA under read lock first so we can perform the security
+ * without holding locks (which can be problematic). We reacquire a
+ * write lock later and check nothing changed underneath us.
+ */
vma = vma_lookup(mm, start);
- if (!vma || !(vma->vm_flags & VM_SHARED))
+ if (!vma || !(vma->vm_flags & VM_SHARED)) {
+ mmap_read_unlock(mm);
+ return -EINVAL;
+ }
+
+ prot |= vma->vm_flags & VM_READ ? PROT_READ : 0;
+ prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0;
+ prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0;
+
+ flags &= MAP_NONBLOCK;
+ flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE;
+ if (vma->vm_flags & VM_LOCKED)
+ flags |= MAP_LOCKED;
+
+ /* Save vm_flags used to calculate prot and flags, and recheck later. */
+ vm_flags = vma->vm_flags;
+ file = get_file(vma->vm_file);
+
+ mmap_read_unlock(mm);
+
+ /* Call outside mmap_lock to be consistent with other callers. */
+ ret = security_mmap_file(file, prot, flags);
+ if (ret) {
+ fput(file);
+ return ret;
+ }
+
+ ret = -EINVAL;
+
+ /* OK security check passed, take write lock + let it rip. */
+ if (mmap_write_lock_killable(mm)) {
+ fput(file);
+ return -EINTR;
+ }
+
+ vma = vma_lookup(mm, start);
+
+ if (!vma)
+ goto out;
+
+ /* Make sure things didn't change under us. */
+ if (vma->vm_flags != vm_flags)
+ goto out;
+ if (vma->vm_file != file)
goto out;
if (start + size > vma->vm_end) {
@@ -1689,25 +1738,11 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
goto out;
}
- prot |= vma->vm_flags & VM_READ ? PROT_READ : 0;
- prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0;
- prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0;
-
- flags &= MAP_NONBLOCK;
- flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE;
- if (vma->vm_flags & VM_LOCKED)
- flags |= MAP_LOCKED;
-
- file = get_file(vma->vm_file);
- ret = security_mmap_file(vma->vm_file, prot, flags);
- if (ret)
- goto out_fput;
ret = do_mmap(vma->vm_file, start, size,
prot, flags, 0, pgoff, &populate, NULL);
-out_fput:
- fput(file);
out:
mmap_write_unlock(mm);
+ fput(file);
if (populate)
mm_populate(ret, populate);
if (!IS_ERR_VALUE(ret))
--
2.34.1
During the aborting of a command, the software receives a command
completion event for the command ring stopped, with the TRB pointing
to the next TRB after the aborted command.
If the command we abort is located just before the Link TRB in the
command ring, then during the 'command ring stopped' completion event,
the xHC gives the Link TRB in the event's cmd DMA, which causes a
mismatch in handling command completion event.
To handle this situation, an additional check has been added to ignore
the mismatch error and continue the operation.
Fixes: 7f84eef0dafb ("USB: xhci: No-op command queueing and irq handler.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Faisal Hassan <quic_faisalh(a)quicinc.com>
---
Changes in v2:
- Removed traversing of TRBs with in_range() API.
- Simplified the if condition check.
v1 link:
https://lore.kernel.org/all/20241018195953.12315-1-quic_faisalh@quicinc.com
drivers/usb/host/xhci-ring.c | 43 +++++++++++++++++++++++++++++++-----
1 file changed, 38 insertions(+), 5 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index b2950c35c740..de375c9f08ca 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -126,6 +126,29 @@ static void inc_td_cnt(struct urb *urb)
urb_priv->num_tds_done++;
}
+/*
+ * Return true if the DMA is pointing to a Link TRB in the ring;
+ * otherwise, return false.
+ */
+static bool is_dma_link_trb(struct xhci_ring *ring, dma_addr_t dma)
+{
+ struct xhci_segment *seg;
+ union xhci_trb *trb;
+
+ seg = ring->first_seg;
+ do {
+ if (in_range(dma, seg->dma, TRB_SEGMENT_SIZE)) {
+ /* found the TRB, check if it's link */
+ trb = &seg->trbs[(dma - seg->dma) / sizeof(*trb)];
+ return trb_is_link(trb);
+ }
+
+ seg = seg->next;
+ } while (seg != ring->first_seg);
+
+ return false;
+}
+
static void trb_to_noop(union xhci_trb *trb, u32 noop_type)
{
if (trb_is_link(trb)) {
@@ -1718,6 +1741,7 @@ static void handle_cmd_completion(struct xhci_hcd *xhci,
trace_xhci_handle_command(xhci->cmd_ring, &cmd_trb->generic);
+ cmd_comp_code = GET_COMP_CODE(le32_to_cpu(event->status));
cmd_dequeue_dma = xhci_trb_virt_to_dma(xhci->cmd_ring->deq_seg,
cmd_trb);
/*
@@ -1725,17 +1749,26 @@ static void handle_cmd_completion(struct xhci_hcd *xhci,
* command.
*/
if (!cmd_dequeue_dma || cmd_dma != (u64)cmd_dequeue_dma) {
- xhci_warn(xhci,
- "ERROR mismatched command completion event\n");
- return;
+ /*
+ * For the 'command ring stopped' completion event, there
+ * is a risk of a mismatch in dequeue pointers if we abort
+ * the command just before the link TRB in the command ring.
+ * In this scenario, the cmd_dma in the event would point
+ * to a link TRB, while the software dequeue pointer circles
+ * back to the start.
+ */
+ if (!(cmd_comp_code == COMP_COMMAND_RING_STOPPED &&
+ is_dma_link_trb(xhci->cmd_ring, cmd_dma))) {
+ xhci_warn(xhci,
+ "ERROR mismatched command completion event\n");
+ return;
+ }
}
cmd = list_first_entry(&xhci->cmd_list, struct xhci_command, cmd_list);
cancel_delayed_work(&xhci->cmd_timer);
- cmd_comp_code = GET_COMP_CODE(le32_to_cpu(event->status));
-
/* If CMD ring stopped we own the trbs between enqueue and dequeue */
if (cmd_comp_code == COMP_COMMAND_RING_STOPPED) {
complete_all(&xhci->cmd_ring_stop_completion);
--
2.17.1
From: Johannes Berg <johannes.berg(a)intel.com>
When we free wdev->cqm_config when unregistering, we also
need to clear out the pointer since the same wdev/netdev
may get re-registered in another network namespace, then
destroyed later, running this code again, which results in
a double-free.
Reported-by: syzbot+36218cddfd84b5cc263e(a)syzkaller.appspotmail.com
Fixes: 37c20b2effe9 ("wifi: cfg80211: fix cqm_config access race")
Cc: stable(a)vger.kernel.org # 6.6+
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
---
net/wireless/core.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/wireless/core.c b/net/wireless/core.c
index 4c8d8f167409..d3c7b7978f00 100644
--- a/net/wireless/core.c
+++ b/net/wireless/core.c
@@ -1280,6 +1280,7 @@ static void _cfg80211_unregister_wdev(struct wireless_dev *wdev,
/* deleted from the list, so can't be found from nl80211 any more */
cqm_config = rcu_access_pointer(wdev->cqm_config);
kfree_rcu(cqm_config, rcu_head);
+ RCU_INIT_POINTER(wdev->cqm_config, NULL);
/*
* Ensure that all events have been processed and
--
2.47.0
daddr can be NULL if there is no neighbour table entry present,
in that case the tx packet should be dropped.
saddr will normally be set by MCTP core, but in case it is NULL it
should be set to the device address.
Incorrect indent of the function arguments is also fixed.
Fixes: f5b8abf9fc3d ("mctp i2c: MCTP I2C binding driver")
Cc: stable(a)vger.kernel.org
Reported-by: Dung Cao <dung(a)os.amperecomputing.com>
Signed-off-by: Matt Johnston <matt(a)codeconstruct.com.au>
---
Changes in v2:
- Set saddr to device address if NULL, mention in commit message
- Fix patch prefix formatting
- Link to v1: https://lore.kernel.org/r/20241018-mctp-i2c-null-dest-v1-1-ba1ab52966e9@cod…
---
drivers/net/mctp/mctp-i2c.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/net/mctp/mctp-i2c.c b/drivers/net/mctp/mctp-i2c.c
index 4dc057c121f5d0fb9c9c48bf16b6933ae2f7b2ac..c909254e03c21518c17daf8b813e610558e074c1 100644
--- a/drivers/net/mctp/mctp-i2c.c
+++ b/drivers/net/mctp/mctp-i2c.c
@@ -579,7 +579,7 @@ static void mctp_i2c_flow_release(struct mctp_i2c_dev *midev)
static int mctp_i2c_header_create(struct sk_buff *skb, struct net_device *dev,
unsigned short type, const void *daddr,
- const void *saddr, unsigned int len)
+ const void *saddr, unsigned int len)
{
struct mctp_i2c_hdr *hdr;
struct mctp_hdr *mhdr;
@@ -588,8 +588,15 @@ static int mctp_i2c_header_create(struct sk_buff *skb, struct net_device *dev,
if (len > MCTP_I2C_MAXMTU)
return -EMSGSIZE;
- lldst = *((u8 *)daddr);
- llsrc = *((u8 *)saddr);
+ if (daddr)
+ lldst = *((u8 *)daddr);
+ else
+ return -EINVAL;
+
+ if (saddr)
+ llsrc = *((u8 *)saddr);
+ else
+ llsrc = dev->dev_addr;
skb_push(skb, sizeof(struct mctp_i2c_hdr));
skb_reset_mac_header(skb);
---
base-commit: cb560795c8c2ceca1d36a95f0d1b2eafc4074e37
change-id: 20241018-mctp-i2c-null-dest-a0ba271e0c48
Best regards,
--
Matt Johnston <matt(a)codeconstruct.com.au>
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 2b0f922323ccfa76219bcaacd35cd50aeaa13592
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101837-mammogram-headsman-2dec@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2b0f922323ccfa76219bcaacd35cd50aeaa13592 Mon Sep 17 00:00:00 2001
From: David Hildenbrand <david(a)redhat.com>
Date: Fri, 11 Oct 2024 12:24:45 +0200
Subject: [PATCH] mm: don't install PMD mappings when THPs are disabled by the
hw/process/vma
We (or rather, readahead logic :) ) might be allocating a THP in the
pagecache and then try mapping it into a process that explicitly disabled
THP: we might end up installing PMD mappings.
This is a problem for s390x KVM, which explicitly remaps all PMD-mapped
THPs to be PTE-mapped in s390_enable_sie()->thp_split_mm(), before
starting the VM.
For example, starting a VM backed on a file system with large folios
supported makes the VM crash when the VM tries accessing such a mapping
using KVM.
Is it also a problem when the HW disabled THP using
TRANSPARENT_HUGEPAGE_UNSUPPORTED? At least on x86 this would be the case
without X86_FEATURE_PSE.
In the future, we might be able to do better on s390x and only disallow
PMD mappings -- what s390x and likely TRANSPARENT_HUGEPAGE_UNSUPPORTED
really wants. For now, fix it by essentially performing the same check as
would be done in __thp_vma_allowable_orders() or in shmem code, where this
works as expected, and disallow PMD mappings, making us fallback to PTE
mappings.
Link: https://lkml.kernel.org/r/20241011102445.934409-3-david@redhat.com
Fixes: 793917d997df ("mm/readahead: Add large folio readahead")
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Reported-by: Leo Fu <bfu(a)redhat.com>
Tested-by: Thomas Huth <thuth(a)redhat.com>
Cc: Thomas Huth <thuth(a)redhat.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Cc: Janosch Frank <frankja(a)linux.ibm.com>
Cc: Claudio Imbrenda <imbrenda(a)linux.ibm.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/memory.c b/mm/memory.c
index c0869a962ddd..30feedabc932 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4920,6 +4920,15 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
pmd_t entry;
vm_fault_t ret = VM_FAULT_FALLBACK;
+ /*
+ * It is too late to allocate a small folio, we already have a large
+ * folio in the pagecache: especially s390 KVM cannot tolerate any
+ * PMD mappings, but PTE-mapped THP are fine. So let's simply refuse any
+ * PMD mappings if THPs are disabled.
+ */
+ if (thp_disabled_by_hw() || vma_thp_disabled(vma, vma->vm_flags))
+ return ret;
+
if (!thp_vma_suitable_order(vma, haddr, PMD_ORDER))
return ret;
The Voltorb device uses a speaker codec different from the original
Corsola device. When the Voltorb device tree was first added, the new
codec was added as a separate node when it should have just replaced the
existing one.
Merge the two nodes. The only differences are the compatible string and
the GPIO line property name. This keeps the device node path for the
speaker codec the same across the MT8186 Chromebook line. Also rename
the related labels and node names from having rt1019p to speaker codec.
Cc: <stable(a)vger.kernel.org> # v6.11+
Signed-off-by: Chen-Yu Tsai <wenst(a)chromium.org>
---
This is technically not a fix, but having the same device tree structure
in stable kernels would be more consistent for consumers of the device
tree. Hence the request for a stable backport.
Changes since v1:
- Dropped Fixes tag, since this is technically a cleanup, not a fix
- Rename existing rt1019p related node names and labels to the generic
"speaker codec" name
---
.../dts/mediatek/mt8186-corsola-voltorb.dtsi | 21 +++++--------------
.../boot/dts/mediatek/mt8186-corsola.dtsi | 8 +++----
2 files changed, 9 insertions(+), 20 deletions(-)
diff --git a/arch/arm64/boot/dts/mediatek/mt8186-corsola-voltorb.dtsi b/arch/arm64/boot/dts/mediatek/mt8186-corsola-voltorb.dtsi
index 52ec58128d56..b495a241b443 100644
--- a/arch/arm64/boot/dts/mediatek/mt8186-corsola-voltorb.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8186-corsola-voltorb.dtsi
@@ -10,12 +10,6 @@
/ {
chassis-type = "laptop";
-
- max98360a: max98360a {
- compatible = "maxim,max98360a";
- sdmode-gpios = <&pio 150 GPIO_ACTIVE_HIGH>;
- #sound-dai-cells = <0>;
- };
};
&cpu6 {
@@ -59,19 +53,14 @@ &cluster1_opp_15 {
opp-hz = /bits/ 64 <2200000000>;
};
-&rt1019p{
- status = "disabled";
-};
-
&sound {
compatible = "mediatek,mt8186-mt6366-rt5682s-max98360-sound";
- status = "okay";
+};
- spk-hdmi-playback-dai-link {
- codec {
- sound-dai = <&it6505dptx>, <&max98360a>;
- };
- };
+&speaker_codec {
+ compatible = "maxim,max98360a";
+ sdmode-gpios = <&pio 150 GPIO_ACTIVE_HIGH>;
+ /delete-property/ sdb-gpios;
};
&spmi {
diff --git a/arch/arm64/boot/dts/mediatek/mt8186-corsola.dtsi b/arch/arm64/boot/dts/mediatek/mt8186-corsola.dtsi
index c7580ac1e2d4..cf288fe7a238 100644
--- a/arch/arm64/boot/dts/mediatek/mt8186-corsola.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8186-corsola.dtsi
@@ -259,15 +259,15 @@ spk-hdmi-playback-dai-link {
mediatek,clk-provider = "cpu";
/* RT1019P and IT6505 connected to the same I2S line */
codec {
- sound-dai = <&it6505dptx>, <&rt1019p>;
+ sound-dai = <&it6505dptx>, <&speaker_codec>;
};
};
};
- rt1019p: speaker-codec {
+ speaker_codec: speaker-codec {
compatible = "realtek,rt1019p";
pinctrl-names = "default";
- pinctrl-0 = <&rt1019p_pins_default>;
+ pinctrl-0 = <&speaker_codec_pins_default>;
#sound-dai-cells = <0>;
sdb-gpios = <&pio 150 GPIO_ACTIVE_HIGH>;
};
@@ -1195,7 +1195,7 @@ pins {
};
};
- rt1019p_pins_default: rt1019p-default-pins {
+ speaker_codec_pins_default: speaker-codec-default-pins {
pins-sdb {
pinmux = <PINMUX_GPIO150__FUNC_GPIO150>;
output-low;
--
2.47.0.rc1.288.g06298d1525-goog
The patch below does not apply to the 6.11-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.11.y
git checkout FETCH_HEAD
git cherry-pick -x 963756aac1f011d904ddd9548ae82286d3a91f96
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101835-eloquent-could-27ce@gregkh' --subject-prefix 'PATCH 6.11.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 963756aac1f011d904ddd9548ae82286d3a91f96 Mon Sep 17 00:00:00 2001
From: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Date: Fri, 11 Oct 2024 12:24:44 +0200
Subject: [PATCH] mm: huge_memory: add vma_thp_disabled() and
thp_disabled_by_hw()
Patch series "mm: don't install PMD mappings when THPs are disabled by the
hw/process/vma".
During testing, it was found that we can get PMD mappings in processes
where THP (and more precisely, PMD mappings) are supposed to be disabled.
While it works as expected for anon+shmem, the pagecache is the
problematic bit.
For s390 KVM this currently means that a VM backed by a file located on
filesystem with large folio support can crash when KVM tries accessing the
problematic page, because the readahead logic might decide to use a
PMD-sized THP and faulting it into the page tables will install a PMD
mapping, something that s390 KVM cannot tolerate.
This might also be a problem with HW that does not support PMD mappings,
but I did not try reproducing it.
Fix it by respecting the ways to disable THPs when deciding whether we can
install a PMD mapping. khugepaged should already be taking care of not
collapsing if THPs are effectively disabled for the hw/process/vma.
This patch (of 2):
Add vma_thp_disabled() and thp_disabled_by_hw() helpers to be shared by
shmem_allowable_huge_orders() and __thp_vma_allowable_orders().
[david(a)redhat.com: rename to vma_thp_disabled(), split out thp_disabled_by_hw() ]
Link: https://lkml.kernel.org/r/20241011102445.934409-2-david@redhat.com
Fixes: 793917d997df ("mm/readahead: Add large folio readahead")
Signed-off-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Reported-by: Leo Fu <bfu(a)redhat.com>
Tested-by: Thomas Huth <thuth(a)redhat.com>
Reviewed-by: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Boqiao Fu <bfu(a)redhat.com>
Cc: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Cc: Claudio Imbrenda <imbrenda(a)linux.ibm.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Janosch Frank <frankja(a)linux.ibm.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 67d0ab3c3bba..ef5b80e48599 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -322,6 +322,24 @@ struct thpsize {
(transparent_hugepage_flags & \
(1<<TRANSPARENT_HUGEPAGE_USE_ZERO_PAGE_FLAG))
+static inline bool vma_thp_disabled(struct vm_area_struct *vma,
+ unsigned long vm_flags)
+{
+ /*
+ * Explicitly disabled through madvise or prctl, or some
+ * architectures may disable THP for some mappings, for
+ * example, s390 kvm.
+ */
+ return (vm_flags & VM_NOHUGEPAGE) ||
+ test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags);
+}
+
+static inline bool thp_disabled_by_hw(void)
+{
+ /* If the hardware/firmware marked hugepage support disabled. */
+ return transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED);
+}
+
unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
unsigned long len, unsigned long pgoff, unsigned long flags);
unsigned long thp_get_unmapped_area_vmflags(struct file *filp, unsigned long addr,
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 87b49ecc7b1e..2fb328880b50 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -109,18 +109,7 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,
if (!vma->vm_mm) /* vdso */
return 0;
- /*
- * Explicitly disabled through madvise or prctl, or some
- * architectures may disable THP for some mappings, for
- * example, s390 kvm.
- * */
- if ((vm_flags & VM_NOHUGEPAGE) ||
- test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
- return 0;
- /*
- * If the hardware/firmware marked hugepage support disabled.
- */
- if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED))
+ if (thp_disabled_by_hw() || vma_thp_disabled(vma, vm_flags))
return 0;
/* khugepaged doesn't collapse DAX vma, but page fault is fine. */
diff --git a/mm/shmem.c b/mm/shmem.c
index 4f11b5506363..c5adb987b23c 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1664,12 +1664,7 @@ unsigned long shmem_allowable_huge_orders(struct inode *inode,
loff_t i_size;
int order;
- if (vma && ((vm_flags & VM_NOHUGEPAGE) ||
- test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags)))
- return 0;
-
- /* If the hardware/firmware marked hugepage support disabled. */
- if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED))
+ if (thp_disabled_by_hw() || (vma && vma_thp_disabled(vma, vm_flags)))
return 0;
global_huge = shmem_huge_global_enabled(inode, index, write_end,
Hi,
This is a backport of [1] series for 6.6.x stable kernel.
It is fixing a user reported [2] NULL dereference kernel panic after
updating the SOF firmware and topology files.
While Meteor Lake is not supported by 6.6 kernel, users might need to be
able to use it as a jumping point to update the kernel to a version which
is supporting Meteor Lake (6.7+), a kernel panic should be avoided to
not block the transition.
[1] https://lore.kernel.org/alsa-devel/20230919103115.30783-1-peter.ujfalusi@li…
[2] https://github.com/thesofproject/sof/issues/9600
Regards,
Peter
---
Peter Ujfalusi (3):
ASoC: SOF: ipc4-topology: Add definition for generic switch/enum
control
ASoC: SOF: ipc4-control: Add support for ALSA switch control
ASoC: SOF: ipc4-control: Add support for ALSA enum control
sound/soc/sof/ipc4-control.c | 175 +++++++++++++++++++++++++++++++++-
sound/soc/sof/ipc4-topology.c | 49 +++++++++-
sound/soc/sof/ipc4-topology.h | 19 +++-
3 files changed, 237 insertions(+), 6 deletions(-)
--
2.47.0
[CCing Greg and the stable list, to ensure he is aware of this, as well
as the regressions list]
On 21.10.24 11:45, Pablo Neira Ayuso wrote:
> - There is no NFPROTO_IPV6 family for mark and NFLOG.
> - TRACE is also missing module autoload with NFPROTO_IPV6.
>
> This results in ip6tables failing to restore a ruleset. This issue has been
> reported by several users providing incomplete patches.
>
> Very similar to Ilya Katsnelson's patch including a missing chunk in the
> TRACE extension.
>
> Fixes: 0bfcb7b71e73 ("netfilter: xtables: avoid NFPROTO_UNSPEC where needed")
> [...]
Just FYI as the culprit recently hit various stable series (v6.11.4,
v6.6.57, v6.1.113, v5.15.168) quite a few reports came in that look like
issues that might be fixed by this to my untrained eyes. I suppose they
won't tell you anything new and maybe you even have seen them, but on
the off-chance that this might not be the case you can find them here:
https://bugzilla.kernel.org/show_bug.cgi?id=219397https://bugzilla.kernel.org/show_bug.cgi?id=219402https://bugzilla.kernel.org/show_bug.cgi?id=219409
Ciao, Thorsten
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 705e3ce37bccdf2ed6f848356ff355f480d51a91
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102152-salvage-pursuable-3b7c@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 705e3ce37bccdf2ed6f848356ff355f480d51a91 Mon Sep 17 00:00:00 2001
From: Roger Quadros <rogerq(a)kernel.org>
Date: Fri, 11 Oct 2024 13:53:24 +0300
Subject: [PATCH] usb: dwc3: core: Fix system suspend on TI AM62 platforms
Since commit 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init"),
system suspend is broken on AM62 TI platforms.
Before that commit, both DWC3_GUSB3PIPECTL_SUSPHY and DWC3_GUSB2PHYCFG_SUSPHY
bits (hence forth called 2 SUSPHY bits) were being set during core
initialization and even during core re-initialization after a system
suspend/resume.
These bits are required to be set for system suspend/resume to work correctly
on AM62 platforms.
Since that commit, the 2 SUSPHY bits are not set for DEVICE/OTG mode if gadget
driver is not loaded and started.
For Host mode, the 2 SUSPHY bits are set before the first system suspend but
get cleared at system resume during core re-init and are never set again.
This patch resovles these two issues by ensuring the 2 SUSPHY bits are set
before system suspend and restored to the original state during system resume.
Cc: stable(a)vger.kernel.org # v6.9+
Fixes: 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init")
Link: https://lore.kernel.org/all/1519dbe7-73b6-4afc-bfe3-23f4f75d772f@kernel.org/
Signed-off-by: Roger Quadros <rogerq(a)kernel.org>
Acked-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
Tested-by: Markus Schneider-Pargmann <msp(a)baylibre.com>
Reviewed-by: Dhruva Gole <d-gole(a)ti.com>
Link: https://lore.kernel.org/r/20241011-am62-lpm-usb-v3-1-562d445625b5@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 21740e2b8f07..427e5660f87c 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -2342,6 +2342,11 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg)
u32 reg;
int i;
+ dwc->susphy_state = (dwc3_readl(dwc->regs, DWC3_GUSB2PHYCFG(0)) &
+ DWC3_GUSB2PHYCFG_SUSPHY) ||
+ (dwc3_readl(dwc->regs, DWC3_GUSB3PIPECTL(0)) &
+ DWC3_GUSB3PIPECTL_SUSPHY);
+
switch (dwc->current_dr_role) {
case DWC3_GCTL_PRTCAP_DEVICE:
if (pm_runtime_suspended(dwc->dev))
@@ -2393,6 +2398,15 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg)
break;
}
+ if (!PMSG_IS_AUTO(msg)) {
+ /*
+ * TI AM62 platform requires SUSPHY to be
+ * enabled for system suspend to work.
+ */
+ if (!dwc->susphy_state)
+ dwc3_enable_susphy(dwc, true);
+ }
+
return 0;
}
@@ -2460,6 +2474,11 @@ static int dwc3_resume_common(struct dwc3 *dwc, pm_message_t msg)
break;
}
+ if (!PMSG_IS_AUTO(msg)) {
+ /* restore SUSPHY state to that before system suspend. */
+ dwc3_enable_susphy(dwc, dwc->susphy_state);
+ }
+
return 0;
}
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index 9c508e0c5cdf..eab81dfdcc35 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -1150,6 +1150,8 @@ struct dwc3_scratchpad_array {
* @sys_wakeup: set if the device may do system wakeup.
* @wakeup_configured: set if the device is configured for remote wakeup.
* @suspended: set to track suspend event due to U3/L2.
+ * @susphy_state: state of DWC3_GUSB2PHYCFG_SUSPHY + DWC3_GUSB3PIPECTL_SUSPHY
+ * before PM suspend.
* @imod_interval: set the interrupt moderation interval in 250ns
* increments or 0 to disable.
* @max_cfg_eps: current max number of IN eps used across all USB configs.
@@ -1382,6 +1384,7 @@ struct dwc3 {
unsigned sys_wakeup:1;
unsigned wakeup_configured:1;
unsigned suspended:1;
+ unsigned susphy_state:1;
u16 imod_interval;
Hello, I wanted to check on the backport of the fix for CVE-2024-26800
on the 5.15 kernel.
Here is the commit fixing the issue:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
and as you can see the commit it says it fixes was backported to 5.15
to fix a different CVE but this one wasn't as far as I can tell.
Thank You,
Michael Kochera
> Jeongjun Park <aha310510(a)gmail.com> wrote:
>
>
>
>> Kalle Valo <kvalo(a)kernel.org> wrote:
>>
>> Jeongjun Park <aha310510(a)gmail.com> wrote:
>>
>>> I found the following bug in my fuzzer:
>>>
>>> UBSAN: array-index-out-of-bounds in drivers/net/wireless/ath/ath9k/htc_hst.c:26:51
>>> index 255 is out of range for type 'htc_endpoint [22]'
>>> CPU: 0 UID: 0 PID: 8 Comm: kworker/0:0 Not tainted 6.11.0-rc6-dirty #14
>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
>>> Workqueue: events request_firmware_work_func
>>> Call Trace:
>>> <TASK>
>>> dump_stack_lvl+0x180/0x1b0
>>> __ubsan_handle_out_of_bounds+0xd4/0x130
>>> htc_issue_send.constprop.0+0x20c/0x230
>>> ? _raw_spin_unlock_irqrestore+0x3c/0x70
>>> ath9k_wmi_cmd+0x41d/0x610
>>> ? mark_held_locks+0x9f/0xe0
>>> ...
>>>
>>> Since this bug has been confirmed to be caused by insufficient verification
>>> of conn_rsp_epid, I think it would be appropriate to add a range check for
>>> conn_rsp_epid to htc_connect_service() to prevent the bug from occurring.
>>>
>>> Fixes: fb9987d0f748 ("ath9k_htc: Support for AR9271 chipset.")
>>> Signed-off-by: Jeongjun Park <aha310510(a)gmail.com>
>>> Acked-by: Toke Høiland-Jørgensen <toke(a)toke.dk>
>>> Signed-off-by: Kalle Valo <quic_kvalo(a)quicinc.com>
>>
>> Patch applied to ath-next branch of ath.git, thanks.
>>
>
Cc: <stable(a)vger.kernel.org>
> I think this patch should be applied to the next rc version immediately
> to fix the oob vulnerability as soon as possible, and also to the
> stable version.
>
> Regards,
>
> Jeongjun Park
>
>> 8619593634cb wifi: ath9k: add range check for conn_rsp_epid in htc_connect_service()
>>
>> --
>> https://patchwork.kernel.org/project/linux-wireless/patch/20240909103855.68…
>>
>> https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatc…
>> https://docs.kernel.org/process/submitting-patches.html
>>
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 40d7903386df4d18f04d90510ba90eedee260085
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102153-fiddling-unblended-6e63@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 40d7903386df4d18f04d90510ba90eedee260085 Mon Sep 17 00:00:00 2001
From: Marek Vasut <marex(a)denx.de>
Date: Wed, 2 Oct 2024 20:40:38 +0200
Subject: [PATCH] serial: imx: Update mctrl old_status on RTSD interrupt
When sending data using DMA at high baudrate (4 Mbdps in local test case) to
a device with small RX buffer which keeps asserting RTS after every received
byte, it is possible that the iMX UART driver would not recognize the falling
edge of RTS input signal and get stuck, unable to transmit any more data.
This condition happens when the following sequence of events occur:
- imx_uart_mctrl_check() is called at some point and takes a snapshot of UART
control signal status into sport->old_status using imx_uart_get_hwmctrl().
The RTSS/TIOCM_CTS bit is of interest here (*).
- DMA transfer occurs, the remote device asserts RTS signal after each byte.
The i.MX UART driver recognizes each such RTS signal change, raises an
interrupt with USR1 register RTSD bit set, which leads to invocation of
__imx_uart_rtsint(), which calls uart_handle_cts_change().
- If the RTS signal is deasserted, uart_handle_cts_change() clears
port->hw_stopped and unblocks the port for further data transfers.
- If the RTS is asserted, uart_handle_cts_change() sets port->hw_stopped
and blocks the port for further data transfers. This may occur as the
last interrupt of a transfer, which means port->hw_stopped remains set
and the port remains blocked (**).
- Any further data transfer attempts will trigger imx_uart_mctrl_check(),
which will read current status of UART control signals by calling
imx_uart_get_hwmctrl() (***) and compare it with sport->old_status .
- If current status differs from sport->old_status for RTS signal,
uart_handle_cts_change() is called and possibly unblocks the port
by clearing port->hw_stopped .
- If current status does not differ from sport->old_status for RTS
signal, no action occurs. This may occur in case prior snapshot (*)
was taken before any transfer so the RTS is deasserted, current
snapshot (***) was taken after a transfer and therefore RTS is
deasserted again, which means current status and sport->old_status
are identical. In case (**) triggered when RTS got asserted, and
made port->hw_stopped set, the port->hw_stopped will remain set
because no change on RTS line is recognized by this driver and
uart_handle_cts_change() is not called from here to unblock the
port->hw_stopped.
Update sport->old_status in __imx_uart_rtsint() accordingly to make
imx_uart_mctrl_check() detect such RTS change. Note that TIOCM_CAR
and TIOCM_RI bits in sport->old_status do not suffer from this problem.
Fixes: ceca629e0b48 ("[ARM] 2971/1: i.MX uart handle rts irq")
Cc: stable <stable(a)kernel.org>
Reviewed-by: Esben Haabendal <esben(a)geanix.com>
Signed-off-by: Marek Vasut <marex(a)denx.de>
Link: https://lore.kernel.org/r/20241002184133.19427-1-marex@denx.de
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index 67d4a72eda77..90974d338f3c 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -762,6 +762,21 @@ static irqreturn_t __imx_uart_rtsint(int irq, void *dev_id)
imx_uart_writel(sport, USR1_RTSD, USR1);
usr1 = imx_uart_readl(sport, USR1) & USR1_RTSS;
+ /*
+ * Update sport->old_status here, so any follow-up calls to
+ * imx_uart_mctrl_check() will be able to recognize that RTS
+ * state changed since last imx_uart_mctrl_check() call.
+ *
+ * In case RTS has been detected as asserted here and later on
+ * deasserted by the time imx_uart_mctrl_check() was called,
+ * imx_uart_mctrl_check() can detect the RTS state change and
+ * trigger uart_handle_cts_change() to unblock the port for
+ * further TX transfers.
+ */
+ if (usr1 & USR1_RTSS)
+ sport->old_status |= TIOCM_CTS;
+ else
+ sport->old_status &= ~TIOCM_CTS;
uart_handle_cts_change(&sport->port, usr1);
wake_up_interruptible(&sport->port.state->port.delta_msr_wait);
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 3267cb6d3a174ff83d6287dcd5b0047bbd912452
Gitweb: https://git.kernel.org/tip/3267cb6d3a174ff83d6287dcd5b0047bbd912452
Author: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
AuthorDate: Tue, 23 Jan 2024 19:55:21 -08:00
Committer: Dave Hansen <dave.hansen(a)linux.intel.com>
CommitterDate: Mon, 21 Oct 2024 15:05:43 -07:00
x86/lam: Disable ADDRESS_MASKING in most cases
Linear Address Masking (LAM) has a weakness related to transient
execution as described in the SLAM paper[1]. Unless Linear Address
Space Separation (LASS) is enabled this weakness may be exploitable.
Until kernel adds support for LASS[2], only allow LAM for COMPILE_TEST,
or when speculation mitigations have been disabled at compile time,
otherwise keep LAM disabled.
There are no processors in market that support LAM yet, so currently
nobody is affected by this issue.
[1] SLAM: https://download.vusec.net/papers/slam_sp24.pdf
[2] LASS: https://lore.kernel.org/lkml/20230609183632.48706-1-alexander.shishkin@linu…
[ dhansen: update SPECULATION_MITIGATIONS -> CPU_MITIGATIONS ]
Signed-off-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Reviewed-by: Sohil Mehta <sohil.mehta(a)intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/5373262886f2783f054256babdf5a98545dc986b.170606…
---
arch/x86/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 2852fcd..16354df 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2257,6 +2257,7 @@ config RANDOMIZE_MEMORY_PHYSICAL_PADDING
config ADDRESS_MASKING
bool "Linear Address Masking support"
depends on X86_64
+ depends on COMPILE_TEST || !CPU_MITIGATIONS # wait for LASS
help
Linear Address Masking (LAM) modifies the checking that is applied
to 64-bit linear addresses, allowing software to use of the
This is something that I've been thinking about for a while. We had a
discussion at LPC 2020 about this[1] but the proposals suggested there
never materialised.
In short, it is quite difficult for userspace to detect the feature
capability of syscalls at runtime. This is something a lot of programs
want to do, but they are forced to create elaborate scenarios to try to
figure out if a feature is supported without causing damage to the
system. For the vast majority of cases, each individual feature also
needs to be tested individually (because syscall results are
all-or-nothing), so testing even a single syscall's feature set can
easily inflate the startup time of programs.
This patchset implements the fairly minimal design I proposed in this
talk[2] and in some old LKML threads (though I can't find the exact
references ATM). The general flow looks like:
1. Userspace will indicate to the kernel that a syscall should a be
no-op by setting the top bit of the extensible struct size argument.
We will almost certainly never support exabyte sized structs, so the
top bits are free for us to use as makeshift flag bits. This is
preferable to using the per-syscall flag field inside the structure
because seccomp can easily detect the bit in the flag and allow the
probe or forcefully return -EEXTSYS_NOOP.
2. The kernel will then fill the provided structure with every valid
bit pattern that the current kernel understands.
For flags or other bitflag-like fields, this is the set of valid
flags or bits. For pointer fields or fields that take an arbitrary
value, the field has every bit set (0xFF... to fill the field) to
indicate that any value is valid in the field.
3. The syscall then returns -EEXTSYS_NOOP which is an errno that will
only ever be used for this purpose (so userspace can be sure that
the request succeeded).
On older kernels, the syscall will return a different error (usually
-E2BIG or -EFAULT) and userspace can do their old-fashioned checks.
4. Userspace can then check which flags and fields are supported by
looking at the fields in the returned structure. Flags are checked
by doing an AND with the flags field, and field support can checked
by comparing to 0. In principle you could just AND the entire
structure if you wanted to do this check generically without caring
about the structure contents (this is what libraries might consider
doing).
Userspace can even find out the internal kernel structure size by
passing a PAGE_SIZE buffer and seeing how many bytes are non-zero.
As with copy_struct_from_user(), this is designed to be forward- and
backwards- compatible.
This allows programas to get a one-shot understanding of what features a
syscall supports without having to do any elaborate setups or tricks to
detect support for destructive features. Flags can simply be ANDed to
check if they are in the supported set, and fields can just be checked
to see if they are non-zero.
This patchset is IMHO the simplest way we can add the ability to
introspect the feature set of extensible struct (copy_struct_from_user)
syscalls. It doesn't preclude the chance of a more generic mechanism
being added later.
The intended way of using this interface to get feature information
looks something like the following (imagine that openat2 has gained a
new field and a new flag in the future):
static bool openat2_no_automount_supported;
static bool openat2_cwd_fd_supported;
int check_openat2_support(void)
{
int err;
struct open_how how = {};
err = openat2(AT_FDCWD, ".", &how, CHECK_FIELDS | sizeof(how));
assert(err < 0);
switch (errno) {
case EFAULT: case E2BIG:
/* Old kernel... */
check_support_the_old_way();
break;
case EEXTSYS_NOOP:
openat2_no_automount_supported = (how.flags & RESOLVE_NO_AUTOMOUNT);
openat2_cwd_fd_supported = (how.cwd_fd != 0);
break;
}
}
This series adds CHECK_FIELDS support for the following extensible
struct syscalls, as they are quite likely to grow flags in the near
future:
* openat2
* clone3
* mount_setattr
[1]: https://lwn.net/Articles/830666/
[2]: https://youtu.be/ggD-eb3yPVs
Signed-off-by: Aleksa Sarai <cyphar(a)cyphar.com>
---
Changes in v3:
- Fix copy_struct_to_user() return values in case of clear_user() failure.
- v2: <https://lore.kernel.org/r/20240906-extensible-structs-check_fields-v2-0-0f4…>
Changes in v2:
- Add CHECK_FIELDS support to mount_setattr(2).
- Fix build failure on architectures with custom errno values.
- Rework selftests to use the tools/ uAPI headers rather than custom
defining EEXTSYS_NOOP.
- Make sure we return -EINVAL and -E2BIG for invalid sizes even if
CHECK_FIELDS is set, and add some tests for that.
- v1: <https://lore.kernel.org/r/20240902-extensible-structs-check_fields-v1-0-545…>
---
Aleksa Sarai (10):
uaccess: add copy_struct_to_user helper
sched_getattr: port to copy_struct_to_user
openat2: explicitly return -E2BIG for (usize > PAGE_SIZE)
openat2: add CHECK_FIELDS flag to usize argument
selftests: openat2: add 0xFF poisoned data after misaligned struct
selftests: openat2: add CHECK_FIELDS selftests
clone3: add CHECK_FIELDS flag to usize argument
selftests: clone3: add CHECK_FIELDS selftests
mount_setattr: add CHECK_FIELDS flag to usize argument
selftests: mount_setattr: add CHECK_FIELDS selftest
arch/alpha/include/uapi/asm/errno.h | 3 +
arch/mips/include/uapi/asm/errno.h | 3 +
arch/parisc/include/uapi/asm/errno.h | 3 +
arch/sparc/include/uapi/asm/errno.h | 3 +
fs/namespace.c | 17 ++
fs/open.c | 18 ++
include/linux/uaccess.h | 97 ++++++++
include/uapi/asm-generic/errno.h | 3 +
include/uapi/linux/openat2.h | 2 +
kernel/fork.c | 30 ++-
kernel/sched/syscalls.c | 42 +---
tools/arch/alpha/include/uapi/asm/errno.h | 3 +
tools/arch/mips/include/uapi/asm/errno.h | 3 +
tools/arch/parisc/include/uapi/asm/errno.h | 3 +
tools/arch/sparc/include/uapi/asm/errno.h | 3 +
tools/include/uapi/asm-generic/errno.h | 3 +
tools/include/uapi/asm-generic/posix_types.h | 101 ++++++++
tools/testing/selftests/clone3/.gitignore | 1 +
tools/testing/selftests/clone3/Makefile | 4 +-
.../testing/selftests/clone3/clone3_check_fields.c | 264 +++++++++++++++++++++
tools/testing/selftests/mount_setattr/Makefile | 2 +-
.../selftests/mount_setattr/mount_setattr_test.c | 53 ++++-
tools/testing/selftests/openat2/Makefile | 2 +
tools/testing/selftests/openat2/openat2_test.c | 165 ++++++++++++-
24 files changed, 777 insertions(+), 51 deletions(-)
---
base-commit: 98f7e32f20d28ec452afb208f9cffc08448a2652
change-id: 20240803-extensible-structs-check_fields-a47e94cef691
Best regards,
--
Aleksa Sarai <cyphar(a)cyphar.com>
The first patch simply adds the missing call to fwnode_handle_put() when
the node is no longer required to make it compatible with stable kernels
that don't support the cleanup attribute in its current form. The second
patch adds the __free() macro to make the code more robust and avoid
similar issues in the future.
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
---
Changes in v2:
- Add patch for the automatic cleanup facility.
- Link to v1: https://lore.kernel.org/r/20241019-typec-class-fwnode_handle_put-v1-1-a3b5a…
---
Javier Carrasco (2):
usb: typec: fix unreleased fwnode_handle in typec_port_register_altmodes()
usb: typec: use cleanup facility for 'altmodes_node'
drivers/usb/typec/class.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
---
base-commit: f2493655d2d3d5c6958ed996b043c821c23ae8d3
change-id: 20241019-typec-class-fwnode_handle_put-b3648f5bc51b
Best regards,
--
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x e4d2102018542e3ae5e297bc6e229303abff8a0f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102131-blissful-iodize-4056@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e4d2102018542e3ae5e297bc6e229303abff8a0f Mon Sep 17 00:00:00 2001
From: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
Date: Thu, 26 Sep 2024 09:10:31 -0700
Subject: [PATCH] x86/bugs: Use code segment selector for VERW operand
Robert Gill reported below #GP in 32-bit mode when dosemu software was
executing vm86() system call:
general protection fault: 0000 [#1] PREEMPT SMP
CPU: 4 PID: 4610 Comm: dosemu.bin Not tainted 6.6.21-gentoo-x86 #1
Hardware name: Dell Inc. PowerEdge 1950/0H723K, BIOS 2.7.0 10/30/2010
EIP: restore_all_switch_stack+0xbe/0xcf
EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: ff8affdc
DS: 0000 ES: 0000 FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010046
CR0: 80050033 CR2: 00c2101c CR3: 04b6d000 CR4: 000406d0
Call Trace:
show_regs+0x70/0x78
die_addr+0x29/0x70
exc_general_protection+0x13c/0x348
exc_bounds+0x98/0x98
handle_exception+0x14d/0x14d
exc_bounds+0x98/0x98
restore_all_switch_stack+0xbe/0xcf
exc_bounds+0x98/0x98
restore_all_switch_stack+0xbe/0xcf
This only happens in 32-bit mode when VERW based mitigations like MDS/RFDS
are enabled. This is because segment registers with an arbitrary user value
can result in #GP when executing VERW. Intel SDM vol. 2C documents the
following behavior for VERW instruction:
#GP(0) - If a memory operand effective address is outside the CS, DS, ES,
FS, or GS segment limit.
CLEAR_CPU_BUFFERS macro executes VERW instruction before returning to user
space. Use %cs selector to reference VERW operand. This ensures VERW will
not #GP for an arbitrary user %ds.
[ mingo: Fixed the SOB chain. ]
Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
Reported-by: Robert Gill <rtgill82(a)gmail.com>
Reviewed-by: Andrew Cooper <andrew.cooper3(a)citrix.com
Cc: stable(a)vger.kernel.org # 5.10+
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
Closes: https://lore.kernel.org/all/8c77ccfd-d561-45a1-8ed5-6b75212c7a58@leemhuis.i…
Suggested-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Suggested-by: Brian Gerst <brgerst(a)gmail.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index ff5f1ecc7d1e..96b410b1d4e8 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -323,7 +323,16 @@
* Note: Only the memory operand variant of VERW clears the CPU buffers.
*/
.macro CLEAR_CPU_BUFFERS
- ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
+#ifdef CONFIG_X86_64
+ ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF
+#else
+ /*
+ * In 32bit mode, the memory operand must be a %cs reference. The data
+ * segments may not be usable (vm86 mode), and the stack segment may not
+ * be flat (ESPFIX32).
+ */
+ ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF
+#endif
.endm
#ifdef CONFIG_X86_64
Stuart Hayhurst has found that both at bootup and fullscreen VA-API video
is leading to black screens for around 1 second and kernel WARNING [1] traces
when calling dmub_psr_enable() with Parade 08-01 TCON.
These symptoms all go away with PSR-SU disabled for this TCON, so disable
it for now while DMUB traces [2] from the failure can be analyzed and the failure
state properly root caused.
Cc: stable(a)vger.kernel.org
Cc: Marc Rossi <Marc.Rossi(a)amd.com>
Cc: Hamza Mahfooz <Hamza.Mahfooz(a)amd.com>
Link: https://gitlab.freedesktop.org/drm/amd/uploads/a832dd515b571ee171b3e3b566e9… [1]
Link: https://gitlab.freedesktop.org/drm/amd/uploads/8f13ff3b00963c833e23e68aa811… [2]
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2645
Signed-off-by: Mario Limonciello <mario.limonciello(a)amd.com>
---
---
drivers/gpu/drm/amd/display/modules/power/power_helpers.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/modules/power/power_helpers.c b/drivers/gpu/drm/amd/display/modules/power/power_helpers.c
index e304e8435fb8..477289846a0a 100644
--- a/drivers/gpu/drm/amd/display/modules/power/power_helpers.c
+++ b/drivers/gpu/drm/amd/display/modules/power/power_helpers.c
@@ -841,6 +841,8 @@ bool is_psr_su_specific_panel(struct dc_link *link)
isPSRSUSupported = false;
else if (dpcd_caps->sink_dev_id_str[1] == 0x08 && dpcd_caps->sink_dev_id_str[0] == 0x03)
isPSRSUSupported = false;
+ else if (dpcd_caps->sink_dev_id_str[1] == 0x08 && dpcd_caps->sink_dev_id_str[0] == 0x01)
+ isPSRSUSupported = false;
else if (dpcd_caps->psr_info.force_psrsu_cap == 0x1)
isPSRSUSupported = true;
}
--
2.34.1
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x e4d2102018542e3ae5e297bc6e229303abff8a0f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102130-saturday-bountiful-5087@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e4d2102018542e3ae5e297bc6e229303abff8a0f Mon Sep 17 00:00:00 2001
From: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
Date: Thu, 26 Sep 2024 09:10:31 -0700
Subject: [PATCH] x86/bugs: Use code segment selector for VERW operand
Robert Gill reported below #GP in 32-bit mode when dosemu software was
executing vm86() system call:
general protection fault: 0000 [#1] PREEMPT SMP
CPU: 4 PID: 4610 Comm: dosemu.bin Not tainted 6.6.21-gentoo-x86 #1
Hardware name: Dell Inc. PowerEdge 1950/0H723K, BIOS 2.7.0 10/30/2010
EIP: restore_all_switch_stack+0xbe/0xcf
EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: ff8affdc
DS: 0000 ES: 0000 FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010046
CR0: 80050033 CR2: 00c2101c CR3: 04b6d000 CR4: 000406d0
Call Trace:
show_regs+0x70/0x78
die_addr+0x29/0x70
exc_general_protection+0x13c/0x348
exc_bounds+0x98/0x98
handle_exception+0x14d/0x14d
exc_bounds+0x98/0x98
restore_all_switch_stack+0xbe/0xcf
exc_bounds+0x98/0x98
restore_all_switch_stack+0xbe/0xcf
This only happens in 32-bit mode when VERW based mitigations like MDS/RFDS
are enabled. This is because segment registers with an arbitrary user value
can result in #GP when executing VERW. Intel SDM vol. 2C documents the
following behavior for VERW instruction:
#GP(0) - If a memory operand effective address is outside the CS, DS, ES,
FS, or GS segment limit.
CLEAR_CPU_BUFFERS macro executes VERW instruction before returning to user
space. Use %cs selector to reference VERW operand. This ensures VERW will
not #GP for an arbitrary user %ds.
[ mingo: Fixed the SOB chain. ]
Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
Reported-by: Robert Gill <rtgill82(a)gmail.com>
Reviewed-by: Andrew Cooper <andrew.cooper3(a)citrix.com
Cc: stable(a)vger.kernel.org # 5.10+
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
Closes: https://lore.kernel.org/all/8c77ccfd-d561-45a1-8ed5-6b75212c7a58@leemhuis.i…
Suggested-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Suggested-by: Brian Gerst <brgerst(a)gmail.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index ff5f1ecc7d1e..96b410b1d4e8 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -323,7 +323,16 @@
* Note: Only the memory operand variant of VERW clears the CPU buffers.
*/
.macro CLEAR_CPU_BUFFERS
- ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
+#ifdef CONFIG_X86_64
+ ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF
+#else
+ /*
+ * In 32bit mode, the memory operand must be a %cs reference. The data
+ * segments may not be usable (vm86 mode), and the stack segment may not
+ * be flat (ESPFIX32).
+ */
+ ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF
+#endif
.endm
#ifdef CONFIG_X86_64
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x e4d2102018542e3ae5e297bc6e229303abff8a0f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102128-omega-phosphate-db6c@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e4d2102018542e3ae5e297bc6e229303abff8a0f Mon Sep 17 00:00:00 2001
From: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
Date: Thu, 26 Sep 2024 09:10:31 -0700
Subject: [PATCH] x86/bugs: Use code segment selector for VERW operand
Robert Gill reported below #GP in 32-bit mode when dosemu software was
executing vm86() system call:
general protection fault: 0000 [#1] PREEMPT SMP
CPU: 4 PID: 4610 Comm: dosemu.bin Not tainted 6.6.21-gentoo-x86 #1
Hardware name: Dell Inc. PowerEdge 1950/0H723K, BIOS 2.7.0 10/30/2010
EIP: restore_all_switch_stack+0xbe/0xcf
EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: ff8affdc
DS: 0000 ES: 0000 FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010046
CR0: 80050033 CR2: 00c2101c CR3: 04b6d000 CR4: 000406d0
Call Trace:
show_regs+0x70/0x78
die_addr+0x29/0x70
exc_general_protection+0x13c/0x348
exc_bounds+0x98/0x98
handle_exception+0x14d/0x14d
exc_bounds+0x98/0x98
restore_all_switch_stack+0xbe/0xcf
exc_bounds+0x98/0x98
restore_all_switch_stack+0xbe/0xcf
This only happens in 32-bit mode when VERW based mitigations like MDS/RFDS
are enabled. This is because segment registers with an arbitrary user value
can result in #GP when executing VERW. Intel SDM vol. 2C documents the
following behavior for VERW instruction:
#GP(0) - If a memory operand effective address is outside the CS, DS, ES,
FS, or GS segment limit.
CLEAR_CPU_BUFFERS macro executes VERW instruction before returning to user
space. Use %cs selector to reference VERW operand. This ensures VERW will
not #GP for an arbitrary user %ds.
[ mingo: Fixed the SOB chain. ]
Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
Reported-by: Robert Gill <rtgill82(a)gmail.com>
Reviewed-by: Andrew Cooper <andrew.cooper3(a)citrix.com
Cc: stable(a)vger.kernel.org # 5.10+
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
Closes: https://lore.kernel.org/all/8c77ccfd-d561-45a1-8ed5-6b75212c7a58@leemhuis.i…
Suggested-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Suggested-by: Brian Gerst <brgerst(a)gmail.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index ff5f1ecc7d1e..96b410b1d4e8 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -323,7 +323,16 @@
* Note: Only the memory operand variant of VERW clears the CPU buffers.
*/
.macro CLEAR_CPU_BUFFERS
- ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
+#ifdef CONFIG_X86_64
+ ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF
+#else
+ /*
+ * In 32bit mode, the memory operand must be a %cs reference. The data
+ * segments may not be usable (vm86 mode), and the stack segment may not
+ * be flat (ESPFIX32).
+ */
+ ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF
+#endif
.endm
#ifdef CONFIG_X86_64
Commit 9bf4e919ccad worked around an issue introduced after an innocuous
optimisation change in LLVM main:
> len is defined as an 'int' because it is assigned from
> '__user int *optlen'. However, it is clamped against the result of
> sizeof(), which has a type of 'size_t' ('unsigned long' for 64-bit
> platforms). This is done with min_t() because min() requires compatible
> types, which results in both len and the result of sizeof() being casted
> to 'unsigned int', meaning len changes signs and the result of sizeof()
> is truncated. From there, len is passed to copy_to_user(), which has a
> third parameter type of 'unsigned long', so it is widened and changes
> signs again. This excessive casting in combination with the KCSAN
> instrumentation causes LLVM to fail to eliminate the __bad_copy_from()
> call, failing the build.
The same issue occurs in rfcomm in functions rfcomm_sock_getsockopt and
rfcomm_sock_getsockopt_old.
Change the type of len to size_t in both rfcomm_sock_getsockopt and
rfcomm_sock_getsockopt_old and replace min_t() with min().
Cc: stable(a)vger.kernel.org
Co-authored-by: Aleksei Vetrov <vvvvvv(a)google.com>
Improves: 9bf4e919ccad ("Bluetooth: Fix type of len in {l2cap,sco}_sock_getsockopt_old()")
Link: https://github.com/ClangBuiltLinux/linux/issues/2007
Link: https://github.com/llvm/llvm-project/issues/85647
Signed-off-by: Andrej Shadura <andrew.shadura(a)collabora.co.uk>
Reviewed-by: Nathan Chancellor <nathan(a)kernel.org>
---
net/bluetooth/rfcomm/sock.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c
index 37d63d768afb..5f9d370e09b1 100644
--- a/net/bluetooth/rfcomm/sock.c
+++ b/net/bluetooth/rfcomm/sock.c
@@ -729,7 +729,8 @@ static int rfcomm_sock_getsockopt_old(struct socket *sock, int optname, char __u
struct sock *l2cap_sk;
struct l2cap_conn *conn;
struct rfcomm_conninfo cinfo;
- int len, err = 0;
+ int err = 0;
+ size_t len;
u32 opt;
BT_DBG("sk %p", sk);
@@ -783,7 +784,7 @@ static int rfcomm_sock_getsockopt_old(struct socket *sock, int optname, char __u
cinfo.hci_handle = conn->hcon->handle;
memcpy(cinfo.dev_class, conn->hcon->dev_class, 3);
- len = min_t(unsigned int, len, sizeof(cinfo));
+ len = min(len, sizeof(cinfo));
if (copy_to_user(optval, (char *) &cinfo, len))
err = -EFAULT;
@@ -802,7 +803,8 @@ static int rfcomm_sock_getsockopt(struct socket *sock, int level, int optname, c
{
struct sock *sk = sock->sk;
struct bt_security sec;
- int len, err = 0;
+ int err = 0;
+ size_t len;
BT_DBG("sk %p", sk);
@@ -827,7 +829,7 @@ static int rfcomm_sock_getsockopt(struct socket *sock, int level, int optname, c
sec.level = rfcomm_pi(sk)->sec_level;
sec.key_size = 0;
- len = min_t(unsigned int, len, sizeof(sec));
+ len = min(len, sizeof(sec));
if (copy_to_user(optval, (char *) &sec, len))
err = -EFAULT;
--
2.43.0
Read buffer is allocated according to max message size, reported by
the firmware and may reach 64K in systems with pxp client.
Contiguous 64k allocation may fail under memory pressure.
Read buffer is used as in-driver message storage and not required
to be contiguous.
Use kvmalloc to allow kernel to allocate non-contiguous memory.
Fixes: 3030dc056459 ("mei: add wrapper for queuing control commands.")
Reported-by: Rohit Agarwal <rohiagar(a)chromium.org>
Closes: https://lore.kernel.org/all/20240813084542.2921300-1-rohiagar@chromium.org/
Tested-by: Brian Geffon <bgeffon(a)google.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin(a)intel.com>
---
Changes since V2:
- add Fixes and CC:stable
Changes since V1:
- add Tested-by and Reported-by
drivers/misc/mei/client.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/misc/mei/client.c b/drivers/misc/mei/client.c
index 9d090fa07516..be011cef12e5 100644
--- a/drivers/misc/mei/client.c
+++ b/drivers/misc/mei/client.c
@@ -321,7 +321,7 @@ void mei_io_cb_free(struct mei_cl_cb *cb)
return;
list_del(&cb->list);
- kfree(cb->buf.data);
+ kvfree(cb->buf.data);
kfree(cb->ext_hdr);
kfree(cb);
}
@@ -497,7 +497,7 @@ struct mei_cl_cb *mei_cl_alloc_cb(struct mei_cl *cl, size_t length,
if (length == 0)
return cb;
- cb->buf.data = kmalloc(roundup(length, MEI_SLOT_SIZE), GFP_KERNEL);
+ cb->buf.data = kvmalloc(roundup(length, MEI_SLOT_SIZE), GFP_KERNEL);
if (!cb->buf.data) {
mei_io_cb_free(cb);
return NULL;
--
2.43.0
The 'altmodes_node' fwnode_handle is never released after it is no
longer required, which leaks the resource.
Add the required call to fwnode_handle_put() when 'altmodes_node' is no
longer required.
Cc: stable(a)vger.kernel.org
Fixes: 7b458a4c5d73 ("usb: typec: Add typec_port_register_altmodes()")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
---
drivers/usb/typec/class.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/usb/typec/class.c b/drivers/usb/typec/class.c
index d61b4c74648d..1eb240604cf6 100644
--- a/drivers/usb/typec/class.c
+++ b/drivers/usb/typec/class.c
@@ -2341,6 +2341,7 @@ void typec_port_register_altmodes(struct typec_port *port,
altmodes[index] = alt;
index++;
}
+ fwnode_handle_put(altmodes_node);
}
EXPORT_SYMBOL_GPL(typec_port_register_altmodes);
---
base-commit: f2493655d2d3d5c6958ed996b043c821c23ae8d3
change-id: 20241019-typec-class-fwnode_handle_put-b3648f5bc51b
Best regards,
--
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
From: Vimal Agrawal <vimal.agrawal(a)sophos.com>
misc_minor_alloc was allocating id using ida for minor only in case of
MISC_DYNAMIC_MINOR but misc_minor_free was always freeing ids
using ida_free causing a mismatch and following warn:
> > WARNING: CPU: 0 PID: 159 at lib/idr.c:525 ida_free+0x3e0/0x41f
> > ida_free called for id=127 which is not allocated.
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
...
> > [<60941eb4>] ida_free+0x3e0/0x41f
> > [<605ac993>] misc_minor_free+0x3e/0xbc
> > [<605acb82>] misc_deregister+0x171/0x1b3
misc_minor_alloc is changed to allocate id from ida for all minors
falling in the range of dynamic/ misc dynamic minors
Fixes: ab760791c0cf ("char: misc: Increase the maximum number of dynamic misc devices to 1048448")
Signed-off-by: Vimal Agrawal <vimal.agrawal(a)sophos.com>
Reviewed-by: Dirk VanDerMerwe <dirk.vandermerwe(a)sophos.com>
Cc: stable(a)vger.kernel.org
---
v2: Added Fixes:
Added missed case for static minor in misc_minor_alloc
v3: Removed kunit changes as that will be added as second patch in this two patch series
v4: Updated Signed-off-by: to match from:
v5: Used corporate id in from: and Signed-off-by:
drivers/char/misc.c | 39 ++++++++++++++++++++++++++++++---------
1 file changed, 30 insertions(+), 9 deletions(-)
diff --git a/drivers/char/misc.c b/drivers/char/misc.c
index 541edc26ec89..2cf595d2e10b 100644
--- a/drivers/char/misc.c
+++ b/drivers/char/misc.c
@@ -63,16 +63,30 @@ static DEFINE_MUTEX(misc_mtx);
#define DYNAMIC_MINORS 128 /* like dynamic majors */
static DEFINE_IDA(misc_minors_ida);
-static int misc_minor_alloc(void)
+static int misc_minor_alloc(int minor)
{
- int ret;
-
- ret = ida_alloc_max(&misc_minors_ida, DYNAMIC_MINORS - 1, GFP_KERNEL);
- if (ret >= 0) {
- ret = DYNAMIC_MINORS - ret - 1;
+ int ret = 0;
+
+ if (minor == MISC_DYNAMIC_MINOR) {
+ /* allocate free id */
+ ret = ida_alloc_max(&misc_minors_ida, DYNAMIC_MINORS - 1, GFP_KERNEL);
+ if (ret >= 0) {
+ ret = DYNAMIC_MINORS - ret - 1;
+ } else {
+ ret = ida_alloc_range(&misc_minors_ida, MISC_DYNAMIC_MINOR + 1,
+ MINORMASK, GFP_KERNEL);
+ }
} else {
- ret = ida_alloc_range(&misc_minors_ida, MISC_DYNAMIC_MINOR + 1,
- MINORMASK, GFP_KERNEL);
+ /* specific minor, check if it is in dynamic or misc dynamic range */
+ if (minor < DYNAMIC_MINORS) {
+ minor = DYNAMIC_MINORS - minor - 1;
+ ret = ida_alloc_range(&misc_minors_ida, minor, minor, GFP_KERNEL);
+ } else if (minor > MISC_DYNAMIC_MINOR) {
+ ret = ida_alloc_range(&misc_minors_ida, minor, minor, GFP_KERNEL);
+ } else {
+ /* case of non-dynamic minors, no need to allocate id */
+ ret = 0;
+ }
}
return ret;
}
@@ -219,7 +233,7 @@ int misc_register(struct miscdevice *misc)
mutex_lock(&misc_mtx);
if (is_dynamic) {
- int i = misc_minor_alloc();
+ int i = misc_minor_alloc(misc->minor);
if (i < 0) {
err = -EBUSY;
@@ -228,6 +242,7 @@ int misc_register(struct miscdevice *misc)
misc->minor = i;
} else {
struct miscdevice *c;
+ int i;
list_for_each_entry(c, &misc_list, list) {
if (c->minor == misc->minor) {
@@ -235,6 +250,12 @@ int misc_register(struct miscdevice *misc)
goto out;
}
}
+
+ i = misc_minor_alloc(misc->minor);
+ if (i < 0) {
+ err = -EBUSY;
+ goto out;
+ }
}
dev = MKDEV(MISC_MAJOR, misc->minor);
--
2.17.1
This series fixes the handling of an fwnode that is not released in all
error paths and uses the wrong function to release it (spotted by Dmitry
Baryshkov).
To: Bryan O'Donoghue <bryan.odonoghue(a)linaro.org>
To: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
To: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
To: Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org>
To: Caleb Connolly <caleb.connolly(a)linaro.org>
To: Guenter Roeck <linux(a)roeck-us.net>
Cc: linux-arm-msm(a)vger.kernel.org
Cc: linux-usb(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Changes in v2:
- add patch to use fwnode_handle_put() instead of
fwnode_remove_software-node().
- Link to v1: https://lore.kernel.org/r/20241019-qcom_pmic_typec-fwnode_remove-v1-1-88496…
---
Javier Carrasco (2):
usb: typec: qcom-pmic-typec: use fwnode_handle_put() to release fwnodes
usb: typec: qcom-pmic-typec: fix missing fwnode removal in error path
drivers/usb/typec/tcpm/qcom/qcom_pmic_typec.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
---
base-commit: f2493655d2d3d5c6958ed996b043c821c23ae8d3
change-id: 20241019-qcom_pmic_typec-fwnode_remove-00dc49054cf7
Best regards,
--
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
When splicing a 0-length bvec, the block layer may loop iterating and not
advancing the bvec, causing lockups or hangs.
Pavel Begunkov (1):
splice: don't generate zero-len segement bvecs
fs/splice.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
--
2.34.1
If USB virtualizatoin is enabled, USB2 ports are shared between all
Virtual Functions. The USB2 port number owned by an USB2 root hub in
a Virtual Function may be less than total USB2 phy number supported
by the Tegra XUSB controller.
Using total USB2 phy number as port number to check all PORTSC values
would cause invalid memory access.
[ 116.923438] Unable to handle kernel paging request at virtual address 006c622f7665642f
...
[ 117.213640] Call trace:
[ 117.216783] tegra_xusb_enter_elpg+0x23c/0x658
[ 117.222021] tegra_xusb_runtime_suspend+0x40/0x68
[ 117.227260] pm_generic_runtime_suspend+0x30/0x50
[ 117.232847] __rpm_callback+0x84/0x3c0
[ 117.237038] rpm_suspend+0x2dc/0x740
[ 117.241229] pm_runtime_work+0xa0/0xb8
[ 117.245769] process_scheduled_works+0x24c/0x478
[ 117.251007] worker_thread+0x23c/0x328
[ 117.255547] kthread+0x104/0x1b0
[ 117.259389] ret_from_fork+0x10/0x20
[ 117.263582] Code: 54000222 f9461ae8 f8747908 b4ffff48 (f9400100)
Cc: <stable(a)vger.kernel.org> # v6.3+
Fixes: a30951d31b25 ("xhci: tegra: USB2 pad power controls")
Signed-off-by: Henry Lin <henryl(a)nvidia.com>
---
V1 -> V2: Add Fixes tag and the cc stable line
V2 -> V3: Update commit message to clarify issue
V3 -> V4: Resend for patch changelogs that are missing in V3
drivers/usb/host/xhci-tegra.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci-tegra.c b/drivers/usb/host/xhci-tegra.c
index 6246d5ad1468..76f228e7443c 100644
--- a/drivers/usb/host/xhci-tegra.c
+++ b/drivers/usb/host/xhci-tegra.c
@@ -2183,7 +2183,7 @@ static int tegra_xusb_enter_elpg(struct tegra_xusb *tegra, bool runtime)
goto out;
}
- for (i = 0; i < tegra->num_usb_phys; i++) {
+ for (i = 0; i < xhci->usb2_rhub.num_ports; i++) {
if (!xhci->usb2_rhub.ports[i])
continue;
portsc = readl(xhci->usb2_rhub.ports[i]->addr);
--
2.25.1
The LG Gram Pro 16 2-in-1 (2024) the 16T90SP has its keybopard IRQ (1)
described as ActiveLow in the DSDT, which the kernel overrides to EdgeHigh
which breaks the keyboard.
Add the 16T90SP to the irq1_level_low_skip_override[] quirk table to fix
this.
Reported-by: Dirk Holten <dirk.holten(a)gmx.de>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219382
Cc: stable(a)vger.kernel.org
Suggested-by: Dirk Holten <dirk.holten(a)gmx.de>
Signed-off-by: Christian Heusel <christian(a)heusel.eu>
---
Note that I do not have the relevant hardware since I'm sending in this
quirk at the request of someone else.
---
Changes in v2:
- fix the double initialization warning reported by the kernel test
robot, which accidentially overwrote another quirk
- Link to v1: https://lore.kernel.org/r/20241016-lg-gram-pro-keyboard-v1-1-34306123102f@h…
---
drivers/acpi/resource.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/acpi/resource.c b/drivers/acpi/resource.c
index 129bceb1f4a27df93439bcefdb27fd9c91258028..7fe842dae1ec05ce6726af2ae4fcc8eff3698dcb 100644
--- a/drivers/acpi/resource.c
+++ b/drivers/acpi/resource.c
@@ -503,6 +503,13 @@ static const struct dmi_system_id irq1_level_low_skip_override[] = {
DMI_MATCH(DMI_BOARD_NAME, "17U70P"),
},
},
+ {
+ /* LG Electronics 16T90SP */
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "LG Electronics"),
+ DMI_MATCH(DMI_BOARD_NAME, "16T90SP"),
+ },
+ },
{ }
};
---
base-commit: 8e929cb546ee42c9a61d24fae60605e9e3192354
change-id: 20241016-lg-gram-pro-keyboard-9a9d8b9aa647
Best regards,
--
Christian Heusel <christian(a)heusel.eu>
A few commits from Yu Zhao have been merged into 6.12.
They need to be backported to 6.11.
- c2a967f6ab0ec ("mm/hugetlb_vmemmap: don't synchronize_rcu() without HVO")
- 95599ef684d01 ("mm/codetag: fix pgalloc_tag_split()")
- e0a955bf7f61c ("mm/codetag: add pgalloc_tag_copy()")
---
Changes in v2:
- Add signed off tag
- Link to v1: https://lore.kernel.org/r/20241017-stable-yuzhao-v1-0-3a4566660d44@kernel.o…
---
Yu Zhao (3):
mm/hugetlb_vmemmap: don't synchronize_rcu() without HVO
mm/codetag: fix pgalloc_tag_split()
mm/codetag: add pgalloc_tag_copy()
include/linux/alloc_tag.h | 24 ++++++++-----------
include/linux/mm.h | 57 +++++++++++++++++++++++++++++++++++++++++++++
include/linux/pgalloc_tag.h | 31 ------------------------
mm/huge_memory.c | 2 +-
mm/hugetlb_vmemmap.c | 40 +++++++++++++++----------------
mm/migrate.c | 1 +
mm/page_alloc.c | 4 ++--
7 files changed, 91 insertions(+), 68 deletions(-)
---
base-commit: 8e24a758d14c0b1cd42ab0aea980a1030eea811f
change-id: 20241016-stable-yuzhao-7779910482e8
Best regards,
--
Chris Li <chrisl(a)kernel.org>
commit 3c12466b6b7bf1e56f9b32c366a3d83d87afb4de upstream.
Currently EROFS can map another compressed buffer for inplace
decompression, that was used to handle the cases that some pages of
compressed data are actually not in-place I/O.
However, like most simple LZ77 algorithms, LZ4 expects the compressed
data is arranged at the end of the decompressed buffer and it
explicitly uses memmove() to handle overlapping:
__________________________________________________________
|_ direction of decompression --> ____ |_ compressed data _|
Although EROFS arranges compressed data like this, it typically maps two
individual virtual buffers so the relative order is uncertain.
Previously, it was hardly observed since LZ4 only uses memmove() for
short overlapped literals and x86/arm64 memmove implementations seem to
completely cover it up and they don't have this issue. Juhyung reported
that EROFS data corruption can be found on a new Intel x86 processor.
After some analysis, it seems that recent x86 processors with the new
FSRM feature expose this issue with "rep movsb".
Let's strictly use the decompressed buffer for lz4 inplace
decompression for now. Later, as an useful improvement, we could try
to tie up these two buffers together in the correct order.
Reported-and-tested-by: Juhyung Park <qkrwngud825(a)gmail.com>
Closes: https://lore.kernel.org/r/CAD14+f2AVKf8Fa2OO1aAUdDNTDsVzzR6ctU_oJSmTyd6zSYR…
Fixes: 0ffd71bcc3a0 ("staging: erofs: introduce LZ4 decompression inplace")
Fixes: 598162d05080 ("erofs: support decompress big pcluster for lz4 backend")
Cc: stable <stable(a)vger.kernel.org> # 5.4+
Tested-by: Yifan Zhao <zhaoyifan(a)sjtu.edu.cn>
Link: https://lore.kernel.org/r/20231206045534.3920847-1-hsiangkao@linux.alibaba.…
Signed-off-by: Gao Xiang <hsiangkao(a)linux.alibaba.com>
---
The remaining stable patch to address the issue "CVE-2023-52497" for
5.4.y, which is the same as the 5.10.y one [1].
[1] https://lore.kernel.org/r/20240224063248.2157885-1-hsiangkao@linux.alibaba.…
fs/erofs/decompressor.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/fs/erofs/decompressor.c b/fs/erofs/decompressor.c
index 38eeec5e3032..d06a3b77fb39 100644
--- a/fs/erofs/decompressor.c
+++ b/fs/erofs/decompressor.c
@@ -24,7 +24,8 @@ struct z_erofs_decompressor {
*/
int (*prepare_destpages)(struct z_erofs_decompress_req *rq,
struct list_head *pagepool);
- int (*decompress)(struct z_erofs_decompress_req *rq, u8 *out);
+ int (*decompress)(struct z_erofs_decompress_req *rq, u8 *out,
+ u8 *obase);
char *name;
};
@@ -114,10 +115,13 @@ static void *generic_copy_inplace_data(struct z_erofs_decompress_req *rq,
return tmp;
}
-static int z_erofs_lz4_decompress(struct z_erofs_decompress_req *rq, u8 *out)
+static int z_erofs_lz4_decompress(struct z_erofs_decompress_req *rq, u8 *out,
+ u8 *obase)
{
+ const uint nrpages_out = PAGE_ALIGN(rq->pageofs_out +
+ rq->outputsize) >> PAGE_SHIFT;
unsigned int inputmargin, inlen;
- u8 *src;
+ u8 *src, *src2;
bool copied, support_0padding;
int ret;
@@ -125,6 +129,7 @@ static int z_erofs_lz4_decompress(struct z_erofs_decompress_req *rq, u8 *out)
return -EOPNOTSUPP;
src = kmap_atomic(*rq->in);
+ src2 = src;
inputmargin = 0;
support_0padding = false;
@@ -148,16 +153,15 @@ static int z_erofs_lz4_decompress(struct z_erofs_decompress_req *rq, u8 *out)
if (rq->inplace_io) {
const uint oend = (rq->pageofs_out +
rq->outputsize) & ~PAGE_MASK;
- const uint nr = PAGE_ALIGN(rq->pageofs_out +
- rq->outputsize) >> PAGE_SHIFT;
-
if (rq->partial_decoding || !support_0padding ||
- rq->out[nr - 1] != rq->in[0] ||
+ rq->out[nrpages_out - 1] != rq->in[0] ||
rq->inputsize - oend <
LZ4_DECOMPRESS_INPLACE_MARGIN(inlen)) {
src = generic_copy_inplace_data(rq, src, inputmargin);
inputmargin = 0;
copied = true;
+ } else {
+ src = obase + ((nrpages_out - 1) << PAGE_SHIFT);
}
}
@@ -178,7 +182,7 @@ static int z_erofs_lz4_decompress(struct z_erofs_decompress_req *rq, u8 *out)
if (copied)
erofs_put_pcpubuf(src);
else
- kunmap_atomic(src);
+ kunmap_atomic(src2);
return ret;
}
@@ -248,7 +252,7 @@ static int z_erofs_decompress_generic(struct z_erofs_decompress_req *rq,
return PTR_ERR(dst);
rq->inplace_io = false;
- ret = alg->decompress(rq, dst);
+ ret = alg->decompress(rq, dst, NULL);
if (!ret)
copy_from_pcpubuf(rq->out, dst, rq->pageofs_out,
rq->outputsize);
@@ -282,7 +286,7 @@ static int z_erofs_decompress_generic(struct z_erofs_decompress_req *rq,
dst_maptype = 2;
dstmap_out:
- ret = alg->decompress(rq, dst + rq->pageofs_out);
+ ret = alg->decompress(rq, dst + rq->pageofs_out, dst);
if (!dst_maptype)
kunmap_atomic(dst);
--
2.43.5
Greg recently reported 2 patches that could not be applied without
conflicts in v5.10:
- e32d262c89e2 ("mptcp: handle consistently DSS corruption")
- 4dabcdf58121 ("tcp: fix mptcp DSS corruption due to large pmtu xmit")
Conflicts have been resolved, and documented in each patch.
One extra commit has been backported, to support allow_infinite_fallback
which is used by one commit from the list above:
- 0530020a7c8f ("mptcp: track and update contiguous data status")
Geliang Tang (1):
mptcp: track and update contiguous data status
Paolo Abeni (2):
mptcp: handle consistently DSS corruption
tcp: fix mptcp DSS corruption due to large pmtu xmit
net/ipv4/tcp_output.c | 2 +-
net/mptcp/mib.c | 2 ++
net/mptcp/mib.h | 2 ++
net/mptcp/protocol.c | 26 ++++++++++++++++++++++----
net/mptcp/protocol.h | 1 +
net/mptcp/subflow.c | 3 ++-
6 files changed, 30 insertions(+), 6 deletions(-)
--
2.45.2
Greg recently reported 6 patches that could not be applied without
conflicts in v5.15:
- e32d262c89e2 ("mptcp: handle consistently DSS corruption")
- 4dabcdf58121 ("tcp: fix mptcp DSS corruption due to large pmtu xmit")
- 119d51e225fe ("mptcp: fallback when MPTCP opts are dropped after 1st
data")
- 7decd1f5904a ("mptcp: pm: fix UaF read in mptcp_pm_nl_rm_addr_or_subflow")
- 3d041393ea8c ("mptcp: prevent MPC handshake on port-based signal
endpoints")
- 5afca7e996c4 ("selftests: mptcp: join: test for prohibited MPC to
port-based endp")
Conflicts have been resolved for the 5 first ones, and documented in
each patch.
The last patch has not been backported: this is an extra test for the
selftests validating the previous commit, and there are a lot of
conflicts. That's fine not to backport this test, it is still possible
to use the selftests from a newer version and run them on this older
kernel.
One extra commit has been backported, to support allow_infinite_fallback
which is used by two commits from the list above:
- 0530020a7c8f ("mptcp: track and update contiguous data status")
Geliang Tang (1):
mptcp: track and update contiguous data status
Matthieu Baerts (NGI0) (2):
mptcp: fallback when MPTCP opts are dropped after 1st data
mptcp: pm: fix UaF read in mptcp_pm_nl_rm_addr_or_subflow
Paolo Abeni (3):
mptcp: handle consistently DSS corruption
tcp: fix mptcp DSS corruption due to large pmtu xmit
mptcp: prevent MPC handshake on port-based signal endpoints
net/ipv4/tcp_output.c | 2 +-
net/mptcp/mib.c | 3 +++
net/mptcp/mib.h | 3 +++
net/mptcp/pm_netlink.c | 3 ++-
net/mptcp/protocol.c | 23 ++++++++++++++++++++---
net/mptcp/protocol.h | 2 ++
net/mptcp/subflow.c | 19 ++++++++++++++++---
7 files changed, 47 insertions(+), 8 deletions(-)
--
2.45.2
Greg recently reported 3 patches that could not be applied without
conflicts in v6.1:
- 4dabcdf58121 ("tcp: fix mptcp DSS corruption due to large pmtu xmit")
- 3d041393ea8c ("mptcp: prevent MPC handshake on port-based signal
endpoints")
- 5afca7e996c4 ("selftests: mptcp: join: test for prohibited MPC to
port-based endp")
Conflicts have been resolved for the two first ones, and documented in
each patch.
The last patch has not been backported: this is an extra test for the
selftests validating the previous commit, and there are a lot of
conflicts. That's fine not to backport this test, it is still possible
to use the selftests from a newer version and run them on this older
kernel.
Paolo Abeni (2):
tcp: fix mptcp DSS corruption due to large pmtu xmit
mptcp: prevent MPC handshake on port-based signal endpoints
net/ipv4/tcp_output.c | 4 +---
net/mptcp/mib.c | 1 +
net/mptcp/mib.h | 1 +
net/mptcp/pm_netlink.c | 1 +
net/mptcp/protocol.h | 1 +
net/mptcp/subflow.c | 11 +++++++++++
6 files changed, 16 insertions(+), 3 deletions(-)
--
2.45.2
Greg recently reported 2 patches that could not be applied without
conflict in v6.6:
- 4dabcdf58121 ("tcp: fix mptcp DSS corruption due to large pmtu xmit")
- 5afca7e996c4 ("selftests: mptcp: join: test for prohibited MPC to
port-based endp")
Conflicts have been resolved, and documented in each patch.
Note that there are two extra patches:
- 8c6f6b4bb53a ("selftests: mptcp: join: change capture/checksum as
bool"): to avoid some conflicts
- "selftests: mptcp: remove duplicated variables": a dedicated patch
for v6.6, to fix some previous backport issues.
Geliang Tang (1):
selftests: mptcp: join: change capture/checksum as bool
Matthieu Baerts (NGI0) (1):
selftests: mptcp: remove duplicated variables
Paolo Abeni (2):
tcp: fix mptcp DSS corruption due to large pmtu xmit
selftests: mptcp: join: test for prohibited MPC to port-based endp
net/ipv4/tcp_output.c | 4 +-
.../testing/selftests/net/mptcp/mptcp_join.sh | 135 ++++++++++++------
.../testing/selftests/net/mptcp/mptcp_lib.sh | 11 --
3 files changed, 96 insertions(+), 54 deletions(-)
--
2.45.2
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 23f5f5debcaac1399cfeacec215278bf6dbc1d11
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102142-pristine-mayday-9998@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 23f5f5debcaac1399cfeacec215278bf6dbc1d11 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan+linaro(a)kernel.org>
Date: Wed, 9 Oct 2024 16:51:04 +0200
Subject: [PATCH] serial: qcom-geni: fix shutdown race
A commit adding back the stopping of tx on port shutdown failed to add
back the locking which had also been removed by commit e83766334f96
("tty: serial: qcom_geni_serial: No need to stop tx/rx on UART
shutdown").
Holding the port lock is needed to serialise against the console code,
which may update the interrupt enable register and access the port
state.
Fixes: d8aca2f96813 ("tty: serial: qcom-geni-serial: stop operations in progress at shutdown")
Fixes: 947cc4ecc06c ("serial: qcom-geni: fix soft lockup on sw flow control and suspend")
Cc: stable(a)vger.kernel.org # 6.3
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
Reviewed-by: Douglas Anderson <dianders(a)chromium.org>
Link: https://lore.kernel.org/r/20241009145110.16847-4-johan+linaro@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
index 2e4a5361f137..87cd974b76bf 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -1114,10 +1114,12 @@ static void qcom_geni_serial_shutdown(struct uart_port *uport)
{
disable_irq(uport->irq);
+ uart_port_lock_irq(uport);
qcom_geni_serial_stop_tx(uport);
qcom_geni_serial_stop_rx(uport);
qcom_geni_serial_cancel_tx_cmd(uport);
+ uart_port_unlock_irq(uport);
}
static void qcom_geni_serial_flush_buffer(struct uart_port *uport)
This problem reported by Clement LE GOFFIC manifest when
using CONFIG_KASAN_IN_VMALLOC and VMAP_STACK:
https://lore.kernel.org/linux-arm-kernel/a1a1d062-f3a2-4d05-9836-3b098de9db…
After some analysis it seems we are missing to sync the
VMALLOC shadow memory in top level PGD to all CPUs.
Add some code to perform this sync, and the bug appears
to go away.
As suggested by Ard, also perform a dummy read from the
shadow memory of the new VMAP_STACK in the low level
assembly.
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
---
Changes in v2:
- Implement the two helper functions suggested by Russell
making the KASAN PGD copying less messy.
- Link to v1: https://lore.kernel.org/r/20241015-arm-kasan-vmalloc-crash-v1-0-dbb23592ca8…
---
Linus Walleij (2):
ARM: ioremap: Sync PGDs for VMALLOC shadow
ARM: entry: Do a dummy read from VMAP shadow
arch/arm/kernel/entry-armv.S | 8 ++++++++
arch/arm/mm/ioremap.c | 25 +++++++++++++++++++++----
2 files changed, 29 insertions(+), 4 deletions(-)
---
base-commit: 9852d85ec9d492ebef56dc5f229416c925758edc
change-id: 20241015-arm-kasan-vmalloc-crash-fcbd51416457
Best regards,
--
Linus Walleij <linus.walleij(a)linaro.org>
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 40d7903386df4d18f04d90510ba90eedee260085
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102101-glacial-outsell-b7f5@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 40d7903386df4d18f04d90510ba90eedee260085 Mon Sep 17 00:00:00 2001
From: Marek Vasut <marex(a)denx.de>
Date: Wed, 2 Oct 2024 20:40:38 +0200
Subject: [PATCH] serial: imx: Update mctrl old_status on RTSD interrupt
When sending data using DMA at high baudrate (4 Mbdps in local test case) to
a device with small RX buffer which keeps asserting RTS after every received
byte, it is possible that the iMX UART driver would not recognize the falling
edge of RTS input signal and get stuck, unable to transmit any more data.
This condition happens when the following sequence of events occur:
- imx_uart_mctrl_check() is called at some point and takes a snapshot of UART
control signal status into sport->old_status using imx_uart_get_hwmctrl().
The RTSS/TIOCM_CTS bit is of interest here (*).
- DMA transfer occurs, the remote device asserts RTS signal after each byte.
The i.MX UART driver recognizes each such RTS signal change, raises an
interrupt with USR1 register RTSD bit set, which leads to invocation of
__imx_uart_rtsint(), which calls uart_handle_cts_change().
- If the RTS signal is deasserted, uart_handle_cts_change() clears
port->hw_stopped and unblocks the port for further data transfers.
- If the RTS is asserted, uart_handle_cts_change() sets port->hw_stopped
and blocks the port for further data transfers. This may occur as the
last interrupt of a transfer, which means port->hw_stopped remains set
and the port remains blocked (**).
- Any further data transfer attempts will trigger imx_uart_mctrl_check(),
which will read current status of UART control signals by calling
imx_uart_get_hwmctrl() (***) and compare it with sport->old_status .
- If current status differs from sport->old_status for RTS signal,
uart_handle_cts_change() is called and possibly unblocks the port
by clearing port->hw_stopped .
- If current status does not differ from sport->old_status for RTS
signal, no action occurs. This may occur in case prior snapshot (*)
was taken before any transfer so the RTS is deasserted, current
snapshot (***) was taken after a transfer and therefore RTS is
deasserted again, which means current status and sport->old_status
are identical. In case (**) triggered when RTS got asserted, and
made port->hw_stopped set, the port->hw_stopped will remain set
because no change on RTS line is recognized by this driver and
uart_handle_cts_change() is not called from here to unblock the
port->hw_stopped.
Update sport->old_status in __imx_uart_rtsint() accordingly to make
imx_uart_mctrl_check() detect such RTS change. Note that TIOCM_CAR
and TIOCM_RI bits in sport->old_status do not suffer from this problem.
Fixes: ceca629e0b48 ("[ARM] 2971/1: i.MX uart handle rts irq")
Cc: stable <stable(a)kernel.org>
Reviewed-by: Esben Haabendal <esben(a)geanix.com>
Signed-off-by: Marek Vasut <marex(a)denx.de>
Link: https://lore.kernel.org/r/20241002184133.19427-1-marex@denx.de
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index 67d4a72eda77..90974d338f3c 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -762,6 +762,21 @@ static irqreturn_t __imx_uart_rtsint(int irq, void *dev_id)
imx_uart_writel(sport, USR1_RTSD, USR1);
usr1 = imx_uart_readl(sport, USR1) & USR1_RTSS;
+ /*
+ * Update sport->old_status here, so any follow-up calls to
+ * imx_uart_mctrl_check() will be able to recognize that RTS
+ * state changed since last imx_uart_mctrl_check() call.
+ *
+ * In case RTS has been detected as asserted here and later on
+ * deasserted by the time imx_uart_mctrl_check() was called,
+ * imx_uart_mctrl_check() can detect the RTS state change and
+ * trigger uart_handle_cts_change() to unblock the port for
+ * further TX transfers.
+ */
+ if (usr1 & USR1_RTSS)
+ sport->old_status |= TIOCM_CTS;
+ else
+ sport->old_status &= ~TIOCM_CTS;
uart_handle_cts_change(&sport->port, usr1);
wake_up_interruptible(&sport->port.state->port.delta_msr_wait);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 40d7903386df4d18f04d90510ba90eedee260085
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102100-unselect-chevron-960a@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 40d7903386df4d18f04d90510ba90eedee260085 Mon Sep 17 00:00:00 2001
From: Marek Vasut <marex(a)denx.de>
Date: Wed, 2 Oct 2024 20:40:38 +0200
Subject: [PATCH] serial: imx: Update mctrl old_status on RTSD interrupt
When sending data using DMA at high baudrate (4 Mbdps in local test case) to
a device with small RX buffer which keeps asserting RTS after every received
byte, it is possible that the iMX UART driver would not recognize the falling
edge of RTS input signal and get stuck, unable to transmit any more data.
This condition happens when the following sequence of events occur:
- imx_uart_mctrl_check() is called at some point and takes a snapshot of UART
control signal status into sport->old_status using imx_uart_get_hwmctrl().
The RTSS/TIOCM_CTS bit is of interest here (*).
- DMA transfer occurs, the remote device asserts RTS signal after each byte.
The i.MX UART driver recognizes each such RTS signal change, raises an
interrupt with USR1 register RTSD bit set, which leads to invocation of
__imx_uart_rtsint(), which calls uart_handle_cts_change().
- If the RTS signal is deasserted, uart_handle_cts_change() clears
port->hw_stopped and unblocks the port for further data transfers.
- If the RTS is asserted, uart_handle_cts_change() sets port->hw_stopped
and blocks the port for further data transfers. This may occur as the
last interrupt of a transfer, which means port->hw_stopped remains set
and the port remains blocked (**).
- Any further data transfer attempts will trigger imx_uart_mctrl_check(),
which will read current status of UART control signals by calling
imx_uart_get_hwmctrl() (***) and compare it with sport->old_status .
- If current status differs from sport->old_status for RTS signal,
uart_handle_cts_change() is called and possibly unblocks the port
by clearing port->hw_stopped .
- If current status does not differ from sport->old_status for RTS
signal, no action occurs. This may occur in case prior snapshot (*)
was taken before any transfer so the RTS is deasserted, current
snapshot (***) was taken after a transfer and therefore RTS is
deasserted again, which means current status and sport->old_status
are identical. In case (**) triggered when RTS got asserted, and
made port->hw_stopped set, the port->hw_stopped will remain set
because no change on RTS line is recognized by this driver and
uart_handle_cts_change() is not called from here to unblock the
port->hw_stopped.
Update sport->old_status in __imx_uart_rtsint() accordingly to make
imx_uart_mctrl_check() detect such RTS change. Note that TIOCM_CAR
and TIOCM_RI bits in sport->old_status do not suffer from this problem.
Fixes: ceca629e0b48 ("[ARM] 2971/1: i.MX uart handle rts irq")
Cc: stable <stable(a)kernel.org>
Reviewed-by: Esben Haabendal <esben(a)geanix.com>
Signed-off-by: Marek Vasut <marex(a)denx.de>
Link: https://lore.kernel.org/r/20241002184133.19427-1-marex@denx.de
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index 67d4a72eda77..90974d338f3c 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -762,6 +762,21 @@ static irqreturn_t __imx_uart_rtsint(int irq, void *dev_id)
imx_uart_writel(sport, USR1_RTSD, USR1);
usr1 = imx_uart_readl(sport, USR1) & USR1_RTSS;
+ /*
+ * Update sport->old_status here, so any follow-up calls to
+ * imx_uart_mctrl_check() will be able to recognize that RTS
+ * state changed since last imx_uart_mctrl_check() call.
+ *
+ * In case RTS has been detected as asserted here and later on
+ * deasserted by the time imx_uart_mctrl_check() was called,
+ * imx_uart_mctrl_check() can detect the RTS state change and
+ * trigger uart_handle_cts_change() to unblock the port for
+ * further TX transfers.
+ */
+ if (usr1 & USR1_RTSS)
+ sport->old_status |= TIOCM_CTS;
+ else
+ sport->old_status &= ~TIOCM_CTS;
uart_handle_cts_change(&sport->port, usr1);
wake_up_interruptible(&sport->port.state->port.delta_msr_wait);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 40d7903386df4d18f04d90510ba90eedee260085
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102158-vanquish-ignition-83d0@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 40d7903386df4d18f04d90510ba90eedee260085 Mon Sep 17 00:00:00 2001
From: Marek Vasut <marex(a)denx.de>
Date: Wed, 2 Oct 2024 20:40:38 +0200
Subject: [PATCH] serial: imx: Update mctrl old_status on RTSD interrupt
When sending data using DMA at high baudrate (4 Mbdps in local test case) to
a device with small RX buffer which keeps asserting RTS after every received
byte, it is possible that the iMX UART driver would not recognize the falling
edge of RTS input signal and get stuck, unable to transmit any more data.
This condition happens when the following sequence of events occur:
- imx_uart_mctrl_check() is called at some point and takes a snapshot of UART
control signal status into sport->old_status using imx_uart_get_hwmctrl().
The RTSS/TIOCM_CTS bit is of interest here (*).
- DMA transfer occurs, the remote device asserts RTS signal after each byte.
The i.MX UART driver recognizes each such RTS signal change, raises an
interrupt with USR1 register RTSD bit set, which leads to invocation of
__imx_uart_rtsint(), which calls uart_handle_cts_change().
- If the RTS signal is deasserted, uart_handle_cts_change() clears
port->hw_stopped and unblocks the port for further data transfers.
- If the RTS is asserted, uart_handle_cts_change() sets port->hw_stopped
and blocks the port for further data transfers. This may occur as the
last interrupt of a transfer, which means port->hw_stopped remains set
and the port remains blocked (**).
- Any further data transfer attempts will trigger imx_uart_mctrl_check(),
which will read current status of UART control signals by calling
imx_uart_get_hwmctrl() (***) and compare it with sport->old_status .
- If current status differs from sport->old_status for RTS signal,
uart_handle_cts_change() is called and possibly unblocks the port
by clearing port->hw_stopped .
- If current status does not differ from sport->old_status for RTS
signal, no action occurs. This may occur in case prior snapshot (*)
was taken before any transfer so the RTS is deasserted, current
snapshot (***) was taken after a transfer and therefore RTS is
deasserted again, which means current status and sport->old_status
are identical. In case (**) triggered when RTS got asserted, and
made port->hw_stopped set, the port->hw_stopped will remain set
because no change on RTS line is recognized by this driver and
uart_handle_cts_change() is not called from here to unblock the
port->hw_stopped.
Update sport->old_status in __imx_uart_rtsint() accordingly to make
imx_uart_mctrl_check() detect such RTS change. Note that TIOCM_CAR
and TIOCM_RI bits in sport->old_status do not suffer from this problem.
Fixes: ceca629e0b48 ("[ARM] 2971/1: i.MX uart handle rts irq")
Cc: stable <stable(a)kernel.org>
Reviewed-by: Esben Haabendal <esben(a)geanix.com>
Signed-off-by: Marek Vasut <marex(a)denx.de>
Link: https://lore.kernel.org/r/20241002184133.19427-1-marex@denx.de
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index 67d4a72eda77..90974d338f3c 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -762,6 +762,21 @@ static irqreturn_t __imx_uart_rtsint(int irq, void *dev_id)
imx_uart_writel(sport, USR1_RTSD, USR1);
usr1 = imx_uart_readl(sport, USR1) & USR1_RTSS;
+ /*
+ * Update sport->old_status here, so any follow-up calls to
+ * imx_uart_mctrl_check() will be able to recognize that RTS
+ * state changed since last imx_uart_mctrl_check() call.
+ *
+ * In case RTS has been detected as asserted here and later on
+ * deasserted by the time imx_uart_mctrl_check() was called,
+ * imx_uart_mctrl_check() can detect the RTS state change and
+ * trigger uart_handle_cts_change() to unblock the port for
+ * further TX transfers.
+ */
+ if (usr1 & USR1_RTSS)
+ sport->old_status |= TIOCM_CTS;
+ else
+ sport->old_status &= ~TIOCM_CTS;
uart_handle_cts_change(&sport->port, usr1);
wake_up_interruptible(&sport->port.state->port.delta_msr_wait);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 40d7903386df4d18f04d90510ba90eedee260085
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102157-tactical-darkened-7877@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 40d7903386df4d18f04d90510ba90eedee260085 Mon Sep 17 00:00:00 2001
From: Marek Vasut <marex(a)denx.de>
Date: Wed, 2 Oct 2024 20:40:38 +0200
Subject: [PATCH] serial: imx: Update mctrl old_status on RTSD interrupt
When sending data using DMA at high baudrate (4 Mbdps in local test case) to
a device with small RX buffer which keeps asserting RTS after every received
byte, it is possible that the iMX UART driver would not recognize the falling
edge of RTS input signal and get stuck, unable to transmit any more data.
This condition happens when the following sequence of events occur:
- imx_uart_mctrl_check() is called at some point and takes a snapshot of UART
control signal status into sport->old_status using imx_uart_get_hwmctrl().
The RTSS/TIOCM_CTS bit is of interest here (*).
- DMA transfer occurs, the remote device asserts RTS signal after each byte.
The i.MX UART driver recognizes each such RTS signal change, raises an
interrupt with USR1 register RTSD bit set, which leads to invocation of
__imx_uart_rtsint(), which calls uart_handle_cts_change().
- If the RTS signal is deasserted, uart_handle_cts_change() clears
port->hw_stopped and unblocks the port for further data transfers.
- If the RTS is asserted, uart_handle_cts_change() sets port->hw_stopped
and blocks the port for further data transfers. This may occur as the
last interrupt of a transfer, which means port->hw_stopped remains set
and the port remains blocked (**).
- Any further data transfer attempts will trigger imx_uart_mctrl_check(),
which will read current status of UART control signals by calling
imx_uart_get_hwmctrl() (***) and compare it with sport->old_status .
- If current status differs from sport->old_status for RTS signal,
uart_handle_cts_change() is called and possibly unblocks the port
by clearing port->hw_stopped .
- If current status does not differ from sport->old_status for RTS
signal, no action occurs. This may occur in case prior snapshot (*)
was taken before any transfer so the RTS is deasserted, current
snapshot (***) was taken after a transfer and therefore RTS is
deasserted again, which means current status and sport->old_status
are identical. In case (**) triggered when RTS got asserted, and
made port->hw_stopped set, the port->hw_stopped will remain set
because no change on RTS line is recognized by this driver and
uart_handle_cts_change() is not called from here to unblock the
port->hw_stopped.
Update sport->old_status in __imx_uart_rtsint() accordingly to make
imx_uart_mctrl_check() detect such RTS change. Note that TIOCM_CAR
and TIOCM_RI bits in sport->old_status do not suffer from this problem.
Fixes: ceca629e0b48 ("[ARM] 2971/1: i.MX uart handle rts irq")
Cc: stable <stable(a)kernel.org>
Reviewed-by: Esben Haabendal <esben(a)geanix.com>
Signed-off-by: Marek Vasut <marex(a)denx.de>
Link: https://lore.kernel.org/r/20241002184133.19427-1-marex@denx.de
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index 67d4a72eda77..90974d338f3c 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -762,6 +762,21 @@ static irqreturn_t __imx_uart_rtsint(int irq, void *dev_id)
imx_uart_writel(sport, USR1_RTSD, USR1);
usr1 = imx_uart_readl(sport, USR1) & USR1_RTSS;
+ /*
+ * Update sport->old_status here, so any follow-up calls to
+ * imx_uart_mctrl_check() will be able to recognize that RTS
+ * state changed since last imx_uart_mctrl_check() call.
+ *
+ * In case RTS has been detected as asserted here and later on
+ * deasserted by the time imx_uart_mctrl_check() was called,
+ * imx_uart_mctrl_check() can detect the RTS state change and
+ * trigger uart_handle_cts_change() to unblock the port for
+ * further TX transfers.
+ */
+ if (usr1 & USR1_RTSS)
+ sport->old_status |= TIOCM_CTS;
+ else
+ sport->old_status &= ~TIOCM_CTS;
uart_handle_cts_change(&sport->port, usr1);
wake_up_interruptible(&sport->port.state->port.delta_msr_wait);
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 40d7903386df4d18f04d90510ba90eedee260085
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102155-anemia-fructose-ab64@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 40d7903386df4d18f04d90510ba90eedee260085 Mon Sep 17 00:00:00 2001
From: Marek Vasut <marex(a)denx.de>
Date: Wed, 2 Oct 2024 20:40:38 +0200
Subject: [PATCH] serial: imx: Update mctrl old_status on RTSD interrupt
When sending data using DMA at high baudrate (4 Mbdps in local test case) to
a device with small RX buffer which keeps asserting RTS after every received
byte, it is possible that the iMX UART driver would not recognize the falling
edge of RTS input signal and get stuck, unable to transmit any more data.
This condition happens when the following sequence of events occur:
- imx_uart_mctrl_check() is called at some point and takes a snapshot of UART
control signal status into sport->old_status using imx_uart_get_hwmctrl().
The RTSS/TIOCM_CTS bit is of interest here (*).
- DMA transfer occurs, the remote device asserts RTS signal after each byte.
The i.MX UART driver recognizes each such RTS signal change, raises an
interrupt with USR1 register RTSD bit set, which leads to invocation of
__imx_uart_rtsint(), which calls uart_handle_cts_change().
- If the RTS signal is deasserted, uart_handle_cts_change() clears
port->hw_stopped and unblocks the port for further data transfers.
- If the RTS is asserted, uart_handle_cts_change() sets port->hw_stopped
and blocks the port for further data transfers. This may occur as the
last interrupt of a transfer, which means port->hw_stopped remains set
and the port remains blocked (**).
- Any further data transfer attempts will trigger imx_uart_mctrl_check(),
which will read current status of UART control signals by calling
imx_uart_get_hwmctrl() (***) and compare it with sport->old_status .
- If current status differs from sport->old_status for RTS signal,
uart_handle_cts_change() is called and possibly unblocks the port
by clearing port->hw_stopped .
- If current status does not differ from sport->old_status for RTS
signal, no action occurs. This may occur in case prior snapshot (*)
was taken before any transfer so the RTS is deasserted, current
snapshot (***) was taken after a transfer and therefore RTS is
deasserted again, which means current status and sport->old_status
are identical. In case (**) triggered when RTS got asserted, and
made port->hw_stopped set, the port->hw_stopped will remain set
because no change on RTS line is recognized by this driver and
uart_handle_cts_change() is not called from here to unblock the
port->hw_stopped.
Update sport->old_status in __imx_uart_rtsint() accordingly to make
imx_uart_mctrl_check() detect such RTS change. Note that TIOCM_CAR
and TIOCM_RI bits in sport->old_status do not suffer from this problem.
Fixes: ceca629e0b48 ("[ARM] 2971/1: i.MX uart handle rts irq")
Cc: stable <stable(a)kernel.org>
Reviewed-by: Esben Haabendal <esben(a)geanix.com>
Signed-off-by: Marek Vasut <marex(a)denx.de>
Link: https://lore.kernel.org/r/20241002184133.19427-1-marex@denx.de
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index 67d4a72eda77..90974d338f3c 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -762,6 +762,21 @@ static irqreturn_t __imx_uart_rtsint(int irq, void *dev_id)
imx_uart_writel(sport, USR1_RTSD, USR1);
usr1 = imx_uart_readl(sport, USR1) & USR1_RTSS;
+ /*
+ * Update sport->old_status here, so any follow-up calls to
+ * imx_uart_mctrl_check() will be able to recognize that RTS
+ * state changed since last imx_uart_mctrl_check() call.
+ *
+ * In case RTS has been detected as asserted here and later on
+ * deasserted by the time imx_uart_mctrl_check() was called,
+ * imx_uart_mctrl_check() can detect the RTS state change and
+ * trigger uart_handle_cts_change() to unblock the port for
+ * further TX transfers.
+ */
+ if (usr1 & USR1_RTSS)
+ sport->old_status |= TIOCM_CTS;
+ else
+ sport->old_status &= ~TIOCM_CTS;
uart_handle_cts_change(&sport->port, usr1);
wake_up_interruptible(&sport->port.state->port.delta_msr_wait);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 705e3ce37bccdf2ed6f848356ff355f480d51a91
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102151-shove-lucid-37a2@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 705e3ce37bccdf2ed6f848356ff355f480d51a91 Mon Sep 17 00:00:00 2001
From: Roger Quadros <rogerq(a)kernel.org>
Date: Fri, 11 Oct 2024 13:53:24 +0300
Subject: [PATCH] usb: dwc3: core: Fix system suspend on TI AM62 platforms
Since commit 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init"),
system suspend is broken on AM62 TI platforms.
Before that commit, both DWC3_GUSB3PIPECTL_SUSPHY and DWC3_GUSB2PHYCFG_SUSPHY
bits (hence forth called 2 SUSPHY bits) were being set during core
initialization and even during core re-initialization after a system
suspend/resume.
These bits are required to be set for system suspend/resume to work correctly
on AM62 platforms.
Since that commit, the 2 SUSPHY bits are not set for DEVICE/OTG mode if gadget
driver is not loaded and started.
For Host mode, the 2 SUSPHY bits are set before the first system suspend but
get cleared at system resume during core re-init and are never set again.
This patch resovles these two issues by ensuring the 2 SUSPHY bits are set
before system suspend and restored to the original state during system resume.
Cc: stable(a)vger.kernel.org # v6.9+
Fixes: 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init")
Link: https://lore.kernel.org/all/1519dbe7-73b6-4afc-bfe3-23f4f75d772f@kernel.org/
Signed-off-by: Roger Quadros <rogerq(a)kernel.org>
Acked-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
Tested-by: Markus Schneider-Pargmann <msp(a)baylibre.com>
Reviewed-by: Dhruva Gole <d-gole(a)ti.com>
Link: https://lore.kernel.org/r/20241011-am62-lpm-usb-v3-1-562d445625b5@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 21740e2b8f07..427e5660f87c 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -2342,6 +2342,11 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg)
u32 reg;
int i;
+ dwc->susphy_state = (dwc3_readl(dwc->regs, DWC3_GUSB2PHYCFG(0)) &
+ DWC3_GUSB2PHYCFG_SUSPHY) ||
+ (dwc3_readl(dwc->regs, DWC3_GUSB3PIPECTL(0)) &
+ DWC3_GUSB3PIPECTL_SUSPHY);
+
switch (dwc->current_dr_role) {
case DWC3_GCTL_PRTCAP_DEVICE:
if (pm_runtime_suspended(dwc->dev))
@@ -2393,6 +2398,15 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg)
break;
}
+ if (!PMSG_IS_AUTO(msg)) {
+ /*
+ * TI AM62 platform requires SUSPHY to be
+ * enabled for system suspend to work.
+ */
+ if (!dwc->susphy_state)
+ dwc3_enable_susphy(dwc, true);
+ }
+
return 0;
}
@@ -2460,6 +2474,11 @@ static int dwc3_resume_common(struct dwc3 *dwc, pm_message_t msg)
break;
}
+ if (!PMSG_IS_AUTO(msg)) {
+ /* restore SUSPHY state to that before system suspend. */
+ dwc3_enable_susphy(dwc, dwc->susphy_state);
+ }
+
return 0;
}
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index 9c508e0c5cdf..eab81dfdcc35 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -1150,6 +1150,8 @@ struct dwc3_scratchpad_array {
* @sys_wakeup: set if the device may do system wakeup.
* @wakeup_configured: set if the device is configured for remote wakeup.
* @suspended: set to track suspend event due to U3/L2.
+ * @susphy_state: state of DWC3_GUSB2PHYCFG_SUSPHY + DWC3_GUSB3PIPECTL_SUSPHY
+ * before PM suspend.
* @imod_interval: set the interrupt moderation interval in 250ns
* increments or 0 to disable.
* @max_cfg_eps: current max number of IN eps used across all USB configs.
@@ -1382,6 +1384,7 @@ struct dwc3 {
unsigned sys_wakeup:1;
unsigned wakeup_configured:1;
unsigned suspended:1;
+ unsigned susphy_state:1;
u16 imod_interval;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 705e3ce37bccdf2ed6f848356ff355f480d51a91
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102150-sneer-daughter-91eb@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 705e3ce37bccdf2ed6f848356ff355f480d51a91 Mon Sep 17 00:00:00 2001
From: Roger Quadros <rogerq(a)kernel.org>
Date: Fri, 11 Oct 2024 13:53:24 +0300
Subject: [PATCH] usb: dwc3: core: Fix system suspend on TI AM62 platforms
Since commit 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init"),
system suspend is broken on AM62 TI platforms.
Before that commit, both DWC3_GUSB3PIPECTL_SUSPHY and DWC3_GUSB2PHYCFG_SUSPHY
bits (hence forth called 2 SUSPHY bits) were being set during core
initialization and even during core re-initialization after a system
suspend/resume.
These bits are required to be set for system suspend/resume to work correctly
on AM62 platforms.
Since that commit, the 2 SUSPHY bits are not set for DEVICE/OTG mode if gadget
driver is not loaded and started.
For Host mode, the 2 SUSPHY bits are set before the first system suspend but
get cleared at system resume during core re-init and are never set again.
This patch resovles these two issues by ensuring the 2 SUSPHY bits are set
before system suspend and restored to the original state during system resume.
Cc: stable(a)vger.kernel.org # v6.9+
Fixes: 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init")
Link: https://lore.kernel.org/all/1519dbe7-73b6-4afc-bfe3-23f4f75d772f@kernel.org/
Signed-off-by: Roger Quadros <rogerq(a)kernel.org>
Acked-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
Tested-by: Markus Schneider-Pargmann <msp(a)baylibre.com>
Reviewed-by: Dhruva Gole <d-gole(a)ti.com>
Link: https://lore.kernel.org/r/20241011-am62-lpm-usb-v3-1-562d445625b5@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 21740e2b8f07..427e5660f87c 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -2342,6 +2342,11 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg)
u32 reg;
int i;
+ dwc->susphy_state = (dwc3_readl(dwc->regs, DWC3_GUSB2PHYCFG(0)) &
+ DWC3_GUSB2PHYCFG_SUSPHY) ||
+ (dwc3_readl(dwc->regs, DWC3_GUSB3PIPECTL(0)) &
+ DWC3_GUSB3PIPECTL_SUSPHY);
+
switch (dwc->current_dr_role) {
case DWC3_GCTL_PRTCAP_DEVICE:
if (pm_runtime_suspended(dwc->dev))
@@ -2393,6 +2398,15 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg)
break;
}
+ if (!PMSG_IS_AUTO(msg)) {
+ /*
+ * TI AM62 platform requires SUSPHY to be
+ * enabled for system suspend to work.
+ */
+ if (!dwc->susphy_state)
+ dwc3_enable_susphy(dwc, true);
+ }
+
return 0;
}
@@ -2460,6 +2474,11 @@ static int dwc3_resume_common(struct dwc3 *dwc, pm_message_t msg)
break;
}
+ if (!PMSG_IS_AUTO(msg)) {
+ /* restore SUSPHY state to that before system suspend. */
+ dwc3_enable_susphy(dwc, dwc->susphy_state);
+ }
+
return 0;
}
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index 9c508e0c5cdf..eab81dfdcc35 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -1150,6 +1150,8 @@ struct dwc3_scratchpad_array {
* @sys_wakeup: set if the device may do system wakeup.
* @wakeup_configured: set if the device is configured for remote wakeup.
* @suspended: set to track suspend event due to U3/L2.
+ * @susphy_state: state of DWC3_GUSB2PHYCFG_SUSPHY + DWC3_GUSB3PIPECTL_SUSPHY
+ * before PM suspend.
* @imod_interval: set the interrupt moderation interval in 250ns
* increments or 0 to disable.
* @max_cfg_eps: current max number of IN eps used across all USB configs.
@@ -1382,6 +1384,7 @@ struct dwc3 {
unsigned sys_wakeup:1;
unsigned wakeup_configured:1;
unsigned suspended:1;
+ unsigned susphy_state:1;
u16 imod_interval;
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 705e3ce37bccdf2ed6f848356ff355f480d51a91
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102149-useable-steep-0218@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 705e3ce37bccdf2ed6f848356ff355f480d51a91 Mon Sep 17 00:00:00 2001
From: Roger Quadros <rogerq(a)kernel.org>
Date: Fri, 11 Oct 2024 13:53:24 +0300
Subject: [PATCH] usb: dwc3: core: Fix system suspend on TI AM62 platforms
Since commit 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init"),
system suspend is broken on AM62 TI platforms.
Before that commit, both DWC3_GUSB3PIPECTL_SUSPHY and DWC3_GUSB2PHYCFG_SUSPHY
bits (hence forth called 2 SUSPHY bits) were being set during core
initialization and even during core re-initialization after a system
suspend/resume.
These bits are required to be set for system suspend/resume to work correctly
on AM62 platforms.
Since that commit, the 2 SUSPHY bits are not set for DEVICE/OTG mode if gadget
driver is not loaded and started.
For Host mode, the 2 SUSPHY bits are set before the first system suspend but
get cleared at system resume during core re-init and are never set again.
This patch resovles these two issues by ensuring the 2 SUSPHY bits are set
before system suspend and restored to the original state during system resume.
Cc: stable(a)vger.kernel.org # v6.9+
Fixes: 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init")
Link: https://lore.kernel.org/all/1519dbe7-73b6-4afc-bfe3-23f4f75d772f@kernel.org/
Signed-off-by: Roger Quadros <rogerq(a)kernel.org>
Acked-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
Tested-by: Markus Schneider-Pargmann <msp(a)baylibre.com>
Reviewed-by: Dhruva Gole <d-gole(a)ti.com>
Link: https://lore.kernel.org/r/20241011-am62-lpm-usb-v3-1-562d445625b5@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 21740e2b8f07..427e5660f87c 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -2342,6 +2342,11 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg)
u32 reg;
int i;
+ dwc->susphy_state = (dwc3_readl(dwc->regs, DWC3_GUSB2PHYCFG(0)) &
+ DWC3_GUSB2PHYCFG_SUSPHY) ||
+ (dwc3_readl(dwc->regs, DWC3_GUSB3PIPECTL(0)) &
+ DWC3_GUSB3PIPECTL_SUSPHY);
+
switch (dwc->current_dr_role) {
case DWC3_GCTL_PRTCAP_DEVICE:
if (pm_runtime_suspended(dwc->dev))
@@ -2393,6 +2398,15 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg)
break;
}
+ if (!PMSG_IS_AUTO(msg)) {
+ /*
+ * TI AM62 platform requires SUSPHY to be
+ * enabled for system suspend to work.
+ */
+ if (!dwc->susphy_state)
+ dwc3_enable_susphy(dwc, true);
+ }
+
return 0;
}
@@ -2460,6 +2474,11 @@ static int dwc3_resume_common(struct dwc3 *dwc, pm_message_t msg)
break;
}
+ if (!PMSG_IS_AUTO(msg)) {
+ /* restore SUSPHY state to that before system suspend. */
+ dwc3_enable_susphy(dwc, dwc->susphy_state);
+ }
+
return 0;
}
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index 9c508e0c5cdf..eab81dfdcc35 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -1150,6 +1150,8 @@ struct dwc3_scratchpad_array {
* @sys_wakeup: set if the device may do system wakeup.
* @wakeup_configured: set if the device is configured for remote wakeup.
* @suspended: set to track suspend event due to U3/L2.
+ * @susphy_state: state of DWC3_GUSB2PHYCFG_SUSPHY + DWC3_GUSB3PIPECTL_SUSPHY
+ * before PM suspend.
* @imod_interval: set the interrupt moderation interval in 250ns
* increments or 0 to disable.
* @max_cfg_eps: current max number of IN eps used across all USB configs.
@@ -1382,6 +1384,7 @@ struct dwc3 {
unsigned sys_wakeup:1;
unsigned wakeup_configured:1;
unsigned suspended:1;
+ unsigned susphy_state:1;
u16 imod_interval;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 9499327714de7bc5cf6c792112c1474932d8ad31
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102111-sandstone-affected-1fd4@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9499327714de7bc5cf6c792112c1474932d8ad31 Mon Sep 17 00:00:00 2001
From: Kevin Groeneveld <kgroeneveld(a)lenbrook.com>
Date: Sun, 6 Oct 2024 19:26:31 -0400
Subject: [PATCH] usb: gadget: f_uac2: fix return value for
UAC2_ATTRIBUTE_STRING store
The configfs store callback should return the number of bytes consumed
not the total number of bytes we actually stored. These could differ if
for example the passed in string had a newline we did not store.
If the returned value does not match the number of bytes written the
writer might assume a failure or keep trying to write the remaining bytes.
For example the following command will hang trying to write the final
newline over and over again (tested on bash 2.05b):
echo foo > function_name
Fixes: 993a44fa85c1 ("usb: gadget: f_uac2: allow changing interface name via configfs")
Cc: stable <stable(a)kernel.org>
Signed-off-by: Kevin Groeneveld <kgroeneveld(a)lenbrook.com>
Link: https://lore.kernel.org/r/20241006232637.4267-1-kgroeneveld@lenbrook.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/gadget/function/f_uac2.c b/drivers/usb/gadget/function/f_uac2.c
index 1cdda44455b3..ce5b77f89190 100644
--- a/drivers/usb/gadget/function/f_uac2.c
+++ b/drivers/usb/gadget/function/f_uac2.c
@@ -2061,7 +2061,7 @@ static ssize_t f_uac2_opts_##name##_store(struct config_item *item, \
const char *page, size_t len) \
{ \
struct f_uac2_opts *opts = to_f_uac2_opts(item); \
- int ret = 0; \
+ int ret = len; \
\
mutex_lock(&opts->lock); \
if (opts->refcnt) { \
@@ -2072,8 +2072,8 @@ static ssize_t f_uac2_opts_##name##_store(struct config_item *item, \
if (len && page[len - 1] == '\n') \
len--; \
\
- ret = scnprintf(opts->name, min(sizeof(opts->name), len + 1), \
- "%s", page); \
+ scnprintf(opts->name, min(sizeof(opts->name), len + 1), \
+ "%s", page); \
\
end: \
mutex_unlock(&opts->lock); \
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 9499327714de7bc5cf6c792112c1474932d8ad31
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102111-unbalance-roman-5bdd@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9499327714de7bc5cf6c792112c1474932d8ad31 Mon Sep 17 00:00:00 2001
From: Kevin Groeneveld <kgroeneveld(a)lenbrook.com>
Date: Sun, 6 Oct 2024 19:26:31 -0400
Subject: [PATCH] usb: gadget: f_uac2: fix return value for
UAC2_ATTRIBUTE_STRING store
The configfs store callback should return the number of bytes consumed
not the total number of bytes we actually stored. These could differ if
for example the passed in string had a newline we did not store.
If the returned value does not match the number of bytes written the
writer might assume a failure or keep trying to write the remaining bytes.
For example the following command will hang trying to write the final
newline over and over again (tested on bash 2.05b):
echo foo > function_name
Fixes: 993a44fa85c1 ("usb: gadget: f_uac2: allow changing interface name via configfs")
Cc: stable <stable(a)kernel.org>
Signed-off-by: Kevin Groeneveld <kgroeneveld(a)lenbrook.com>
Link: https://lore.kernel.org/r/20241006232637.4267-1-kgroeneveld@lenbrook.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/gadget/function/f_uac2.c b/drivers/usb/gadget/function/f_uac2.c
index 1cdda44455b3..ce5b77f89190 100644
--- a/drivers/usb/gadget/function/f_uac2.c
+++ b/drivers/usb/gadget/function/f_uac2.c
@@ -2061,7 +2061,7 @@ static ssize_t f_uac2_opts_##name##_store(struct config_item *item, \
const char *page, size_t len) \
{ \
struct f_uac2_opts *opts = to_f_uac2_opts(item); \
- int ret = 0; \
+ int ret = len; \
\
mutex_lock(&opts->lock); \
if (opts->refcnt) { \
@@ -2072,8 +2072,8 @@ static ssize_t f_uac2_opts_##name##_store(struct config_item *item, \
if (len && page[len - 1] == '\n') \
len--; \
\
- ret = scnprintf(opts->name, min(sizeof(opts->name), len + 1), \
- "%s", page); \
+ scnprintf(opts->name, min(sizeof(opts->name), len + 1), \
+ "%s", page); \
\
end: \
mutex_unlock(&opts->lock); \
When BPF_TRAMP_F_CALL_ORIG is enabled, the address of a bpf_tramp_image
struct on the stack is passed during the size calculation pass and
an address on the heap is passed during code generation. This may
cause a heap buffer overflow if the heap address is tagged because
emit_a64_mov_i64() will emit longer code than it did during the size
calculation pass. The same problem could occur without tag-based
KASAN if one of the 16-bit words of the stack address happened to
be all-ones during the size calculation pass. Fix the problem by
assuming the worst case (4 instructions) when calculating the size
of the bpf_tramp_image address emission.
Fixes: 19d3c179a377 ("bpf, arm64: Fix trampoline for BPF_TRAMP_F_CALL_ORIG")
Signed-off-by: Peter Collingbourne <pcc(a)google.com>
Link: https://linux-review.googlesource.com/id/I1496f2bc24fba7a1d492e16e2b94cf437…
Cc: stable(a)vger.kernel.org
---
arch/arm64/net/bpf_jit_comp.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 8bbd0b20136a8..5db82bfc9dc11 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -2220,7 +2220,11 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im,
emit(A64_STR64I(A64_R(20), A64_SP, regs_off + 8), ctx);
if (flags & BPF_TRAMP_F_CALL_ORIG) {
- emit_a64_mov_i64(A64_R(0), (const u64)im, ctx);
+ /* for the first pass, assume the worst case */
+ if (!ctx->image)
+ ctx->idx += 4;
+ else
+ emit_a64_mov_i64(A64_R(0), (const u64)im, ctx);
emit_call((const u64)__bpf_tramp_enter, ctx);
}
@@ -2264,7 +2268,11 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im,
if (flags & BPF_TRAMP_F_CALL_ORIG) {
im->ip_epilogue = ctx->ro_image + ctx->idx;
- emit_a64_mov_i64(A64_R(0), (const u64)im, ctx);
+ /* for the first pass, assume the worst case */
+ if (!ctx->image)
+ ctx->idx += 4;
+ else
+ emit_a64_mov_i64(A64_R(0), (const u64)im, ctx);
emit_call((const u64)__bpf_tramp_exit, ctx);
}
--
2.47.0.rc1.288.g06298d1525-goog
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 30c9ae5ece8ecd69d36e6912c2c0896418f2468c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102146-theme-encircle-9ead@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 30c9ae5ece8ecd69d36e6912c2c0896418f2468c Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Wed, 16 Oct 2024 17:00:00 +0300
Subject: [PATCH] xhci: dbc: honor usb transfer size boundaries.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Treat each completed full size write to /dev/ttyDBC0 as a separate usb
transfer. Make sure the size of the TRBs matches the size of the tty
write by first queuing as many max packet size TRBs as possible up to
the last TRB which will be cut short to match the size of the tty write.
This solves an issue where userspace writes several transfers back to
back via /dev/ttyDBC0 into a kfifo before dbgtty can find available
request to turn that kfifo data into TRBs on the transfer ring.
The boundary between transfer was lost as xhci-dbgtty then turned
everyting in the kfifo into as many 'max packet size' TRBs as possible.
DbC would then send more data to the host than intended for that
transfer, causing host to issue a babble error.
Refuse to write more data to kfifo until previous tty write data is
turned into properly sized TRBs with data size boundaries matching tty
write size
Tested-by: Uday M Bhat <uday.m.bhat(a)intel.com>
Tested-by: Łukasz Bartosik <ukaszb(a)chromium.org>
Cc: stable(a)vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241016140000.783905-5-mathias.nyman@linux.intel…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-dbgcap.h b/drivers/usb/host/xhci-dbgcap.h
index 8ec813b6e9fd..9dc8f4d8077c 100644
--- a/drivers/usb/host/xhci-dbgcap.h
+++ b/drivers/usb/host/xhci-dbgcap.h
@@ -110,6 +110,7 @@ struct dbc_port {
struct tasklet_struct push;
struct list_head write_pool;
+ unsigned int tx_boundary;
bool registered;
};
diff --git a/drivers/usb/host/xhci-dbgtty.c b/drivers/usb/host/xhci-dbgtty.c
index b8e78867e25a..d719c16ea30b 100644
--- a/drivers/usb/host/xhci-dbgtty.c
+++ b/drivers/usb/host/xhci-dbgtty.c
@@ -24,6 +24,29 @@ static inline struct dbc_port *dbc_to_port(struct xhci_dbc *dbc)
return dbc->priv;
}
+static unsigned int
+dbc_kfifo_to_req(struct dbc_port *port, char *packet)
+{
+ unsigned int len;
+
+ len = kfifo_len(&port->port.xmit_fifo);
+
+ if (len == 0)
+ return 0;
+
+ len = min(len, DBC_MAX_PACKET);
+
+ if (port->tx_boundary)
+ len = min(port->tx_boundary, len);
+
+ len = kfifo_out(&port->port.xmit_fifo, packet, len);
+
+ if (port->tx_boundary)
+ port->tx_boundary -= len;
+
+ return len;
+}
+
static int dbc_start_tx(struct dbc_port *port)
__releases(&port->port_lock)
__acquires(&port->port_lock)
@@ -36,7 +59,7 @@ static int dbc_start_tx(struct dbc_port *port)
while (!list_empty(pool)) {
req = list_entry(pool->next, struct dbc_request, list_pool);
- len = kfifo_out(&port->port.xmit_fifo, req->buf, DBC_MAX_PACKET);
+ len = dbc_kfifo_to_req(port, req->buf);
if (len == 0)
break;
do_tty_wake = true;
@@ -200,14 +223,32 @@ static ssize_t dbc_tty_write(struct tty_struct *tty, const u8 *buf,
{
struct dbc_port *port = tty->driver_data;
unsigned long flags;
+ unsigned int written = 0;
spin_lock_irqsave(&port->port_lock, flags);
- if (count)
- count = kfifo_in(&port->port.xmit_fifo, buf, count);
- dbc_start_tx(port);
+
+ /*
+ * Treat tty write as one usb transfer. Make sure the writes are turned
+ * into TRB request having the same size boundaries as the tty writes.
+ * Don't add data to kfifo before previous write is turned into TRBs
+ */
+ if (port->tx_boundary) {
+ spin_unlock_irqrestore(&port->port_lock, flags);
+ return 0;
+ }
+
+ if (count) {
+ written = kfifo_in(&port->port.xmit_fifo, buf, count);
+
+ if (written == count)
+ port->tx_boundary = kfifo_len(&port->port.xmit_fifo);
+
+ dbc_start_tx(port);
+ }
+
spin_unlock_irqrestore(&port->port_lock, flags);
- return count;
+ return written;
}
static int dbc_tty_put_char(struct tty_struct *tty, u8 ch)
@@ -241,6 +282,10 @@ static unsigned int dbc_tty_write_room(struct tty_struct *tty)
spin_lock_irqsave(&port->port_lock, flags);
room = kfifo_avail(&port->port.xmit_fifo);
+
+ if (port->tx_boundary)
+ room = 0;
+
spin_unlock_irqrestore(&port->port_lock, flags);
return room;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 30c9ae5ece8ecd69d36e6912c2c0896418f2468c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102144-cinch-foster-09d5@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 30c9ae5ece8ecd69d36e6912c2c0896418f2468c Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Wed, 16 Oct 2024 17:00:00 +0300
Subject: [PATCH] xhci: dbc: honor usb transfer size boundaries.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Treat each completed full size write to /dev/ttyDBC0 as a separate usb
transfer. Make sure the size of the TRBs matches the size of the tty
write by first queuing as many max packet size TRBs as possible up to
the last TRB which will be cut short to match the size of the tty write.
This solves an issue where userspace writes several transfers back to
back via /dev/ttyDBC0 into a kfifo before dbgtty can find available
request to turn that kfifo data into TRBs on the transfer ring.
The boundary between transfer was lost as xhci-dbgtty then turned
everyting in the kfifo into as many 'max packet size' TRBs as possible.
DbC would then send more data to the host than intended for that
transfer, causing host to issue a babble error.
Refuse to write more data to kfifo until previous tty write data is
turned into properly sized TRBs with data size boundaries matching tty
write size
Tested-by: Uday M Bhat <uday.m.bhat(a)intel.com>
Tested-by: Łukasz Bartosik <ukaszb(a)chromium.org>
Cc: stable(a)vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241016140000.783905-5-mathias.nyman@linux.intel…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-dbgcap.h b/drivers/usb/host/xhci-dbgcap.h
index 8ec813b6e9fd..9dc8f4d8077c 100644
--- a/drivers/usb/host/xhci-dbgcap.h
+++ b/drivers/usb/host/xhci-dbgcap.h
@@ -110,6 +110,7 @@ struct dbc_port {
struct tasklet_struct push;
struct list_head write_pool;
+ unsigned int tx_boundary;
bool registered;
};
diff --git a/drivers/usb/host/xhci-dbgtty.c b/drivers/usb/host/xhci-dbgtty.c
index b8e78867e25a..d719c16ea30b 100644
--- a/drivers/usb/host/xhci-dbgtty.c
+++ b/drivers/usb/host/xhci-dbgtty.c
@@ -24,6 +24,29 @@ static inline struct dbc_port *dbc_to_port(struct xhci_dbc *dbc)
return dbc->priv;
}
+static unsigned int
+dbc_kfifo_to_req(struct dbc_port *port, char *packet)
+{
+ unsigned int len;
+
+ len = kfifo_len(&port->port.xmit_fifo);
+
+ if (len == 0)
+ return 0;
+
+ len = min(len, DBC_MAX_PACKET);
+
+ if (port->tx_boundary)
+ len = min(port->tx_boundary, len);
+
+ len = kfifo_out(&port->port.xmit_fifo, packet, len);
+
+ if (port->tx_boundary)
+ port->tx_boundary -= len;
+
+ return len;
+}
+
static int dbc_start_tx(struct dbc_port *port)
__releases(&port->port_lock)
__acquires(&port->port_lock)
@@ -36,7 +59,7 @@ static int dbc_start_tx(struct dbc_port *port)
while (!list_empty(pool)) {
req = list_entry(pool->next, struct dbc_request, list_pool);
- len = kfifo_out(&port->port.xmit_fifo, req->buf, DBC_MAX_PACKET);
+ len = dbc_kfifo_to_req(port, req->buf);
if (len == 0)
break;
do_tty_wake = true;
@@ -200,14 +223,32 @@ static ssize_t dbc_tty_write(struct tty_struct *tty, const u8 *buf,
{
struct dbc_port *port = tty->driver_data;
unsigned long flags;
+ unsigned int written = 0;
spin_lock_irqsave(&port->port_lock, flags);
- if (count)
- count = kfifo_in(&port->port.xmit_fifo, buf, count);
- dbc_start_tx(port);
+
+ /*
+ * Treat tty write as one usb transfer. Make sure the writes are turned
+ * into TRB request having the same size boundaries as the tty writes.
+ * Don't add data to kfifo before previous write is turned into TRBs
+ */
+ if (port->tx_boundary) {
+ spin_unlock_irqrestore(&port->port_lock, flags);
+ return 0;
+ }
+
+ if (count) {
+ written = kfifo_in(&port->port.xmit_fifo, buf, count);
+
+ if (written == count)
+ port->tx_boundary = kfifo_len(&port->port.xmit_fifo);
+
+ dbc_start_tx(port);
+ }
+
spin_unlock_irqrestore(&port->port_lock, flags);
- return count;
+ return written;
}
static int dbc_tty_put_char(struct tty_struct *tty, u8 ch)
@@ -241,6 +282,10 @@ static unsigned int dbc_tty_write_room(struct tty_struct *tty)
spin_lock_irqsave(&port->port_lock, flags);
room = kfifo_avail(&port->port.xmit_fifo);
+
+ if (port->tx_boundary)
+ room = 0;
+
spin_unlock_irqrestore(&port->port_lock, flags);
return room;
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 30c9ae5ece8ecd69d36e6912c2c0896418f2468c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102143-chevy-extras-add3@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 30c9ae5ece8ecd69d36e6912c2c0896418f2468c Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Wed, 16 Oct 2024 17:00:00 +0300
Subject: [PATCH] xhci: dbc: honor usb transfer size boundaries.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Treat each completed full size write to /dev/ttyDBC0 as a separate usb
transfer. Make sure the size of the TRBs matches the size of the tty
write by first queuing as many max packet size TRBs as possible up to
the last TRB which will be cut short to match the size of the tty write.
This solves an issue where userspace writes several transfers back to
back via /dev/ttyDBC0 into a kfifo before dbgtty can find available
request to turn that kfifo data into TRBs on the transfer ring.
The boundary between transfer was lost as xhci-dbgtty then turned
everyting in the kfifo into as many 'max packet size' TRBs as possible.
DbC would then send more data to the host than intended for that
transfer, causing host to issue a babble error.
Refuse to write more data to kfifo until previous tty write data is
turned into properly sized TRBs with data size boundaries matching tty
write size
Tested-by: Uday M Bhat <uday.m.bhat(a)intel.com>
Tested-by: Łukasz Bartosik <ukaszb(a)chromium.org>
Cc: stable(a)vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241016140000.783905-5-mathias.nyman@linux.intel…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-dbgcap.h b/drivers/usb/host/xhci-dbgcap.h
index 8ec813b6e9fd..9dc8f4d8077c 100644
--- a/drivers/usb/host/xhci-dbgcap.h
+++ b/drivers/usb/host/xhci-dbgcap.h
@@ -110,6 +110,7 @@ struct dbc_port {
struct tasklet_struct push;
struct list_head write_pool;
+ unsigned int tx_boundary;
bool registered;
};
diff --git a/drivers/usb/host/xhci-dbgtty.c b/drivers/usb/host/xhci-dbgtty.c
index b8e78867e25a..d719c16ea30b 100644
--- a/drivers/usb/host/xhci-dbgtty.c
+++ b/drivers/usb/host/xhci-dbgtty.c
@@ -24,6 +24,29 @@ static inline struct dbc_port *dbc_to_port(struct xhci_dbc *dbc)
return dbc->priv;
}
+static unsigned int
+dbc_kfifo_to_req(struct dbc_port *port, char *packet)
+{
+ unsigned int len;
+
+ len = kfifo_len(&port->port.xmit_fifo);
+
+ if (len == 0)
+ return 0;
+
+ len = min(len, DBC_MAX_PACKET);
+
+ if (port->tx_boundary)
+ len = min(port->tx_boundary, len);
+
+ len = kfifo_out(&port->port.xmit_fifo, packet, len);
+
+ if (port->tx_boundary)
+ port->tx_boundary -= len;
+
+ return len;
+}
+
static int dbc_start_tx(struct dbc_port *port)
__releases(&port->port_lock)
__acquires(&port->port_lock)
@@ -36,7 +59,7 @@ static int dbc_start_tx(struct dbc_port *port)
while (!list_empty(pool)) {
req = list_entry(pool->next, struct dbc_request, list_pool);
- len = kfifo_out(&port->port.xmit_fifo, req->buf, DBC_MAX_PACKET);
+ len = dbc_kfifo_to_req(port, req->buf);
if (len == 0)
break;
do_tty_wake = true;
@@ -200,14 +223,32 @@ static ssize_t dbc_tty_write(struct tty_struct *tty, const u8 *buf,
{
struct dbc_port *port = tty->driver_data;
unsigned long flags;
+ unsigned int written = 0;
spin_lock_irqsave(&port->port_lock, flags);
- if (count)
- count = kfifo_in(&port->port.xmit_fifo, buf, count);
- dbc_start_tx(port);
+
+ /*
+ * Treat tty write as one usb transfer. Make sure the writes are turned
+ * into TRB request having the same size boundaries as the tty writes.
+ * Don't add data to kfifo before previous write is turned into TRBs
+ */
+ if (port->tx_boundary) {
+ spin_unlock_irqrestore(&port->port_lock, flags);
+ return 0;
+ }
+
+ if (count) {
+ written = kfifo_in(&port->port.xmit_fifo, buf, count);
+
+ if (written == count)
+ port->tx_boundary = kfifo_len(&port->port.xmit_fifo);
+
+ dbc_start_tx(port);
+ }
+
spin_unlock_irqrestore(&port->port_lock, flags);
- return count;
+ return written;
}
static int dbc_tty_put_char(struct tty_struct *tty, u8 ch)
@@ -241,6 +282,10 @@ static unsigned int dbc_tty_write_room(struct tty_struct *tty)
spin_lock_irqsave(&port->port_lock, flags);
room = kfifo_avail(&port->port.xmit_fifo);
+
+ if (port->tx_boundary)
+ room = 0;
+
spin_unlock_irqrestore(&port->port_lock, flags);
return room;
The patch below does not apply to the 6.11-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.11.y
git checkout FETCH_HEAD
git cherry-pick -x 30c9ae5ece8ecd69d36e6912c2c0896418f2468c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102142-eskimo-lumber-a654@gregkh' --subject-prefix 'PATCH 6.11.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 30c9ae5ece8ecd69d36e6912c2c0896418f2468c Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Wed, 16 Oct 2024 17:00:00 +0300
Subject: [PATCH] xhci: dbc: honor usb transfer size boundaries.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Treat each completed full size write to /dev/ttyDBC0 as a separate usb
transfer. Make sure the size of the TRBs matches the size of the tty
write by first queuing as many max packet size TRBs as possible up to
the last TRB which will be cut short to match the size of the tty write.
This solves an issue where userspace writes several transfers back to
back via /dev/ttyDBC0 into a kfifo before dbgtty can find available
request to turn that kfifo data into TRBs on the transfer ring.
The boundary between transfer was lost as xhci-dbgtty then turned
everyting in the kfifo into as many 'max packet size' TRBs as possible.
DbC would then send more data to the host than intended for that
transfer, causing host to issue a babble error.
Refuse to write more data to kfifo until previous tty write data is
turned into properly sized TRBs with data size boundaries matching tty
write size
Tested-by: Uday M Bhat <uday.m.bhat(a)intel.com>
Tested-by: Łukasz Bartosik <ukaszb(a)chromium.org>
Cc: stable(a)vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241016140000.783905-5-mathias.nyman@linux.intel…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-dbgcap.h b/drivers/usb/host/xhci-dbgcap.h
index 8ec813b6e9fd..9dc8f4d8077c 100644
--- a/drivers/usb/host/xhci-dbgcap.h
+++ b/drivers/usb/host/xhci-dbgcap.h
@@ -110,6 +110,7 @@ struct dbc_port {
struct tasklet_struct push;
struct list_head write_pool;
+ unsigned int tx_boundary;
bool registered;
};
diff --git a/drivers/usb/host/xhci-dbgtty.c b/drivers/usb/host/xhci-dbgtty.c
index b8e78867e25a..d719c16ea30b 100644
--- a/drivers/usb/host/xhci-dbgtty.c
+++ b/drivers/usb/host/xhci-dbgtty.c
@@ -24,6 +24,29 @@ static inline struct dbc_port *dbc_to_port(struct xhci_dbc *dbc)
return dbc->priv;
}
+static unsigned int
+dbc_kfifo_to_req(struct dbc_port *port, char *packet)
+{
+ unsigned int len;
+
+ len = kfifo_len(&port->port.xmit_fifo);
+
+ if (len == 0)
+ return 0;
+
+ len = min(len, DBC_MAX_PACKET);
+
+ if (port->tx_boundary)
+ len = min(port->tx_boundary, len);
+
+ len = kfifo_out(&port->port.xmit_fifo, packet, len);
+
+ if (port->tx_boundary)
+ port->tx_boundary -= len;
+
+ return len;
+}
+
static int dbc_start_tx(struct dbc_port *port)
__releases(&port->port_lock)
__acquires(&port->port_lock)
@@ -36,7 +59,7 @@ static int dbc_start_tx(struct dbc_port *port)
while (!list_empty(pool)) {
req = list_entry(pool->next, struct dbc_request, list_pool);
- len = kfifo_out(&port->port.xmit_fifo, req->buf, DBC_MAX_PACKET);
+ len = dbc_kfifo_to_req(port, req->buf);
if (len == 0)
break;
do_tty_wake = true;
@@ -200,14 +223,32 @@ static ssize_t dbc_tty_write(struct tty_struct *tty, const u8 *buf,
{
struct dbc_port *port = tty->driver_data;
unsigned long flags;
+ unsigned int written = 0;
spin_lock_irqsave(&port->port_lock, flags);
- if (count)
- count = kfifo_in(&port->port.xmit_fifo, buf, count);
- dbc_start_tx(port);
+
+ /*
+ * Treat tty write as one usb transfer. Make sure the writes are turned
+ * into TRB request having the same size boundaries as the tty writes.
+ * Don't add data to kfifo before previous write is turned into TRBs
+ */
+ if (port->tx_boundary) {
+ spin_unlock_irqrestore(&port->port_lock, flags);
+ return 0;
+ }
+
+ if (count) {
+ written = kfifo_in(&port->port.xmit_fifo, buf, count);
+
+ if (written == count)
+ port->tx_boundary = kfifo_len(&port->port.xmit_fifo);
+
+ dbc_start_tx(port);
+ }
+
spin_unlock_irqrestore(&port->port_lock, flags);
- return count;
+ return written;
}
static int dbc_tty_put_char(struct tty_struct *tty, u8 ch)
@@ -241,6 +282,10 @@ static unsigned int dbc_tty_write_room(struct tty_struct *tty)
spin_lock_irqsave(&port->port_lock, flags);
room = kfifo_avail(&port->port.xmit_fifo);
+
+ if (port->tx_boundary)
+ room = 0;
+
spin_unlock_irqrestore(&port->port_lock, flags);
return room;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x fe49df60cdb7c2975aa743dc295f8786e4b7db10
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102120-valid-uncured-dcca@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From fe49df60cdb7c2975aa743dc295f8786e4b7db10 Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Wed, 16 Oct 2024 16:59:58 +0300
Subject: [PATCH] xhci: Mitigate failed set dequeue pointer commands
Avoid xHC host from processing a cancelled URB by always turning
cancelled URB TDs into no-op TRBs before queuing a 'Set TR Deq' command.
If the command fails then xHC will start processing the cancelled TD
instead of skipping it once endpoint is restarted, causing issues like
Babble error.
This is not a complete solution as a failed 'Set TR Deq' command does not
guarantee xHC TRB caches are cleared.
Fixes: 4db356924a50 ("xhci: turn cancelled td cleanup to its own function")
Cc: stable(a)vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241016140000.783905-3-mathias.nyman@linux.intel…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 4d664ba53fe9..7dedf31bbddd 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1023,7 +1023,7 @@ static int xhci_invalidate_cancelled_tds(struct xhci_virt_ep *ep)
td_to_noop(xhci, ring, cached_td, false);
cached_td->cancel_status = TD_CLEARED;
}
-
+ td_to_noop(xhci, ring, td, false);
td->cancel_status = TD_CLEARING_CACHE;
cached_td = td;
break;
Hi Greg! Please consider picking up the following two bluetooth fixes
for the next round of stable updates, they fix problems quite a few
users hit in various stable series due to backports:
4084286151fc91 ("Bluetooth: btusb: Fix not being able to reconnect after
suspend") [v6.12-rc4] for 6.11.y
and
2c1dda2acc4192 ("Bluetooth: btusb: Fix regression with fake CSR
controllers 0a12:0001") [v6.12-rc4] for 5.10.y and later
For details see also:
https://lore.kernel.org/all/CABBYNZL0_j4EDWzDS=kXc1Vy0D6ToU+oYnP_uBWTKoXbEa…
tia!
Ciao, Thorsten
From: Eric Biggers <ebiggers(a)google.com>
Fix the kconfig option for the tas2781 HDA driver to select CRC32 rather
than CRC32_SARWATE. CRC32_SARWATE is an option from the kconfig
'choice' that selects the specific CRC32 implementation. Selecting a
'choice' option seems to have no effect, but even if it did work, it
would be incorrect for a random driver to override the user's choice.
CRC32 is the correct option to select for crc32() to be available.
Fixes: 5be27f1e3ec9 ("ALSA: hda/tas2781: Add tas2781 HDA driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
sound/pci/hda/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/pci/hda/Kconfig b/sound/pci/hda/Kconfig
index bb15a0248250c..68f1eee9e5c93 100644
--- a/sound/pci/hda/Kconfig
+++ b/sound/pci/hda/Kconfig
@@ -196,11 +196,11 @@ config SND_HDA_SCODEC_TAS2781_I2C
depends on ACPI
depends on EFI
depends on SND_SOC
select SND_SOC_TAS2781_COMLIB
select SND_SOC_TAS2781_FMWLIB
- select CRC32_SARWATE
+ select CRC32
help
Say Y or M here to include TAS2781 I2C HD-audio side codec support
in snd-hda-intel driver, such as ALC287.
comment "Set to Y if you want auto-loading the side codec driver"
base-commit: 715ca9dd687f89ddaac8ec8ccb3b5e5a30311a99
--
2.47.0