The patch below does not apply to the 6.3-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.3.y
git checkout FETCH_HEAD
git cherry-pick -x fd4aed8d985a3236d0877ff6d0c80ad39d4ce81a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023070420-fraction-such-da8d@gregkh' --subject-prefix 'PATCH 6.3.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From fd4aed8d985a3236d0877ff6d0c80ad39d4ce81a Mon Sep 17 00:00:00 2001
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Date: Wed, 21 Jun 2023 14:24:03 -0700
Subject: [PATCH] hugetlb: revert use of page_cache_next_miss()
Ackerley Tng reported an issue with hugetlbfs fallocate as noted in the
Closes tag. The issue showed up after the conversion of hugetlb page
cache lookup code to use page_cache_next_miss. User visible effects are:
- hugetlbfs fallocate incorrectly returns -EEXIST if pages are presnet
in the file.
- hugetlb pages will not be included in core dumps if they need to be
brought in via GUP.
- userfaultfd UFFDIO_COPY will not notice pages already present in the
cache. It may try to allocate a new page and potentially return
ENOMEM as opposed to EEXIST.
Revert the use page_cache_next_miss() in hugetlb code.
IMPORTANT NOTE FOR STABLE BACKPORTS:
This patch will apply cleanly to v6.3. However, due to the change of
filemap_get_folio() return values, it will not function correctly. This
patch must be modified for stable backports.
[dan.carpenter(a)linaro.org: fix hugetlbfs_pagecache_present()]
Link: https://lkml.kernel.org/r/efa86091-6a2c-4064-8f55-9b44e1313015@moroto.mount…
Link: https://lkml.kernel.org/r/20230621212403.174710-2-mike.kravetz@oracle.com
Fixes: d0ce0e47b323 ("mm/hugetlb: convert hugetlb fault paths to use alloc_hugetlb_folio()")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Signed-off-by: Dan Carpenter <dan.carpenter(a)linaro.org>
Reported-by: Ackerley Tng <ackerleytng(a)google.com>
Closes: https://lore.kernel.org/linux-mm/cover.1683069252.git.ackerleytng@google.com
Reviewed-by: Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
Cc: Erdem Aktas <erdemaktas(a)google.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: Vishal Annapurve <vannapurve(a)google.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 90361a922cec..7b17ccfa039d 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -821,7 +821,6 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
*/
struct folio *folio;
unsigned long addr;
- bool present;
cond_resched();
@@ -842,10 +841,9 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
mutex_lock(&hugetlb_fault_mutex_table[hash]);
/* See if already present in mapping to avoid alloc/free */
- rcu_read_lock();
- present = page_cache_next_miss(mapping, index, 1) != index;
- rcu_read_unlock();
- if (present) {
+ folio = filemap_get_folio(mapping, index);
+ if (!IS_ERR(folio)) {
+ folio_put(folio);
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
continue;
}
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index d76574425da3..bce28cca73a1 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5728,13 +5728,13 @@ static bool hugetlbfs_pagecache_present(struct hstate *h,
{
struct address_space *mapping = vma->vm_file->f_mapping;
pgoff_t idx = vma_hugecache_offset(h, vma, address);
- bool present;
-
- rcu_read_lock();
- present = page_cache_next_miss(mapping, idx, 1) != idx;
- rcu_read_unlock();
+ struct folio *folio;
- return present;
+ folio = filemap_get_folio(mapping, idx);
+ if (IS_ERR(folio))
+ return false;
+ folio_put(folio);
+ return true;
}
int hugetlb_add_to_page_cache(struct folio *folio, struct address_space *mapping,
This is the start of the stable review cycle for the 5.15.120 release.
There are 15 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 05 Jul 2023 18:45:08 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.120-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.15.120-rc1
Bas Nieuwenhuizen <bas(a)basnieuwenhuizen.nl>
drm/amdgpu: Validate VM ioctl flags.
Ahmed S. Darwish <darwi(a)linutronix.de>
scripts/tags.sh: Resolve gtags empty index generation
Krister Johansen <kjlx(a)templeofstupid.com>
perf symbols: Symbol lookup with kcore can fail if multiple segments match stext
Ricardo Cañuelo <ricardo.canuelo(a)collabora.com>
Revert "thermal/drivers/mediatek: Use devm_of_iomap to avoid resource leak in mtk_thermal_probe"
Mike Hommey <mh(a)glandium.org>
HID: logitech-hidpp: add HIDPP_QUIRK_DELAYED_INIT for the T651.
Jason Gerecke <jason.gerecke(a)wacom.com>
HID: wacom: Use ktime_t rather than int when dealing with timestamps
Krister Johansen <kjlx(a)templeofstupid.com>
bpf: ensure main program has an extable
Oliver Hartkopp <socketcan(a)hartkopp.net>
can: isotp: isotp_sendmsg(): fix return error fix on TX path
Thomas Gleixner <tglx(a)linutronix.de>
x86/smp: Use dedicated cache-line for mwait_play_dead()
Borislav Petkov (AMD) <bp(a)alien8.de>
x86/microcode/AMD: Load late on both threads too
Philip Yang <Philip.Yang(a)amd.com>
drm/amdgpu: Set vmbo destroy after pt bo is created
Jane Chu <jane.chu(a)oracle.com>
mm, hwpoison: when copy-on-write hits poison, take page offline
Tony Luck <tony.luck(a)intel.com>
mm, hwpoison: try to recover from copy-on write faults
Paolo Abeni <pabeni(a)redhat.com>
mptcp: consolidate fallback and non fallback state machine
Paolo Abeni <pabeni(a)redhat.com>
mptcp: fix possible divide by zero in recvmsg()
-------------
Diffstat:
Makefile | 4 +--
arch/x86/kernel/cpu/microcode/amd.c | 2 +-
arch/x86/kernel/smpboot.c | 24 +++++++++-------
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 1 -
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 +++
drivers/hid/hid-logitech-hidpp.c | 2 +-
drivers/hid/wacom_wac.c | 6 ++--
drivers/hid/wacom_wac.h | 2 +-
drivers/thermal/mtk_thermal.c | 14 ++-------
include/linux/highmem.h | 24 ++++++++++++++++
include/linux/mm.h | 5 +++-
kernel/bpf/verifier.c | 7 +++--
mm/memory.c | 33 ++++++++++++++-------
net/can/isotp.c | 5 ++--
net/mptcp/protocol.c | 46 ++++++++++++++----------------
net/mptcp/subflow.c | 17 ++++++-----
scripts/tags.sh | 9 +++++-
tools/perf/util/symbol.c | 17 +++++++++--
18 files changed, 142 insertions(+), 80 deletions(-)
Making 'blk' sector_t (i.e. 64 bit if LBD support is active)
fails the 'blk>0' test in the partition block loop if a
value of (signed int) -1 is used to mark the end of the
partition block list.
This bug was introduced in patch 3 of my prior Amiga partition
support fixes series, and spotted by Christian Zigotzky when
testing the latest block updates.
Explicitly cast 'blk' to signed int to allow use of -1 to
terminate the partition block linked list.
Reported-by: Christian Zigotzky <chzigotzky(a)xenosoft.de>
Fixes: b6f3f28f60 ("Linux 6.4")
Message-ID: 024ce4fa-cc6d-50a2-9aae-3701d0ebf668(a)xenosoft.de
Cc: <stable(a)vger.kernel.org> # 6.4
Link: https://lore.kernel.org/r/024ce4fa-cc6d-50a2-9aae-3701d0ebf668@xenosoft.de
Signed-off-by: Michael Schmitz <schmitzmic(a)gmail.com>
---
block/partitions/amiga.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/partitions/amiga.c b/block/partitions/amiga.c
index ed222b9c901b..506921095412 100644
--- a/block/partitions/amiga.c
+++ b/block/partitions/amiga.c
@@ -90,7 +90,7 @@ int amiga_partition(struct parsed_partitions *state)
}
blk = be32_to_cpu(rdb->rdb_PartitionList);
put_dev_sector(sect);
- for (part = 1; blk>0 && part<=16; part++, put_dev_sector(sect)) {
+ for (part = 1; (s32) blk>0 && part<=16; part++, put_dev_sector(sect)) {
/* Read in terms partition table understands */
if (check_mul_overflow(blk, (sector_t) blksize, &blk)) {
pr_err("Dev %s: overflow calculating partition block %llu! Skipping partitions %u and beyond\n",
--
2.17.1
Making 'blk' sector_t (i.e. 64 bit if LBD support is active)
fails the 'blk>0' test in the partition block loop if a
value of (signed int) -1 is used to mark the end of the
partition block list.
This bug was introduced in patch 3 of my prior Amiga partition
support fixes series, and spotted by Christian Zigotzky when
testing the latest block updates.
Explicitly cast 'blk' to signed int to allow use of -1 to
terminate the partition block linked list.
Testing by Christian also exposed another aspect of the old
bug fixed in commits fc3d092c6b ("block: fix signed int
overflow in Amiga partition support") and b6f3f28f60
("block: add overflow checks for Amiga partition support"):
Partitons that did overflow the disk size (due to 32 bit int
overflow) were not skipped but truncated to the end of the
disk. Users who missed the warning message during boot would
go on to create a filesystem with a size exceeding the
actual partition size. Now that the 32 bit overflow has been
corrected, such filesystems may refuse to mount with a
'filesystem exceeds partition size' error. Users should
either correct the partition size, or resize the filesystem
before attempting to boot a kernel with the RDB fixes in
place.
Reported-by: Christian Zigotzky <chzigotzky(a)xenosoft.de>
Fixes: b6f3f28f60 ("block: add overflow checks for Amiga partition support")
Message-ID: 024ce4fa-cc6d-50a2-9aae-3701d0ebf668(a)xenosoft.de
Cc: <stable(a)vger.kernel.org> # 6.4
Link: https://lore.kernel.org/r/024ce4fa-cc6d-50a2-9aae-3701d0ebf668@xenosoft.de
Signed-off-by: Michael Schmitz <schmitzmic(a)gmail.com>
Tested-by: Christian Zigotzky <chzigotzky(a)xenosoft.de>
--
Changes since v1:
- corrected Fixes: tag
- added Tested-by:
- reworded commit message to describe filesystem partition
size mismatch problem
---
block/partitions/amiga.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/partitions/amiga.c b/block/partitions/amiga.c
index ed222b9c901b..506921095412 100644
--- a/block/partitions/amiga.c
+++ b/block/partitions/amiga.c
@@ -90,7 +90,7 @@ int amiga_partition(struct parsed_partitions *state)
}
blk = be32_to_cpu(rdb->rdb_PartitionList);
put_dev_sector(sect);
- for (part = 1; blk>0 && part<=16; part++, put_dev_sector(sect)) {
+ for (part = 1; (s32) blk>0 && part<=16; part++, put_dev_sector(sect)) {
/* Read in terms partition table understands */
if (check_mul_overflow(blk, (sector_t) blksize, &blk)) {
pr_err("Dev %s: overflow calculating partition block %llu! Skipping partitions %u and beyond\n",
--
2.17.1
This is the start of the stable review cycle for the 6.1.38 release.
There are 11 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 05 Jul 2023 18:45:08 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.38-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.1.38-rc1
Bas Nieuwenhuizen <bas(a)basnieuwenhuizen.nl>
drm/amdgpu: Validate VM ioctl flags.
Ahmed S. Darwish <darwi(a)linutronix.de>
docs: Set minimal gtags / GNU GLOBAL version to 6.6.5
Ahmed S. Darwish <darwi(a)linutronix.de>
scripts/tags.sh: Resolve gtags empty index generation
Krister Johansen <kjlx(a)templeofstupid.com>
perf symbols: Symbol lookup with kcore can fail if multiple segments match stext
Finn Thain <fthain(a)linux-m68k.org>
nubus: Partially revert proc_create_single_data() conversion
Linus Torvalds <torvalds(a)linux-foundation.org>
execve: always mark stack as growing down during early stack setup
Mario Limonciello <mario.limonciello(a)amd.com>
PCI/ACPI: Call _REG when transitioning D-states
Bjorn Helgaas <bhelgaas(a)google.com>
PCI/ACPI: Validate acpi_pci_set_power_state() parameter
Aric Cyr <aric.cyr(a)amd.com>
drm/amd/display: Do not update DRR while BW optimizations pending
Alvin Lee <Alvin.Lee2(a)amd.com>
drm/amd/display: Remove optimization for VRR updates
Max Filippov <jcmvbkbc(a)gmail.com>
xtensa: fix lock_mm_and_find_vma in case VMA not found
-------------
Diffstat:
Documentation/process/changes.rst | 7 +++++
Makefile | 4 +--
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 +++
drivers/gpu/drm/amd/display/dc/core/dc.c | 49 ++++++++++++++++-------------
drivers/nubus/proc.c | 22 ++++++++++---
drivers/pci/pci-acpi.c | 53 ++++++++++++++++++++++++--------
include/linux/mm.h | 4 ++-
mm/nommu.c | 7 ++++-
scripts/tags.sh | 9 +++++-
tools/perf/util/symbol.c | 17 ++++++++--
10 files changed, 130 insertions(+), 46 deletions(-)
__setup() handlers should return 1 to obsolete_checksetup() in
init/main.c to indicate that the boot option has been handled.
A return of 0 causes the boot option/value to be listed as an Unknown
kernel parameter and added to init's (limited) argument or environment
strings. Also, error return codes don't mean anything to
obsolete_checksetup() -- only non-zero (usually 1) or zero.
So return 1 from setup_nmi_watchdog().
Fixes: e5553a6d0442 ("sparc64: Implement NMI watchdog on capable cpus.")
Signed-off-by: Randy Dunlap <rdunlap(a)infradead.org>
Reported-by: Igor Zhbanov <izh1979(a)gmail.com>
Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: sparclinux(a)vger.kernel.org
Cc: Sam Ravnborg <sam(a)ravnborg.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: stable(a)vger.kernel.org
Cc: Arnd Bergmann <arnd(a)arndb.de>
---
v2: change From: Igor to Reported-by:
add more Cc's
v3: use Igor's current email address
v4: add Arnd to Cc: list
arch/sparc/kernel/nmi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff -- a/arch/sparc/kernel/nmi.c b/arch/sparc/kernel/nmi.c
--- a/arch/sparc/kernel/nmi.c
+++ b/arch/sparc/kernel/nmi.c
@@ -279,7 +279,7 @@ static int __init setup_nmi_watchdog(cha
if (!strncmp(str, "panic", 5))
panic_on_timeout = 1;
- return 0;
+ return 1;
}
__setup("nmi_watchdog=", setup_nmi_watchdog);
__setup() handlers should return 1 to obsolete_checksetup() in
init/main.c to indicate that the boot option has been handled.
A return of 0 causes the boot option/value to be listed as an Unknown
kernel parameter and added to init's (limited) argument or environment
strings. Also, error return codes don't mean anything to
obsolete_checksetup() -- only non-zero (usually 1) or zero.
So return 1 from vdso_setup().
Fixes: 9a08862a5d2e ("vDSO for sparc")
Signed-off-by: Randy Dunlap <rdunlap(a)infradead.org>
Reported-by: Igor Zhbanov <izh1979(a)gmail.com>
Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: sparclinux(a)vger.kernel.org
Cc: Dan Carpenter <dan.carpenter(a)oracle.com>
Cc: Nick Alcock <nick.alcock(a)oracle.com>
Cc: Sam Ravnborg <sam(a)ravnborg.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: stable(a)vger.kernel.org
Cc: Arnd Bergmann <arnd(a)arndb.de>
---
v2: correct the Fixes: tag (Dan Carpenter)
v3: add more Cc's;
correct Igor's email address;
change From: Igor to Reported-by: Igor;
v4: add Arnd to Cc: list
arch/sparc/vdso/vma.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff -- a/arch/sparc/vdso/vma.c b/arch/sparc/vdso/vma.c
--- a/arch/sparc/vdso/vma.c
+++ b/arch/sparc/vdso/vma.c
@@ -449,9 +449,8 @@ static __init int vdso_setup(char *s)
unsigned long val;
err = kstrtoul(s, 10, &val);
- if (err)
- return err;
- vdso_enabled = val;
- return 0;
+ if (!err)
+ vdso_enabled = val;
+ return 1;
}
__setup("vdso=", vdso_setup);
Hi Greg and Sasha,
On Sat, 10 Jun 2023 17:56:18 +0000 SeongJae Park <sj(a)kernel.org> wrote:
> On Sat, 10 Jun 2023 12:15:55 +0800 David Gow <davidgow(a)google.com> wrote:
>
> > [-- Attachment #1: Type: text/plain, Size: 2275 bytes --]
> >
> > On Sat, 10 Jun 2023 at 03:09, SeongJae Park <sj(a)kernel.org> wrote:
> > >
> > > Hi David and Brendan,
> > >
> > > On Tue, 2 May 2023 08:04:20 +0800 David Gow <davidgow(a)google.com> wrote:
> > >
> > > > [-- Attachment #1: Type: text/plain, Size: 1473 bytes --]
> > > >
> > > > On Tue, 2 May 2023 at 02:16, 'Daniel Latypov' via KUnit Development
> > > > <kunit-dev(a)googlegroups.com> wrote:
> > > > >
> > > > > Writing `subprocess.Popen[str]` requires python 3.9+.
> > > > > kunit.py has an assertion that the python version is 3.7+, so we should
> > > > > try to stay backwards compatible.
> > > > >
> > > > > This conflicts a bit with commit 1da2e6220e11 ("kunit: tool: fix
> > > > > pre-existing `mypy --strict` errors and update run_checks.py"), since
> > > > > mypy complains like so
> > > > > > kunit_kernel.py:95: error: Missing type parameters for generic type "Popen" [type-arg]
> > > > >
> > > > > Note: `mypy --strict --python-version 3.7` does not work.
> > > > >
> > > > > We could annotate each file with comments like
> > > > > `# mypy: disable-error-code="type-arg"
> > > > > but then we might still get nudged to break back-compat in other files.
> > > > >
> > > > > This patch adds a `mypy.ini` file since it seems like the only way to
> > > > > disable specific error codes for all our files.
> > > > >
> > > > > Note: run_checks.py doesn't need to specify `--config_file mypy.ini`,
> > > > > but I think being explicit is better, particularly since most kernel
> > > > > devs won't be familiar with how mypy works.
> > > > >
> > > > > Fixes: 695e26030858 ("kunit: tool: add subscripts for type annotations where appropriate")
> > > > > Reported-by: SeongJae Park <sj(a)kernel.org>
> > > > > Link: https://lore.kernel.org/linux-kselftest/20230501171520.138753-1-sj@kernel.o…
> > > > > Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
> > > > > ---
> > > >
> > > > Thanks for jumping on this.
> > > >
> > > > Looks good to me!
> > > >
> > > > Reviewed-by: David Gow <davidgow(a)google.com>
> > >
> > > Looks like this patch is still not merged in the mainline. May I ask the ETA,
> > > or any concern if you have?
> > >
> > >
> >
> > We've got this queued for 6.5 in the kselftest/kunit tree[1], so it
> > should land during the merge window. But I'll look into getting it
> > applied as a fix for 6.4, beforehand.
>
> Thank you for the kind answer, Gow! I was thinking this would be treated as a
> fix, and hence merged into the mainline before next merge window. I'm actually
> getting my personal test suite failures due to absence of this fix. It's not a
> critical problem, but it would definitely better for me if this could be merged
> into the mainline as early as possible.
This patch is now in the mainline (e30f65c4b3d671115bf2a9d9ef142285387f2aff).
However, this fix is not in 6.4.y yet, so the original issue is reproducible on
6.4.y. Could you please add this to 6.4.y? I confirmed the mainline commit
can cleanly applied on latest 6.1.y tree, and it fixes the issue.
Thanks,
SJ
>
>
> Thanks,
> SJ
>
> >
> > -- David
> >
> > [1]: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git/c…
83efeeeb3d04 ("tty: Allow TIOCSTI to be disabled") broke BRLTTY's
ability to simulate keypresses on the console, thus effectively breaking
braille keyboards of blind users.
This restores the TIOCSTI feature for CAP_SYS_ADMIN processes, which
BRLTTY is, thus fixing braille keyboards without re-opening the security
issue.
Signed-off-by: Samuel Thibault <samuel.thibault(a)ens-lyon.org>
Fixes: 83efeeeb3d04 ("tty: Allow TIOCSTI to be disabled")
Index: linux-6.4/drivers/tty/tty_io.c
===================================================================
--- linux-6.4.orig/drivers/tty/tty_io.c
+++ linux-6.4/drivers/tty/tty_io.c
@@ -2276,7 +2276,7 @@ static int tiocsti(struct tty_struct *tt
char ch, mbz = 0;
struct tty_ldisc *ld;
- if (!tty_legacy_tiocsti)
+ if (!tty_legacy_tiocsti && !capable(CAP_SYS_ADMIN))
return -EIO;
if ((current->signal->tty != tty) && !capable(CAP_SYS_ADMIN))
This is the start of the stable review cycle for the 6.4.2 release.
There are 13 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 05 Jul 2023 18:45:08 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.2-rc1.…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.4.2-rc1
Bas Nieuwenhuizen <bas(a)basnieuwenhuizen.nl>
drm/amdgpu: Validate VM ioctl flags.
Demi Marie Obenour <demi(a)invisiblethingslab.com>
dm ioctl: Avoid double-fetch of version
Ahmed S. Darwish <darwi(a)linutronix.de>
docs: Set minimal gtags / GNU GLOBAL version to 6.6.5
Ahmed S. Darwish <darwi(a)linutronix.de>
scripts/tags.sh: Resolve gtags empty index generation
Mike Kravetz <mike.kravetz(a)oracle.com>
hugetlb: revert use of page_cache_next_miss()
Finn Thain <fthain(a)linux-m68k.org>
nubus: Partially revert proc_create_single_data() conversion
Dan Williams <dan.j.williams(a)intel.com>
Revert "cxl/port: Enable the HDM decoder capability for switch ports"
Jeff Layton <jlayton(a)kernel.org>
nfs: don't report STATX_BTIME in ->getattr
Linus Torvalds <torvalds(a)linux-foundation.org>
execve: always mark stack as growing down during early stack setup
Mario Limonciello <mario.limonciello(a)amd.com>
PCI/ACPI: Call _REG when transitioning D-states
Bjorn Helgaas <bhelgaas(a)google.com>
PCI/ACPI: Validate acpi_pci_set_power_state() parameter
Thomas Weißschuh <linux(a)weissschuh.net>
tools/nolibc: x86_64: disable stack protector for _start
Max Filippov <jcmvbkbc(a)gmail.com>
xtensa: fix lock_mm_and_find_vma in case VMA not found
-------------
Diffstat:
Documentation/process/changes.rst | 7 +++++
Makefile | 4 +--
drivers/cxl/core/pci.c | 27 +++--------------
drivers/cxl/cxl.h | 1 -
drivers/cxl/port.c | 14 ++++-----
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 +++
drivers/md/dm-ioctl.c | 33 +++++++++++++--------
drivers/nubus/proc.c | 22 ++++++++++----
drivers/pci/pci-acpi.c | 53 +++++++++++++++++++++++++---------
fs/hugetlbfs/inode.c | 8 ++---
fs/nfs/inode.c | 2 +-
include/linux/mm.h | 4 ++-
mm/hugetlb.c | 12 ++++----
mm/nommu.c | 7 ++++-
scripts/tags.sh | 9 +++++-
tools/include/nolibc/arch-x86_64.h | 2 +-
tools/testing/cxl/Kbuild | 1 -
tools/testing/cxl/test/mock.c | 15 ----------
18 files changed, 128 insertions(+), 97 deletions(-)
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x 249bed821b4db6d95a99160f7d6d236ea5fe6362
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023070344-agenda-establish-d050@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 249bed821b4db6d95a99160f7d6d236ea5fe6362 Mon Sep 17 00:00:00 2001
From: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Date: Sat, 3 Jun 2023 10:52:42 -0400
Subject: [PATCH] dm ioctl: Avoid double-fetch of version
The version is fetched once in check_version(), which then does some
validation and then overwrites the version in userspace with the API
version supported by the kernel. copy_params() then fetches the version
from userspace *again*, and this time no validation is done. The result
is that the kernel's version number is completely controllable by
userspace, provided that userspace can win a race condition.
Fix this flaw by not copying the version back to the kernel the second
time. This is not exploitable as the version is not further used in the
kernel. However, it could become a problem if future patches start
relying on the version field.
Cc: stable(a)vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Signed-off-by: Mike Snitzer <snitzer(a)kernel.org>
diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
index a92abbe90981..bfaebc02833a 100644
--- a/drivers/md/dm-ioctl.c
+++ b/drivers/md/dm-ioctl.c
@@ -1872,30 +1872,36 @@ static ioctl_fn lookup_ioctl(unsigned int cmd, int *ioctl_flags)
* As well as checking the version compatibility this always
* copies the kernel interface version out.
*/
-static int check_version(unsigned int cmd, struct dm_ioctl __user *user)
+static int check_version(unsigned int cmd, struct dm_ioctl __user *user,
+ struct dm_ioctl *kernel_params)
{
- uint32_t version[3];
int r = 0;
- if (copy_from_user(version, user->version, sizeof(version)))
+ /* Make certain version is first member of dm_ioctl struct */
+ BUILD_BUG_ON(offsetof(struct dm_ioctl, version) != 0);
+
+ if (copy_from_user(kernel_params->version, user->version, sizeof(kernel_params->version)))
return -EFAULT;
- if ((version[0] != DM_VERSION_MAJOR) ||
- (version[1] > DM_VERSION_MINOR)) {
+ if ((kernel_params->version[0] != DM_VERSION_MAJOR) ||
+ (kernel_params->version[1] > DM_VERSION_MINOR)) {
DMERR("ioctl interface mismatch: kernel(%u.%u.%u), user(%u.%u.%u), cmd(%d)",
DM_VERSION_MAJOR, DM_VERSION_MINOR,
DM_VERSION_PATCHLEVEL,
- version[0], version[1], version[2], cmd);
+ kernel_params->version[0],
+ kernel_params->version[1],
+ kernel_params->version[2],
+ cmd);
r = -EINVAL;
}
/*
* Fill in the kernel version.
*/
- version[0] = DM_VERSION_MAJOR;
- version[1] = DM_VERSION_MINOR;
- version[2] = DM_VERSION_PATCHLEVEL;
- if (copy_to_user(user->version, version, sizeof(version)))
+ kernel_params->version[0] = DM_VERSION_MAJOR;
+ kernel_params->version[1] = DM_VERSION_MINOR;
+ kernel_params->version[2] = DM_VERSION_PATCHLEVEL;
+ if (copy_to_user(user->version, kernel_params->version, sizeof(kernel_params->version)))
return -EFAULT;
return r;
@@ -1921,7 +1927,10 @@ static int copy_params(struct dm_ioctl __user *user, struct dm_ioctl *param_kern
const size_t minimum_data_size = offsetof(struct dm_ioctl, data);
unsigned int noio_flag;
- if (copy_from_user(param_kernel, user, minimum_data_size))
+ /* check_version() already copied version from userspace, avoid TOCTOU */
+ if (copy_from_user((char *)param_kernel + sizeof(param_kernel->version),
+ (char __user *)user + sizeof(param_kernel->version),
+ minimum_data_size - sizeof(param_kernel->version)))
return -EFAULT;
if (param_kernel->data_size < minimum_data_size) {
@@ -2033,7 +2042,7 @@ static int ctl_ioctl(struct file *file, uint command, struct dm_ioctl __user *us
* Check the interface version passed in. This also
* writes out the kernel's interface version.
*/
- r = check_version(cmd, user);
+ r = check_version(cmd, user, ¶m_kernel);
if (r)
return r;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 249bed821b4db6d95a99160f7d6d236ea5fe6362
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023070343-agenda-customs-7f89@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 249bed821b4db6d95a99160f7d6d236ea5fe6362 Mon Sep 17 00:00:00 2001
From: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Date: Sat, 3 Jun 2023 10:52:42 -0400
Subject: [PATCH] dm ioctl: Avoid double-fetch of version
The version is fetched once in check_version(), which then does some
validation and then overwrites the version in userspace with the API
version supported by the kernel. copy_params() then fetches the version
from userspace *again*, and this time no validation is done. The result
is that the kernel's version number is completely controllable by
userspace, provided that userspace can win a race condition.
Fix this flaw by not copying the version back to the kernel the second
time. This is not exploitable as the version is not further used in the
kernel. However, it could become a problem if future patches start
relying on the version field.
Cc: stable(a)vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Signed-off-by: Mike Snitzer <snitzer(a)kernel.org>
diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
index a92abbe90981..bfaebc02833a 100644
--- a/drivers/md/dm-ioctl.c
+++ b/drivers/md/dm-ioctl.c
@@ -1872,30 +1872,36 @@ static ioctl_fn lookup_ioctl(unsigned int cmd, int *ioctl_flags)
* As well as checking the version compatibility this always
* copies the kernel interface version out.
*/
-static int check_version(unsigned int cmd, struct dm_ioctl __user *user)
+static int check_version(unsigned int cmd, struct dm_ioctl __user *user,
+ struct dm_ioctl *kernel_params)
{
- uint32_t version[3];
int r = 0;
- if (copy_from_user(version, user->version, sizeof(version)))
+ /* Make certain version is first member of dm_ioctl struct */
+ BUILD_BUG_ON(offsetof(struct dm_ioctl, version) != 0);
+
+ if (copy_from_user(kernel_params->version, user->version, sizeof(kernel_params->version)))
return -EFAULT;
- if ((version[0] != DM_VERSION_MAJOR) ||
- (version[1] > DM_VERSION_MINOR)) {
+ if ((kernel_params->version[0] != DM_VERSION_MAJOR) ||
+ (kernel_params->version[1] > DM_VERSION_MINOR)) {
DMERR("ioctl interface mismatch: kernel(%u.%u.%u), user(%u.%u.%u), cmd(%d)",
DM_VERSION_MAJOR, DM_VERSION_MINOR,
DM_VERSION_PATCHLEVEL,
- version[0], version[1], version[2], cmd);
+ kernel_params->version[0],
+ kernel_params->version[1],
+ kernel_params->version[2],
+ cmd);
r = -EINVAL;
}
/*
* Fill in the kernel version.
*/
- version[0] = DM_VERSION_MAJOR;
- version[1] = DM_VERSION_MINOR;
- version[2] = DM_VERSION_PATCHLEVEL;
- if (copy_to_user(user->version, version, sizeof(version)))
+ kernel_params->version[0] = DM_VERSION_MAJOR;
+ kernel_params->version[1] = DM_VERSION_MINOR;
+ kernel_params->version[2] = DM_VERSION_PATCHLEVEL;
+ if (copy_to_user(user->version, kernel_params->version, sizeof(kernel_params->version)))
return -EFAULT;
return r;
@@ -1921,7 +1927,10 @@ static int copy_params(struct dm_ioctl __user *user, struct dm_ioctl *param_kern
const size_t minimum_data_size = offsetof(struct dm_ioctl, data);
unsigned int noio_flag;
- if (copy_from_user(param_kernel, user, minimum_data_size))
+ /* check_version() already copied version from userspace, avoid TOCTOU */
+ if (copy_from_user((char *)param_kernel + sizeof(param_kernel->version),
+ (char __user *)user + sizeof(param_kernel->version),
+ minimum_data_size - sizeof(param_kernel->version)))
return -EFAULT;
if (param_kernel->data_size < minimum_data_size) {
@@ -2033,7 +2042,7 @@ static int ctl_ioctl(struct file *file, uint command, struct dm_ioctl __user *us
* Check the interface version passed in. This also
* writes out the kernel's interface version.
*/
- r = check_version(cmd, user);
+ r = check_version(cmd, user, ¶m_kernel);
if (r)
return r;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 249bed821b4db6d95a99160f7d6d236ea5fe6362
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023070342-blah-levitate-41a6@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 249bed821b4db6d95a99160f7d6d236ea5fe6362 Mon Sep 17 00:00:00 2001
From: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Date: Sat, 3 Jun 2023 10:52:42 -0400
Subject: [PATCH] dm ioctl: Avoid double-fetch of version
The version is fetched once in check_version(), which then does some
validation and then overwrites the version in userspace with the API
version supported by the kernel. copy_params() then fetches the version
from userspace *again*, and this time no validation is done. The result
is that the kernel's version number is completely controllable by
userspace, provided that userspace can win a race condition.
Fix this flaw by not copying the version back to the kernel the second
time. This is not exploitable as the version is not further used in the
kernel. However, it could become a problem if future patches start
relying on the version field.
Cc: stable(a)vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Signed-off-by: Mike Snitzer <snitzer(a)kernel.org>
diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
index a92abbe90981..bfaebc02833a 100644
--- a/drivers/md/dm-ioctl.c
+++ b/drivers/md/dm-ioctl.c
@@ -1872,30 +1872,36 @@ static ioctl_fn lookup_ioctl(unsigned int cmd, int *ioctl_flags)
* As well as checking the version compatibility this always
* copies the kernel interface version out.
*/
-static int check_version(unsigned int cmd, struct dm_ioctl __user *user)
+static int check_version(unsigned int cmd, struct dm_ioctl __user *user,
+ struct dm_ioctl *kernel_params)
{
- uint32_t version[3];
int r = 0;
- if (copy_from_user(version, user->version, sizeof(version)))
+ /* Make certain version is first member of dm_ioctl struct */
+ BUILD_BUG_ON(offsetof(struct dm_ioctl, version) != 0);
+
+ if (copy_from_user(kernel_params->version, user->version, sizeof(kernel_params->version)))
return -EFAULT;
- if ((version[0] != DM_VERSION_MAJOR) ||
- (version[1] > DM_VERSION_MINOR)) {
+ if ((kernel_params->version[0] != DM_VERSION_MAJOR) ||
+ (kernel_params->version[1] > DM_VERSION_MINOR)) {
DMERR("ioctl interface mismatch: kernel(%u.%u.%u), user(%u.%u.%u), cmd(%d)",
DM_VERSION_MAJOR, DM_VERSION_MINOR,
DM_VERSION_PATCHLEVEL,
- version[0], version[1], version[2], cmd);
+ kernel_params->version[0],
+ kernel_params->version[1],
+ kernel_params->version[2],
+ cmd);
r = -EINVAL;
}
/*
* Fill in the kernel version.
*/
- version[0] = DM_VERSION_MAJOR;
- version[1] = DM_VERSION_MINOR;
- version[2] = DM_VERSION_PATCHLEVEL;
- if (copy_to_user(user->version, version, sizeof(version)))
+ kernel_params->version[0] = DM_VERSION_MAJOR;
+ kernel_params->version[1] = DM_VERSION_MINOR;
+ kernel_params->version[2] = DM_VERSION_PATCHLEVEL;
+ if (copy_to_user(user->version, kernel_params->version, sizeof(kernel_params->version)))
return -EFAULT;
return r;
@@ -1921,7 +1927,10 @@ static int copy_params(struct dm_ioctl __user *user, struct dm_ioctl *param_kern
const size_t minimum_data_size = offsetof(struct dm_ioctl, data);
unsigned int noio_flag;
- if (copy_from_user(param_kernel, user, minimum_data_size))
+ /* check_version() already copied version from userspace, avoid TOCTOU */
+ if (copy_from_user((char *)param_kernel + sizeof(param_kernel->version),
+ (char __user *)user + sizeof(param_kernel->version),
+ minimum_data_size - sizeof(param_kernel->version)))
return -EFAULT;
if (param_kernel->data_size < minimum_data_size) {
@@ -2033,7 +2042,7 @@ static int ctl_ioctl(struct file *file, uint command, struct dm_ioctl __user *us
* Check the interface version passed in. This also
* writes out the kernel's interface version.
*/
- r = check_version(cmd, user);
+ r = check_version(cmd, user, ¶m_kernel);
if (r)
return r;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 249bed821b4db6d95a99160f7d6d236ea5fe6362
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023070341-aflame-earwig-4540@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 249bed821b4db6d95a99160f7d6d236ea5fe6362 Mon Sep 17 00:00:00 2001
From: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Date: Sat, 3 Jun 2023 10:52:42 -0400
Subject: [PATCH] dm ioctl: Avoid double-fetch of version
The version is fetched once in check_version(), which then does some
validation and then overwrites the version in userspace with the API
version supported by the kernel. copy_params() then fetches the version
from userspace *again*, and this time no validation is done. The result
is that the kernel's version number is completely controllable by
userspace, provided that userspace can win a race condition.
Fix this flaw by not copying the version back to the kernel the second
time. This is not exploitable as the version is not further used in the
kernel. However, it could become a problem if future patches start
relying on the version field.
Cc: stable(a)vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Signed-off-by: Mike Snitzer <snitzer(a)kernel.org>
diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
index a92abbe90981..bfaebc02833a 100644
--- a/drivers/md/dm-ioctl.c
+++ b/drivers/md/dm-ioctl.c
@@ -1872,30 +1872,36 @@ static ioctl_fn lookup_ioctl(unsigned int cmd, int *ioctl_flags)
* As well as checking the version compatibility this always
* copies the kernel interface version out.
*/
-static int check_version(unsigned int cmd, struct dm_ioctl __user *user)
+static int check_version(unsigned int cmd, struct dm_ioctl __user *user,
+ struct dm_ioctl *kernel_params)
{
- uint32_t version[3];
int r = 0;
- if (copy_from_user(version, user->version, sizeof(version)))
+ /* Make certain version is first member of dm_ioctl struct */
+ BUILD_BUG_ON(offsetof(struct dm_ioctl, version) != 0);
+
+ if (copy_from_user(kernel_params->version, user->version, sizeof(kernel_params->version)))
return -EFAULT;
- if ((version[0] != DM_VERSION_MAJOR) ||
- (version[1] > DM_VERSION_MINOR)) {
+ if ((kernel_params->version[0] != DM_VERSION_MAJOR) ||
+ (kernel_params->version[1] > DM_VERSION_MINOR)) {
DMERR("ioctl interface mismatch: kernel(%u.%u.%u), user(%u.%u.%u), cmd(%d)",
DM_VERSION_MAJOR, DM_VERSION_MINOR,
DM_VERSION_PATCHLEVEL,
- version[0], version[1], version[2], cmd);
+ kernel_params->version[0],
+ kernel_params->version[1],
+ kernel_params->version[2],
+ cmd);
r = -EINVAL;
}
/*
* Fill in the kernel version.
*/
- version[0] = DM_VERSION_MAJOR;
- version[1] = DM_VERSION_MINOR;
- version[2] = DM_VERSION_PATCHLEVEL;
- if (copy_to_user(user->version, version, sizeof(version)))
+ kernel_params->version[0] = DM_VERSION_MAJOR;
+ kernel_params->version[1] = DM_VERSION_MINOR;
+ kernel_params->version[2] = DM_VERSION_PATCHLEVEL;
+ if (copy_to_user(user->version, kernel_params->version, sizeof(kernel_params->version)))
return -EFAULT;
return r;
@@ -1921,7 +1927,10 @@ static int copy_params(struct dm_ioctl __user *user, struct dm_ioctl *param_kern
const size_t minimum_data_size = offsetof(struct dm_ioctl, data);
unsigned int noio_flag;
- if (copy_from_user(param_kernel, user, minimum_data_size))
+ /* check_version() already copied version from userspace, avoid TOCTOU */
+ if (copy_from_user((char *)param_kernel + sizeof(param_kernel->version),
+ (char __user *)user + sizeof(param_kernel->version),
+ minimum_data_size - sizeof(param_kernel->version)))
return -EFAULT;
if (param_kernel->data_size < minimum_data_size) {
@@ -2033,7 +2042,7 @@ static int ctl_ioctl(struct file *file, uint command, struct dm_ioctl __user *us
* Check the interface version passed in. This also
* writes out the kernel's interface version.
*/
- r = check_version(cmd, user);
+ r = check_version(cmd, user, ¶m_kernel);
if (r)
return r;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 249bed821b4db6d95a99160f7d6d236ea5fe6362
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023070340-dice-enjoyably-e417@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 249bed821b4db6d95a99160f7d6d236ea5fe6362 Mon Sep 17 00:00:00 2001
From: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Date: Sat, 3 Jun 2023 10:52:42 -0400
Subject: [PATCH] dm ioctl: Avoid double-fetch of version
The version is fetched once in check_version(), which then does some
validation and then overwrites the version in userspace with the API
version supported by the kernel. copy_params() then fetches the version
from userspace *again*, and this time no validation is done. The result
is that the kernel's version number is completely controllable by
userspace, provided that userspace can win a race condition.
Fix this flaw by not copying the version back to the kernel the second
time. This is not exploitable as the version is not further used in the
kernel. However, it could become a problem if future patches start
relying on the version field.
Cc: stable(a)vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Signed-off-by: Mike Snitzer <snitzer(a)kernel.org>
diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
index a92abbe90981..bfaebc02833a 100644
--- a/drivers/md/dm-ioctl.c
+++ b/drivers/md/dm-ioctl.c
@@ -1872,30 +1872,36 @@ static ioctl_fn lookup_ioctl(unsigned int cmd, int *ioctl_flags)
* As well as checking the version compatibility this always
* copies the kernel interface version out.
*/
-static int check_version(unsigned int cmd, struct dm_ioctl __user *user)
+static int check_version(unsigned int cmd, struct dm_ioctl __user *user,
+ struct dm_ioctl *kernel_params)
{
- uint32_t version[3];
int r = 0;
- if (copy_from_user(version, user->version, sizeof(version)))
+ /* Make certain version is first member of dm_ioctl struct */
+ BUILD_BUG_ON(offsetof(struct dm_ioctl, version) != 0);
+
+ if (copy_from_user(kernel_params->version, user->version, sizeof(kernel_params->version)))
return -EFAULT;
- if ((version[0] != DM_VERSION_MAJOR) ||
- (version[1] > DM_VERSION_MINOR)) {
+ if ((kernel_params->version[0] != DM_VERSION_MAJOR) ||
+ (kernel_params->version[1] > DM_VERSION_MINOR)) {
DMERR("ioctl interface mismatch: kernel(%u.%u.%u), user(%u.%u.%u), cmd(%d)",
DM_VERSION_MAJOR, DM_VERSION_MINOR,
DM_VERSION_PATCHLEVEL,
- version[0], version[1], version[2], cmd);
+ kernel_params->version[0],
+ kernel_params->version[1],
+ kernel_params->version[2],
+ cmd);
r = -EINVAL;
}
/*
* Fill in the kernel version.
*/
- version[0] = DM_VERSION_MAJOR;
- version[1] = DM_VERSION_MINOR;
- version[2] = DM_VERSION_PATCHLEVEL;
- if (copy_to_user(user->version, version, sizeof(version)))
+ kernel_params->version[0] = DM_VERSION_MAJOR;
+ kernel_params->version[1] = DM_VERSION_MINOR;
+ kernel_params->version[2] = DM_VERSION_PATCHLEVEL;
+ if (copy_to_user(user->version, kernel_params->version, sizeof(kernel_params->version)))
return -EFAULT;
return r;
@@ -1921,7 +1927,10 @@ static int copy_params(struct dm_ioctl __user *user, struct dm_ioctl *param_kern
const size_t minimum_data_size = offsetof(struct dm_ioctl, data);
unsigned int noio_flag;
- if (copy_from_user(param_kernel, user, minimum_data_size))
+ /* check_version() already copied version from userspace, avoid TOCTOU */
+ if (copy_from_user((char *)param_kernel + sizeof(param_kernel->version),
+ (char __user *)user + sizeof(param_kernel->version),
+ minimum_data_size - sizeof(param_kernel->version)))
return -EFAULT;
if (param_kernel->data_size < minimum_data_size) {
@@ -2033,7 +2042,7 @@ static int ctl_ioctl(struct file *file, uint command, struct dm_ioctl __user *us
* Check the interface version passed in. This also
* writes out the kernel's interface version.
*/
- r = check_version(cmd, user);
+ r = check_version(cmd, user, ¶m_kernel);
if (r)
return r;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 249bed821b4db6d95a99160f7d6d236ea5fe6362
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023070339-squire-expend-6932@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 249bed821b4db6d95a99160f7d6d236ea5fe6362 Mon Sep 17 00:00:00 2001
From: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Date: Sat, 3 Jun 2023 10:52:42 -0400
Subject: [PATCH] dm ioctl: Avoid double-fetch of version
The version is fetched once in check_version(), which then does some
validation and then overwrites the version in userspace with the API
version supported by the kernel. copy_params() then fetches the version
from userspace *again*, and this time no validation is done. The result
is that the kernel's version number is completely controllable by
userspace, provided that userspace can win a race condition.
Fix this flaw by not copying the version back to the kernel the second
time. This is not exploitable as the version is not further used in the
kernel. However, it could become a problem if future patches start
relying on the version field.
Cc: stable(a)vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Signed-off-by: Mike Snitzer <snitzer(a)kernel.org>
diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
index a92abbe90981..bfaebc02833a 100644
--- a/drivers/md/dm-ioctl.c
+++ b/drivers/md/dm-ioctl.c
@@ -1872,30 +1872,36 @@ static ioctl_fn lookup_ioctl(unsigned int cmd, int *ioctl_flags)
* As well as checking the version compatibility this always
* copies the kernel interface version out.
*/
-static int check_version(unsigned int cmd, struct dm_ioctl __user *user)
+static int check_version(unsigned int cmd, struct dm_ioctl __user *user,
+ struct dm_ioctl *kernel_params)
{
- uint32_t version[3];
int r = 0;
- if (copy_from_user(version, user->version, sizeof(version)))
+ /* Make certain version is first member of dm_ioctl struct */
+ BUILD_BUG_ON(offsetof(struct dm_ioctl, version) != 0);
+
+ if (copy_from_user(kernel_params->version, user->version, sizeof(kernel_params->version)))
return -EFAULT;
- if ((version[0] != DM_VERSION_MAJOR) ||
- (version[1] > DM_VERSION_MINOR)) {
+ if ((kernel_params->version[0] != DM_VERSION_MAJOR) ||
+ (kernel_params->version[1] > DM_VERSION_MINOR)) {
DMERR("ioctl interface mismatch: kernel(%u.%u.%u), user(%u.%u.%u), cmd(%d)",
DM_VERSION_MAJOR, DM_VERSION_MINOR,
DM_VERSION_PATCHLEVEL,
- version[0], version[1], version[2], cmd);
+ kernel_params->version[0],
+ kernel_params->version[1],
+ kernel_params->version[2],
+ cmd);
r = -EINVAL;
}
/*
* Fill in the kernel version.
*/
- version[0] = DM_VERSION_MAJOR;
- version[1] = DM_VERSION_MINOR;
- version[2] = DM_VERSION_PATCHLEVEL;
- if (copy_to_user(user->version, version, sizeof(version)))
+ kernel_params->version[0] = DM_VERSION_MAJOR;
+ kernel_params->version[1] = DM_VERSION_MINOR;
+ kernel_params->version[2] = DM_VERSION_PATCHLEVEL;
+ if (copy_to_user(user->version, kernel_params->version, sizeof(kernel_params->version)))
return -EFAULT;
return r;
@@ -1921,7 +1927,10 @@ static int copy_params(struct dm_ioctl __user *user, struct dm_ioctl *param_kern
const size_t minimum_data_size = offsetof(struct dm_ioctl, data);
unsigned int noio_flag;
- if (copy_from_user(param_kernel, user, minimum_data_size))
+ /* check_version() already copied version from userspace, avoid TOCTOU */
+ if (copy_from_user((char *)param_kernel + sizeof(param_kernel->version),
+ (char __user *)user + sizeof(param_kernel->version),
+ minimum_data_size - sizeof(param_kernel->version)))
return -EFAULT;
if (param_kernel->data_size < minimum_data_size) {
@@ -2033,7 +2042,7 @@ static int ctl_ioctl(struct file *file, uint command, struct dm_ioctl __user *us
* Check the interface version passed in. This also
* writes out the kernel's interface version.
*/
- r = check_version(cmd, user);
+ r = check_version(cmd, user, ¶m_kernel);
if (r)
return r;
commit 1c249565426e3a9940102c0ba9f63914f7cda73d upstream.
This problem was encountered on an arm64 system with a lot of memory.
Without kernel debug symbols installed, and with both kcore and kallsyms
available, perf managed to get confused and returned "unknown" for all
of the kernel symbols that it tried to look up.
On this system, stext fell within the vmalloc segment. The kcore symbol
matching code tries to find the first segment that contains stext and
uses that to replace the segment generated from just the kallsyms
information. In this case, however, there were two: a very large
vmalloc segment, and the text segment. This caused perf to get confused
because multiple overlapping segments were inserted into the RB tree
that holds the discovered segments. However, that alone wasn't
sufficient to cause the problem. Even when we could find the segment,
the offsets were adjusted in such a way that the newly generated symbols
didn't line up with the instruction addresses in the trace. The most
obvious solution would be to consult which segment type is text from
kcore, but this information is not exposed to users.
Instead, select the smallest matching segment that contains stext
instead of the first matching segment. This allows us to match the text
segment instead of vmalloc, if one is contained within the other.
Reviewed-by: Adrian Hunter <adrian.hunter(a)intel.com>
Signed-off-by: Krister Johansen <kjlx(a)templeofstupid.com>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: David Reaver <me(a)davidreaver.com>
Cc: Ian Rogers <irogers(a)google.com>
Cc: Jiri Olsa <jolsa(a)kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Michael Petlan <mpetlan(a)redhat.com>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Link: http://lore.kernel.org/lkml/20230125183418.GD1963@templeofstupid.com
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Signed-off-by: Krister Johansen <kjlx(a)templeofstupid.com>
---
tools/perf/util/symbol.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index b1e5fd99e38a..80c54196e0e4 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1357,10 +1357,23 @@ static int dso__load_kcore(struct dso *dso, struct map *map,
/* Find the kernel map using the '_stext' symbol */
if (!kallsyms__get_function_start(kallsyms_filename, "_stext", &stext)) {
+ u64 replacement_size = 0;
+
list_for_each_entry(new_map, &md.maps, node) {
- if (stext >= new_map->start && stext < new_map->end) {
+ u64 new_size = new_map->end - new_map->start;
+
+ if (!(stext >= new_map->start && stext < new_map->end))
+ continue;
+
+ /*
+ * On some architectures, ARM64 for example, the kernel
+ * text can get allocated inside of the vmalloc segment.
+ * Select the smallest matching segment, in case stext
+ * falls within more than one in the list.
+ */
+ if (!replacement_map || new_size < replacement_size) {
replacement_map = new_map;
- break;
+ replacement_size = new_size;
}
}
}
--
2.25.1
commit fd4aed8d985a3236d0877ff6d0c80ad39d4ce81a upstream
Ackerley Tng reported an issue with hugetlbfs fallocate as noted in the
Closes tag. The issue showed up after the conversion of hugetlb page
cache lookup code to use page_cache_next_miss. User visible effects are:
- hugetlbfs fallocate incorrectly returns -EEXIST if pages are presnet
in the file.
- hugetlb pages will not be included in core dumps if they need to be
brought in via GUP.
- userfaultfd UFFDIO_COPY will not notice pages already present in the
cache. It may try to allocate a new page and potentially return
ENOMEM as opposed to EEXIST.
Revert the use page_cache_next_miss() in hugetlb code.
The upstream fix[2] cannot be used used directly as the return value for
filemap_get_folio() has been changed between 6.3 and upstream.
Closes: https://lore.kernel.org/linux-mm/cover.1683069252.git.ackerleytng@google.com
Fixes: d0ce0e47b323 ("mm/hugetlb: convert hugetlb fault paths to use alloc_hugetlb_folio()")
Cc: <stable(a)vger.kernel.org> #v6.3
Reported-by: Ackerley Tng <ackerleytng(a)google.com>
Signed-off-by: Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
[1] https://lore.kernel.org/linux-mm/cover.1683069252.git.ackerleytng@google.co…
[2] https://lore.kernel.org/lkml/20230621230255.GD4155@monkey/
---
fs/hugetlbfs/inode.c | 8 +++-----
mm/hugetlb.c | 11 +++++------
2 files changed, 8 insertions(+), 11 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 9062da6da5675..586767afb4cdb 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -821,7 +821,6 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
*/
struct folio *folio;
unsigned long addr;
- bool present;
cond_resched();
@@ -845,10 +844,9 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
mutex_lock(&hugetlb_fault_mutex_table[hash]);
/* See if already present in mapping to avoid alloc/free */
- rcu_read_lock();
- present = page_cache_next_miss(mapping, index, 1) != index;
- rcu_read_unlock();
- if (present) {
+ folio = filemap_get_folio(mapping, index);
+ if (folio) {
+ folio_put(folio);
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
hugetlb_drop_vma_policy(&pseudo_vma);
continue;
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 245038a9fe4ea..29ab27d2a3ef5 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5666,13 +5666,12 @@ static bool hugetlbfs_pagecache_present(struct hstate *h,
{
struct address_space *mapping = vma->vm_file->f_mapping;
pgoff_t idx = vma_hugecache_offset(h, vma, address);
- bool present;
-
- rcu_read_lock();
- present = page_cache_next_miss(mapping, idx, 1) != idx;
- rcu_read_unlock();
+ struct folio *folio;
- return present;
+ folio = filemap_get_folio(mapping, idx);
+ if (folio)
+ folio_put(folio);
+ return folio != NULL;
}
int hugetlb_add_to_page_cache(struct folio *folio, struct address_space *mapping,
--
2.40.1
split_if_spec expects a NULL-pointer as an end marker for the argument
list, but tuntap_probe never supplied that terminating NULL. As a result
incorrectly formatted interface specification string may cause a crash
because of the random memory access. Fix that by adding NULL terminator
to the split_if_spec argument list.
Cc: stable(a)vger.kernel.org
Fixes: 7282bee78798 ("[PATCH] xtensa: Architecture support for Tensilica Xtensa Part 8")
Signed-off-by: Max Filippov <jcmvbkbc(a)gmail.com>
---
Changes v1->v2:
- fix commit message wording and add cc: stable
arch/xtensa/platforms/iss/network.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/xtensa/platforms/iss/network.c b/arch/xtensa/platforms/iss/network.c
index 7b97e6ab85a4..85c82cd42188 100644
--- a/arch/xtensa/platforms/iss/network.c
+++ b/arch/xtensa/platforms/iss/network.c
@@ -237,7 +237,7 @@ static int tuntap_probe(struct iss_net_private *lp, int index, char *init)
init += sizeof(TRANSPORT_TUNTAP_NAME) - 1;
if (*init == ',') {
- rem = split_if_spec(init + 1, &mac_str, &dev_name);
+ rem = split_if_spec(init + 1, &mac_str, &dev_name, NULL);
if (rem != NULL) {
pr_err("%s: extra garbage on specification : '%s'\n",
dev->name, rem);
--
2.30.2
__register_btf_kfunc_id_set() assumes .BTF to be part of the module's
.ko file if CONFIG_DEBUG_INFO_BTF is enabled. If that's not the case,
the function prints an error message and return an error. As a result,
such modules cannot be loaded.
However, the section could be stripped out during a build process. It
would be better to let the modules loaded, because their basic
functionalities have no problem[1], though the BTF functionalities will
not be supported. Make the function to lower the level of the message
from error to warn, and return no error.
[1] https://lore.kernel.org/bpf/20220219082037.ow2kbq5brktf4f2u@apollo.legion/
Reported-by: Alexander Egorenkov <Alexander.Egorenkov(a)ibm.com>
Link: https://lore.kernel.org/bpf/87y228q66f.fsf@oc8242746057.ibm.com/
Suggested-by: Kumar Kartikeya Dwivedi <memxor(a)gmail.com>
Link: https://lore.kernel.org/bpf/20220219082037.ow2kbq5brktf4f2u@apollo.legion/
Fixes: c446fdacb10d ("bpf: fix register_btf_kfunc_id_set for !CONFIG_DEBUG_INFO_BTF")
Cc: <stable(a)vger.kernel.org> # 5.18.x
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Acked-by: Jiri Olsa <jolsa(a)kernel.org>
Signed-off-by: SeongJae Park <sj(a)kernel.org>
---
Changes from v2
(https://lore.kernel.org/bpf/20230628164611.83038-1-sj@kernel.org/)
- Keep the error for vmlinux case.
Changes from v1
(https://lore.kernel.org/all/20230626181120.7086-1-sj@kernel.org/)
- Fix Fixes: tag (Jiri Olsa)
- Add 'Acked-by: ' from Jiri Olsa
Notes
- This is a fix. Hence, would better to merged into bpf tree, not
bpf-next.
- This doesn't cleanly applied on 6.1.y. I will send the backport to
stable@ as soon as this is merged into the mainline.
kernel/bpf/btf.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 29fe21099298..817204d53372 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -7891,10 +7891,8 @@ static int __register_btf_kfunc_id_set(enum btf_kfunc_hook hook,
pr_err("missing vmlinux BTF, cannot register kfuncs\n");
return -ENOENT;
}
- if (kset->owner && IS_ENABLED(CONFIG_DEBUG_INFO_BTF_MODULES)) {
- pr_err("missing module BTF, cannot register kfuncs\n");
- return -ENOENT;
- }
+ if (kset->owner && IS_ENABLED(CONFIG_DEBUG_INFO_BTF_MODULES))
+ pr_warn("missing module BTF, cannot register kfuncs\n");
return 0;
}
if (IS_ERR(btf))
--
2.25.1
__register_btf_kfunc_id_set() assumes .BTF to be part of the module's
.ko file if CONFIG_DEBUG_INFO_BTF is enabled. If that's not the case,
the function prints an error message and return an error. As a result,
such modules cannot be loaded.
However, the section could be stripped out during a build process. It
would be better to let the modules loaded, because their basic
functionalities have no problem[1], though the BTF functionalities will
not be supported. Make the function to lower the level of the message
from error to warn, and return no error.
[1] https://lore.kernel.org/bpf/20220219082037.ow2kbq5brktf4f2u@apollo.legion/
Reported-by: Alexander Egorenkov <Alexander.Egorenkov(a)ibm.com>
Link: https://lore.kernel.org/bpf/87y228q66f.fsf@oc8242746057.ibm.com/
Suggested-by: Kumar Kartikeya Dwivedi <memxor(a)gmail.com>
Link: https://lore.kernel.org/bpf/20220219082037.ow2kbq5brktf4f2u@apollo.legion/
Fixes: c446fdacb10d ("bpf: fix register_btf_kfunc_id_set for !CONFIG_DEBUG_INFO_BTF")
Cc: <stable(a)vger.kernel.org> # 5.18.x
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Acked-by: Jiri Olsa <jolsa(a)kernel.org>
---
Changes from v1
(https://lore.kernel.org/all/20230626181120.7086-1-sj@kernel.org/)
- Fix Fixes: tag (Jiri Olsa)
- Add 'Acked-by: ' from Jiri Olsa
kernel/bpf/btf.c | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 6b682b8e4b50..d683f034996f 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -7848,14 +7848,10 @@ static int __register_btf_kfunc_id_set(enum btf_kfunc_hook hook,
btf = btf_get_module_btf(kset->owner);
if (!btf) {
- if (!kset->owner && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) {
- pr_err("missing vmlinux BTF, cannot register kfuncs\n");
- return -ENOENT;
- }
- if (kset->owner && IS_ENABLED(CONFIG_DEBUG_INFO_BTF_MODULES)) {
- pr_err("missing module BTF, cannot register kfuncs\n");
- return -ENOENT;
- }
+ if (!kset->owner && IS_ENABLED(CONFIG_DEBUG_INFO_BTF))
+ pr_warn("missing vmlinux BTF, cannot register kfuncs\n");
+ if (kset->owner && IS_ENABLED(CONFIG_DEBUG_INFO_BTF_MODULES))
+ pr_warn("missing module BTF, cannot register kfuncs\n");
return 0;
}
if (IS_ERR(btf))
--
2.25.1
Make sure that the soundwire device used for register accesses has been
enumerated and initialised before trying to read the codec variant
during component probe.
This specifically avoids interpreting (a masked and shifted) -EBUSY
errno as the variant:
wcd938x_codec audio-codec: ASoC: error at soc_component_read_no_lock on audio-codec for register: [0x000034b0] -16
in case the soundwire device has not yet been initialised, which in turn
prevents some headphone controls from being registered.
Fixes: 8d78602aa87a ("ASoC: codecs: wcd938x: add basic driver")
Cc: stable(a)vger.kernel.org # 5.14
Cc: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
Reported-by: Steev Klimaszewski <steev(a)kali.org>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
sound/soc/codecs/wcd938x.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/sound/soc/codecs/wcd938x.c b/sound/soc/codecs/wcd938x.c
index e3ae4fb2c4db..4571588fad62 100644
--- a/sound/soc/codecs/wcd938x.c
+++ b/sound/soc/codecs/wcd938x.c
@@ -3080,9 +3080,18 @@ static int wcd938x_irq_init(struct wcd938x_priv *wcd, struct device *dev)
static int wcd938x_soc_codec_probe(struct snd_soc_component *component)
{
struct wcd938x_priv *wcd938x = snd_soc_component_get_drvdata(component);
+ struct sdw_slave *tx_sdw_dev = wcd938x->tx_sdw_dev;
struct device *dev = component->dev;
+ unsigned long time_left;
int ret, i;
+ time_left = wait_for_completion_timeout(&tx_sdw_dev->initialization_complete,
+ msecs_to_jiffies(2000));
+ if (!time_left) {
+ dev_err(dev, "soundwire device init timeout\n");
+ return -ETIMEDOUT;
+ }
+
snd_soc_component_init_regmap(component, wcd938x->regmap);
ret = pm_runtime_resume_and_get(dev);
--
2.39.3
If nf_conntrack_init_start() fails (for example due to a
register_nf_conntrack_bpf() failure), the nf_conntrack_helper_fini()
clean-up path frees the nf_ct_helper_hash map.
When built with NF_CONNTRACK=y, further netfilter modules (e.g:
netfilter_conntrack_ftp) can still be loaded and call
nf_conntrack_helpers_register(), independently of whether nf_conntrack
initialized correctly. This accesses the nf_ct_helper_hash dangling
pointer and causes a uaf, possibly leading to random memory corruption.
This patch guards nf_conntrack_helper_register() from accessing a freed
or uninitialized nf_ct_helper_hash pointer and fixes possible
uses-after-free when loading a conntrack module.
Cc: stable(a)vger.kernel.org
Fixes: 12f7a505331e ("netfilter: add user-space connection tracking helper infrastructure")
Signed-off-by: Florent Revest <revest(a)chromium.org>
---
net/netfilter/nf_conntrack_helper.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/net/netfilter/nf_conntrack_helper.c b/net/netfilter/nf_conntrack_helper.c
index 0c4db2f2ac43..f22691f83853 100644
--- a/net/netfilter/nf_conntrack_helper.c
+++ b/net/netfilter/nf_conntrack_helper.c
@@ -360,6 +360,9 @@ int nf_conntrack_helper_register(struct nf_conntrack_helper *me)
BUG_ON(me->expect_class_max >= NF_CT_MAX_EXPECT_CLASSES);
BUG_ON(strlen(me->name) > NF_CT_HELPER_NAME_LEN - 1);
+ if (!nf_ct_helper_hash)
+ return -ENOENT;
+
if (me->expect_policy->max_expected > NF_CT_EXPECT_MAX_CNT)
return -EINVAL;
@@ -515,4 +518,5 @@ int nf_conntrack_helper_init(void)
void nf_conntrack_helper_fini(void)
{
kvfree(nf_ct_helper_hash);
+ nf_ct_helper_hash = NULL;
}
--
2.41.0.255.g8b1d071c50-goog
Commit c145e0b47c77 ("mm: streamline COW logic in do_swap_page()") moved
the call to swap_free() before the call to set_pte_at(), which meant that
the MTE tags could end up being freed before set_pte_at() had a chance
to restore them. Fix it by adding a call to the arch_swap_restore() hook
before the call to swap_free().
Signed-off-by: Peter Collingbourne <pcc(a)google.com>
Link: https://linux-review.googlesource.com/id/I6470efa669e8bd2f841049b8c61020c51…
Cc: <stable(a)vger.kernel.org> # 6.1
Fixes: c145e0b47c77 ("mm: streamline COW logic in do_swap_page()")
Reported-by: Qun-wei Lin (林群崴) <Qun-wei.Lin(a)mediatek.com>
Closes: https://lore.kernel.org/all/5050805753ac469e8d727c797c2218a9d780d434.camel@…
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: "Huang, Ying" <ying.huang(a)intel.com>
Reviewed-by: Steven Price <steven.price(a)arm.com>
Acked-by: Catalin Marinas <catalin.marinas(a)arm.com>
---
v2:
- Call arch_swap_restore() directly instead of via arch_do_swap_page()
mm/memory.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/mm/memory.c b/mm/memory.c
index f69fbc251198..fc25764016b3 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3932,6 +3932,13 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
}
}
+ /*
+ * Some architectures may have to restore extra metadata to the page
+ * when reading from swap. This metadata may be indexed by swap entry
+ * so this must be called before swap_free().
+ */
+ arch_swap_restore(entry, folio);
+
/*
* Remove the swap entry and conditionally try to free up the swapcache.
* We're already holding a reference on the page but haven't mapped it
--
2.40.1.698.g37aff9b760-goog
For some reason we ended up with a setup without this flag.
This resulted in inconsistent sound card devices numbers which
are also not starting as expected at dai_link->id.
(Ex: MultiMedia1 pcm ended up with device number 4 instead of 0)
With this patch patch now the MultiMedia1 PCM ends up with device number 0
as expected.
Fixes: 9b4fe0f1cd79 ("ASoC: qdsp6: audioreach: add q6apm-dai support")
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
---
sound/soc/qcom/qdsp6/q6apm-dai.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/sound/soc/qcom/qdsp6/q6apm-dai.c b/sound/soc/qcom/qdsp6/q6apm-dai.c
index 5eb0b864c740..c90db6daabbd 100644
--- a/sound/soc/qcom/qdsp6/q6apm-dai.c
+++ b/sound/soc/qcom/qdsp6/q6apm-dai.c
@@ -840,6 +840,7 @@ static const struct snd_soc_component_driver q6apm_fe_dai_component = {
.pointer = q6apm_dai_pointer,
.trigger = q6apm_dai_trigger,
.compress_ops = &q6apm_dai_compress_ops,
+ .use_dai_pcm_id = true,
};
static int q6apm_dai_probe(struct platform_device *pdev)
--
2.21.0
From: David Woodhouse <dwmw(a)amazon.co.uk>
[ Upstream commit 6d712b9b3a58018259fb40ddd498d1f7dfa1f4ec ]
Commit dce1ca0525bf ("sched/scs: Reset task stack state in bringup_cpu()")
ensured that the shadow call stack and KASAN poisoning were removed from
a CPU's stack each time that CPU is brought up, not just once.
This is not incorrect. However, with parallel bringup the idle thread setup
will happen at a different step. As a consequence the cleanup in
bringup_cpu() would be too late.
Move the SCS/KASAN cleanup to the generic _cpu_up() function instead,
which already ensures that the new CPU's stack is available, purely to
allow for early failure. This occurs when the CPU to be brought up is
in the CPUHP_OFFLINE state, which should correctly do the cleanup any
time the CPU has been taken down to the point where such is needed.
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Reviewed-by: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Michael Kelley <mikelley(a)microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr(a)natalenko.name>
Tested-by: Helge Deller <deller(a)gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli(a)igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205257.027075560@linutronix.de
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/cpu.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index f4a2c5845bcbd..6c11cf2260542 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -591,12 +591,6 @@ static int bringup_cpu(unsigned int cpu)
struct task_struct *idle = idle_thread_get(cpu);
int ret;
- /*
- * Reset stale stack state from the last time this CPU was online.
- */
- scs_task_reset(idle);
- kasan_unpoison_task_stack(idle);
-
/*
* Some architectures have to walk the irq descriptors to
* setup the vector space for the cpu which comes online.
@@ -1383,6 +1377,12 @@ static int _cpu_up(unsigned int cpu, int tasks_frozen, enum cpuhp_state target)
ret = PTR_ERR(idle);
goto out;
}
+
+ /*
+ * Reset stale stack state from the last time this CPU was online.
+ */
+ scs_task_reset(idle);
+ kasan_unpoison_task_stack(idle);
}
cpuhp_tasks_frozen = tasks_frozen;
--
2.39.2
A switch from OSI to PC mode is only possible if all CPUs other than the
calling one are OFF, either through a call to CPU_OFF or not yet booted.
Currently OSI mode is enabled before power domains are created. In cases
where CPUidle states are not using hierarchical CPU topology the bail out
path tries to switch back to PC mode which gets denied by firmware since
other CPUs are online at this point and creates inconsistent state as
firmware is in OSI mode and Linux in PC mode.
This change moves enabling OSI mode after power domains are created,
this would makes sure that hierarchical CPU topology is used before
switching firmware to OSI mode.
Cc: stable(a)vger.kernel.org
Fixes: 70c179b49870 ("cpuidle: psci: Allow PM domain to be initialized even if no OSI mode")
Signed-off-by: Maulik Shah <quic_mkshah(a)quicinc.com>
Reviewed-by: Ulf Hansson <ulf.hansson(a)linaro.org>
---
drivers/cpuidle/cpuidle-psci-domain.c | 39 +++++++++------------------
1 file changed, 13 insertions(+), 26 deletions(-)
diff --git a/drivers/cpuidle/cpuidle-psci-domain.c b/drivers/cpuidle/cpuidle-psci-domain.c
index c2d6d9c3c930..b88af1262f1a 100644
--- a/drivers/cpuidle/cpuidle-psci-domain.c
+++ b/drivers/cpuidle/cpuidle-psci-domain.c
@@ -120,20 +120,6 @@ static void psci_pd_remove(void)
}
}
-static bool psci_pd_try_set_osi_mode(void)
-{
- int ret;
-
- if (!psci_has_osi_support())
- return false;
-
- ret = psci_set_osi_mode(true);
- if (ret)
- return false;
-
- return true;
-}
-
static void psci_cpuidle_domain_sync_state(struct device *dev)
{
/*
@@ -152,15 +138,12 @@ static int psci_cpuidle_domain_probe(struct platform_device *pdev)
{
struct device_node *np = pdev->dev.of_node;
struct device_node *node;
- bool use_osi;
+ bool use_osi = psci_has_osi_support();
int ret = 0, pd_count = 0;
if (!np)
return -ENODEV;
- /* If OSI mode is supported, let's try to enable it. */
- use_osi = psci_pd_try_set_osi_mode();
-
/*
* Parse child nodes for the "#power-domain-cells" property and
* initialize a genpd/genpd-of-provider pair when it's found.
@@ -170,33 +153,37 @@ static int psci_cpuidle_domain_probe(struct platform_device *pdev)
continue;
ret = psci_pd_init(node, use_osi);
- if (ret)
- goto put_node;
+ if (ret) {
+ of_node_put(node);
+ goto exit;
+ }
pd_count++;
}
/* Bail out if not using the hierarchical CPU topology. */
if (!pd_count)
- goto no_pd;
+ return 0;
/* Link genpd masters/subdomains to model the CPU topology. */
ret = dt_idle_pd_init_topology(np);
if (ret)
goto remove_pd;
+ /* let's try to enable OSI. */
+ ret = psci_set_osi_mode(use_osi);
+ if (ret)
+ goto remove_pd;
+
pr_info("Initialized CPU PM domain topology using %s mode\n",
use_osi ? "OSI" : "PC");
return 0;
-put_node:
- of_node_put(node);
remove_pd:
+ dt_idle_pd_remove_topology(np);
psci_pd_remove();
+exit:
pr_err("failed to create CPU PM domains ret=%d\n", ret);
-no_pd:
- if (use_osi)
- psci_set_osi_mode(false);
return ret;
}
--
2.17.1
Hi,
There are two issues that have been identified recently where the lack
of calling ACPI _REG when devices change D-states leads to functional
problems.
The first one is a case where some PCIe devices are not functional after
resuming from suspend (S3 or s0ix).
The second one is a case where as the kernel is initializing it gets
stuck in a busy loop waiting for AML that never returns.
In both cases this is fixed by cherry-picking these two commits:
5557b62634ab ("PCI/ACPI: Validate acpi_pci_set_power_state() parameter")
112a7f9c8edb ("PCI/ACPI: Call _REG when transitioning D-states")
Can you please backport these to 6.1.y and later?
Thanks,
Since commit 85e031154c7c ("powerpc/bpf: Perform complete extra passes
to update addresses"), two additional passes are performed to avoid
space and CPU time wastage on powerpc. But these extra passes led to
WARN_ON_ONCE() hits in bpf_add_extable_entry() as extable entries are
populated again, during the extra pass, without resetting the index.
Fix it by resetting entry index before repopulating extable entries,
if and when there is an additional pass.
Fixes: 85e031154c7c ("powerpc/bpf: Perform complete extra passes to update addresses")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hari Bathini <hbathini(a)linux.ibm.com>
---
arch/powerpc/net/bpf_jit_comp.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index e93aefcfb83f..37043dfc1add 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -101,6 +101,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
bpf_hdr = jit_data->header;
proglen = jit_data->proglen;
extra_pass = true;
+ /* During extra pass, ensure index is reset before repopulating extable entries */
+ cgctx.exentry_idx = 0;
goto skip_init_ctx;
}
--
2.40.0
When building the boot wrapper assembly files with clang after
commit 648a1783fe25 ("powerpc/boot: Fix boot wrapper code generation
with CONFIG_POWER10_CPU"), the following warnings appear for each file
built:
'-prefixed' is not a recognized feature for this target (ignoring feature)
'-pcrel' is not a recognized feature for this target (ignoring feature)
While it is questionable whether or not LLVM should be emitting a
warning when passed negative versions of code generation flags when
building assembly files (since it does not emit a warning for the
altivec and vsx flags), it is easy enough to work around this by just
moving the disabled flags to BOOTCFLAGS after the assignment of
BOOTAFLAGS, so that they are not added when building assembly files.
Do so to silence the warnings.
Cc: stable(a)vger.kernel.org
Fixes: 648a1783fe25 ("powerpc/boot: Fix boot wrapper code generation with CONFIG_POWER10_CPU")
Link: https://github.com/ClangBuiltLinux/linux/issues/1839
Reviewed-by: Nicholas Piggin <npiggin(a)gmail.com>
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
I do not think that 648a1783fe25 is truly to blame for this but the
Fixes tag will help the stable team ensure that this change gets
backported with 648a1783fe25. This is the minimal fix for the problem
but the true fix is separating AFLAGS and CFLAGS, which should be done
by this in-flight series by Nick:
https://lore.kernel.org/20230426055848.402993-1-npiggin@gmail.com/
---
arch/powerpc/boot/Makefile | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
index 85cde5bf04b7..771b79423bbc 100644
--- a/arch/powerpc/boot/Makefile
+++ b/arch/powerpc/boot/Makefile
@@ -34,8 +34,6 @@ endif
BOOTCFLAGS := -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \
-fno-strict-aliasing -O2 -msoft-float -mno-altivec -mno-vsx \
- $(call cc-option,-mno-prefixed) $(call cc-option,-mno-pcrel) \
- $(call cc-option,-mno-mma) \
$(call cc-option,-mno-spe) $(call cc-option,-mspe=no) \
-pipe -fomit-frame-pointer -fno-builtin -fPIC -nostdinc \
$(LINUXINCLUDE)
@@ -71,6 +69,10 @@ BOOTAFLAGS := -D__ASSEMBLY__ $(BOOTCFLAGS) -nostdinc
BOOTARFLAGS := -crD
+BOOTCFLAGS += $(call cc-option,-mno-prefixed) \
+ $(call cc-option,-mno-pcrel) \
+ $(call cc-option,-mno-mma)
+
ifdef CONFIG_CC_IS_CLANG
BOOTCFLAGS += $(CLANG_FLAGS)
BOOTAFLAGS += $(CLANG_FLAGS)
---
base-commit: 169f8997968ab620d750d9a45e15c5288d498356
change-id: 20230427-remove-power10-args-from-boot-aflags-clang-268c43e8c1fc
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
Using `% nr_cpumask_bits` is slow and complicated, and not totally
robust toward dynamic changes to CPU topologies. Rather than storing the
next CPU in the round-robin, just store the last one, and also return
that value. This simplifies the loop drastically into a much more common
pattern.
Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")
Cc: stable(a)vger.kernel.org
Reported-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Tested-by: Manuel Leiner <manuel.leiner(a)gmx.de>
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
drivers/net/wireguard/queueing.c | 1 +
drivers/net/wireguard/queueing.h | 25 +++++++++++--------------
drivers/net/wireguard/receive.c | 2 +-
drivers/net/wireguard/send.c | 2 +-
4 files changed, 14 insertions(+), 16 deletions(-)
diff --git a/drivers/net/wireguard/queueing.c b/drivers/net/wireguard/queueing.c
index 8084e7408c0a..26d235d15235 100644
--- a/drivers/net/wireguard/queueing.c
+++ b/drivers/net/wireguard/queueing.c
@@ -28,6 +28,7 @@ int wg_packet_queue_init(struct crypt_queue *queue, work_func_t function,
int ret;
memset(queue, 0, sizeof(*queue));
+ queue->last_cpu = -1;
ret = ptr_ring_init(&queue->ring, len, GFP_KERNEL);
if (ret)
return ret;
diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h
index 125284b346a7..1ea4f874e367 100644
--- a/drivers/net/wireguard/queueing.h
+++ b/drivers/net/wireguard/queueing.h
@@ -117,20 +117,17 @@ static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id)
return cpu;
}
-/* This function is racy, in the sense that next is unlocked, so it could return
- * the same CPU twice. A race-free version of this would be to instead store an
- * atomic sequence number, do an increment-and-return, and then iterate through
- * every possible CPU until we get to that index -- choose_cpu. However that's
- * a bit slower, and it doesn't seem like this potential race actually
- * introduces any performance loss, so we live with it.
+/* This function is racy, in the sense that it's called while last_cpu is
+ * unlocked, so it could return the same CPU twice. Adding locking or using
+ * atomic sequence numbers is slower though, and the consequences of racing are
+ * harmless, so live with it.
*/
-static inline int wg_cpumask_next_online(int *next)
+static inline int wg_cpumask_next_online(int *last_cpu)
{
- int cpu = *next;
-
- while (unlikely(!cpumask_test_cpu(cpu, cpu_online_mask)))
- cpu = cpumask_next(cpu, cpu_online_mask) % nr_cpumask_bits;
- *next = cpumask_next(cpu, cpu_online_mask) % nr_cpumask_bits;
+ int cpu = cpumask_next(*last_cpu, cpu_online_mask);
+ if (cpu >= nr_cpu_ids)
+ cpu = cpumask_first(cpu_online_mask);
+ *last_cpu = cpu;
return cpu;
}
@@ -159,7 +156,7 @@ static inline void wg_prev_queue_drop_peeked(struct prev_queue *queue)
static inline int wg_queue_enqueue_per_device_and_peer(
struct crypt_queue *device_queue, struct prev_queue *peer_queue,
- struct sk_buff *skb, struct workqueue_struct *wq, int *next_cpu)
+ struct sk_buff *skb, struct workqueue_struct *wq)
{
int cpu;
@@ -173,7 +170,7 @@ static inline int wg_queue_enqueue_per_device_and_peer(
/* Then we queue it up in the device queue, which consumes the
* packet as soon as it can.
*/
- cpu = wg_cpumask_next_online(next_cpu);
+ cpu = wg_cpumask_next_online(&device_queue->last_cpu);
if (unlikely(ptr_ring_produce_bh(&device_queue->ring, skb)))
return -EPIPE;
queue_work_on(cpu, wq, &per_cpu_ptr(device_queue->worker, cpu)->work);
diff --git a/drivers/net/wireguard/receive.c b/drivers/net/wireguard/receive.c
index 7135d51d2d87..0b3f0c843550 100644
--- a/drivers/net/wireguard/receive.c
+++ b/drivers/net/wireguard/receive.c
@@ -524,7 +524,7 @@ static void wg_packet_consume_data(struct wg_device *wg, struct sk_buff *skb)
goto err;
ret = wg_queue_enqueue_per_device_and_peer(&wg->decrypt_queue, &peer->rx_queue, skb,
- wg->packet_crypt_wq, &wg->decrypt_queue.last_cpu);
+ wg->packet_crypt_wq);
if (unlikely(ret == -EPIPE))
wg_queue_enqueue_per_peer_rx(skb, PACKET_STATE_DEAD);
if (likely(!ret || ret == -EPIPE)) {
diff --git a/drivers/net/wireguard/send.c b/drivers/net/wireguard/send.c
index 5368f7c35b4b..95c853b59e1d 100644
--- a/drivers/net/wireguard/send.c
+++ b/drivers/net/wireguard/send.c
@@ -318,7 +318,7 @@ static void wg_packet_create_data(struct wg_peer *peer, struct sk_buff *first)
goto err;
ret = wg_queue_enqueue_per_device_and_peer(&wg->encrypt_queue, &peer->tx_queue, first,
- wg->packet_crypt_wq, &wg->encrypt_queue.last_cpu);
+ wg->packet_crypt_wq);
if (unlikely(ret == -EPIPE))
wg_queue_enqueue_per_peer_tx(first, PACKET_STATE_DEAD);
err:
--
2.41.0
Packets bound for peers can queue up prior to the device private key
being set. For example, if persistent keepalive is set, a packet is
queued up to be sent as soon as the device comes up. However, if the
private key hasn't been set yet, the handshake message never sends, and
no timer is armed to retry, since that would be pointless.
But, if a user later sets a private key, the expectation is that those
queued packets, such as a persistent keepalive, are actually sent. So
adjust the configuration logic to account for this edge case, and add a
test case to make sure this works.
Maxim noticed this with a wg-quick(8) config to the tune of:
[Interface]
PostUp = wg set %i private-key somefile
[Peer]
PublicKey = ...
Endpoint = ...
PersistentKeepalive = 25
Here, the private key gets set after the device comes up using a PostUp
script, triggering the bug.
Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")
Cc: stable(a)vger.kernel.org
Reported-by: Maxim Cournoyer <maxim.cournoyer(a)gmail.com>
Tested-by: Maxim Cournoyer <maxim.cournoyer(a)gmail.com>
Link: https://lore.kernel.org/wireguard/87fs7xtqrv.fsf@gmail.com/
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
drivers/net/wireguard/netlink.c | 14 ++++++----
tools/testing/selftests/wireguard/netns.sh | 30 +++++++++++++++++++---
2 files changed, 35 insertions(+), 9 deletions(-)
diff --git a/drivers/net/wireguard/netlink.c b/drivers/net/wireguard/netlink.c
index 43c8c84e7ea8..6d1bd9f52d02 100644
--- a/drivers/net/wireguard/netlink.c
+++ b/drivers/net/wireguard/netlink.c
@@ -546,6 +546,7 @@ static int wg_set_device(struct sk_buff *skb, struct genl_info *info)
u8 *private_key = nla_data(info->attrs[WGDEVICE_A_PRIVATE_KEY]);
u8 public_key[NOISE_PUBLIC_KEY_LEN];
struct wg_peer *peer, *temp;
+ bool send_staged_packets;
if (!crypto_memneq(wg->static_identity.static_private,
private_key, NOISE_PUBLIC_KEY_LEN))
@@ -564,14 +565,17 @@ static int wg_set_device(struct sk_buff *skb, struct genl_info *info)
}
down_write(&wg->static_identity.lock);
- wg_noise_set_static_identity_private_key(&wg->static_identity,
- private_key);
- list_for_each_entry_safe(peer, temp, &wg->peer_list,
- peer_list) {
+ send_staged_packets = !wg->static_identity.has_identity && netif_running(wg->dev);
+ wg_noise_set_static_identity_private_key(&wg->static_identity, private_key);
+ send_staged_packets = send_staged_packets && wg->static_identity.has_identity;
+
+ wg_cookie_checker_precompute_device_keys(&wg->cookie_checker);
+ list_for_each_entry_safe(peer, temp, &wg->peer_list, peer_list) {
wg_noise_precompute_static_static(peer);
wg_noise_expire_current_peer_keypairs(peer);
+ if (send_staged_packets)
+ wg_packet_send_staged_packets(peer);
}
- wg_cookie_checker_precompute_device_keys(&wg->cookie_checker);
up_write(&wg->static_identity.lock);
}
skip_set_private_key:
diff --git a/tools/testing/selftests/wireguard/netns.sh b/tools/testing/selftests/wireguard/netns.sh
index 69c7796c7ca9..405ff262ca93 100755
--- a/tools/testing/selftests/wireguard/netns.sh
+++ b/tools/testing/selftests/wireguard/netns.sh
@@ -514,10 +514,32 @@ n2 bash -c 'printf 0 > /proc/sys/net/ipv4/conf/all/rp_filter'
n1 ping -W 1 -c 1 192.168.241.2
[[ $(n2 wg show wg0 endpoints) == "$pub1 10.0.0.3:1" ]]
-ip1 link del veth1
-ip1 link del veth3
-ip1 link del wg0
-ip2 link del wg0
+ip1 link del dev veth3
+ip1 link del dev wg0
+ip2 link del dev wg0
+
+# Make sure persistent keep alives are sent when an adapter comes up
+ip1 link add dev wg0 type wireguard
+n1 wg set wg0 private-key <(echo "$key1") peer "$pub2" endpoint 10.0.0.1:1 persistent-keepalive 1
+read _ _ tx_bytes < <(n1 wg show wg0 transfer)
+[[ $tx_bytes -eq 0 ]]
+ip1 link set dev wg0 up
+read _ _ tx_bytes < <(n1 wg show wg0 transfer)
+[[ $tx_bytes -gt 0 ]]
+ip1 link del dev wg0
+# This should also happen even if the private key is set later
+ip1 link add dev wg0 type wireguard
+n1 wg set wg0 peer "$pub2" endpoint 10.0.0.1:1 persistent-keepalive 1
+read _ _ tx_bytes < <(n1 wg show wg0 transfer)
+[[ $tx_bytes -eq 0 ]]
+ip1 link set dev wg0 up
+read _ _ tx_bytes < <(n1 wg show wg0 transfer)
+[[ $tx_bytes -eq 0 ]]
+n1 wg set wg0 private-key <(echo "$key1")
+read _ _ tx_bytes < <(n1 wg show wg0 transfer)
+[[ $tx_bytes -gt 0 ]]
+ip1 link del dev veth1
+ip1 link del dev wg0
# We test that Netlink/IPC is working properly by doing things that usually cause split responses
ip0 link add dev wg0 type wireguard
--
2.41.0
From: "Paul E. McKenney" <paulmck(a)kernel.org>
[ Upstream commit a24c1aab652ebacf9ea62470a166514174c96fe1 ]
The rcu_data structure's ->rcu_cpu_has_work field can be modified by
any CPU attempting to wake up the rcuc kthread. Therefore, this commit
marks accesses to this field from the rcu_cpu_kthread() function.
This data race was reported by KCSAN. Not appropriate for backporting
due to failure being unlikely.
Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/rcu/tree.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 615283404d9dc..98d64f107fbb7 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2457,12 +2457,12 @@ static void rcu_cpu_kthread(unsigned int cpu)
*statusp = RCU_KTHREAD_RUNNING;
local_irq_disable();
work = *workp;
- *workp = 0;
+ WRITE_ONCE(*workp, 0);
local_irq_enable();
if (work)
rcu_core();
local_bh_enable();
- if (*workp == 0) {
+ if (!READ_ONCE(*workp)) {
trace_rcu_utilization(TPS("End CPU kthread@rcu_wait"));
*statusp = RCU_KTHREAD_WAITING;
return;
--
2.39.2
From: "Paul E. McKenney" <paulmck(a)kernel.org>
[ Upstream commit a24c1aab652ebacf9ea62470a166514174c96fe1 ]
The rcu_data structure's ->rcu_cpu_has_work field can be modified by
any CPU attempting to wake up the rcuc kthread. Therefore, this commit
marks accesses to this field from the rcu_cpu_kthread() function.
This data race was reported by KCSAN. Not appropriate for backporting
due to failure being unlikely.
Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/rcu/tree.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index df016f6d0662c..48f3e90c5de53 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2826,12 +2826,12 @@ static void rcu_cpu_kthread(unsigned int cpu)
*statusp = RCU_KTHREAD_RUNNING;
local_irq_disable();
work = *workp;
- *workp = 0;
+ WRITE_ONCE(*workp, 0);
local_irq_enable();
if (work)
rcu_core();
local_bh_enable();
- if (*workp == 0) {
+ if (!READ_ONCE(*workp)) {
trace_rcu_utilization(TPS("End CPU kthread@rcu_wait"));
*statusp = RCU_KTHREAD_WAITING;
return;
--
2.39.2
From: Mark Rutland <mark.rutland(a)arm.com>
[ Upstream commit ab9b4008092c86dc12497af155a0901cc1156999 ]
Both create_mapping_noalloc() and update_mapping_prot() sanity-check
their 'virt' parameter, but the check itself doesn't make much sense.
The condition used today appears to be a historical accident.
The sanity-check condition:
if ((virt >= PAGE_END) && (virt < VMALLOC_START)) {
[ ... warning here ... ]
return;
}
... can only be true for the KASAN shadow region or the module region,
and there's no reason to exclude these specifically for creating and
updateing mappings.
When arm64 support was first upstreamed in commit:
c1cc1552616d0f35 ("arm64: MMU initialisation")
... the condition was:
if (virt < VMALLOC_START) {
[ ... warning here ... ]
return;
}
At the time, VMALLOC_START was the lowest kernel address, and this was
checking whether 'virt' would be translated via TTBR1.
Subsequently in commit:
14c127c957c1c607 ("arm64: mm: Flip kernel VA space")
... the condition was changed to:
if ((virt >= VA_START) && (virt < VMALLOC_START)) {
[ ... warning here ... ]
return;
}
This appear to have been a thinko. The commit moved the linear map to
the bottom of the kernel address space, with VMALLOC_START being at the
halfway point. The old condition would warn for changes to the linear
map below this, and at the time VA_START was the end of the linear map.
Subsequently we cleaned up the naming of VA_START in commit:
77ad4ce69321abbe ("arm64: memory: rename VA_START to PAGE_END")
... keeping the erroneous condition as:
if ((virt >= PAGE_END) && (virt < VMALLOC_START)) {
[ ... warning here ... ]
return;
}
Correct the condition to check against the start of the TTBR1 address
space, which is currently PAGE_OFFSET. This simplifies the logic, and
more clearly matches the "outside kernel range" message in the warning.
Signed-off-by: Mark Rutland <mark.rutland(a)arm.com>
Cc: Russell King <linux(a)armlinux.org.uk>
Cc: Steve Capper <steve.capper(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Reviewed-by: Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
Link: https://lore.kernel.org/r/20230615102628.1052103-1-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
arch/arm64/mm/mmu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 5cf575f23af28..8e934bb44f12e 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -399,7 +399,7 @@ static phys_addr_t pgd_pgtable_alloc(int shift)
static void __init create_mapping_noalloc(phys_addr_t phys, unsigned long virt,
phys_addr_t size, pgprot_t prot)
{
- if ((virt >= PAGE_END) && (virt < VMALLOC_START)) {
+ if (virt < PAGE_OFFSET) {
pr_warn("BUG: not creating mapping for %pa at 0x%016lx - outside kernel range\n",
&phys, virt);
return;
@@ -426,7 +426,7 @@ void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
static void update_mapping_prot(phys_addr_t phys, unsigned long virt,
phys_addr_t size, pgprot_t prot)
{
- if ((virt >= PAGE_END) && (virt < VMALLOC_START)) {
+ if (virt < PAGE_OFFSET) {
pr_warn("BUG: not updating mapping for %pa at 0x%016lx - outside kernel range\n",
&phys, virt);
return;
--
2.39.2
From: Hans de Goede <hdegoede(a)redhat.com>
[ Upstream commit 69d6b37695c1f2320cfa330e1e1636d50dd5040a ]
The Nextbook Ares 8A is a x86 ACPI tablet which ships with Android x86
as factory OS. Its DSDT contains a bunch of I2C devices which are not
actually there (the Android x86 kernel fork ignores I2C devices described
in the DSDT).
On this specific model this just not cause resource conflicts, one of
the probe() calls for the non existing i2c_clients actually ends up
toggling a GPIO or executing a _PS3 after a failed probe which turns
the tablet off.
Add a ACPI_QUIRK_SKIP_I2C_CLIENTS for the Nextbook Ares 8 to the
acpi_quirk_skip_dmi_ids table to avoid the bogus i2c_clients and
to fix the tablet turning off during boot because of this.
Also add the "10EC5651" HID for the RealTek ALC5651 codec used
in this tablet to the list of HIDs for which not to skipi2c_client
instantiation, since the Intel SST sound driver relies on
the codec being instantiated through ACPI.
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/acpi/x86/utils.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/x86/utils.c b/drivers/acpi/x86/utils.c
index e45285d4e62a4..0b65c0d3f5a80 100644
--- a/drivers/acpi/x86/utils.c
+++ b/drivers/acpi/x86/utils.c
@@ -330,7 +330,7 @@ static const struct dmi_system_id acpi_quirk_skip_dmi_ids[] = {
ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY),
},
{
- /* Nextbook Ares 8 */
+ /* Nextbook Ares 8 (BYT version)*/
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Insyde"),
DMI_MATCH(DMI_PRODUCT_NAME, "M890BAP"),
@@ -338,6 +338,16 @@ static const struct dmi_system_id acpi_quirk_skip_dmi_ids[] = {
.driver_data = (void *)(ACPI_QUIRK_SKIP_I2C_CLIENTS |
ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY),
},
+ {
+ /* Nextbook Ares 8A (CHT version)*/
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Insyde"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "CherryTrail"),
+ DMI_MATCH(DMI_BIOS_VERSION, "M882"),
+ },
+ .driver_data = (void *)(ACPI_QUIRK_SKIP_I2C_CLIENTS |
+ ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY),
+ },
{
/* Whitelabel (sold as various brands) TM800A550L */
.matches = {
@@ -356,6 +366,7 @@ static const struct dmi_system_id acpi_quirk_skip_dmi_ids[] = {
#if IS_ENABLED(CONFIG_X86_ANDROID_TABLETS)
static const struct acpi_device_id i2c_acpi_known_good_ids[] = {
{ "10EC5640", 0 }, /* RealTek ALC5640 audio codec */
+ { "10EC5651", 0 }, /* RealTek ALC5651 audio codec */
{ "INT33F4", 0 }, /* X-Powers AXP288 PMIC */
{ "INT33FD", 0 }, /* Intel Crystal Cove PMIC */
{ "INT34D3", 0 }, /* Intel Whiskey Cove PMIC */
--
2.39.2
From: David Woodhouse <dwmw(a)amazon.co.uk>
[ Upstream commit 6d712b9b3a58018259fb40ddd498d1f7dfa1f4ec ]
Commit dce1ca0525bf ("sched/scs: Reset task stack state in bringup_cpu()")
ensured that the shadow call stack and KASAN poisoning were removed from
a CPU's stack each time that CPU is brought up, not just once.
This is not incorrect. However, with parallel bringup the idle thread setup
will happen at a different step. As a consequence the cleanup in
bringup_cpu() would be too late.
Move the SCS/KASAN cleanup to the generic _cpu_up() function instead,
which already ensures that the new CPU's stack is available, purely to
allow for early failure. This occurs when the CPU to be brought up is
in the CPUHP_OFFLINE state, which should correctly do the cleanup any
time the CPU has been taken down to the point where such is needed.
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Reviewed-by: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Michael Kelley <mikelley(a)microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr(a)natalenko.name>
Tested-by: Helge Deller <deller(a)gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli(a)igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205257.027075560@linutronix.de
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/cpu.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 008b50da22246..705d330600485 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -553,12 +553,6 @@ static int bringup_cpu(unsigned int cpu)
struct task_struct *idle = idle_thread_get(cpu);
int ret;
- /*
- * Reset stale stack state from the last time this CPU was online.
- */
- scs_task_reset(idle);
- kasan_unpoison_task_stack(idle);
-
/*
* Some architectures have to walk the irq descriptors to
* setup the vector space for the cpu which comes online.
@@ -1274,6 +1268,12 @@ static int _cpu_up(unsigned int cpu, int tasks_frozen, enum cpuhp_state target)
ret = PTR_ERR(idle);
goto out;
}
+
+ /*
+ * Reset stale stack state from the last time this CPU was online.
+ */
+ scs_task_reset(idle);
+ kasan_unpoison_task_stack(idle);
}
cpuhp_tasks_frozen = tasks_frozen;
--
2.39.2
From: David Woodhouse <dwmw(a)amazon.co.uk>
[ Upstream commit 6d712b9b3a58018259fb40ddd498d1f7dfa1f4ec ]
Commit dce1ca0525bf ("sched/scs: Reset task stack state in bringup_cpu()")
ensured that the shadow call stack and KASAN poisoning were removed from
a CPU's stack each time that CPU is brought up, not just once.
This is not incorrect. However, with parallel bringup the idle thread setup
will happen at a different step. As a consequence the cleanup in
bringup_cpu() would be too late.
Move the SCS/KASAN cleanup to the generic _cpu_up() function instead,
which already ensures that the new CPU's stack is available, purely to
allow for early failure. This occurs when the CPU to be brought up is
in the CPUHP_OFFLINE state, which should correctly do the cleanup any
time the CPU has been taken down to the point where such is needed.
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Reviewed-by: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Michael Kelley <mikelley(a)microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr(a)natalenko.name>
Tested-by: Helge Deller <deller(a)gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli(a)igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205257.027075560@linutronix.de
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/cpu.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 393114c10c285..4191f17379d2f 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -590,12 +590,6 @@ static int bringup_cpu(unsigned int cpu)
struct task_struct *idle = idle_thread_get(cpu);
int ret;
- /*
- * Reset stale stack state from the last time this CPU was online.
- */
- scs_task_reset(idle);
- kasan_unpoison_task_stack(idle);
-
/*
* Some architectures have to walk the irq descriptors to
* setup the vector space for the cpu which comes online.
@@ -1372,6 +1366,12 @@ static int _cpu_up(unsigned int cpu, int tasks_frozen, enum cpuhp_state target)
ret = PTR_ERR(idle);
goto out;
}
+
+ /*
+ * Reset stale stack state from the last time this CPU was online.
+ */
+ scs_task_reset(idle);
+ kasan_unpoison_task_stack(idle);
}
cpuhp_tasks_frozen = tasks_frozen;
--
2.39.2
From: David Woodhouse <dwmw(a)amazon.co.uk>
[ Upstream commit 6d712b9b3a58018259fb40ddd498d1f7dfa1f4ec ]
Commit dce1ca0525bf ("sched/scs: Reset task stack state in bringup_cpu()")
ensured that the shadow call stack and KASAN poisoning were removed from
a CPU's stack each time that CPU is brought up, not just once.
This is not incorrect. However, with parallel bringup the idle thread setup
will happen at a different step. As a consequence the cleanup in
bringup_cpu() would be too late.
Move the SCS/KASAN cleanup to the generic _cpu_up() function instead,
which already ensures that the new CPU's stack is available, purely to
allow for early failure. This occurs when the CPU to be brought up is
in the CPUHP_OFFLINE state, which should correctly do the cleanup any
time the CPU has been taken down to the point where such is needed.
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Reviewed-by: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Michael Kelley <mikelley(a)microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr(a)natalenko.name>
Tested-by: Helge Deller <deller(a)gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli(a)igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205257.027075560@linutronix.de
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/cpu.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 98a7a7b1471b7..dbf1e572f6e7d 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -591,12 +591,6 @@ static int bringup_cpu(unsigned int cpu)
struct task_struct *idle = idle_thread_get(cpu);
int ret;
- /*
- * Reset stale stack state from the last time this CPU was online.
- */
- scs_task_reset(idle);
- kasan_unpoison_task_stack(idle);
-
/*
* Some architectures have to walk the irq descriptors to
* setup the vector space for the cpu which comes online.
@@ -1383,6 +1377,12 @@ static int _cpu_up(unsigned int cpu, int tasks_frozen, enum cpuhp_state target)
ret = PTR_ERR(idle);
goto out;
}
+
+ /*
+ * Reset stale stack state from the last time this CPU was online.
+ */
+ scs_task_reset(idle);
+ kasan_unpoison_task_stack(idle);
}
cpuhp_tasks_frozen = tasks_frozen;
--
2.39.2
From: David Woodhouse <dwmw(a)amazon.co.uk>
[ Upstream commit 6d712b9b3a58018259fb40ddd498d1f7dfa1f4ec ]
Commit dce1ca0525bf ("sched/scs: Reset task stack state in bringup_cpu()")
ensured that the shadow call stack and KASAN poisoning were removed from
a CPU's stack each time that CPU is brought up, not just once.
This is not incorrect. However, with parallel bringup the idle thread setup
will happen at a different step. As a consequence the cleanup in
bringup_cpu() would be too late.
Move the SCS/KASAN cleanup to the generic _cpu_up() function instead,
which already ensures that the new CPU's stack is available, purely to
allow for early failure. This occurs when the CPU to be brought up is
in the CPUHP_OFFLINE state, which should correctly do the cleanup any
time the CPU has been taken down to the point where such is needed.
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Reviewed-by: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Michael Kelley <mikelley(a)microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr(a)natalenko.name>
Tested-by: Helge Deller <deller(a)gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli(a)igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205257.027075560@linutronix.de
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/cpu.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 6c0a92ca6bb59..43e0a77f21e81 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -591,12 +591,6 @@ static int bringup_cpu(unsigned int cpu)
struct task_struct *idle = idle_thread_get(cpu);
int ret;
- /*
- * Reset stale stack state from the last time this CPU was online.
- */
- scs_task_reset(idle);
- kasan_unpoison_task_stack(idle);
-
/*
* Some architectures have to walk the irq descriptors to
* setup the vector space for the cpu which comes online.
@@ -1383,6 +1377,12 @@ static int _cpu_up(unsigned int cpu, int tasks_frozen, enum cpuhp_state target)
ret = PTR_ERR(idle);
goto out;
}
+
+ /*
+ * Reset stale stack state from the last time this CPU was online.
+ */
+ scs_task_reset(idle);
+ kasan_unpoison_task_stack(idle);
}
cpuhp_tasks_frozen = tasks_frozen;
--
2.39.2
From: "Paul E. McKenney" <paulmck(a)kernel.org>
[ Upstream commit 6bc6e6b27524304aadb9c04611ddb1c84dd7617a ]
The ref_scale_shutdown() kthread/function uses wait_event() to wait for
the refscale test to complete. However, although the read-side tests
are normally extremely fast, there is no law against specifying a very
large value for the refscale.loops module parameter or against having
a slow read-side primitive. Either way, this might well trigger the
hung-task timeout.
This commit therefore replaces those wait_event() calls with calls to
wait_event_idle(), which do not trigger the hung-task timeout.
Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org>
Signed-off-by: Boqun Feng <boqun.feng(a)gmail.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/rcu/refscale.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/rcu/refscale.c b/kernel/rcu/refscale.c
index 952595c678b37..4e419ca6d6114 100644
--- a/kernel/rcu/refscale.c
+++ b/kernel/rcu/refscale.c
@@ -625,7 +625,7 @@ ref_scale_cleanup(void)
static int
ref_scale_shutdown(void *arg)
{
- wait_event(shutdown_wq, shutdown_start);
+ wait_event_idle(shutdown_wq, shutdown_start);
smp_mb(); // Wake before output.
ref_scale_cleanup();
--
2.39.2
From: "Paul E. McKenney" <paulmck(a)kernel.org>
[ Upstream commit 6bc6e6b27524304aadb9c04611ddb1c84dd7617a ]
The ref_scale_shutdown() kthread/function uses wait_event() to wait for
the refscale test to complete. However, although the read-side tests
are normally extremely fast, there is no law against specifying a very
large value for the refscale.loops module parameter or against having
a slow read-side primitive. Either way, this might well trigger the
hung-task timeout.
This commit therefore replaces those wait_event() calls with calls to
wait_event_idle(), which do not trigger the hung-task timeout.
Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org>
Signed-off-by: Boqun Feng <boqun.feng(a)gmail.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/rcu/refscale.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/rcu/refscale.c b/kernel/rcu/refscale.c
index 66dc14cf5687e..5abb0cf52803a 100644
--- a/kernel/rcu/refscale.c
+++ b/kernel/rcu/refscale.c
@@ -777,7 +777,7 @@ ref_scale_cleanup(void)
static int
ref_scale_shutdown(void *arg)
{
- wait_event(shutdown_wq, shutdown_start);
+ wait_event_idle(shutdown_wq, shutdown_start);
smp_mb(); // Wake before output.
ref_scale_cleanup();
--
2.39.2
From: "Paul E. McKenney" <paulmck(a)kernel.org>
[ Upstream commit 6bc6e6b27524304aadb9c04611ddb1c84dd7617a ]
The ref_scale_shutdown() kthread/function uses wait_event() to wait for
the refscale test to complete. However, although the read-side tests
are normally extremely fast, there is no law against specifying a very
large value for the refscale.loops module parameter or against having
a slow read-side primitive. Either way, this might well trigger the
hung-task timeout.
This commit therefore replaces those wait_event() calls with calls to
wait_event_idle(), which do not trigger the hung-task timeout.
Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org>
Signed-off-by: Boqun Feng <boqun.feng(a)gmail.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/rcu/refscale.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/rcu/refscale.c b/kernel/rcu/refscale.c
index 435c884c02b5c..d49a9d66e0000 100644
--- a/kernel/rcu/refscale.c
+++ b/kernel/rcu/refscale.c
@@ -795,7 +795,7 @@ ref_scale_cleanup(void)
static int
ref_scale_shutdown(void *arg)
{
- wait_event(shutdown_wq, shutdown_start);
+ wait_event_idle(shutdown_wq, shutdown_start);
smp_mb(); // Wake before output.
ref_scale_cleanup();
--
2.39.2
From: "Paul E. McKenney" <paulmck(a)kernel.org>
[ Upstream commit 6bc6e6b27524304aadb9c04611ddb1c84dd7617a ]
The ref_scale_shutdown() kthread/function uses wait_event() to wait for
the refscale test to complete. However, although the read-side tests
are normally extremely fast, there is no law against specifying a very
large value for the refscale.loops module parameter or against having
a slow read-side primitive. Either way, this might well trigger the
hung-task timeout.
This commit therefore replaces those wait_event() calls with calls to
wait_event_idle(), which do not trigger the hung-task timeout.
Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org>
Signed-off-by: Boqun Feng <boqun.feng(a)gmail.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/rcu/refscale.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/rcu/refscale.c b/kernel/rcu/refscale.c
index afa3e1a2f6902..1970ce5f22d40 100644
--- a/kernel/rcu/refscale.c
+++ b/kernel/rcu/refscale.c
@@ -1031,7 +1031,7 @@ ref_scale_cleanup(void)
static int
ref_scale_shutdown(void *arg)
{
- wait_event(shutdown_wq, shutdown_start);
+ wait_event_idle(shutdown_wq, shutdown_start);
smp_mb(); // Wake before output.
ref_scale_cleanup();
--
2.39.2
The patch titled
Subject: mm: call arch_swap_restore() from do_swap_page()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-call-arch_swap_restore-from-do_swap_page.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Peter Collingbourne <pcc(a)google.com>
Subject: mm: call arch_swap_restore() from do_swap_page()
Date: Mon, 22 May 2023 17:43:08 -0700
Commit c145e0b47c77 ("mm: streamline COW logic in do_swap_page()") moved
the call to swap_free() before the call to set_pte_at(), which meant that
the MTE tags could end up being freed before set_pte_at() had a chance to
restore them. Fix it by adding a call to the arch_swap_restore() hook
before the call to swap_free().
Link: https://lkml.kernel.org/r/20230523004312.1807357-2-pcc@google.com
Link: https://linux-review.googlesource.com/id/I6470efa669e8bd2f841049b8c61020c51…
Fixes: c145e0b47c77 ("mm: streamline COW logic in do_swap_page()")
Signed-off-by: Peter Collingbourne <pcc(a)google.com>
Reported-by: Qun-wei Lin <Qun-wei.Lin(a)mediatek.com>
Closes: https://lore.kernel.org/all/5050805753ac469e8d727c797c2218a9d780d434.camel@…
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: "Huang, Ying" <ying.huang(a)intel.com>
Reviewed-by: Steven Price <steven.price(a)arm.com>
Acked-by: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: <stable(a)vger.kernel.org> [6.1+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/mm/memory.c~mm-call-arch_swap_restore-from-do_swap_page
+++ a/mm/memory.c
@@ -3954,6 +3954,13 @@ vm_fault_t do_swap_page(struct vm_fault
}
/*
+ * Some architectures may have to restore extra metadata to the page
+ * when reading from swap. This metadata may be indexed by swap entry
+ * so this must be called before swap_free().
+ */
+ arch_swap_restore(entry, folio);
+
+ /*
* Remove the swap entry and conditionally try to free up the swapcache.
* We're already holding a reference on the page but haven't mapped it
* yet.
_
Patches currently in -mm which might be from pcc(a)google.com are
mm-call-arch_swap_restore-from-do_swap_page.patch
Hi,
A regression [1] was reported on suspend for AMD Stoney on 6.3.10.
It bisected down to:
1ca399f127e0 ("drm/amd/display: Add wrapper to call planes and stream
update")
This was requested by me to fix a PSR problem [2], which isn't used on
Stoney.
This was also backported into 6.1.36 (as e538342002cb) and 5.15.119 (as
3c1aa91b37f9).
It's fixed on 6.3.y by cherry-picking:
32953485c558 ("drm/amd/display: Do not update DRR while BW optimizations
pending")
It's fixed in 6.1.y by cherry-picking:
3442f4e0e555 ("drm/amd/display: Remove optimization for VRR updates")
32953485c558 ("drm/amd/display: Do not update DRR while BW optimizations
pending")
On 5.15.y it's not a reasonable backport to take the fix to stable
because there is a lot of missing Freesync code. Instead it's better to
revert the patch series that introduced it to 5.15.y because PSR-SU
isn't even introduced until later kernels anyway.
5a24be76af79 ("drm/amd/display: fix the system hang while disable PSR")
3c1aa91b37f9 ("drm/amd/display: Add wrapper to call planes and stream
update")
eea850c025b5 ("drm/amd/display: Use dc_update_planes_and_stream")
97ca308925a5 ("drm/amd/display: Add minimal pipe split transition state")
[1] https://gitlab.freedesktop.org/drm/amd/-/issues/2670
[2]
https://lore.kernel.org/stable/e2ae2999-2e39-31ad-198a-26ab3ae53ae7@amd.com/
Thanks,
Hallow und wie geht es dir heute?
Ich möchte, dass Ihre Partnerschaft Sie als Subunternehmer
präsentiert, damit Sie in meinem Namen 8,6 Millionen US-Dollar aus
Überrechnungsverträgen erhalten können, die wir zu 65 % und 35 %
aufteilen können.
Diese Transaktion ist 100 % risikofrei; Du brauchst keine Angst zu haben.
Bitte senden Sie mir eine E-Mail an (osbornemichel438(a)gmail.com), um
ausführliche Informationen zu erhalten und bei Interesse zu erfahren,
wie wir dies gemeinsam bewältigen können.
Sie müssen es mir also weiterleiten
Ihr vollständiger Name.........................
Telefon.............
Geburtsdatum .........................
Staatsangehörigkeit .................................
Mit freundlichen Grüße,
Osborne Michel.
Hallow and how are you today?
I seek for your partnership to present you as a sub-contractor so
that you can receive 8.6M Over-Invoice contract fund on my behalf and
we can split it 65% 35%.
This transaction is 100% risk -free; you need not to be afraid.
Please email me at ( osbornemichel438(a)gmail.com ) for comprehensive
details and how we can handle this together if interested.
So I need you to forward it to me
your full name.........................
Telephone.............
Date of Birth .........................
Nationality .................................
Kind Regards,
Osborne Michel.
From: Stewart Smith <trawets(a)amazon.com>
For both IPv4 and IPv6 incoming TCP connections are tracked in a hash
table with a hash over the source & destination addresses and ports.
However, the IPv6 hash is insufficient and can lead to a high rate of
collisions.
The IPv6 hash used an XOR to fit everything into the 96 bits for the
fast jenkins hash, meaning it is possible for an external entity to
ensure the hash collides, thus falling back to a linear search in the
bucket, which is slow.
We take the approach of hash half the data; hash the other half; and
then hash them together. We do this with 3x jenkins hashes rather than
2x to calculate the hashing value for the connection covering the full
length of the addresses and ports.
While this may look like it adds overhead, the reality of modern CPUs
means that this is unmeasurable in real world scenarios.
In simulating with llvm-mca, the increase in cycles for the hashing code
was ~5 cycles on Skylake (from a base of ~50), and an extra ~9 on
Nehalem (base of ~62).
In commit dd6d2910c5e0 ("netfilter: conntrack: switch to siphash")
netfilter switched from a jenkins hash to a siphash, but even the faster
hsiphash is a more significant overhead (~20-30%) in some preliminary
testing. So, in this patch, we keep to the more conservative approach to
ensure we don't add much overhead per SYN.
In testing, this results in a consistently even spread across the
connection buckets. In both testing and real-world scenarios, we have
not found any measurable performance impact.
Cc: stable(a)vger.kernel.org
Fixes: 08dcdbf6a7b9 ("ipv6: use a stronger hash for tcp")
Fixes: b3da2cf37c5c ("[INET]: Use jhash + random secret for ehash.")
Signed-off-by: Stewart Smith <trawets(a)amazon.com>
Signed-off-by: Samuel Mendoza-Jonas <samjonas(a)amazon.com>
---
include/net/ipv6.h | 4 +---
net/ipv6/inet6_hashtables.c | 5 ++++-
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 7332296eca44..f9bb54869d82 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -752,9 +752,7 @@ static inline u32 ipv6_addr_hash(const struct in6_addr *a)
/* more secured version of ipv6_addr_hash() */
static inline u32 __ipv6_addr_jhash(const struct in6_addr *a, const u32 initval)
{
- u32 v = (__force u32)a->s6_addr32[0] ^ (__force u32)a->s6_addr32[1];
-
- return jhash_3words(v,
+ return jhash_3words((__force u32)a->s6_addr32[1],
(__force u32)a->s6_addr32[2],
(__force u32)a->s6_addr32[3],
initval);
diff --git a/net/ipv6/inet6_hashtables.c b/net/ipv6/inet6_hashtables.c
index b64b49012655..bb7198081974 100644
--- a/net/ipv6/inet6_hashtables.c
+++ b/net/ipv6/inet6_hashtables.c
@@ -33,7 +33,10 @@ u32 inet6_ehashfn(const struct net *net,
net_get_random_once(&inet6_ehash_secret, sizeof(inet6_ehash_secret));
net_get_random_once(&ipv6_hash_secret, sizeof(ipv6_hash_secret));
- lhash = (__force u32)laddr->s6_addr32[3];
+ lhash = jhash_3words((__force u32)laddr->s6_addr32[3],
+ (((u32)lport) << 16) | (__force u32)fport,
+ (__force u32)faddr->s6_addr32[0],
+ ipv6_hash_secret);
fhash = __ipv6_addr_jhash(faddr, ipv6_hash_secret);
return __inet6_ehashfn(lhash, lport, fhash, fport,
--
2.25.1
When calling device_add in the registration of typec_port, it will do
the NULL check on usb_power_delivery handle in typec_port for the
visibility of the device attributes. It is always NULL because port->pd
is set in typec_port_set_usb_power_delivery which is later than the
device_add call.
Set port->pd before device_add and only link the device after that.
Fixes: a7cff92f0635 ("usb: typec: USB Power Delivery helpers for ports and partners")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kyle Tso <kyletso(a)google.com>
---
update v2:
- Add "Cc: stable(a)vger.kernel.org"
drivers/usb/typec/class.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/typec/class.c b/drivers/usb/typec/class.c
index faa184ae3dac..3b30948bf4b0 100644
--- a/drivers/usb/typec/class.c
+++ b/drivers/usb/typec/class.c
@@ -2288,6 +2288,8 @@ struct typec_port *typec_register_port(struct device *parent,
return ERR_PTR(ret);
}
+ port->pd = cap->pd;
+
ret = device_add(&port->dev);
if (ret) {
dev_err(parent, "failed to register port (%d)\n", ret);
@@ -2295,7 +2297,7 @@ struct typec_port *typec_register_port(struct device *parent,
return ERR_PTR(ret);
}
- ret = typec_port_set_usb_power_delivery(port, cap->pd);
+ ret = usb_power_delivery_link_device(port->pd, &port->dev);
if (ret) {
dev_err(&port->dev, "failed to link pd\n");
device_unregister(&port->dev);
--
2.41.0.162.gfafddb0af9-goog
This is the start of the stable review cycle for the 6.1.37 release.
There are 31 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 02 Jul 2023 05:56:20 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.37-rc2…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.1.37-rc2
Linus Torvalds <torvalds(a)linux-foundation.org>
sparc32: fix lock_mm_and_find_vma() conversion
Ricardo Cañuelo <ricardo.canuelo(a)collabora.com>
Revert "thermal/drivers/mediatek: Use devm_of_iomap to avoid resource leak in mtk_thermal_probe"
Mike Hommey <mh(a)glandium.org>
HID: logitech-hidpp: add HIDPP_QUIRK_DELAYED_INIT for the T651.
Jason Gerecke <jason.gerecke(a)wacom.com>
HID: wacom: Use ktime_t rather than int when dealing with timestamps
Ludvig Michaelsson <ludvig.michaelsson(a)yubico.com>
HID: hidraw: fix data race on device refcount
Zhang Shurong <zhang_shurong(a)foxmail.com>
fbdev: fix potential OOB read in fast_imageblit()
Linus Torvalds <torvalds(a)linux-foundation.org>
mm: always expand the stack with the mmap write lock held
Linus Torvalds <torvalds(a)linux-foundation.org>
execve: expand new process stack manually ahead of time
Liam R. Howlett <Liam.Howlett(a)oracle.com>
mm: make find_extend_vma() fail if write lock not held
Linus Torvalds <torvalds(a)linux-foundation.org>
powerpc/mm: convert coprocessor fault to lock_mm_and_find_vma()
Linus Torvalds <torvalds(a)linux-foundation.org>
mm/fault: convert remaining simple cases to lock_mm_and_find_vma()
Ben Hutchings <ben(a)decadent.org.uk>
arm/mm: Convert to using lock_mm_and_find_vma()
Ben Hutchings <ben(a)decadent.org.uk>
riscv/mm: Convert to using lock_mm_and_find_vma()
Ben Hutchings <ben(a)decadent.org.uk>
mips/mm: Convert to using lock_mm_and_find_vma()
Michael Ellerman <mpe(a)ellerman.id.au>
powerpc/mm: Convert to using lock_mm_and_find_vma()
Linus Torvalds <torvalds(a)linux-foundation.org>
arm64/mm: Convert to using lock_mm_and_find_vma()
Linus Torvalds <torvalds(a)linux-foundation.org>
mm: make the page fault mmap locking killable
Linus Torvalds <torvalds(a)linux-foundation.org>
mm: introduce new 'lock_mm_and_find_vma()' page fault helper
Peng Zhang <zhangpeng.00(a)bytedance.com>
maple_tree: fix potential out-of-bounds access in mas_wr_end_piv()
Oliver Hartkopp <socketcan(a)hartkopp.net>
can: isotp: isotp_sendmsg(): fix return error fix on TX path
Thomas Gleixner <tglx(a)linutronix.de>
x86/smp: Cure kexec() vs. mwait_play_dead() breakage
Thomas Gleixner <tglx(a)linutronix.de>
x86/smp: Use dedicated cache-line for mwait_play_dead()
Thomas Gleixner <tglx(a)linutronix.de>
x86/smp: Remove pointless wmb()s from native_stop_other_cpus()
Tony Battersby <tonyb(a)cybernetics.com>
x86/smp: Dont access non-existing CPUID leaf
Thomas Gleixner <tglx(a)linutronix.de>
x86/smp: Make stop_other_cpus() more robust
Borislav Petkov (AMD) <bp(a)alien8.de>
x86/microcode/AMD: Load late on both threads too
Tony Luck <tony.luck(a)intel.com>
mm, hwpoison: when copy-on-write hits poison, take page offline
Tony Luck <tony.luck(a)intel.com>
mm, hwpoison: try to recover from copy-on write faults
Paolo Abeni <pabeni(a)redhat.com>
mptcp: ensure listener is unhashed before updating the sk status
David Woodhouse <dwmw(a)amazon.co.uk>
mm/mmap: Fix error return in do_vmi_align_munmap()
Liam R. Howlett <Liam.Howlett(a)oracle.com>
mm/mmap: Fix error path in do_vmi_align_munmap()
-------------
Diffstat:
Makefile | 4 +-
arch/alpha/Kconfig | 1 +
arch/alpha/mm/fault.c | 13 +--
arch/arc/Kconfig | 1 +
arch/arc/mm/fault.c | 11 +--
arch/arm/Kconfig | 1 +
arch/arm/mm/fault.c | 63 +++-----------
arch/arm64/Kconfig | 1 +
arch/arm64/mm/fault.c | 46 ++--------
arch/csky/Kconfig | 1 +
arch/csky/mm/fault.c | 22 ++---
arch/hexagon/Kconfig | 1 +
arch/hexagon/mm/vm_fault.c | 18 +---
arch/ia64/mm/fault.c | 36 ++------
arch/loongarch/Kconfig | 1 +
arch/loongarch/mm/fault.c | 16 ++--
arch/m68k/mm/fault.c | 9 +-
arch/microblaze/mm/fault.c | 5 +-
arch/mips/Kconfig | 1 +
arch/mips/mm/fault.c | 12 +--
arch/nios2/Kconfig | 1 +
arch/nios2/mm/fault.c | 17 +---
arch/openrisc/mm/fault.c | 5 +-
arch/parisc/mm/fault.c | 23 +++--
arch/powerpc/Kconfig | 1 +
arch/powerpc/mm/copro_fault.c | 14 +--
arch/powerpc/mm/fault.c | 39 +--------
arch/riscv/Kconfig | 1 +
arch/riscv/mm/fault.c | 31 +++----
arch/s390/mm/fault.c | 5 +-
arch/sh/Kconfig | 1 +
arch/sh/mm/fault.c | 17 +---
arch/sparc/Kconfig | 1 +
arch/sparc/mm/fault_32.c | 32 ++-----
arch/sparc/mm/fault_64.c | 8 +-
arch/um/kernel/trap.c | 11 +--
arch/x86/Kconfig | 1 +
arch/x86/include/asm/cpu.h | 2 +
arch/x86/include/asm/smp.h | 2 +
arch/x86/kernel/cpu/microcode/amd.c | 2 +-
arch/x86/kernel/process.c | 28 +++++-
arch/x86/kernel/smp.c | 73 ++++++++++------
arch/x86/kernel/smpboot.c | 81 ++++++++++++++++--
arch/x86/mm/fault.c | 52 +-----------
arch/xtensa/Kconfig | 1 +
arch/xtensa/mm/fault.c | 14 +--
drivers/hid/hid-logitech-hidpp.c | 2 +-
drivers/hid/hidraw.c | 9 +-
drivers/hid/wacom_wac.c | 6 +-
drivers/hid/wacom_wac.h | 2 +-
drivers/iommu/amd/iommu_v2.c | 4 +-
drivers/iommu/io-pgfault.c | 2 +-
drivers/thermal/mtk_thermal.c | 14 +--
drivers/video/fbdev/core/sysimgblt.c | 2 +-
fs/binfmt_elf.c | 6 +-
fs/exec.c | 38 +++++----
include/linux/highmem.h | 26 ++++++
include/linux/mm.h | 21 ++---
lib/maple_tree.c | 11 +--
mm/Kconfig | 4 +
mm/gup.c | 6 +-
mm/memory.c | 159 ++++++++++++++++++++++++++++++++---
mm/mmap.c | 154 +++++++++++++++++++++++++--------
mm/nommu.c | 17 ++--
net/can/isotp.c | 5 +-
net/mptcp/pm_netlink.c | 1 +
net/mptcp/protocol.c | 26 ++++--
67 files changed, 682 insertions(+), 559 deletions(-)
This is the start of the stable review cycle for the 6.1.37 release.
There are 30 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 01 Jul 2023 18:41:39 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.37-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.1.37-rc1
Ricardo Cañuelo <ricardo.canuelo(a)collabora.com>
Revert "thermal/drivers/mediatek: Use devm_of_iomap to avoid resource leak in mtk_thermal_probe"
Mike Hommey <mh(a)glandium.org>
HID: logitech-hidpp: add HIDPP_QUIRK_DELAYED_INIT for the T651.
Jason Gerecke <jason.gerecke(a)wacom.com>
HID: wacom: Use ktime_t rather than int when dealing with timestamps
Ludvig Michaelsson <ludvig.michaelsson(a)yubico.com>
HID: hidraw: fix data race on device refcount
Zhang Shurong <zhang_shurong(a)foxmail.com>
fbdev: fix potential OOB read in fast_imageblit()
Linus Torvalds <torvalds(a)linux-foundation.org>
mm: always expand the stack with the mmap write lock held
Linus Torvalds <torvalds(a)linux-foundation.org>
execve: expand new process stack manually ahead of time
Liam R. Howlett <Liam.Howlett(a)oracle.com>
mm: make find_extend_vma() fail if write lock not held
Linus Torvalds <torvalds(a)linux-foundation.org>
powerpc/mm: convert coprocessor fault to lock_mm_and_find_vma()
Linus Torvalds <torvalds(a)linux-foundation.org>
mm/fault: convert remaining simple cases to lock_mm_and_find_vma()
Ben Hutchings <ben(a)decadent.org.uk>
arm/mm: Convert to using lock_mm_and_find_vma()
Ben Hutchings <ben(a)decadent.org.uk>
riscv/mm: Convert to using lock_mm_and_find_vma()
Ben Hutchings <ben(a)decadent.org.uk>
mips/mm: Convert to using lock_mm_and_find_vma()
Michael Ellerman <mpe(a)ellerman.id.au>
powerpc/mm: Convert to using lock_mm_and_find_vma()
Linus Torvalds <torvalds(a)linux-foundation.org>
arm64/mm: Convert to using lock_mm_and_find_vma()
Linus Torvalds <torvalds(a)linux-foundation.org>
mm: make the page fault mmap locking killable
Linus Torvalds <torvalds(a)linux-foundation.org>
mm: introduce new 'lock_mm_and_find_vma()' page fault helper
Peng Zhang <zhangpeng.00(a)bytedance.com>
maple_tree: fix potential out-of-bounds access in mas_wr_end_piv()
Oliver Hartkopp <socketcan(a)hartkopp.net>
can: isotp: isotp_sendmsg(): fix return error fix on TX path
Thomas Gleixner <tglx(a)linutronix.de>
x86/smp: Cure kexec() vs. mwait_play_dead() breakage
Thomas Gleixner <tglx(a)linutronix.de>
x86/smp: Use dedicated cache-line for mwait_play_dead()
Thomas Gleixner <tglx(a)linutronix.de>
x86/smp: Remove pointless wmb()s from native_stop_other_cpus()
Tony Battersby <tonyb(a)cybernetics.com>
x86/smp: Dont access non-existing CPUID leaf
Thomas Gleixner <tglx(a)linutronix.de>
x86/smp: Make stop_other_cpus() more robust
Borislav Petkov (AMD) <bp(a)alien8.de>
x86/microcode/AMD: Load late on both threads too
Tony Luck <tony.luck(a)intel.com>
mm, hwpoison: when copy-on-write hits poison, take page offline
Tony Luck <tony.luck(a)intel.com>
mm, hwpoison: try to recover from copy-on write faults
Paolo Abeni <pabeni(a)redhat.com>
mptcp: ensure listener is unhashed before updating the sk status
David Woodhouse <dwmw(a)amazon.co.uk>
mm/mmap: Fix error return in do_vmi_align_munmap()
Liam R. Howlett <Liam.Howlett(a)oracle.com>
mm/mmap: Fix error path in do_vmi_align_munmap()
-------------
Diffstat:
Makefile | 4 +-
arch/alpha/Kconfig | 1 +
arch/alpha/mm/fault.c | 13 +--
arch/arc/Kconfig | 1 +
arch/arc/mm/fault.c | 11 +--
arch/arm/Kconfig | 1 +
arch/arm/mm/fault.c | 63 +++-----------
arch/arm64/Kconfig | 1 +
arch/arm64/mm/fault.c | 46 ++--------
arch/csky/Kconfig | 1 +
arch/csky/mm/fault.c | 22 ++---
arch/hexagon/Kconfig | 1 +
arch/hexagon/mm/vm_fault.c | 18 +---
arch/ia64/mm/fault.c | 36 ++------
arch/loongarch/Kconfig | 1 +
arch/loongarch/mm/fault.c | 16 ++--
arch/m68k/mm/fault.c | 9 +-
arch/microblaze/mm/fault.c | 5 +-
arch/mips/Kconfig | 1 +
arch/mips/mm/fault.c | 12 +--
arch/nios2/Kconfig | 1 +
arch/nios2/mm/fault.c | 17 +---
arch/openrisc/mm/fault.c | 5 +-
arch/parisc/mm/fault.c | 23 +++--
arch/powerpc/Kconfig | 1 +
arch/powerpc/mm/copro_fault.c | 14 +--
arch/powerpc/mm/fault.c | 39 +--------
arch/riscv/Kconfig | 1 +
arch/riscv/mm/fault.c | 31 +++----
arch/s390/mm/fault.c | 5 +-
arch/sh/Kconfig | 1 +
arch/sh/mm/fault.c | 17 +---
arch/sparc/Kconfig | 1 +
arch/sparc/mm/fault_32.c | 32 ++-----
arch/sparc/mm/fault_64.c | 8 +-
arch/um/kernel/trap.c | 11 +--
arch/x86/Kconfig | 1 +
arch/x86/include/asm/cpu.h | 2 +
arch/x86/include/asm/smp.h | 2 +
arch/x86/kernel/cpu/microcode/amd.c | 2 +-
arch/x86/kernel/process.c | 28 +++++-
arch/x86/kernel/smp.c | 73 ++++++++++------
arch/x86/kernel/smpboot.c | 81 ++++++++++++++++--
arch/x86/mm/fault.c | 52 +-----------
arch/xtensa/Kconfig | 1 +
arch/xtensa/mm/fault.c | 14 +--
drivers/hid/hid-logitech-hidpp.c | 2 +-
drivers/hid/hidraw.c | 9 +-
drivers/hid/wacom_wac.c | 6 +-
drivers/hid/wacom_wac.h | 2 +-
drivers/iommu/amd/iommu_v2.c | 4 +-
drivers/iommu/io-pgfault.c | 2 +-
drivers/thermal/mtk_thermal.c | 14 +--
drivers/video/fbdev/core/sysimgblt.c | 2 +-
fs/binfmt_elf.c | 6 +-
fs/exec.c | 38 +++++----
include/linux/highmem.h | 26 ++++++
include/linux/mm.h | 21 ++---
lib/maple_tree.c | 11 +--
mm/Kconfig | 4 +
mm/gup.c | 6 +-
mm/memory.c | 159 ++++++++++++++++++++++++++++++++---
mm/mmap.c | 154 +++++++++++++++++++++++++--------
mm/nommu.c | 17 ++--
net/can/isotp.c | 5 +-
net/mptcp/pm_netlink.c | 1 +
net/mptcp/protocol.c | 26 ++++--
67 files changed, 682 insertions(+), 559 deletions(-)
From: Lorenz Brun <lorenz(a)brun.one>
[ Upstream commit 03633c4ef1fb5ee119296dfe0c411656a9b5e04f ]
Currently the ROCK64 device tree specifies two regulators, vcc_host_5v
and vcc_host1_5v for USB VBUS on the device. Both of those are however
specified with RK_PA2 as the GPIO enabling them, causing the following
error when booting:
rockchip-pinctrl pinctrl: pin gpio0-2 already requested by vcc-host-5v-regulator; cannot claim for vcc-host1-5v-regulator
rockchip-pinctrl pinctrl: pin-2 (vcc-host1-5v-regulator) status -22
rockchip-pinctrl pinctrl: could not request pin 2 (gpio0-2) from group usb20-host-drv on device rockchip-pinctrl
reg-fixed-voltage vcc-host1-5v-regulator: Error applying setting, reverse things back
Looking at the schematic, there are in fact three USB regulators,
vcc_host_5v, vcc_host1_5v and vcc_otg_v5. But the enable signal for all
three is driven by Q2604 which is in turn driven by GPIO_A2/PA2.
Since these three regulators are not controllable separately, I removed
the second one which was causing the error and added labels for all
rails to the single regulator.
Signed-off-by: Lorenz Brun <lorenz(a)brun.one>
Tested-by: Diederik de Haas <didi.debian(a)cknow.org>
Link: https://lore.kernel.org/r/20230421213841.3079632-1-lorenz@brun.one
Signed-off-by: Heiko Stuebner <heiko(a)sntech.de>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
arch/arm64/boot/dts/rockchip/rk3328-rock64.dts | 14 ++------------
1 file changed, 2 insertions(+), 12 deletions(-)
diff --git a/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts b/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts
index f69a38f42d2d5..0a27fa5271f57 100644
--- a/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts
@@ -37,7 +37,8 @@ vcc_sd: sdmmc-regulator {
vin-supply = <&vcc_io>;
};
- vcc_host_5v: vcc-host-5v-regulator {
+ /* Common enable line for all of the rails mentioned in the labels */
+ vcc_host_5v: vcc_host1_5v: vcc_otg_5v: vcc-host-5v-regulator {
compatible = "regulator-fixed";
gpio = <&gpio0 RK_PA2 GPIO_ACTIVE_LOW>;
pinctrl-names = "default";
@@ -48,17 +49,6 @@ vcc_host_5v: vcc-host-5v-regulator {
vin-supply = <&vcc_sys>;
};
- vcc_host1_5v: vcc_otg_5v: vcc-host1-5v-regulator {
- compatible = "regulator-fixed";
- gpio = <&gpio0 RK_PA2 GPIO_ACTIVE_LOW>;
- pinctrl-names = "default";
- pinctrl-0 = <&usb20_host_drv>;
- regulator-name = "vcc_host1_5v";
- regulator-always-on;
- regulator-boot-on;
- vin-supply = <&vcc_sys>;
- };
-
vcc_sys: vcc-sys {
compatible = "regulator-fixed";
regulator-name = "vcc_sys";
--
2.39.2
From: Long Li <longli(a)microsoft.com>
The hardware specification specifies that WQE_COUNT should set to 0 for
the Receive Queue. Although currently the hardware doesn't enforce the
check, in the future releases it may check on this value.
Cc: stable(a)vger.kernel.org
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Reviewed-by: Haiyang Zhang <haiyangz(a)microsoft.com>
Reviewed-by: Dexuan Cui <decui(a)microsoft.com>
Signed-off-by: Long Li <longli(a)microsoft.com>
---
Change log:
v4:
Split the original patch into two: one for batching doorbell, one for setting the correct wqe count
drivers/net/ethernet/microsoft/mana/gdma_main.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
index 8f3f78b68592..3765d3389a9a 100644
--- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
@@ -300,8 +300,11 @@ static void mana_gd_ring_doorbell(struct gdma_context *gc, u32 db_index,
void mana_gd_wq_ring_doorbell(struct gdma_context *gc, struct gdma_queue *queue)
{
+ /* Hardware Spec specifies that software client should set 0 for
+ * wqe_cnt for Receive Queues. This value is not used in Send Queues.
+ */
mana_gd_ring_doorbell(gc, queue->gdma_dev->doorbell, queue->type,
- queue->id, queue->head * GDMA_WQE_BU_SIZE, 1);
+ queue->id, queue->head * GDMA_WQE_BU_SIZE, 0);
}
void mana_gd_ring_cq(struct gdma_queue *cq, u8 arm_bit)
--
2.34.1
From: Long Li <longli(a)microsoft.com>
It's inefficient to ring the doorbell page every time a WQE is posted to
the received queue. Excessive MMIO writes result in CPU spending more
time waiting on LOCK instructions (atomic operations), resulting in
poor scaling performance.
Move the code for ringing doorbell page to where after we have posted all
WQEs to the receive queue during a callback from napi_poll().
With this change, tests showed an improvement from 120G/s to 160G/s on a
200G physical link, with 16 or 32 hardware queues.
Tests showed no regression in network latency benchmarks on single
connection.
Cc: stable(a)vger.kernel.org
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Reviewed-by: Haiyang Zhang <haiyangz(a)microsoft.com>
Reviewed-by: Dexuan Cui <decui(a)microsoft.com>
Signed-off-by: Long Li <longli(a)microsoft.com>
---
Change log:
v2:
Check for comp_read > 0 as it might be negative on completion error.
Set rq.wqe_cnt to 0 according to BNIC spec.
v3:
Add details in the commit on the reason of performance increase and test numbers.
Add details in the commit on why rq.wqe_cnt should be set to 0 according to hardware spec.
Add "Reviewed-by" from Haiyang and Dexuan.
v4:
Split the original patch into two: one for batching doorbell, one for setting the correct wqe count
drivers/net/ethernet/microsoft/mana/mana_en.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index cd4d5ceb9f2d..1d8abe63fcb8 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -1383,8 +1383,8 @@ static void mana_post_pkt_rxq(struct mana_rxq *rxq)
recv_buf_oob = &rxq->rx_oobs[curr_index];
- err = mana_gd_post_and_ring(rxq->gdma_rq, &recv_buf_oob->wqe_req,
- &recv_buf_oob->wqe_inf);
+ err = mana_gd_post_work_request(rxq->gdma_rq, &recv_buf_oob->wqe_req,
+ &recv_buf_oob->wqe_inf);
if (WARN_ON_ONCE(err))
return;
@@ -1654,6 +1654,12 @@ static void mana_poll_rx_cq(struct mana_cq *cq)
mana_process_rx_cqe(rxq, cq, &comp[i]);
}
+ if (comp_read > 0) {
+ struct gdma_context *gc = rxq->gdma_rq->gdma_dev->gdma_context;
+
+ mana_gd_wq_ring_doorbell(gc, rxq->gdma_rq);
+ }
+
if (rxq->xdp_flush)
xdp_do_flush();
}
--
2.34.1
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 97eea946b93961fffd29448dcda7398d0d51c4b2 ]
The bounds checks in snd_soc_put_volsw_sx() are only being applied to the
first channel, meaning it is possible to write out of bounds values to the
second channel in stereo controls. Add appropriate checks.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20220511134137.169575-2-broonie@kernel.org
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/soc/soc-ops.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c
index 81c9ecfa7c7f9..63c0e61b17541 100644
--- a/sound/soc/soc-ops.c
+++ b/sound/soc/soc-ops.c
@@ -465,6 +465,12 @@ int snd_soc_put_volsw_sx(struct snd_kcontrol *kcontrol,
if (snd_soc_volsw_is_stereo(mc)) {
val_mask = mask << rshift;
val2 = (ucontrol->value.integer.value[1] + min) & mask;
+
+ if (mc->platform_max && val2 > mc->platform_max)
+ return -EINVAL;
+ if (val2 > max)
+ return -EINVAL;
+
val2 = val2 << rshift;
err = snd_soc_component_update_bits(component, reg2, val_mask,
--
2.39.2
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 97eea946b93961fffd29448dcda7398d0d51c4b2 ]
The bounds checks in snd_soc_put_volsw_sx() are only being applied to the
first channel, meaning it is possible to write out of bounds values to the
second channel in stereo controls. Add appropriate checks.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20220511134137.169575-2-broonie@kernel.org
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/soc/soc-ops.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c
index 453b61b42dd9e..00e6a6e46fe52 100644
--- a/sound/soc/soc-ops.c
+++ b/sound/soc/soc-ops.c
@@ -460,6 +460,12 @@ int snd_soc_put_volsw_sx(struct snd_kcontrol *kcontrol,
if (snd_soc_volsw_is_stereo(mc)) {
val_mask = mask << rshift;
val2 = (ucontrol->value.integer.value[1] + min) & mask;
+
+ if (mc->platform_max && val2 > mc->platform_max)
+ return -EINVAL;
+ if (val2 > max)
+ return -EINVAL;
+
val2 = val2 << rshift;
err = snd_soc_component_update_bits(component, reg2, val_mask,
--
2.39.2
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 97eea946b93961fffd29448dcda7398d0d51c4b2 ]
The bounds checks in snd_soc_put_volsw_sx() are only being applied to the
first channel, meaning it is possible to write out of bounds values to the
second channel in stereo controls. Add appropriate checks.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20220511134137.169575-2-broonie@kernel.org
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/soc/soc-ops.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c
index 453b61b42dd9e..00e6a6e46fe52 100644
--- a/sound/soc/soc-ops.c
+++ b/sound/soc/soc-ops.c
@@ -460,6 +460,12 @@ int snd_soc_put_volsw_sx(struct snd_kcontrol *kcontrol,
if (snd_soc_volsw_is_stereo(mc)) {
val_mask = mask << rshift;
val2 = (ucontrol->value.integer.value[1] + min) & mask;
+
+ if (mc->platform_max && val2 > mc->platform_max)
+ return -EINVAL;
+ if (val2 > max)
+ return -EINVAL;
+
val2 = val2 << rshift;
err = snd_soc_component_update_bits(component, reg2, val_mask,
--
2.39.2
From: Lorenz Brun <lorenz(a)brun.one>
[ Upstream commit 03633c4ef1fb5ee119296dfe0c411656a9b5e04f ]
Currently the ROCK64 device tree specifies two regulators, vcc_host_5v
and vcc_host1_5v for USB VBUS on the device. Both of those are however
specified with RK_PA2 as the GPIO enabling them, causing the following
error when booting:
rockchip-pinctrl pinctrl: pin gpio0-2 already requested by vcc-host-5v-regulator; cannot claim for vcc-host1-5v-regulator
rockchip-pinctrl pinctrl: pin-2 (vcc-host1-5v-regulator) status -22
rockchip-pinctrl pinctrl: could not request pin 2 (gpio0-2) from group usb20-host-drv on device rockchip-pinctrl
reg-fixed-voltage vcc-host1-5v-regulator: Error applying setting, reverse things back
Looking at the schematic, there are in fact three USB regulators,
vcc_host_5v, vcc_host1_5v and vcc_otg_v5. But the enable signal for all
three is driven by Q2604 which is in turn driven by GPIO_A2/PA2.
Since these three regulators are not controllable separately, I removed
the second one which was causing the error and added labels for all
rails to the single regulator.
Signed-off-by: Lorenz Brun <lorenz(a)brun.one>
Tested-by: Diederik de Haas <didi.debian(a)cknow.org>
Link: https://lore.kernel.org/r/20230421213841.3079632-1-lorenz@brun.one
Signed-off-by: Heiko Stuebner <heiko(a)sntech.de>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
arch/arm64/boot/dts/rockchip/rk3328-rock64.dts | 14 ++------------
1 file changed, 2 insertions(+), 12 deletions(-)
diff --git a/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts b/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts
index 95ab6928cfd40..d6a288b0e2ab6 100644
--- a/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts
@@ -32,7 +32,8 @@ vcc_sd: sdmmc-regulator {
vin-supply = <&vcc_io>;
};
- vcc_host_5v: vcc-host-5v-regulator {
+ /* Common enable line for all of the rails mentioned in the labels */
+ vcc_host_5v: vcc_host1_5v: vcc_otg_5v: vcc-host-5v-regulator {
compatible = "regulator-fixed";
gpio = <&gpio0 RK_PA2 GPIO_ACTIVE_LOW>;
pinctrl-names = "default";
@@ -43,17 +44,6 @@ vcc_host_5v: vcc-host-5v-regulator {
vin-supply = <&vcc_sys>;
};
- vcc_host1_5v: vcc_otg_5v: vcc-host1-5v-regulator {
- compatible = "regulator-fixed";
- gpio = <&gpio0 RK_PA2 GPIO_ACTIVE_LOW>;
- pinctrl-names = "default";
- pinctrl-0 = <&usb20_host_drv>;
- regulator-name = "vcc_host1_5v";
- regulator-always-on;
- regulator-boot-on;
- vin-supply = <&vcc_sys>;
- };
-
vcc_sys: vcc-sys {
compatible = "regulator-fixed";
regulator-name = "vcc_sys";
--
2.39.2
From: Lorenz Brun <lorenz(a)brun.one>
[ Upstream commit 03633c4ef1fb5ee119296dfe0c411656a9b5e04f ]
Currently the ROCK64 device tree specifies two regulators, vcc_host_5v
and vcc_host1_5v for USB VBUS on the device. Both of those are however
specified with RK_PA2 as the GPIO enabling them, causing the following
error when booting:
rockchip-pinctrl pinctrl: pin gpio0-2 already requested by vcc-host-5v-regulator; cannot claim for vcc-host1-5v-regulator
rockchip-pinctrl pinctrl: pin-2 (vcc-host1-5v-regulator) status -22
rockchip-pinctrl pinctrl: could not request pin 2 (gpio0-2) from group usb20-host-drv on device rockchip-pinctrl
reg-fixed-voltage vcc-host1-5v-regulator: Error applying setting, reverse things back
Looking at the schematic, there are in fact three USB regulators,
vcc_host_5v, vcc_host1_5v and vcc_otg_v5. But the enable signal for all
three is driven by Q2604 which is in turn driven by GPIO_A2/PA2.
Since these three regulators are not controllable separately, I removed
the second one which was causing the error and added labels for all
rails to the single regulator.
Signed-off-by: Lorenz Brun <lorenz(a)brun.one>
Tested-by: Diederik de Haas <didi.debian(a)cknow.org>
Link: https://lore.kernel.org/r/20230421213841.3079632-1-lorenz@brun.one
Signed-off-by: Heiko Stuebner <heiko(a)sntech.de>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
arch/arm64/boot/dts/rockchip/rk3328-rock64.dts | 14 ++------------
1 file changed, 2 insertions(+), 12 deletions(-)
diff --git a/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts b/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts
index 1b0f7e4551ea4..522d2d4281033 100644
--- a/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts
@@ -37,7 +37,8 @@ vcc_sd: sdmmc-regulator {
vin-supply = <&vcc_io>;
};
- vcc_host_5v: vcc-host-5v-regulator {
+ /* Common enable line for all of the rails mentioned in the labels */
+ vcc_host_5v: vcc_host1_5v: vcc_otg_5v: vcc-host-5v-regulator {
compatible = "regulator-fixed";
gpio = <&gpio0 RK_PA2 GPIO_ACTIVE_LOW>;
pinctrl-names = "default";
@@ -48,17 +49,6 @@ vcc_host_5v: vcc-host-5v-regulator {
vin-supply = <&vcc_sys>;
};
- vcc_host1_5v: vcc_otg_5v: vcc-host1-5v-regulator {
- compatible = "regulator-fixed";
- gpio = <&gpio0 RK_PA2 GPIO_ACTIVE_LOW>;
- pinctrl-names = "default";
- pinctrl-0 = <&usb20_host_drv>;
- regulator-name = "vcc_host1_5v";
- regulator-always-on;
- regulator-boot-on;
- vin-supply = <&vcc_sys>;
- };
-
vcc_sys: vcc-sys {
compatible = "regulator-fixed";
regulator-name = "vcc_sys";
--
2.39.2
From: Lorenz Brun <lorenz(a)brun.one>
[ Upstream commit 03633c4ef1fb5ee119296dfe0c411656a9b5e04f ]
Currently the ROCK64 device tree specifies two regulators, vcc_host_5v
and vcc_host1_5v for USB VBUS on the device. Both of those are however
specified with RK_PA2 as the GPIO enabling them, causing the following
error when booting:
rockchip-pinctrl pinctrl: pin gpio0-2 already requested by vcc-host-5v-regulator; cannot claim for vcc-host1-5v-regulator
rockchip-pinctrl pinctrl: pin-2 (vcc-host1-5v-regulator) status -22
rockchip-pinctrl pinctrl: could not request pin 2 (gpio0-2) from group usb20-host-drv on device rockchip-pinctrl
reg-fixed-voltage vcc-host1-5v-regulator: Error applying setting, reverse things back
Looking at the schematic, there are in fact three USB regulators,
vcc_host_5v, vcc_host1_5v and vcc_otg_v5. But the enable signal for all
three is driven by Q2604 which is in turn driven by GPIO_A2/PA2.
Since these three regulators are not controllable separately, I removed
the second one which was causing the error and added labels for all
rails to the single regulator.
Signed-off-by: Lorenz Brun <lorenz(a)brun.one>
Tested-by: Diederik de Haas <didi.debian(a)cknow.org>
Link: https://lore.kernel.org/r/20230421213841.3079632-1-lorenz@brun.one
Signed-off-by: Heiko Stuebner <heiko(a)sntech.de>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
arch/arm64/boot/dts/rockchip/rk3328-rock64.dts | 14 ++------------
1 file changed, 2 insertions(+), 12 deletions(-)
diff --git a/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts b/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts
index f69a38f42d2d5..0a27fa5271f57 100644
--- a/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts
@@ -37,7 +37,8 @@ vcc_sd: sdmmc-regulator {
vin-supply = <&vcc_io>;
};
- vcc_host_5v: vcc-host-5v-regulator {
+ /* Common enable line for all of the rails mentioned in the labels */
+ vcc_host_5v: vcc_host1_5v: vcc_otg_5v: vcc-host-5v-regulator {
compatible = "regulator-fixed";
gpio = <&gpio0 RK_PA2 GPIO_ACTIVE_LOW>;
pinctrl-names = "default";
@@ -48,17 +49,6 @@ vcc_host_5v: vcc-host-5v-regulator {
vin-supply = <&vcc_sys>;
};
- vcc_host1_5v: vcc_otg_5v: vcc-host1-5v-regulator {
- compatible = "regulator-fixed";
- gpio = <&gpio0 RK_PA2 GPIO_ACTIVE_LOW>;
- pinctrl-names = "default";
- pinctrl-0 = <&usb20_host_drv>;
- regulator-name = "vcc_host1_5v";
- regulator-always-on;
- regulator-boot-on;
- vin-supply = <&vcc_sys>;
- };
-
vcc_sys: vcc-sys {
compatible = "regulator-fixed";
regulator-name = "vcc_sys";
--
2.39.2
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x c2d22806aecb24e2de55c30a06e5d6eb297d161d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023062901-public-account-117e@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c2d22806aecb24e2de55c30a06e5d6eb297d161d Mon Sep 17 00:00:00 2001
From: Zhang Shurong <zhang_shurong(a)foxmail.com>
Date: Sun, 25 Jun 2023 00:16:49 +0800
Subject: [PATCH] fbdev: fix potential OOB read in fast_imageblit()
There is a potential OOB read at fast_imageblit, for
"colortab[(*src >> 4)]" can become a negative value due to
"const char *s = image->data, *src".
This change makes sure the index for colortab always positive
or zero.
Similar commit:
https://patchwork.kernel.org/patch/11746067
Potential bug report:
https://groups.google.com/g/syzkaller-bugs/c/9ubBXKeKXf4/m/k-QXy4UgAAAJ
Signed-off-by: Zhang Shurong <zhang_shurong(a)foxmail.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Helge Deller <deller(a)gmx.de>
diff --git a/drivers/video/fbdev/core/sysimgblt.c b/drivers/video/fbdev/core/sysimgblt.c
index 335e92b813fc..665ef7a0a249 100644
--- a/drivers/video/fbdev/core/sysimgblt.c
+++ b/drivers/video/fbdev/core/sysimgblt.c
@@ -189,7 +189,7 @@ static void fast_imageblit(const struct fb_image *image, struct fb_info *p,
u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
u32 bit_mask, eorx, shift;
- const char *s = image->data, *src;
+ const u8 *s = image->data, *src;
u32 *dst;
const u32 *tab;
size_t tablen;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x c2d22806aecb24e2de55c30a06e5d6eb297d161d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023062953-frosty-dubiously-e01d@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c2d22806aecb24e2de55c30a06e5d6eb297d161d Mon Sep 17 00:00:00 2001
From: Zhang Shurong <zhang_shurong(a)foxmail.com>
Date: Sun, 25 Jun 2023 00:16:49 +0800
Subject: [PATCH] fbdev: fix potential OOB read in fast_imageblit()
There is a potential OOB read at fast_imageblit, for
"colortab[(*src >> 4)]" can become a negative value due to
"const char *s = image->data, *src".
This change makes sure the index for colortab always positive
or zero.
Similar commit:
https://patchwork.kernel.org/patch/11746067
Potential bug report:
https://groups.google.com/g/syzkaller-bugs/c/9ubBXKeKXf4/m/k-QXy4UgAAAJ
Signed-off-by: Zhang Shurong <zhang_shurong(a)foxmail.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Helge Deller <deller(a)gmx.de>
diff --git a/drivers/video/fbdev/core/sysimgblt.c b/drivers/video/fbdev/core/sysimgblt.c
index 335e92b813fc..665ef7a0a249 100644
--- a/drivers/video/fbdev/core/sysimgblt.c
+++ b/drivers/video/fbdev/core/sysimgblt.c
@@ -189,7 +189,7 @@ static void fast_imageblit(const struct fb_image *image, struct fb_info *p,
u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
u32 bit_mask, eorx, shift;
- const char *s = image->data, *src;
+ const u8 *s = image->data, *src;
u32 *dst;
const u32 *tab;
size_t tablen;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x c2d22806aecb24e2de55c30a06e5d6eb297d161d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023062944-backless-boastful-fd52@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c2d22806aecb24e2de55c30a06e5d6eb297d161d Mon Sep 17 00:00:00 2001
From: Zhang Shurong <zhang_shurong(a)foxmail.com>
Date: Sun, 25 Jun 2023 00:16:49 +0800
Subject: [PATCH] fbdev: fix potential OOB read in fast_imageblit()
There is a potential OOB read at fast_imageblit, for
"colortab[(*src >> 4)]" can become a negative value due to
"const char *s = image->data, *src".
This change makes sure the index for colortab always positive
or zero.
Similar commit:
https://patchwork.kernel.org/patch/11746067
Potential bug report:
https://groups.google.com/g/syzkaller-bugs/c/9ubBXKeKXf4/m/k-QXy4UgAAAJ
Signed-off-by: Zhang Shurong <zhang_shurong(a)foxmail.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Helge Deller <deller(a)gmx.de>
diff --git a/drivers/video/fbdev/core/sysimgblt.c b/drivers/video/fbdev/core/sysimgblt.c
index 335e92b813fc..665ef7a0a249 100644
--- a/drivers/video/fbdev/core/sysimgblt.c
+++ b/drivers/video/fbdev/core/sysimgblt.c
@@ -189,7 +189,7 @@ static void fast_imageblit(const struct fb_image *image, struct fb_info *p,
u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
u32 bit_mask, eorx, shift;
- const char *s = image->data, *src;
+ const u8 *s = image->data, *src;
u32 *dst;
const u32 *tab;
size_t tablen;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x c2d22806aecb24e2de55c30a06e5d6eb297d161d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023062936-cesarean-acid-5726@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c2d22806aecb24e2de55c30a06e5d6eb297d161d Mon Sep 17 00:00:00 2001
From: Zhang Shurong <zhang_shurong(a)foxmail.com>
Date: Sun, 25 Jun 2023 00:16:49 +0800
Subject: [PATCH] fbdev: fix potential OOB read in fast_imageblit()
There is a potential OOB read at fast_imageblit, for
"colortab[(*src >> 4)]" can become a negative value due to
"const char *s = image->data, *src".
This change makes sure the index for colortab always positive
or zero.
Similar commit:
https://patchwork.kernel.org/patch/11746067
Potential bug report:
https://groups.google.com/g/syzkaller-bugs/c/9ubBXKeKXf4/m/k-QXy4UgAAAJ
Signed-off-by: Zhang Shurong <zhang_shurong(a)foxmail.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Helge Deller <deller(a)gmx.de>
diff --git a/drivers/video/fbdev/core/sysimgblt.c b/drivers/video/fbdev/core/sysimgblt.c
index 335e92b813fc..665ef7a0a249 100644
--- a/drivers/video/fbdev/core/sysimgblt.c
+++ b/drivers/video/fbdev/core/sysimgblt.c
@@ -189,7 +189,7 @@ static void fast_imageblit(const struct fb_image *image, struct fb_info *p,
u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
u32 bit_mask, eorx, shift;
- const char *s = image->data, *src;
+ const u8 *s = image->data, *src;
u32 *dst;
const u32 *tab;
size_t tablen;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x c2d22806aecb24e2de55c30a06e5d6eb297d161d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023062930-passage-rockband-816c@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c2d22806aecb24e2de55c30a06e5d6eb297d161d Mon Sep 17 00:00:00 2001
From: Zhang Shurong <zhang_shurong(a)foxmail.com>
Date: Sun, 25 Jun 2023 00:16:49 +0800
Subject: [PATCH] fbdev: fix potential OOB read in fast_imageblit()
There is a potential OOB read at fast_imageblit, for
"colortab[(*src >> 4)]" can become a negative value due to
"const char *s = image->data, *src".
This change makes sure the index for colortab always positive
or zero.
Similar commit:
https://patchwork.kernel.org/patch/11746067
Potential bug report:
https://groups.google.com/g/syzkaller-bugs/c/9ubBXKeKXf4/m/k-QXy4UgAAAJ
Signed-off-by: Zhang Shurong <zhang_shurong(a)foxmail.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Helge Deller <deller(a)gmx.de>
diff --git a/drivers/video/fbdev/core/sysimgblt.c b/drivers/video/fbdev/core/sysimgblt.c
index 335e92b813fc..665ef7a0a249 100644
--- a/drivers/video/fbdev/core/sysimgblt.c
+++ b/drivers/video/fbdev/core/sysimgblt.c
@@ -189,7 +189,7 @@ static void fast_imageblit(const struct fb_image *image, struct fb_info *p,
u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
u32 bit_mask, eorx, shift;
- const char *s = image->data, *src;
+ const u8 *s = image->data, *src;
u32 *dst;
const u32 *tab;
size_t tablen;
Hi Paul,
A number of stable kernels recently backported this upstream commit:
"""
commit 4ce1f694eb5d8ca607fed8542d32a33b4f1217a5
Author: Paul Moore <paul(a)paul-moore.com>
Date: Wed Apr 12 13:29:11 2023 -0400
selinux: ensure av_permissions.h is built when needed
"""
We're seeing a build issue with this commit where the "crash" tool will fail
to start, it complains that the vmlinux image and /proc/version don't match.
A minimum reproducer would be having "make" version before 4.3 and building
the kernel with:
$ make bzImages
$ make modules
Then compare the version strings in the bzImage and vmlinux images,
we can use "strings" for this. For example, in the 5.10.181 kernel I get:
$ strings vmlinux | egrep '^Linux version'
Linux version 5.10.181 (ec2-user(a)ip-172-31-79-134.ec2.internal) (gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-15), GNU ld version 2.29.1-31.amzn2) #2 SMP Thu Jun 1 01:26:38 UTC 2023
$ strings ./arch/x86_64/boot/bzImage | egrep 'ld version'
5.10.181 (ec2-user(a)ip-172-31-79-134.ec2.internal) (gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-15), GNU ld version 2.29.1-31.amzn2) #1 SMP Thu Jun 1 01:23:59 UTC 2023
The version string in the bzImage doesn't have the "Linux version" part, but
I think this is added by the kernel when printing. If you compare the strings,
you'll see that they have a different build date and the "#1" and "#2" are
different.
This only happens with commit 4ce1f694eb5 applied and older "make", in my case I
have "make" version 3.82.
If I revert 4ce1f694eb5 or use "make" version 4.3 I get identical strings (except
for the "Linux version" part):
$ strings vmlinux | egrep '^Linux version'
Linux version 5.10.181+ (ec2-user(a)ip-172-31-79-134.ec2.internal) (gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-15), GNU ld version 2.29.1-31.amzn2) #1 SMP Thu Jun 1 01:29:11 UTC 2023
$ strings ./arch/x86_64/boot/bzImage | egrep 'ld version'
5.10.181+ (ec2-user(a)ip-172-31-79-134.ec2.internal) (gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-15), GNU ld version 2.29.1-31.amzn2) #1 SMP Thu Jun 1 01:29:11 UTC 2023
Maybe the grouped target usage in 4ce1f694eb5 with older "make" is causing a
rebuild of the vmlinux image in "make modules"? If yes, is this expected?
I'm afraid this issue could be high impact for distros with older user-space.
- Luiz
Hi,
Would you be interested in reaching out to top level decisions makers to promote your services?
Learning Development
Training / Talent Development
CVD Human Resource
CVD Learning
Corporate Training
Meeting & Events
Owners, CEO, Presidents etc
Industries:
Information Technology |Finance |Advertising & Marketing |Construction and Real estate |Charity and NGO’s |Education |Publishing |Retail | Consumer | Manufacturing |Healthcare | Hospitality |Legal Services |Food & Beverages |Media & Entertainment |Energy and chemicals |Aerospace and Defense |Transportation and Logistics Etc.
We customize the list according to your target requirements: - Job titles, Industries and location.
Let us know your target audience and we will send you more details.
Regards,
Claire
Claire Davis | Marketing Consultant
Reply only opt-out in the subject line to remove from the mailing list.
Hi Greg,
below is the manual backport of an upstream patch to fix the build failure
in kernel v4.14 in imsttfb.c.
It's not sufficient to just return from init_imstt() as the kernel then
may crash later when it tries to access the non-existent framebuffer or
cmap. Instead return failure to imsttfb_probe() so that the kernel
will skip using that hardware/driver.
Can you please apply this patch to the v4.14 stable queue?
Thanks,
Helge
-----------
From: Zheng Wang <zyytlz.wz(a)163.com>
This is a manual backport of upstream patch c75f5a55061091030a13fef71b9995b89bc86213
to fix a build error in imsttfb.c in kernel v4.14.
A use-after-free bug may occur if init_imstt invokes framebuffer_release
and free the info ptr. The caller, imsttfb_probe didn't notice that and
still keep the ptr as private data in pdev.
If we remove the driver which will call imsttfb_remove to make cleanup,
UAF happens.
Fix it by return error code if bad case happens in init_imstt.
Signed-off-by: Zheng Wang <zyytlz.wz(a)163.com>
Signed-off-by: Helge Deller <deller(a)gmx.de>
diff --git a/drivers/video/fbdev/imsttfb.c b/drivers/video/fbdev/imsttfb.c
index 6589d5f0a5a4..eaaa5f1c0f6f 100644
--- a/drivers/video/fbdev/imsttfb.c
+++ b/drivers/video/fbdev/imsttfb.c
@@ -1348,7 +1348,7 @@ static struct fb_ops imsttfb_ops = {
.fb_ioctl = imsttfb_ioctl,
};
-static void init_imstt(struct fb_info *info)
+static int init_imstt(struct fb_info *info)
{
struct imstt_par *par = info->par;
__u32 i, tmp, *ip, *end;
@@ -1420,7 +1420,7 @@ static void init_imstt(struct fb_info *info)
|| !(compute_imstt_regvals(par, info->var.xres, info->var.yres))) {
printk("imsttfb: %ux%ux%u not supported\n", info->var.xres, info->var.yres, info->var.bits_per_pixel);
framebuffer_release(info);
- return;
+ return -ENODEV;
}
sprintf(info->fix.id, "IMS TT (%s)", par->ramdac == IBM ? "IBM" : "TVP");
@@ -1460,12 +1460,13 @@ static void init_imstt(struct fb_info *info)
if (register_framebuffer(info) < 0) {
fb_dealloc_cmap(&info->cmap);
framebuffer_release(info);
- return;
+ return -ENODEV;
}
tmp = (read_reg_le32(par->dc_regs, SSTATUS) & 0x0f00) >> 8;
fb_info(info, "%s frame buffer; %uMB vram; chip version %u\n",
info->fix.id, info->fix.smem_len >> 20, tmp);
+ return 0;
}
static int imsttfb_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
@@ -1474,6 +1475,7 @@ static int imsttfb_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
struct imstt_par *par;
struct fb_info *info;
struct device_node *dp;
+ int ret;
dp = pci_device_to_OF_node(pdev);
if(dp)
@@ -1525,10 +1527,10 @@ static int imsttfb_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
par->cmap_regs_phys = addr + 0x840000;
par->cmap_regs = (__u8 *)ioremap(addr + 0x840000, 0x1000);
info->pseudo_palette = par->palette;
- init_imstt(info);
-
- pci_set_drvdata(pdev, info);
- return 0;
+ ret = init_imstt(info);
+ if (!ret)
+ pci_set_drvdata(pdev, info);
+ return ret;
}
static void imsttfb_remove(struct pci_dev *pdev)
From: Yi Yang <yiyang13(a)huawei.com>
Kmemleak reported the following leak info in try_smi_init():
unreferenced object 0xffff00018ecf9400 (size 1024):
comm "modprobe", pid 2707763, jiffies 4300851415 (age 773.308s)
backtrace:
[<000000004ca5b312>] __kmalloc+0x4b8/0x7b0
[<00000000953b1072>] try_smi_init+0x148/0x5dc [ipmi_si]
[<000000006460d325>] 0xffff800081b10148
[<0000000039206ea5>] do_one_initcall+0x64/0x2a4
[<00000000601399ce>] do_init_module+0x50/0x300
[<000000003c12ba3c>] load_module+0x7a8/0x9e0
[<00000000c246fffe>] __se_sys_init_module+0x104/0x180
[<00000000eea99093>] __arm64_sys_init_module+0x24/0x30
[<0000000021b1ef87>] el0_svc_common.constprop.0+0x94/0x250
[<0000000070f4f8b7>] do_el0_svc+0x48/0xe0
[<000000005a05337f>] el0_svc+0x24/0x3c
[<000000005eb248d6>] el0_sync_handler+0x160/0x164
[<0000000030a59039>] el0_sync+0x160/0x180
The problem was that when an error occurred before handlers registration
and after allocating `new_smi->si_sm`, the variable wouldn't be freed in
the error handling afterwards since `shutdown_smi()` hadn't been
registered yet. Fix it by adding a `kfree()` in the error handling path
in `try_smi_init()`.
Cc: stable(a)vger.kernel.org # 4.19+
Fixes: 7960f18a5647 ("ipmi_si: Convert over to a shutdown handler")
Signed-off-by: Yi Yang <yiyang13(a)huawei.com>
Co-developed-by: GONG, Ruiqi <gongruiqi(a)huaweicloud.com>
Signed-off-by: GONG, Ruiqi <gongruiqi(a)huaweicloud.com>
---
drivers/char/ipmi/ipmi_si_intf.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
index abddd7e43a9a..5cd031f3fc97 100644
--- a/drivers/char/ipmi/ipmi_si_intf.c
+++ b/drivers/char/ipmi/ipmi_si_intf.c
@@ -2082,6 +2082,11 @@ static int try_smi_init(struct smi_info *new_smi)
new_smi->io.io_cleanup = NULL;
}
+ if (rv && new_smi->si_sm) {
+ kfree(new_smi->si_sm);
+ new_smi->si_sm = NULL;
+ }
+
return rv;
}
--
2.25.1
From: Xiubo Li <xiubli(a)redhat.com>
If a client sends out a cap update dropping caps with the prior 'seq'
just before an incoming cap revoke request, then the client may drop
the revoke because it believes it's already released the requested
capabilities.
This causes the MDS to wait indefinitely for the client to respond
to the revoke. It's therefore always a good idea to ack the cap
revoke request with the bumped up 'seq'.
Cc: stable(a)vger.kernel.org
Link: https://tracker.ceph.com/issues/61782
Signed-off-by: Xiubo Li <xiubli(a)redhat.com>
Reviewed-by: Milind Changire <mchangir(a)redhat.com>
Signed-off-by: Ilya Dryomov <idryomov(a)gmail.com>
---
V3:
- Updated the commit message from Patrick. Thanks!
fs/ceph/caps.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index cef91dd5ef83..e2bb0d0072da 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -3566,6 +3566,15 @@ static void handle_cap_grant(struct inode *inode,
}
BUG_ON(cap->issued & ~cap->implemented);
+ /* don't let check_caps skip sending a response to MDS for revoke msgs */
+ if (le32_to_cpu(grant->op) == CEPH_CAP_OP_REVOKE) {
+ cap->mds_wanted = 0;
+ if (cap == ci->i_auth_cap)
+ check_caps = 1; /* check auth cap only */
+ else
+ check_caps = 2; /* check all caps */
+ }
+
if (extra_info->inline_version > 0 &&
extra_info->inline_version >= ci->i_inline_version) {
ci->i_inline_version = extra_info->inline_version;
--
2.40.1
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 0108a4e9f3584a7a2c026d1601b0682ff7335d95
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023062341-reunite-senior-f0c0@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0108a4e9f3584a7a2c026d1601b0682ff7335d95 Mon Sep 17 00:00:00 2001
From: Krister Johansen <kjlx(a)templeofstupid.com>
Date: Mon, 12 Jun 2023 17:44:40 -0700
Subject: [PATCH] bpf: ensure main program has an extable
When subprograms are in use, the main program is not jit'd after the
subprograms because jit_subprogs sets a value for prog->bpf_func upon
success. Subsequent calls to the JIT are bypassed when this value is
non-NULL. This leads to a situation where the main program and its
func[0] counterpart are both in the bpf kallsyms tree, but only func[0]
has an extable. Extables are only created during JIT. Now there are
two nearly identical program ksym entries in the tree, but only one has
an extable. Depending upon how the entries are placed, there's a chance
that a fault will call search_extable on the aux with the NULL entry.
Since jit_subprogs already copies state from func[0] to the main
program, include the extable pointer in this state duplication.
Additionally, ensure that the copy of the main program in func[0] is not
added to the bpf_prog_kallsyms table. Instead, let the main program get
added later in bpf_prog_load(). This ensures there is only a single
copy of the main program in the kallsyms table, and that its tag matches
the tag observed by tooling like bpftool.
Cc: stable(a)vger.kernel.org
Fixes: 1c2a088a6626 ("bpf: x64: add JIT support for multi-function programs")
Signed-off-by: Krister Johansen <kjlx(a)templeofstupid.com>
Acked-by: Yonghong Song <yhs(a)fb.com>
Acked-by: Ilya Leoshkevich <iii(a)linux.ibm.com>
Tested-by: Ilya Leoshkevich <iii(a)linux.ibm.com>
Link: https://lore.kernel.org/r/6de9b2f4b4724ef56efbb0339daaa66c8b68b1e7.16866166…
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 0dd8adc7a159..cf5f230360f5 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -17217,9 +17217,10 @@ static int jit_subprogs(struct bpf_verifier_env *env)
}
/* finally lock prog and jit images for all functions and
- * populate kallsysm
+ * populate kallsysm. Begin at the first subprogram, since
+ * bpf_prog_load will add the kallsyms for the main program.
*/
- for (i = 0; i < env->subprog_cnt; i++) {
+ for (i = 1; i < env->subprog_cnt; i++) {
bpf_prog_lock_ro(func[i]);
bpf_prog_kallsyms_add(func[i]);
}
@@ -17245,6 +17246,8 @@ static int jit_subprogs(struct bpf_verifier_env *env)
prog->jited = 1;
prog->bpf_func = func[0]->bpf_func;
prog->jited_len = func[0]->jited_len;
+ prog->aux->extable = func[0]->aux->extable;
+ prog->aux->num_exentries = func[0]->aux->num_exentries;
prog->aux->func = func;
prog->aux->func_cnt = env->subprog_cnt;
bpf_prog_jit_attempt_done(prog);
This is the start of the stable review cycle for the 4.14.320 release.
There are 26 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 28 Jun 2023 18:07:23 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.320-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.320-rc1
Clark Wang <xiaoning.wang(a)nxp.com>
i2c: imx-lpi2c: fix type char overflow issue when calculating the clock cycle
Dheeraj Kumar Srivastava <dheerajkumar.srivastava(a)amd.com>
x86/apic: Fix kernel panic when booting with intremap=off and x2apic_phys
Min Li <lm0963hack(a)gmail.com>
drm/radeon: fix race condition UAF in radeon_gem_set_domain_ioctl
Min Li <lm0963hack(a)gmail.com>
drm/exynos: fix race condition UAF in exynos_g2d_exec_ioctl
Inki Dae <inki.dae(a)samsung.com>
drm/exynos: vidi: fix a wrong error return
Vineeth Vijayan <vneethv(a)linux.ibm.com>
s390/cio: unregister device when the only path is gone
Dan Carpenter <dan.carpenter(a)linaro.org>
usb: gadget: udc: fix NULL dereference in remove()
Helge Deller <deller(a)gmx.de>
fbdev: imsttfb: Release framebuffer and dealloc cmap on error path
Osama Muhammad <osmtendev(a)gmail.com>
nfcsim.c: Fix error checking for debugfs_create_dir
Marc Zyngier <maz(a)kernel.org>
arm64: Add missing Set/Way CMO encodings
Denis Arefev <arefev(a)swemel.ru>
HID: wacom: Add error check to wacom_parse_and_register()
Maurizio Lombardi <mlombard(a)redhat.com>
scsi: target: iscsi: Prevent login threads from racing between each other
Pablo Neira Ayuso <pablo(a)netfilter.org>
netfilter: nf_tables: disallow element updates of bound anonymous sets
Ross Lagerwall <ross.lagerwall(a)citrix.com>
be2net: Extend xmit workaround to BE3 chip
Sergey Shtylyov <s.shtylyov(a)omp.ru>
mmc: usdhi60rol0: fix deferred probing
Sergey Shtylyov <s.shtylyov(a)omp.ru>
mmc: omap_hsmmc: fix deferred probing
Sergey Shtylyov <s.shtylyov(a)omp.ru>
mmc: omap: fix deferred probing
Sergey Shtylyov <s.shtylyov(a)omp.ru>
mmc: mtk-sd: fix deferred probing
Stefan Wahren <stefan.wahren(a)i2se.com>
net: qca_spi: Avoid high load if QCA7000 is not available
Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
xfrm: Linearize the skb after offloading if needed.
Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
nilfs2: prevent general protection fault in nilfs_clear_dirty_page()
Xiu Jianfeng <xiujianfeng(a)huawei.com>
cgroup: Do not corrupt task iteration when rebinding subsystem
Michael Kelley <mikelley(a)microsoft.com>
Drivers: hv: vmbus: Fix vmbus_wait_for_unload() to scan present CPUs
Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
nilfs2: fix buffer corruption due to concurrent device reads
Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
nilfs2: reject devices with insufficient block count
Bernhard Seibold <mail(a)bernhard-seibold.de>
serial: lantiq: add missing interrupt ack
-------------
Diffstat:
Makefile | 4 +--
arch/arm64/include/asm/sysreg.h | 6 ++++
arch/x86/kernel/apic/x2apic_phys.c | 5 +++-
drivers/gpu/drm/exynos/exynos_drm_g2d.c | 2 +-
drivers/gpu/drm/exynos/exynos_drm_vidi.c | 2 --
drivers/gpu/drm/radeon/radeon_gem.c | 4 +--
drivers/hid/wacom_sys.c | 7 ++++-
drivers/hv/channel_mgmt.c | 18 ++++++++++--
drivers/i2c/busses/i2c-imx-lpi2c.c | 4 +--
drivers/mmc/host/mtk-sd.c | 2 +-
drivers/mmc/host/omap.c | 2 +-
drivers/mmc/host/omap_hsmmc.c | 6 ++--
drivers/mmc/host/usdhi6rol0.c | 6 ++--
drivers/net/ethernet/emulex/benet/be_main.c | 4 +--
drivers/net/ethernet/qualcomm/qca_spi.c | 3 +-
drivers/nfc/nfcsim.c | 4 ---
drivers/s390/cio/device.c | 5 +++-
drivers/target/iscsi/iscsi_target_nego.c | 4 ++-
drivers/tty/serial/lantiq.c | 1 +
drivers/usb/gadget/udc/amd5536udc_pci.c | 3 ++
drivers/video/fbdev/imsttfb.c | 6 +++-
fs/nilfs2/page.c | 10 ++++++-
fs/nilfs2/segbuf.c | 6 ++++
fs/nilfs2/segment.c | 7 +++++
fs/nilfs2/super.c | 25 ++++++++++++++--
fs/nilfs2/the_nilfs.c | 44 ++++++++++++++++++++++++++++-
kernel/cgroup/cgroup.c | 20 +++++++++++--
net/ipv4/esp4_offload.c | 3 ++
net/ipv6/esp6_offload.c | 3 ++
net/netfilter/nf_tables_api.c | 7 +++--
30 files changed, 183 insertions(+), 40 deletions(-)
namespace's request queue is frozen and quiesced during error recovering,
writeback IO is blocked in bio_queue_enter(), so fsync_bdev() <- del_gendisk()
can't move on, and causes IO hang. Removal could be from sysfs, hard
unplug or error handling.
Fix this kind of issue by marking controller as DEAD if removal breaks
error recovery.
This ways is reasonable too, because controller can't be recovered any
more after being removed.
Cc: stable(a)vger.kernel.org
Reported-by: Chunguang Xu <brookxu.cn(a)gmail.com>
Closes: https://lore.kernel.org/linux-nvme/cover.1685350577.git.chunguang.xu@shopee…
Reported-by: Yi Zhang <yi.zhang(a)redhat.com>
Signed-off-by: Ming Lei <ming.lei(a)redhat.com>
---
V2:
- patch style fix, as suggested by Christoph
- document this handling
drivers/nvme/host/core.c | 9 ++++++++-
drivers/nvme/host/nvme.h | 1 +
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index fdfcf2781c85..1419eb35b47a 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -567,6 +567,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
}
if (changed) {
+ ctrl->old_state = ctrl->state;
ctrl->state = new_state;
wake_up_all(&ctrl->state_wq);
}
@@ -4054,8 +4055,14 @@ void nvme_remove_namespaces(struct nvme_ctrl *ctrl)
* disconnected. In that case, we won't be able to flush any data while
* removing the namespaces' disks; fail all the queues now to avoid
* potentially having to clean up the failed sync later.
+ *
+ * If this removal happens during error recovering, resetting part
+ * may not be started, or controller isn't be recovered completely,
+ * so we have to treat controller as DEAD for avoiding IO hang since
+ * queues can be left as frozen and quiesced.
*/
- if (ctrl->state == NVME_CTRL_DEAD) {
+ if (ctrl->state == NVME_CTRL_DEAD ||
+ ctrl->old_state != NVME_CTRL_LIVE) {
nvme_mark_namespaces_dead(ctrl);
nvme_unquiesce_io_queues(ctrl);
}
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 9a98c14c552a..ce67856d4d4f 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -254,6 +254,7 @@ struct nvme_ctrl {
bool comp_seen;
bool identified;
enum nvme_ctrl_state state;
+ enum nvme_ctrl_state old_state;
spinlock_t lock;
struct mutex scan_lock;
const struct nvme_ctrl_ops *ops;
--
2.40.1
From: Sheetal <sheetal(a)nvidia.com>
I2S data sanity tests fail beyond a bit clock frequency of 6.144MHz.
This happens because the AHUB clock rate is too low and it shows
9.83MHz on boot.
The maximum rate of PLLA_OUT0 is 49.152MHz and is used to serve I/O
clocks. It is recommended that AHUB clock operates higher than this.
Thus fix this by using PLLP_OUT0 as parent clock for AHUB instead of
PLLA_OUT0 and fix the rate to 81.6MHz.
Fixes: dc94a94daa39 ("arm64: tegra: Add audio devices on Tegra234")
Cc: stable(a)vger.kernel.org
Signed-off-by: Sheetal <sheetal(a)nvidia.com>
Signed-off-by: Sameer Pujar <spujar(a)nvidia.com>
Reviewed-by: Mohan Kumar D <mkumard(a)nvidia.com>
---
arch/arm64/boot/dts/nvidia/tegra234.dtsi | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/nvidia/tegra234.dtsi b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
index f4974e8..0f12a8de 100644
--- a/arch/arm64/boot/dts/nvidia/tegra234.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
@@ -180,7 +180,8 @@
clocks = <&bpmp TEGRA234_CLK_AHUB>;
clock-names = "ahub";
assigned-clocks = <&bpmp TEGRA234_CLK_AHUB>;
- assigned-clock-parents = <&bpmp TEGRA234_CLK_PLLA_OUT0>;
+ assigned-clock-parents = <&bpmp TEGRA234_CLK_PLLP_OUT0>;
+ assigned-clock-rates = <81600000>;
status = "disabled";
#address-cells = <2>;
--
2.7.4
From: Sheetal <sheetal(a)nvidia.com>
Byte mask for channel-1 of stream-1 is not getting enabled and this
causes failures during ADX use cases. This happens because the byte
map value 0 matches the byte map array and put() callback returns
without enabling the corresponding bits in the byte mask.
ADX supports 4 output streams and each stream can have a maximum of
16 channels. Each byte in the input frame is uniquely mapped to a
byte in one of these 4 outputs. This mapping is done with the help of
byte map array via user space control setting. The byte map array
size in the driver is 16 and each array element is of size 4 bytes.
This corresponds to 64 byte map values.
Each byte in the byte map array can have any value between 0 to 255
to enable the corresponding bits in the byte mask. The value 256 is
used as a way to disable the byte map. However the byte map array
element cannot store this value. The put() callback disables the byte
mask for 256 value and byte map value is reset to 0 for this case.
This causes problems during subsequent runs since put() callback,
for value of 0, just returns without enabling the byte mask. In short,
the problem is coming because 0 and 256 control values are stored as
0 in the byte map array.
Right now fix the put() callback by actually looking at the byte mask
array state to identify if any change is needed and update the fields
accordingly. The get() callback needs an update as well to return the
correct control value that user has set before. Note that when user
set 256, the value is stored as 0 and byte mask is disabled. So byte
mask state is used to either return 256 or the value from byte map
array.
Given above, this looks bit complicated and all this happens because
the byte map array is tightly packed and cannot actually store the 256
value. Right now the priority is to fix the existing failure and a TODO
item is put to improve this logic.
Fixes: 3c97881b8c8a ("ASoC: tegra: Fix kcontrol put callback in ADX")
Cc: stable(a)vger.kernel.org
Signed-off-by: Sheetal <sheetal(a)nvidia.com>
Reviewed-by: Mohan Kumar D <mkumard(a)nvidia.com>
Reviewed-by: Sameer Pujar <spujar(a)nvidia.com>
---
sound/soc/tegra/tegra210_adx.c | 34 ++++++++++++++++++++++------------
1 file changed, 22 insertions(+), 12 deletions(-)
diff --git a/sound/soc/tegra/tegra210_adx.c b/sound/soc/tegra/tegra210_adx.c
index bd0b10c..7d003f0 100644
--- a/sound/soc/tegra/tegra210_adx.c
+++ b/sound/soc/tegra/tegra210_adx.c
@@ -2,7 +2,7 @@
//
// tegra210_adx.c - Tegra210 ADX driver
//
-// Copyright (c) 2021 NVIDIA CORPORATION. All rights reserved.
+// Copyright (c) 2021-2023 NVIDIA CORPORATION. All rights reserved.
#include <linux/clk.h>
#include <linux/device.h>
@@ -175,10 +175,20 @@ static int tegra210_adx_get_byte_map(struct snd_kcontrol *kcontrol,
mc = (struct soc_mixer_control *)kcontrol->private_value;
enabled = adx->byte_mask[mc->reg / 32] & (1 << (mc->reg % 32));
+ /*
+ * TODO: Simplify this logic to just return from bytes_map[]
+ *
+ * Presently below is required since bytes_map[] is
+ * tightly packed and cannot store the control value of 256.
+ * Byte mask state is used to know if 256 needs to be returned.
+ * Note that for control value of 256, the put() call stores 0
+ * in the bytes_map[] and disables the corresponding bit in
+ * byte_mask[].
+ */
if (enabled)
ucontrol->value.integer.value[0] = bytes_map[mc->reg];
else
- ucontrol->value.integer.value[0] = 0;
+ ucontrol->value.integer.value[0] = 256;
return 0;
}
@@ -192,19 +202,19 @@ static int tegra210_adx_put_byte_map(struct snd_kcontrol *kcontrol,
int value = ucontrol->value.integer.value[0];
struct soc_mixer_control *mc =
(struct soc_mixer_control *)kcontrol->private_value;
+ unsigned int mask_val = adx->byte_mask[mc->reg / 32];
- if (value == bytes_map[mc->reg])
+ if (value >= 0 && value <= 255)
+ mask_val |= (1 << (mc->reg % 32));
+ else
+ mask_val &= ~(1 << (mc->reg % 32));
+
+ if (mask_val == adx->byte_mask[mc->reg / 32])
return 0;
- if (value >= 0 && value <= 255) {
- /* update byte map and enable slot */
- bytes_map[mc->reg] = value;
- adx->byte_mask[mc->reg / 32] |= (1 << (mc->reg % 32));
- } else {
- /* reset byte map and disable slot */
- bytes_map[mc->reg] = 0;
- adx->byte_mask[mc->reg / 32] &= ~(1 << (mc->reg % 32));
- }
+ /* Update byte map and slot */
+ bytes_map[mc->reg] = value % 256;
+ adx->byte_mask[mc->reg / 32] = mask_val;
return 1;
}
--
2.7.4
From: Sheetal <sheetal(a)nvidia.com>
Byte mask for channel-1 of stream-1 is not getting enabled and this
causes failures during AMX use cases. This happens because the byte
map value 0 matches the byte map array and put() callback returns
without enabling the corresponding bits in the byte mask.
AMX supports 4 input streams and each stream can take a maximum of
16 channels. Each byte in the output frame is uniquely mapped to a
byte in one of these 4 inputs. This mapping is done with the help of
byte map array via user space control setting. The byte map array
size in the driver is 16 and each array element is of size 4 bytes.
This corresponds to 64 byte map values.
Each byte in the byte map array can have any value between 0 to 255
to enable the corresponding bits in the byte mask. The value 256 is
used as a way to disable the byte map. However the byte map array
element cannot store this value. The put() callback disables the byte
mask for 256 value and byte map value is reset to 0 for this case.
This causes problems during subsequent runs since put() callback,
for value of 0, just returns without enabling the byte mask. In short,
the problem is coming because 0 and 256 control values are stored as
0 in the byte map array.
Right now fix the put() callback by actually looking at the byte mask
array state to identify if any change is needed and update the fields
accordingly. The get() callback needs an update as well to return the
correct control value that user has set before. Note that when user
sets 256, the value is stored as 0 and byte mask is disabled. So byte
mask state is used to either return 256 or the value from byte map
array.
Given above, this looks bit complicated and all this happens because
the byte map array is tightly packed and cannot actually store the 256
value. Right now the priority is to fix the existing failure and a TODO
item is put to improve this logic.
Fixes: 8db78ace1ba8 ("ASoC: tegra: Fix kcontrol put callback in AMX")
Cc: stable(a)vger.kernel.org
Signed-off-by: Sheetal <sheetal(a)nvidia.com>
Reviewed-by: Mohan Kumar D <mkumard(a)nvidia.com>
Reviewed-by: Sameer Pujar <spujar(a)nvidia.com>
---
sound/soc/tegra/tegra210_amx.c | 40 ++++++++++++++++++++++------------------
1 file changed, 22 insertions(+), 18 deletions(-)
diff --git a/sound/soc/tegra/tegra210_amx.c b/sound/soc/tegra/tegra210_amx.c
index 782a141..1798769 100644
--- a/sound/soc/tegra/tegra210_amx.c
+++ b/sound/soc/tegra/tegra210_amx.c
@@ -2,7 +2,7 @@
//
// tegra210_amx.c - Tegra210 AMX driver
//
-// Copyright (c) 2021 NVIDIA CORPORATION. All rights reserved.
+// Copyright (c) 2021-2023 NVIDIA CORPORATION. All rights reserved.
#include <linux/clk.h>
#include <linux/device.h>
@@ -203,10 +203,20 @@ static int tegra210_amx_get_byte_map(struct snd_kcontrol *kcontrol,
else
enabled = amx->byte_mask[0] & (1 << reg);
+ /*
+ * TODO: Simplify this logic to just return from bytes_map[]
+ *
+ * Presently below is required since bytes_map[] is
+ * tightly packed and cannot store the control value of 256.
+ * Byte mask state is used to know if 256 needs to be returned.
+ * Note that for control value of 256, the put() call stores 0
+ * in the bytes_map[] and disables the corresponding bit in
+ * byte_mask[].
+ */
if (enabled)
ucontrol->value.integer.value[0] = bytes_map[reg];
else
- ucontrol->value.integer.value[0] = 0;
+ ucontrol->value.integer.value[0] = 256;
return 0;
}
@@ -221,25 +231,19 @@ static int tegra210_amx_put_byte_map(struct snd_kcontrol *kcontrol,
unsigned char *bytes_map = (unsigned char *)&amx->map;
int reg = mc->reg;
int value = ucontrol->value.integer.value[0];
+ unsigned int mask_val = amx->byte_mask[reg / 32];
- if (value == bytes_map[reg])
+ if (value >= 0 && value <= 255)
+ mask_val |= (1 << (reg % 32));
+ else
+ mask_val &= ~(1 << (reg % 32));
+
+ if (mask_val == amx->byte_mask[reg / 32])
return 0;
- if (value >= 0 && value <= 255) {
- /* Update byte map and enable slot */
- bytes_map[reg] = value;
- if (reg > 31)
- amx->byte_mask[1] |= (1 << (reg - 32));
- else
- amx->byte_mask[0] |= (1 << reg);
- } else {
- /* Reset byte map and disable slot */
- bytes_map[reg] = 0;
- if (reg > 31)
- amx->byte_mask[1] &= ~(1 << (reg - 32));
- else
- amx->byte_mask[0] &= ~(1 << reg);
- }
+ /* Update byte map and slot */
+ bytes_map[reg] = value % 256;
+ amx->byte_mask[reg / 32] = mask_val;
return 1;
}
--
2.7.4
Before calling add partition or resize partition, there is no check
on whether the length is aligned with the logical block size.
If the logical block size of the disk is larger than 512 bytes,
then the partition size maybe not the multiple of the logical block size,
and when the last sector is read, bio_truncate() will adjust the bio size,
resulting in an IO error if the size of the read command is smaller than
the logical block size.If integrity data is supported, this will also
result in a null pointer dereference when calling bio_integrity_free.
Cc: stable(a)vger.kernel.org
Signed-off-by: Min Li <min15.li(a)samsung.com>
---
Changes from v1:
- Add a space after /* and before */.
- Move length alignment check before the "start = p.start >> SECTOR_SHIFT"
- Move check for p.start being aligned together with this length alignment check.
Changes from v2:
- Add the assignment on the first line and merge the two lines into one.
Changes from v3:
- Change the blksz to unsigned int.
- Add check if p.start and p.length are negative.
---
block/ioctl.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/block/ioctl.c b/block/ioctl.c
index 3be11941fb2d..a8061c2fcae0 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -16,9 +16,10 @@
static int blkpg_do_ioctl(struct block_device *bdev,
struct blkpg_partition __user *upart, int op)
{
+ unsigned int blksz = bdev_logical_block_size(bdev);
struct gendisk *disk = bdev->bd_disk;
struct blkpg_partition p;
- long long start, length;
+ sector_t start, length;
if (!capable(CAP_SYS_ADMIN))
return -EACCES;
@@ -33,14 +34,17 @@ static int blkpg_do_ioctl(struct block_device *bdev,
if (op == BLKPG_DEL_PARTITION)
return bdev_del_partition(disk, p.pno);
+ if (p.start < 0 || p.length <= 0 || p.start + p.length < 0)
+ return -EINVAL;
+ /* Check that the partition is aligned to the block size */
+ if (!IS_ALIGNED(p.start | p.length, blksz))
+ return -EINVAL;
+
start = p.start >> SECTOR_SHIFT;
length = p.length >> SECTOR_SHIFT;
switch (op) {
case BLKPG_ADD_PARTITION:
- /* check if partition is aligned to blocksize */
- if (p.start & (bdev_logical_block_size(bdev) - 1))
- return -EINVAL;
return bdev_add_partition(disk, p.pno, start, length);
case BLKPG_RESIZE_PARTITION:
return bdev_resize_partition(disk, p.pno, start, length);
--
2.34.1
From: Xiubo Li <xiubli(a)redhat.com>
If a client sends out a cap-update request with the old 'seq' just
before a pending cap revoke request, then the MDS might miscalculate
the 'seqs' and caps. It's therefore always a good idea to ack the
cap revoke request with the bumped up 'seq'.
Cc: stable(a)vger.kernel.org
Cc: Patrick Donnelly <pdonnell(a)redhat.com>
URL: https://tracker.ceph.com/issues/61782
Signed-off-by: Xiubo Li <xiubli(a)redhat.com>
---
V2:
- Rephrased the commit comment for better understanding from Milind
fs/ceph/caps.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index 1052885025b3..eee2fbca3430 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -3737,6 +3737,15 @@ static void handle_cap_grant(struct inode *inode,
}
BUG_ON(cap->issued & ~cap->implemented);
+ /* don't let check_caps skip sending a response to MDS for revoke msgs */
+ if (le32_to_cpu(grant->op) == CEPH_CAP_OP_REVOKE) {
+ cap->mds_wanted = 0;
+ if (cap == ci->i_auth_cap)
+ check_caps = 1; /* check auth cap only */
+ else
+ check_caps = 2; /* check all caps */
+ }
+
if (extra_info->inline_version > 0 &&
extra_info->inline_version >= ci->i_inline_version) {
ci->i_inline_version = extra_info->inline_version;
--
2.40.1
commit 1c249565426e3a9940102c0ba9f63914f7cda73d upstream.
This problem was encountered on an arm64 system with a lot of memory.
Without kernel debug symbols installed, and with both kcore and kallsyms
available, perf managed to get confused and returned "unknown" for all
of the kernel symbols that it tried to look up.
On this system, stext fell within the vmalloc segment. The kcore symbol
matching code tries to find the first segment that contains stext and
uses that to replace the segment generated from just the kallsyms
information. In this case, however, there were two: a very large
vmalloc segment, and the text segment. This caused perf to get confused
because multiple overlapping segments were inserted into the RB tree
that holds the discovered segments. However, that alone wasn't
sufficient to cause the problem. Even when we could find the segment,
the offsets were adjusted in such a way that the newly generated symbols
didn't line up with the instruction addresses in the trace. The most
obvious solution would be to consult which segment type is text from
kcore, but this information is not exposed to users.
Instead, select the smallest matching segment that contains stext
instead of the first matching segment. This allows us to match the text
segment instead of vmalloc, if one is contained within the other.
Reviewed-by: Adrian Hunter <adrian.hunter(a)intel.com>
Signed-off-by: Krister Johansen <kjlx(a)templeofstupid.com>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: David Reaver <me(a)davidreaver.com>
Cc: Ian Rogers <irogers(a)google.com>
Cc: Jiri Olsa <jolsa(a)kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Michael Petlan <mpetlan(a)redhat.com>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Link: http://lore.kernel.org/lkml/20230125183418.GD1963@templeofstupid.com
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Signed-off-by: Krister Johansen <kjlx(a)templeofstupid.com>
---
tools/perf/util/symbol.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index a3a165ae933a..98014f937568 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1368,10 +1368,23 @@ static int dso__load_kcore(struct dso *dso, struct map *map,
/* Find the kernel map using the '_stext' symbol */
if (!kallsyms__get_function_start(kallsyms_filename, "_stext", &stext)) {
+ u64 replacement_size = 0;
+
list_for_each_entry(new_map, &md.maps, node) {
- if (stext >= new_map->start && stext < new_map->end) {
+ u64 new_size = new_map->end - new_map->start;
+
+ if (!(stext >= new_map->start && stext < new_map->end))
+ continue;
+
+ /*
+ * On some architectures, ARM64 for example, the kernel
+ * text can get allocated inside of the vmalloc segment.
+ * Select the smallest matching segment, in case stext
+ * falls within more than one in the list.
+ */
+ if (!replacement_map || new_size < replacement_size) {
replacement_map = new_map;
- break;
+ replacement_size = new_size;
}
}
}
--
2.25.1
We have to verify the selected mode as Userspace might request one which
we can't configure the GPU for.
X with the modesetting DDX is adding a bunch of modes, some so far outside
of hardware limits that things simply break.
Sadly we can't fix X this way as on start it sticks to one mode and
ignores any error and there is really nothing we can do about this, but at
least this way we won't let the GPU run into any errors caused by a non
supported display mode.
However this does prevent X from switching to such a mode, which to be
fair is an improvement as well.
Seen on one of my Tesla GPUs with a connected 4K display.
Link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/199
Cc: Ben Skeggs <bskeggs(a)redhat.com>
Cc: Lyude Paul <lyude(a)redhat.com>
Cc: stable(a)vger.kernel.org # v6.1+
Signed-off-by: Karol Herbst <kherbst(a)redhat.com>
---
drivers/gpu/drm/nouveau/nouveau_connector.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c
index 22c42a5e184d..edf490c1490c 100644
--- a/drivers/gpu/drm/nouveau/nouveau_connector.c
+++ b/drivers/gpu/drm/nouveau/nouveau_connector.c
@@ -1114,6 +1114,25 @@ nouveau_connector_atomic_check(struct drm_connector *connector, struct drm_atomi
struct drm_connector_state *conn_state =
drm_atomic_get_new_connector_state(state, connector);
+ /* As we can get any random mode from Userspace, we have to make sure the to be set mode
+ * is valid and does not violate hardware constraints as we rely on it being sane.
+ */
+ if (conn_state->crtc) {
+ struct drm_crtc_state *crtc_state =
+ drm_atomic_get_crtc_state(state, conn_state->crtc);
+
+ if (IS_ERR(crtc_state))
+ return PTR_ERR(crtc_state);
+
+ if (crtc_state->enable && (crtc_state->mode_changed ||
+ crtc_state->connectors_changed)) {
+ struct drm_display_mode *mode = &crtc_state->mode;
+
+ if (connector->helper_private->mode_valid(connector, mode) != MODE_OK)
+ return -EINVAL;
+ }
+ }
+
if (!nv_conn->dp_encoder || !nv50_has_mst(nouveau_drm(connector->dev)))
return 0;
--
2.41.0
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x d7893093a7417527c0d73c9832244e65c9d0114f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023062812-bloated-equal-cc64@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d7893093a7417527c0d73c9832244e65c9d0114f Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx(a)linutronix.de>
Date: Thu, 15 Jun 2023 22:33:57 +0200
Subject: [PATCH] x86/smp: Cure kexec() vs. mwait_play_dead() breakage
TLDR: It's a mess.
When kexec() is executed on a system with offline CPUs, which are parked in
mwait_play_dead() it can end up in a triple fault during the bootup of the
kexec kernel or cause hard to diagnose data corruption.
The reason is that kexec() eventually overwrites the previous kernel's text,
page tables, data and stack. If it writes to the cache line which is
monitored by a previously offlined CPU, MWAIT resumes execution and ends
up executing the wrong text, dereferencing overwritten page tables or
corrupting the kexec kernels data.
Cure this by bringing the offlined CPUs out of MWAIT into HLT.
Write to the monitored cache line of each offline CPU, which makes MWAIT
resume execution. The written control word tells the offlined CPUs to issue
HLT, which does not have the MWAIT problem.
That does not help, if a stray NMI, MCE or SMI hits the offlined CPUs as
those make it come out of HLT.
A follow up change will put them into INIT, which protects at least against
NMI and SMI.
Fixes: ea53069231f9 ("x86, hotplug: Use mwait to offline a processor, fix the legacy case")
Reported-by: Ashok Raj <ashok.raj(a)intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Tested-by: Ashok Raj <ashok.raj(a)intel.com>
Reviewed-by: Ashok Raj <ashok.raj(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20230615193330.492257119@linutronix.de
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 4e91054c84be..d4ce5cb5c953 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -132,6 +132,8 @@ void wbinvd_on_cpu(int cpu);
int wbinvd_on_all_cpus(void);
void cond_wakeup_cpu0(void);
+void smp_kick_mwait_play_dead(void);
+
void native_smp_send_reschedule(int cpu);
void native_send_call_func_ipi(const struct cpumask *mask);
void native_send_call_func_single_ipi(int cpu);
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index d842875f986f..174d6232b87f 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -21,6 +21,7 @@
#include <linux/interrupt.h>
#include <linux/cpu.h>
#include <linux/gfp.h>
+#include <linux/kexec.h>
#include <asm/mtrr.h>
#include <asm/tlbflush.h>
@@ -157,6 +158,10 @@ static void native_stop_other_cpus(int wait)
if (atomic_cmpxchg(&stopping_cpu, -1, cpu) != -1)
return;
+ /* For kexec, ensure that offline CPUs are out of MWAIT and in HLT */
+ if (kexec_in_progress)
+ smp_kick_mwait_play_dead();
+
/*
* 1) Send an IPI on the reboot vector to all other CPUs.
*
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index c5ac5d74cdd4..483df0427678 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -53,6 +53,7 @@
#include <linux/tboot.h>
#include <linux/gfp.h>
#include <linux/cpuidle.h>
+#include <linux/kexec.h>
#include <linux/numa.h>
#include <linux/pgtable.h>
#include <linux/overflow.h>
@@ -106,6 +107,9 @@ struct mwait_cpu_dead {
unsigned int status;
};
+#define CPUDEAD_MWAIT_WAIT 0xDEADBEEF
+#define CPUDEAD_MWAIT_KEXEC_HLT 0x4A17DEAD
+
/*
* Cache line aligned data for mwait_play_dead(). Separate on purpose so
* that it's unlikely to be touched by other CPUs.
@@ -173,6 +177,10 @@ static void smp_callin(void)
{
int cpuid;
+ /* Mop up eventual mwait_play_dead() wreckage */
+ this_cpu_write(mwait_cpu_dead.status, 0);
+ this_cpu_write(mwait_cpu_dead.control, 0);
+
/*
* If waken up by an INIT in an 82489DX configuration
* cpu_callout_mask guarantees we don't get here before
@@ -1807,6 +1815,10 @@ static inline void mwait_play_dead(void)
(highest_subcstate - 1);
}
+ /* Set up state for the kexec() hack below */
+ md->status = CPUDEAD_MWAIT_WAIT;
+ md->control = CPUDEAD_MWAIT_WAIT;
+
wbinvd();
while (1) {
@@ -1824,10 +1836,57 @@ static inline void mwait_play_dead(void)
mb();
__mwait(eax, 0);
+ if (READ_ONCE(md->control) == CPUDEAD_MWAIT_KEXEC_HLT) {
+ /*
+ * Kexec is about to happen. Don't go back into mwait() as
+ * the kexec kernel might overwrite text and data including
+ * page tables and stack. So mwait() would resume when the
+ * monitor cache line is written to and then the CPU goes
+ * south due to overwritten text, page tables and stack.
+ *
+ * Note: This does _NOT_ protect against a stray MCE, NMI,
+ * SMI. They will resume execution at the instruction
+ * following the HLT instruction and run into the problem
+ * which this is trying to prevent.
+ */
+ WRITE_ONCE(md->status, CPUDEAD_MWAIT_KEXEC_HLT);
+ while(1)
+ native_halt();
+ }
+
cond_wakeup_cpu0();
}
}
+/*
+ * Kick all "offline" CPUs out of mwait on kexec(). See comment in
+ * mwait_play_dead().
+ */
+void smp_kick_mwait_play_dead(void)
+{
+ u32 newstate = CPUDEAD_MWAIT_KEXEC_HLT;
+ struct mwait_cpu_dead *md;
+ unsigned int cpu, i;
+
+ for_each_cpu_andnot(cpu, cpu_present_mask, cpu_online_mask) {
+ md = per_cpu_ptr(&mwait_cpu_dead, cpu);
+
+ /* Does it sit in mwait_play_dead() ? */
+ if (READ_ONCE(md->status) != CPUDEAD_MWAIT_WAIT)
+ continue;
+
+ /* Wait up to 5ms */
+ for (i = 0; READ_ONCE(md->status) != newstate && i < 1000; i++) {
+ /* Bring it out of mwait */
+ WRITE_ONCE(md->control, newstate);
+ udelay(5);
+ }
+
+ if (READ_ONCE(md->status) != newstate)
+ pr_err_once("CPU%u is stuck in mwait_play_dead()\n", cpu);
+ }
+}
+
void __noreturn hlt_play_dead(void)
{
if (__this_cpu_read(cpu_info.x86) >= 4)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x d7893093a7417527c0d73c9832244e65c9d0114f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023062811-ambush-finishing-abd6@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d7893093a7417527c0d73c9832244e65c9d0114f Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx(a)linutronix.de>
Date: Thu, 15 Jun 2023 22:33:57 +0200
Subject: [PATCH] x86/smp: Cure kexec() vs. mwait_play_dead() breakage
TLDR: It's a mess.
When kexec() is executed on a system with offline CPUs, which are parked in
mwait_play_dead() it can end up in a triple fault during the bootup of the
kexec kernel or cause hard to diagnose data corruption.
The reason is that kexec() eventually overwrites the previous kernel's text,
page tables, data and stack. If it writes to the cache line which is
monitored by a previously offlined CPU, MWAIT resumes execution and ends
up executing the wrong text, dereferencing overwritten page tables or
corrupting the kexec kernels data.
Cure this by bringing the offlined CPUs out of MWAIT into HLT.
Write to the monitored cache line of each offline CPU, which makes MWAIT
resume execution. The written control word tells the offlined CPUs to issue
HLT, which does not have the MWAIT problem.
That does not help, if a stray NMI, MCE or SMI hits the offlined CPUs as
those make it come out of HLT.
A follow up change will put them into INIT, which protects at least against
NMI and SMI.
Fixes: ea53069231f9 ("x86, hotplug: Use mwait to offline a processor, fix the legacy case")
Reported-by: Ashok Raj <ashok.raj(a)intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Tested-by: Ashok Raj <ashok.raj(a)intel.com>
Reviewed-by: Ashok Raj <ashok.raj(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20230615193330.492257119@linutronix.de
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 4e91054c84be..d4ce5cb5c953 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -132,6 +132,8 @@ void wbinvd_on_cpu(int cpu);
int wbinvd_on_all_cpus(void);
void cond_wakeup_cpu0(void);
+void smp_kick_mwait_play_dead(void);
+
void native_smp_send_reschedule(int cpu);
void native_send_call_func_ipi(const struct cpumask *mask);
void native_send_call_func_single_ipi(int cpu);
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index d842875f986f..174d6232b87f 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -21,6 +21,7 @@
#include <linux/interrupt.h>
#include <linux/cpu.h>
#include <linux/gfp.h>
+#include <linux/kexec.h>
#include <asm/mtrr.h>
#include <asm/tlbflush.h>
@@ -157,6 +158,10 @@ static void native_stop_other_cpus(int wait)
if (atomic_cmpxchg(&stopping_cpu, -1, cpu) != -1)
return;
+ /* For kexec, ensure that offline CPUs are out of MWAIT and in HLT */
+ if (kexec_in_progress)
+ smp_kick_mwait_play_dead();
+
/*
* 1) Send an IPI on the reboot vector to all other CPUs.
*
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index c5ac5d74cdd4..483df0427678 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -53,6 +53,7 @@
#include <linux/tboot.h>
#include <linux/gfp.h>
#include <linux/cpuidle.h>
+#include <linux/kexec.h>
#include <linux/numa.h>
#include <linux/pgtable.h>
#include <linux/overflow.h>
@@ -106,6 +107,9 @@ struct mwait_cpu_dead {
unsigned int status;
};
+#define CPUDEAD_MWAIT_WAIT 0xDEADBEEF
+#define CPUDEAD_MWAIT_KEXEC_HLT 0x4A17DEAD
+
/*
* Cache line aligned data for mwait_play_dead(). Separate on purpose so
* that it's unlikely to be touched by other CPUs.
@@ -173,6 +177,10 @@ static void smp_callin(void)
{
int cpuid;
+ /* Mop up eventual mwait_play_dead() wreckage */
+ this_cpu_write(mwait_cpu_dead.status, 0);
+ this_cpu_write(mwait_cpu_dead.control, 0);
+
/*
* If waken up by an INIT in an 82489DX configuration
* cpu_callout_mask guarantees we don't get here before
@@ -1807,6 +1815,10 @@ static inline void mwait_play_dead(void)
(highest_subcstate - 1);
}
+ /* Set up state for the kexec() hack below */
+ md->status = CPUDEAD_MWAIT_WAIT;
+ md->control = CPUDEAD_MWAIT_WAIT;
+
wbinvd();
while (1) {
@@ -1824,10 +1836,57 @@ static inline void mwait_play_dead(void)
mb();
__mwait(eax, 0);
+ if (READ_ONCE(md->control) == CPUDEAD_MWAIT_KEXEC_HLT) {
+ /*
+ * Kexec is about to happen. Don't go back into mwait() as
+ * the kexec kernel might overwrite text and data including
+ * page tables and stack. So mwait() would resume when the
+ * monitor cache line is written to and then the CPU goes
+ * south due to overwritten text, page tables and stack.
+ *
+ * Note: This does _NOT_ protect against a stray MCE, NMI,
+ * SMI. They will resume execution at the instruction
+ * following the HLT instruction and run into the problem
+ * which this is trying to prevent.
+ */
+ WRITE_ONCE(md->status, CPUDEAD_MWAIT_KEXEC_HLT);
+ while(1)
+ native_halt();
+ }
+
cond_wakeup_cpu0();
}
}
+/*
+ * Kick all "offline" CPUs out of mwait on kexec(). See comment in
+ * mwait_play_dead().
+ */
+void smp_kick_mwait_play_dead(void)
+{
+ u32 newstate = CPUDEAD_MWAIT_KEXEC_HLT;
+ struct mwait_cpu_dead *md;
+ unsigned int cpu, i;
+
+ for_each_cpu_andnot(cpu, cpu_present_mask, cpu_online_mask) {
+ md = per_cpu_ptr(&mwait_cpu_dead, cpu);
+
+ /* Does it sit in mwait_play_dead() ? */
+ if (READ_ONCE(md->status) != CPUDEAD_MWAIT_WAIT)
+ continue;
+
+ /* Wait up to 5ms */
+ for (i = 0; READ_ONCE(md->status) != newstate && i < 1000; i++) {
+ /* Bring it out of mwait */
+ WRITE_ONCE(md->control, newstate);
+ udelay(5);
+ }
+
+ if (READ_ONCE(md->status) != newstate)
+ pr_err_once("CPU%u is stuck in mwait_play_dead()\n", cpu);
+ }
+}
+
void __noreturn hlt_play_dead(void)
{
if (__this_cpu_read(cpu_info.x86) >= 4)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x d7893093a7417527c0d73c9832244e65c9d0114f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023062809-ultimate-spider-e3a0@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d7893093a7417527c0d73c9832244e65c9d0114f Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx(a)linutronix.de>
Date: Thu, 15 Jun 2023 22:33:57 +0200
Subject: [PATCH] x86/smp: Cure kexec() vs. mwait_play_dead() breakage
TLDR: It's a mess.
When kexec() is executed on a system with offline CPUs, which are parked in
mwait_play_dead() it can end up in a triple fault during the bootup of the
kexec kernel or cause hard to diagnose data corruption.
The reason is that kexec() eventually overwrites the previous kernel's text,
page tables, data and stack. If it writes to the cache line which is
monitored by a previously offlined CPU, MWAIT resumes execution and ends
up executing the wrong text, dereferencing overwritten page tables or
corrupting the kexec kernels data.
Cure this by bringing the offlined CPUs out of MWAIT into HLT.
Write to the monitored cache line of each offline CPU, which makes MWAIT
resume execution. The written control word tells the offlined CPUs to issue
HLT, which does not have the MWAIT problem.
That does not help, if a stray NMI, MCE or SMI hits the offlined CPUs as
those make it come out of HLT.
A follow up change will put them into INIT, which protects at least against
NMI and SMI.
Fixes: ea53069231f9 ("x86, hotplug: Use mwait to offline a processor, fix the legacy case")
Reported-by: Ashok Raj <ashok.raj(a)intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Tested-by: Ashok Raj <ashok.raj(a)intel.com>
Reviewed-by: Ashok Raj <ashok.raj(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20230615193330.492257119@linutronix.de
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 4e91054c84be..d4ce5cb5c953 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -132,6 +132,8 @@ void wbinvd_on_cpu(int cpu);
int wbinvd_on_all_cpus(void);
void cond_wakeup_cpu0(void);
+void smp_kick_mwait_play_dead(void);
+
void native_smp_send_reschedule(int cpu);
void native_send_call_func_ipi(const struct cpumask *mask);
void native_send_call_func_single_ipi(int cpu);
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index d842875f986f..174d6232b87f 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -21,6 +21,7 @@
#include <linux/interrupt.h>
#include <linux/cpu.h>
#include <linux/gfp.h>
+#include <linux/kexec.h>
#include <asm/mtrr.h>
#include <asm/tlbflush.h>
@@ -157,6 +158,10 @@ static void native_stop_other_cpus(int wait)
if (atomic_cmpxchg(&stopping_cpu, -1, cpu) != -1)
return;
+ /* For kexec, ensure that offline CPUs are out of MWAIT and in HLT */
+ if (kexec_in_progress)
+ smp_kick_mwait_play_dead();
+
/*
* 1) Send an IPI on the reboot vector to all other CPUs.
*
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index c5ac5d74cdd4..483df0427678 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -53,6 +53,7 @@
#include <linux/tboot.h>
#include <linux/gfp.h>
#include <linux/cpuidle.h>
+#include <linux/kexec.h>
#include <linux/numa.h>
#include <linux/pgtable.h>
#include <linux/overflow.h>
@@ -106,6 +107,9 @@ struct mwait_cpu_dead {
unsigned int status;
};
+#define CPUDEAD_MWAIT_WAIT 0xDEADBEEF
+#define CPUDEAD_MWAIT_KEXEC_HLT 0x4A17DEAD
+
/*
* Cache line aligned data for mwait_play_dead(). Separate on purpose so
* that it's unlikely to be touched by other CPUs.
@@ -173,6 +177,10 @@ static void smp_callin(void)
{
int cpuid;
+ /* Mop up eventual mwait_play_dead() wreckage */
+ this_cpu_write(mwait_cpu_dead.status, 0);
+ this_cpu_write(mwait_cpu_dead.control, 0);
+
/*
* If waken up by an INIT in an 82489DX configuration
* cpu_callout_mask guarantees we don't get here before
@@ -1807,6 +1815,10 @@ static inline void mwait_play_dead(void)
(highest_subcstate - 1);
}
+ /* Set up state for the kexec() hack below */
+ md->status = CPUDEAD_MWAIT_WAIT;
+ md->control = CPUDEAD_MWAIT_WAIT;
+
wbinvd();
while (1) {
@@ -1824,10 +1836,57 @@ static inline void mwait_play_dead(void)
mb();
__mwait(eax, 0);
+ if (READ_ONCE(md->control) == CPUDEAD_MWAIT_KEXEC_HLT) {
+ /*
+ * Kexec is about to happen. Don't go back into mwait() as
+ * the kexec kernel might overwrite text and data including
+ * page tables and stack. So mwait() would resume when the
+ * monitor cache line is written to and then the CPU goes
+ * south due to overwritten text, page tables and stack.
+ *
+ * Note: This does _NOT_ protect against a stray MCE, NMI,
+ * SMI. They will resume execution at the instruction
+ * following the HLT instruction and run into the problem
+ * which this is trying to prevent.
+ */
+ WRITE_ONCE(md->status, CPUDEAD_MWAIT_KEXEC_HLT);
+ while(1)
+ native_halt();
+ }
+
cond_wakeup_cpu0();
}
}
+/*
+ * Kick all "offline" CPUs out of mwait on kexec(). See comment in
+ * mwait_play_dead().
+ */
+void smp_kick_mwait_play_dead(void)
+{
+ u32 newstate = CPUDEAD_MWAIT_KEXEC_HLT;
+ struct mwait_cpu_dead *md;
+ unsigned int cpu, i;
+
+ for_each_cpu_andnot(cpu, cpu_present_mask, cpu_online_mask) {
+ md = per_cpu_ptr(&mwait_cpu_dead, cpu);
+
+ /* Does it sit in mwait_play_dead() ? */
+ if (READ_ONCE(md->status) != CPUDEAD_MWAIT_WAIT)
+ continue;
+
+ /* Wait up to 5ms */
+ for (i = 0; READ_ONCE(md->status) != newstate && i < 1000; i++) {
+ /* Bring it out of mwait */
+ WRITE_ONCE(md->control, newstate);
+ udelay(5);
+ }
+
+ if (READ_ONCE(md->status) != newstate)
+ pr_err_once("CPU%u is stuck in mwait_play_dead()\n", cpu);
+ }
+}
+
void __noreturn hlt_play_dead(void)
{
if (__this_cpu_read(cpu_info.x86) >= 4)
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x d7893093a7417527c0d73c9832244e65c9d0114f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023062808-each-phony-f7b3@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d7893093a7417527c0d73c9832244e65c9d0114f Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx(a)linutronix.de>
Date: Thu, 15 Jun 2023 22:33:57 +0200
Subject: [PATCH] x86/smp: Cure kexec() vs. mwait_play_dead() breakage
TLDR: It's a mess.
When kexec() is executed on a system with offline CPUs, which are parked in
mwait_play_dead() it can end up in a triple fault during the bootup of the
kexec kernel or cause hard to diagnose data corruption.
The reason is that kexec() eventually overwrites the previous kernel's text,
page tables, data and stack. If it writes to the cache line which is
monitored by a previously offlined CPU, MWAIT resumes execution and ends
up executing the wrong text, dereferencing overwritten page tables or
corrupting the kexec kernels data.
Cure this by bringing the offlined CPUs out of MWAIT into HLT.
Write to the monitored cache line of each offline CPU, which makes MWAIT
resume execution. The written control word tells the offlined CPUs to issue
HLT, which does not have the MWAIT problem.
That does not help, if a stray NMI, MCE or SMI hits the offlined CPUs as
those make it come out of HLT.
A follow up change will put them into INIT, which protects at least against
NMI and SMI.
Fixes: ea53069231f9 ("x86, hotplug: Use mwait to offline a processor, fix the legacy case")
Reported-by: Ashok Raj <ashok.raj(a)intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Tested-by: Ashok Raj <ashok.raj(a)intel.com>
Reviewed-by: Ashok Raj <ashok.raj(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20230615193330.492257119@linutronix.de
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 4e91054c84be..d4ce5cb5c953 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -132,6 +132,8 @@ void wbinvd_on_cpu(int cpu);
int wbinvd_on_all_cpus(void);
void cond_wakeup_cpu0(void);
+void smp_kick_mwait_play_dead(void);
+
void native_smp_send_reschedule(int cpu);
void native_send_call_func_ipi(const struct cpumask *mask);
void native_send_call_func_single_ipi(int cpu);
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index d842875f986f..174d6232b87f 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -21,6 +21,7 @@
#include <linux/interrupt.h>
#include <linux/cpu.h>
#include <linux/gfp.h>
+#include <linux/kexec.h>
#include <asm/mtrr.h>
#include <asm/tlbflush.h>
@@ -157,6 +158,10 @@ static void native_stop_other_cpus(int wait)
if (atomic_cmpxchg(&stopping_cpu, -1, cpu) != -1)
return;
+ /* For kexec, ensure that offline CPUs are out of MWAIT and in HLT */
+ if (kexec_in_progress)
+ smp_kick_mwait_play_dead();
+
/*
* 1) Send an IPI on the reboot vector to all other CPUs.
*
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index c5ac5d74cdd4..483df0427678 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -53,6 +53,7 @@
#include <linux/tboot.h>
#include <linux/gfp.h>
#include <linux/cpuidle.h>
+#include <linux/kexec.h>
#include <linux/numa.h>
#include <linux/pgtable.h>
#include <linux/overflow.h>
@@ -106,6 +107,9 @@ struct mwait_cpu_dead {
unsigned int status;
};
+#define CPUDEAD_MWAIT_WAIT 0xDEADBEEF
+#define CPUDEAD_MWAIT_KEXEC_HLT 0x4A17DEAD
+
/*
* Cache line aligned data for mwait_play_dead(). Separate on purpose so
* that it's unlikely to be touched by other CPUs.
@@ -173,6 +177,10 @@ static void smp_callin(void)
{
int cpuid;
+ /* Mop up eventual mwait_play_dead() wreckage */
+ this_cpu_write(mwait_cpu_dead.status, 0);
+ this_cpu_write(mwait_cpu_dead.control, 0);
+
/*
* If waken up by an INIT in an 82489DX configuration
* cpu_callout_mask guarantees we don't get here before
@@ -1807,6 +1815,10 @@ static inline void mwait_play_dead(void)
(highest_subcstate - 1);
}
+ /* Set up state for the kexec() hack below */
+ md->status = CPUDEAD_MWAIT_WAIT;
+ md->control = CPUDEAD_MWAIT_WAIT;
+
wbinvd();
while (1) {
@@ -1824,10 +1836,57 @@ static inline void mwait_play_dead(void)
mb();
__mwait(eax, 0);
+ if (READ_ONCE(md->control) == CPUDEAD_MWAIT_KEXEC_HLT) {
+ /*
+ * Kexec is about to happen. Don't go back into mwait() as
+ * the kexec kernel might overwrite text and data including
+ * page tables and stack. So mwait() would resume when the
+ * monitor cache line is written to and then the CPU goes
+ * south due to overwritten text, page tables and stack.
+ *
+ * Note: This does _NOT_ protect against a stray MCE, NMI,
+ * SMI. They will resume execution at the instruction
+ * following the HLT instruction and run into the problem
+ * which this is trying to prevent.
+ */
+ WRITE_ONCE(md->status, CPUDEAD_MWAIT_KEXEC_HLT);
+ while(1)
+ native_halt();
+ }
+
cond_wakeup_cpu0();
}
}
+/*
+ * Kick all "offline" CPUs out of mwait on kexec(). See comment in
+ * mwait_play_dead().
+ */
+void smp_kick_mwait_play_dead(void)
+{
+ u32 newstate = CPUDEAD_MWAIT_KEXEC_HLT;
+ struct mwait_cpu_dead *md;
+ unsigned int cpu, i;
+
+ for_each_cpu_andnot(cpu, cpu_present_mask, cpu_online_mask) {
+ md = per_cpu_ptr(&mwait_cpu_dead, cpu);
+
+ /* Does it sit in mwait_play_dead() ? */
+ if (READ_ONCE(md->status) != CPUDEAD_MWAIT_WAIT)
+ continue;
+
+ /* Wait up to 5ms */
+ for (i = 0; READ_ONCE(md->status) != newstate && i < 1000; i++) {
+ /* Bring it out of mwait */
+ WRITE_ONCE(md->control, newstate);
+ udelay(5);
+ }
+
+ if (READ_ONCE(md->status) != newstate)
+ pr_err_once("CPU%u is stuck in mwait_play_dead()\n", cpu);
+ }
+}
+
void __noreturn hlt_play_dead(void)
{
if (__this_cpu_read(cpu_info.x86) >= 4)
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x d7893093a7417527c0d73c9832244e65c9d0114f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023062806-handcuff-depress-af1d@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d7893093a7417527c0d73c9832244e65c9d0114f Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx(a)linutronix.de>
Date: Thu, 15 Jun 2023 22:33:57 +0200
Subject: [PATCH] x86/smp: Cure kexec() vs. mwait_play_dead() breakage
TLDR: It's a mess.
When kexec() is executed on a system with offline CPUs, which are parked in
mwait_play_dead() it can end up in a triple fault during the bootup of the
kexec kernel or cause hard to diagnose data corruption.
The reason is that kexec() eventually overwrites the previous kernel's text,
page tables, data and stack. If it writes to the cache line which is
monitored by a previously offlined CPU, MWAIT resumes execution and ends
up executing the wrong text, dereferencing overwritten page tables or
corrupting the kexec kernels data.
Cure this by bringing the offlined CPUs out of MWAIT into HLT.
Write to the monitored cache line of each offline CPU, which makes MWAIT
resume execution. The written control word tells the offlined CPUs to issue
HLT, which does not have the MWAIT problem.
That does not help, if a stray NMI, MCE or SMI hits the offlined CPUs as
those make it come out of HLT.
A follow up change will put them into INIT, which protects at least against
NMI and SMI.
Fixes: ea53069231f9 ("x86, hotplug: Use mwait to offline a processor, fix the legacy case")
Reported-by: Ashok Raj <ashok.raj(a)intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Tested-by: Ashok Raj <ashok.raj(a)intel.com>
Reviewed-by: Ashok Raj <ashok.raj(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20230615193330.492257119@linutronix.de
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 4e91054c84be..d4ce5cb5c953 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -132,6 +132,8 @@ void wbinvd_on_cpu(int cpu);
int wbinvd_on_all_cpus(void);
void cond_wakeup_cpu0(void);
+void smp_kick_mwait_play_dead(void);
+
void native_smp_send_reschedule(int cpu);
void native_send_call_func_ipi(const struct cpumask *mask);
void native_send_call_func_single_ipi(int cpu);
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index d842875f986f..174d6232b87f 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -21,6 +21,7 @@
#include <linux/interrupt.h>
#include <linux/cpu.h>
#include <linux/gfp.h>
+#include <linux/kexec.h>
#include <asm/mtrr.h>
#include <asm/tlbflush.h>
@@ -157,6 +158,10 @@ static void native_stop_other_cpus(int wait)
if (atomic_cmpxchg(&stopping_cpu, -1, cpu) != -1)
return;
+ /* For kexec, ensure that offline CPUs are out of MWAIT and in HLT */
+ if (kexec_in_progress)
+ smp_kick_mwait_play_dead();
+
/*
* 1) Send an IPI on the reboot vector to all other CPUs.
*
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index c5ac5d74cdd4..483df0427678 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -53,6 +53,7 @@
#include <linux/tboot.h>
#include <linux/gfp.h>
#include <linux/cpuidle.h>
+#include <linux/kexec.h>
#include <linux/numa.h>
#include <linux/pgtable.h>
#include <linux/overflow.h>
@@ -106,6 +107,9 @@ struct mwait_cpu_dead {
unsigned int status;
};
+#define CPUDEAD_MWAIT_WAIT 0xDEADBEEF
+#define CPUDEAD_MWAIT_KEXEC_HLT 0x4A17DEAD
+
/*
* Cache line aligned data for mwait_play_dead(). Separate on purpose so
* that it's unlikely to be touched by other CPUs.
@@ -173,6 +177,10 @@ static void smp_callin(void)
{
int cpuid;
+ /* Mop up eventual mwait_play_dead() wreckage */
+ this_cpu_write(mwait_cpu_dead.status, 0);
+ this_cpu_write(mwait_cpu_dead.control, 0);
+
/*
* If waken up by an INIT in an 82489DX configuration
* cpu_callout_mask guarantees we don't get here before
@@ -1807,6 +1815,10 @@ static inline void mwait_play_dead(void)
(highest_subcstate - 1);
}
+ /* Set up state for the kexec() hack below */
+ md->status = CPUDEAD_MWAIT_WAIT;
+ md->control = CPUDEAD_MWAIT_WAIT;
+
wbinvd();
while (1) {
@@ -1824,10 +1836,57 @@ static inline void mwait_play_dead(void)
mb();
__mwait(eax, 0);
+ if (READ_ONCE(md->control) == CPUDEAD_MWAIT_KEXEC_HLT) {
+ /*
+ * Kexec is about to happen. Don't go back into mwait() as
+ * the kexec kernel might overwrite text and data including
+ * page tables and stack. So mwait() would resume when the
+ * monitor cache line is written to and then the CPU goes
+ * south due to overwritten text, page tables and stack.
+ *
+ * Note: This does _NOT_ protect against a stray MCE, NMI,
+ * SMI. They will resume execution at the instruction
+ * following the HLT instruction and run into the problem
+ * which this is trying to prevent.
+ */
+ WRITE_ONCE(md->status, CPUDEAD_MWAIT_KEXEC_HLT);
+ while(1)
+ native_halt();
+ }
+
cond_wakeup_cpu0();
}
}
+/*
+ * Kick all "offline" CPUs out of mwait on kexec(). See comment in
+ * mwait_play_dead().
+ */
+void smp_kick_mwait_play_dead(void)
+{
+ u32 newstate = CPUDEAD_MWAIT_KEXEC_HLT;
+ struct mwait_cpu_dead *md;
+ unsigned int cpu, i;
+
+ for_each_cpu_andnot(cpu, cpu_present_mask, cpu_online_mask) {
+ md = per_cpu_ptr(&mwait_cpu_dead, cpu);
+
+ /* Does it sit in mwait_play_dead() ? */
+ if (READ_ONCE(md->status) != CPUDEAD_MWAIT_WAIT)
+ continue;
+
+ /* Wait up to 5ms */
+ for (i = 0; READ_ONCE(md->status) != newstate && i < 1000; i++) {
+ /* Bring it out of mwait */
+ WRITE_ONCE(md->control, newstate);
+ udelay(5);
+ }
+
+ if (READ_ONCE(md->status) != newstate)
+ pr_err_once("CPU%u is stuck in mwait_play_dead()\n", cpu);
+ }
+}
+
void __noreturn hlt_play_dead(void)
{
if (__this_cpu_read(cpu_info.x86) >= 4)
From: Philip Yang <Philip.Yang(a)amd.com>
Under VRAM usage pression, map to GPU may fail to create pt bo and
vmbo->shadow_list is not initialized, then ttm_bo_release calling
amdgpu_bo_vm_destroy to access vmbo->shadow_list generates below
dmesg and NULL pointer access backtrace:
Set vmbo destroy callback to amdgpu_bo_vm_destroy only after creating pt
bo successfully, otherwise use default callback amdgpu_bo_destroy.
amdgpu: amdgpu_vm_bo_update failed
amdgpu: update_gpuvm_pte() failed
amdgpu: Failed to map bo to gpuvm
amdgpu 0000:43:00.0: amdgpu: Failed to map peer:0000:43:00.0 mem_domain:2
BUG: kernel NULL pointer dereference, address:
RIP: 0010:amdgpu_bo_vm_destroy+0x4d/0x80 [amdgpu]
Call Trace:
<TASK>
ttm_bo_release+0x207/0x320 [amdttm]
amdttm_bo_init_reserved+0x1d6/0x210 [amdttm]
amdgpu_bo_create+0x1ba/0x520 [amdgpu]
amdgpu_bo_create_vm+0x3a/0x80 [amdgpu]
amdgpu_vm_pt_create+0xde/0x270 [amdgpu]
amdgpu_vm_ptes_update+0x63b/0x710 [amdgpu]
amdgpu_vm_update_range+0x2e7/0x6e0 [amdgpu]
amdgpu_vm_bo_update+0x2bd/0x600 [amdgpu]
update_gpuvm_pte+0x160/0x420 [amdgpu]
amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x313/0x1130 [amdgpu]
kfd_ioctl_map_memory_to_gpu+0x115/0x390 [amdgpu]
kfd_ioctl+0x24a/0x5b0 [amdgpu]
Signed-off-by: Philip Yang <Philip.Yang(a)amd.com>
Reviewed-by: Christian König <christian.koenig(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
(cherry picked from commit 9a3c6067bd2ee2ca2652fbb0679f422f3c9109f9)
This fixes a regression introduced by commit 1cc40dccad76 ("drm/amdgpu:
fix Null pointer dereference error in amdgpu_device_recover_vram") in
5.15.118. It's a hand modified cherry-pick because that commit that
introduced the regression touched nearby code and the context is now
incorrect.
Cc: Linux Regressions <regressions(a)lists.linux.dev>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2650
Fixes: 1cc40dccad76 ("drm/amdgpu: fix Null pointer dereference error in amdgpu_device_recover_vram")
Signed-off-by: Mario Limonciello <mario.limonciello(a)amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index d03a4519f945..8a0b652da4f4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -685,7 +685,6 @@ int amdgpu_bo_create_vm(struct amdgpu_device *adev,
* num of amdgpu_vm_pt entries.
*/
BUG_ON(bp->bo_ptr_size < sizeof(struct amdgpu_bo_vm));
- bp->destroy = &amdgpu_bo_vm_destroy;
r = amdgpu_bo_create(adev, bp, &bo_ptr);
if (r)
return r;
--
2.34.1