From: Rander Wang <rander.wang(a)intel.com>
commit 33c8516841ea4fa12fdb8961711bf95095c607ee upstream
On TGL platform with max98373 codec the trigger start sequence is
fe first, then codec component and sdw link is the last. Recently
a delay was introduced in max98373 codec driver and this resulted
to the start of sdw stream transmission was delayed and the data
transmitted by fw can't be consumed by sdw controller, so xrun happened.
Adding delay in trigger function is a bad idea. This patch enable spk
pin in prepare function and disable it in hw_free to avoid xrun issue
caused by delay in trigger.
Fixes: 3a27875e91fb ("ASoC: max98373: Added 30ms turn on/off time delay")
BugLink: https://github.com/thesofproject/sof/issues/4066
Reviewed-by: Bard Liao <bard.liao(a)intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi(a)linux.intel.com>
Signed-off-by: Rander Wang <rander.wang(a)intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart(a)linux.intel.com>
Link: https://lore.kernel.org/r/20210625205042.65181-2-pierre-louis.bossart@linux…
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
backport to stable/linux-5.13.y and stable/linux-5.12.y since upstream
commit does not apply directly due to a rename in 9c5046e4b3e7 which
creates a conflict.
There is no need to apply this patch to earlier versions since the
commit 3a27875e91fb is only present in V5.12+
sound/soc/intel/boards/sof_sdw_max98373.c | 81 +++++++++++++++--------
1 file changed, 53 insertions(+), 28 deletions(-)
diff --git a/sound/soc/intel/boards/sof_sdw_max98373.c b/sound/soc/intel/boards/sof_sdw_max98373.c
index cfdf970c5800..25daef910aee 100644
--- a/sound/soc/intel/boards/sof_sdw_max98373.c
+++ b/sound/soc/intel/boards/sof_sdw_max98373.c
@@ -55,43 +55,68 @@ static int spk_init(struct snd_soc_pcm_runtime *rtd)
return ret;
}
-static int max98373_sdw_trigger(struct snd_pcm_substream *substream, int cmd)
+static int mx8373_enable_spk_pin(struct snd_pcm_substream *substream, bool enable)
{
+ struct snd_soc_pcm_runtime *rtd = asoc_substream_to_rtd(substream);
+ struct snd_soc_dai *codec_dai;
+ struct snd_soc_dai *cpu_dai;
int ret;
+ int j;
- switch (cmd) {
- case SNDRV_PCM_TRIGGER_START:
- case SNDRV_PCM_TRIGGER_RESUME:
- case SNDRV_PCM_TRIGGER_PAUSE_RELEASE:
- /* enable max98373 first */
- ret = max98373_trigger(substream, cmd);
- if (ret < 0)
- break;
-
- ret = sdw_trigger(substream, cmd);
- break;
- case SNDRV_PCM_TRIGGER_STOP:
- case SNDRV_PCM_TRIGGER_SUSPEND:
- case SNDRV_PCM_TRIGGER_PAUSE_PUSH:
- ret = sdw_trigger(substream, cmd);
- if (ret < 0)
- break;
-
- ret = max98373_trigger(substream, cmd);
- break;
- default:
- ret = -EINVAL;
- break;
+ /* set spk pin by playback only */
+ if (substream->stream == SNDRV_PCM_STREAM_CAPTURE)
+ return 0;
+
+ cpu_dai = asoc_rtd_to_cpu(rtd, 0);
+ for_each_rtd_codec_dais(rtd, j, codec_dai) {
+ struct snd_soc_dapm_context *dapm =
+ snd_soc_component_get_dapm(cpu_dai->component);
+ char pin_name[16];
+
+ snprintf(pin_name, ARRAY_SIZE(pin_name), "%s Spk",
+ codec_dai->component->name_prefix);
+
+ if (enable)
+ ret = snd_soc_dapm_enable_pin(dapm, pin_name);
+ else
+ ret = snd_soc_dapm_disable_pin(dapm, pin_name);
+
+ if (!ret)
+ snd_soc_dapm_sync(dapm);
}
- return ret;
+ return 0;
+}
+
+static int mx8373_sdw_prepare(struct snd_pcm_substream *substream)
+{
+ int ret = 0;
+
+ /* according to soc_pcm_prepare dai link prepare is called first */
+ ret = sdw_prepare(substream);
+ if (ret < 0)
+ return ret;
+
+ return mx8373_enable_spk_pin(substream, true);
+}
+
+static int mx8373_sdw_hw_free(struct snd_pcm_substream *substream)
+{
+ int ret = 0;
+
+ /* according to soc_pcm_hw_free dai link free is called first */
+ ret = sdw_hw_free(substream);
+ if (ret < 0)
+ return ret;
+
+ return mx8373_enable_spk_pin(substream, false);
}
static const struct snd_soc_ops max_98373_sdw_ops = {
.startup = sdw_startup,
- .prepare = sdw_prepare,
- .trigger = max98373_sdw_trigger,
- .hw_free = sdw_hw_free,
+ .prepare = mx8373_sdw_prepare,
+ .trigger = sdw_trigger,
+ .hw_free = mx8373_sdw_hw_free,
.shutdown = sdw_shutdown,
};
--
2.25.1
On 8/2/2021 7:43 AM, ci_notify(a)linaro.org wrote:
> Successfully identified regression in *linux* in CI configuration tcwg_kernel/llvm-release-arm-stable-allyesconfig. So far, this commit has regressed CI configurations:
> - tcwg_kernel/llvm-release-arm-stable-allyesconfig
>
> Culprit:
> <cut>
> commit 341db343768bc44f3512facc464021730d64071c
> Author: Linus Walleij <linus.walleij(a)linaro.org>
> Date: Sun May 23 00:50:39 2021 +0200
>
> power: supply: ab8500: Move to componentized binding
>
> [ Upstream commit 1c1f13a006ed0d71bb5664c8b7e3e77a28da3beb ]
>
> The driver has problems with the different components of
> the charging code racing with each other to probe().
>
> This results in all four subdrivers populating battery
> information to ascertain that it is populated for their
> own needs for example.
>
> Fix this by using component probing and thus expressing
> to the kernel that these are dependent components.
> The probes can happen in any order and will only acquire
> resources such as state container, regulators and
> interrupts and initialize the data structures, but no
> execution happens until the .bind() callback is called.
>
> The charging driver is the main component and binds
> first, then bind in order the three subcomponents:
> ab8500-fg, ab8500-btemp and ab8500-chargalg.
>
> Do some housekeeping while we are moving the code around.
> Like use devm_* for IRQs so as to cut down on some
> boilerplate.
>
> Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
> Signed-off-by: Sebastian Reichel <sebastian.reichel(a)collabora.com>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
> </cut>
>
> Results regressed to (for first_bad == 341db343768bc44f3512facc464021730d64071c)
> # reset_artifacts:
> -10
> # build_abe binutils:
> -9
> # build_llvm:
> -5
> # build_abe qemu:
> -2
> # linux_n_obj:
> 19634
> # First few build errors in logs:
> # 00:03:07 drivers/power/supply/ab8500_fg.c:3061:32: error: use of undeclared identifier 'np'
> # 00:03:08 make[3]: *** [drivers/power/supply/ab8500_fg.o] Error 1
> # 00:03:10 make[2]: *** [drivers/power/supply] Error 2
> # 00:03:10 make[1]: *** [drivers/power] Error 2
> # 00:04:05 make: *** [drivers] Error 2
Greg and Sasha,
Please cherry pick upstream commit 7e2bb83c617f ("power: supply: ab8500:
Call battery population once") to resolve this build error on 5.13.
Cheers,
Nathan
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 240246f6b913b0c23733cfd2def1d283f8cc9bbe Mon Sep 17 00:00:00 2001
From: Goldwyn Rodrigues <rgoldwyn(a)suse.de>
Date: Fri, 9 Jul 2021 11:29:22 -0500
Subject: [PATCH] btrfs: mark compressed range uptodate only if all bio succeed
In compression write endio sequence, the range which the compressed_bio
writes is marked as uptodate if the last bio of the compressed (sub)bios
is completed successfully. There could be previous bio which may
have failed which is recorded in cb->errors.
Set the writeback range as uptodate only if cb->errors is zero, as opposed
to checking only the last bio's status.
Backporting notes: in all versions up to 4.4 the last argument is always
replaced by "!cb->errors".
CC: stable(a)vger.kernel.org # 4.4+
Signed-off-by: Goldwyn Rodrigues <rgoldwyn(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 9a023ae0f98b..30d82cdf128c 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -352,7 +352,7 @@ static void end_compressed_bio_write(struct bio *bio)
btrfs_record_physical_zoned(inode, cb->start, bio);
btrfs_writepage_endio_finish_ordered(BTRFS_I(inode), NULL,
cb->start, cb->start + cb->len - 1,
- bio->bi_status == BLK_STS_OK);
+ !cb->errors);
end_compressed_writeback(inode, cb);
/* note, our inode could be gone now */
The commit <1b6b26ae7053>("pipe: fix and clarify pipe write wakeup
logic") changed pipe write logic to wakeup readers only if the pipe
was empty at the time of write. However, there are libraries that relied
upon the older behavior for notification scheme similar to what's
described in [1]
One such library 'realm-core'[2] is used by numerous Android applications.
The library uses a similar notification mechanism as GNU Make but it
never drains the pipe until it is full. When Android moved to v5.10
kernel, all applications using this library stopped working.
The C program at the end of this email mimics the library code.
The program works with 5.4 kernel. It fails with v5.10 and I am fairly
certain it will fail wiht v5.5 as well. The single patch in this series
restores the old behavior. With the patch, the test and all affected
Android applications start working with v5.10
After reading through epoll(7), I think the pipe should be drained after
each epoll_wait() comes back. Also, that a non-empty pipe is
considered to be "ready" for readers. The problem is that prior
to the commit above, any new data written to non-empty pipes
would wakeup threads waiting in epoll(EPOLLIN|EPILLET) and thats
how this library worked.
I do think the program below is using EPOLLET wrong. However, it
used to work before and now it doesn't. So, I thought it is
worth asking if this counts as userspace break.
There was also a symmetrical change made to pipe_read in commit
<f467a6a66419> ("pipe: fix and clarify pipe read wakeup logic")
that I am not sure needs changing.
The library has since been fixed[3] but it will be a while
before all applications incorporate the updated library.
- ssp
1. https://lore.kernel.org/lkml/CAHk-=wjeG0q1vgzu4iJhW5juPkTsjTYmiqiMUYAebWW+0…
2. https://github.com/realm/realm-core
3. https://github.com/realm/realm-core/issues/4666
====
#include <stdio.h>
#include <error.h>
#include <errno.h>
#include <fcntl.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
#include <time.h>
#include <sys/epoll.h>
#include <sys/stat.h>
#include <sys/types.h>
#define FIFO_NAME "epoll-test-fifo"
pthread_t tid;
int max_delay_ms = 20;
int fifo_fd;
unsigned char written;
unsigned char received;
int epoll_fd;
void *wait_on_fifo(void *unused)
{
while (1) {
struct epoll_event ev;
int ret;
unsigned char c;
ret = epoll_wait(epoll_fd, &ev, 1, 5000);
if (ret == -1) {
/* If interrupted syscall, continue .. */
if (errno == EINTR)
continue;
/* epoll_wait failed, bail.. */
error(99, errno, "epoll_wait failed \n");
}
/* timeout */
if (ret == 0)
break;
if (ev.data.fd == fifo_fd) {
/* Assume this is notification where the thread is catching up.
* pipe is emptied by the writer when it detects it is full.
*/
received = written;
}
}
return NULL;
}
int write_fifo(int fd, unsigned char c)
{
while (1) {
int actual;
char buf[1024];
ssize_t ret = write(fd, &c, 1);
if (ret == 1)
break;
/*
* If the pipe's buffer is full, we need to read some of the old data in
* it to make space. We dont read in the code waiting for
* notifications so that we can notify multiple waiters with a single
* write.
*/
if (ret != 0) {
if (errno != EAGAIN)
return -EIO;
}
actual = read(fd, buf, 1024);
if (actual == 0)
return -errno;
}
return 0;
}
int create_and_setup_fifo()
{
int ret;
char fifo_path[4096];
struct epoll_event ev;
char *tmpdir = getenv("TMPDIR");
if (tmpdir == NULL)
tmpdir = ".";
ret = sprintf(fifo_path, "%s/%s", tmpdir, FIFO_NAME);
if (access(fifo_path, F_OK) == 0)
unlink(fifo_path);
ret = mkfifo(fifo_path, 0600);
if (ret < 0)
error(1, errno, "Failed to create fifo");
fifo_fd = open(fifo_path, O_RDWR | O_NONBLOCK);
if (fifo_fd < 0)
error(2, errno, "Failed to open Fifo");
ev.events = EPOLLIN | EPOLLET;
ev.data.fd = fifo_fd;
ret = epoll_ctl(epoll_fd, EPOLL_CTL_ADD, fifo_fd, &ev);
if (ret < 0)
error(4, errno, "Failed to add fifo to epoll instance");
return 0;
}
int main(int argc, char *argv[])
{
int ret, random;
unsigned char c = 1;
epoll_fd = epoll_create(1);
if (epoll_fd == -1)
error(3, errno, "Failed to create epoll instance");
ret = create_and_setup_fifo();
if (ret != 0)
error(45, EINVAL, "Failed to setup fifo");
ret = pthread_create(&tid, NULL, wait_on_fifo, NULL);
if (ret != 0)
error(2, errno, "Failed to create a thread");
srand(time(NULL));
/* Write 256 bytes to fifo one byte at a time with random delays upto 20ms */
do {
written = c;
ret = write_fifo(fifo_fd, c);
if (ret != 0)
error(55, errno, "Failed to notify fifo, write #%u", (unsigned int)c);
c++;
random = rand();
usleep((random % max_delay_ms) * 1000);
} while (written <= c); /* stop after c = 255 */
pthread_join(tid, NULL);
printf("Test: %s", written == received ? "PASS\n" : "FAIL");
if (written != received)
printf(": Written (%d) Received (%d)\n", written, received);
close(fifo_fd);
close(epoll_fd);
return 0;
}
====
Sandeep Patil (1):
fs: pipe: wakeup readers everytime new data written is to pipe
fs/pipe.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
--
2.32.0.554.ge1b32706d8-goog
The HYP rodata section is currently lumped together with the BSS,
which isn't exactly what is expected (it gets registered with
kmemleak, for example).
Move it away so that it is actually marked RO. As an added
benefit, it isn't registered with kmemleak anymore.
Fixes: 380e18ade4a5 ("KVM: arm64: Introduce a BSS section for use at Hyp")
Suggested-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Cc: stable(a)vger.kernel.org #5.13
---
arch/arm64/kernel/vmlinux.lds.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 709d2c433c5e..f6b1a88245db 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -181,6 +181,8 @@ SECTIONS
/* everything from this point to __init_begin will be marked RO NX */
RO_DATA(PAGE_SIZE)
+ HYPERVISOR_DATA_SECTIONS
+
idmap_pg_dir = .;
. += IDMAP_DIR_SIZE;
idmap_pg_end = .;
@@ -260,8 +262,6 @@ SECTIONS
_sdata = .;
RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, THREAD_ALIGN)
- HYPERVISOR_DATA_SECTIONS
-
/*
* Data written with the MMU off but read with the MMU on requires
* cache lines to be invalidated, discarding up to a Cache Writeback
--
2.30.2
Booting a KVM host in protected mode with kmemleak quickly results
in a pretty bad crash, as kmemleak doesn't know that the HYP sections
have been taken away. This is specially true for the BSS section,
which is part of the kernel BSS section and registered at boot time
by kmemleak itself.
Unregister the HYP part of the BSS before making that section
HYP-private. The rest of the HYP-specific data is obtained via
the page allocator or lives in other sections, none of which is
subjected to kmemleak.
Fixes: 90134ac9cabb ("KVM: arm64: Protect the .hyp sections from the host")
Reviewed-by: Quentin Perret <qperret(a)google.com>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: stable(a)vger.kernel.org # 5.13
---
arch/arm64/kvm/arm.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index e9a2b8f27792..52242f32c4be 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -15,6 +15,7 @@
#include <linux/fs.h>
#include <linux/mman.h>
#include <linux/sched.h>
+#include <linux/kmemleak.h>
#include <linux/kvm.h>
#include <linux/kvm_irqfd.h>
#include <linux/irqbypass.h>
@@ -1982,6 +1983,12 @@ static int finalize_hyp_mode(void)
if (ret)
return ret;
+ /*
+ * Exclude HYP BSS from kmemleak so that it doesn't get peeked
+ * at, which would end badly once the section is inaccessible.
+ * None of other sections should ever be introspected.
+ */
+ kmemleak_free_part(__hyp_bss_start, __hyp_bss_end - __hyp_bss_start);
ret = pkvm_mark_hyp_section(__hyp_bss);
if (ret)
return ret;
--
2.30.2
A recent change in LLVM causes module_{c,d}tor sections to appear when
CONFIG_K{A,C}SAN are enabled, which results in orphan section warnings
because these are not handled anywhere:
ld.lld: warning: arch/x86/pci/built-in.a(legacy.o):(.text.asan.module_ctor) is being placed in '.text.asan.module_ctor'
ld.lld: warning: arch/x86/pci/built-in.a(legacy.o):(.text.asan.module_dtor) is being placed in '.text.asan.module_dtor'
ld.lld: warning: arch/x86/pci/built-in.a(legacy.o):(.text.tsan.module_ctor) is being placed in '.text.tsan.module_ctor'
Place them in the TEXT_TEXT section so that these technologies continue
to work with the newer compiler versions. All of the KASAN and KCSAN
KUnit tests continue to pass after this change.
Cc: stable(a)vger.kernel.org
Link: https://github.com/ClangBuiltLinux/linux/issues/1432
Link: https://github.com/llvm/llvm-project/commit/7b789562244ee941b7bf2cefeb3fc08…
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
include/asm-generic/vmlinux.lds.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 17325416e2de..3b79b1e76556 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -586,6 +586,7 @@
NOINSTR_TEXT \
*(.text..refcount) \
*(.ref.text) \
+ *(.text.asan .text.asan.*) \
TEXT_CFI_JT \
MEM_KEEP(init.text*) \
MEM_KEEP(exit.text*) \
base-commit: 4669e13cd67f8532be12815ed3d37e775a9bdc16
--
2.32.0.264.g75ae10bc75
The patch below does not apply to the 5.13-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 6be50f5d83adc9541de3d5be26e968182b5ac150 Mon Sep 17 00:00:00 2001
From: Stylon Wang <stylon.wang(a)amd.com>
Date: Wed, 21 Jul 2021 12:25:24 +0800
Subject: [PATCH] drm/amd/display: Fix ASSR regression on embedded panels
[Why]
Regression found in some embedded panels traces back to the earliest
upstreamed ASSR patch. The changed code flow are causing problems
with some panels.
[How]
- Change ASSR enabling code while preserving original code flow
as much as possible
- Simplify the code on guarding with internal display flag
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=213779
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1620
Reviewed-by: Alex Deucher <alexander.deucher(a)amd.com>
Signed-off-by: Stylon Wang <stylon.wang(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index 12066f5a53fc..9fb8c46dc606 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -1820,8 +1820,7 @@ bool perform_link_training_with_retries(
*/
panel_mode = DP_PANEL_MODE_DEFAULT;
}
- } else
- panel_mode = DP_PANEL_MODE_DEFAULT;
+ }
}
#endif
@@ -4650,7 +4649,10 @@ enum dp_panel_mode dp_get_panel_mode(struct dc_link *link)
}
}
- if (link->dpcd_caps.panel_mode_edp) {
+ if (link->dpcd_caps.panel_mode_edp &&
+ (link->connector_signal == SIGNAL_TYPE_EDP ||
+ (link->connector_signal == SIGNAL_TYPE_DISPLAY_PORT &&
+ link->is_internal_display))) {
return DP_PANEL_MODE_EDP;
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 240246f6b913b0c23733cfd2def1d283f8cc9bbe Mon Sep 17 00:00:00 2001
From: Goldwyn Rodrigues <rgoldwyn(a)suse.de>
Date: Fri, 9 Jul 2021 11:29:22 -0500
Subject: [PATCH] btrfs: mark compressed range uptodate only if all bio succeed
In compression write endio sequence, the range which the compressed_bio
writes is marked as uptodate if the last bio of the compressed (sub)bios
is completed successfully. There could be previous bio which may
have failed which is recorded in cb->errors.
Set the writeback range as uptodate only if cb->errors is zero, as opposed
to checking only the last bio's status.
Backporting notes: in all versions up to 4.4 the last argument is always
replaced by "!cb->errors".
CC: stable(a)vger.kernel.org # 4.4+
Signed-off-by: Goldwyn Rodrigues <rgoldwyn(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 9a023ae0f98b..30d82cdf128c 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -352,7 +352,7 @@ static void end_compressed_bio_write(struct bio *bio)
btrfs_record_physical_zoned(inode, cb->start, bio);
btrfs_writepage_endio_finish_ordered(BTRFS_I(inode), NULL,
cb->start, cb->start + cb->len - 1,
- bio->bi_status == BLK_STS_OK);
+ !cb->errors);
end_compressed_writeback(inode, cb);
/* note, our inode could be gone now */
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 240246f6b913b0c23733cfd2def1d283f8cc9bbe Mon Sep 17 00:00:00 2001
From: Goldwyn Rodrigues <rgoldwyn(a)suse.de>
Date: Fri, 9 Jul 2021 11:29:22 -0500
Subject: [PATCH] btrfs: mark compressed range uptodate only if all bio succeed
In compression write endio sequence, the range which the compressed_bio
writes is marked as uptodate if the last bio of the compressed (sub)bios
is completed successfully. There could be previous bio which may
have failed which is recorded in cb->errors.
Set the writeback range as uptodate only if cb->errors is zero, as opposed
to checking only the last bio's status.
Backporting notes: in all versions up to 4.4 the last argument is always
replaced by "!cb->errors".
CC: stable(a)vger.kernel.org # 4.4+
Signed-off-by: Goldwyn Rodrigues <rgoldwyn(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 9a023ae0f98b..30d82cdf128c 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -352,7 +352,7 @@ static void end_compressed_bio_write(struct bio *bio)
btrfs_record_physical_zoned(inode, cb->start, bio);
btrfs_writepage_endio_finish_ordered(BTRFS_I(inode), NULL,
cb->start, cb->start + cb->len - 1,
- bio->bi_status == BLK_STS_OK);
+ !cb->errors);
end_compressed_writeback(inode, cb);
/* note, our inode could be gone now */
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 240246f6b913b0c23733cfd2def1d283f8cc9bbe Mon Sep 17 00:00:00 2001
From: Goldwyn Rodrigues <rgoldwyn(a)suse.de>
Date: Fri, 9 Jul 2021 11:29:22 -0500
Subject: [PATCH] btrfs: mark compressed range uptodate only if all bio succeed
In compression write endio sequence, the range which the compressed_bio
writes is marked as uptodate if the last bio of the compressed (sub)bios
is completed successfully. There could be previous bio which may
have failed which is recorded in cb->errors.
Set the writeback range as uptodate only if cb->errors is zero, as opposed
to checking only the last bio's status.
Backporting notes: in all versions up to 4.4 the last argument is always
replaced by "!cb->errors".
CC: stable(a)vger.kernel.org # 4.4+
Signed-off-by: Goldwyn Rodrigues <rgoldwyn(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 9a023ae0f98b..30d82cdf128c 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -352,7 +352,7 @@ static void end_compressed_bio_write(struct bio *bio)
btrfs_record_physical_zoned(inode, cb->start, bio);
btrfs_writepage_endio_finish_ordered(BTRFS_I(inode), NULL,
cb->start, cb->start + cb->len - 1,
- bio->bi_status == BLK_STS_OK);
+ !cb->errors);
end_compressed_writeback(inode, cb);
/* note, our inode could be gone now */
Booting a KVM host in protected mode with kmemleak quickly results
in a pretty bad crash, as kmemleak doesn't know that the HYP sections
have been taken away.
Make the unregistration from kmemleak part of marking the sections
as HYP-private. The rest of the HYP-specific data is obtained via
the page allocator, which is not subjected to kmemleak.
Fixes: 90134ac9cabb ("KVM: arm64: Protect the .hyp sections from the host")
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Cc: Quentin Perret <qperret(a)google.com>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: stable(a)vger.kernel.org # 5.13
---
arch/arm64/kvm/arm.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index e9a2b8f27792..23f12e602878 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -15,6 +15,7 @@
#include <linux/fs.h>
#include <linux/mman.h>
#include <linux/sched.h>
+#include <linux/kmemleak.h>
#include <linux/kvm.h>
#include <linux/kvm_irqfd.h>
#include <linux/irqbypass.h>
@@ -1960,8 +1961,12 @@ static inline int pkvm_mark_hyp(phys_addr_t start, phys_addr_t end)
}
#define pkvm_mark_hyp_section(__section) \
+({ \
+ u64 sz = __section##_end - __section##_start; \
+ kmemleak_free_part(__section##_start, sz); \
pkvm_mark_hyp(__pa_symbol(__section##_start), \
- __pa_symbol(__section##_end))
+ __pa_symbol(__section##_end)); \
+})
static int finalize_hyp_mode(void)
{
--
2.30.2
The patch below does not apply to the 5.13-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 476d98018f32e68e7c5d4e8456940cf2b6d66f10 Mon Sep 17 00:00:00 2001
From: John Fastabend <john.fastabend(a)gmail.com>
Date: Tue, 27 Jul 2021 09:04:59 -0700
Subject: [PATCH] bpf, sockmap: On cleanup we additionally need to remove
cached skb
Its possible if a socket is closed and the receive thread is under memory
pressure it may have cached a skb. We need to ensure these skbs are
free'd along with the normal ingress_skb queue.
Before 799aa7f98d53 ("skmsg: Avoid lock_sock() in sk_psock_backlog()") tear
down and backlog processing both had sock_lock for the common case of
socket close or unhash. So it was not possible to have both running in
parrallel so all we would need is the kfree in those kernels.
But, latest kernels include the commit 799aa7f98d5e and this requires a
bit more work. Without the ingress_lock guarding reading/writing the
state->skb case its possible the tear down could run before the state
update causing it to leak memory or worse when the backlog reads the state
it could potentially run interleaved with the tear down and we might end up
free'ing the state->skb from tear down side but already have the reference
from backlog side. To resolve such races we wrap accesses in ingress_lock
on both sides serializing tear down and backlog case. In both cases this
only happens after an EAGAIN error case so having an extra lock in place
is likely fine. The normal path will skip the locks.
Note, we check state->skb before grabbing lock. This works because
we can only enqueue with the mutex we hold already. Avoiding a race
on adding state->skb after the check. And if tear down path is running
that is also fine if the tear down path then removes state->skb we
will simply set skb=NULL and the subsequent goto is skipped. This
slight complication avoids locking in normal case.
With this fix we no longer see this warning splat from tcp side on
socket close when we hit the above case with redirect to ingress self.
[224913.935822] WARNING: CPU: 3 PID: 32100 at net/core/stream.c:208 sk_stream_kill_queues+0x212/0x220
[224913.935841] Modules linked in: fuse overlay bpf_preload x86_pkg_temp_thermal intel_uncore wmi_bmof squashfs sch_fq_codel efivarfs ip_tables x_tables uas xhci_pci ixgbe mdio xfrm_algo xhci_hcd wmi
[224913.935897] CPU: 3 PID: 32100 Comm: fgs-bench Tainted: G I 5.14.0-rc1alu+ #181
[224913.935908] Hardware name: Dell Inc. Precision 5820 Tower/002KVM, BIOS 1.9.2 01/24/2019
[224913.935914] RIP: 0010:sk_stream_kill_queues+0x212/0x220
[224913.935923] Code: 8b 83 20 02 00 00 85 c0 75 20 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 89 df e8 2b 11 fe ff eb c3 0f 0b e9 7c ff ff ff 0f 0b eb ce <0f> 0b 5b 5d 41 5c 41 5d 41 5e 41 5f c3 90 0f 1f 44 00 00 41 57 41
[224913.935932] RSP: 0018:ffff88816271fd38 EFLAGS: 00010206
[224913.935941] RAX: 0000000000000ae8 RBX: ffff88815acd5240 RCX: dffffc0000000000
[224913.935948] RDX: 0000000000000003 RSI: 0000000000000ae8 RDI: ffff88815acd5460
[224913.935954] RBP: ffff88815acd5460 R08: ffffffff955c0ae8 R09: fffffbfff2e6f543
[224913.935961] R10: ffffffff9737aa17 R11: fffffbfff2e6f542 R12: ffff88815acd5390
[224913.935967] R13: ffff88815acd5480 R14: ffffffff98d0c080 R15: ffffffff96267500
[224913.935974] FS: 00007f86e6bd1700(0000) GS:ffff888451cc0000(0000) knlGS:0000000000000000
[224913.935981] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[224913.935988] CR2: 000000c0008eb000 CR3: 00000001020e0005 CR4: 00000000003706e0
[224913.935994] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[224913.936000] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[224913.936007] Call Trace:
[224913.936016] inet_csk_destroy_sock+0xba/0x1f0
[224913.936033] __tcp_close+0x620/0x790
[224913.936047] tcp_close+0x20/0x80
[224913.936056] inet_release+0x8f/0xf0
[224913.936070] __sock_release+0x72/0x120
[224913.936083] sock_close+0x14/0x20
Fixes: a136678c0bdbb ("bpf: sk_msg, zap ingress queue on psock down")
Signed-off-by: John Fastabend <john.fastabend(a)gmail.com>
Signed-off-by: Andrii Nakryiko <andrii(a)kernel.org>
Acked-by: Jakub Sitnicki <jakub(a)cloudflare.com>
Acked-by: Martin KaFai Lau <kafai(a)fb.com>
Link: https://lore.kernel.org/bpf/20210727160500.1713554-3-john.fastabend@gmail.c…
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 28115ef742e8..036cdb33a94a 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -590,23 +590,42 @@ static void sock_drop(struct sock *sk, struct sk_buff *skb)
kfree_skb(skb);
}
+static void sk_psock_skb_state(struct sk_psock *psock,
+ struct sk_psock_work_state *state,
+ struct sk_buff *skb,
+ int len, int off)
+{
+ spin_lock_bh(&psock->ingress_lock);
+ if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED)) {
+ state->skb = skb;
+ state->len = len;
+ state->off = off;
+ } else {
+ sock_drop(psock->sk, skb);
+ }
+ spin_unlock_bh(&psock->ingress_lock);
+}
+
static void sk_psock_backlog(struct work_struct *work)
{
struct sk_psock *psock = container_of(work, struct sk_psock, work);
struct sk_psock_work_state *state = &psock->work_state;
- struct sk_buff *skb;
+ struct sk_buff *skb = NULL;
bool ingress;
u32 len, off;
int ret;
mutex_lock(&psock->work_mutex);
- if (state->skb) {
+ if (unlikely(state->skb)) {
+ spin_lock_bh(&psock->ingress_lock);
skb = state->skb;
len = state->len;
off = state->off;
state->skb = NULL;
- goto start;
+ spin_unlock_bh(&psock->ingress_lock);
}
+ if (skb)
+ goto start;
while ((skb = skb_dequeue(&psock->ingress_skb))) {
len = skb->len;
@@ -621,9 +640,8 @@ static void sk_psock_backlog(struct work_struct *work)
len, ingress);
if (ret <= 0) {
if (ret == -EAGAIN) {
- state->skb = skb;
- state->len = len;
- state->off = off;
+ sk_psock_skb_state(psock, state, skb,
+ len, off);
goto end;
}
/* Hard errors break pipe and stop xmit. */
@@ -722,6 +740,11 @@ static void __sk_psock_zap_ingress(struct sk_psock *psock)
skb_bpf_redirect_clear(skb);
sock_drop(psock->sk, skb);
}
+ kfree_skb(psock->work_state.skb);
+ /* We null the skb here to ensure that calls to sk_psock_backlog
+ * do not pick up the free'd skb.
+ */
+ psock->work_state.skb = NULL;
__sk_psock_purge_ingress_msg(psock);
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 476d98018f32e68e7c5d4e8456940cf2b6d66f10 Mon Sep 17 00:00:00 2001
From: John Fastabend <john.fastabend(a)gmail.com>
Date: Tue, 27 Jul 2021 09:04:59 -0700
Subject: [PATCH] bpf, sockmap: On cleanup we additionally need to remove
cached skb
Its possible if a socket is closed and the receive thread is under memory
pressure it may have cached a skb. We need to ensure these skbs are
free'd along with the normal ingress_skb queue.
Before 799aa7f98d53 ("skmsg: Avoid lock_sock() in sk_psock_backlog()") tear
down and backlog processing both had sock_lock for the common case of
socket close or unhash. So it was not possible to have both running in
parrallel so all we would need is the kfree in those kernels.
But, latest kernels include the commit 799aa7f98d5e and this requires a
bit more work. Without the ingress_lock guarding reading/writing the
state->skb case its possible the tear down could run before the state
update causing it to leak memory or worse when the backlog reads the state
it could potentially run interleaved with the tear down and we might end up
free'ing the state->skb from tear down side but already have the reference
from backlog side. To resolve such races we wrap accesses in ingress_lock
on both sides serializing tear down and backlog case. In both cases this
only happens after an EAGAIN error case so having an extra lock in place
is likely fine. The normal path will skip the locks.
Note, we check state->skb before grabbing lock. This works because
we can only enqueue with the mutex we hold already. Avoiding a race
on adding state->skb after the check. And if tear down path is running
that is also fine if the tear down path then removes state->skb we
will simply set skb=NULL and the subsequent goto is skipped. This
slight complication avoids locking in normal case.
With this fix we no longer see this warning splat from tcp side on
socket close when we hit the above case with redirect to ingress self.
[224913.935822] WARNING: CPU: 3 PID: 32100 at net/core/stream.c:208 sk_stream_kill_queues+0x212/0x220
[224913.935841] Modules linked in: fuse overlay bpf_preload x86_pkg_temp_thermal intel_uncore wmi_bmof squashfs sch_fq_codel efivarfs ip_tables x_tables uas xhci_pci ixgbe mdio xfrm_algo xhci_hcd wmi
[224913.935897] CPU: 3 PID: 32100 Comm: fgs-bench Tainted: G I 5.14.0-rc1alu+ #181
[224913.935908] Hardware name: Dell Inc. Precision 5820 Tower/002KVM, BIOS 1.9.2 01/24/2019
[224913.935914] RIP: 0010:sk_stream_kill_queues+0x212/0x220
[224913.935923] Code: 8b 83 20 02 00 00 85 c0 75 20 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 89 df e8 2b 11 fe ff eb c3 0f 0b e9 7c ff ff ff 0f 0b eb ce <0f> 0b 5b 5d 41 5c 41 5d 41 5e 41 5f c3 90 0f 1f 44 00 00 41 57 41
[224913.935932] RSP: 0018:ffff88816271fd38 EFLAGS: 00010206
[224913.935941] RAX: 0000000000000ae8 RBX: ffff88815acd5240 RCX: dffffc0000000000
[224913.935948] RDX: 0000000000000003 RSI: 0000000000000ae8 RDI: ffff88815acd5460
[224913.935954] RBP: ffff88815acd5460 R08: ffffffff955c0ae8 R09: fffffbfff2e6f543
[224913.935961] R10: ffffffff9737aa17 R11: fffffbfff2e6f542 R12: ffff88815acd5390
[224913.935967] R13: ffff88815acd5480 R14: ffffffff98d0c080 R15: ffffffff96267500
[224913.935974] FS: 00007f86e6bd1700(0000) GS:ffff888451cc0000(0000) knlGS:0000000000000000
[224913.935981] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[224913.935988] CR2: 000000c0008eb000 CR3: 00000001020e0005 CR4: 00000000003706e0
[224913.935994] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[224913.936000] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[224913.936007] Call Trace:
[224913.936016] inet_csk_destroy_sock+0xba/0x1f0
[224913.936033] __tcp_close+0x620/0x790
[224913.936047] tcp_close+0x20/0x80
[224913.936056] inet_release+0x8f/0xf0
[224913.936070] __sock_release+0x72/0x120
[224913.936083] sock_close+0x14/0x20
Fixes: a136678c0bdbb ("bpf: sk_msg, zap ingress queue on psock down")
Signed-off-by: John Fastabend <john.fastabend(a)gmail.com>
Signed-off-by: Andrii Nakryiko <andrii(a)kernel.org>
Acked-by: Jakub Sitnicki <jakub(a)cloudflare.com>
Acked-by: Martin KaFai Lau <kafai(a)fb.com>
Link: https://lore.kernel.org/bpf/20210727160500.1713554-3-john.fastabend@gmail.c…
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 28115ef742e8..036cdb33a94a 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -590,23 +590,42 @@ static void sock_drop(struct sock *sk, struct sk_buff *skb)
kfree_skb(skb);
}
+static void sk_psock_skb_state(struct sk_psock *psock,
+ struct sk_psock_work_state *state,
+ struct sk_buff *skb,
+ int len, int off)
+{
+ spin_lock_bh(&psock->ingress_lock);
+ if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED)) {
+ state->skb = skb;
+ state->len = len;
+ state->off = off;
+ } else {
+ sock_drop(psock->sk, skb);
+ }
+ spin_unlock_bh(&psock->ingress_lock);
+}
+
static void sk_psock_backlog(struct work_struct *work)
{
struct sk_psock *psock = container_of(work, struct sk_psock, work);
struct sk_psock_work_state *state = &psock->work_state;
- struct sk_buff *skb;
+ struct sk_buff *skb = NULL;
bool ingress;
u32 len, off;
int ret;
mutex_lock(&psock->work_mutex);
- if (state->skb) {
+ if (unlikely(state->skb)) {
+ spin_lock_bh(&psock->ingress_lock);
skb = state->skb;
len = state->len;
off = state->off;
state->skb = NULL;
- goto start;
+ spin_unlock_bh(&psock->ingress_lock);
}
+ if (skb)
+ goto start;
while ((skb = skb_dequeue(&psock->ingress_skb))) {
len = skb->len;
@@ -621,9 +640,8 @@ static void sk_psock_backlog(struct work_struct *work)
len, ingress);
if (ret <= 0) {
if (ret == -EAGAIN) {
- state->skb = skb;
- state->len = len;
- state->off = off;
+ sk_psock_skb_state(psock, state, skb,
+ len, off);
goto end;
}
/* Hard errors break pipe and stop xmit. */
@@ -722,6 +740,11 @@ static void __sk_psock_zap_ingress(struct sk_psock *psock)
skb_bpf_redirect_clear(skb);
sock_drop(psock->sk, skb);
}
+ kfree_skb(psock->work_state.skb);
+ /* We null the skb here to ensure that calls to sk_psock_backlog
+ * do not pick up the free'd skb.
+ */
+ psock->work_state.skb = NULL;
__sk_psock_purge_ingress_msg(psock);
}
The performance reporting driver added cpu hotplug
feature but it didn't add pmu migration call in cpu
offline function.
This can create an issue incase the current designated
cpu being used to collect fme pmu data got offline,
as based on current code we are not migrating fme pmu to
new target cpu. Because of that perf will still try to
fetch data from that offline cpu and hence we will not
get counter data.
Patch fixed this issue by adding pmu_migrate_context call
in fme_perf_offline_cpu function.
Fixes: 724142f8c42a ("fpga: dfl: fme: add performance reporting support")
Tested-by: Xu Yilun <yilun.xu(a)intel.com>
Acked-by: Wu Hao <hao.wu(a)intel.com>
Signed-off-by: Kajol Jain <kjain(a)linux.ibm.com>
Cc: stable(a)vger.kernel.org
---
drivers/fpga/dfl-fme-perf.c | 2 ++
1 file changed, 2 insertions(+)
---
Changelog:
v2 -> v3:
- Added Acked-by tag
- Removed comment as suggested by Wu Hao
- Link to patch v2: https://lkml.org/lkml/2021/7/9/143
v1 -> v2:
- Add stable(a)vger.kernel.org in cc list
- Link to patch v1: https://lkml.org/lkml/2021/6/28/275
RFC -> PATCH v1
- Remove RFC tag
- Did nits changes on subject and commit message as suggested by Xu Yilun
- Added Tested-by tag
- Link to rfc patch: https://lkml.org/lkml/2021/6/28/112
---
diff --git a/drivers/fpga/dfl-fme-perf.c b/drivers/fpga/dfl-fme-perf.c
index 4299145ef347..587c82be12f7 100644
--- a/drivers/fpga/dfl-fme-perf.c
+++ b/drivers/fpga/dfl-fme-perf.c
@@ -953,6 +953,8 @@ static int fme_perf_offline_cpu(unsigned int cpu, struct hlist_node *node)
return 0;
priv->cpu = target;
+ perf_pmu_migrate_context(&priv->pmu, cpu, target);
+
return 0;
}
--
2.31.1
The patch below does not apply to the 5.13-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b946dbcfa4df80ec81b442964e07ad37000cc059 Mon Sep 17 00:00:00 2001
From: Ronnie Sahlberg <lsahlber(a)redhat.com>
Date: Wed, 28 Jul 2021 16:38:29 +1000
Subject: [PATCH] cifs: add missing parsing of backupuid
We lost parsing of backupuid in the switch to new mount API.
Add it back.
Signed-off-by: Ronnie Sahlberg <lsahlber(a)redhat.com>
Reviewed-by: Shyam Prasad N <sprasad(a)microsoft.com>
Cc: <stable(a)vger.kernel.org> # v5.11+
Reported-by: Xiaoli Feng <xifeng(a)redhat.com>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
diff --git a/fs/cifs/fs_context.c b/fs/cifs/fs_context.c
index 9a59d7ff9a11..eed59bc1d913 100644
--- a/fs/cifs/fs_context.c
+++ b/fs/cifs/fs_context.c
@@ -925,6 +925,13 @@ static int smb3_fs_context_parse_param(struct fs_context *fc,
ctx->cred_uid = uid;
ctx->cruid_specified = true;
break;
+ case Opt_backupuid:
+ uid = make_kuid(current_user_ns(), result.uint_32);
+ if (!uid_valid(uid))
+ goto cifs_parse_mount_err;
+ ctx->backupuid = uid;
+ ctx->backupuid_specified = true;
+ break;
case Opt_backupgid:
gid = make_kgid(current_user_ns(), result.uint_32);
if (!gid_valid(gid))
The patch below does not apply to the 5.13-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 110aa25c3ce417a44e35990cf8ed22383277933a Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe(a)kernel.dk>
Date: Mon, 26 Jul 2021 10:42:56 -0600
Subject: [PATCH] io_uring: fix race in unified task_work running
We use a bit to manage if we need to add the shared task_work, but
a list + lock for the pending work. Before aborting a current run
of the task_work we check if the list is empty, but we do so without
grabbing the lock that protects it. This can lead to races where
we think we have nothing left to run, where in practice we could be
racing with a task adding new work to the list. If we do hit that
race condition, we could be left with work items that need processing,
but the shared task_work is not active.
Ensure that we grab the lock before checking if the list is empty,
so we know if it's safe to exit the run or not.
Link: https://lore.kernel.org/io-uring/c6bd5987-e9ae-cd02-49d0-1b3ac1ef65b1@tnonl…
Cc: stable(a)vger.kernel.org # 5.11+
Reported-by: Forza <forza(a)tnonline.net>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/fs/io_uring.c b/fs/io_uring.c
index c4d2b320cdd4..a4331deb0427 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1959,9 +1959,13 @@ static void tctx_task_work(struct callback_head *cb)
node = next;
}
if (wq_list_empty(&tctx->task_list)) {
+ spin_lock_irq(&tctx->task_lock);
clear_bit(0, &tctx->task_state);
- if (wq_list_empty(&tctx->task_list))
+ if (wq_list_empty(&tctx->task_list)) {
+ spin_unlock_irq(&tctx->task_lock);
break;
+ }
+ spin_unlock_irq(&tctx->task_lock);
/* another tctx_task_work() is enqueued, yield */
if (test_and_set_bit(0, &tctx->task_state))
break;
Hi -stable kernel team,
Commit 216f8e95aacc8e ("PCI: mvebu: Setup BAR0 in order to fix MSI")
fixes ath10k_pci devices on Armada 385 systems, and probably also others
using similar PCI hosts. Can you consider applying it to v5.4.x? The
commit should apply cleanly to current v5.4.
Technically this bug is not a regression. MSI only worked with vendor
provided U-Boot. So I'm not sure it is eligible for -stable. But still,
this bug bites users of latest OpenWrt release that is based on kernel
v5.4. Oli (on Cc) verified that this commit fixes the issue on OpenWrt
with kernel v5.4.
I put PCI/mvebu maintainers to Cc so they can provide their input.
Thanks,
baruch
--
~. .~ Tk Open Systems
=}------------------------------------------------ooO--U--Ooo------------{=
- baruch(a)tkos.co.il - tel: +972.52.368.4656, http://www.tkos.co.il -
From: Linus Torvalds <torvalds(a)linux-foundation.org>
commit 3a34b13a88caeb2800ab44a4918f230041b37dd9 upstream.
Since commit 1b6b26ae7053 ("pipe: fix and clarify pipe write wakeup
logic") we have sanitized the pipe write logic, and would only try to
wake up readers if they needed it.
In particular, if the pipe already had data in it before the write,
there was no point in trying to wake up a reader, since any existing
readers must have been aware of the pre-existing data already. Doing
extraneous wakeups will only cause potential thundering herd problems.
However, it turns out that some Android libraries have misused the EPOLL
interface, and expected "edge triggered" be to "any new write will
trigger it". Even if there was no edge in sight.
Quoting Sandeep Patil:
"The commit 1b6b26ae7053 ('pipe: fix and clarify pipe write wakeup
logic') changed pipe write logic to wakeup readers only if the pipe
was empty at the time of write. However, there are libraries that
relied upon the older behavior for notification scheme similar to
what's described in [1]
One such library 'realm-core'[2] is used by numerous Android
applications. The library uses a similar notification mechanism as GNU
Make but it never drains the pipe until it is full. When Android moved
to v5.10 kernel, all applications using this library stopped working.
The library has since been fixed[3] but it will be a while before all
applications incorporate the updated library"
Our regression rule for the kernel is that if applications break from
new behavior, it's a regression, even if it was because the application
did something patently wrong. Also note the original report [4] by
Michal Kerrisk about a test for this epoll behavior - but at that point
we didn't know of any actual broken use case.
So add the extraneous wakeup, to approximate the old behavior.
[ I say "approximate", because the exact old behavior was to do a wakeup
not for each write(), but for each pipe buffer chunk that was filled
in. The behavior introduced by this change is not that - this is just
"every write will cause a wakeup, whether necessary or not", which
seems to be sufficient for the broken library use. ]
It's worth noting that this adds the extraneous wakeup only for the
write side, while the read side still considers the "edge" to be purely
about reading enough from the pipe to allow further writes.
See commit f467a6a66419 ("pipe: fix and clarify pipe read wakeup logic")
for the pipe read case, which remains that "only wake up if the pipe was
full, and we read something from it".
Link: https://lore.kernel.org/lkml/CAHk-=wjeG0q1vgzu4iJhW5juPkTsjTYmiqiMUYAebWW+0… [1]
Link: https://github.com/realm/realm-core [2]
Link: https://github.com/realm/realm-core/issues/4666 [3]
Link: https://lore.kernel.org/lkml/CAKgNAkjMBGeAwF=2MKK758BhxvW58wYTgYKB2V-gY1PwX… [4]
Link: https://lore.kernel.org/lkml/20210729222635.2937453-1-sspatil@android.com/
Reported-by: Sandeep Patil <sspatil(a)android.com>
Cc: Michael Kerrisk <mtk.manpages(a)gmail.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/pipe.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/pipe.c b/fs/pipe.c
index bfd946a9ad01..9ef4231cce61 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -429,20 +429,20 @@ pipe_write(struct kiocb *iocb, struct iov_iter *from)
#endif
/*
- * Only wake up if the pipe started out empty, since
- * otherwise there should be no readers waiting.
+ * Epoll nonsensically wants a wakeup whether the pipe
+ * was already empty or not.
*
* If it wasn't empty we try to merge new data into
* the last buffer.
*
* That naturally merges small writes, but it also
- * page-aligs the rest of the writes for large writes
+ * page-aligns the rest of the writes for large writes
* spanning multiple pages.
*/
head = pipe->head;
- was_empty = pipe_empty(head, pipe->tail);
+ was_empty = true;
chars = total_len & (PAGE_SIZE-1);
- if (chars && !was_empty) {
+ if (chars && !pipe_empty(head, pipe->tail)) {
unsigned int mask = pipe->ring_size - 1;
struct pipe_buffer *buf = &pipe->bufs[(head - 1) & mask];
int offset = buf->offset + buf->len;
--
2.30.2
The patch titled
Subject: mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()
has been removed from the -mm tree. Its filename was
mm-memcg-fix-null-pointer-dereference-in-memcg_slab_free_hook.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Wang Hai <wanghai38(a)huawei.com>
Subject: mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()
When I use kfree_rcu() to free a large memory allocated by kmalloc_node(),
the following dump occurs.
BUG: kernel NULL pointer dereference, address: 0000000000000020
[...]
Oops: 0000 [#1] SMP
[...]
Workqueue: events kfree_rcu_work
RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
[...]
Call Trace:
kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
kfree_bulk include/linux/slab.h:413 [inline]
kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
process_one_work+0x207/0x530 kernel/workqueue.c:2276
worker_thread+0x320/0x610 kernel/workqueue.c:2422
kthread+0x13d/0x160 kernel/kthread.c:313
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
When kmalloc_node() a large memory, page is allocated, not slab, so when
freeing memory via kfree_rcu(), this large memory should not be used by
memcg_slab_free_hook(), because memcg_slab_free_hook() is is used for
slab.
Using page_objcgs_check() instead of page_objcgs() in
memcg_slab_free_hook() to fix this bug.
Link: https://lkml.kernel.org/r/20210728145655.274476-1-wanghai38@huawei.com
Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")
Signed-off-by: Wang Hai <wanghai38(a)huawei.com>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Acked-by: Roman Gushchin <guro(a)fb.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Reviewed-by: Muchun Song <songmuchun(a)bytedance.com>
Cc: Christoph Lameter <cl(a)linux.com>
Cc: Pekka Enberg <penberg(a)kernel.org>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Alexei Starovoitov <ast(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/slab.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/slab.h~mm-memcg-fix-null-pointer-dereference-in-memcg_slab_free_hook
+++ a/mm/slab.h
@@ -346,7 +346,7 @@ static inline void memcg_slab_free_hook(
continue;
page = virt_to_head_page(p[i]);
- objcgs = page_objcgs(page);
+ objcgs = page_objcgs_check(page);
if (!objcgs)
continue;
_
Patches currently in -mm which might be from wanghai38(a)huawei.com are
The patch titled
Subject: mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code
has been removed from the -mm tree. Its filename was
mm-memcontrol-fix-blocking-rstat-function-called-from-atomic-cgroup1-thresholding-code.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Johannes Weiner <hannes(a)cmpxchg.org>
Subject: mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code
Dan Carpenter reports:
The patch 2d146aa3aa84: "mm: memcontrol: switch to rstat" from Apr
29, 2021, leads to the following static checker warning:
kernel/cgroup/rstat.c:200 cgroup_rstat_flush()
warn: sleeping in atomic context
mm/memcontrol.c
3572 static unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
3573 {
3574 unsigned long val;
3575
3576 if (mem_cgroup_is_root(memcg)) {
3577 cgroup_rstat_flush(memcg->css.cgroup);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This is from static analysis and potentially a false positive. The
problem is that mem_cgroup_usage() is called from __mem_cgroup_threshold()
which holds an rcu_read_lock(). And the cgroup_rstat_flush() function
can sleep.
3578 val = memcg_page_state(memcg, NR_FILE_PAGES) +
3579 memcg_page_state(memcg, NR_ANON_MAPPED);
3580 if (swap)
3581 val += memcg_page_state(memcg, MEMCG_SWAP);
3582 } else {
3583 if (!swap)
3584 val = page_counter_read(&memcg->memory);
3585 else
3586 val = page_counter_read(&memcg->memsw);
3587 }
3588 return val;
3589 }
__mem_cgroup_threshold() indeed holds the rcu lock. In addition, the
thresholding code is invoked during stat changes, and those contexts have
irqs disabled as well. If the lock breaking occurs inside the flush
function, it will result in a sleep from an atomic context.
Use the irqsafe flushing variant in mem_cgroup_usage() to fix this.
Link: https://lkml.kernel.org/r/20210726150019.251820-1-hannes@cmpxchg.org
Fixes: 2d146aa3aa84 ("mm: memcontrol: switch to rstat")
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reported-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Acked-by: Chris Down <chris(a)chrisdown.name>
Reviewed-by: Rik van Riel <riel(a)surriel.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memcontrol.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/mm/memcontrol.c~mm-memcontrol-fix-blocking-rstat-function-called-from-atomic-cgroup1-thresholding-code
+++ a/mm/memcontrol.c
@@ -3574,7 +3574,8 @@ static unsigned long mem_cgroup_usage(st
unsigned long val;
if (mem_cgroup_is_root(memcg)) {
- cgroup_rstat_flush(memcg->css.cgroup);
+ /* mem_cgroup_threshold() calls here from irqsafe context */
+ cgroup_rstat_flush_irqsafe(memcg->css.cgroup);
val = memcg_page_state(memcg, NR_FILE_PAGES) +
memcg_page_state(memcg, NR_ANON_MAPPED);
if (swap)
_
Patches currently in -mm which might be from hannes(a)cmpxchg.org are
mm-remove-irqsave-restore-locking-from-contexts-with-irqs-enabled.patch
fs-drop_caches-fix-skipping-over-shadow-cache-inodes.patch
fs-inode-count-invalidated-shadow-pages-in-pginodesteal.patch
vfs-keep-inodes-with-page-cache-off-the-inode-shrinker-lru.patch
The patch titled
Subject: ocfs2: issue zeroout to EOF blocks
has been removed from the -mm tree. Its filename was
ocfs2-issue-zeroout-to-eof-blocks.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Junxiao Bi <junxiao.bi(a)oracle.com>
Subject: ocfs2: issue zeroout to EOF blocks
For punch holes in EOF blocks, fallocate used buffer write to zero the EOF
blocks in last cluster. But since ->writepage will ignore EOF pages,
those zeros will not be flushed.
This "looks" ok as commit 6bba4471f0cc ("ocfs2: fix data corruption by
fallocate") will zero the EOF blocks when extend the file size, but it
isn't. The problem happened on those EOF pages, before writeback, those
pages had DIRTY flag set and all buffer_head in them also had DIRTY flag
set, when writeback run by write_cache_pages(), DIRTY flag on the page was
cleared, but DIRTY flag on the buffer_head not.
When next write happened to those EOF pages, since buffer_head already had
DIRTY flag set, it would not mark page DIRTY again. That made writeback
ignore them forever. That will cause data corruption. Even directio
write can't work because it will fail when trying to drop pages caches
before direct io, as it found the buffer_head for those pages still had
DIRTY flag set, then it will fall back to buffer io mode.
To make a summary of the issue, as writeback ingores EOF pages, once any
EOF page is generated, any write to it will only go to the page cache, it
will never be flushed to disk even file size extends and that page is not
EOF page any more. The fix is to avoid zero EOF blocks with buffer write.
The following code snippet from qemu-img could trigger the corruption.
656 open("6b3711ae-3306-4bdd-823c-cf1c0060a095.conv.2", O_RDWR|O_DIRECT|O_CLOEXEC) = 11
...
660 fallocate(11, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 2275868672, 327680 <unfinished ...>
660 fallocate(11, 0, 2275868672, 327680) = 0
658 pwrite64(11, "
Link: https://lkml.kernel.org/r/20210722054923.24389-2-junxiao.bi@oracle.com
Signed-off-by: Junxiao Bi <junxiao.bi(a)oracle.com>
Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Changwei Ge <gechangwei(a)live.cn>
Cc: Gang He <ghe(a)suse.com>
Cc: Jun Piao <piaojun(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/ocfs2/file.c | 99 +++++++++++++++++++++++++++-------------------
1 file changed, 60 insertions(+), 39 deletions(-)
--- a/fs/ocfs2/file.c~ocfs2-issue-zeroout-to-eof-blocks
+++ a/fs/ocfs2/file.c
@@ -1529,6 +1529,45 @@ static void ocfs2_truncate_cluster_pages
}
}
+/*
+ * zero out partial blocks of one cluster.
+ *
+ * start: file offset where zero starts, will be made upper block aligned.
+ * len: it will be trimmed to the end of current cluster if "start + len"
+ * is bigger than it.
+ */
+static int ocfs2_zeroout_partial_cluster(struct inode *inode,
+ u64 start, u64 len)
+{
+ int ret;
+ u64 start_block, end_block, nr_blocks;
+ u64 p_block, offset;
+ u32 cluster, p_cluster, nr_clusters;
+ struct super_block *sb = inode->i_sb;
+ u64 end = ocfs2_align_bytes_to_clusters(sb, start);
+
+ if (start + len < end)
+ end = start + len;
+
+ start_block = ocfs2_blocks_for_bytes(sb, start);
+ end_block = ocfs2_blocks_for_bytes(sb, end);
+ nr_blocks = end_block - start_block;
+ if (!nr_blocks)
+ return 0;
+
+ cluster = ocfs2_bytes_to_clusters(sb, start);
+ ret = ocfs2_get_clusters(inode, cluster, &p_cluster,
+ &nr_clusters, NULL);
+ if (ret)
+ return ret;
+ if (!p_cluster)
+ return 0;
+
+ offset = start_block - ocfs2_clusters_to_blocks(sb, cluster);
+ p_block = ocfs2_clusters_to_blocks(sb, p_cluster) + offset;
+ return sb_issue_zeroout(sb, p_block, nr_blocks, GFP_NOFS);
+}
+
static int ocfs2_zero_partial_clusters(struct inode *inode,
u64 start, u64 len)
{
@@ -1538,6 +1577,7 @@ static int ocfs2_zero_partial_clusters(s
struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
unsigned int csize = osb->s_clustersize;
handle_t *handle;
+ loff_t isize = i_size_read(inode);
/*
* The "start" and "end" values are NOT necessarily part of
@@ -1558,6 +1598,26 @@ static int ocfs2_zero_partial_clusters(s
if ((start & (csize - 1)) == 0 && (end & (csize - 1)) == 0)
goto out;
+ /* No page cache for EOF blocks, issue zero out to disk. */
+ if (end > isize) {
+ /*
+ * zeroout eof blocks in last cluster starting from
+ * "isize" even "start" > "isize" because it is
+ * complicated to zeroout just at "start" as "start"
+ * may be not aligned with block size, buffer write
+ * would be required to do that, but out of eof buffer
+ * write is not supported.
+ */
+ ret = ocfs2_zeroout_partial_cluster(inode, isize,
+ end - isize);
+ if (ret) {
+ mlog_errno(ret);
+ goto out;
+ }
+ if (start >= isize)
+ goto out;
+ end = isize;
+ }
handle = ocfs2_start_trans(osb, OCFS2_INODE_UPDATE_CREDITS);
if (IS_ERR(handle)) {
ret = PTR_ERR(handle);
@@ -1856,45 +1916,6 @@ out:
}
/*
- * zero out partial blocks of one cluster.
- *
- * start: file offset where zero starts, will be made upper block aligned.
- * len: it will be trimmed to the end of current cluster if "start + len"
- * is bigger than it.
- */
-static int ocfs2_zeroout_partial_cluster(struct inode *inode,
- u64 start, u64 len)
-{
- int ret;
- u64 start_block, end_block, nr_blocks;
- u64 p_block, offset;
- u32 cluster, p_cluster, nr_clusters;
- struct super_block *sb = inode->i_sb;
- u64 end = ocfs2_align_bytes_to_clusters(sb, start);
-
- if (start + len < end)
- end = start + len;
-
- start_block = ocfs2_blocks_for_bytes(sb, start);
- end_block = ocfs2_blocks_for_bytes(sb, end);
- nr_blocks = end_block - start_block;
- if (!nr_blocks)
- return 0;
-
- cluster = ocfs2_bytes_to_clusters(sb, start);
- ret = ocfs2_get_clusters(inode, cluster, &p_cluster,
- &nr_clusters, NULL);
- if (ret)
- return ret;
- if (!p_cluster)
- return 0;
-
- offset = start_block - ocfs2_clusters_to_blocks(sb, cluster);
- p_block = ocfs2_clusters_to_blocks(sb, p_cluster) + offset;
- return sb_issue_zeroout(sb, p_block, nr_blocks, GFP_NOFS);
-}
-
-/*
* Parts of this function taken from xfs_change_file_space()
*/
static int __ocfs2_change_file_space(struct file *file, struct inode *inode,
_
Patches currently in -mm which might be from junxiao.bi(a)oracle.com are
The patch titled
Subject: ocfs2: fix zero out valid data
has been removed from the -mm tree. Its filename was
ocfs2-fix-zero-out-valid-data.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Junxiao Bi <junxiao.bi(a)oracle.com>
Subject: ocfs2: fix zero out valid data
If append-dio feature is enabled, direct-io write and fallocate could run
in parallel to extend file size, fallocate used "orig_isize" to record
i_size before taking "ip_alloc_sem", when ocfs2_zeroout_partial_cluster()
zeroout EOF blocks, i_size maybe already extended by
ocfs2_dio_end_io_write(), that will cause valid data zeroed out.
Link: https://lkml.kernel.org/r/20210722054923.24389-1-junxiao.bi@oracle.com
Fixes: 6bba4471f0cc ("ocfs2: fix data corruption by fallocate")
Signed-off-by: Junxiao Bi <junxiao.bi(a)oracle.com>
Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Cc: Changwei Ge <gechangwei(a)live.cn>
Cc: Gang He <ghe(a)suse.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Jun Piao <piaojun(a)huawei.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/ocfs2/file.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/ocfs2/file.c~ocfs2-fix-zero-out-valid-data
+++ a/fs/ocfs2/file.c
@@ -1935,7 +1935,6 @@ static int __ocfs2_change_file_space(str
goto out_inode_unlock;
}
- orig_isize = i_size_read(inode);
switch (sr->l_whence) {
case 0: /*SEEK_SET*/
break;
@@ -1943,7 +1942,7 @@ static int __ocfs2_change_file_space(str
sr->l_start += f_pos;
break;
case 2: /*SEEK_END*/
- sr->l_start += orig_isize;
+ sr->l_start += i_size_read(inode);
break;
default:
ret = -EINVAL;
@@ -1998,6 +1997,7 @@ static int __ocfs2_change_file_space(str
ret = -EINVAL;
}
+ orig_isize = i_size_read(inode);
/* zeroout eof blocks in the cluster. */
if (!ret && change_size && orig_isize < size) {
ret = ocfs2_zeroout_partial_cluster(inode, orig_isize,
_
Patches currently in -mm which might be from junxiao.bi(a)oracle.com are
This is the start of the stable review cycle for the 5.10.55 release.
There are 24 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 31 Jul 2021 13:51:22 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.55-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.10.55-rc1
Vasily Averin <vvs(a)virtuozzo.com>
ipv6: ip6_finish_output2: set sk into newly allocated nskb
Sudeep Holla <sudeep.holla(a)arm.com>
ARM: dts: versatile: Fix up interrupt controller node names
Christoph Hellwig <hch(a)lst.de>
iomap: remove the length variable in iomap_seek_hole
Christoph Hellwig <hch(a)lst.de>
iomap: remove the length variable in iomap_seek_data
Hyunchul Lee <hyc.lee(a)gmail.com>
cifs: fix the out of range assignment to bit fields in parse_server_interfaces
Cristian Marussi <cristian.marussi(a)arm.com>
firmware: arm_scmi: Fix range check for the maximum number of pending messages
Sudeep Holla <sudeep.holla(a)arm.com>
firmware: arm_scmi: Fix possible scmi_linux_errmap buffer overflow
Desmond Cheong Zhi Xi <desmondcheongzx(a)gmail.com>
hfs: add lock nesting notation to hfs_find_init
Desmond Cheong Zhi Xi <desmondcheongzx(a)gmail.com>
hfs: fix high memory mapping in hfs_bnode_read
Desmond Cheong Zhi Xi <desmondcheongzx(a)gmail.com>
hfs: add missing clean-up in hfs_fill_super
Zheyu Ma <zheyuma97(a)gmail.com>
drm/ttm: add a check against null pointer dereference
Vasily Averin <vvs(a)virtuozzo.com>
ipv6: allocate enough headroom in ip6_finish_output2()
Paul E. McKenney <paulmck(a)kernel.org>
rcu-tasks: Don't delete holdouts within trc_wait_for_one_reader()
Paul E. McKenney <paulmck(a)kernel.org>
rcu-tasks: Don't delete holdouts within trc_inspect_reader()
Xin Long <lucien.xin(a)gmail.com>
sctp: move 198 addresses from unusable to private scope
Eric Dumazet <edumazet(a)google.com>
net: annotate data race around sk_ll_usec
Yang Yingliang <yangyingliang(a)huawei.com>
net/802/garp: fix memleak in garp_request_join()
Yang Yingliang <yangyingliang(a)huawei.com>
net/802/mrp: fix memleak in mrp_request_join()
Paul Gortmaker <paul.gortmaker(a)windriver.com>
cgroup1: fix leaked context root causing sporadic NULL deref in LTP
Yang Yingliang <yangyingliang(a)huawei.com>
workqueue: fix UAF in pwq_unbound_release_workfn()
Miklos Szeredi <mszeredi(a)redhat.com>
af_unix: fix garbage collect vs MSG_PEEK
Maxim Levitsky <mlevitsk(a)redhat.com>
KVM: x86: determine if an exception has an error code only when injecting it.
Pavel Begunkov <asml.silence(a)gmail.com>
io_uring: fix link timeout refs
Yonghong Song <yhs(a)fb.com>
tools: Allow proper CC/CXX/... override with LLVM=1 in Makefile.include
-------------
Diffstat:
Makefile | 4 +--
arch/arm/boot/dts/versatile-ab.dts | 5 ++--
arch/arm/boot/dts/versatile-pb.dts | 2 +-
arch/x86/kvm/x86.c | 13 ++++++---
drivers/firmware/arm_scmi/driver.c | 12 ++++----
drivers/gpu/drm/ttm/ttm_range_manager.c | 3 ++
fs/cifs/smb2ops.c | 4 +--
fs/hfs/bfind.c | 14 ++++++++-
fs/hfs/bnode.c | 25 ++++++++++++----
fs/hfs/btree.h | 7 +++++
fs/hfs/super.c | 10 +++----
fs/internal.h | 1 -
fs/io_uring.c | 1 -
fs/iomap/seek.c | 25 ++++++----------
include/linux/fs_context.h | 1 +
include/net/busy_poll.h | 2 +-
include/net/sctp/constants.h | 4 +--
kernel/cgroup/cgroup-v1.c | 4 +--
kernel/rcu/tasks.h | 6 ++--
kernel/workqueue.c | 20 ++++++++-----
net/802/garp.c | 14 +++++++++
net/802/mrp.c | 14 +++++++++
net/core/sock.c | 2 +-
net/ipv6/ip6_output.c | 28 ++++++++++++++++++
net/sctp/protocol.c | 3 +-
net/unix/af_unix.c | 51 +++++++++++++++++++++++++++++++--
tools/scripts/Makefile.include | 12 ++++++--
27 files changed, 217 insertions(+), 70 deletions(-)
I'm announcing the release of the 5.13.7 kernel.
All users of the 5.13 kernel series must upgrade.
The updated 5.13.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.13.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2
arch/arm/boot/dts/versatile-ab.dts | 5 --
arch/arm/boot/dts/versatile-pb.dts | 2
drivers/firmware/arm_scmi/driver.c | 12 +++--
drivers/gpu/drm/ttm/ttm_range_manager.c | 3 +
drivers/nvme/host/pci.c | 66 ++++++++++++++++++++++++++++----
fs/cifs/smb2ops.c | 4 -
fs/hfs/bfind.c | 14 ++++++
fs/hfs/bnode.c | 25 +++++++++---
fs/hfs/btree.h | 7 +++
fs/hfs/super.c | 10 ++--
fs/internal.h | 1
fs/iomap/seek.c | 25 ++++--------
include/linux/fs_context.h | 1
include/net/busy_poll.h | 2
include/net/sctp/constants.h | 4 -
kernel/cgroup/cgroup-v1.c | 4 -
kernel/rcu/tasks.h | 6 --
kernel/workqueue.c | 20 ++++++---
net/802/garp.c | 14 ++++++
net/802/mrp.c | 14 ++++++
net/core/sock.c | 2
net/ipv6/ip6_output.c | 28 +++++++++++++
net/sctp/protocol.c | 3 -
net/unix/af_unix.c | 51 +++++++++++++++++++++++-
25 files changed, 255 insertions(+), 70 deletions(-)
Casey Chen (1):
nvme-pci: fix multiple races in nvme_setup_io_queues
Christoph Hellwig (2):
iomap: remove the length variable in iomap_seek_data
iomap: remove the length variable in iomap_seek_hole
Cristian Marussi (1):
firmware: arm_scmi: Fix range check for the maximum number of pending messages
Desmond Cheong Zhi Xi (3):
hfs: add missing clean-up in hfs_fill_super
hfs: fix high memory mapping in hfs_bnode_read
hfs: add lock nesting notation to hfs_find_init
Eric Dumazet (1):
net: annotate data race around sk_ll_usec
Greg Kroah-Hartman (1):
Linux 5.13.7
Hyunchul Lee (1):
cifs: fix the out of range assignment to bit fields in parse_server_interfaces
Miklos Szeredi (1):
af_unix: fix garbage collect vs MSG_PEEK
Paul E. McKenney (2):
rcu-tasks: Don't delete holdouts within trc_inspect_reader()
rcu-tasks: Don't delete holdouts within trc_wait_for_one_reader()
Paul Gortmaker (1):
cgroup1: fix leaked context root causing sporadic NULL deref in LTP
Sudeep Holla (2):
firmware: arm_scmi: Fix possible scmi_linux_errmap buffer overflow
ARM: dts: versatile: Fix up interrupt controller node names
Vasily Averin (2):
ipv6: allocate enough headroom in ip6_finish_output2()
ipv6: ip6_finish_output2: set sk into newly allocated nskb
Xin Long (1):
sctp: move 198 addresses from unusable to private scope
Yang Yingliang (3):
workqueue: fix UAF in pwq_unbound_release_workfn()
net/802/mrp: fix memleak in mrp_request_join()
net/802/garp: fix memleak in garp_request_join()
Zheyu Ma (1):
drm/ttm: add a check against null pointer dereference
I'm announcing the release of the 5.10.55 kernel.
All users of the 5.10 kernel series must upgrade.
The updated 5.10.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.10.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 -
arch/arm/boot/dts/versatile-ab.dts | 5 +--
arch/arm/boot/dts/versatile-pb.dts | 2 -
arch/x86/kvm/x86.c | 13 +++++---
drivers/firmware/arm_scmi/driver.c | 12 ++++---
drivers/gpu/drm/ttm/ttm_range_manager.c | 3 +
fs/cifs/smb2ops.c | 4 +-
fs/hfs/bfind.c | 14 ++++++++
fs/hfs/bnode.c | 25 ++++++++++++---
fs/hfs/btree.h | 7 ++++
fs/hfs/super.c | 10 +++---
fs/internal.h | 1
fs/io_uring.c | 1
fs/iomap/seek.c | 25 +++++----------
include/linux/fs_context.h | 1
include/net/busy_poll.h | 2 -
include/net/sctp/constants.h | 4 --
kernel/cgroup/cgroup-v1.c | 4 --
kernel/rcu/tasks.h | 6 +--
kernel/workqueue.c | 20 ++++++++----
net/802/garp.c | 14 ++++++++
net/802/mrp.c | 14 ++++++++
net/core/sock.c | 2 -
net/ipv6/ip6_output.c | 28 +++++++++++++++++
net/sctp/protocol.c | 3 +
net/unix/af_unix.c | 51 ++++++++++++++++++++++++++++++--
tools/scripts/Makefile.include | 12 ++++++-
27 files changed, 216 insertions(+), 69 deletions(-)
Christoph Hellwig (2):
iomap: remove the length variable in iomap_seek_data
iomap: remove the length variable in iomap_seek_hole
Cristian Marussi (1):
firmware: arm_scmi: Fix range check for the maximum number of pending messages
Desmond Cheong Zhi Xi (3):
hfs: add missing clean-up in hfs_fill_super
hfs: fix high memory mapping in hfs_bnode_read
hfs: add lock nesting notation to hfs_find_init
Eric Dumazet (1):
net: annotate data race around sk_ll_usec
Greg Kroah-Hartman (1):
Linux 5.10.55
Hyunchul Lee (1):
cifs: fix the out of range assignment to bit fields in parse_server_interfaces
Maxim Levitsky (1):
KVM: x86: determine if an exception has an error code only when injecting it.
Miklos Szeredi (1):
af_unix: fix garbage collect vs MSG_PEEK
Paul E. McKenney (2):
rcu-tasks: Don't delete holdouts within trc_inspect_reader()
rcu-tasks: Don't delete holdouts within trc_wait_for_one_reader()
Paul Gortmaker (1):
cgroup1: fix leaked context root causing sporadic NULL deref in LTP
Pavel Begunkov (1):
io_uring: fix link timeout refs
Sudeep Holla (2):
firmware: arm_scmi: Fix possible scmi_linux_errmap buffer overflow
ARM: dts: versatile: Fix up interrupt controller node names
Vasily Averin (2):
ipv6: allocate enough headroom in ip6_finish_output2()
ipv6: ip6_finish_output2: set sk into newly allocated nskb
Xin Long (1):
sctp: move 198 addresses from unusable to private scope
Yang Yingliang (3):
workqueue: fix UAF in pwq_unbound_release_workfn()
net/802/mrp: fix memleak in mrp_request_join()
net/802/garp: fix memleak in garp_request_join()
Yonghong Song (1):
tools: Allow proper CC/CXX/... override with LLVM=1 in Makefile.include
Zheyu Ma (1):
drm/ttm: add a check against null pointer dereference
I'm announcing the release of the 5.4.137 kernel.
All users of the 5.4 kernel series must upgrade.
The updated 5.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.4.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 -
arch/arm/boot/dts/versatile-ab.dts | 5 +--
arch/arm/boot/dts/versatile-pb.dts | 2 -
arch/x86/kvm/x86.c | 13 +++++--
drivers/firmware/arm_scmi/driver.c | 12 ++++---
fs/cifs/smb2ops.c | 4 +-
fs/hfs/bfind.c | 14 +++++++-
fs/hfs/bnode.c | 25 ++++++++++++---
fs/hfs/btree.h | 7 ++++
fs/hfs/super.c | 10 +++---
fs/internal.h | 1
fs/iomap/seek.c | 25 +++++----------
include/linux/fs_context.h | 1
include/net/busy_poll.h | 2 -
include/net/sctp/constants.h | 4 --
kernel/cgroup/cgroup-v1.c | 4 --
kernel/workqueue.c | 20 +++++++-----
net/802/garp.c | 14 ++++++++
net/802/mrp.c | 14 ++++++++
net/core/sock.c | 2 -
net/ipv6/ip6_output.c | 28 +++++++++++++++++
net/sctp/protocol.c | 3 +
net/unix/af_unix.c | 51 +++++++++++++++++++++++++++++--
tools/scripts/Makefile.include | 12 ++++++-
tools/testing/selftests/vm/userfaultfd.c | 2 -
25 files changed, 212 insertions(+), 65 deletions(-)
Christoph Hellwig (2):
iomap: remove the length variable in iomap_seek_data
iomap: remove the length variable in iomap_seek_hole
Cristian Marussi (1):
firmware: arm_scmi: Fix range check for the maximum number of pending messages
Desmond Cheong Zhi Xi (3):
hfs: add missing clean-up in hfs_fill_super
hfs: fix high memory mapping in hfs_bnode_read
hfs: add lock nesting notation to hfs_find_init
Eric Dumazet (1):
net: annotate data race around sk_ll_usec
Greg Kroah-Hartman (2):
selftest: fix build error in tools/testing/selftests/vm/userfaultfd.c
Linux 5.4.137
Hyunchul Lee (1):
cifs: fix the out of range assignment to bit fields in parse_server_interfaces
Maxim Levitsky (1):
KVM: x86: determine if an exception has an error code only when injecting it.
Miklos Szeredi (1):
af_unix: fix garbage collect vs MSG_PEEK
Paul Gortmaker (1):
cgroup1: fix leaked context root causing sporadic NULL deref in LTP
Sudeep Holla (2):
firmware: arm_scmi: Fix possible scmi_linux_errmap buffer overflow
ARM: dts: versatile: Fix up interrupt controller node names
Vasily Averin (2):
ipv6: allocate enough headroom in ip6_finish_output2()
ipv6: ip6_finish_output2: set sk into newly allocated nskb
Xin Long (1):
sctp: move 198 addresses from unusable to private scope
Yang Yingliang (3):
workqueue: fix UAF in pwq_unbound_release_workfn()
net/802/mrp: fix memleak in mrp_request_join()
net/802/garp: fix memleak in garp_request_join()
Yonghong Song (1):
tools: Allow proper CC/CXX/... override with LLVM=1 in Makefile.include
I'm announcing the release of the 4.19.200 kernel.
All users of the 4.19 kernel series must upgrade.
The updated 4.19.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.19.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2
arch/arm/boot/dts/versatile-ab.dts | 5 -
arch/arm/boot/dts/versatile-pb.dts | 2
arch/x86/kvm/x86.c | 13 +-
drivers/firmware/arm_scmi/driver.c | 12 +-
drivers/iio/dac/ds4424.c | 6 -
fs/cifs/smb2ops.c | 4
fs/hfs/bfind.c | 14 ++
fs/hfs/bnode.c | 25 ++++-
fs/hfs/btree.h | 7 +
fs/hfs/super.c | 10 +-
include/net/af_unix.h | 1
include/net/busy_poll.h | 2
include/net/sctp/constants.h | 4
kernel/workqueue.c | 20 ++--
net/802/garp.c | 14 ++
net/802/mrp.c | 14 ++
net/Makefile | 2
net/core/sock.c | 2
net/sctp/protocol.c | 3
net/unix/Kconfig | 5 +
net/unix/Makefile | 2
net/unix/af_unix.c | 102 +++++++++------------
net/unix/garbage.c | 68 --------------
net/unix/scm.c | 148 +++++++++++++++++++++++++++++++
net/unix/scm.h | 10 ++
tools/testing/selftests/vm/userfaultfd.c | 2
27 files changed, 328 insertions(+), 171 deletions(-)
Cristian Marussi (1):
firmware: arm_scmi: Fix range check for the maximum number of pending messages
Desmond Cheong Zhi Xi (3):
hfs: add missing clean-up in hfs_fill_super
hfs: fix high memory mapping in hfs_bnode_read
hfs: add lock nesting notation to hfs_find_init
Eric Dumazet (1):
net: annotate data race around sk_ll_usec
Greg Kroah-Hartman (2):
selftest: fix build error in tools/testing/selftests/vm/userfaultfd.c
Linux 4.19.200
Hyunchul Lee (1):
cifs: fix the out of range assignment to bit fields in parse_server_interfaces
Jens Axboe (1):
net: split out functions related to registering inflight socket files
Maxim Levitsky (1):
KVM: x86: determine if an exception has an error code only when injecting it.
Miklos Szeredi (1):
af_unix: fix garbage collect vs MSG_PEEK
Ruslan Babayev (1):
iio: dac: ds4422/ds4424 drop of_node check
Sudeep Holla (2):
firmware: arm_scmi: Fix possible scmi_linux_errmap buffer overflow
ARM: dts: versatile: Fix up interrupt controller node names
Xin Long (1):
sctp: move 198 addresses from unusable to private scope
Yang Yingliang (3):
workqueue: fix UAF in pwq_unbound_release_workfn()
net/802/mrp: fix memleak in mrp_request_join()
net/802/garp: fix memleak in garp_request_join()
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 44eff40a32e8f5228ae041006352e32638ad2368 Mon Sep 17 00:00:00 2001
From: Pavel Begunkov <asml.silence(a)gmail.com>
Date: Mon, 26 Jul 2021 14:14:31 +0100
Subject: [PATCH] io_uring: fix io_prep_async_link locking
io_prep_async_link() may be called after arming a linked timeout,
automatically making it unsafe to traverse the linked list. Guard
with completion_lock if there was a linked timeout.
Cc: stable(a)vger.kernel.org # 5.9+
Signed-off-by: Pavel Begunkov <asml.silence(a)gmail.com>
Link: https://lore.kernel.org/r/93f7c617e2b4f012a2a175b3dab6bc2f27cebc48.16273044…
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 5a0fd6bcd318..c4d2b320cdd4 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1279,8 +1279,17 @@ static void io_prep_async_link(struct io_kiocb *req)
{
struct io_kiocb *cur;
- io_for_each_link(cur, req)
- io_prep_async_work(cur);
+ if (req->flags & REQ_F_LINK_TIMEOUT) {
+ struct io_ring_ctx *ctx = req->ctx;
+
+ spin_lock_irq(&ctx->completion_lock);
+ io_for_each_link(cur, req)
+ io_prep_async_work(cur);
+ spin_unlock_irq(&ctx->completion_lock);
+ } else {
+ io_for_each_link(cur, req)
+ io_prep_async_work(cur);
+ }
}
static void io_queue_async_work(struct io_kiocb *req)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5ab189cf3abbc9994bae3be524c5b88589ed56e2 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj(a)kernel.org>
Date: Tue, 27 Jul 2021 14:38:09 -1000
Subject: [PATCH] blk-iocost: fix operation ordering in iocg_wake_fn()
iocg_wake_fn() open-codes wait_queue_entry removal and wakeup because it
wants the wq_entry to be always removed whether it ended up waking the
task or not. finish_wait() tests whether wq_entry needs removal without
grabbing the wait_queue lock and expects the waker to use
list_del_init_careful() after all waking operations are complete, which
iocg_wake_fn() didn't do. The operation order was wrong and the regular
list_del_init() was used.
The result is that if a waiter wakes up racing the waker, it can free pop
the wq_entry off stack before the waker is still looking at it, which can
lead to a backtrace like the following.
[7312084.588951] general protection fault, probably for non-canonical address 0x586bf4005b2b88: 0000 [#1] SMP
...
[7312084.647079] RIP: 0010:queued_spin_lock_slowpath+0x171/0x1b0
...
[7312084.858314] Call Trace:
[7312084.863548] _raw_spin_lock_irqsave+0x22/0x30
[7312084.872605] try_to_wake_up+0x4c/0x4f0
[7312084.880444] iocg_wake_fn+0x71/0x80
[7312084.887763] __wake_up_common+0x71/0x140
[7312084.895951] iocg_kick_waitq+0xe8/0x2b0
[7312084.903964] ioc_rqos_throttle+0x275/0x650
[7312084.922423] __rq_qos_throttle+0x20/0x30
[7312084.930608] blk_mq_make_request+0x120/0x650
[7312084.939490] generic_make_request+0xca/0x310
[7312084.957600] submit_bio+0x173/0x200
[7312084.981806] swap_readpage+0x15c/0x240
[7312084.989646] read_swap_cache_async+0x58/0x60
[7312084.998527] swap_cluster_readahead+0x201/0x320
[7312085.023432] swapin_readahead+0x2df/0x450
[7312085.040672] do_swap_page+0x52f/0x820
[7312085.058259] handle_mm_fault+0xa16/0x1420
[7312085.066620] do_page_fault+0x2c6/0x5c0
[7312085.074459] page_fault+0x2f/0x40
Fix it by switching to list_del_init_careful() and putting it at the end.
Signed-off-by: Tejun Heo <tj(a)kernel.org>
Reported-by: Rik van Riel <riel(a)surriel.com>
Fixes: 7caa47151ab2 ("blkcg: implement blk-iocost")
Cc: stable(a)vger.kernel.org # v5.4+
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/block/blk-iocost.c b/block/blk-iocost.c
index c2d6bc88d3f1..5fac3757e6e0 100644
--- a/block/blk-iocost.c
+++ b/block/blk-iocost.c
@@ -1440,16 +1440,17 @@ static int iocg_wake_fn(struct wait_queue_entry *wq_entry, unsigned mode,
return -1;
iocg_commit_bio(ctx->iocg, wait->bio, wait->abs_cost, cost);
+ wait->committed = true;
/*
* autoremove_wake_function() removes the wait entry only when it
- * actually changed the task state. We want the wait always
- * removed. Remove explicitly and use default_wake_function().
+ * actually changed the task state. We want the wait always removed.
+ * Remove explicitly and use default_wake_function(). Note that the
+ * order of operations is important as finish_wait() tests whether
+ * @wq_entry is removed without grabbing the lock.
*/
- list_del_init(&wq_entry->entry);
- wait->committed = true;
-
default_wake_function(wq_entry, mode, flags, key);
+ list_del_init_careful(&wq_entry->entry);
return 0;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ecc64fab7d49c678e70bd4c35fe64d2ab3e3d212 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Tue, 27 Jul 2021 11:24:43 +0100
Subject: [PATCH] btrfs: fix lost inode on log replay after mix of fsync,
rename and inode eviction
When checking if we need to log the new name of a renamed inode, we are
checking if the inode and its parent inode have been logged before, and if
not we don't log the new name. The check however is buggy, as it directly
compares the logged_trans field of the inodes versus the ID of the current
transaction. The problem is that logged_trans is a transient field, only
stored in memory and never persisted in the inode item, so if an inode
was logged before, evicted and reloaded, its logged_trans field is set to
a value of 0, meaning the check will return false and the new name of the
renamed inode is not logged. If the old parent directory was previously
fsynced and we deleted the logged directory entries corresponding to the
old name, we end up with a log that when replayed will delete the renamed
inode.
The following example triggers the problem:
$ mkfs.btrfs -f /dev/sdc
$ mount /dev/sdc /mnt
$ mkdir /mnt/A
$ mkdir /mnt/B
$ echo -n "hello world" > /mnt/A/foo
$ sync
# Add some new file to A and fsync directory A.
$ touch /mnt/A/bar
$ xfs_io -c "fsync" /mnt/A
# Now trigger inode eviction. We are only interested in triggering
# eviction for the inode of directory A.
$ echo 2 > /proc/sys/vm/drop_caches
# Move foo from directory A to directory B.
# This deletes the directory entries for foo in A from the log, and
# does not add the new name for foo in directory B to the log, because
# logged_trans of A is 0, which is less than the current transaction ID.
$ mv /mnt/A/foo /mnt/B/foo
# Now make an fsync to anything except A, B or any file inside them,
# like for example create a file at the root directory and fsync this
# new file. This syncs the log that contains all the changes done by
# previous rename operation.
$ touch /mnt/baz
$ xfs_io -c "fsync" /mnt/baz
<power fail>
# Mount the filesystem and replay the log.
$ mount /dev/sdc /mnt
# Check the filesystem content.
$ ls -1R /mnt
/mnt/:
A
B
baz
/mnt/A:
bar
/mnt/B:
$
# File foo is gone, it's neither in A/ nor in B/.
Fix this by using the inode_logged() helper at btrfs_log_new_name(), which
safely checks if an inode was logged before in the current transaction.
A test case for fstests will follow soon.
CC: stable(a)vger.kernel.org # 4.14+
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 9fd0348be7f5..e6430ac9bbe8 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -6503,8 +6503,8 @@ void btrfs_log_new_name(struct btrfs_trans_handle *trans,
* if this inode hasn't been logged and directory we're renaming it
* from hasn't been logged, we don't need to log it
*/
- if (inode->logged_trans < trans->transid &&
- (!old_dir || old_dir->logged_trans < trans->transid))
+ if (!inode_logged(trans, inode) &&
+ (!old_dir || !inode_logged(trans, old_dir)))
return;
/*
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ecc64fab7d49c678e70bd4c35fe64d2ab3e3d212 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Tue, 27 Jul 2021 11:24:43 +0100
Subject: [PATCH] btrfs: fix lost inode on log replay after mix of fsync,
rename and inode eviction
When checking if we need to log the new name of a renamed inode, we are
checking if the inode and its parent inode have been logged before, and if
not we don't log the new name. The check however is buggy, as it directly
compares the logged_trans field of the inodes versus the ID of the current
transaction. The problem is that logged_trans is a transient field, only
stored in memory and never persisted in the inode item, so if an inode
was logged before, evicted and reloaded, its logged_trans field is set to
a value of 0, meaning the check will return false and the new name of the
renamed inode is not logged. If the old parent directory was previously
fsynced and we deleted the logged directory entries corresponding to the
old name, we end up with a log that when replayed will delete the renamed
inode.
The following example triggers the problem:
$ mkfs.btrfs -f /dev/sdc
$ mount /dev/sdc /mnt
$ mkdir /mnt/A
$ mkdir /mnt/B
$ echo -n "hello world" > /mnt/A/foo
$ sync
# Add some new file to A and fsync directory A.
$ touch /mnt/A/bar
$ xfs_io -c "fsync" /mnt/A
# Now trigger inode eviction. We are only interested in triggering
# eviction for the inode of directory A.
$ echo 2 > /proc/sys/vm/drop_caches
# Move foo from directory A to directory B.
# This deletes the directory entries for foo in A from the log, and
# does not add the new name for foo in directory B to the log, because
# logged_trans of A is 0, which is less than the current transaction ID.
$ mv /mnt/A/foo /mnt/B/foo
# Now make an fsync to anything except A, B or any file inside them,
# like for example create a file at the root directory and fsync this
# new file. This syncs the log that contains all the changes done by
# previous rename operation.
$ touch /mnt/baz
$ xfs_io -c "fsync" /mnt/baz
<power fail>
# Mount the filesystem and replay the log.
$ mount /dev/sdc /mnt
# Check the filesystem content.
$ ls -1R /mnt
/mnt/:
A
B
baz
/mnt/A:
bar
/mnt/B:
$
# File foo is gone, it's neither in A/ nor in B/.
Fix this by using the inode_logged() helper at btrfs_log_new_name(), which
safely checks if an inode was logged before in the current transaction.
A test case for fstests will follow soon.
CC: stable(a)vger.kernel.org # 4.14+
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 9fd0348be7f5..e6430ac9bbe8 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -6503,8 +6503,8 @@ void btrfs_log_new_name(struct btrfs_trans_handle *trans,
* if this inode hasn't been logged and directory we're renaming it
* from hasn't been logged, we don't need to log it
*/
- if (inode->logged_trans < trans->transid &&
- (!old_dir || old_dir->logged_trans < trans->transid))
+ if (!inode_logged(trans, inode) &&
+ (!old_dir || !inode_logged(trans, old_dir)))
return;
/*
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ecc64fab7d49c678e70bd4c35fe64d2ab3e3d212 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Tue, 27 Jul 2021 11:24:43 +0100
Subject: [PATCH] btrfs: fix lost inode on log replay after mix of fsync,
rename and inode eviction
When checking if we need to log the new name of a renamed inode, we are
checking if the inode and its parent inode have been logged before, and if
not we don't log the new name. The check however is buggy, as it directly
compares the logged_trans field of the inodes versus the ID of the current
transaction. The problem is that logged_trans is a transient field, only
stored in memory and never persisted in the inode item, so if an inode
was logged before, evicted and reloaded, its logged_trans field is set to
a value of 0, meaning the check will return false and the new name of the
renamed inode is not logged. If the old parent directory was previously
fsynced and we deleted the logged directory entries corresponding to the
old name, we end up with a log that when replayed will delete the renamed
inode.
The following example triggers the problem:
$ mkfs.btrfs -f /dev/sdc
$ mount /dev/sdc /mnt
$ mkdir /mnt/A
$ mkdir /mnt/B
$ echo -n "hello world" > /mnt/A/foo
$ sync
# Add some new file to A and fsync directory A.
$ touch /mnt/A/bar
$ xfs_io -c "fsync" /mnt/A
# Now trigger inode eviction. We are only interested in triggering
# eviction for the inode of directory A.
$ echo 2 > /proc/sys/vm/drop_caches
# Move foo from directory A to directory B.
# This deletes the directory entries for foo in A from the log, and
# does not add the new name for foo in directory B to the log, because
# logged_trans of A is 0, which is less than the current transaction ID.
$ mv /mnt/A/foo /mnt/B/foo
# Now make an fsync to anything except A, B or any file inside them,
# like for example create a file at the root directory and fsync this
# new file. This syncs the log that contains all the changes done by
# previous rename operation.
$ touch /mnt/baz
$ xfs_io -c "fsync" /mnt/baz
<power fail>
# Mount the filesystem and replay the log.
$ mount /dev/sdc /mnt
# Check the filesystem content.
$ ls -1R /mnt
/mnt/:
A
B
baz
/mnt/A:
bar
/mnt/B:
$
# File foo is gone, it's neither in A/ nor in B/.
Fix this by using the inode_logged() helper at btrfs_log_new_name(), which
safely checks if an inode was logged before in the current transaction.
A test case for fstests will follow soon.
CC: stable(a)vger.kernel.org # 4.14+
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 9fd0348be7f5..e6430ac9bbe8 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -6503,8 +6503,8 @@ void btrfs_log_new_name(struct btrfs_trans_handle *trans,
* if this inode hasn't been logged and directory we're renaming it
* from hasn't been logged, we don't need to log it
*/
- if (inode->logged_trans < trans->transid &&
- (!old_dir || old_dir->logged_trans < trans->transid))
+ if (!inode_logged(trans, inode) &&
+ (!old_dir || !inode_logged(trans, old_dir)))
return;
/*
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ecc64fab7d49c678e70bd4c35fe64d2ab3e3d212 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Tue, 27 Jul 2021 11:24:43 +0100
Subject: [PATCH] btrfs: fix lost inode on log replay after mix of fsync,
rename and inode eviction
When checking if we need to log the new name of a renamed inode, we are
checking if the inode and its parent inode have been logged before, and if
not we don't log the new name. The check however is buggy, as it directly
compares the logged_trans field of the inodes versus the ID of the current
transaction. The problem is that logged_trans is a transient field, only
stored in memory and never persisted in the inode item, so if an inode
was logged before, evicted and reloaded, its logged_trans field is set to
a value of 0, meaning the check will return false and the new name of the
renamed inode is not logged. If the old parent directory was previously
fsynced and we deleted the logged directory entries corresponding to the
old name, we end up with a log that when replayed will delete the renamed
inode.
The following example triggers the problem:
$ mkfs.btrfs -f /dev/sdc
$ mount /dev/sdc /mnt
$ mkdir /mnt/A
$ mkdir /mnt/B
$ echo -n "hello world" > /mnt/A/foo
$ sync
# Add some new file to A and fsync directory A.
$ touch /mnt/A/bar
$ xfs_io -c "fsync" /mnt/A
# Now trigger inode eviction. We are only interested in triggering
# eviction for the inode of directory A.
$ echo 2 > /proc/sys/vm/drop_caches
# Move foo from directory A to directory B.
# This deletes the directory entries for foo in A from the log, and
# does not add the new name for foo in directory B to the log, because
# logged_trans of A is 0, which is less than the current transaction ID.
$ mv /mnt/A/foo /mnt/B/foo
# Now make an fsync to anything except A, B or any file inside them,
# like for example create a file at the root directory and fsync this
# new file. This syncs the log that contains all the changes done by
# previous rename operation.
$ touch /mnt/baz
$ xfs_io -c "fsync" /mnt/baz
<power fail>
# Mount the filesystem and replay the log.
$ mount /dev/sdc /mnt
# Check the filesystem content.
$ ls -1R /mnt
/mnt/:
A
B
baz
/mnt/A:
bar
/mnt/B:
$
# File foo is gone, it's neither in A/ nor in B/.
Fix this by using the inode_logged() helper at btrfs_log_new_name(), which
safely checks if an inode was logged before in the current transaction.
A test case for fstests will follow soon.
CC: stable(a)vger.kernel.org # 4.14+
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 9fd0348be7f5..e6430ac9bbe8 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -6503,8 +6503,8 @@ void btrfs_log_new_name(struct btrfs_trans_handle *trans,
* if this inode hasn't been logged and directory we're renaming it
* from hasn't been logged, we don't need to log it
*/
- if (inode->logged_trans < trans->transid &&
- (!old_dir || old_dir->logged_trans < trans->transid))
+ if (!inode_logged(trans, inode) &&
+ (!old_dir || !inode_logged(trans, old_dir)))
return;
/*
Dear list,
The current 5.13 stable kernel branch oopses when handling ext2
filesystems, and the filesystem is not usable, sometimes leading
to a panic.
The bug was introduced during the 5.13 development cycle.
A complete analysis can be found here:
https://lore.kernel.org/linux-ext4/20210713165821.8a268e2c1db4fd5cf452acd2@…
A fix for this bug has been recently merged into the mainline
kernel, with commit id 728d392f8a799f037812d0f2b254fb3b5e115fcf.
The 5.13 branch is the only one affected by this bug.
Best regards,
Javier
This is the start of the stable review cycle for the 5.13.7 release.
There are 22 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 31 Jul 2021 13:51:22 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.13.7-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.13.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.13.7-rc1
Vasily Averin <vvs(a)virtuozzo.com>
ipv6: ip6_finish_output2: set sk into newly allocated nskb
Sudeep Holla <sudeep.holla(a)arm.com>
ARM: dts: versatile: Fix up interrupt controller node names
Christoph Hellwig <hch(a)lst.de>
iomap: remove the length variable in iomap_seek_hole
Christoph Hellwig <hch(a)lst.de>
iomap: remove the length variable in iomap_seek_data
Hyunchul Lee <hyc.lee(a)gmail.com>
cifs: fix the out of range assignment to bit fields in parse_server_interfaces
Cristian Marussi <cristian.marussi(a)arm.com>
firmware: arm_scmi: Fix range check for the maximum number of pending messages
Sudeep Holla <sudeep.holla(a)arm.com>
firmware: arm_scmi: Fix possible scmi_linux_errmap buffer overflow
Desmond Cheong Zhi Xi <desmondcheongzx(a)gmail.com>
hfs: add lock nesting notation to hfs_find_init
Desmond Cheong Zhi Xi <desmondcheongzx(a)gmail.com>
hfs: fix high memory mapping in hfs_bnode_read
Desmond Cheong Zhi Xi <desmondcheongzx(a)gmail.com>
hfs: add missing clean-up in hfs_fill_super
Zheyu Ma <zheyuma97(a)gmail.com>
drm/ttm: add a check against null pointer dereference
Casey Chen <cachen(a)purestorage.com>
nvme-pci: fix multiple races in nvme_setup_io_queues
Vasily Averin <vvs(a)virtuozzo.com>
ipv6: allocate enough headroom in ip6_finish_output2()
Paul E. McKenney <paulmck(a)kernel.org>
rcu-tasks: Don't delete holdouts within trc_wait_for_one_reader()
Paul E. McKenney <paulmck(a)kernel.org>
rcu-tasks: Don't delete holdouts within trc_inspect_reader()
Xin Long <lucien.xin(a)gmail.com>
sctp: move 198 addresses from unusable to private scope
Eric Dumazet <edumazet(a)google.com>
net: annotate data race around sk_ll_usec
Yang Yingliang <yangyingliang(a)huawei.com>
net/802/garp: fix memleak in garp_request_join()
Yang Yingliang <yangyingliang(a)huawei.com>
net/802/mrp: fix memleak in mrp_request_join()
Paul Gortmaker <paul.gortmaker(a)windriver.com>
cgroup1: fix leaked context root causing sporadic NULL deref in LTP
Yang Yingliang <yangyingliang(a)huawei.com>
workqueue: fix UAF in pwq_unbound_release_workfn()
Miklos Szeredi <mszeredi(a)redhat.com>
af_unix: fix garbage collect vs MSG_PEEK
-------------
Diffstat:
Makefile | 4 +-
arch/arm/boot/dts/versatile-ab.dts | 5 +--
arch/arm/boot/dts/versatile-pb.dts | 2 +-
drivers/firmware/arm_scmi/driver.c | 12 +++---
drivers/gpu/drm/ttm/ttm_range_manager.c | 3 ++
drivers/nvme/host/pci.c | 66 +++++++++++++++++++++++++++++----
fs/cifs/smb2ops.c | 4 +-
fs/hfs/bfind.c | 14 ++++++-
fs/hfs/bnode.c | 25 ++++++++++---
fs/hfs/btree.h | 7 ++++
fs/hfs/super.c | 10 ++---
fs/internal.h | 1 -
fs/iomap/seek.c | 25 +++++--------
include/linux/fs_context.h | 1 +
include/net/busy_poll.h | 2 +-
include/net/sctp/constants.h | 4 +-
kernel/cgroup/cgroup-v1.c | 4 +-
kernel/rcu/tasks.h | 6 +--
kernel/workqueue.c | 20 ++++++----
net/802/garp.c | 14 +++++++
net/802/mrp.c | 14 +++++++
net/core/sock.c | 2 +-
net/ipv6/ip6_output.c | 28 ++++++++++++++
net/sctp/protocol.c | 3 +-
net/unix/af_unix.c | 51 ++++++++++++++++++++++++-
25 files changed, 256 insertions(+), 71 deletions(-)
This is the start of the stable review cycle for the 5.4.137 release.
There are 21 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 31 Jul 2021 13:51:22 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.137-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.4.137-rc1
Vasily Averin <vvs(a)virtuozzo.com>
ipv6: ip6_finish_output2: set sk into newly allocated nskb
Sudeep Holla <sudeep.holla(a)arm.com>
ARM: dts: versatile: Fix up interrupt controller node names
Christoph Hellwig <hch(a)lst.de>
iomap: remove the length variable in iomap_seek_hole
Christoph Hellwig <hch(a)lst.de>
iomap: remove the length variable in iomap_seek_data
Hyunchul Lee <hyc.lee(a)gmail.com>
cifs: fix the out of range assignment to bit fields in parse_server_interfaces
Cristian Marussi <cristian.marussi(a)arm.com>
firmware: arm_scmi: Fix range check for the maximum number of pending messages
Sudeep Holla <sudeep.holla(a)arm.com>
firmware: arm_scmi: Fix possible scmi_linux_errmap buffer overflow
Desmond Cheong Zhi Xi <desmondcheongzx(a)gmail.com>
hfs: add lock nesting notation to hfs_find_init
Desmond Cheong Zhi Xi <desmondcheongzx(a)gmail.com>
hfs: fix high memory mapping in hfs_bnode_read
Desmond Cheong Zhi Xi <desmondcheongzx(a)gmail.com>
hfs: add missing clean-up in hfs_fill_super
Vasily Averin <vvs(a)virtuozzo.com>
ipv6: allocate enough headroom in ip6_finish_output2()
Xin Long <lucien.xin(a)gmail.com>
sctp: move 198 addresses from unusable to private scope
Eric Dumazet <edumazet(a)google.com>
net: annotate data race around sk_ll_usec
Yang Yingliang <yangyingliang(a)huawei.com>
net/802/garp: fix memleak in garp_request_join()
Yang Yingliang <yangyingliang(a)huawei.com>
net/802/mrp: fix memleak in mrp_request_join()
Paul Gortmaker <paul.gortmaker(a)windriver.com>
cgroup1: fix leaked context root causing sporadic NULL deref in LTP
Yang Yingliang <yangyingliang(a)huawei.com>
workqueue: fix UAF in pwq_unbound_release_workfn()
Miklos Szeredi <mszeredi(a)redhat.com>
af_unix: fix garbage collect vs MSG_PEEK
Maxim Levitsky <mlevitsk(a)redhat.com>
KVM: x86: determine if an exception has an error code only when injecting it.
Yonghong Song <yhs(a)fb.com>
tools: Allow proper CC/CXX/... override with LLVM=1 in Makefile.include
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
selftest: fix build error in tools/testing/selftests/vm/userfaultfd.c
-------------
Diffstat:
Makefile | 4 +--
arch/arm/boot/dts/versatile-ab.dts | 5 ++--
arch/arm/boot/dts/versatile-pb.dts | 2 +-
arch/x86/kvm/x86.c | 13 +++++---
drivers/firmware/arm_scmi/driver.c | 12 ++++----
fs/cifs/smb2ops.c | 4 +--
fs/hfs/bfind.c | 14 ++++++++-
fs/hfs/bnode.c | 25 ++++++++++++----
fs/hfs/btree.h | 7 +++++
fs/hfs/super.c | 10 +++----
fs/internal.h | 1 -
fs/iomap/seek.c | 25 ++++++----------
include/linux/fs_context.h | 1 +
include/net/busy_poll.h | 2 +-
include/net/sctp/constants.h | 4 +--
kernel/cgroup/cgroup-v1.c | 4 +--
kernel/workqueue.c | 20 ++++++++-----
net/802/garp.c | 14 +++++++++
net/802/mrp.c | 14 +++++++++
net/core/sock.c | 2 +-
net/ipv6/ip6_output.c | 28 ++++++++++++++++++
net/sctp/protocol.c | 3 +-
net/unix/af_unix.c | 51 ++++++++++++++++++++++++++++++--
tools/scripts/Makefile.include | 12 ++++++--
tools/testing/selftests/vm/userfaultfd.c | 2 +-
25 files changed, 213 insertions(+), 66 deletions(-)
This is the start of the stable review cycle for the 4.19.200 release.
There are 17 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 31 Jul 2021 13:51:22 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.200-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.19.200-rc1
Sudeep Holla <sudeep.holla(a)arm.com>
ARM: dts: versatile: Fix up interrupt controller node names
Hyunchul Lee <hyc.lee(a)gmail.com>
cifs: fix the out of range assignment to bit fields in parse_server_interfaces
Cristian Marussi <cristian.marussi(a)arm.com>
firmware: arm_scmi: Fix range check for the maximum number of pending messages
Sudeep Holla <sudeep.holla(a)arm.com>
firmware: arm_scmi: Fix possible scmi_linux_errmap buffer overflow
Desmond Cheong Zhi Xi <desmondcheongzx(a)gmail.com>
hfs: add lock nesting notation to hfs_find_init
Desmond Cheong Zhi Xi <desmondcheongzx(a)gmail.com>
hfs: fix high memory mapping in hfs_bnode_read
Desmond Cheong Zhi Xi <desmondcheongzx(a)gmail.com>
hfs: add missing clean-up in hfs_fill_super
Xin Long <lucien.xin(a)gmail.com>
sctp: move 198 addresses from unusable to private scope
Eric Dumazet <edumazet(a)google.com>
net: annotate data race around sk_ll_usec
Yang Yingliang <yangyingliang(a)huawei.com>
net/802/garp: fix memleak in garp_request_join()
Yang Yingliang <yangyingliang(a)huawei.com>
net/802/mrp: fix memleak in mrp_request_join()
Yang Yingliang <yangyingliang(a)huawei.com>
workqueue: fix UAF in pwq_unbound_release_workfn()
Miklos Szeredi <mszeredi(a)redhat.com>
af_unix: fix garbage collect vs MSG_PEEK
Jens Axboe <axboe(a)kernel.dk>
net: split out functions related to registering inflight socket files
Maxim Levitsky <mlevitsk(a)redhat.com>
KVM: x86: determine if an exception has an error code only when injecting it.
Ruslan Babayev <ruslan(a)babayev.com>
iio: dac: ds4422/ds4424 drop of_node check
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
selftest: fix build error in tools/testing/selftests/vm/userfaultfd.c
-------------
Diffstat:
Makefile | 4 +-
arch/arm/boot/dts/versatile-ab.dts | 5 +-
arch/arm/boot/dts/versatile-pb.dts | 2 +-
arch/x86/kvm/x86.c | 13 ++-
drivers/firmware/arm_scmi/driver.c | 12 +--
drivers/iio/dac/ds4424.c | 6 --
fs/cifs/smb2ops.c | 4 +-
fs/hfs/bfind.c | 14 ++-
fs/hfs/bnode.c | 25 ++++--
fs/hfs/btree.h | 7 ++
fs/hfs/super.c | 10 +--
include/net/af_unix.h | 1 +
include/net/busy_poll.h | 2 +-
include/net/sctp/constants.h | 4 +-
kernel/workqueue.c | 20 +++--
net/802/garp.c | 14 +++
net/802/mrp.c | 14 +++
net/Makefile | 2 +-
net/core/sock.c | 2 +-
net/sctp/protocol.c | 3 +-
net/unix/Kconfig | 5 ++
net/unix/Makefile | 2 +
net/unix/af_unix.c | 102 ++++++++++-----------
net/unix/garbage.c | 68 +-------------
net/unix/scm.c | 148 +++++++++++++++++++++++++++++++
net/unix/scm.h | 10 +++
tools/testing/selftests/vm/userfaultfd.c | 2 +-
27 files changed, 329 insertions(+), 172 deletions(-)
The physical address may exceed 32 bits on ARM(when ARM_LPAE enabled),
use PFN_PHYS() in devmem_is_allowed(), or the physical address may
overflow and be truncated.
This bug was initially introduced from v2.6.37, and the function was moved
to lib when v5.11.
Fixes: 087aaffcdf9c ("ARM: implement CONFIG_STRICT_DEVMEM by disabling access to RAM via /dev/mem")
Fixes: 527701eda5f1 ("lib: Add a generic version of devmem_is_allowed()")
Cc: stable(a)vger.kernel.org # v2.6.37
Signed-off-by: Liang Wang <wangliang101(a)huawei.com>
---
v2: update subject and changelog
lib/devmem_is_allowed.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/devmem_is_allowed.c b/lib/devmem_is_allowed.c
index c0d67c541849..60be9e24bd57 100644
--- a/lib/devmem_is_allowed.c
+++ b/lib/devmem_is_allowed.c
@@ -19,7 +19,7 @@
*/
int devmem_is_allowed(unsigned long pfn)
{
- if (iomem_is_exclusive(pfn << PAGE_SHIFT))
+ if (iomem_is_exclusive(PFN_PHYS(pfn)))
return 0;
if (!page_is_ram(pfn))
return 1;
--
2.32.0
On Fri, Jul 30, 2021 at 09:38:52AM +0100, Alan Young wrote:
> This commit is not applicable before the 64-bit time_t in user space with
> 32-bit compatibility changes introduces by
> 80fe7430c7085951d1246d83f638cc17e6c0be36 in 5.6.
That is odd, as that is not what you wrote in the patch itself:
> Fixes: 9027c4639ef1 ("ALSA: pcm: Call ack() whenever appl_ptr is updated")
So is the Fixes: tag here incorrect?
thanks,
greg k-h
From: Pavel Skripkin <paskripkin(a)gmail.com>
In usb_8dev_start() MAX_RX_URBS coherent buffers are allocated and
there is nothing, that frees them:
1) In callback function the urb is resubmitted and that's all
2) In disconnect function urbs are simply killed, but URB_FREE_BUFFER
is not set (see usb_8dev_start) and this flag cannot be used with
coherent buffers.
So, all allocated buffers should be freed with usb_free_coherent()
explicitly.
Side note: This code looks like a copy-paste of other can drivers. The
same patch was applied to mcba_usb driver and it works nice with real
hardware. There is no change in functionality, only clean-up code for
coherent buffers.
Fixes: 0024d8ad1639 ("can: usb_8dev: Add support for USB2CAN interface from 8 devices")
Link: https://lore.kernel.org/r/d39b458cd425a1cf7f512f340224e6e9563b07bd.16274044…
Cc: linux-stable <stable(a)vger.kernel.org>
Signed-off-by: Pavel Skripkin <paskripkin(a)gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/usb/usb_8dev.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/drivers/net/can/usb/usb_8dev.c b/drivers/net/can/usb/usb_8dev.c
index b6e7ef0d5bc6..d1b83bd1b3cb 100644
--- a/drivers/net/can/usb/usb_8dev.c
+++ b/drivers/net/can/usb/usb_8dev.c
@@ -137,7 +137,8 @@ struct usb_8dev_priv {
u8 *cmd_msg_buffer;
struct mutex usb_8dev_cmd_lock;
-
+ void *rxbuf[MAX_RX_URBS];
+ dma_addr_t rxbuf_dma[MAX_RX_URBS];
};
/* tx frame */
@@ -733,6 +734,7 @@ static int usb_8dev_start(struct usb_8dev_priv *priv)
for (i = 0; i < MAX_RX_URBS; i++) {
struct urb *urb = NULL;
u8 *buf;
+ dma_addr_t buf_dma;
/* create a URB, and a buffer for it */
urb = usb_alloc_urb(0, GFP_KERNEL);
@@ -742,7 +744,7 @@ static int usb_8dev_start(struct usb_8dev_priv *priv)
}
buf = usb_alloc_coherent(priv->udev, RX_BUFFER_SIZE, GFP_KERNEL,
- &urb->transfer_dma);
+ &buf_dma);
if (!buf) {
netdev_err(netdev, "No memory left for USB buffer\n");
usb_free_urb(urb);
@@ -750,6 +752,8 @@ static int usb_8dev_start(struct usb_8dev_priv *priv)
break;
}
+ urb->transfer_dma = buf_dma;
+
usb_fill_bulk_urb(urb, priv->udev,
usb_rcvbulkpipe(priv->udev,
USB_8DEV_ENDP_DATA_RX),
@@ -767,6 +771,9 @@ static int usb_8dev_start(struct usb_8dev_priv *priv)
break;
}
+ priv->rxbuf[i] = buf;
+ priv->rxbuf_dma[i] = buf_dma;
+
/* Drop reference, USB core will take care of freeing it */
usb_free_urb(urb);
}
@@ -836,6 +843,10 @@ static void unlink_all_urbs(struct usb_8dev_priv *priv)
usb_kill_anchored_urbs(&priv->rx_submitted);
+ for (i = 0; i < MAX_RX_URBS; ++i)
+ usb_free_coherent(priv->udev, RX_BUFFER_SIZE,
+ priv->rxbuf[i], priv->rxbuf_dma[i]);
+
usb_kill_anchored_urbs(&priv->tx_submitted);
atomic_set(&priv->active_tx_urbs, 0);
--
2.30.2
The device release number for HX-type devices is configurable in
EEPROM/OTPROM and cannot be used reliably for type detection.
Assume all (non-H) devices with bcdUSB 1.1 and unknown bcdDevice to be
of HX type while adding a bcdDevice check for HXD and TB (1.1 and 2.0,
respectively).
Reported-by: Chris <chris(a)cyber-anlage.de>
Fixes: 8a7bf7510d1f ("USB: serial: pl2303: amend and tighten type detection")
Cc: stable(a)vger.kernel.org # 5.13
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/usb/serial/pl2303.c | 41 ++++++++++++++++++++++---------------
1 file changed, 25 insertions(+), 16 deletions(-)
diff --git a/drivers/usb/serial/pl2303.c b/drivers/usb/serial/pl2303.c
index 2f2f5047452b..17601e32083e 100644
--- a/drivers/usb/serial/pl2303.c
+++ b/drivers/usb/serial/pl2303.c
@@ -418,24 +418,33 @@ static int pl2303_detect_type(struct usb_serial *serial)
bcdDevice = le16_to_cpu(desc->bcdDevice);
bcdUSB = le16_to_cpu(desc->bcdUSB);
- switch (bcdDevice) {
- case 0x100:
- /*
- * Assume it's an HXN-type if the device doesn't support the old read
- * request value.
- */
- if (bcdUSB == 0x200 && !pl2303_supports_hx_status(serial))
- return TYPE_HXN;
+ switch (bcdUSB) {
+ case 0x110:
+ switch (bcdDevice) {
+ case 0x300:
+ return TYPE_HX;
+ case 0x400:
+ return TYPE_HXD;
+ default:
+ return TYPE_HX;
+ }
break;
- case 0x300:
- if (bcdUSB == 0x200)
+ case 0x200:
+ switch (bcdDevice) {
+ case 0x100:
+ /*
+ * Assume it's an HXN-type if the device doesn't
+ * support the old read request value.
+ */
+ if (!pl2303_supports_hx_status(serial))
+ return TYPE_HXN;
+ break;
+ case 0x300:
return TYPE_TA;
-
- return TYPE_HX;
- case 0x400:
- return TYPE_HXD;
- case 0x500:
- return TYPE_TB;
+ case 0x500:
+ return TYPE_TB;
+ }
+ break;
}
dev_err(&serial->interface->dev,
--
2.31.1
This is a note to let you know that I've just added the patch titled
serial: 8250_pci: Avoid irq sharing for MSI(-X) interrupts.
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 341abd693d10e5f337a51f140ae3e7a1ae0febf6 Mon Sep 17 00:00:00 2001
From: Mario Kleiner <mario.kleiner.de(a)gmail.com>
Date: Thu, 29 Jul 2021 06:33:06 +0200
Subject: serial: 8250_pci: Avoid irq sharing for MSI(-X) interrupts.
This attempts to fix a bug found with a serial port card which uses
an MCS9922 chip, one of the 4 models for which MSI-X interrupts are
currently supported. I don't possess such a card, and i'm not
experienced with the serial subsystem, so this patch is based on what
i think i found as a likely reason for failure, based on walking the
user who actually owns the card through some diagnostic.
The user who reported the problem finds the following in his dmesg
output for the relevant ttyS4 and ttyS5:
[ 0.580425] serial 0000:02:00.0: enabling device (0000 -> 0003)
[ 0.601448] 0000:02:00.0: ttyS4 at I/O 0x3010 (irq = 125, base_baud = 115200) is a ST16650V2
[ 0.603089] serial 0000:02:00.1: enabling device (0000 -> 0003)
[ 0.624119] 0000:02:00.1: ttyS5 at I/O 0x3000 (irq = 126, base_baud = 115200) is a ST16650V2
...
[ 6.323784] genirq: Flags mismatch irq 128. 00000080 (ttyS5) vs. 00000000 (xhci_hcd)
[ 6.324128] genirq: Flags mismatch irq 128. 00000080 (ttyS5) vs. 00000000 (xhci_hcd)
...
Output of setserial -a:
/dev/ttyS4, Line 4, UART: 16650V2, Port: 0x3010, IRQ: 127
Baud_base: 115200, close_delay: 50, divisor: 0
closing_wait: 3000
Flags: spd_normal skip_test
This suggests to me that the serial driver wants to register and share a
MSI/MSI-X irq 128 with the xhci_hcd driver, whereas the xhci driver does
not want to share the irq, as flags 0x00000080 (== IRQF_SHARED) from the
serial port driver means to share the irq, and this mismatch ends in some
failed irq init?
With this setup, data reception works very unreliable, with dropped data,
already at a transmission rate of only a 16 Bytes chunk every 1/120th of
a second, ie. 1920 Bytes/sec, presumably due to rx fifo overflow due to
mishandled or not used at all rx irq's?
See full discussion thread with attempted diagnosis at:
https://psychtoolbox.discourse.group/t/issues-with-iscan-serial-port-record…
Disabling the use of MSI interrupts for the serial port pci card did
fix the reliability problems. The user executed the following sequence
of commands to achieve this:
echo 0000:02:00.0 | sudo tee /sys/bus/pci/drivers/serial/unbind
echo 0000:02:00.1 | sudo tee /sys/bus/pci/drivers/serial/unbind
echo 0 | sudo tee /sys/bus/pci/devices/0000:02:00.0/msi_bus
echo 0 | sudo tee /sys/bus/pci/devices/0000:02:00.1/msi_bus
echo 0000:02:00.0 | sudo tee /sys/bus/pci/drivers/serial/bind
echo 0000:02:00.1 | sudo tee /sys/bus/pci/drivers/serial/bind
This resulted in the following log output:
[ 82.179021] pci 0000:02:00.0: MSI/MSI-X disallowed for future drivers
[ 87.003031] pci 0000:02:00.1: MSI/MSI-X disallowed for future drivers
[ 98.537010] 0000:02:00.0: ttyS4 at I/O 0x3010 (irq = 17, base_baud = 115200) is a ST16650V2
[ 103.648124] 0000:02:00.1: ttyS5 at I/O 0x3000 (irq = 18, base_baud = 115200) is a ST16650V2
This patch attempts to fix the problem by disabling irq sharing when
using MSI irq's. Note that all i know for sure is that disabling MSI
irq's fixed the problem for the user, so this patch could be wrong and
is untested. Please review with caution, keeping this in mind.
Fixes: 8428413b1d14 ("serial: 8250_pci: Implement MSI(-X) support")
Cc: Ralf Ramsauer <ralf.ramsauer(a)oth-regensburg.de>
Cc: stable <stable(a)vger.kernel.org>
Reviewed-by: Andy Shevchenko <andy.shevchenko(a)gmail.com>
Signed-off-by: Mario Kleiner <mario.kleiner.de(a)gmail.com>
Link: https://lore.kernel.org/r/20210729043306.18528-1-mario.kleiner.de@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/8250/8250_pci.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/tty/serial/8250/8250_pci.c b/drivers/tty/serial/8250/8250_pci.c
index 02985cf90ef2..a808c283883e 100644
--- a/drivers/tty/serial/8250/8250_pci.c
+++ b/drivers/tty/serial/8250/8250_pci.c
@@ -4002,6 +4002,7 @@ pciserial_init_ports(struct pci_dev *dev, const struct pciserial_board *board)
if (pci_match_id(pci_use_msi, dev)) {
dev_dbg(&dev->dev, "Using MSI(-X) interrupts\n");
pci_set_master(dev);
+ uart.port.flags &= ~UPF_SHARE_IRQ;
rc = pci_alloc_irq_vectors(dev, 1, 1, PCI_IRQ_ALL_TYPES);
} else {
dev_dbg(&dev->dev, "Using legacy interrupts\n");
--
2.32.0
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: rtl28xxu: fix zero-length control request
Author: Johan Hovold <johan(a)kernel.org>
Date: Wed Jun 23 10:45:21 2021 +0200
The direction of the pipe argument must match the request-type direction
bit or control requests may fail depending on the host-controller-driver
implementation.
Control transfers without a data stage are treated as OUT requests by
the USB stack and should be using usb_sndctrlpipe(). Failing to do so
will now trigger a warning.
The driver uses a zero-length i2c-read request for type detection so
update the control-request code to use usb_sndctrlpipe() in this case.
Note that actually trying to read the i2c register in question does not
work as the register might not exist (e.g. depending on the demodulator)
as reported by Eero Lehtinen <debiangamer2(a)gmail.com>.
Reported-by: syzbot+faf11bbadc5a372564da(a)syzkaller.appspotmail.com
Reported-by: Eero Lehtinen <debiangamer2(a)gmail.com>
Tested-by: Eero Lehtinen <debiangamer2(a)gmail.com>
Fixes: d0f232e823af ("[media] rtl28xxu: add heuristic to detect chip type")
Cc: stable(a)vger.kernel.org # 4.0
Cc: Antti Palosaari <crope(a)iki.fi>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Signed-off-by: Sean Young <sean(a)mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei(a)kernel.org>
drivers/media/usb/dvb-usb-v2/rtl28xxu.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
---
diff --git a/drivers/media/usb/dvb-usb-v2/rtl28xxu.c b/drivers/media/usb/dvb-usb-v2/rtl28xxu.c
index 0cbdb95f8d35..795a012d4020 100644
--- a/drivers/media/usb/dvb-usb-v2/rtl28xxu.c
+++ b/drivers/media/usb/dvb-usb-v2/rtl28xxu.c
@@ -37,7 +37,16 @@ static int rtl28xxu_ctrl_msg(struct dvb_usb_device *d, struct rtl28xxu_req *req)
} else {
/* read */
requesttype = (USB_TYPE_VENDOR | USB_DIR_IN);
- pipe = usb_rcvctrlpipe(d->udev, 0);
+
+ /*
+ * Zero-length transfers must use usb_sndctrlpipe() and
+ * rtl28xxu_identify_state() uses a zero-length i2c read
+ * command to determine the chip type.
+ */
+ if (req->size)
+ pipe = usb_rcvctrlpipe(d->udev, 0);
+ else
+ pipe = usb_sndctrlpipe(d->udev, 0);
}
ret = usb_control_msg(d->udev, pipe, 0, requesttype, req->value,
Commit 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
introduces a ~10% performance drop when using virtio-net drivers.
This commit has been backported to v4.19 in commit 669c0b5782fb and this
performance drop is also visible there.
Here at Tessares, we can also notice this drop with the MPTCP fork [1]
on top of the v4.19 kernel.
Eric Dumazet already fixed this issue a few months ago, see
commit 0f6925b3e8da ("virtio_net: Do not pull payload in skb->head").
Unfortunately, this patch has not been backported to < v5.4 because it
caused issues [2]. Indeed, after having backported it, the kernel failed
to compile because one commit was missing, see
commit 503d539a6e41 ("virtio_net: Add XDP meta data support"). However,
this missing commit has been added in 4.19.186 but probably because
there were still some opened discussions [3] around
commit 0f6925b3e8da ("virtio_net: Do not pull payload in skb->head"),
the latter has not been backported at all in v4.19.
A cherry-pick of this patch without any modification is proposed here.
It has been validated: it fixes the original issue on v4.19 as well.
Please note that there is also a fix for the fix, see
commit 38ec4944b593 ("gro: ensure frag0 meets IP header alignment").
This second fix has also not been backported because it caused issues as
well [4]. Here, it was due to a conflict but also a compilation error
when the conflict has been resolved. Please refer to patch 2/2 for more
details.
One last note: these two patches have also been backported and validated
on a v4.14 release. A second series is going to be sent.
It looks like it could be interesting to backport these two patches to
v4.9 and v4.4 as well but unfortunately, the backport of these two
patches fails with conflicts and I don't have any setup to validate the
performance drop and fix with v4.9 and v4.4 kernels.
[1] https://github.com/multipath-tcp/mptcp
[2] https://lore.kernel.org/stable/161806389686151@kroah.com/
[3] https://lore.kernel.org/stable/20210412051204-mutt-send-email-mst@kernel.or…
[4] https://lore.kernel.org/stable/1618749018155126@kroah.com/
Eric Dumazet (2):
virtio_net: Do not pull payload in skb->head
gro: ensure frag0 meets IP header alignment
drivers/net/virtio_net.c | 10 +++++++---
include/linux/skbuff.h | 9 +++++++++
include/linux/virtio_net.h | 14 +++++++++-----
net/core/dev.c | 3 ++-
4 files changed, 27 insertions(+), 9 deletions(-)
--
2.31.1
Bus bandwidth array access is based on esit, increase one
will cause out-of-bounds issue; for example, when esit is
XHCI_MTK_MAX_ESIT, will overstep boundary.
Fixes: 7c986fbc16ae ("usb: xhci-mtk: get the microframe boundary for ESIT")
Cc: <stable(a)vger.kernel.org>
Reported-by: Stan Lu <stan.lu(a)mediatek.com>
Signed-off-by: Chunfeng Yun <chunfeng.yun(a)mediatek.com>
---
drivers/usb/host/xhci-mtk-sch.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/usb/host/xhci-mtk-sch.c b/drivers/usb/host/xhci-mtk-sch.c
index cffcaf4dfa9f..0bb1a6295d64 100644
--- a/drivers/usb/host/xhci-mtk-sch.c
+++ b/drivers/usb/host/xhci-mtk-sch.c
@@ -575,10 +575,12 @@ static u32 get_esit_boundary(struct mu3h_sch_ep_info *sch_ep)
u32 boundary = sch_ep->esit;
if (sch_ep->sch_tt) { /* LS/FS with TT */
- /* tune for CS */
- if (sch_ep->ep_type != ISOC_OUT_EP)
- boundary++;
- else if (boundary > 1) /* normally esit >= 8 for FS/LS */
+ /*
+ * tune for CS, normally esit >= 8 for FS/LS,
+ * not add one for other types to avoid access array
+ * out of boundary
+ */
+ if (sch_ep->ep_type == ISOC_OUT_EP && boundary > 1)
boundary--;
}
--
2.18.0
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 2e2832562c877e6530b8480982d99a4ff90c6777 Mon Sep 17 00:00:00 2001
From: Alan Young <consult.awy(a)gmail.com>
Date: Fri, 9 Jul 2021 09:48:54 +0100
Subject: [PATCH] ALSA: pcm: Call substream ack() method upon compat mmap
commit
If a 32-bit application is being used with a 64-bit kernel and is using
the mmap mechanism to write data, then the SNDRV_PCM_IOCTL_SYNC_PTR
ioctl results in calling snd_pcm_ioctl_sync_ptr_compat(). Make this use
pcm_lib_apply_appl_ptr() so that the substream's ack() method, if
defined, is called.
The snd_pcm_sync_ptr() function, used in the 64-bit ioctl case, already
uses snd_pcm_ioctl_sync_ptr_compat().
Fixes: 9027c4639ef1 ("ALSA: pcm: Call ack() whenever appl_ptr is updated")
Signed-off-by: Alan Young <consult.awy(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/c441f18c-eb2a-3bdd-299a-696ccca2de9c@gmail.com
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 14e32825c339..c88c4316c417 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -3063,9 +3063,14 @@ static int snd_pcm_ioctl_sync_ptr_compat(struct snd_pcm_substream *substream,
boundary = 0x7fffffff;
snd_pcm_stream_lock_irq(substream);
/* FIXME: we should consider the boundary for the sync from app */
- if (!(sflags & SNDRV_PCM_SYNC_PTR_APPL))
- control->appl_ptr = scontrol.appl_ptr;
- else
+ if (!(sflags & SNDRV_PCM_SYNC_PTR_APPL)) {
+ err = pcm_lib_apply_appl_ptr(substream,
+ scontrol.appl_ptr);
+ if (err < 0) {
+ snd_pcm_stream_unlock_irq(substream);
+ return err;
+ }
+ } else
scontrol.appl_ptr = control->appl_ptr % boundary;
if (!(sflags & SNDRV_PCM_SYNC_PTR_AVAIL_MIN))
control->avail_min = scontrol.avail_min;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 2e2832562c877e6530b8480982d99a4ff90c6777 Mon Sep 17 00:00:00 2001
From: Alan Young <consult.awy(a)gmail.com>
Date: Fri, 9 Jul 2021 09:48:54 +0100
Subject: [PATCH] ALSA: pcm: Call substream ack() method upon compat mmap
commit
If a 32-bit application is being used with a 64-bit kernel and is using
the mmap mechanism to write data, then the SNDRV_PCM_IOCTL_SYNC_PTR
ioctl results in calling snd_pcm_ioctl_sync_ptr_compat(). Make this use
pcm_lib_apply_appl_ptr() so that the substream's ack() method, if
defined, is called.
The snd_pcm_sync_ptr() function, used in the 64-bit ioctl case, already
uses snd_pcm_ioctl_sync_ptr_compat().
Fixes: 9027c4639ef1 ("ALSA: pcm: Call ack() whenever appl_ptr is updated")
Signed-off-by: Alan Young <consult.awy(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/c441f18c-eb2a-3bdd-299a-696ccca2de9c@gmail.com
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 14e32825c339..c88c4316c417 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -3063,9 +3063,14 @@ static int snd_pcm_ioctl_sync_ptr_compat(struct snd_pcm_substream *substream,
boundary = 0x7fffffff;
snd_pcm_stream_lock_irq(substream);
/* FIXME: we should consider the boundary for the sync from app */
- if (!(sflags & SNDRV_PCM_SYNC_PTR_APPL))
- control->appl_ptr = scontrol.appl_ptr;
- else
+ if (!(sflags & SNDRV_PCM_SYNC_PTR_APPL)) {
+ err = pcm_lib_apply_appl_ptr(substream,
+ scontrol.appl_ptr);
+ if (err < 0) {
+ snd_pcm_stream_unlock_irq(substream);
+ return err;
+ }
+ } else
scontrol.appl_ptr = control->appl_ptr % boundary;
if (!(sflags & SNDRV_PCM_SYNC_PTR_AVAIL_MIN))
control->avail_min = scontrol.avail_min;
From: Pavel Skripkin <paskripkin(a)gmail.com>
In esd_usb2_setup_rx_urbs() MAX_RX_URBS coherent buffers are allocated
and there is nothing, that frees them:
1) In callback function the urb is resubmitted and that's all
2) In disconnect function urbs are simply killed, but URB_FREE_BUFFER
is not set (see esd_usb2_setup_rx_urbs) and this flag cannot be used
with coherent buffers.
So, all allocated buffers should be freed with usb_free_coherent()
explicitly.
Side note: This code looks like a copy-paste of other can drivers. The
same patch was applied to mcba_usb driver and it works nice with real
hardware. There is no change in functionality, only clean-up code for
coherent buffers.
Fixes: 96d8e90382dc ("can: Add driver for esd CAN-USB/2 device")
Link: https://lore.kernel.org/r/b31b096926dcb35998ad0271aac4b51770ca7cc8.16274044…
Cc: linux-stable <stable(a)vger.kernel.org>
Signed-off-by: Pavel Skripkin <paskripkin(a)gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/usb/esd_usb2.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/net/can/usb/esd_usb2.c b/drivers/net/can/usb/esd_usb2.c
index 65b58f8fc328..66fa8b07c2e6 100644
--- a/drivers/net/can/usb/esd_usb2.c
+++ b/drivers/net/can/usb/esd_usb2.c
@@ -195,6 +195,8 @@ struct esd_usb2 {
int net_count;
u32 version;
int rxinitdone;
+ void *rxbuf[MAX_RX_URBS];
+ dma_addr_t rxbuf_dma[MAX_RX_URBS];
};
struct esd_usb2_net_priv {
@@ -545,6 +547,7 @@ static int esd_usb2_setup_rx_urbs(struct esd_usb2 *dev)
for (i = 0; i < MAX_RX_URBS; i++) {
struct urb *urb = NULL;
u8 *buf = NULL;
+ dma_addr_t buf_dma;
/* create a URB, and a buffer for it */
urb = usb_alloc_urb(0, GFP_KERNEL);
@@ -554,7 +557,7 @@ static int esd_usb2_setup_rx_urbs(struct esd_usb2 *dev)
}
buf = usb_alloc_coherent(dev->udev, RX_BUFFER_SIZE, GFP_KERNEL,
- &urb->transfer_dma);
+ &buf_dma);
if (!buf) {
dev_warn(dev->udev->dev.parent,
"No memory left for USB buffer\n");
@@ -562,6 +565,8 @@ static int esd_usb2_setup_rx_urbs(struct esd_usb2 *dev)
goto freeurb;
}
+ urb->transfer_dma = buf_dma;
+
usb_fill_bulk_urb(urb, dev->udev,
usb_rcvbulkpipe(dev->udev, 1),
buf, RX_BUFFER_SIZE,
@@ -574,8 +579,12 @@ static int esd_usb2_setup_rx_urbs(struct esd_usb2 *dev)
usb_unanchor_urb(urb);
usb_free_coherent(dev->udev, RX_BUFFER_SIZE, buf,
urb->transfer_dma);
+ goto freeurb;
}
+ dev->rxbuf[i] = buf;
+ dev->rxbuf_dma[i] = buf_dma;
+
freeurb:
/* Drop reference, USB core will take care of freeing it */
usb_free_urb(urb);
@@ -663,6 +672,11 @@ static void unlink_all_urbs(struct esd_usb2 *dev)
int i, j;
usb_kill_anchored_urbs(&dev->rx_submitted);
+
+ for (i = 0; i < MAX_RX_URBS; ++i)
+ usb_free_coherent(dev->udev, RX_BUFFER_SIZE,
+ dev->rxbuf[i], dev->rxbuf_dma[i]);
+
for (i = 0; i < dev->net_count; i++) {
priv = dev->nets[i];
if (priv) {
--
2.30.2
From: Pavel Skripkin <paskripkin(a)gmail.com>
In ems_usb_start() MAX_RX_URBS coherent buffers are allocated and
there is nothing, that frees them:
1) In callback function the urb is resubmitted and that's all
2) In disconnect function urbs are simply killed, but URB_FREE_BUFFER
is not set (see ems_usb_start) and this flag cannot be used with
coherent buffers.
So, all allocated buffers should be freed with usb_free_coherent()
explicitly.
Side note: This code looks like a copy-paste of other can drivers. The
same patch was applied to mcba_usb driver and it works nice with real
hardware. There is no change in functionality, only clean-up code for
coherent buffers.
Fixes: 702171adeed3 ("ems_usb: Added support for EMS CPC-USB/ARM7 CAN/USB interface")
Link: https://lore.kernel.org/r/59aa9fbc9a8cbf9af2bbd2f61a659c480b415800.16274044…
Cc: linux-stable <stable(a)vger.kernel.org>
Signed-off-by: Pavel Skripkin <paskripkin(a)gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/usb/ems_usb.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/drivers/net/can/usb/ems_usb.c b/drivers/net/can/usb/ems_usb.c
index 0a37af4a3fa4..2b5302e72435 100644
--- a/drivers/net/can/usb/ems_usb.c
+++ b/drivers/net/can/usb/ems_usb.c
@@ -255,6 +255,8 @@ struct ems_usb {
unsigned int free_slots; /* remember number of available slots */
struct ems_cpc_msg active_params; /* active controller parameters */
+ void *rxbuf[MAX_RX_URBS];
+ dma_addr_t rxbuf_dma[MAX_RX_URBS];
};
static void ems_usb_read_interrupt_callback(struct urb *urb)
@@ -587,6 +589,7 @@ static int ems_usb_start(struct ems_usb *dev)
for (i = 0; i < MAX_RX_URBS; i++) {
struct urb *urb = NULL;
u8 *buf = NULL;
+ dma_addr_t buf_dma;
/* create a URB, and a buffer for it */
urb = usb_alloc_urb(0, GFP_KERNEL);
@@ -596,7 +599,7 @@ static int ems_usb_start(struct ems_usb *dev)
}
buf = usb_alloc_coherent(dev->udev, RX_BUFFER_SIZE, GFP_KERNEL,
- &urb->transfer_dma);
+ &buf_dma);
if (!buf) {
netdev_err(netdev, "No memory left for USB buffer\n");
usb_free_urb(urb);
@@ -604,6 +607,8 @@ static int ems_usb_start(struct ems_usb *dev)
break;
}
+ urb->transfer_dma = buf_dma;
+
usb_fill_bulk_urb(urb, dev->udev, usb_rcvbulkpipe(dev->udev, 2),
buf, RX_BUFFER_SIZE,
ems_usb_read_bulk_callback, dev);
@@ -619,6 +624,9 @@ static int ems_usb_start(struct ems_usb *dev)
break;
}
+ dev->rxbuf[i] = buf;
+ dev->rxbuf_dma[i] = buf_dma;
+
/* Drop reference, USB core will take care of freeing it */
usb_free_urb(urb);
}
@@ -684,6 +692,10 @@ static void unlink_all_urbs(struct ems_usb *dev)
usb_kill_anchored_urbs(&dev->rx_submitted);
+ for (i = 0; i < MAX_RX_URBS; ++i)
+ usb_free_coherent(dev->udev, RX_BUFFER_SIZE,
+ dev->rxbuf[i], dev->rxbuf_dma[i]);
+
usb_kill_anchored_urbs(&dev->tx_submitted);
atomic_set(&dev->active_tx_urbs, 0);
--
2.30.2
From: Pavel Skripkin <paskripkin(a)gmail.com>
Yasushi reported, that his Microchip CAN Analyzer stopped working
since commit 91c02557174b ("can: mcba_usb: fix memory leak in
mcba_usb"). The problem was in missing urb->transfer_dma
initialization.
In my previous patch to this driver I refactored mcba_usb_start() code
to avoid leaking usb coherent buffers. To archive it, I passed local
stack variable to usb_alloc_coherent() and then saved it to private
array to correctly free all coherent buffers on ->close() call. But I
forgot to initialize urb->transfer_dma with variable passed to
usb_alloc_coherent().
All of this was causing device to not work, since dma addr 0 is not
valid and following log can be found on bug report page, which points
exactly to problem described above.
| DMAR: [DMA Write] Request device [00:14.0] PASID ffffffff fault addr 0 [fault reason 05] PTE Write access is not set
Fixes: 91c02557174b ("can: mcba_usb: fix memory leak in mcba_usb")
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=990850
Link: https://lore.kernel.org/r/20210725103630.23864-1-paskripkin@gmail.com
Cc: linux-stable <stable(a)vger.kernel.org>
Reported-by: Yasushi SHOJI <yasushi.shoji(a)gmail.com>
Signed-off-by: Pavel Skripkin <paskripkin(a)gmail.com>
Tested-by: Yasushi SHOJI <yashi(a)spacecubics.com>
[mkl: fixed typos in commit message - thanks Yasushi SHOJI]
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/usb/mcba_usb.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/can/usb/mcba_usb.c b/drivers/net/can/usb/mcba_usb.c
index a45865bd7254..a1a154c08b7f 100644
--- a/drivers/net/can/usb/mcba_usb.c
+++ b/drivers/net/can/usb/mcba_usb.c
@@ -653,6 +653,8 @@ static int mcba_usb_start(struct mcba_priv *priv)
break;
}
+ urb->transfer_dma = buf_dma;
+
usb_fill_bulk_urb(urb, priv->udev,
usb_rcvbulkpipe(priv->udev, MCBA_USB_EP_IN),
buf, MCBA_USB_RX_BUFF_SIZE,
--
2.30.2
Hi,
This is a patch series to CVE-2021-21781.
The patch 9c698bff66ab ("RM: ensure the signal page contains defined
contents") depepds on memset32. However, this function is not provided
in 4.4 and 4.9. Therefore, we need the patch 3b3c4babd898 ("lib/string.c:
add multibyte memset functions") to apply this feature.
Another option is to implement only the memset32 function in
arch/arm/kernel/signal.c only or using loop memset, but for simplicity
we have taken the way of applying the original patch 3b3c4babd898
("lib/string.c: add multibyte memset functions") that provides memset32
in mainline kernel.
Best regards,
Nobuhiro
Matthew Wilcox (1):
lib/string.c: add multibyte memset functions
Russell King (1):
ARM: ensure the signal page contains defined contents
arch/arm/kernel/signal.c | 14 +++++----
include/linux/string.h | 30 ++++++++++++++++++
lib/string.c | 66 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 104 insertions(+), 6 deletions(-)
--
2.32.0
>From 97cc3a4817c982954ff69355d2577a92bddbad4a Mon Sep 17 00:00:00 2001
From: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu(a)toshiba.co.jp>
Date: Fri, 30 Jul 2021 14:40:55 +0900
Subject: [RFC/PATCH 0/2] Backports CVE-2021-21781 for 4.4 and 4.9
Hi,
This is a patch series to CVE-2021-21781.
The patch 9c698bff66ab ("RM: ensure the signal page contains defined
contents") depepds on memset32. However, this function is not provided
in 4.4 and 4.9. Therefore, we need the patch 3b3c4babd898 ("lib/string.c:
add multibyte memset functions") to apply this feature.
Another option is to implement only the memset32 function in
arch/arm/kernel/signal.c only, but for simplicity we have taken the
way of applying the original patch 3b3c4babd898 ("lib/string.c: add
multibyte memset functions") that provides memset32 in mainline kernel.
Best regards,
Nobuhiro
Matthew Wilcox (1):
lib/string.c: add multibyte memset functions
Russell King (1):
ARM: ensure the signal page contains defined contents
arch/arm/kernel/signal.c | 23 ++++++++++----
include/linux/string.h | 30 ++++++++++++++++++
lib/string.c | 66 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 113 insertions(+), 6 deletions(-)
--
2.32.0
Hi,
This is a patch series to CVE-2021-21781.
The patch 9c698bff66ab ("RM: ensure the signal page contains defined
contents") depepds on memset32. However, this function is not provided
in 4.4 and 4.9. Therefore, we need the patch 3b3c4babd898 ("lib/string.c:
add multibyte memset functions") to apply this feature.
Another option is to implement only the memset32 function in
arch/arm/kernel/signal.c only or using loop memset, but for simplicity
we have taken the way of applying the original patch 3b3c4babd898
("lib/string.c: add multibyte memset functions") that provides memset32
in mainline kernel.
Best regards,
Nobuhiro
Matthew Wilcox (1):
lib/string.c: add multibyte memset functions
Russell King (1):
ARM: ensure the signal page contains defined contents
arch/arm/kernel/signal.c | 14 +++++----
include/linux/string.h | 30 ++++++++++++++++++
lib/string.c | 66 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 104 insertions(+), 6 deletions(-)
--
2.32.0
>From 97cc3a4817c982954ff69355d2577a92bddbad4a Mon Sep 17 00:00:00 2001
From: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu(a)toshiba.co.jp>
Date: Fri, 30 Jul 2021 14:40:55 +0900
Subject: [RFC/PATCH 0/2] Backports CVE-2021-21781 for 4.4 and 4.9
Hi,
This is a patch series to CVE-2021-21781.
The patch 9c698bff66ab ("RM: ensure the signal page contains defined
contents") depepds on memset32. However, this function is not provided
in 4.4 and 4.9. Therefore, we need the patch 3b3c4babd898 ("lib/string.c:
add multibyte memset functions") to apply this feature.
Another option is to implement only the memset32 function in
arch/arm/kernel/signal.c only, but for simplicity we have taken the
way of applying the original patch 3b3c4babd898 ("lib/string.c: add
multibyte memset functions") that provides memset32 in mainline kernel.
Best regards,
Nobuhiro
Matthew Wilcox (1):
lib/string.c: add multibyte memset functions
Russell King (1):
ARM: ensure the signal page contains defined contents
arch/arm/kernel/signal.c | 23 ++++++++++----
include/linux/string.h | 30 ++++++++++++++++++
lib/string.c | 66 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 113 insertions(+), 6 deletions(-)
--
2.32.0