The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64620e0a1e712a778095bd35cbb277dc2259281f Mon Sep 17 00:00:00 2001
From: Daniel Borkmann <daniel(a)iogearbox.net>
Date: Tue, 11 Jan 2022 14:43:41 +0000
Subject: [PATCH] bpf: Fix out of bounds access for ringbuf helpers
Both bpf_ringbuf_submit() and bpf_ringbuf_discard() have ARG_PTR_TO_ALLOC_MEM
in their bpf_func_proto definition as their first argument. They both expect
the result from a prior bpf_ringbuf_reserve() call which has a return type of
RET_PTR_TO_ALLOC_MEM_OR_NULL.
Meaning, after a NULL check in the code, the verifier will promote the register
type in the non-NULL branch to a PTR_TO_MEM and in the NULL branch to a known
zero scalar. Generally, pointer arithmetic on PTR_TO_MEM is allowed, so the
latter could have an offset.
The ARG_PTR_TO_ALLOC_MEM expects a PTR_TO_MEM register type. However, the non-
zero result from bpf_ringbuf_reserve() must be fed into either bpf_ringbuf_submit()
or bpf_ringbuf_discard() but with the original offset given it will then read
out the struct bpf_ringbuf_hdr mapping.
The verifier missed to enforce a zero offset, so that out of bounds access
can be triggered which could be used to escalate privileges if unprivileged
BPF was enabled (disabled by default in kernel).
Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it")
Reported-by: <tr3e.wang(a)gmail.com> (SecCoder Security Lab)
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Acked-by: John Fastabend <john.fastabend(a)gmail.com>
Acked-by: Alexei Starovoitov <ast(a)kernel.org>
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index e0b3f4d683eb..c72c57a6684f 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5318,9 +5318,15 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
case PTR_TO_BUF:
case PTR_TO_BUF | MEM_RDONLY:
case PTR_TO_STACK:
+ /* Some of the argument types nevertheless require a
+ * zero register offset.
+ */
+ if (arg_type == ARG_PTR_TO_ALLOC_MEM)
+ goto force_off_check;
break;
/* All the rest must be rejected: */
default:
+force_off_check:
err = __check_ptr_off_reg(env, reg, regno,
type == PTR_TO_BTF_ID);
if (err < 0)
When introducing support for processed channels I needed
to invert the expression:
if (!iio_channel_has_info(schan, IIO_CHAN_INFO_RAW) ||
!iio_channel_has_info(schan, IIO_CHAN_INFO_SCALE))
dev_err(dev, "source channel does not support raw/scale\n");
To the inverse, meaning detect when we can usse raw+scale
rather than when we can not. This was the result:
if (iio_channel_has_info(schan, IIO_CHAN_INFO_RAW) ||
iio_channel_has_info(schan, IIO_CHAN_INFO_SCALE))
dev_info(dev, "using raw+scale source channel\n");
Ooops. Spot the error. Yep old George Boole came up and bit me.
That should be an &&.
The current code "mostly works" because we have not run into
systems supporting only raw but not scale or only scale but not
raw, and I doubt there are few using the rescaler on anything
such, but let's fix the logic.
Cc: Liam Beguin <liambeguin(a)gmail.com>
Cc: stable(a)vger.kernel.org
Fixes: 53ebee949980 ("iio: afe: iio-rescale: Support processed channels")
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
---
drivers/iio/afe/iio-rescale.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iio/afe/iio-rescale.c b/drivers/iio/afe/iio-rescale.c
index 7e511293d6d1..dc426e1484f0 100644
--- a/drivers/iio/afe/iio-rescale.c
+++ b/drivers/iio/afe/iio-rescale.c
@@ -278,7 +278,7 @@ static int rescale_configure_channel(struct device *dev,
chan->ext_info = rescale->ext_info;
chan->type = rescale->cfg->type;
- if (iio_channel_has_info(schan, IIO_CHAN_INFO_RAW) ||
+ if (iio_channel_has_info(schan, IIO_CHAN_INFO_RAW) &&
iio_channel_has_info(schan, IIO_CHAN_INFO_SCALE)) {
dev_info(dev, "using raw+scale source channel\n");
} else if (iio_channel_has_info(schan, IIO_CHAN_INFO_PROCESSED)) {
--
2.35.3
Currently, due to the sequential use of min_t() and clamp_t() macros,
in cdc_ncm_check_tx_max(), if dwNtbOutMaxSize is not set, the logic
sets tx_max to 0. This is then used to allocate the data area of the
SKB requested later in cdc_ncm_fill_tx_frame().
This does not cause an issue presently because when memory is
allocated during initialisation phase of SKB creation, more memory
(512b) is allocated than is required for the SKB headers alone (320b),
leaving some space (512b - 320b = 192b) for CDC data (172b).
However, if more elements (for example 3 x u64 = [24b]) were added to
one of the SKB header structs, say 'struct skb_shared_info',
increasing its original size (320b [320b aligned]) to something larger
(344b [384b aligned]), then suddenly the CDC data (172b) no longer
fits in the spare SKB data area (512b - 384b = 128b).
Consequently the SKB bounds checking semantics fails and panics:
skbuff: skb_over_panic: text:ffffffff830a5b5f len:184 put:172 \
head:ffff888119227c00 data:ffff888119227c00 tail:0xb8 end:0x80 dev:<NULL>
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:110!
RIP: 0010:skb_panic+0x14f/0x160 net/core/skbuff.c:106
<snip>
Call Trace:
<IRQ>
skb_over_panic+0x2c/0x30 net/core/skbuff.c:115
skb_put+0x205/0x210 net/core/skbuff.c:1877
skb_put_zero include/linux/skbuff.h:2270 [inline]
cdc_ncm_ndp16 drivers/net/usb/cdc_ncm.c:1116 [inline]
cdc_ncm_fill_tx_frame+0x127f/0x3d50 drivers/net/usb/cdc_ncm.c:1293
cdc_ncm_tx_fixup+0x98/0xf0 drivers/net/usb/cdc_ncm.c:1514
By overriding the max value with the default CDC_NCM_NTB_MAX_SIZE_TX
when not offered through the system provided params, we ensure enough
data space is allocated to handle the CDC data, meaning no crash will
occur.
Cc: stable(a)vger.kernel.org
Cc: Oliver Neukum <oliver(a)neukum.org>
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: Jakub Kicinski <kuba(a)kernel.org>
Cc: linux-usb(a)vger.kernel.org
Cc: netdev(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Fixes: 289507d3364f9 ("net: cdc_ncm: use sysfs for rx/tx aggregation tuning")
Signed-off-by: Lee Jones <lee.jones(a)linaro.org>
---
drivers/net/usb/cdc_ncm.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c
index 24753a4da7e60..e303b522efb50 100644
--- a/drivers/net/usb/cdc_ncm.c
+++ b/drivers/net/usb/cdc_ncm.c
@@ -181,6 +181,8 @@ static u32 cdc_ncm_check_tx_max(struct usbnet *dev, u32 new_tx)
min = ctx->max_datagram_size + ctx->max_ndp_size + sizeof(struct usb_cdc_ncm_nth32);
max = min_t(u32, CDC_NCM_NTB_MAX_SIZE_TX, le32_to_cpu(ctx->ncm_parm.dwNtbOutMaxSize));
+ if (max == 0)
+ max = CDC_NCM_NTB_MAX_SIZE_TX; /* dwNtbOutMaxSize not set */
/* some devices set dwNtbOutMaxSize too low for the above default */
min = min(min, max);
--
2.34.0.384.gca35af8252-goog
When bfqq is shared by multiple processes it can happen that one of the
processes gets moved to a different cgroup (or just starts submitting IO
for different cgroup). In case that happens we need to split the merged
bfqq as otherwise we will have IO for multiple cgroups in one bfqq and
we will just account IO time to wrong entities etc.
Similarly if the bfqq is scheduled to merge with another bfqq but the
merge didn't happen yet, cancel the merge as it need not be valid
anymore.
CC: stable(a)vger.kernel.org
Fixes: e21b7a0b9887 ("block, bfq: add full hierarchical scheduling and cgroups support")
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
block/bfq-cgroup.c | 36 +++++++++++++++++++++++++++++++++---
block/bfq-iosched.c | 2 +-
block/bfq-iosched.h | 1 +
3 files changed, 35 insertions(+), 4 deletions(-)
diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
index 420eda2589c0..9352f3cc2377 100644
--- a/block/bfq-cgroup.c
+++ b/block/bfq-cgroup.c
@@ -743,9 +743,39 @@ static struct bfq_group *__bfq_bic_change_cgroup(struct bfq_data *bfqd,
}
if (sync_bfqq) {
- entity = &sync_bfqq->entity;
- if (entity->sched_data != &bfqg->sched_data)
- bfq_bfqq_move(bfqd, sync_bfqq, bfqg);
+ if (!sync_bfqq->new_bfqq && !bfq_bfqq_coop(sync_bfqq)) {
+ /* We are the only user of this bfqq, just move it */
+ if (sync_bfqq->entity.sched_data != &bfqg->sched_data)
+ bfq_bfqq_move(bfqd, sync_bfqq, bfqg);
+ } else {
+ struct bfq_queue *bfqq;
+
+ /*
+ * The queue was merged to a different queue. Check
+ * that the merge chain still belongs to the same
+ * cgroup.
+ */
+ for (bfqq = sync_bfqq; bfqq; bfqq = bfqq->new_bfqq)
+ if (bfqq->entity.sched_data !=
+ &bfqg->sched_data)
+ break;
+ if (bfqq) {
+ /*
+ * Some queue changed cgroup so the merge is
+ * not valid anymore. We cannot easily just
+ * cancel the merge (by clearing new_bfqq) as
+ * there may be other processes using this
+ * queue and holding refs to all queues below
+ * sync_bfqq->new_bfqq. Similarly if the merge
+ * already happened, we need to detach from
+ * bfqq now so that we cannot merge bio to a
+ * request from the old cgroup.
+ */
+ bfq_put_cooperator(sync_bfqq);
+ bfq_release_process_ref(bfqd, sync_bfqq);
+ bic_set_bfqq(bic, NULL, 1);
+ }
+ }
}
return bfqg;
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 7d00b21ebe5d..89fe3f85eb3c 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -5315,7 +5315,7 @@ static void bfq_put_stable_ref(struct bfq_queue *bfqq)
bfq_put_queue(bfqq);
}
-static void bfq_put_cooperator(struct bfq_queue *bfqq)
+void bfq_put_cooperator(struct bfq_queue *bfqq)
{
struct bfq_queue *__bfqq, *next;
diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
index 3b83e3d1c2e5..a56763045d19 100644
--- a/block/bfq-iosched.h
+++ b/block/bfq-iosched.h
@@ -979,6 +979,7 @@ void bfq_weights_tree_remove(struct bfq_data *bfqd,
void bfq_bfqq_expire(struct bfq_data *bfqd, struct bfq_queue *bfqq,
bool compensate, enum bfqq_expiration reason);
void bfq_put_queue(struct bfq_queue *bfqq);
+void bfq_put_cooperator(struct bfq_queue *bfqq);
void bfq_end_wr_async_queues(struct bfq_data *bfqd, struct bfq_group *bfqg);
void bfq_release_process_ref(struct bfq_data *bfqd, struct bfq_queue *bfqq);
void bfq_schedule_dispatch(struct bfq_data *bfqd);
--
2.34.1
From: Alexander Sverdlin <alexander.sverdlin(a)nokia.com>
Erase can be zeroed in spi_nor_parse_4bait() or
spi_nor_init_non_uniform_erase_map(). In practice it happened with
mt25qu256a, which supports 4K, 32K, 64K erases with 3b address commands,
but only 4K and 64K erase with 4b address commands.
Fixes: dc92843159a7 ("mtd: spi-nor: fix erase_type array to indicate current map conf")
Cc: stable(a)vger.kernel.org
Signed-off-by: Alexander Sverdlin <alexander.sverdlin(a)nokia.com>
---
Changes in v2:
erase->opcode -> erase->size
drivers/mtd/spi-nor/core.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index 88dd090..183ea9d 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -1400,6 +1400,8 @@ spi_nor_find_best_erase_type(const struct spi_nor_erase_map *map,
continue;
erase = &map->erase_type[i];
+ if (!erase->size)
+ continue;
/* Alignment is not mandatory for overlaid regions */
if (region->offset & SNOR_OVERLAID_REGION &&
--
2.10.2
This reverts commit 2dc016599cfa9672a147528ca26d70c3654a5423.
Users are reporting regressions in regulatory domain detection and
channel availability.
The problem this was trying to resolve was fixed in firmware anyway:
QCA6174 hw3.0: sdio-4.4.1: add firmware.bin_WLAN.RMH.4.4.1-00042
https://github.com/kvalo/ath10k-firmware/commit/4d382787f0efa77dba40394e0bc…
Link: https://bbs.archlinux.org/viewtopic.php?id=254535
Link: http://lists.infradead.org/pipermail/ath10k/2020-April/014871.html
Link: http://lists.infradead.org/pipermail/ath10k/2020-May/015152.html
Fixes: 2dc016599cfa ("ath: add support for special 0x0 regulatory domain")
Cc: <stable(a)vger.kernel.org>
Cc: Wen Gong <wgong(a)codeaurora.org>
Signed-off-by: Brian Norris <briannorris(a)chromium.org>
---
drivers/net/wireless/ath/regd.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/net/wireless/ath/regd.c b/drivers/net/wireless/ath/regd.c
index bee9110b91f3..20f4f8ea9f89 100644
--- a/drivers/net/wireless/ath/regd.c
+++ b/drivers/net/wireless/ath/regd.c
@@ -666,14 +666,14 @@ ath_regd_init_wiphy(struct ath_regulatory *reg,
/*
* Some users have reported their EEPROM programmed with
- * 0x8000 or 0x0 set, this is not a supported regulatory
- * domain but since we have more than one user with it we
- * need a solution for them. We default to 0x64, which is
- * the default Atheros world regulatory domain.
+ * 0x8000 set, this is not a supported regulatory domain
+ * but since we have more than one user with it we need
+ * a solution for them. We default to 0x64, which is the
+ * default Atheros world regulatory domain.
*/
static void ath_regd_sanitize(struct ath_regulatory *reg)
{
- if (reg->current_rd != COUNTRY_ERD_FLAG && reg->current_rd != 0)
+ if (reg->current_rd != COUNTRY_ERD_FLAG)
return;
printk(KERN_DEBUG "ath: EEPROM regdomain sanitized\n");
reg->current_rd = 0x64;
--
2.27.0.rc0.183.gde8f92d652-goog
Ampere Altra defines CPU clusters in the ACPI PPTT. They share a Snoop
Control Unit, but have no shared CPU-side last level cache.
cpu_coregroup_mask() will return a cpumask with weight 1, while
cpu_clustergroup_mask() will return a cpumask with weight 2.
As a result, build_sched_domain() will BUG() once per CPU with:
BUG: arch topology borken
the CLS domain not a subset of the MC domain
The MC level cpumask is then extended to that of the CLS child, and is
later removed entirely as redundant. This sched domain topology is an
improvement over previous topologies, or those built without
SCHED_CLUSTER, particularly for certain latency sensitive workloads.
With the current scheduler model and heuristics, this is a desirable
default topology for Ampere Altra and Altra Max system.
Rather than create a custom sched domains topology structure and
introduce new logic in arch/arm64 to detect these systems, update the
core_mask so coregroup is never a subset of clustergroup, extending it
to cluster_siblings if necessary. Only do this if CONFIG_SCHED_CLUSTER
is enabled to avoid also changing the topology (MC) when
CONFIG_SCHED_CLUSTER is disabled.
This has the added benefit over a custom topology of working for both
symmetric and asymmetric topologies. It does not address systems where
the CLUSTER topology is above a populated MC topology, but these are not
considered today and can be addressed separately if and when they
appear.
The final sched domain topology for a 2 socket Ampere Altra system is
unchanged with or without CONFIG_SCHED_CLUSTER, and the BUG is avoided:
For CPU0:
CONFIG_SCHED_CLUSTER=y
CLS [0-1]
DIE [0-79]
NUMA [0-159]
CONFIG_SCHED_CLUSTER is not set
DIE [0-79]
NUMA [0-159]
Signed-off-by: Darren Hart <darren(a)os.amperecomputing.com>
Suggested-by: Barry Song <song.bao.hua(a)hisilicon.com>
Reviewed-by: Barry Song <song.bao.hua(a)hisilicon.com>
Acked-by: Sudeep Holla <sudeep.holla(a)arm.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann(a)arm.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael(a)kernel.org>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Vincent Guittot <vincent.guittot(a)linaro.org>
Cc: D. Scott Phillips <scott(a)os.amperecomputing.com>
Cc: Ilkka Koskinen <ilkka(a)os.amperecomputing.com>
Cc: <stable(a)vger.kernel.org> # 5.16.x
---
v1: Drop MC level if coregroup weight == 1
v2: New sd topo in arch/arm64/kernel/smp.c
v3: No new topo, extend core_mask to cluster_siblings
v4: Rebase on 5.18-rc1 for GregKH to pull. Add IS_ENABLED(CONFIG_SCHED_CLUSTER).
v5: Rebase on 5.18-rc2 for GregKH to pull. Add collected tags. No other changes.
drivers/base/arch_topology.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 1d6636ebaac5..5497c5ab7318 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -667,6 +667,15 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
core_mask = &cpu_topology[cpu].llc_sibling;
}
+ /*
+ * For systems with no shared cpu-side LLC but with clusters defined,
+ * extend core_mask to cluster_siblings. The sched domain builder will
+ * then remove MC as redundant with CLS if SCHED_CLUSTER is enabled.
+ */
+ if (IS_ENABLED(CONFIG_SCHED_CLUSTER) &&
+ cpumask_subset(core_mask, &cpu_topology[cpu].cluster_sibling))
+ core_mask = &cpu_topology[cpu].cluster_sibling;
+
return core_mask;
}
--
2.34.1
Fix an issue with the Tyan Tomcat IV S1564D system, the BIOS of which
does not assign PCI buses beyond #2, where our resource reallocation
code preserves the reset default of an I/O BAR assignment outside its
upstream PCI-to-PCI bridge's I/O forwarding range for device 06:08.0 in
this log:
pci_bus 0000:00: max bus depth: 4 pci_try_num: 5
[...]
pci 0000:06:08.0: BAR 4: no space for [io size 0x0020]
pci 0000:06:08.0: BAR 4: trying firmware assignment [io 0xfce0-0xfcff]
pci 0000:06:08.0: BAR 4: assigned [io 0xfce0-0xfcff]
pci 0000:06:08.1: BAR 4: no space for [io size 0x0020]
pci 0000:06:08.1: BAR 4: trying firmware assignment [io 0xfce0-0xfcff]
pci 0000:06:08.1: BAR 4: [io 0xfce0-0xfcff] conflicts with 0000:06:08.0 [io 0xfce0-0xfcff]
pci 0000:06:08.1: BAR 4: failed to assign [io size 0x0020]
pci 0000:05:00.0: PCI bridge to [bus 06]
pci 0000:05:00.0: bridge window [mem 0xd8000000-0xd85fffff]
[...]
pci 0000:00:11.0: PCI bridge to [bus 01-06]
pci 0000:00:11.0: bridge window [io 0xe000-0xefff]
pci 0000:00:11.0: bridge window [mem 0xd8000000-0xdfffffff]
pci 0000:00:11.0: bridge window [mem 0xa8000000-0xafffffff 64bit pref]
pci_bus 0000:00: No. 2 try to assign unassigned res
[...]
pci 0000:06:08.1: BAR 4: no space for [io size 0x0020]
pci 0000:06:08.1: BAR 4: trying firmware assignment [io 0xfce0-0xfcff]
pci 0000:06:08.1: BAR 4: [io 0xfce0-0xfcff] conflicts with 0000:06:08.0 [io 0xfce0-0xfcff]
pci 0000:06:08.1: BAR 4: failed to assign [io size 0x0020]
pci 0000:05:00.0: PCI bridge to [bus 06]
pci 0000:05:00.0: bridge window [mem 0xd8000000-0xd85fffff]
[...]
pci 0000:00:11.0: PCI bridge to [bus 01-06]
pci 0000:00:11.0: bridge window [io 0xe000-0xefff]
pci 0000:00:11.0: bridge window [mem 0xd8000000-0xdfffffff]
pci 0000:00:11.0: bridge window [mem 0xa8000000-0xafffffff 64bit pref]
pci_bus 0000:00: No. 3 try to assign unassigned res
pci 0000:00:11.0: resource 7 [io 0xe000-0xefff] released
[...]
pci 0000:06:08.1: BAR 4: assigned [io 0x2000-0x201f]
pci 0000:05:00.0: PCI bridge to [bus 06]
pci 0000:05:00.0: bridge window [io 0x2000-0x2fff]
pci 0000:05:00.0: bridge window [mem 0xd8000000-0xd85fffff]
[...]
pci 0000:00:11.0: PCI bridge to [bus 01-06]
pci 0000:00:11.0: bridge window [io 0x1000-0x2fff]
pci 0000:00:11.0: bridge window [mem 0xd8000000-0xdfffffff]
pci 0000:00:11.0: bridge window [mem 0xa8000000-0xafffffff 64bit pref]
pci_bus 0000:00: resource 4 [io 0x0000-0xffff]
pci_bus 0000:00: resource 5 [mem 0x00000000-0xffffffff]
pci_bus 0000:01: resource 0 [io 0x1000-0x2fff]
pci_bus 0000:01: resource 1 [mem 0xd8000000-0xdfffffff]
pci_bus 0000:01: resource 2 [mem 0xa8000000-0xafffffff 64bit pref]
pci_bus 0000:02: resource 0 [io 0x1000-0x2fff]
pci_bus 0000:02: resource 1 [mem 0xd8000000-0xd8bfffff]
pci_bus 0000:04: resource 0 [io 0x1000-0x1fff]
pci_bus 0000:04: resource 1 [mem 0xd8600000-0xd8afffff]
pci_bus 0000:05: resource 0 [io 0x2000-0x2fff]
pci_bus 0000:05: resource 1 [mem 0xd8000000-0xd85fffff]
pci_bus 0000:06: resource 0 [io 0x2000-0x2fff]
pci_bus 0000:06: resource 1 [mem 0xd8000000-0xd85fffff]
-- note that the assignment of 0xfce0-0xfcff is outside the range of
0x2000-0x2fff assigned to bus #6:
05:00.0 PCI bridge: Texas Instruments XIO2000(A)/XIO2200A PCI Express-to-PCI Bridge (rev 03) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=05, secondary=06, subordinate=06, sec-latency=0
I/O behind bridge: 00002000-00002fff
Memory behind bridge: d8000000-d85fffff
Capabilities: [50] Power Management version 2
Capabilities: [60] Message Signalled Interrupts: 64bit+ Queue=0/4 Enable-
Capabilities: [80] #0d [0000]
Capabilities: [90] Express PCI/PCI-X Bridge IRQ 0
06:08.0 USB controller: VIA Technologies, Inc. VT82xx/62xx/VX700/8x0/900 UHCI USB 1.1 Controller (rev 61) (prog-if 00 [UHCI])
Subsystem: VIA Technologies, Inc. VT82xx/62xx/VX700/8x0/900 UHCI USB 1.1 Controller
Flags: bus master, medium devsel, latency 22, IRQ 5
I/O ports at fce0 [size=32]
Capabilities: [80] Power Management version 2
06:08.1 USB controller: VIA Technologies, Inc. VT82xx/62xx/VX700/8x0/900 UHCI USB 1.1 Controller (rev 61) (prog-if 00 [UHCI])
Subsystem: VIA Technologies, Inc. VT82xx/62xx/VX700/8x0/900 UHCI USB 1.1 Controller
Flags: bus master, medium devsel, latency 22, IRQ 5
I/O ports at 2000 [size=32]
Capabilities: [80] Power Management version 2
Since both 06:08.0 and 06:08.1 have the same reset defaults the latter
device escapes its fate and gets a good assignment owing to an address
conflict with the former device.
Consequently when the device driver tries to access 06:08.0 according to
its designated address range it pokes at an unassigned I/O location,
likely subtractively decoded by the southbridge and forwarded to ISA,
causing the driver to become confused and bail out:
uhci_hcd 0000:06:08.0: host system error, PCI problems?
uhci_hcd 0000:06:08.0: host controller process error, something bad happened!
uhci_hcd 0000:06:08.0: host controller halted, very bad!
uhci_hcd 0000:06:08.0: HCRESET not completed yet!
uhci_hcd 0000:06:08.0: HC died; cleaning up
if good luck happens or if bad luck does, an infinite flood of messages:
uhci_hcd 0000:06:08.0: host system error, PCI problems?
uhci_hcd 0000:06:08.0: host controller process error, something bad happened!
uhci_hcd 0000:06:08.0: host system error, PCI problems?
uhci_hcd 0000:06:08.0: host controller process error, something bad happened!
uhci_hcd 0000:06:08.0: host system error, PCI problems?
uhci_hcd 0000:06:08.0: host controller process error, something bad happened!
[...]
making the system virtually unusuable.
This is because we have code to deal with a situation from PR #16263,
where broken ACPI firmware reports the wrong address range for the host
bridge's decoding window and trying to adjust to the window causes more
breakage than leaving the BIOS assignments intact.
This may work for a device directly on the root bus decoded by the host
bridge only, but for a device behind one or more PCI-to-PCI (or CardBus)
bridges those bridges' forwarding windows have been standardised and
need to be respected, or leaving whatever has been there in a downstream
device's BAR will have no effect as cycles for the addresses recorded
there will have no chance to appear on the bus the device has been
immediately attached to.
Make sure then for a device behind a PCI-to-PCI bridge that any firmware
assignment is within the bridge's relevant forwarding window or do not
restore the assignment, fixing the system concerned as follows:
pci_bus 0000:00: max bus depth: 4 pci_try_num: 5
[...]
pci 0000:06:08.0: BAR 4: no space for [io size 0x0020]
pci 0000:06:08.0: BAR 4: failed to assign [io 0xfce0-0xfcff]
pci 0000:06:08.1: BAR 4: no space for [io size 0x0020]
pci 0000:06:08.1: BAR 4: failed to assign [io 0xfce0-0xfcff]
[...]
pci_bus 0000:00: No. 2 try to assign unassigned res
[...]
pci 0000:06:08.0: BAR 4: no space for [io size 0x0020]
pci 0000:06:08.0: BAR 4: failed to assign [io 0xfce0-0xfcff]
pci 0000:06:08.1: BAR 4: no space for [io size 0x0020]
pci 0000:06:08.1: BAR 4: failed to assign [io 0xfce0-0xfcff]
[...]
pci_bus 0000:00: No. 3 try to assign unassigned res
[...]
pci 0000:06:08.0: BAR 4: assigned [io 0x2000-0x201f]
pci 0000:06:08.1: BAR 4: assigned [io 0x2020-0x203f]
and making device 06:08.0 work correctly.
Cf. <https://bugzilla.kernel.org/show_bug.cgi?id=16263>
Signed-off-by: Maciej W. Rozycki <macro(a)orcam.me.uk>
Fixes: 58c84eda0756 ("PCI: fall back to original BIOS BAR addresses")
Cc: stable(a)vger.kernel.org # v2.6.35+
---
Hi,
Resending this patch as it has gone into void. Patch re-verified against
5.17-rc2.
For the record the system's bus topology is as follows:
-[0000:00]-+-00.0
+-07.0
+-07.1
+-07.2
+-11.0-[0000:01-06]----00.0-[0000:02-06]--+-00.0-[0000:03]--
| +-01.0-[0000:04]--+-00.0
| | \-00.3
| \-02.0-[0000:05-06]----00.0-[0000:06]--+-05.0
| +-08.0
| +-08.1
| \-08.2
+-12.0
+-13.0
\-14.0
Maciej
Changes from v1:
- Do restore firmware BAR assignments behind a PCI-PCI bridge, but only if
within the bridge's forwarding window.
- Update the change description and heading accordingly (was: PCI: Do not
restore firmware BAR assignments behind a PCI-PCI bridge).
---
drivers/pci/setup-res.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
linux-pci-setup-res-fw-address-nobridge.diff
Index: linux-macro/drivers/pci/setup-res.c
===================================================================
--- linux-macro.orig/drivers/pci/setup-res.c
+++ linux-macro/drivers/pci/setup-res.c
@@ -212,9 +212,19 @@ static int pci_revert_fw_address(struct
res->end = res->start + size - 1;
res->flags &= ~IORESOURCE_UNSET;
+ /*
+ * If we're behind a P2P or CardBus bridge, make sure we're
+ * inside the relevant forwarding window, or otherwise the
+ * assignment must have been bogus and accesses intended for
+ * the range assigned would not reach the device anyway.
+ * On the root bus accept anything under the assumption the
+ * host bridge will let it through.
+ */
root = pci_find_parent_resource(dev, res);
if (!root) {
- if (res->flags & IORESOURCE_IO)
+ if (dev->bus->parent)
+ return -ENXIO;
+ else if (res->flags & IORESOURCE_IO)
root = &ioport_resource;
else
root = &iomem_resource;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1dd498e5e26ad71e3e9130daf72cfb6a693fee03 Mon Sep 17 00:00:00 2001
From: James Morse <james.morse(a)arm.com>
Date: Thu, 27 Jan 2022 12:20:52 +0000
Subject: [PATCH] KVM: arm64: Workaround Cortex-A510's single-step and PAC trap
errata
Cortex-A510's erratum #2077057 causes SPSR_EL2 to be corrupted when
single-stepping authenticated ERET instructions. A single step is
expected, but a pointer authentication trap is taken instead. The
erratum causes SPSR_EL1 to be copied to SPSR_EL2, which could allow
EL1 to cause a return to EL2 with a guest controlled ELR_EL2.
Because the conditions require an ERET into active-not-pending state,
this is only a problem for the EL2 when EL2 is stepping EL1. In this case
the previous SPSR_EL2 value is preserved in struct kvm_vcpu, and can be
restored.
Cc: stable(a)vger.kernel.org # 53960faf2b73: arm64: Add Cortex-A510 CPU part definition
Cc: stable(a)vger.kernel.org
Signed-off-by: James Morse <james.morse(a)arm.com>
[maz: fixup cpucaps ordering]
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20220127122052.1584324-5-james.morse@arm.com
diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
index 0ec7b7f1524b..ea281dd75517 100644
--- a/Documentation/arm64/silicon-errata.rst
+++ b/Documentation/arm64/silicon-errata.rst
@@ -100,6 +100,8 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A510 | #2051678 | ARM64_ERRATUM_2051678 |
+----------------+-----------------+-----------------+-----------------------------+
+| ARM | Cortex-A510 | #2077057 | ARM64_ERRATUM_2077057 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A710 | #2119858 | ARM64_ERRATUM_2119858 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A710 | #2054223 | ARM64_ERRATUM_2054223 |
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index f2b5a4abef21..cbcd42decb2a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -680,6 +680,22 @@ config ARM64_ERRATUM_2051678
If unsure, say Y.
+config ARM64_ERRATUM_2077057
+ bool "Cortex-A510: 2077057: workaround software-step corrupting SPSR_EL2"
+ help
+ This option adds the workaround for ARM Cortex-A510 erratum 2077057.
+ Affected Cortex-A510 may corrupt SPSR_EL2 when the a step exception is
+ expected, but a Pointer Authentication trap is taken instead. The
+ erratum causes SPSR_EL1 to be copied to SPSR_EL2, which could allow
+ EL1 to cause a return to EL2 with a guest controlled ELR_EL2.
+
+ This can only happen when EL2 is stepping EL1.
+
+ When these conditions occur, the SPSR_EL2 value is unchanged from the
+ previous guest entry, and can be restored from the in-memory copy.
+
+ If unsure, say Y.
+
config ARM64_ERRATUM_2119858
bool "Cortex-A710/X2: 2119858: workaround TRBE overwriting trace data in FILL mode"
default y
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 066098198c24..b217941713a8 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -600,6 +600,14 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
CAP_MIDR_RANGE_LIST(trbe_write_out_of_range_cpus),
},
#endif
+#ifdef CONFIG_ARM64_ERRATUM_2077057
+ {
+ .desc = "ARM erratum 2077057",
+ .capability = ARM64_WORKAROUND_2077057,
+ .type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM,
+ ERRATA_MIDR_REV_RANGE(MIDR_CORTEX_A510, 0, 0, 2),
+ },
+#endif
#ifdef CONFIG_ARM64_ERRATUM_2064142
{
.desc = "ARM erratum 2064142",
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 331dd10821df..701cfb964905 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -402,6 +402,24 @@ static inline bool kvm_hyp_handle_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
return false;
}
+static inline void synchronize_vcpu_pstate(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+ /*
+ * Check for the conditions of Cortex-A510's #2077057. When these occur
+ * SPSR_EL2 can't be trusted, but isn't needed either as it is
+ * unchanged from the value in vcpu_gp_regs(vcpu)->pstate.
+ * Are we single-stepping the guest, and took a PAC exception from the
+ * active-not-pending state?
+ */
+ if (cpus_have_final_cap(ARM64_WORKAROUND_2077057) &&
+ vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP &&
+ *vcpu_cpsr(vcpu) & DBG_SPSR_SS &&
+ ESR_ELx_EC(read_sysreg_el2(SYS_ESR)) == ESR_ELx_EC_PAC)
+ write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR);
+
+ vcpu->arch.ctxt.regs.pstate = read_sysreg_el2(SYS_SPSR);
+}
+
/*
* Return true when we were able to fixup the guest exit and should return to
* the guest, false when we should restore the host state and return to the
@@ -413,7 +431,7 @@ static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
* Save PSTATE early so that we can evaluate the vcpu mode
* early on.
*/
- vcpu->arch.ctxt.regs.pstate = read_sysreg_el2(SYS_SPSR);
+ synchronize_vcpu_pstate(vcpu, exit_code);
/*
* Check whether we want to repaint the state one way or
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index e7719e8f18de..9c65b1e25a96 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -55,9 +55,10 @@ WORKAROUND_1418040
WORKAROUND_1463225
WORKAROUND_1508412
WORKAROUND_1542419
-WORKAROUND_2064142
-WORKAROUND_2038923
WORKAROUND_1902691
+WORKAROUND_2038923
+WORKAROUND_2064142
+WORKAROUND_2077057
WORKAROUND_TRBE_OVERWRITE_FILL_MODE
WORKAROUND_TSB_FLUSH_FAILURE
WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
From: Maxim Levitsky <mlevitsk(a)redhat.com>
[ Upstream commit 755c2bf878607dbddb1423df9abf16b82205896f ]
kvm_apic_update_apicv is called when AVIC is still active, thus IRR bits
can be set by the CPU after it is called, and don't cause the irr_pending
to be set to true.
Also logic in avic_kick_target_vcpu doesn't expect a race with this
function so to make it simple, just keep irr_pending set to true and
let the next interrupt injection to the guest clear it.
Signed-off-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Message-Id: <20220207155447.840194-9-mlevitsk(a)redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
arch/x86/kvm/lapic.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 677d21082454f..d484269a390bc 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2292,7 +2292,12 @@ void kvm_apic_update_apicv(struct kvm_vcpu *vcpu)
apic->irr_pending = true;
apic->isr_count = 1;
} else {
- apic->irr_pending = (apic_search_irr(apic) != -1);
+ /*
+ * Don't clear irr_pending, searching the IRR can race with
+ * updates from the CPU as APICv is still active from hardware's
+ * perspective. The flag will be cleared as appropriate when
+ * KVM injects the interrupt.
+ */
apic->isr_count = count_vectors(apic->regs + APIC_ISR);
}
}
--
2.34.1
The i801 controller provides a locking mechanism that the OS is supposed
to use to safely share the SMBus with ACPI AML or other firmware.
Previously, Linux attempted to get out of the way of ACPI AML entirely,
but left the bus locked if it used it before the first AML access. This
causes AML implementations that *do* attempt to safely share the bus
to time out if Linux uses it first; notably, this regressed ACPI video
backlight controls on 2015 iMacs after 01590f361e started instantiating
SPD EEPROMs on boot.
Commit 065b6211a8 fixed the immediate problem of leaving the bus locked,
but we can do better. The controller does have a proper locking mechanism,
so let's use it as intended. Since we can't rely on the BIOS doing this
properly, we implement the following logic:
- If ACPI AML uses the bus at all, we make a note and disable power
management. The latter matches already existing behavior.
- When we want to use the bus, we attempt to lock it first. If the
locking attempt times out, *and* ACPI hasn't tried to use the bus at
all yet, we cautiously go ahead and assume the BIOS forgot to unlock
the bus after boot. This preserves existing behavior.
- We always unlock the bus after a transfer.
- If ACPI AML tries to use the bus (except trying to lock it) while
we're in the middle of a transfer, or after we've determined
locking is broken, we know we cannot safely share the bus and give up.
Upon first usage of SMBus by ACPI AML, if nothing has gone horribly
wrong so far, users will see:
i801_smbus 0000:00:1f.4: SMBus controller is shared with ACPI AML. This seems safe so far.
If locking the SMBus times out, users will see:
i801_smbus 0000:00:1f.4: BIOS left SMBus locked
And if ACPI AML tries to use the bus concurrently with Linux, or it
previously used the bus and we failed to subsequently lock it as
above, the driver will give up and users will get:
i801_smbus 0000:00:1f.4: BIOS uses SMBus unsafely
i801_smbus 0000:00:1f.4: Driver SMBus register access inhibited
This fixes the regression introduced by 01590f361e, and further allows
safely sharing the SMBus on 2015 iMacs. Tested by running `i2cdump` in a
loop while changing backlight levels via the ACPI video device.
Fixes: 01590f361e ("i2c: i801: Instantiate SPD EEPROMs automatically")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Hector Martin <marcan(a)marcan.st>
---
drivers/i2c/busses/i2c-i801.c | 96 ++++++++++++++++++++++++++++-------
1 file changed, 79 insertions(+), 17 deletions(-)
diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c
index 04a1e38f2a6f..03be6310d6d7 100644
--- a/drivers/i2c/busses/i2c-i801.c
+++ b/drivers/i2c/busses/i2c-i801.c
@@ -287,11 +287,18 @@ struct i801_priv {
#endif
struct platform_device *tco_pdev;
+ /* BIOS left the controller marked busy. */
+ bool inuse_stuck;
/*
- * If set to true the host controller registers are reserved for
- * ACPI AML use. Protected by acpi_lock.
+ * If set to true, ACPI AML uses the host controller registers.
+ * Protected by acpi_lock.
*/
- bool acpi_reserved;
+ bool acpi_usage;
+ /*
+ * If set to true, ACPI AML uses the host controller registers in an
+ * unsafe way. Protected by acpi_lock.
+ */
+ bool acpi_unsafe;
struct mutex acpi_lock;
};
@@ -854,10 +861,37 @@ static s32 i801_access(struct i2c_adapter *adap, u16 addr,
int hwpec;
int block = 0;
int ret = 0, xact = 0;
+ int timeout = 0;
struct i801_priv *priv = i2c_get_adapdata(adap);
+ /*
+ * The controller provides a bit that implements a mutex mechanism
+ * between users of the bus. First, try to lock the hardware mutex.
+ * If this doesn't work, we give up trying to do this, but then
+ * bail if ACPI uses SMBus at all.
+ */
+ if (!priv->inuse_stuck) {
+ while (inb_p(SMBHSTSTS(priv)) & SMBHSTSTS_INUSE_STS) {
+ if (++timeout >= MAX_RETRIES) {
+ dev_warn(&priv->pci_dev->dev,
+ "BIOS left SMBus locked\n");
+ priv->inuse_stuck = true;
+ break;
+ }
+ usleep_range(250, 500);
+ }
+ }
+
mutex_lock(&priv->acpi_lock);
- if (priv->acpi_reserved) {
+ if (priv->acpi_usage && priv->inuse_stuck && !priv->acpi_unsafe) {
+ priv->acpi_unsafe = true;
+
+ dev_warn(&priv->pci_dev->dev, "BIOS uses SMBus unsafely\n");
+ dev_warn(&priv->pci_dev->dev,
+ "Driver SMBus register access inhibited\n");
+ }
+
+ if (priv->acpi_unsafe) {
mutex_unlock(&priv->acpi_lock);
return -EBUSY;
}
@@ -1639,6 +1673,16 @@ static bool i801_acpi_is_smbus_ioport(const struct i801_priv *priv,
address <= pci_resource_end(priv->pci_dev, SMBBAR);
}
+static acpi_status
+i801_acpi_do_access(u32 function, acpi_physical_address address,
+ u32 bits, u64 *value)
+{
+ if ((function & ACPI_IO_MASK) == ACPI_READ)
+ return acpi_os_read_port(address, (u32 *)value, bits);
+ else
+ return acpi_os_write_port(address, (u32)*value, bits);
+}
+
static acpi_status
i801_acpi_io_handler(u32 function, acpi_physical_address address, u32 bits,
u64 *value, void *handler_context, void *region_context)
@@ -1648,17 +1692,38 @@ i801_acpi_io_handler(u32 function, acpi_physical_address address, u32 bits,
acpi_status status;
/*
- * Once BIOS AML code touches the OpRegion we warn and inhibit any
- * further access from the driver itself. This device is now owned
- * by the system firmware.
+ * Non-i801 accesses pass through.
*/
- mutex_lock(&priv->acpi_lock);
+ if (!i801_acpi_is_smbus_ioport(priv, address))
+ return i801_acpi_do_access(function, address, bits, value);
- if (!priv->acpi_reserved && i801_acpi_is_smbus_ioport(priv, address)) {
- priv->acpi_reserved = true;
+ if (!mutex_trylock(&priv->acpi_lock)) {
+ mutex_lock(&priv->acpi_lock);
+ /*
+ * This better be a read of the status register to acquire
+ * the lock...
+ */
+ if (!priv->acpi_unsafe &&
+ !(address == SMBHSTSTS(priv) &&
+ (function & ACPI_IO_MASK) == ACPI_READ)) {
+ /*
+ * Uh-oh, ACPI AML is trying to do something with the
+ * controller without locking it properly.
+ */
+ priv->acpi_unsafe = true;
+
+ dev_warn(&pdev->dev, "BIOS uses SMBus unsafely\n");
+ dev_warn(&pdev->dev,
+ "Driver SMBus register access inhibited\n");
+ }
+ }
- dev_warn(&pdev->dev, "BIOS is accessing SMBus registers\n");
- dev_warn(&pdev->dev, "Driver SMBus register access inhibited\n");
+ if (!priv->acpi_usage) {
+ priv->acpi_usage = true;
+
+ if (!priv->acpi_unsafe)
+ dev_info(&pdev->dev,
+ "SMBus controller is shared with ACPI AML. This seems safe so far.\n");
/*
* BIOS is accessing the host controller so prevent it from
@@ -1667,10 +1732,7 @@ i801_acpi_io_handler(u32 function, acpi_physical_address address, u32 bits,
pm_runtime_get_sync(&pdev->dev);
}
- if ((function & ACPI_IO_MASK) == ACPI_READ)
- status = acpi_os_read_port(address, (u32 *)value, bits);
- else
- status = acpi_os_write_port(address, (u32)*value, bits);
+ status = i801_acpi_do_access(function, address, bits, value);
mutex_unlock(&priv->acpi_lock);
@@ -1706,7 +1768,7 @@ static void i801_acpi_remove(struct i801_priv *priv)
ACPI_ADR_SPACE_SYSTEM_IO, i801_acpi_io_handler);
mutex_lock(&priv->acpi_lock);
- if (priv->acpi_reserved)
+ if (priv->acpi_usage)
pm_runtime_put(&priv->pci_dev->dev);
mutex_unlock(&priv->acpi_lock);
}
--
2.32.0
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2e8e79c416aae1de224c0f1860f2e3350fa171f8 Mon Sep 17 00:00:00 2001
From: Marc Kleine-Budde <mkl(a)pengutronix.de>
Date: Thu, 17 Mar 2022 08:57:35 +0100
Subject: [PATCH] can: m_can: m_can_tx_handler(): fix use after free of skb
can_put_echo_skb() will clone skb then free the skb. Move the
can_put_echo_skb() for the m_can version 3.0.x directly before the
start of the xmit in hardware, similar to the 3.1.x branch.
Fixes: 80646733f11c ("can: m_can: update to support CAN FD features")
Link: https://lore.kernel.org/all/20220317081305.739554-1-mkl@pengutronix.de
Cc: stable(a)vger.kernel.org
Reported-by: Hangyu Hua <hbh25y(a)gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index 1a4b56f6fa8c..b3b5bc1c803b 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -1637,8 +1637,6 @@ static netdev_tx_t m_can_tx_handler(struct m_can_classdev *cdev)
if (err)
goto out_fail;
- can_put_echo_skb(skb, dev, 0, 0);
-
if (cdev->can.ctrlmode & CAN_CTRLMODE_FD) {
cccr = m_can_read(cdev, M_CAN_CCCR);
cccr &= ~CCCR_CMR_MASK;
@@ -1655,6 +1653,9 @@ static netdev_tx_t m_can_tx_handler(struct m_can_classdev *cdev)
m_can_write(cdev, M_CAN_CCCR, cccr);
}
m_can_write(cdev, M_CAN_TXBTIE, 0x1);
+
+ can_put_echo_skb(skb, dev, 0, 0);
+
m_can_write(cdev, M_CAN_TXBAR, 0x1);
/* End of xmit function for version 3.0.x */
} else {
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e53ac7374e64dede04d745ff0e70ff5048378d1f Mon Sep 17 00:00:00 2001
From: Rik van Riel <riel(a)surriel.com>
Date: Tue, 22 Mar 2022 14:44:09 -0700
Subject: [PATCH] mm: invalidate hwpoison page cache page in fault path
Sometimes the page offlining code can leave behind a hwpoisoned clean
page cache page. This can lead to programs being killed over and over
and over again as they fault in the hwpoisoned page, get killed, and
then get re-spawned by whatever wanted to run them.
This is particularly embarrassing when the page was offlined due to
having too many corrected memory errors. Now we are killing tasks due
to them trying to access memory that probably isn't even corrupted.
This problem can be avoided by invalidating the page from the page fault
handler, which already has a branch for dealing with these kinds of
pages. With this patch we simply pretend the page fault was successful
if the page was invalidated, return to userspace, incur another page
fault, read in the file from disk (to a new memory page), and then
everything works again.
Link: https://lkml.kernel.org/r/20220212213740.423efcea@imladris.surriel.com
Signed-off-by: Rik van Riel <riel(a)surriel.com>
Reviewed-by: Miaohe Lin <linmiaohe(a)huawei.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/memory.c b/mm/memory.c
index c96281458c83..1a55b4c5b5db 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3877,11 +3877,16 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
return ret;
if (unlikely(PageHWPoison(vmf->page))) {
- if (ret & VM_FAULT_LOCKED)
+ vm_fault_t poisonret = VM_FAULT_HWPOISON;
+ if (ret & VM_FAULT_LOCKED) {
+ /* Retry if a clean page was removed from the cache. */
+ if (invalidate_inode_page(vmf->page))
+ poisonret = 0;
unlock_page(vmf->page);
+ }
put_page(vmf->page);
vmf->page = NULL;
- return VM_FAULT_HWPOISON;
+ return poisonret;
}
if (unlikely(!(ret & VM_FAULT_LOCKED)))
This is the start of the stable review cycle for the 5.15.39 release.
There are 135 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 12 May 2022 13:07:16 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.39-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.15.39-rc1
Marek Behún <kabel(a)kernel.org>
PCI: aardvark: Update comment about link going down after link-up
Marek Behún <kabel(a)kernel.org>
PCI: aardvark: Drop __maybe_unused from advk_pcie_disable_phy()
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Don't mask irq when mapping
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Remove irq_mask_ack() callback for INTx interrupts
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Use separate INTA interrupt for emulated root bridge
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Fix support for PME requester on emulated bridge
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Add support for PME interrupts
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Optimize writing PCI_EXP_RTCTL_PMEIE and PCI_EXP_RTSTA_PME on emulated bridge
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Add support for ERR interrupt on emulated bridge
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Enable MSI-X support
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Fix setting MSI address
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Add support for masking MSI interrupts
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Refactor unmasking summary MSI interrupt
Marek Behún <kabel(a)kernel.org>
PCI: aardvark: Use dev_fwnode() instead of of_node_to_fwnode(dev->of_node)
Marek Behún <kabel(a)kernel.org>
PCI: aardvark: Make msi_domain_info structure a static driver structure
Marek Behún <kabel(a)kernel.org>
PCI: aardvark: Make MSI irq_chip structures static driver structures
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Check return value of generic_handle_domain_irq() when processing INTx IRQ
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Rewrite IRQ code to chained IRQ handler
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Replace custom PCIE_CORE_INT_* macros with PCI_INTERRUPT_*
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Disable common PHY when unbinding driver
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Disable link training when unbinding driver
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Assert PERST# when unbinding driver
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Fix memory leak in driver unbind
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Mask all interrupts when unbinding driver
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Disable bus mastering when unbinding driver
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Comment actions in driver remove method
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Clear all MSIs at setup
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Add support for DEVCAP2, DEVCTL2, LNKCAP2 and LNKCTL2 registers on emulated bridge
Pali Rohár <pali(a)kernel.org>
PCI: pci-bridge-emul: Add definitions for missing capabilities registers
Pali Rohár <pali(a)kernel.org>
PCI: pci-bridge-emul: Add description for class_revision field
Frederic Weisbecker <frederic(a)kernel.org>
rcu: Apply callbacks processing time limit only on softirq
Frederic Weisbecker <frederic(a)kernel.org>
rcu: Fix callbacks processing time limit retaining cond_resched()
Helge Deller <deller(a)gmx.de>
Revert "parisc: Mark sched_clock unstable only if clocks are not syncronized"
Ricky WU <ricky_wu(a)realtek.com>
mmc: rtsx: add 74 Clocks in power on flow
Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
selftest/vm: verify remap destination address in mremap_test
Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
selftest/vm: verify mmap addr in mremap_test
Wanpeng Li <wanpengli(a)tencent.com>
KVM: LAPIC: Enable timer posted-interrupt only when mwait/hlt is advertised
Paolo Bonzini <pbonzini(a)redhat.com>
KVM: x86/mmu: avoid NULL-pointer dereference on page freeing bugs
Paolo Bonzini <pbonzini(a)redhat.com>
KVM: x86: Do not change ICR on write to APIC_SELF_IPI
Wanpeng Li <wanpengli(a)tencent.com>
x86/kvm: Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume
Thomas Huth <thuth(a)redhat.com>
KVM: selftests: Silence compiler warning in the kvm_page_table_test
Paolo Bonzini <pbonzini(a)redhat.com>
kvm: selftests: do not use bitfields larger than 32-bits for PTEs
Hector Martin <marcan(a)marcan.st>
iommu/dart: Add missing module owner to ops structure
Vlad Buslov <vladbu(a)nvidia.com>
net/mlx5e: Lag, Don't skip fib events on current dst
Vlad Buslov <vladbu(a)nvidia.com>
net/mlx5e: Lag, Fix fib_info pointer assignment
Vlad Buslov <vladbu(a)nvidia.com>
net/mlx5e: Lag, Fix use-after-free in fib event handler
Aya Levin <ayal(a)nvidia.com>
net/mlx5: Fix slab-out-of-bounds while reading resource dump menu
Javier Martinez Canillas <javierm(a)redhat.com>
fbdev: Make fb_release() return -ENODEV if fbdev was unregistered
Sandipan Das <sandipan.das(a)amd.com>
kvm: x86/cpuid: Only provide CPUID leaf 0xA if host has architectural PMU
Baruch Siach <baruch(a)tkos.co.il>
gpio: mvebu: drop pwm base assignment
Kai-Heng Feng <kai.heng.feng(a)canonical.com>
drm/amdgpu: Ensure HDA function is suspended before ASIC reset
Mario Limonciello <mario.limonciello(a)amd.com>
drm/amdgpu: don't set s3 and s0ix at the same time
Mario Limonciello <mario.limonciello(a)amd.com>
drm/amdgpu: explicitly check for s0ix when evicting resources
Nirmoy Das <nirmoy.das(a)amd.com>
drm/amdgpu: unify BO evicting method in amdgpu_ttm
Filipe Manana <fdmanana(a)suse.com>
btrfs: always log symlinks in full mode
Qu Wenruo <wqu(a)suse.com>
btrfs: force v2 space cache usage for subpage mount
Sergey Shtylyov <s.shtylyov(a)omp.ru>
smsc911x: allow using IRQ0
Vladimir Oltean <vladimir.oltean(a)nxp.com>
selftests: ocelot: tc_flower_chains: specify conform-exceed action for policer
Michael Chan <michael.chan(a)broadcom.com>
bnxt_en: Fix unnecessary dropping of RX packets
Somnath Kotur <somnath.kotur(a)broadcom.com>
bnxt_en: Fix possible bnxt_open() failure caused by wrong RFS flag
Ido Schimmel <idosch(a)nvidia.com>
selftests: mirror_gre_bridge_1q: Avoid changing PVID while interface is operational
David Howells <dhowells(a)redhat.com>
rxrpc: Enable IPv6 checksums on transport socket
Eric Dumazet <edumazet(a)google.com>
mld: respect RCU rules in ip6_mc_source() and ip6_mc_msfilter()
Qiao Ma <mqaio(a)linux.alibaba.com>
hinic: fix bug of wq out of bound access
Filipe Manana <fdmanana(a)suse.com>
btrfs: do not BUG_ON() on failure to update inode when setting xattr
Kuogee Hsieh <quic_khsieh(a)quicinc.com>
drm/msm/dp: remove fail safe mode related code
Marc Kleine-Budde <mkl(a)pengutronix.de>
selftests/net: so_txtime: usage(): fix documentation of default clock
Marc Kleine-Budde <mkl(a)pengutronix.de>
selftests/net: so_txtime: fix parsing of start time stamp on 32 bit systems
Shravya Kumbham <shravya.kumbham(a)xilinx.com>
net: emaclite: Add error handling for of_address_to_resource()
Eric Dumazet <edumazet(a)google.com>
net: igmp: respect RCU rules in ip_mc_source() and ip_mc_msfilter()
Yang Yingliang <yangyingliang(a)huawei.com>
net: cpsw: add missing of_node_put() in cpsw_probe_dt()
Niels Dossche <dossche.niels(a)gmail.com>
net: mdio: Fix ENOMEM return value in BCM6368 mux bus controller
Yang Yingliang <yangyingliang(a)huawei.com>
net: stmmac: dwmac-sun8i: add missing of_node_put() in sun8i_dwmac_register_mdio_mux()
Yang Yingliang <yangyingliang(a)huawei.com>
net: dsa: mt7530: add missing of_node_put() in mt7530_setup()
Yang Yingliang <yangyingliang(a)huawei.com>
net: ethernet: mediatek: add missing of_node_put() in mtk_sgmii_init()
Trond Myklebust <trond.myklebust(a)hammerspace.com>
NFSv4: Don't invalidate inode attributes on delegation return
Mustafa Ismail <mustafa.ismail(a)intel.com>
RDMA/irdma: Fix possible crash due to NULL netdev in notifier
Shiraz Saleem <shiraz.saleem(a)intel.com>
RDMA/irdma: Reduce iWARP QP destroy time
Tatyana Nikolova <tatyana.e.nikolova(a)intel.com>
RDMA/irdma: Flush iWARP QP if modified to ERR from RTR state
Cheng Xu <chengyou(a)linux.alibaba.com>
RDMA/siw: Fix a condition race issue in MPA request processing
Olga Kornievskaia <kolga(a)netapp.com>
SUNRPC release the transport of a relocated task with an assigned transport
Jann Horn <jannh(a)google.com>
selftests/seccomp: Don't call read() on TTY from background pgrp
Moshe Shemesh <moshe(a)nvidia.com>
net/mlx5: Fix deadlock in sync reset flow
Moshe Shemesh <moshe(a)nvidia.com>
net/mlx5: Avoid double clear or set of sync reset requested
Mark Zhang <markzhang(a)nvidia.com>
net/mlx5e: Fix the calling of update_buffer_lossy() API
Paul Blakey <paulb(a)nvidia.com>
net/mlx5e: CT: Fix queued up restore put() executing after relevant ft release
Vlad Buslov <vladbu(a)nvidia.com>
net/mlx5e: Don't match double-vlan packets if cvlan is not set
Moshe Tal <moshet(a)nvidia.com>
net/mlx5e: Fix trust state reset in reload
Yang Yingliang <yangyingliang(a)huawei.com>
iommu/dart: check return value after calling platform_get_resource()
Lu Baolu <baolu.lu(a)linux.intel.com>
iommu/vt-d: Drop stop marker messages
Pierre-Louis Bossart <pierre-louis.bossart(a)linux.intel.com>
ASoC: soc-ops: fix error handling
Codrin Ciubotariu <codrin.ciubotariu(a)microchip.com>
ASoC: dmaengine: Restore NULL prepare_slave_config() callback
Adam Wujek <dev_public(a)wujek.eu>
hwmon: (pmbus) disable PEC if not enabled
Armin Wolf <W_Armin(a)gmx.de>
hwmon: (adt7470) Fix warning on module removal
Puyou Lu <puyou.lu(a)gmail.com>
gpio: pca953x: fix irq_stat not updated when irq is disabled (irq_mask not set)
Nobuhiro Iwamatsu <nobuhiro1.iwamatsu(a)toshiba.co.jp>
gpio: visconti: Fix fwnode of GPIO IRQ
Duoming Zhou <duoming(a)zju.edu.cn>
NFC: netlink: fix sleep in atomic bug when firmware download timeout
Duoming Zhou <duoming(a)zju.edu.cn>
nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs
Duoming Zhou <duoming(a)zju.edu.cn>
nfc: replace improper check device_is_registered() in netlink related functions
Andreas Larsson <andreas(a)gaisler.com>
can: grcan: only use the NAPI poll budget for RX
Andreas Larsson <andreas(a)gaisler.com>
can: grcan: grcan_probe(): fix broken system id check for errata workaround needs
Daniel Hellstrom <daniel(a)gaisler.com>
can: grcan: use ofdev->dev when allocating DMA memory
Oliver Hartkopp <socketcan(a)hartkopp.net>
can: isotp: remove re-binding of bound socket
Duoming Zhou <duoming(a)zju.edu.cn>
can: grcan: grcan_close(): fix deadlock
Jan Höppner <hoeppner(a)linux.ibm.com>
s390/dasd: Fix read inconsistency for ESE DASD devices
Jan Höppner <hoeppner(a)linux.ibm.com>
s390/dasd: Fix read for ESE with blksize < 4k
Stefan Haberland <sth(a)linux.ibm.com>
s390/dasd: prevent double format of tracks for ESE devices
Stefan Haberland <sth(a)linux.ibm.com>
s390/dasd: fix data corruption for ESE devices
Mark Brown <broonie(a)kernel.org>
ASoC: meson: Fix event generation for AUI CODEC mux
Mark Brown <broonie(a)kernel.org>
ASoC: meson: Fix event generation for G12A tohdmi mux
Mark Brown <broonie(a)kernel.org>
ASoC: meson: Fix event generation for AUI ACODEC mux
Mark Brown <broonie(a)kernel.org>
ASoC: wm8958: Fix change notifications for DSP controls
Mark Brown <broonie(a)kernel.org>
ASoC: da7219: Fix change notifications for tone generator frequency
Thomas Pfaff <tpfaff(a)pcs.com>
genirq: Synchronize interrupt thread startup
Tan Tee Min <tee.min.tan(a)linux.intel.com>
net: stmmac: disable Split Header (SPH) for Intel platforms
Niels Dossche <dossche.niels(a)gmail.com>
firewire: core: extend card->lock in fw_core_handle_bus_reset
Jakob Koschel <jakobkoschel(a)gmail.com>
firewire: remove check of list iterator against head past the loop body
Chengfeng Ye <cyeaa(a)connect.ust.hk>
firewire: fix potential uaf in outbound_phy_packet_callback()
Kurt Kanzenbach <kurt(a)linutronix.de>
timekeeping: Mark NMI safe time accessors as notrace
Trond Myklebust <trond.myklebust(a)hammerspace.com>
Revert "SUNRPC: attempt AF_LOCAL connect on setup"
Nick Kossifidis <mick(a)ics.forth.gr>
RISC-V: relocate DTB if it's outside memory region
Marek Marczykowski-Górecki <marmarek(a)invisiblethingslab.com>
drm/amdgpu: do not use passthrough mode in Xen dom0
Harry Wentland <harry.wentland(a)amd.com>
drm/amd/display: Avoid reading audio pattern past AUDIO_CHANNELS_COUNT
Nicolin Chen <nicolinc(a)nvidia.com>
iommu/arm-smmu-v3: Fix size calculation in arm_smmu_mm_invalidate_range()
David Stevens <stevensd(a)chromium.org>
iommu/vt-d: Calculate mask for non-aligned flushes
Kyle Huey <me(a)kylehuey.com>
KVM: x86/svm: Account for family 17h event renumberings in amd_pmc_perf_hw_id
Thomas Gleixner <tglx(a)linutronix.de>
x86/fpu: Prevent FPU state corruption
Andrei Lalaev <andrei.lalaev(a)emlid.com>
gpiolib: of: fix bounds check for 'gpio-reserved-ranges'
Brian Norris <briannorris(a)chromium.org>
mmc: core: Set HS clock speed before sending HS CMD13
Samuel Holland <samuel(a)sholland.org>
mmc: sunxi-mmc: Fix DMA descriptors allocated above 32 bits
Shaik Sajida Bhanu <quic_c_sbhanu(a)quicinc.com>
mmc: sdhci-msm: Reset GCC_SDCC_BCR register for SDHC
Takashi Sakamoto <o-takashi(a)sakamocchi.jp>
ALSA: fireworks: fix wrong return count shorter than expected by 4 bytes
Zihao Wang <wzhd(a)ustc.edu>
ALSA: hda/realtek: Add quirk for Yoga Duet 7 13ITL6 speakers
Helge Deller <deller(a)gmx.de>
parisc: Merge model and model name into one line in /proc/cpuinfo
Maciej W. Rozycki <macro(a)orcam.me.uk>
MIPS: Fix CP0 counter erratum detection for R4k CPUs
-------------
Diffstat:
Makefile | 4 +-
arch/mips/include/asm/timex.h | 8 +-
arch/mips/kernel/time.c | 11 +-
arch/parisc/kernel/processor.c | 3 +-
arch/parisc/kernel/setup.c | 2 +
arch/parisc/kernel/time.c | 6 +-
arch/riscv/mm/init.c | 21 +-
arch/x86/kernel/fpu/core.c | 67 ++--
arch/x86/kernel/kvm.c | 13 +
arch/x86/kvm/cpuid.c | 5 +
arch/x86/kvm/lapic.c | 10 +-
arch/x86/kvm/mmu/mmu.c | 2 +
arch/x86/kvm/svm/pmu.c | 28 +-
drivers/firewire/core-card.c | 3 +
drivers/firewire/core-cdev.c | 4 +-
drivers/firewire/core-topology.c | 9 +-
drivers/firewire/core-transaction.c | 30 +-
drivers/firewire/sbp2.c | 13 +-
drivers/gpio/gpio-mvebu.c | 7 -
drivers/gpio/gpio-pca953x.c | 4 +-
drivers/gpio/gpio-visconti.c | 7 +-
drivers/gpio/gpiolib-of.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 8 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 30 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 24 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 23 --
drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 1 -
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 30 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 4 +-
drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 2 +-
drivers/gpu/drm/msm/dp/dp_display.c | 6 -
drivers/gpu/drm/msm/dp/dp_panel.c | 11 -
drivers/gpu/drm/msm/dp/dp_panel.h | 1 -
drivers/hwmon/adt7470.c | 4 +-
drivers/hwmon/pmbus/pmbus_core.c | 3 +
drivers/infiniband/hw/irdma/cm.c | 26 +-
drivers/infiniband/hw/irdma/utils.c | 21 +-
drivers/infiniband/hw/irdma/verbs.c | 4 +-
drivers/infiniband/sw/siw/siw_cm.c | 7 +-
drivers/iommu/apple-dart.c | 10 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 9 +-
drivers/iommu/intel/iommu.c | 27 +-
drivers/iommu/intel/svm.c | 4 +
drivers/mmc/core/mmc.c | 23 +-
drivers/mmc/host/rtsx_pci_sdmmc.c | 29 +-
drivers/mmc/host/sdhci-msm.c | 42 ++
drivers/mmc/host/sunxi-mmc.c | 5 +-
drivers/net/can/grcan.c | 46 +--
drivers/net/dsa/mt7530.c | 1 +
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 13 +-
drivers/net/ethernet/huawei/hinic/hinic_hw_wq.c | 7 +-
drivers/net/ethernet/mediatek/mtk_sgmii.c | 1 +
.../ethernet/mellanox/mlx5/core/diag/rsc_dump.c | 31 +-
.../ethernet/mellanox/mlx5/core/en/port_buffer.c | 4 +-
drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c | 4 +
drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c | 10 +
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 11 +
drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c | 60 +--
drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c | 38 +-
drivers/net/ethernet/mellanox/mlx5/core/lag_mp.h | 7 +-
drivers/net/ethernet/smsc/smsc911x.c | 2 +-
drivers/net/ethernet/stmicro/stmmac/dwmac-intel.c | 1 +
drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c | 1 +
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
drivers/net/ethernet/ti/cpsw_new.c | 5 +-
drivers/net/ethernet/xilinx/xilinx_emaclite.c | 15 +-
drivers/net/mdio/mdio-mux-bcm6368.c | 2 +-
drivers/nfc/nfcmrvl/main.c | 2 +-
drivers/pci/controller/pci-aardvark.c | 428 ++++++++++++++++-----
drivers/pci/pci-bridge-emul.c | 49 ++-
drivers/s390/block/dasd.c | 18 +-
drivers/s390/block/dasd_eckd.c | 28 +-
drivers/s390/block/dasd_int.h | 14 +
drivers/video/fbdev/core/fbmem.c | 5 +-
fs/btrfs/disk-io.c | 11 +
fs/btrfs/tree-log.c | 14 +-
fs/btrfs/xattr.c | 6 +-
fs/nfs/nfs4proc.c | 12 +-
include/linux/stmmac.h | 1 +
kernel/irq/internals.h | 2 +
kernel/irq/irqdesc.c | 2 +
kernel/irq/manage.c | 39 +-
kernel/rcu/tree.c | 31 +-
kernel/time/timekeeping.c | 4 +-
net/can/isotp.c | 22 +-
net/ipv4/igmp.c | 9 +-
net/ipv6/mcast.c | 8 +-
net/nfc/core.c | 29 +-
net/nfc/netlink.c | 4 +-
net/rxrpc/local_object.c | 3 +
net/sunrpc/clnt.c | 11 +-
net/sunrpc/xprtsock.c | 3 -
sound/firewire/fireworks/fireworks_hwdep.c | 1 +
sound/pci/hda/patch_realtek.c | 1 +
sound/soc/codecs/da7219.c | 14 +-
sound/soc/codecs/wm8958-dsp2.c | 8 +-
sound/soc/meson/aiu-acodec-ctrl.c | 2 +-
sound/soc/meson/aiu-codec-ctrl.c | 2 +-
sound/soc/meson/g12a-tohdmitx.c | 2 +-
sound/soc/soc-generic-dmaengine-pcm.c | 6 +-
sound/soc/soc-ops.c | 2 +-
.../drivers/net/ocelot/tc_flower_chains.sh | 2 +-
.../selftests/kvm/include/x86_64/processor.h | 15 +
tools/testing/selftests/kvm/kvm_page_table_test.c | 2 +-
tools/testing/selftests/kvm/lib/x86_64/processor.c | 192 ++++-----
.../net/forwarding/mirror_gre_bridge_1q.sh | 3 +
tools/testing/selftests/net/so_txtime.c | 4 +-
tools/testing/selftests/seccomp/seccomp_bpf.c | 10 +-
tools/testing/selftests/vm/mremap_test.c | 53 +++
110 files changed, 1293 insertions(+), 656 deletions(-)
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2e8e79c416aae1de224c0f1860f2e3350fa171f8 Mon Sep 17 00:00:00 2001
From: Marc Kleine-Budde <mkl(a)pengutronix.de>
Date: Thu, 17 Mar 2022 08:57:35 +0100
Subject: [PATCH] can: m_can: m_can_tx_handler(): fix use after free of skb
can_put_echo_skb() will clone skb then free the skb. Move the
can_put_echo_skb() for the m_can version 3.0.x directly before the
start of the xmit in hardware, similar to the 3.1.x branch.
Fixes: 80646733f11c ("can: m_can: update to support CAN FD features")
Link: https://lore.kernel.org/all/20220317081305.739554-1-mkl@pengutronix.de
Cc: stable(a)vger.kernel.org
Reported-by: Hangyu Hua <hbh25y(a)gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index 1a4b56f6fa8c..b3b5bc1c803b 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -1637,8 +1637,6 @@ static netdev_tx_t m_can_tx_handler(struct m_can_classdev *cdev)
if (err)
goto out_fail;
- can_put_echo_skb(skb, dev, 0, 0);
-
if (cdev->can.ctrlmode & CAN_CTRLMODE_FD) {
cccr = m_can_read(cdev, M_CAN_CCCR);
cccr &= ~CCCR_CMR_MASK;
@@ -1655,6 +1653,9 @@ static netdev_tx_t m_can_tx_handler(struct m_can_classdev *cdev)
m_can_write(cdev, M_CAN_CCCR, cccr);
}
m_can_write(cdev, M_CAN_TXBTIE, 0x1);
+
+ can_put_echo_skb(skb, dev, 0, 0);
+
m_can_write(cdev, M_CAN_TXBAR, 0x1);
/* End of xmit function for version 3.0.x */
} else {
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2e8e79c416aae1de224c0f1860f2e3350fa171f8 Mon Sep 17 00:00:00 2001
From: Marc Kleine-Budde <mkl(a)pengutronix.de>
Date: Thu, 17 Mar 2022 08:57:35 +0100
Subject: [PATCH] can: m_can: m_can_tx_handler(): fix use after free of skb
can_put_echo_skb() will clone skb then free the skb. Move the
can_put_echo_skb() for the m_can version 3.0.x directly before the
start of the xmit in hardware, similar to the 3.1.x branch.
Fixes: 80646733f11c ("can: m_can: update to support CAN FD features")
Link: https://lore.kernel.org/all/20220317081305.739554-1-mkl@pengutronix.de
Cc: stable(a)vger.kernel.org
Reported-by: Hangyu Hua <hbh25y(a)gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index 1a4b56f6fa8c..b3b5bc1c803b 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -1637,8 +1637,6 @@ static netdev_tx_t m_can_tx_handler(struct m_can_classdev *cdev)
if (err)
goto out_fail;
- can_put_echo_skb(skb, dev, 0, 0);
-
if (cdev->can.ctrlmode & CAN_CTRLMODE_FD) {
cccr = m_can_read(cdev, M_CAN_CCCR);
cccr &= ~CCCR_CMR_MASK;
@@ -1655,6 +1653,9 @@ static netdev_tx_t m_can_tx_handler(struct m_can_classdev *cdev)
m_can_write(cdev, M_CAN_CCCR, cccr);
}
m_can_write(cdev, M_CAN_TXBTIE, 0x1);
+
+ can_put_echo_skb(skb, dev, 0, 0);
+
m_can_write(cdev, M_CAN_TXBAR, 0x1);
/* End of xmit function for version 3.0.x */
} else {
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e53ac7374e64dede04d745ff0e70ff5048378d1f Mon Sep 17 00:00:00 2001
From: Rik van Riel <riel(a)surriel.com>
Date: Tue, 22 Mar 2022 14:44:09 -0700
Subject: [PATCH] mm: invalidate hwpoison page cache page in fault path
Sometimes the page offlining code can leave behind a hwpoisoned clean
page cache page. This can lead to programs being killed over and over
and over again as they fault in the hwpoisoned page, get killed, and
then get re-spawned by whatever wanted to run them.
This is particularly embarrassing when the page was offlined due to
having too many corrected memory errors. Now we are killing tasks due
to them trying to access memory that probably isn't even corrupted.
This problem can be avoided by invalidating the page from the page fault
handler, which already has a branch for dealing with these kinds of
pages. With this patch we simply pretend the page fault was successful
if the page was invalidated, return to userspace, incur another page
fault, read in the file from disk (to a new memory page), and then
everything works again.
Link: https://lkml.kernel.org/r/20220212213740.423efcea@imladris.surriel.com
Signed-off-by: Rik van Riel <riel(a)surriel.com>
Reviewed-by: Miaohe Lin <linmiaohe(a)huawei.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/memory.c b/mm/memory.c
index c96281458c83..1a55b4c5b5db 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3877,11 +3877,16 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
return ret;
if (unlikely(PageHWPoison(vmf->page))) {
- if (ret & VM_FAULT_LOCKED)
+ vm_fault_t poisonret = VM_FAULT_HWPOISON;
+ if (ret & VM_FAULT_LOCKED) {
+ /* Retry if a clean page was removed from the cache. */
+ if (invalidate_inode_page(vmf->page))
+ poisonret = 0;
unlock_page(vmf->page);
+ }
put_page(vmf->page);
vmf->page = NULL;
- return VM_FAULT_HWPOISON;
+ return poisonret;
}
if (unlikely(!(ret & VM_FAULT_LOCKED)))
From: Casey Schaufler <casey(a)schaufler-ca.com>
[ Upstream commit ecff30575b5ad0eda149aadad247b7f75411fd47 ]
The usual LSM hook "bail on fail" scheme doesn't work for cases where
a security module may return an error code indicating that it does not
recognize an input. In this particular case Smack sees a mount option
that it recognizes, and returns 0. A call to a BPF hook follows, which
returns -ENOPARAM, which confuses the caller because Smack has processed
its data.
The SELinux hook incorrectly returns 1 on success. There was a time
when this was correct, however the current expectation is that it
return 0 on success. This is repaired.
Reported-by: syzbot+d1e3b1d92d25abf97943(a)syzkaller.appspotmail.com
Signed-off-by: Casey Schaufler <casey(a)schaufler-ca.com>
Acked-by: James Morris <jamorris(a)linux.microsoft.com>
Signed-off-by: Paul Moore <paul(a)paul-moore.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
security/security.c | 17 +++++++++++++++--
security/selinux/hooks.c | 5 ++---
2 files changed, 17 insertions(+), 5 deletions(-)
diff --git a/security/security.c b/security/security.c
index 22261d79f333..f101a53a63ed 100644
--- a/security/security.c
+++ b/security/security.c
@@ -884,9 +884,22 @@ int security_fs_context_dup(struct fs_context *fc, struct fs_context *src_fc)
return call_int_hook(fs_context_dup, 0, fc, src_fc);
}
-int security_fs_context_parse_param(struct fs_context *fc, struct fs_parameter *param)
+int security_fs_context_parse_param(struct fs_context *fc,
+ struct fs_parameter *param)
{
- return call_int_hook(fs_context_parse_param, -ENOPARAM, fc, param);
+ struct security_hook_list *hp;
+ int trc;
+ int rc = -ENOPARAM;
+
+ hlist_for_each_entry(hp, &security_hook_heads.fs_context_parse_param,
+ list) {
+ trc = hp->hook.fs_context_parse_param(fc, param);
+ if (trc == 0)
+ rc = 0;
+ else if (trc != -ENOPARAM)
+ return trc;
+ }
+ return rc;
}
int security_sb_alloc(struct super_block *sb)
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 5b6895e4fc29..371f67a37f9a 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -2860,10 +2860,9 @@ static int selinux_fs_context_parse_param(struct fs_context *fc,
return opt;
rc = selinux_add_opt(opt, param->string, &fc->security);
- if (!rc) {
+ if (!rc)
param->string = NULL;
- rc = 1;
- }
+
return rc;
}
--
2.34.1
The quilt patch titled
Subject: fs: sendfile handles O_NONBLOCK of out_fd
has been removed from the -mm tree. Its filename was
fs-sendfile-handles-o_nonblock-of-out_fd.patch
This patch was dropped because it was merged into the mm-nonmm-stable branch\nof git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Andrei Vagin <avagin(a)gmail.com>
Subject: fs: sendfile handles O_NONBLOCK of out_fd
sendfile has to return EAGAIN if out_fd is nonblocking and the write into
it would block.
Here is a small reproducer for the problem:
#define _GNU_SOURCE /* See feature_test_macros(7) */
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/sendfile.h>
#define FILE_SIZE (1UL << 30)
int main(int argc, char **argv) {
int p[2], fd;
if (pipe2(p, O_NONBLOCK))
return 1;
fd = open(argv[1], O_RDWR | O_TMPFILE, 0666);
if (fd < 0)
return 1;
ftruncate(fd, FILE_SIZE);
if (sendfile(p[1], fd, 0, FILE_SIZE) == -1) {
fprintf(stderr, "FAIL\n");
}
if (sendfile(p[1], fd, 0, FILE_SIZE) != -1 || errno != EAGAIN) {
fprintf(stderr, "FAIL\n");
}
return 0;
}
It worked before b964bf53e540, it is stuck after b964bf53e540, and it
works again with this fix.
This regression occurred because do_splice_direct() calls pipe_write
that handles O_NONBLOCK. Here is a trace log from the reproducer:
1) | __x64_sys_sendfile64() {
1) | do_sendfile() {
1) | __fdget()
1) | rw_verify_area()
1) | __fdget()
1) | rw_verify_area()
1) | do_splice_direct() {
1) | rw_verify_area()
1) | splice_direct_to_actor() {
1) | do_splice_to() {
1) | rw_verify_area()
1) | generic_file_splice_read()
1) + 74.153 us | }
1) | direct_splice_actor() {
1) | iter_file_splice_write() {
1) | __kmalloc()
1) 0.148 us | pipe_lock();
1) 0.153 us | splice_from_pipe_next.part.0();
1) 0.162 us | page_cache_pipe_buf_confirm();
... 16 times
1) 0.159 us | page_cache_pipe_buf_confirm();
1) | vfs_iter_write() {
1) | do_iter_write() {
1) | rw_verify_area()
1) | do_iter_readv_writev() {
1) | pipe_write() {
1) | mutex_lock()
1) 0.153 us | mutex_unlock();
1) 1.368 us | }
1) 1.686 us | }
1) 5.798 us | }
1) 6.084 us | }
1) 0.174 us | kfree();
1) 0.152 us | pipe_unlock();
1) + 14.461 us | }
1) + 14.783 us | }
1) 0.164 us | page_cache_pipe_buf_release();
... 16 times
1) 0.161 us | page_cache_pipe_buf_release();
1) | touch_atime()
1) + 95.854 us | }
1) + 99.784 us | }
1) ! 107.393 us | }
1) ! 107.699 us | }
Link: https://lkml.kernel.org/r/20220415005015.525191-1-avagin@gmail.com
Fixes: b964bf53e540 ("teach sendfile(2) to handle send-to-pipe directly")
Signed-off-by: Andrei Vagin <avagin(a)gmail.com>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/read_write.c | 3 +++
1 file changed, 3 insertions(+)
--- a/fs/read_write.c~fs-sendfile-handles-o_nonblock-of-out_fd
+++ a/fs/read_write.c
@@ -1247,6 +1247,9 @@ static ssize_t do_sendfile(int out_fd, i
count, fl);
file_end_write(out.file);
} else {
+ if (out.file->f_flags & O_NONBLOCK)
+ fl |= SPLICE_F_NONBLOCK;
+
retval = splice_file_to_pipe(in.file, opipe, &pos, count, fl);
}
_
Patches currently in -mm which might be from avagin(a)gmail.com are
This is the start of the stable review cycle for the 5.15.37 release.
There are 33 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 01 May 2022 10:40:41 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.37-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.15.37-rc1
Kumar Kartikeya Dwivedi <memxor(a)gmail.com>
selftests/bpf: Add test for reg2btf_ids out of bounds access
Linus Torvalds <torvalds(a)linux-foundation.org>
mm: gup: make fault_in_safe_writeable() use fixup_user_fault()
Filipe Manana <fdmanana(a)suse.com>
btrfs: fallback to blocking mode when doing async dio over multiple extents
Filipe Manana <fdmanana(a)suse.com>
btrfs: fix deadlock due to page faults during direct IO reads and writes
Andreas Gruenbacher <agruenba(a)redhat.com>
gfs2: Fix mmap + page fault deadlocks for direct I/O
Andreas Gruenbacher <agruenba(a)redhat.com>
iov_iter: Introduce nofault flag to disable page faults
Andreas Gruenbacher <agruenba(a)redhat.com>
gup: Introduce FOLL_NOFAULT flag to disable page faults
Andreas Gruenbacher <agruenba(a)redhat.com>
iomap: Add done_before argument to iomap_dio_rw
Andreas Gruenbacher <agruenba(a)redhat.com>
iomap: Support partial direct I/O on user copy failures
Andreas Gruenbacher <agruenba(a)redhat.com>
iomap: Fix iomap_dio_rw return value for user copies
Andreas Gruenbacher <agruenba(a)redhat.com>
gfs2: Fix mmap + page fault deadlocks for buffered I/O
Andreas Gruenbacher <agruenba(a)redhat.com>
gfs2: Eliminate ip->i_gh
Andreas Gruenbacher <agruenba(a)redhat.com>
gfs2: Move the inode glock locking to gfs2_file_buffered_write
Bob Peterson <rpeterso(a)redhat.com>
gfs2: Introduce flag for glock holder auto-demotion
Andreas Gruenbacher <agruenba(a)redhat.com>
gfs2: Clean up function may_grant
Andreas Gruenbacher <agruenba(a)redhat.com>
gfs2: Add wrapper for iomap_file_buffered_write
Andreas Gruenbacher <agruenba(a)redhat.com>
iov_iter: Introduce fault_in_iov_iter_writeable
Andreas Gruenbacher <agruenba(a)redhat.com>
iov_iter: Turn iov_iter_fault_in_readable into fault_in_iov_iter_readable
Andreas Gruenbacher <agruenba(a)redhat.com>
gup: Turn fault_in_pages_{readable,writeable} into fault_in_{readable,writeable}
Muchun Song <songmuchun(a)bytedance.com>
mm: kfence: fix objcgs vector allocation
Dinh Nguyen <dinguyen(a)kernel.org>
ARM: dts: socfpga: change qspi to "intel,socfpga-qspi"
Dinh Nguyen <dinguyen(a)kernel.org>
spi: cadence-quadspi: fix write completion support
Kumar Kartikeya Dwivedi <memxor(a)gmail.com>
bpf: Fix crash due to out of bounds access into reg2btf_ids.
Hao Luo <haoluo(a)google.com>
bpf/selftests: Test PTR_TO_RDONLY_MEM
Hao Luo <haoluo(a)google.com>
bpf: Add MEM_RDONLY for helper args that are pointers to rdonly mem.
Hao Luo <haoluo(a)google.com>
bpf: Make per_cpu_ptr return rdonly PTR_TO_MEM.
Hao Luo <haoluo(a)google.com>
bpf: Convert PTR_TO_MEM_OR_NULL to composable types.
Hao Luo <haoluo(a)google.com>
bpf: Introduce MEM_RDONLY flag
Hao Luo <haoluo(a)google.com>
bpf: Replace PTR_TO_XXX_OR_NULL with PTR_TO_XXX | PTR_MAYBE_NULL
Hao Luo <haoluo(a)google.com>
bpf: Replace RET_XXX_OR_NULL with RET_XXX | PTR_MAYBE_NULL
Hao Luo <haoluo(a)google.com>
bpf: Replace ARG_XXX_OR_NULL with ARG_XXX | PTR_MAYBE_NULL
Hao Luo <haoluo(a)google.com>
bpf: Introduce composable reg, ret and arg types.
Willy Tarreau <w(a)1wt.eu>
floppy: disable FDRAWCMD by default
-------------
Diffstat:
Makefile | 4 +-
arch/arm/boot/dts/socfpga.dtsi | 2 +-
arch/arm/boot/dts/socfpga_arria10.dtsi | 2 +-
arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi | 2 +-
arch/arm64/boot/dts/intel/socfpga_agilex.dtsi | 2 +-
arch/powerpc/kernel/kvm.c | 3 +-
arch/powerpc/kernel/signal_32.c | 4 +-
arch/powerpc/kernel/signal_64.c | 2 +-
arch/x86/kernel/fpu/signal.c | 7 +-
drivers/block/Kconfig | 16 +
drivers/block/floppy.c | 43 +-
drivers/gpu/drm/armada/armada_gem.c | 7 +-
drivers/spi/spi-cadence-quadspi.c | 24 +-
fs/btrfs/file.c | 142 +++++-
fs/btrfs/inode.c | 28 ++
fs/btrfs/ioctl.c | 5 +-
fs/erofs/data.c | 2 +-
fs/ext4/file.c | 5 +-
fs/f2fs/file.c | 2 +-
fs/fuse/file.c | 2 +-
fs/gfs2/bmap.c | 60 +--
fs/gfs2/file.c | 252 ++++++++++-
fs/gfs2/glock.c | 330 ++++++++++----
fs/gfs2/glock.h | 20 +
fs/gfs2/incore.h | 4 +-
fs/iomap/buffered-io.c | 2 +-
fs/iomap/direct-io.c | 29 +-
fs/ntfs/file.c | 2 +-
fs/ntfs3/file.c | 2 +-
fs/xfs/xfs_file.c | 6 +-
fs/zonefs/super.c | 4 +-
include/linux/bpf.h | 101 ++++-
include/linux/bpf_verifier.h | 18 +
include/linux/iomap.h | 11 +-
include/linux/mm.h | 3 +-
include/linux/pagemap.h | 58 +--
include/linux/uio.h | 4 +-
kernel/bpf/btf.c | 16 +-
kernel/bpf/cgroup.c | 2 +-
kernel/bpf/helpers.c | 12 +-
kernel/bpf/map_iter.c | 4 +-
kernel/bpf/ringbuf.c | 2 +-
kernel/bpf/syscall.c | 2 +-
kernel/bpf/verifier.c | 488 ++++++++++-----------
kernel/trace/bpf_trace.c | 22 +-
lib/iov_iter.c | 98 ++++-
mm/filemap.c | 4 +-
mm/gup.c | 120 ++++-
mm/kfence/core.c | 11 +-
mm/kfence/kfence.h | 3 +
net/core/bpf_sk_storage.c | 2 +-
net/core/filter.c | 64 +--
net/core/sock_map.c | 2 +-
tools/testing/selftests/bpf/prog_tests/ksyms_btf.c | 14 +
.../bpf/progs/test_ksyms_btf_write_check.c | 29 ++
tools/testing/selftests/bpf/verifier/calls.c | 19 +
56 files changed, 1472 insertions(+), 652 deletions(-)
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 024a7ad9eb4df626ca8c77fef4f67fd0ebd559d2 Mon Sep 17 00:00:00 2001
From: Andy Chi <andy.chi(a)canonical.com>
Date: Fri, 13 May 2022 20:16:45 +0800
Subject: [PATCH] ALSA: hda/realtek: fix right sounds and mute/micmute LEDs for
HP machine
The HP EliteBook 630 is using ALC236 codec which used 0x02 to control mute LED
and 0x01 to control micmute LED. Therefore, add a quirk to make it works.
Signed-off-by: Andy Chi <andy.chi(a)canonical.com>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20220513121648.28584-1-andy.chi@canonical.com
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
index ce2cb1986677..ad292df7d805 100644
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -9091,6 +9091,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x103c, 0x8995, "HP EliteBook 855 G9", ALC287_FIXUP_CS35L41_I2C_2),
SND_PCI_QUIRK(0x103c, 0x89a4, "HP ProBook 440 G9", ALC236_FIXUP_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x89a6, "HP ProBook 450 G9", ALC236_FIXUP_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x89aa, "HP EliteBook 630 G9", ALC236_FIXUP_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x89ac, "HP EliteBook 640 G9", ALC236_FIXUP_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x89ae, "HP EliteBook 650 G9", ALC236_FIXUP_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x89c3, "Zbook Studio G9", ALC245_FIXUP_CS35L41_SPI_4_HP_GPIO_LED),
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 024a7ad9eb4df626ca8c77fef4f67fd0ebd559d2 Mon Sep 17 00:00:00 2001
From: Andy Chi <andy.chi(a)canonical.com>
Date: Fri, 13 May 2022 20:16:45 +0800
Subject: [PATCH] ALSA: hda/realtek: fix right sounds and mute/micmute LEDs for
HP machine
The HP EliteBook 630 is using ALC236 codec which used 0x02 to control mute LED
and 0x01 to control micmute LED. Therefore, add a quirk to make it works.
Signed-off-by: Andy Chi <andy.chi(a)canonical.com>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20220513121648.28584-1-andy.chi@canonical.com
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
index ce2cb1986677..ad292df7d805 100644
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -9091,6 +9091,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x103c, 0x8995, "HP EliteBook 855 G9", ALC287_FIXUP_CS35L41_I2C_2),
SND_PCI_QUIRK(0x103c, 0x89a4, "HP ProBook 440 G9", ALC236_FIXUP_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x89a6, "HP ProBook 450 G9", ALC236_FIXUP_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x89aa, "HP EliteBook 630 G9", ALC236_FIXUP_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x89ac, "HP EliteBook 640 G9", ALC236_FIXUP_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x89ae, "HP EliteBook 650 G9", ALC236_FIXUP_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x89c3, "Zbook Studio G9", ALC245_FIXUP_CS35L41_SPI_4_HP_GPIO_LED),
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 024a7ad9eb4df626ca8c77fef4f67fd0ebd559d2 Mon Sep 17 00:00:00 2001
From: Andy Chi <andy.chi(a)canonical.com>
Date: Fri, 13 May 2022 20:16:45 +0800
Subject: [PATCH] ALSA: hda/realtek: fix right sounds and mute/micmute LEDs for
HP machine
The HP EliteBook 630 is using ALC236 codec which used 0x02 to control mute LED
and 0x01 to control micmute LED. Therefore, add a quirk to make it works.
Signed-off-by: Andy Chi <andy.chi(a)canonical.com>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20220513121648.28584-1-andy.chi@canonical.com
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
index ce2cb1986677..ad292df7d805 100644
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -9091,6 +9091,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x103c, 0x8995, "HP EliteBook 855 G9", ALC287_FIXUP_CS35L41_I2C_2),
SND_PCI_QUIRK(0x103c, 0x89a4, "HP ProBook 440 G9", ALC236_FIXUP_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x89a6, "HP ProBook 450 G9", ALC236_FIXUP_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x89aa, "HP EliteBook 630 G9", ALC236_FIXUP_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x89ac, "HP EliteBook 640 G9", ALC236_FIXUP_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x89ae, "HP EliteBook 650 G9", ALC236_FIXUP_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x89c3, "Zbook Studio G9", ALC245_FIXUP_CS35L41_SPI_4_HP_GPIO_LED),
From: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
There is a small chance that get_kretprobe(ri) returns NULL in
kretprobe_dispatcher() when another CPU unregisters the kretprobe
right after __kretprobe_trampoline_handler().
To avoid this issue, kretprobe_dispatcher() checks the get_kretprobe()
return value again. And if it is NULL, it returns soon because that
kretprobe is under unregistering process.
This issue has been introduced when the kretprobe is decoupled
from the struct kretprobe_instance by commit d741bf41d7c7
("kprobes: Remove kretprobe hash"). Before that commit, the
struct kretprob_instance::rp directly points the kretprobe
and it is never be NULL.
Reported-by: Yonghong Song <yhs(a)fb.com>
Fixes: d741bf41d7c7 ("kprobes: Remove kretprobe hash")
Cc: stable(a)vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
---
kernel/trace/trace_kprobe.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 93507330462c..a245ea673715 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1718,8 +1718,17 @@ static int
kretprobe_dispatcher(struct kretprobe_instance *ri, struct pt_regs *regs)
{
struct kretprobe *rp = get_kretprobe(ri);
- struct trace_kprobe *tk = container_of(rp, struct trace_kprobe, rp);
+ struct trace_kprobe *tk;
+
+ /*
+ * There is a small chance that get_kretprobe(ri) returns NULL when
+ * the kretprobe is unregister on another CPU between kretprobe's
+ * trampoline_handler and this function.
+ */
+ if (unlikely(!rp))
+ return 0;
+ tk = container_of(rp, struct trace_kprobe, rp);
raw_cpu_inc(*tk->nhit);
if (trace_probe_test_flag(&tk->tp, TP_FLAG_TRACE))
Currently, a problem faced by arm64 is if a kernel image is signed by a
MOK key, loading it via the kexec_file_load() system call would be
rejected with the error "Lockdown: kexec: kexec of unsigned images is
restricted; see man kernel_lockdown.7".
This happens because image_verify_sig uses only the primary keyring that
contains only kernel built-in keys to verify the kexec image.
This patch allows to verify arm64 kernel image signature using not only
.builtin_trusted_keys but also .platform and .secondary_trusted_keys
keyring.
Fixes: 732b7b93d849 ("arm64: kexec_file: add kernel signature verification support")
Cc: stable(a)vger.kernel.org # 34d5960af253: kexec: clean up arch_kexec_kernel_verify_sig
Cc: stable(a)vger.kernel.org # 83b7bb2d49ae: kexec, KEYS: make the code in bzImage64_verify_sig generic
Acked-by: Baoquan He <bhe(a)redhat.com>
Cc: kexec(a)lists.infradead.org
Cc: keyrings(a)vger.kernel.org
Cc: linux-security-module(a)vger.kernel.org
Co-developed-by: Michal Suchanek <msuchanek(a)suse.de>
Signed-off-by: Michal Suchanek <msuchanek(a)suse.de>
Acked-by: Will Deacon <will(a)kernel.org>
Signed-off-by: Coiby Xu <coxu(a)redhat.com>
---
arch/arm64/kernel/kexec_image.c | 11 +----------
1 file changed, 1 insertion(+), 10 deletions(-)
diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
index 9ec34690e255..5ed6a585f21f 100644
--- a/arch/arm64/kernel/kexec_image.c
+++ b/arch/arm64/kernel/kexec_image.c
@@ -14,7 +14,6 @@
#include <linux/kexec.h>
#include <linux/pe.h>
#include <linux/string.h>
-#include <linux/verification.h>
#include <asm/byteorder.h>
#include <asm/cpufeature.h>
#include <asm/image.h>
@@ -130,18 +129,10 @@ static void *image_load(struct kimage *image,
return NULL;
}
-#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG
-static int image_verify_sig(const char *kernel, unsigned long kernel_len)
-{
- return verify_pefile_signature(kernel, kernel_len, NULL,
- VERIFYING_KEXEC_PE_SIGNATURE);
-}
-#endif
-
const struct kexec_file_ops kexec_image_ops = {
.probe = image_probe,
.load = image_load,
#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG
- .verify_sig = image_verify_sig,
+ .verify_sig = kexec_kernel_verify_pe_sig,
#endif
};
--
2.35.3
Hi Greg, Sasha,
I think we're finally at a good point to begin backporting the work I've
done on random.c during the last 6 months. I've been maintaining
branches for this incrementally as code has been merged into mainline,
in order to make this moment easier than otherwise.
Assuming that Linus merges my PR for 5.19 [1] today, all of these
patches are (or will be in a few hours) in Linus' tree. I've tried to
backport most of the general scaffolding and design of the current state
of random.c, while not backporting any new features or unusual
functionality changes that might invite trouble. So, for example, the
backports switch to using a cryptographic hash function, but they don't
have changes like warning when the cycle counter is zero, attempting to
use jitter on early uses of /dev/urandom, reseeding on suspend/hibernate
notifications, or the vmgenid driver. Hopefully that strikes an okay
balance between getting the core backported so that fixes are
backportable, but not going too far by backporting new "nice to have"
but unessential features.
In this git repo [2], there are three branches: linux-5.15.y,
linux-5.17.y, and linux-5.18.y, which contain backports for everything
up to and including [1].
You'll probably want to backport this to earlier kernels as well. Given
that there hasn't been overly much work on the rng in the last few
years, it shouldn't be too hard to take my 5.15.y branch and fill in the
missing pieces there to bring it back. Given how much changes, you could
probably just take every random.c change for backporting to before 5.15.
There is one snag, which is that some of the work I did during the 5.17
cycle depends on crypto I added for WireGuard, which landed in 5.6. So
for 5.4 and prior, that will need backports. Fortunately, I've already
done this in [3], in the branch backport-5.4.y, which I've kept up to
date for a few years now. This occasion might mark the perfect excuse
we've been waiting for to just backport WireGuard too to 5.4 (which
might make the Android work a bit easier also) :-D.
Let me know if you have any questions, and feel free to poke me on IRC.
And if all of the above sounds terrible to you, and you'd rather just
not take any of this into stable, I guess that's a valid path to take
too.
Regards,
Jason
[1] https://lore.kernel.org/lkml/20220522214457.37108-1-Jason@zx2c4.com/
[2] https://git.kernel.org/pub/scm/linux/kernel/git/crng/random.git/
[3] https://git.kernel.org/pub/scm/linux/kernel/git/zx2c4/wireguard-linux.git/
This is used by code that doesn't need CONFIG_CRYPTO, so move this into
lib/ with a Kconfig option so that it can be selected by whatever needs
it.
This fixes a linker error Zheng pointed out when
CRYPTO_MANAGER_DISABLE_TESTS!=y and CRYPTO=m:
lib/crypto/curve25519-selftest.o: In function `curve25519_selftest':
curve25519-selftest.c:(.init.text+0x60): undefined reference to `__crypto_memneq'
curve25519-selftest.c:(.init.text+0xec): undefined reference to `__crypto_memneq'
curve25519-selftest.c:(.init.text+0x114): undefined reference to `__crypto_memneq'
curve25519-selftest.c:(.init.text+0x154): undefined reference to `__crypto_memneq'
Reported-by: Zheng Bin <zhengbin13(a)huawei.com>
Cc: Eric Biggers <ebiggers(a)kernel.org>
Cc: stable(a)vger.kernel.org
Fixes: aa127963f1ca ("crypto: lib/curve25519 - re-add selftests")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
I'm traveling over the next week, and there are a few ways to skin this
cat, so if somebody here sees issue, feel free to pick this v1 up and
fashion a v2 out of it.
crypto/Kconfig | 1 +
crypto/Makefile | 2 +-
lib/Kconfig | 3 +++
lib/Makefile | 1 +
lib/crypto/Kconfig | 1 +
{crypto => lib}/memneq.c | 0
6 files changed, 7 insertions(+), 1 deletion(-)
rename {crypto => lib}/memneq.c (100%)
diff --git a/crypto/Kconfig b/crypto/Kconfig
index f567271ed10d..38601a072b99 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -15,6 +15,7 @@ source "crypto/async_tx/Kconfig"
#
menuconfig CRYPTO
tristate "Cryptographic API"
+ select LIB_MEMNEQ
help
This option provides the core Cryptographic API.
diff --git a/crypto/Makefile b/crypto/Makefile
index 40d4c2690a49..dbfa53567c92 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -4,7 +4,7 @@
#
obj-$(CONFIG_CRYPTO) += crypto.o
-crypto-y := api.o cipher.o compress.o memneq.o
+crypto-y := api.o cipher.o compress.o
obj-$(CONFIG_CRYPTO_ENGINE) += crypto_engine.o
obj-$(CONFIG_CRYPTO_FIPS) += fips.o
diff --git a/lib/Kconfig b/lib/Kconfig
index 6a843639814f..eaaad4d85bf2 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -120,6 +120,9 @@ config INDIRECT_IOMEM_FALLBACK
source "lib/crypto/Kconfig"
+config LIB_MEMNEQ
+ bool
+
config CRC_CCITT
tristate "CRC-CCITT functions"
help
diff --git a/lib/Makefile b/lib/Makefile
index 89fcae891361..f01023cda508 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -251,6 +251,7 @@ obj-$(CONFIG_DIMLIB) += dim/
obj-$(CONFIG_SIGNATURE) += digsig.o
lib-$(CONFIG_CLZ_TAB) += clz_tab.o
+lib-$(CONFIG_LIB_MEMNEQ) += memneq.o
obj-$(CONFIG_GENERIC_STRNCPY_FROM_USER) += strncpy_from_user.o
obj-$(CONFIG_GENERIC_STRNLEN_USER) += strnlen_user.o
diff --git a/lib/crypto/Kconfig b/lib/crypto/Kconfig
index 7ee13c08c970..337d6852643a 100644
--- a/lib/crypto/Kconfig
+++ b/lib/crypto/Kconfig
@@ -71,6 +71,7 @@ config CRYPTO_LIB_CURVE25519
tristate "Curve25519 scalar multiplication library"
depends on CRYPTO_ARCH_HAVE_LIB_CURVE25519 || !CRYPTO_ARCH_HAVE_LIB_CURVE25519
select CRYPTO_LIB_CURVE25519_GENERIC if CRYPTO_ARCH_HAVE_LIB_CURVE25519=n
+ select LIB_MEMNEQ
help
Enable the Curve25519 library interface. This interface may be
fulfilled by either the generic implementation or an arch-specific
diff --git a/crypto/memneq.c b/lib/memneq.c
similarity index 100%
rename from crypto/memneq.c
rename to lib/memneq.c
--
2.35.1
For some sev ioctl interfaces, input may be passed that is less than or
equal to SEV_FW_BLOB_MAX_SIZE, but larger than the data that PSP
firmware returns. In this case, kmalloc will allocate memory that is the
size of the input rather than the size of the data. Since PSP firmware
doesn't fully overwrite the buffer, the sev ioctl interfaces with the
issue may return uninitialized slab memory.
Currently, all of the ioctl interfaces in the ccp driver are safe, but
to prevent future problems, change all ioctl interfaces that allocate
memory with kmalloc to use kzalloc and memset the data buffer to zero
in sev_ioctl_do_platform_status.
Fixes: 38103671aad3 ("crypto: ccp: Use the stack and common buffer for status commands")
Fixes: e799035609e15 ("crypto: ccp: Implement SEV_PEK_CSR ioctl command")
Fixes: 76a2b524a4b1d ("crypto: ccp: Implement SEV_PDH_CERT_EXPORT ioctl command")
Fixes: d6112ea0cb344 ("crypto: ccp - introduce SEV_GET_ID2 command")
Cc: stable(a)vger.kernel.org
Reported-by: Andy Nguyen <theflow(a)google.com>
Suggested-by: David Rientjes <rientjes(a)google.com>
Suggested-by: Peter Gonda <pgonda(a)google.com>
Signed-off-by: John Allen <john.allen(a)amd.com>
---
v2:
- Add fixes tags and CC stable(a)vger.kernel.org
v3:
- memset data buffer to zero in sev_ioctl_do_platform_status
v4:
- Add fixes tag for sev_ioctl_do_platform_status change
---
drivers/crypto/ccp/sev-dev.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index 6ab93dfd478a..da143cc3a8f5 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -551,6 +551,8 @@ static int sev_ioctl_do_platform_status(struct sev_issue_cmd *argp)
struct sev_user_data_status data;
int ret;
+ memset(&data, 0, sizeof(data));
+
ret = __sev_do_cmd_locked(SEV_CMD_PLATFORM_STATUS, &data, &argp->error);
if (ret)
return ret;
@@ -604,7 +606,7 @@ static int sev_ioctl_do_pek_csr(struct sev_issue_cmd *argp, bool writable)
if (input.length > SEV_FW_BLOB_MAX_SIZE)
return -EFAULT;
- blob = kmalloc(input.length, GFP_KERNEL);
+ blob = kzalloc(input.length, GFP_KERNEL);
if (!blob)
return -ENOMEM;
@@ -828,7 +830,7 @@ static int sev_ioctl_do_get_id2(struct sev_issue_cmd *argp)
input_address = (void __user *)input.address;
if (input.address && input.length) {
- id_blob = kmalloc(input.length, GFP_KERNEL);
+ id_blob = kzalloc(input.length, GFP_KERNEL);
if (!id_blob)
return -ENOMEM;
@@ -947,14 +949,14 @@ static int sev_ioctl_do_pdh_export(struct sev_issue_cmd *argp, bool writable)
if (input.cert_chain_len > SEV_FW_BLOB_MAX_SIZE)
return -EFAULT;
- pdh_blob = kmalloc(input.pdh_cert_len, GFP_KERNEL);
+ pdh_blob = kzalloc(input.pdh_cert_len, GFP_KERNEL);
if (!pdh_blob)
return -ENOMEM;
data.pdh_cert_address = __psp_pa(pdh_blob);
data.pdh_cert_len = input.pdh_cert_len;
- cert_blob = kmalloc(input.cert_chain_len, GFP_KERNEL);
+ cert_blob = kzalloc(input.cert_chain_len, GFP_KERNEL);
if (!cert_blob) {
ret = -ENOMEM;
goto e_free_pdh;
--
2.34.1
TEST_GEN_FILES contains files that are generated during compilation and are
required to be included together with the test binaries, e.g. when
performing:
make -C tools/testing/selftests install INSTALL_PATH=/some/other/path [*]
Add test_encl.elf to TEST_GEN_FILES because otherwise the installed test
binary will fail to run.
[*] https://docs.kernel.org/dev-tools/kselftest.html
Cc: stable(a)vger.kernel.org
Fixes: 2adcba79e69d ("selftests/x86: Add a selftest for SGX")
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
---
v2:
Use TEST_GEN_FILES in the "all" target, instead of duplicating the path for
test_encl.elf.
---
tools/testing/selftests/sgx/Makefile | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/sgx/Makefile b/tools/testing/selftests/sgx/Makefile
index 75af864e07b6..7f60811b5b20 100644
--- a/tools/testing/selftests/sgx/Makefile
+++ b/tools/testing/selftests/sgx/Makefile
@@ -17,9 +17,10 @@ ENCL_CFLAGS := -Wall -Werror -static -nostdlib -nostartfiles -fPIC \
-fno-stack-protector -mrdrnd $(INCLUDES)
TEST_CUSTOM_PROGS := $(OUTPUT)/test_sgx
+TEST_GEN_FILES := $(OUTPUT)/test_encl.elf
ifeq ($(CAN_BUILD_X86_64), 1)
-all: $(TEST_CUSTOM_PROGS) $(OUTPUT)/test_encl.elf
+all: $(TEST_CUSTOM_PROGS) $(TEST_GEN_FILES)
endif
$(OUTPUT)/test_sgx: $(OUTPUT)/main.o \
--
2.36.1
Commit 8e7102273f59 ("bcache: make bch_btree_check() to be
multithreaded") makes bch_btree_check() to be much faster when checking
all btree nodes during cache device registration. But it isn't in ideal
shap yet, still can be improved.
This patch does the following thing to improve current parallel btree
nodes check by multiple threads in bch_btree_check(),
- Add read lock to root node while checking all the btree nodes with
multiple threads. Although currently it is not mandatory but it is
good to have a read lock in code logic.
- Remove local variable 'char name[32]', and generate kernel thread name
string directly when calling kthread_run().
- Allocate local variable "struct btree_check_state check_state" on the
stack and avoid unnecessary dynamic memory allocation for it.
- Increase check_state->started to count created kernel thread after it
succeeds to create.
- When wait for all checking kernel threads to finish, use wait_event()
to replace wait_event_interruptible().
With this change, the code is more clear, and some potential error
conditions are avoided.
Fixes: 8e7102273f59 ("bcache: make bch_btree_check() to be multithreaded")
Signed-off-by: Coly Li <colyli(a)suse.de>
Cc: stable(a)vger.kernel.org
---
drivers/md/bcache/btree.c | 58 ++++++++++++++++++---------------------
1 file changed, 26 insertions(+), 32 deletions(-)
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index ad9f16689419..2362bb8ef6d1 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -2006,8 +2006,7 @@ int bch_btree_check(struct cache_set *c)
int i;
struct bkey *k = NULL;
struct btree_iter iter;
- struct btree_check_state *check_state;
- char name[32];
+ struct btree_check_state check_state;
/* check and mark root node keys */
for_each_key_filter(&c->root->keys, k, &iter, bch_ptr_invalid)
@@ -2018,63 +2017,58 @@ int bch_btree_check(struct cache_set *c)
if (c->root->level == 0)
return 0;
- check_state = kzalloc(sizeof(struct btree_check_state), GFP_KERNEL);
- if (!check_state)
- return -ENOMEM;
-
- check_state->c = c;
- check_state->total_threads = bch_btree_chkthread_nr();
- check_state->key_idx = 0;
- spin_lock_init(&check_state->idx_lock);
- atomic_set(&check_state->started, 0);
- atomic_set(&check_state->enough, 0);
- init_waitqueue_head(&check_state->wait);
+ check_state.c = c;
+ check_state.total_threads = bch_btree_chkthread_nr();
+ check_state.key_idx = 0;
+ spin_lock_init(&check_state.idx_lock);
+ atomic_set(&check_state.started, 0);
+ atomic_set(&check_state.enough, 0);
+ init_waitqueue_head(&check_state.wait);
+ rw_lock(0, c->root, c->root->level);
/*
* Run multiple threads to check btree nodes in parallel,
- * if check_state->enough is non-zero, it means current
+ * if check_state.enough is non-zero, it means current
* running check threads are enough, unncessary to create
* more.
*/
- for (i = 0; i < check_state->total_threads; i++) {
- /* fetch latest check_state->enough earlier */
+ for (i = 0; i < check_state.total_threads; i++) {
+ /* fetch latest check_state.enough earlier */
smp_mb__before_atomic();
- if (atomic_read(&check_state->enough))
+ if (atomic_read(&check_state.enough))
break;
- check_state->infos[i].result = 0;
- check_state->infos[i].state = check_state;
- snprintf(name, sizeof(name), "bch_btrchk[%u]", i);
- atomic_inc(&check_state->started);
+ check_state.infos[i].result = 0;
+ check_state.infos[i].state = &check_state;
- check_state->infos[i].thread =
+ check_state.infos[i].thread =
kthread_run(bch_btree_check_thread,
- &check_state->infos[i],
- name);
- if (IS_ERR(check_state->infos[i].thread)) {
+ &check_state.infos[i],
+ "bch_btrchk[%d]", i);
+ if (IS_ERR(check_state.infos[i].thread)) {
pr_err("fails to run thread bch_btrchk[%d]\n", i);
for (--i; i >= 0; i--)
- kthread_stop(check_state->infos[i].thread);
+ kthread_stop(check_state.infos[i].thread);
ret = -ENOMEM;
goto out;
}
+ atomic_inc(&check_state.started);
}
/*
* Must wait for all threads to stop.
*/
- wait_event_interruptible(check_state->wait,
- atomic_read(&check_state->started) == 0);
+ wait_event(check_state.wait, atomic_read(&check_state.started) == 0);
- for (i = 0; i < check_state->total_threads; i++) {
- if (check_state->infos[i].result) {
- ret = check_state->infos[i].result;
+ for (i = 0; i < check_state.total_threads; i++) {
+ if (check_state.infos[i].result) {
+ ret = check_state.infos[i].result;
goto out;
}
}
out:
- kfree(check_state);
+ rw_unlock(0, c->root);
return ret;
}
--
2.35.3
From: Fabio Estevam <festevam(a)denx.de>
Since commit 358ba762d9f1 ("crypto: caam - enable prediction resistance
in HRWNG") the following CAAM errors can be seen on i.MX6SX:
caam_jr 2101000.jr: 20003c5b: CCB: desc idx 60: RNG: Hardware error
hwrng: no data available
This error is due to an incorrect entropy delay for i.MX6SX.
Fix it by increasing the minimum entropy delay for i.MX6SX
as done in U-Boot:
https://patchwork.ozlabs.org/project/uboot/patch/20220415111049.2565744-1-g…
As explained in the U-Boot patch:
"RNG self tests are run to determine the correct entropy delay.
Such tests are executed with different voltages and temperatures to identify
the worst case value for the entropy delay. For i.MX6SX, it was determined
that after adding a margin value of 1000 the minimum entropy delay should be
at least 12000."
Cc: <stable(a)vger.kernel.org>
Fixes: 358ba762d9f1 ("crypto: caam - enable prediction resistance in HRWNG")
Signed-off-by: Fabio Estevam <festevam(a)denx.de>
Reviewed-by: Horia Geantă <horia.geanta(a)nxp.com>
---
Changes since v4:
- Change the function name to needs_entropy_delay_adjustment() - Vabhav
- Improve the commit log by adding the explanation from the U-Boot
patch - Vabhav
drivers/crypto/caam/ctrl.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c
index ca0361b2dbb0..f87aa2169e5f 100644
--- a/drivers/crypto/caam/ctrl.c
+++ b/drivers/crypto/caam/ctrl.c
@@ -609,6 +609,13 @@ static bool check_version(struct fsl_mc_version *mc_version, u32 major,
}
#endif
+static bool needs_entropy_delay_adjustment(void)
+{
+ if (of_machine_is_compatible("fsl,imx6sx"))
+ return true;
+ return false;
+}
+
/* Probe routine for CAAM top (controller) level */
static int caam_probe(struct platform_device *pdev)
{
@@ -855,6 +862,8 @@ static int caam_probe(struct platform_device *pdev)
* Also, if a handle was instantiated, do not change
* the TRNG parameters.
*/
+ if (needs_entropy_delay_adjustment())
+ ent_delay = 12000;
if (!(ctrlpriv->rng4_sh_init || inst_handles)) {
dev_info(dev,
"Entropy delay = %u\n",
@@ -871,6 +880,15 @@ static int caam_probe(struct platform_device *pdev)
*/
ret = instantiate_rng(dev, inst_handles,
gen_sk);
+ /*
+ * Entropy delay is determined via TRNG characterization.
+ * TRNG characterization is run across different voltages
+ * and temperatures.
+ * If worst case value for ent_dly is identified,
+ * the loop can be skipped for that platform.
+ */
+ if (needs_entropy_delay_adjustment())
+ break;
if (ret == -EAGAIN)
/*
* if here, the loop will rerun,
--
2.25.1
There is a limitation in TI DP83867 PHY device where SGMII AN is only
triggered once after the device is booted up. Even after the PHY TPI is
down and up again, SGMII AN is not triggered and hence no new in-band
message from PHY to MAC side SGMII.
This could cause an issue during power up, when PHY is up prior to MAC.
At this condition, once MAC side SGMII is up, MAC side SGMII wouldn`t
receive new in-band message from TI PHY with correct link status, speed
and duplex info.
As suggested by TI, implemented a SW solution here to retrigger SGMII
Auto-Neg whenever there is a link change.
v2: Add Fixes tag in commit message.
Fixes: 2a10154abcb7 ("net: phy: dp83867: Add TI dp83867 phy")
Cc: <stable(a)vger.kernel.org> # 5.4.x
Signed-off-by: Sit, Michael Wei Hong <michael.wei.hong.sit(a)intel.com>
Reviewed-by: Voon Weifeng <weifeng.voon(a)intel.com>
Signed-off-by: Tan Tee Min <tee.min.tan(a)linux.intel.com>
---
drivers/net/phy/dp83867.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/drivers/net/phy/dp83867.c b/drivers/net/phy/dp83867.c
index 8561f2d4443b..13dafe7a29bd 100644
--- a/drivers/net/phy/dp83867.c
+++ b/drivers/net/phy/dp83867.c
@@ -137,6 +137,7 @@
#define DP83867_DOWNSHIFT_2_COUNT 2
#define DP83867_DOWNSHIFT_4_COUNT 4
#define DP83867_DOWNSHIFT_8_COUNT 8
+#define DP83867_SGMII_AUTONEG_EN BIT(7)
/* CFG3 bits */
#define DP83867_CFG3_INT_OE BIT(7)
@@ -855,6 +856,32 @@ static int dp83867_phy_reset(struct phy_device *phydev)
DP83867_PHYCR_FORCE_LINK_GOOD, 0);
}
+static void dp83867_link_change_notify(struct phy_device *phydev)
+{
+ /* There is a limitation in DP83867 PHY device where SGMII AN is
+ * only triggered once after the device is booted up. Even after the
+ * PHY TPI is down and up again, SGMII AN is not triggered and
+ * hence no new in-band message from PHY to MAC side SGMII.
+ * This could cause an issue during power up, when PHY is up prior
+ * to MAC. At this condition, once MAC side SGMII is up, MAC side
+ * SGMII wouldn`t receive new in-band message from TI PHY with
+ * correct link status, speed and duplex info.
+ * Thus, implemented a SW solution here to retrigger SGMII Auto-Neg
+ * whenever there is a link change.
+ */
+ if (phydev->interface == PHY_INTERFACE_MODE_SGMII) {
+ int val = 0;
+
+ val = phy_clear_bits(phydev, DP83867_CFG2,
+ DP83867_SGMII_AUTONEG_EN);
+ if (val < 0)
+ return;
+
+ phy_set_bits(phydev, DP83867_CFG2,
+ DP83867_SGMII_AUTONEG_EN);
+ }
+}
+
static struct phy_driver dp83867_driver[] = {
{
.phy_id = DP83867_PHY_ID,
@@ -879,6 +906,8 @@ static struct phy_driver dp83867_driver[] = {
.suspend = genphy_suspend,
.resume = genphy_resume,
+
+ .link_change_notify = dp83867_link_change_notify,
},
};
module_phy_driver(dp83867_driver);
--
2.25.1
From: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay(a)intel.com>
[ Upstream commit 0a967f5bfd9134b89681cae58deb222e20840e76 ]
The VT-d spec requires (10.4.4 Global Command Register, TE
field) that:
Hardware implementations supporting DMA draining must drain
any in-flight DMA read/write requests queued within the
Root-Complex before completing the translation enable
command and reflecting the status of the command through
the TES field in the Global Status register.
Unfortunately, some integrated graphic devices fail to do
so after some kind of power state transition. As the
result, the system might stuck in iommu_disable_translati
on(), waiting for the completion of TE transition.
This adds RPLS to a quirk list for those devices and skips
TE disabling if the qurik hits.
Link: https://gitlab.freedesktop.org/drm/intel/-/issues/4898
Tested-by: Raviteja Goud Talla <ravitejax.goud.talla(a)intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Acked-by: Lu Baolu <baolu.lu(a)linux.intel.com>
Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay(a)intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220302043256.191529-1-tejas…
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/iommu/intel/iommu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index ab2273300346..e3f15e0cae34 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5764,7 +5764,7 @@ static void quirk_igfx_skip_te_disable(struct pci_dev *dev)
ver = (dev->device >> 8) & 0xff;
if (ver != 0x45 && ver != 0x46 && ver != 0x4c &&
ver != 0x4e && ver != 0x8a && ver != 0x98 &&
- ver != 0x9a)
+ ver != 0x9a && ver != 0xa7)
return;
if (risky_device(dev))
--
2.35.1
On each vcpu load, we set the KVM_ARM64_HOST_SVE_ENABLED
flag if SVE is enabled for EL0 on the host. This is used to restore
the correct state on vpcu put.
However, it appears that nothing ever clears this flag. Once
set, it will stick until the vcpu is destroyed, which has the
potential to spuriously enable SVE for userspace.
We probably never saw the issue because no VMM uses SVE, but
that's still pretty bad. Unconditionally clearing the flag
on vcpu load addresses the issue.
Fixes: 8383741ab2e7 ("KVM: arm64: Get rid of host SVE tracking/saving")
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Cc: stable(a)vger.kernel.org
---
arch/arm64/kvm/fpsimd.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 441edb9c398c..3c2cfc3adc51 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -80,6 +80,7 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
vcpu->arch.flags &= ~KVM_ARM64_FP_ENABLED;
vcpu->arch.flags |= KVM_ARM64_FP_HOST;
+ vcpu->arch.flags &= ~KVM_ARM64_HOST_SVE_ENABLED;
if (read_sysreg(cpacr_el1) & CPACR_EL1_ZEN_EL0EN)
vcpu->arch.flags |= KVM_ARM64_HOST_SVE_ENABLED;
--
2.34.1
These 2 patches fix issues related to the promiscuous mode on VF.
Comments are welcome,
Olivier
Cc: stable(a)vger.kernel.org
Cc: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Changes since v1:
- resend with CC intel-wired-lan
- remove CC Hiroshi Shimamoto (address does not exist anymore)
Olivier Matz (2):
ixgbe: fix bcast packets Rx on VF after promisc removal
ixgbe: fix unexpected VLAN Rx in promisc mode on VF
drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
--
2.30.2
From: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay(a)intel.com>
[ Upstream commit 0a967f5bfd9134b89681cae58deb222e20840e76 ]
The VT-d spec requires (10.4.4 Global Command Register, TE
field) that:
Hardware implementations supporting DMA draining must drain
any in-flight DMA read/write requests queued within the
Root-Complex before completing the translation enable
command and reflecting the status of the command through
the TES field in the Global Status register.
Unfortunately, some integrated graphic devices fail to do
so after some kind of power state transition. As the
result, the system might stuck in iommu_disable_translati
on(), waiting for the completion of TE transition.
This adds RPLS to a quirk list for those devices and skips
TE disabling if the qurik hits.
Link: https://gitlab.freedesktop.org/drm/intel/-/issues/4898
Tested-by: Raviteja Goud Talla <ravitejax.goud.talla(a)intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Acked-by: Lu Baolu <baolu.lu(a)linux.intel.com>
Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay(a)intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220302043256.191529-1-tejas…
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/iommu/intel/iommu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 0ea47e17b379..ba9a63cac47c 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5031,7 +5031,7 @@ static void quirk_igfx_skip_te_disable(struct pci_dev *dev)
ver = (dev->device >> 8) & 0xff;
if (ver != 0x45 && ver != 0x46 && ver != 0x4c &&
ver != 0x4e && ver != 0x8a && ver != 0x98 &&
- ver != 0x9a)
+ ver != 0x9a && ver != 0xa7)
return;
if (risky_device(dev))
--
2.35.1
This is the start of the stable review cycle for the 4.9.300 release.
There are 48 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 09 Feb 2022 10:37:42 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.300-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.9.300-rc1
Ritesh Harjani <riteshh(a)linux.ibm.com>
ext4: fix error handling in ext4_restore_inline_data()
Sergey Shtylyov <s.shtylyov(a)omp.ru>
EDAC/xgene: Fix deferred probing
Sergey Shtylyov <s.shtylyov(a)omp.ru>
EDAC/altera: Fix deferred probing
Riwen Lu <luriwen(a)kylinos.cn>
rtc: cmos: Evaluate century appropriate
Dai Ngo <dai.ngo(a)oracle.com>
nfsd: nfsd4_setclientid_confirm mistakenly expires confirmed client.
John Meneghini <jmeneghi(a)redhat.com>
scsi: bnx2fc: Make bnx2fc_recv_frame() mp safe
Miaoqian Lin <linmq006(a)gmail.com>
ASoC: fsl: Add missing error handling in pcm030_fabric_probe
Lior Nahmanson <liorna(a)nvidia.com>
net: macsec: Verify that send_sci is on when setting Tx sci explicitly
Miquel Raynal <miquel.raynal(a)bootlin.com>
net: ieee802154: Return meaningful error codes from the netlink helpers
Benjamin Gaignard <benjamin.gaignard(a)collabora.com>
spi: mediatek: Avoid NULL pointer crash in interrupt
Kamal Dasu <kdasu.kdev(a)gmail.com>
spi: bcm-qspi: check for valid cs before applying chip select
Joerg Roedel <jroedel(a)suse.de>
iommu/amd: Fix loop timeout issue in iommu_ga_log_enable()
Nick Lopez <github(a)glowingmonkey.org>
drm/nouveau: fix off by one in BIOS boundary checking
Mark Brown <broonie(a)kernel.org>
ASoC: ops: Reject out of bounds values in snd_soc_put_xr_sx()
Mark Brown <broonie(a)kernel.org>
ASoC: ops: Reject out of bounds values in snd_soc_put_volsw_sx()
Mark Brown <broonie(a)kernel.org>
ASoC: ops: Reject out of bounds values in snd_soc_put_volsw()
Eric Dumazet <edumazet(a)google.com>
af_packet: fix data-race in packet_setsockopt / packet_setsockopt
Eric Dumazet <edumazet(a)google.com>
rtnetlink: make sure to refresh master_dev/m_ops in __rtnl_newlink()
Shyam Sundar S K <Shyam-sundar.S-k(a)amd.com>
net: amd-xgbe: Fix skb data length underflow
Raju Rangoju <Raju.Rangoju(a)amd.com>
net: amd-xgbe: ensure to reset the tx_timer_active flag
Georgi Valkov <gvalkov(a)abv.bg>
ipheth: fix EOVERFLOW in ipheth_rcvbulk_callback
Florian Westphal <fw(a)strlen.de>
netfilter: nat: limit port clash resolution attempts
Florian Westphal <fw(a)strlen.de>
netfilter: nat: remove l4 protocol port rovers
Eric Dumazet <edumazet(a)google.com>
ipv4: tcp: send zero IPID in SYNACK messages
Eric Dumazet <edumazet(a)google.com>
ipv4: raw: lock the socket in raw_bind()
Guenter Roeck <linux(a)roeck-us.net>
hwmon: (lm90) Reduce maximum conversion rate for G781
Xianting Tian <xianting.tian(a)linux.alibaba.com>
drm/msm: Fix wrong size calculation
Jianguo Wu <wujianguo(a)chinatelecom.cn>
net-procfs: show net devices bound packet types
Trond Myklebust <trond.myklebust(a)hammerspace.com>
NFSv4: nfs_atomic_open() can race when looking up a non-regular file
Trond Myklebust <trond.myklebust(a)hammerspace.com>
NFSv4: Handle case where the lookup of a directory fails
Eric Dumazet <edumazet(a)google.com>
ipv4: avoid using shared IP generator for connected sockets
Congyu Liu <liu3101(a)purdue.edu>
net: fix information leakage in /proc/net/ptype
Ido Schimmel <idosch(a)nvidia.com>
ipv6_tunnel: Rate limit warning messages
John Meneghini <jmeneghi(a)redhat.com>
scsi: bnx2fc: Flush destroy_work queue before calling bnx2fc_interface_put()
Christophe Leroy <christophe.leroy(a)csgroup.eu>
powerpc/32: Fix boot failure with GCC latent entropy plugin
Alan Stern <stern(a)rowland.harvard.edu>
USB: core: Fix hang in usb_kill_urb by adding memory barriers
Pavankumar Kondeti <quic_pkondeti(a)quicinc.com>
usb: gadget: f_sourcesink: Fix isoc transfer for USB_SPEED_SUPER_PLUS
Alan Stern <stern(a)rowland.harvard.edu>
usb-storage: Add unusual-devs entry for VL817 USB-SATA bridge
Cameron Williams <cang1(a)live.co.uk>
tty: Add support for Brainboxes UC cards.
daniel.starke(a)siemens.com <daniel.starke(a)siemens.com>
tty: n_gsm: fix SW flow control encoding/handling
Valentin Caron <valentin.caron(a)foss.st.com>
serial: stm32: fix software flow control transfer
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
PM: wakeup: simplify the output logic of pm_show_wakelocks()
Jan Kara <jack(a)suse.cz>
udf: Fix NULL ptr deref when converting from inline format
Jan Kara <jack(a)suse.cz>
udf: Restore i_lenAlloc when inode expansion fails
Steffen Maier <maier(a)linux.ibm.com>
scsi: zfcp: Fix failed recovery on gone remote port with non-NPIV FCP devices
Vasily Gorbik <gor(a)linux.ibm.com>
s390/hypfs: include z/VM guests with access control group set
Brian Gix <brian.gix(a)intel.com>
Bluetooth: refactor malicious adv data check
Ziyang Xuan <william.xuanziyang(a)huawei.com>
can: bcm: fix UAF of bcm op
-------------
Diffstat:
Makefile | 4 +-
arch/powerpc/kernel/Makefile | 1 +
arch/powerpc/lib/Makefile | 3 +
arch/s390/hypfs/hypfs_vm.c | 6 +-
drivers/edac/altera_edac.c | 2 +-
drivers/edac/xgene_edac.c | 2 +-
drivers/gpu/drm/msm/msm_drv.c | 2 +-
drivers/gpu/drm/nouveau/nvkm/subdev/bios/base.c | 2 +-
drivers/hwmon/lm90.c | 2 +-
drivers/iommu/amd_iommu_init.c | 2 +
drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 14 +++-
drivers/net/macsec.c | 9 +++
drivers/net/usb/ipheth.c | 6 +-
drivers/rtc/rtc-mc146818-lib.c | 2 +-
drivers/s390/scsi/zfcp_fc.c | 13 ++-
drivers/scsi/bnx2fc/bnx2fc_fcoe.c | 41 +++++-----
drivers/spi/spi-bcm-qspi.c | 2 +-
drivers/spi/spi-mt65xx.c | 2 +-
drivers/tty/n_gsm.c | 4 +-
drivers/tty/serial/8250/8250_pci.c | 100 +++++++++++++++++++++++-
drivers/tty/serial/stm32-usart.c | 2 +-
drivers/usb/core/hcd.c | 14 ++++
drivers/usb/core/urb.c | 12 +++
drivers/usb/gadget/function/f_sourcesink.c | 1 +
drivers/usb/storage/unusual_devs.h | 10 +++
fs/ext4/inline.c | 10 ++-
fs/nfs/dir.c | 18 +++++
fs/nfsd/nfs4state.c | 4 +-
fs/udf/inode.c | 9 +--
include/linux/netdevice.h | 1 +
include/net/ip.h | 21 +++--
include/net/netfilter/nf_nat_l4proto.h | 2 +-
kernel/power/wakelock.c | 12 +--
net/bluetooth/hci_event.c | 10 +--
net/can/bcm.c | 20 ++---
net/core/net-procfs.c | 38 ++++++++-
net/core/rtnetlink.c | 6 +-
net/ieee802154/nl802154.c | 8 +-
net/ipv4/ip_output.c | 11 ++-
net/ipv4/raw.c | 5 +-
net/ipv6/ip6_tunnel.c | 8 +-
net/netfilter/nf_nat_proto_common.c | 36 ++++++---
net/netfilter/nf_nat_proto_dccp.c | 5 +-
net/netfilter/nf_nat_proto_sctp.c | 5 +-
net/netfilter/nf_nat_proto_tcp.c | 5 +-
net/netfilter/nf_nat_proto_udp.c | 5 +-
net/netfilter/nf_nat_proto_udplite.c | 5 +-
net/packet/af_packet.c | 10 ++-
sound/soc/fsl/pcm030-audio-fabric.c | 11 ++-
sound/soc/soc-ops.c | 29 ++++++-
50 files changed, 410 insertions(+), 142 deletions(-)
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From dcd46d897adb70d63e025f175a00a89797d31a43 Mon Sep 17 00:00:00 2001
From: Kees Cook <keescook(a)chromium.org>
Date: Mon, 31 Jan 2022 16:09:47 -0800
Subject: [PATCH] exec: Force single empty string when argv is empty
Quoting[1] Ariadne Conill:
"In several other operating systems, it is a hard requirement that the
second argument to execve(2) be the name of a program, thus prohibiting
a scenario where argc < 1. POSIX 2017 also recommends this behaviour,
but it is not an explicit requirement[2]:
The argument arg0 should point to a filename string that is
associated with the process being started by one of the exec
functions.
...
Interestingly, Michael Kerrisk opened an issue about this in 2008[3],
but there was no consensus to support fixing this issue then.
Hopefully now that CVE-2021-4034 shows practical exploitative use[4]
of this bug in a shellcode, we can reconsider.
This issue is being tracked in the KSPP issue tracker[5]."
While the initial code searches[6][7] turned up what appeared to be
mostly corner case tests, trying to that just reject argv == NULL
(or an immediately terminated pointer list) quickly started tripping[8]
existing userspace programs.
The next best approach is forcing a single empty string into argv and
adjusting argc to match. The number of programs depending on argc == 0
seems a smaller set than those calling execve with a NULL argv.
Account for the additional stack space in bprm_stack_limits(). Inject an
empty string when argc == 0 (and set argc = 1). Warn about the case so
userspace has some notice about the change:
process './argc0' launched './argc0' with NULL argv: empty string added
Additionally WARN() and reject NULL argv usage for kernel threads.
[1] https://lore.kernel.org/lkml/20220127000724.15106-1-ariadne@dereferenced.or…
[2] https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
[3] https://bugzilla.kernel.org/show_bug.cgi?id=8408
[4] https://www.qualys.com/2022/01/25/cve-2021-4034/pwnkit.txt
[5] https://github.com/KSPP/linux/issues/176
[6] https://codesearch.debian.net/search?q=execve%5C+*%5C%28%5B%5E%2C%5D%2B%2C+…
[7] https://codesearch.debian.net/search?q=execlp%3F%5Cs*%5C%28%5B%5E%2C%5D%2B%…
[8] https://lore.kernel.org/lkml/20220131144352.GE16385@xsang-OptiPlex-9020/
Reported-by: Ariadne Conill <ariadne(a)dereferenced.org>
Reported-by: Michael Kerrisk <mtk.manpages(a)gmail.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Rich Felker <dalias(a)libc.org>
Cc: Eric Biederman <ebiederm(a)xmission.com>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: linux-fsdevel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Signed-off-by: Kees Cook <keescook(a)chromium.org>
Acked-by: Christian Brauner <brauner(a)kernel.org>
Acked-by: Ariadne Conill <ariadne(a)dereferenced.org>
Acked-by: Andy Lutomirski <luto(a)kernel.org>
Link: https://lore.kernel.org/r/20220201000947.2453721-1-keescook@chromium.org
diff --git a/fs/exec.c b/fs/exec.c
index 79f2c9483302..40b1008fb0f7 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -495,8 +495,14 @@ static int bprm_stack_limits(struct linux_binprm *bprm)
* the stack. They aren't stored until much later when we can't
* signal to the parent that the child has run out of stack space.
* Instead, calculate it here so it's possible to fail gracefully.
+ *
+ * In the case of argc = 0, make sure there is space for adding a
+ * empty string (which will bump argc to 1), to ensure confused
+ * userspace programs don't start processing from argv[1], thinking
+ * argc can never be 0, to keep them from walking envp by accident.
+ * See do_execveat_common().
*/
- ptr_size = (bprm->argc + bprm->envc) * sizeof(void *);
+ ptr_size = (max(bprm->argc, 1) + bprm->envc) * sizeof(void *);
if (limit <= ptr_size)
return -E2BIG;
limit -= ptr_size;
@@ -1897,6 +1903,9 @@ static int do_execveat_common(int fd, struct filename *filename,
}
retval = count(argv, MAX_ARG_STRINGS);
+ if (retval == 0)
+ pr_warn_once("process '%s' launched '%s' with NULL argv: empty string added\n",
+ current->comm, bprm->filename);
if (retval < 0)
goto out_free;
bprm->argc = retval;
@@ -1923,6 +1932,19 @@ static int do_execveat_common(int fd, struct filename *filename,
if (retval < 0)
goto out_free;
+ /*
+ * When argv is empty, add an empty string ("") as argv[0] to
+ * ensure confused userspace programs that start processing
+ * from argv[1] won't end up walking envp. See also
+ * bprm_stack_limits().
+ */
+ if (bprm->argc == 0) {
+ retval = copy_string_kernel("", bprm);
+ if (retval < 0)
+ goto out_free;
+ bprm->argc = 1;
+ }
+
retval = bprm_execve(bprm, fd, filename, flags);
out_free:
free_bprm(bprm);
@@ -1951,6 +1973,8 @@ int kernel_execve(const char *kernel_filename,
}
retval = count_strings_kernel(argv);
+ if (WARN_ON_ONCE(retval == 0))
+ retval = -EINVAL;
if (retval < 0)
goto out_free;
bprm->argc = retval;
Hi,
Please apply upstream commit: 64ba4b15e5c0 ("exfat: check if cluster num is valid")
to stable 5.18.y and 5.17.y
Backports for 5.15.y and 5.10.y will follow soon.
--
Thanks,
Tadeusz
Hi Greg and Shasha!
It has been a while since you heard from xfs team.
We are trying to change things and get xfs fixes flowing to stable
again. Crossing my fingers that we will make this last this time :)
Please see this message from Darrick [4] about xfs stable plans.
My team will be focusing on 5.10.y and Ted and Leah's team will be
focusing on 5.15.y at this time.
This v2 is being sent to stable after testing and after v1 was sent
for review of the xfs list [5].
v2 includes an extra patch that Christoph has backported and tested
and was going to send to stable.
Please see my cover letter to xfs with more details about my plans
for 5.10.y below:
Hi all!
During LSFMM 2022, I have had an opportunity to speak with developers
from several different companies that showed interest in collaborating
on the effort of improving the state of xfs code in LTS kernels.
I would like to kick-off this effort for the 5.10 LTS kernel, in the
hope that others will join me in the future to produce a better common
baseline for everyone to build on.
This is the first of 6 series of stable patch candidates that
I collected from xfs releases v5.11..v5.18 [1].
My intention is to post the parts for review on the xfs list on
a ~weekly basis and forward them to stable only after xfs developers
have had the chance to review the selection.
I used a gadget that I developed "b4 rn" that produces high level
"release notes" with references to the posted patch series and also
looks for mentions of fstest names in the discussions on lore.
I then used an elimination process to select the stable tree candidate
patches. The selection process is documented in the git log of [1].
After I had candidates, Luis has helped me to set up a kdevops testing
environment on a server that Samsung has contributed to the effort.
Luis and I have spent a considerable amount of time to establish the
expunge lists that produce stable baseline results for v5.10.y [2].
Eventually, we ran the auto group test over 100 times to sanitize the
baseline, on the following configurations:
reflink_normapbt (default), reflink, reflink_1024, nocrc, nocrc_512.
The patches in this part are from circa v5.11 release.
They have been through 36 auto group runs with the configs listed above
and no regressions from baseline were observed.
At least two of the fixes have regression tests in fstests that were used
to verify the fix. I also annotated [3] the fix commits in the tests.
I would like to thank Luis for his huge part in this still ongoing effort
and I would like to thank Samsung for contributing the hardware resources
to drive this effort.
Your inputs on the selection in this part and in upcoming parts [1]
are most welcome!
Thanks,
Amir.
[1] https://github.com/amir73il/b4/blob/xfs-5.10.y/xfs-5.10..5.17-fixes.rst
[2] https://github.com/linux-kdevops/kdevops/tree/master/workflows/fstests/expu…
[3] https://lore.kernel.org/fstests/20220520143249.2103631-1-amir73il@gmail.com/
[4] https://lore.kernel.org/linux-xfs/Yo6ePjvpC7nhgek+@magnolia/
[5] https://lore.kernel.org/linux-xfs/20220525111715.2769700-1-amir73il@gmail.c…
Changes since v1:
- Send to stable
- Add patch from Christoph
Darrick J. Wong (3):
xfs: detect overflows in bmbt records
xfs: fix the forward progress assertion in xfs_iwalk_run_callbacks
xfs: fix an ABBA deadlock in xfs_rename
Dave Chinner (1):
xfs: Fix CIL throttle hang when CIL space used going backwards
Kaixu Xia (1):
xfs: show the proper user quota options
fs/xfs/libxfs/xfs_bmap.c | 5 +++++
fs/xfs/libxfs/xfs_dir2.h | 2 --
fs/xfs/libxfs/xfs_dir2_sf.c | 2 +-
fs/xfs/xfs_buf_item.c | 37 ++++++++++++++++----------------
fs/xfs/xfs_inode.c | 42 ++++++++++++++++++++++---------------
fs/xfs/xfs_inode_item.c | 14 +++++++++++++
fs/xfs/xfs_iwalk.c | 2 +-
fs/xfs/xfs_log_cil.c | 22 ++++++++++++++-----
fs/xfs/xfs_super.c | 10 +++++----
9 files changed, 87 insertions(+), 49 deletions(-)
--
2.25.1
We recently started building with Poky Kirkstone (quite a leap
from our ancient and venerable branch of Sumo) which includes
a newer set of tools in the toolchain:
binutils 2.30 -> 2.38
gcc 7.3.3 -> 11.2.0
glibc 2.27 -> 2.35
This uncovered some issues while cross-compiling on the 4.x
kernels. The following patches help in building the 4.19
branch again.
These backports are already applied all the way down to 5.4.
Arnaldo Carvalho de Melo (2):
perf bench: Share some global variables to fix build with gcc 10
perf tests bp_account: Make global variable static
Ben Hutchings (1):
libtraceevent: Fix build with binutils 2.35
tools/lib/traceevent/Makefile | 2 +-
tools/perf/bench/bench.h | 4 ++++
tools/perf/bench/futex-hash.c | 12 ++++++------
tools/perf/bench/futex-lock-pi.c | 11 +++++------
tools/perf/tests/bp_account.c | 2 +-
5 files changed, 17 insertions(+), 14 deletions(-)
--
2.32.0
From: Miri Korenblit <miriam.rachel.korenblit(a)intel.com>
commit 1b7b3ac8ff3317cdcf07a1c413de9bdb68019c2b upstream.
We used to set regulatory info before the registration of
the device and then the regulatory info didn't get set, because
the device isn't registered so there isn't a device to set the
regulatory info for. So set the regulatory info after the device
registration.
Call reg_process_self_managed_hints() once again after the device
registration because it does nothing before it.
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit(a)intel.com>
Signed-off-by: Luca Coelho <luciano.coelho(a)intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20210618133832.c96eadcffe80.I86799c2c866b…
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
---
net/wireless/core.c | 7 ++++---
net/wireless/reg.c | 1 +
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/net/wireless/core.c b/net/wireless/core.c
index 68660781aa51..7c66f99046ac 100644
--- a/net/wireless/core.c
+++ b/net/wireless/core.c
@@ -4,6 +4,7 @@
* Copyright 2006-2010 Johannes Berg <johannes(a)sipsolutions.net>
* Copyright 2013-2014 Intel Mobile Communications GmbH
* Copyright 2015-2017 Intel Deutschland GmbH
+ * Copyright (C) 2018-2021 Intel Corporation
*/
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
@@ -835,9 +836,6 @@ int wiphy_register(struct wiphy *wiphy)
return res;
}
- /* set up regulatory info */
- wiphy_regulatory_register(wiphy);
-
list_add_rcu(&rdev->list, &cfg80211_rdev_list);
cfg80211_rdev_list_generation++;
@@ -851,6 +849,9 @@ int wiphy_register(struct wiphy *wiphy)
cfg80211_debugfs_rdev_add(rdev);
nl80211_notify_wiphy(rdev, NL80211_CMD_NEW_WIPHY);
+ /* set up regulatory info */
+ wiphy_regulatory_register(wiphy);
+
if (wiphy->regulatory_flags & REGULATORY_CUSTOM_REG) {
struct regulatory_request request;
diff --git a/net/wireless/reg.c b/net/wireless/reg.c
index c7825b951f72..dd8503a3ef1e 100644
--- a/net/wireless/reg.c
+++ b/net/wireless/reg.c
@@ -3756,6 +3756,7 @@ void wiphy_regulatory_register(struct wiphy *wiphy)
wiphy_update_regulatory(wiphy, lr->initiator);
wiphy_all_share_dfs_chan_state(wiphy);
+ reg_process_self_managed_hints();
}
void wiphy_regulatory_deregister(struct wiphy *wiphy)
--
2.36.1
This is the start of the stable review cycle for the 4.19.237 release.
There are 20 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 27 Mar 2022 15:04:08 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.237-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.19.237-rc1
Arnd Bergmann <arnd(a)arndb.de>
nds32: fix access_ok() checks in get/put_user
Linus Lüssing <ll(a)simonwunderlich.de>
mac80211: fix potential double free on mesh join
Giovanni Cabiddu <giovanni.cabiddu(a)intel.com>
crypto: qat - disable registration of algorithms
Werner Sembach <wse(a)tuxedocomputers.com>
ACPI: video: Force backlight native for Clevo NL5xRU and NL5xNU
Maximilian Luz <luzmaximilian(a)gmail.com>
ACPI: battery: Add device HID and quirk for Microsoft Surface Go 3
Mark Cilissen <mark(a)yotsuba.nl>
ACPI / x86: Work around broken XSDT on Advantech DAC-BJ01 board
Pablo Neira Ayuso <pablo(a)netfilter.org>
netfilter: nf_tables: initialize registers in nft_do_chain()
Stephane Graber <stgraber(a)ubuntu.com>
drivers: net: xgene: Fix regression in CRC stripping
Giacomo Guiduzzi <guiduzzi.giacomo(a)gmail.com>
ALSA: pci: fix reading of swapped values from pcmreg in AC97 codec
Jonathan Teh <jonathan.teh(a)outlook.com>
ALSA: cmipci: Restore aux vol on suspend/resume
Lars-Peter Clausen <lars(a)metafoo.de>
ALSA: usb-audio: Add mute TLV for playback volumes on RODE NT-USB
Takashi Iwai <tiwai(a)suse.de>
ALSA: pcm: Add stream lock during PCM reset ioctl operations
Takashi Iwai <tiwai(a)suse.de>
ALSA: oss: Fix PCM OSS buffer allocation overflow
Takashi Iwai <tiwai(a)suse.de>
ASoC: sti: Fix deadlock via snd_pcm_stop_xrun() call
Eric Dumazet <edumazet(a)google.com>
llc: fix netdevice reference leaks in llc_ui_bind()
Chuansheng Liu <chuansheng.liu(a)intel.com>
thermal: int340x: fix memory leak in int3400_notify()
Oliver Graute <oliver.graute(a)kococonnector.com>
staging: fbtft: fb_st7789v: reset display before initialization
Steffen Klassert <steffen.klassert(a)secunet.com>
esp: Fix possible buffer overflow in ESP transformation
Tadeusz Struk <tadeusz.struk(a)linaro.org>
net: ipv6: fix skb_over_panic in __ip6_append_data
Jordy Zomer <jordy(a)pwning.systems>
nfc: st21nfca: Fix potential buffer overflows in EVT_TRANSACTION
-------------
Diffstat:
Makefile | 4 +-
arch/nds32/include/asm/uaccess.h | 22 +++++--
arch/x86/kernel/acpi/boot.c | 24 ++++++++
drivers/acpi/battery.c | 12 ++++
drivers/acpi/video_detect.c | 75 +++++++++++++++++++++++
drivers/crypto/qat/qat_common/qat_crypto.c | 8 +++
drivers/net/ethernet/apm/xgene/xgene_enet_main.c | 12 ++--
drivers/nfc/st21nfca/se.c | 10 +++
drivers/staging/fbtft/fb_st7789v.c | 2 +
drivers/thermal/int340x_thermal/int3400_thermal.c | 4 ++
include/net/esp.h | 2 +
include/net/sock.h | 3 +
net/core/sock.c | 3 -
net/ipv4/esp4.c | 5 ++
net/ipv6/esp6.c | 5 ++
net/ipv6/ip6_output.c | 4 +-
net/llc/af_llc.c | 8 +++
net/mac80211/cfg.c | 3 -
net/netfilter/nf_tables_core.c | 2 +-
sound/core/oss/pcm_oss.c | 12 ++--
sound/core/oss/pcm_plugin.c | 5 +-
sound/core/pcm_native.c | 4 ++
sound/pci/ac97/ac97_codec.c | 4 +-
sound/pci/cmipci.c | 3 +-
sound/soc/sti/uniperif_player.c | 6 +-
sound/soc/sti/uniperif_reader.c | 2 +-
sound/usb/mixer_quirks.c | 7 ++-
27 files changed, 214 insertions(+), 37 deletions(-)
When the system runs out of enclave memory, SGX can reclaim EPC pages
by swapping to normal RAM. These backing pages are allocated via a
per-enclave shared memory area. Since SGX allows unlimited over
commit on EPC memory, the reclaimer thread can allocate a large
number of backing RAM pages in response to EPC memory pressure.
When the shared memory backing RAM allocation occurs during
the reclaimer thread context, the shared memory is charged to
the root memory control group, and the shmem usage of the enclave
is not properly accounted for, making cgroups ineffective at
limiting the amount of RAM an enclave can consume.
For example, when using a cgroup to launch a set of test
enclaves, the kernel does not properly account for 50% - 75% of
shmem page allocations on average. In the worst case, when
nearly all allocations occur during the reclaimer thread, the
kernel accounts less than a percent of the amount of shmem used
by the enclave's cgroup to the correct cgroup.
SGX stores a list of mm_structs that are associated with
an enclave. Pick one of them during reclaim and charge that
mm's memcg with the shmem allocation. The one that gets picked
is arbitrary, but this list almost always only has one mm. The
cases where there is more than one mm with different memcg's
are not worth considering.
Create a new function - sgx_encl_alloc_backing(). This function
is used whenever a new backing storage page needs to be
allocated. Previously the same function was used for page
allocation as well as retrieving a previously allocated page.
Prior to backing page allocation, if there is a mm_struct associated
with the enclave that is requesting the allocation, it is set
as the active memory control group.
Signed-off-by: Kristen Carlson Accardi <kristen(a)linux.intel.com>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Acked-by: Roman Gushchin <roman.gushchin(a)linux.dev>
Cc: stable(a)vger.kernel.org
---
V2 -> V3:
Changed memcg variable names in sgx_encl_alloc_backing()
and removed some whitespace.
V1 -> V2:
Changed sgx_encl_set_active_memcg() to simply return the correct
memcg for the enclave and renamed to sgx_encl_get_mem_cgroup().
Created helper function current_is_ksgxd() to improve readability.
Use mmget_not_zero()/mmput_async() when searching mm_list.
Move call to set_active_memcg() to sgx_encl_alloc_backing() and
use mem_cgroup_put() to avoid leaking a memcg reference.
Address review feedback regarding comments and commit log.
---
arch/x86/kernel/cpu/sgx/encl.c | 105 ++++++++++++++++++++++++++++++++-
arch/x86/kernel/cpu/sgx/encl.h | 11 +++-
arch/x86/kernel/cpu/sgx/main.c | 4 +-
3 files changed, 114 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 001808e3901c..6f05e3d919f7 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -32,7 +32,7 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
else
page_index = PFN_DOWN(encl->size);
- ret = sgx_encl_get_backing(encl, page_index, &b);
+ ret = sgx_encl_lookup_backing(encl, page_index, &b);
if (ret)
return ret;
@@ -574,7 +574,7 @@ static struct page *sgx_encl_get_backing_page(struct sgx_encl *encl,
* 0 on success,
* -errno otherwise.
*/
-int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
+static int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
struct sgx_backing *backing)
{
pgoff_t pcmd_index = PFN_DOWN(encl->size) + 1 + (page_index >> 5);
@@ -601,6 +601,107 @@ int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
return 0;
}
+/*
+ * When called from ksgxd, returns the mem_cgroup of a struct mm stored
+ * in the enclave's mm_list. When not called from ksgxd, just returns
+ * the mem_cgroup of the current task.
+ */
+static struct mem_cgroup *sgx_encl_get_mem_cgroup(struct sgx_encl *encl)
+{
+ struct mem_cgroup *memcg = NULL;
+ struct sgx_encl_mm *encl_mm;
+ int idx;
+
+ /*
+ * If called from normal task context, return the mem_cgroup
+ * of the current task's mm. The remainder of the handling is for
+ * ksgxd.
+ */
+ if (!current_is_ksgxd())
+ return get_mem_cgroup_from_mm(current->mm);
+
+ /*
+ * Search the enclave's mm_list to find an mm associated with
+ * this enclave to charge the allocation to.
+ */
+ idx = srcu_read_lock(&encl->srcu);
+
+ list_for_each_entry_rcu(encl_mm, &encl->mm_list, list) {
+ if (!mmget_not_zero(encl_mm->mm))
+ continue;
+
+ memcg = get_mem_cgroup_from_mm(encl_mm->mm);
+
+ mmput_async(encl_mm->mm);
+
+ break;
+ }
+
+ srcu_read_unlock(&encl->srcu, idx);
+
+ /*
+ * In the rare case that there isn't an mm associated with
+ * the enclave, set memcg to the current active mem_cgroup.
+ * This will be the root mem_cgroup if there is no active
+ * mem_cgroup.
+ */
+ if (!memcg)
+ return get_mem_cgroup_from_mm(NULL);
+
+ return memcg;
+}
+
+/**
+ * sgx_encl_alloc_backing() - allocate a new backing storage page
+ * @encl: an enclave pointer
+ * @page_index: enclave page index
+ * @backing: data for accessing backing storage for the page
+ *
+ * When called from ksgxd, sets the active memcg from one of the
+ * mms in the enclave's mm_list prior to any backing page allocation,
+ * in order to ensure that shmem page allocations are charged to the
+ * enclave.
+ *
+ * Return:
+ * 0 on success,
+ * -errno otherwise.
+ */
+int sgx_encl_alloc_backing(struct sgx_encl *encl, unsigned long page_index,
+ struct sgx_backing *backing)
+{
+ struct mem_cgroup *encl_memcg = sgx_encl_get_mem_cgroup(encl);
+ struct mem_cgroup *memcg = set_active_memcg(encl_memcg);
+ int ret;
+
+ ret = sgx_encl_get_backing(encl, page_index, backing);
+
+ set_active_memcg(memcg);
+ mem_cgroup_put(encl_memcg);
+
+ return ret;
+}
+
+/**
+ * sgx_encl_lookup_backing() - retrieve an existing backing storage page
+ * @encl: an enclave pointer
+ * @page_index: enclave page index
+ * @backing: data for accessing backing storage for the page
+ *
+ * Retrieve a backing page for loading data back into an EPC page with ELDU.
+ * It is the caller's responsibility to ensure that it is appropriate to use
+ * sgx_encl_lookup_backing() rather than sgx_encl_alloc_backing(). If lookup is
+ * not used correctly, this will cause an allocation which is not accounted for.
+ *
+ * Return:
+ * 0 on success,
+ * -errno otherwise.
+ */
+int sgx_encl_lookup_backing(struct sgx_encl *encl, unsigned long page_index,
+ struct sgx_backing *backing)
+{
+ return sgx_encl_get_backing(encl, page_index, backing);
+}
+
/**
* sgx_encl_put_backing() - Unpin the backing storage
* @backing: data for accessing backing storage for the page
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index fec43ca65065..2de3b150ab00 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -100,13 +100,20 @@ static inline int sgx_encl_find(struct mm_struct *mm, unsigned long addr,
return 0;
}
+static inline bool current_is_ksgxd(void)
+{
+ return current->mm ? false : true;
+}
+
int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start,
unsigned long end, unsigned long vm_flags);
void sgx_encl_release(struct kref *ref);
int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm);
-int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
- struct sgx_backing *backing);
+int sgx_encl_lookup_backing(struct sgx_encl *encl, unsigned long page_index,
+ struct sgx_backing *backing);
+int sgx_encl_alloc_backing(struct sgx_encl *encl, unsigned long page_index,
+ struct sgx_backing *backing);
void sgx_encl_put_backing(struct sgx_backing *backing, bool do_write);
int sgx_encl_test_and_clear_young(struct mm_struct *mm,
struct sgx_encl_page *page);
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 4b41efc9e367..7d41c8538795 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -310,7 +310,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page,
encl->secs_child_cnt--;
if (!encl->secs_child_cnt && test_bit(SGX_ENCL_INITIALIZED, &encl->flags)) {
- ret = sgx_encl_get_backing(encl, PFN_DOWN(encl->size),
+ ret = sgx_encl_alloc_backing(encl, PFN_DOWN(encl->size),
&secs_backing);
if (ret)
goto out;
@@ -381,7 +381,7 @@ static void sgx_reclaim_pages(void)
goto skip;
page_index = PFN_DOWN(encl_page->desc - encl_page->encl->base);
- ret = sgx_encl_get_backing(encl_page->encl, page_index, &backing[i]);
+ ret = sgx_encl_alloc_backing(encl_page->encl, page_index, &backing[i]);
if (ret)
goto skip;
--
2.20.1
From: Stephen Brennan <stephen.s.brennan(a)oracle.com>
A rare BUG_ON triggered in assoc_array_gc:
[3430308.818153] kernel BUG at lib/assoc_array.c:1609!
Which corresponded to the statement currently at line 1593 upstream:
BUG_ON(assoc_array_ptr_is_meta(p));
Using the data from the core dump, I was able to generate a userspace
reproducer[1] and determine the cause of the bug.
[1]: https://github.com/brenns10/kernel_stuff/tree/master/assoc_array_gc
After running the iterator on the entire branch, an internal tree node
looked like the following:
NODE (nr_leaves_on_branch: 3)
SLOT [0] NODE (2 leaves)
SLOT [1] NODE (1 leaf)
SLOT [2..f] NODE (empty)
In the userspace reproducer, the pr_devel output when compressing this
node was:
-- compress node 0x5607cc089380 --
free=0, leaves=0
[0] retain node 2/1 [nx 0]
[1] fold node 1/1 [nx 0]
[2] fold node 0/1 [nx 2]
[3] fold node 0/2 [nx 2]
[4] fold node 0/3 [nx 2]
[5] fold node 0/4 [nx 2]
[6] fold node 0/5 [nx 2]
[7] fold node 0/6 [nx 2]
[8] fold node 0/7 [nx 2]
[9] fold node 0/8 [nx 2]
[10] fold node 0/9 [nx 2]
[11] fold node 0/10 [nx 2]
[12] fold node 0/11 [nx 2]
[13] fold node 0/12 [nx 2]
[14] fold node 0/13 [nx 2]
[15] fold node 0/14 [nx 2]
after: 3
At slot 0, an internal node with 2 leaves could not be folded into the
node, because there was only one available slot (slot 0). Thus, the
internal node was retained. At slot 1, the node had one leaf, and was
able to be folded in successfully. The remaining nodes had no leaves,
and so were removed. By the end of the compression stage, there were 14
free slots, and only 3 leaf nodes. The tree was ascended and then its
parent node was compressed. When this node was seen, it could not be
folded, due to the internal node it contained.
The invariant for compression in this function is: whenever
nr_leaves_on_branch < ASSOC_ARRAY_FAN_OUT, the node should contain all
leaf nodes. The compression step currently cannot guarantee this, given
the corner case shown above.
To fix this issue, retry compression whenever we have retained a node,
and yet nr_leaves_on_branch < ASSOC_ARRAY_FAN_OUT. This second
compression will then allow the node in slot 1 to be folded in,
satisfying the invariant. Below is the output of the reproducer once the
fix is applied:
-- compress node 0x560e9c562380 --
free=0, leaves=0
[0] retain node 2/1 [nx 0]
[1] fold node 1/1 [nx 0]
[2] fold node 0/1 [nx 2]
[3] fold node 0/2 [nx 2]
[4] fold node 0/3 [nx 2]
[5] fold node 0/4 [nx 2]
[6] fold node 0/5 [nx 2]
[7] fold node 0/6 [nx 2]
[8] fold node 0/7 [nx 2]
[9] fold node 0/8 [nx 2]
[10] fold node 0/9 [nx 2]
[11] fold node 0/10 [nx 2]
[12] fold node 0/11 [nx 2]
[13] fold node 0/12 [nx 2]
[14] fold node 0/13 [nx 2]
[15] fold node 0/14 [nx 2]
internal nodes remain despite enough space, retrying
-- compress node 0x560e9c562380 --
free=14, leaves=1
[0] fold node 2/15 [nx 0]
after: 3
Changes
=======
DH:
- Use false instead of 0.
- Reorder the inserted lines in a couple of places to put retained before
next_slot.
ver #2)
- Fix typo in pr_devel, correct comparison to "<="
Fixes: 3cb989501c26 ("Add a generic associative array implementation.")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Stephen Brennan <stephen.s.brennan(a)oracle.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
cc: Jarkko Sakkinen <jarkko(a)kernel.org>
cc: Andrew Morton <akpm(a)linux-foundation.org>
cc: keyrings(a)vger.kernel.org
Link: https://lore.kernel.org/r/20220511225517.407935-1-stephen.s.brennan@oracle.… # v1
Link: https://lore.kernel.org/r/20220512215045.489140-1-stephen.s.brennan@oracle.… # v2
---
lib/assoc_array.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/lib/assoc_array.c b/lib/assoc_array.c
index 079c72e26493..ca0b4f360c1a 100644
--- a/lib/assoc_array.c
+++ b/lib/assoc_array.c
@@ -1461,6 +1461,7 @@ int assoc_array_gc(struct assoc_array *array,
struct assoc_array_ptr *cursor, *ptr;
struct assoc_array_ptr *new_root, *new_parent, **new_ptr_pp;
unsigned long nr_leaves_on_tree;
+ bool retained;
int keylen, slot, nr_free, next_slot, i;
pr_devel("-->%s()\n", __func__);
@@ -1536,6 +1537,7 @@ int assoc_array_gc(struct assoc_array *array,
goto descend;
}
+retry_compress:
pr_devel("-- compress node %p --\n", new_n);
/* Count up the number of empty slots in this node and work out the
@@ -1553,6 +1555,7 @@ int assoc_array_gc(struct assoc_array *array,
pr_devel("free=%d, leaves=%lu\n", nr_free, new_n->nr_leaves_on_branch);
/* See what we can fold in */
+ retained = false;
next_slot = 0;
for (slot = 0; slot < ASSOC_ARRAY_FAN_OUT; slot++) {
struct assoc_array_shortcut *s;
@@ -1602,9 +1605,14 @@ int assoc_array_gc(struct assoc_array *array,
pr_devel("[%d] retain node %lu/%d [nx %d]\n",
slot, child->nr_leaves_on_branch, nr_free + 1,
next_slot);
+ retained = true;
}
}
+ if (retained && new_n->nr_leaves_on_branch <= ASSOC_ARRAY_FAN_OUT) {
+ pr_devel("internal nodes remain despite enough space, retrying\n");
+ goto retry_compress;
+ }
pr_devel("after: %lu\n", new_n->nr_leaves_on_branch);
nr_leaves_on_tree = new_n->nr_leaves_on_branch;
Syzbot found a Use After Free bug in compute_effective_progs().
The reproducer creates a number of BPF links, and causes a fault
injected alloc to fail, while calling bpf_link_detach on them.
Link detach triggers the link to be freed by bpf_link_free(),
which calls __cgroup_bpf_detach() and update_effective_progs().
If the memory allocation in this function fails, the function restores
the pointer to the bpf_cgroup_link on the cgroup list, but the memory
gets freed just after it returns. After this, every subsequent call to
update_effective_progs() causes this already deallocated pointer to be
dereferenced in prog_list_length(), and triggers KASAN UAF error.
To fix this don't preserve the pointer to the link on the cgroup list
in __cgroup_bpf_detach(), but proceed with the cleanup and retry calling
update_effective_progs() again afterwards.
Cc: "Alexei Starovoitov" <ast(a)kernel.org>
Cc: "Daniel Borkmann" <daniel(a)iogearbox.net>
Cc: "Andrii Nakryiko" <andrii(a)kernel.org>
Cc: "Martin KaFai Lau" <kafai(a)fb.com>
Cc: "Song Liu" <songliubraving(a)fb.com>
Cc: "Yonghong Song" <yhs(a)fb.com>
Cc: "John Fastabend" <john.fastabend(a)gmail.com>
Cc: "KP Singh" <kpsingh(a)kernel.org>
Cc: <netdev(a)vger.kernel.org>
Cc: <bpf(a)vger.kernel.org>
Cc: <stable(a)vger.kernel.org>
Cc: <linux-kernel(a)vger.kernel.org>
Link: https://syzkaller.appspot.com/bug?id=8ebf179a95c2a2670f7cf1ba62429ec044369d…
Fixes: af6eea57437a ("bpf: Implement bpf_link-based cgroup BPF program attachment")
Reported-by: <syzbot+f264bffdfbd5614f3bb2(a)syzkaller.appspotmail.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk(a)linaro.org>
---
kernel/bpf/cgroup.c | 25 ++++++++++++++-----------
1 file changed, 14 insertions(+), 11 deletions(-)
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 128028efda64..b6307337a3c7 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -723,10 +723,11 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
pl->link = NULL;
err = update_effective_progs(cgrp, atype);
- if (err)
- goto cleanup;
-
- /* now can actually delete it from this cgroup list */
+ /*
+ * Proceed regardless of error. The link and/or prog will be freed
+ * just after this function returns so just delete it from this
+ * cgroup list and retry calling update_effective_progs again later.
+ */
list_del(&pl->node);
kfree(pl);
if (list_empty(progs))
@@ -735,12 +736,11 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
if (old_prog)
bpf_prog_put(old_prog);
static_branch_dec(&cgroup_bpf_enabled_key[atype]);
- return 0;
-cleanup:
- /* restore back prog or link */
- pl->prog = old_prog;
- pl->link = link;
+ /* In case of error call update_effective_progs again */
+ if (err)
+ err = update_effective_progs(cgrp, atype);
+
return err;
}
@@ -881,6 +881,7 @@ static void bpf_cgroup_link_release(struct bpf_link *link)
struct bpf_cgroup_link *cg_link =
container_of(link, struct bpf_cgroup_link, link);
struct cgroup *cg;
+ int err;
/* link might have been auto-detached by dying cgroup already,
* in that case our work is done here
@@ -896,8 +897,10 @@ static void bpf_cgroup_link_release(struct bpf_link *link)
return;
}
- WARN_ON(__cgroup_bpf_detach(cg_link->cgroup, NULL, cg_link,
- cg_link->type));
+ err = __cgroup_bpf_detach(cg_link->cgroup, NULL, cg_link,
+ cg_link->type);
+ if (err)
+ pr_warn("cgroup_bpf_detach() failed, err %d\n", err);
cg = cg_link->cgroup;
cg_link->cgroup = NULL;
--
2.35.1
unmap_grant_pages() currently waits for the pages to no longer be used.
In https://github.com/QubesOS/qubes-issues/issues/7481, this lead to a
deadlock against i915: i915 was waiting for gntdev's MMU notifier to
finish, while gntdev was waiting for i915 to free its pages. I also
believe this is responsible for various deadlocks I have experienced in
the past.
Avoid these problems by making unmap_grant_pages async. This requires
making it return void, as any errors will not be available when the
function returns. Fortunately, the only use of the return value is a
WARN_ON(), which can be replaced by a WARN_ON when the error is
detected. Additionally, a failed call will not prevent further calls
from being made, but this is harmless.
Because unmap_grant_pages is now async, the grant handle will be sent to
INVALID_GRANT_HANDLE too late to prevent multiple unmaps of the same
handle. Instead, a separate bool array is allocated for this purpose.
This wastes memory, but stuffing this information in padding bytes is
too fragile. Furthermore, it is necessary to grab a reference to the
map before making the asynchronous call, and release the reference when
the call returns.
Fixes: 745282256c75 ("xen/gntdev: safely unmap grants in case they are still in use")
Cc: stable(a)vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
---
drivers/xen/gntdev-common.h | 5 ++
drivers/xen/gntdev.c | 100 +++++++++++++++++++-----------------
2 files changed, 59 insertions(+), 46 deletions(-)
diff --git a/drivers/xen/gntdev-common.h b/drivers/xen/gntdev-common.h
index 20d7d059dadb..a268cdb1f7bf 100644
--- a/drivers/xen/gntdev-common.h
+++ b/drivers/xen/gntdev-common.h
@@ -16,6 +16,7 @@
#include <linux/mmu_notifier.h>
#include <linux/types.h>
#include <xen/interface/event_channel.h>
+#include <xen/grant_table.h>
struct gntdev_dmabuf_priv;
@@ -56,6 +57,7 @@ struct gntdev_grant_map {
struct gnttab_unmap_grant_ref *unmap_ops;
struct gnttab_map_grant_ref *kmap_ops;
struct gnttab_unmap_grant_ref *kunmap_ops;
+ bool *being_removed;
struct page **pages;
unsigned long pages_vm_start;
@@ -73,6 +75,9 @@ struct gntdev_grant_map {
/* Needed to avoid allocation in gnttab_dma_free_pages(). */
xen_pfn_t *frames;
#endif
+
+ /* Needed to avoid allocation in __unmap_grant_pages */
+ struct gntab_unmap_queue_data unmap_data;
};
struct gntdev_grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count,
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 59ffea800079..90bd2b5ef7dd 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -35,6 +35,7 @@
#include <linux/slab.h>
#include <linux/highmem.h>
#include <linux/refcount.h>
+#include <linux/workqueue.h>
#include <xen/xen.h>
#include <xen/grant_table.h>
@@ -62,8 +63,8 @@ MODULE_PARM_DESC(limit,
static int use_ptemod;
-static int unmap_grant_pages(struct gntdev_grant_map *map,
- int offset, int pages);
+static void unmap_grant_pages(struct gntdev_grant_map *map,
+ int offset, int pages);
static struct miscdevice gntdev_miscdev;
@@ -120,6 +121,7 @@ static void gntdev_free_map(struct gntdev_grant_map *map)
kvfree(map->unmap_ops);
kvfree(map->kmap_ops);
kvfree(map->kunmap_ops);
+ kvfree(map->being_removed);
kfree(map);
}
@@ -140,10 +142,13 @@ struct gntdev_grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count,
add->unmap_ops = kvmalloc_array(count, sizeof(add->unmap_ops[0]),
GFP_KERNEL);
add->pages = kvcalloc(count, sizeof(add->pages[0]), GFP_KERNEL);
+ add->being_removed =
+ kvcalloc(count, sizeof(add->being_removed[0]), GFP_KERNEL);
if (NULL == add->grants ||
NULL == add->map_ops ||
NULL == add->unmap_ops ||
- NULL == add->pages)
+ NULL == add->pages ||
+ NULL == add->being_removed)
goto err;
if (use_ptemod) {
add->kmap_ops = kvmalloc_array(count, sizeof(add->kmap_ops[0]),
@@ -349,79 +354,84 @@ int gntdev_map_grant_pages(struct gntdev_grant_map *map)
return err;
}
-static int __unmap_grant_pages(struct gntdev_grant_map *map, int offset,
- int pages)
+static void __unmap_grant_pages_done(int result,
+ struct gntab_unmap_queue_data *data)
{
- int i, err = 0;
- struct gntab_unmap_queue_data unmap_data;
-
- if (map->notify.flags & UNMAP_NOTIFY_CLEAR_BYTE) {
- int pgno = (map->notify.addr >> PAGE_SHIFT);
- if (pgno >= offset && pgno < offset + pages) {
- /* No need for kmap, pages are in lowmem */
- uint8_t *tmp = pfn_to_kaddr(page_to_pfn(map->pages[pgno]));
- tmp[map->notify.addr & (PAGE_SIZE-1)] = 0;
- map->notify.flags &= ~UNMAP_NOTIFY_CLEAR_BYTE;
- }
- }
-
- unmap_data.unmap_ops = map->unmap_ops + offset;
- unmap_data.kunmap_ops = use_ptemod ? map->kunmap_ops + offset : NULL;
- unmap_data.pages = map->pages + offset;
- unmap_data.count = pages;
-
- err = gnttab_unmap_refs_sync(&unmap_data);
- if (err)
- return err;
+ unsigned int i;
+ struct gntdev_grant_map *map = data->data;
+ unsigned int offset = data->unmap_ops - map->unmap_ops;
- for (i = 0; i < pages; i++) {
- if (map->unmap_ops[offset+i].status)
- err = -EINVAL;
+ for (i = 0; i < data->count; i++) {
+ WARN_ON(map->unmap_ops[offset+i].status);
pr_debug("unmap handle=%d st=%d\n",
map->unmap_ops[offset+i].handle,
map->unmap_ops[offset+i].status);
map->unmap_ops[offset+i].handle = INVALID_GRANT_HANDLE;
if (use_ptemod) {
- if (map->kunmap_ops[offset+i].status)
- err = -EINVAL;
+ WARN_ON(map->kunmap_ops[offset+i].status);
pr_debug("kunmap handle=%u st=%d\n",
map->kunmap_ops[offset+i].handle,
map->kunmap_ops[offset+i].status);
map->kunmap_ops[offset+i].handle = INVALID_GRANT_HANDLE;
}
}
- return err;
+
+ /* Release reference taken by __unmap_grant_pages */
+ gntdev_put_map(NULL, map);
}
-static int unmap_grant_pages(struct gntdev_grant_map *map, int offset,
- int pages)
+static void __unmap_grant_pages(struct gntdev_grant_map *map, int offset,
+ int pages)
+{
+ if (map->notify.flags & UNMAP_NOTIFY_CLEAR_BYTE) {
+ int pgno = (map->notify.addr >> PAGE_SHIFT);
+
+ if (pgno >= offset && pgno < offset + pages) {
+ /* No need for kmap, pages are in lowmem */
+ uint8_t *tmp = pfn_to_kaddr(page_to_pfn(map->pages[pgno]));
+
+ tmp[map->notify.addr & (PAGE_SIZE-1)] = 0;
+ map->notify.flags &= ~UNMAP_NOTIFY_CLEAR_BYTE;
+ }
+ }
+
+ map->unmap_data.unmap_ops = map->unmap_ops + offset;
+ map->unmap_data.kunmap_ops = use_ptemod ? map->kunmap_ops + offset : NULL;
+ map->unmap_data.pages = map->pages + offset;
+ map->unmap_data.count = pages;
+ map->unmap_data.done = __unmap_grant_pages_done;
+ map->unmap_data.data = map;
+ refcount_inc(&map->users); /* to keep map alive during async call below */
+
+ gnttab_unmap_refs_async(&map->unmap_data);
+}
+
+static void unmap_grant_pages(struct gntdev_grant_map *map, int offset,
+ int pages)
{
- int range, err = 0;
+ int range;
pr_debug("unmap %d+%d [%d+%d]\n", map->index, map->count, offset, pages);
/* It is possible the requested range will have a "hole" where we
* already unmapped some of the grants. Only unmap valid ranges.
*/
- while (pages && !err) {
- while (pages &&
- map->unmap_ops[offset].handle == INVALID_GRANT_HANDLE) {
+ while (pages) {
+ while (pages && map->being_removed[offset]) {
offset++;
pages--;
}
range = 0;
while (range < pages) {
- if (map->unmap_ops[offset + range].handle ==
- INVALID_GRANT_HANDLE)
+ if (map->being_removed[offset + range])
break;
+ map->being_removed[offset + range] = true;
range++;
}
- err = __unmap_grant_pages(map, offset, range);
+ __unmap_grant_pages(map, offset, range);
offset += range;
pages -= range;
}
-
- return err;
}
/* ------------------------------------------------------------------ */
@@ -473,7 +483,6 @@ static bool gntdev_invalidate(struct mmu_interval_notifier *mn,
struct gntdev_grant_map *map =
container_of(mn, struct gntdev_grant_map, notifier);
unsigned long mstart, mend;
- int err;
if (!mmu_notifier_range_blockable(range))
return false;
@@ -494,10 +503,9 @@ static bool gntdev_invalidate(struct mmu_interval_notifier *mn,
map->index, map->count,
map->vma->vm_start, map->vma->vm_end,
range->start, range->end, mstart, mend);
- err = unmap_grant_pages(map,
+ unmap_grant_pages(map,
(mstart - map->vma->vm_start) >> PAGE_SHIFT,
(mend - mstart) >> PAGE_SHIFT);
- WARN_ON(err);
return true;
}
--
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab
Running kernel-doc script on drivers/hid/hid-uclogic-params.c, it found
6 warnings for hid_dbg() wrapper functions below:
drivers/hid/hid-uclogic-params.c:48: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* Dump tablet interface pen parameters with hid_dbg(), indented with one tab.
drivers/hid/hid-uclogic-params.c:48: warning: missing initial short description on line:
* Dump tablet interface pen parameters with hid_dbg(), indented with one tab.
drivers/hid/hid-uclogic-params.c:48: info: Scanning doc for function Dump
drivers/hid/hid-uclogic-params.c:80: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* Dump tablet interface frame parameters with hid_dbg(), indented with two
drivers/hid/hid-uclogic-params.c:80: warning: missing initial short description on line:
* Dump tablet interface frame parameters with hid_dbg(), indented with two
drivers/hid/hid-uclogic-params.c:80: info: Scanning doc for function Dump
drivers/hid/hid-uclogic-params.c:105: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* Dump tablet interface parameters with hid_dbg().
drivers/hid/hid-uclogic-params.c:105: warning: missing initial short description on line:
* Dump tablet interface parameters with hid_dbg().
One of them is reported by kernel test robot.
Fix these warnings by properly format kernel-doc comment for these
functions.
Link: https://lore.kernel.org/linux-doc/202205272033.XFYlYj8k-lkp@intel.com/
Fixes: a228809fa6f39c ("HID: uclogic: Move param printing to a function")
Reported-by: kernel test robot <lkp(a)intel.com>
Cc: Nikolai Kondrashov <spbnick(a)gmail.com>
Cc: Jiri Kosina <jikos(a)kernel.org>
Cc: Benjamin Tissoires <benjamin.tissoires(a)redhat.com>
Cc: "José Expósito" <jose.exposito89(a)gmail.com>
Cc: llvm(a)lists.linux.dev
Cc: stable(a)vger.kernel.org # v5.18
Cc: linux-input(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Signed-off-by: Bagas Sanjaya <bagasdotme(a)gmail.com>
---
Changes since v1 [1]:
- Approach the warning by fixing kernel-doc comments formatting
(suggested by Jonathan Corbet)
[1]: https://lore.kernel.org/linux-doc/20220528091403.160169-1-bagasdotme@gmail.…
drivers/hid/hid-uclogic-params.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/drivers/hid/hid-uclogic-params.c b/drivers/hid/hid-uclogic-params.c
index db838f16282d64..647bbd3e000e2f 100644
--- a/drivers/hid/hid-uclogic-params.c
+++ b/drivers/hid/hid-uclogic-params.c
@@ -23,11 +23,11 @@
/**
* uclogic_params_pen_inrange_to_str() - Convert a pen in-range reporting type
* to a string.
- *
* @inrange: The in-range reporting type to convert.
*
- * Returns:
- * The string representing the type, or NULL if the type is unknown.
+ * Return:
+ * * The string representing the type, or
+ * * NULL if the type is unknown.
*/
static const char *uclogic_params_pen_inrange_to_str(
enum uclogic_params_pen_inrange inrange)
@@ -45,10 +45,12 @@ static const char *uclogic_params_pen_inrange_to_str(
}
/**
- * Dump tablet interface pen parameters with hid_dbg(), indented with one tab.
- *
+ * uclogic_params_pen_hid_dbg() - Dump tablet interface pen parameters
* @hdev: The HID device the pen parameters describe.
* @pen: The pen parameters to dump.
+ *
+ * Dump tablet interface pen parameters with hid_dbg(). The dump is indented
+ * with a tab.
*/
static void uclogic_params_pen_hid_dbg(const struct hid_device *hdev,
const struct uclogic_params_pen *pen)
@@ -77,11 +79,12 @@ static void uclogic_params_pen_hid_dbg(const struct hid_device *hdev,
}
/**
- * Dump tablet interface frame parameters with hid_dbg(), indented with two
- * tabs.
- *
+ * uclogic_params_frame_hid_dbg() - Dump tablet interface frame parameters
* @hdev: The HID device the pen parameters describe.
* @frame: The frame parameters to dump.
+ *
+ * Dump tablet interface frame parameters with hid_dbg(). The dump is
+ * indented with two tabs.
*/
static void uclogic_params_frame_hid_dbg(
const struct hid_device *hdev,
@@ -102,10 +105,11 @@ static void uclogic_params_frame_hid_dbg(
}
/**
- * Dump tablet interface parameters with hid_dbg().
- *
+ * uclogic_params_hid_dbg() - Dump tablet interface parameters
* @hdev: The HID device the parameters describe.
* @params: The parameters to dump.
+ *
+ * Dump tablet interface parameters with hid_dbg().
*/
void uclogic_params_hid_dbg(const struct hid_device *hdev,
const struct uclogic_params *params)
base-commit: 8ab2afa23bd197df47819a87f0265c0ac95c5b6a
--
An old man doll... just what I always wanted! - Clara
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From dcd46d897adb70d63e025f175a00a89797d31a43 Mon Sep 17 00:00:00 2001
From: Kees Cook <keescook(a)chromium.org>
Date: Mon, 31 Jan 2022 16:09:47 -0800
Subject: [PATCH] exec: Force single empty string when argv is empty
Quoting[1] Ariadne Conill:
"In several other operating systems, it is a hard requirement that the
second argument to execve(2) be the name of a program, thus prohibiting
a scenario where argc < 1. POSIX 2017 also recommends this behaviour,
but it is not an explicit requirement[2]:
The argument arg0 should point to a filename string that is
associated with the process being started by one of the exec
functions.
...
Interestingly, Michael Kerrisk opened an issue about this in 2008[3],
but there was no consensus to support fixing this issue then.
Hopefully now that CVE-2021-4034 shows practical exploitative use[4]
of this bug in a shellcode, we can reconsider.
This issue is being tracked in the KSPP issue tracker[5]."
While the initial code searches[6][7] turned up what appeared to be
mostly corner case tests, trying to that just reject argv == NULL
(or an immediately terminated pointer list) quickly started tripping[8]
existing userspace programs.
The next best approach is forcing a single empty string into argv and
adjusting argc to match. The number of programs depending on argc == 0
seems a smaller set than those calling execve with a NULL argv.
Account for the additional stack space in bprm_stack_limits(). Inject an
empty string when argc == 0 (and set argc = 1). Warn about the case so
userspace has some notice about the change:
process './argc0' launched './argc0' with NULL argv: empty string added
Additionally WARN() and reject NULL argv usage for kernel threads.
[1] https://lore.kernel.org/lkml/20220127000724.15106-1-ariadne@dereferenced.or…
[2] https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
[3] https://bugzilla.kernel.org/show_bug.cgi?id=8408
[4] https://www.qualys.com/2022/01/25/cve-2021-4034/pwnkit.txt
[5] https://github.com/KSPP/linux/issues/176
[6] https://codesearch.debian.net/search?q=execve%5C+*%5C%28%5B%5E%2C%5D%2B%2C+…
[7] https://codesearch.debian.net/search?q=execlp%3F%5Cs*%5C%28%5B%5E%2C%5D%2B%…
[8] https://lore.kernel.org/lkml/20220131144352.GE16385@xsang-OptiPlex-9020/
Reported-by: Ariadne Conill <ariadne(a)dereferenced.org>
Reported-by: Michael Kerrisk <mtk.manpages(a)gmail.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Rich Felker <dalias(a)libc.org>
Cc: Eric Biederman <ebiederm(a)xmission.com>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: linux-fsdevel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Signed-off-by: Kees Cook <keescook(a)chromium.org>
Acked-by: Christian Brauner <brauner(a)kernel.org>
Acked-by: Ariadne Conill <ariadne(a)dereferenced.org>
Acked-by: Andy Lutomirski <luto(a)kernel.org>
Link: https://lore.kernel.org/r/20220201000947.2453721-1-keescook@chromium.org
diff --git a/fs/exec.c b/fs/exec.c
index 79f2c9483302..40b1008fb0f7 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -495,8 +495,14 @@ static int bprm_stack_limits(struct linux_binprm *bprm)
* the stack. They aren't stored until much later when we can't
* signal to the parent that the child has run out of stack space.
* Instead, calculate it here so it's possible to fail gracefully.
+ *
+ * In the case of argc = 0, make sure there is space for adding a
+ * empty string (which will bump argc to 1), to ensure confused
+ * userspace programs don't start processing from argv[1], thinking
+ * argc can never be 0, to keep them from walking envp by accident.
+ * See do_execveat_common().
*/
- ptr_size = (bprm->argc + bprm->envc) * sizeof(void *);
+ ptr_size = (max(bprm->argc, 1) + bprm->envc) * sizeof(void *);
if (limit <= ptr_size)
return -E2BIG;
limit -= ptr_size;
@@ -1897,6 +1903,9 @@ static int do_execveat_common(int fd, struct filename *filename,
}
retval = count(argv, MAX_ARG_STRINGS);
+ if (retval == 0)
+ pr_warn_once("process '%s' launched '%s' with NULL argv: empty string added\n",
+ current->comm, bprm->filename);
if (retval < 0)
goto out_free;
bprm->argc = retval;
@@ -1923,6 +1932,19 @@ static int do_execveat_common(int fd, struct filename *filename,
if (retval < 0)
goto out_free;
+ /*
+ * When argv is empty, add an empty string ("") as argv[0] to
+ * ensure confused userspace programs that start processing
+ * from argv[1] won't end up walking envp. See also
+ * bprm_stack_limits().
+ */
+ if (bprm->argc == 0) {
+ retval = copy_string_kernel("", bprm);
+ if (retval < 0)
+ goto out_free;
+ bprm->argc = 1;
+ }
+
retval = bprm_execve(bprm, fd, filename, flags);
out_free:
free_bprm(bprm);
@@ -1951,6 +1973,8 @@ int kernel_execve(const char *kernel_filename,
}
retval = count_strings_kernel(argv);
+ if (WARN_ON_ONCE(retval == 0))
+ retval = -EINVAL;
if (retval < 0)
goto out_free;
bprm->argc = retval;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From dcd46d897adb70d63e025f175a00a89797d31a43 Mon Sep 17 00:00:00 2001
From: Kees Cook <keescook(a)chromium.org>
Date: Mon, 31 Jan 2022 16:09:47 -0800
Subject: [PATCH] exec: Force single empty string when argv is empty
Quoting[1] Ariadne Conill:
"In several other operating systems, it is a hard requirement that the
second argument to execve(2) be the name of a program, thus prohibiting
a scenario where argc < 1. POSIX 2017 also recommends this behaviour,
but it is not an explicit requirement[2]:
The argument arg0 should point to a filename string that is
associated with the process being started by one of the exec
functions.
...
Interestingly, Michael Kerrisk opened an issue about this in 2008[3],
but there was no consensus to support fixing this issue then.
Hopefully now that CVE-2021-4034 shows practical exploitative use[4]
of this bug in a shellcode, we can reconsider.
This issue is being tracked in the KSPP issue tracker[5]."
While the initial code searches[6][7] turned up what appeared to be
mostly corner case tests, trying to that just reject argv == NULL
(or an immediately terminated pointer list) quickly started tripping[8]
existing userspace programs.
The next best approach is forcing a single empty string into argv and
adjusting argc to match. The number of programs depending on argc == 0
seems a smaller set than those calling execve with a NULL argv.
Account for the additional stack space in bprm_stack_limits(). Inject an
empty string when argc == 0 (and set argc = 1). Warn about the case so
userspace has some notice about the change:
process './argc0' launched './argc0' with NULL argv: empty string added
Additionally WARN() and reject NULL argv usage for kernel threads.
[1] https://lore.kernel.org/lkml/20220127000724.15106-1-ariadne@dereferenced.or…
[2] https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
[3] https://bugzilla.kernel.org/show_bug.cgi?id=8408
[4] https://www.qualys.com/2022/01/25/cve-2021-4034/pwnkit.txt
[5] https://github.com/KSPP/linux/issues/176
[6] https://codesearch.debian.net/search?q=execve%5C+*%5C%28%5B%5E%2C%5D%2B%2C+…
[7] https://codesearch.debian.net/search?q=execlp%3F%5Cs*%5C%28%5B%5E%2C%5D%2B%…
[8] https://lore.kernel.org/lkml/20220131144352.GE16385@xsang-OptiPlex-9020/
Reported-by: Ariadne Conill <ariadne(a)dereferenced.org>
Reported-by: Michael Kerrisk <mtk.manpages(a)gmail.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Rich Felker <dalias(a)libc.org>
Cc: Eric Biederman <ebiederm(a)xmission.com>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: linux-fsdevel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Signed-off-by: Kees Cook <keescook(a)chromium.org>
Acked-by: Christian Brauner <brauner(a)kernel.org>
Acked-by: Ariadne Conill <ariadne(a)dereferenced.org>
Acked-by: Andy Lutomirski <luto(a)kernel.org>
Link: https://lore.kernel.org/r/20220201000947.2453721-1-keescook@chromium.org
diff --git a/fs/exec.c b/fs/exec.c
index 79f2c9483302..40b1008fb0f7 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -495,8 +495,14 @@ static int bprm_stack_limits(struct linux_binprm *bprm)
* the stack. They aren't stored until much later when we can't
* signal to the parent that the child has run out of stack space.
* Instead, calculate it here so it's possible to fail gracefully.
+ *
+ * In the case of argc = 0, make sure there is space for adding a
+ * empty string (which will bump argc to 1), to ensure confused
+ * userspace programs don't start processing from argv[1], thinking
+ * argc can never be 0, to keep them from walking envp by accident.
+ * See do_execveat_common().
*/
- ptr_size = (bprm->argc + bprm->envc) * sizeof(void *);
+ ptr_size = (max(bprm->argc, 1) + bprm->envc) * sizeof(void *);
if (limit <= ptr_size)
return -E2BIG;
limit -= ptr_size;
@@ -1897,6 +1903,9 @@ static int do_execveat_common(int fd, struct filename *filename,
}
retval = count(argv, MAX_ARG_STRINGS);
+ if (retval == 0)
+ pr_warn_once("process '%s' launched '%s' with NULL argv: empty string added\n",
+ current->comm, bprm->filename);
if (retval < 0)
goto out_free;
bprm->argc = retval;
@@ -1923,6 +1932,19 @@ static int do_execveat_common(int fd, struct filename *filename,
if (retval < 0)
goto out_free;
+ /*
+ * When argv is empty, add an empty string ("") as argv[0] to
+ * ensure confused userspace programs that start processing
+ * from argv[1] won't end up walking envp. See also
+ * bprm_stack_limits().
+ */
+ if (bprm->argc == 0) {
+ retval = copy_string_kernel("", bprm);
+ if (retval < 0)
+ goto out_free;
+ bprm->argc = 1;
+ }
+
retval = bprm_execve(bprm, fd, filename, flags);
out_free:
free_bprm(bprm);
@@ -1951,6 +1973,8 @@ int kernel_execve(const char *kernel_filename,
}
retval = count_strings_kernel(argv);
+ if (WARN_ON_ONCE(retval == 0))
+ retval = -EINVAL;
if (retval < 0)
goto out_free;
bprm->argc = retval;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From dcd46d897adb70d63e025f175a00a89797d31a43 Mon Sep 17 00:00:00 2001
From: Kees Cook <keescook(a)chromium.org>
Date: Mon, 31 Jan 2022 16:09:47 -0800
Subject: [PATCH] exec: Force single empty string when argv is empty
Quoting[1] Ariadne Conill:
"In several other operating systems, it is a hard requirement that the
second argument to execve(2) be the name of a program, thus prohibiting
a scenario where argc < 1. POSIX 2017 also recommends this behaviour,
but it is not an explicit requirement[2]:
The argument arg0 should point to a filename string that is
associated with the process being started by one of the exec
functions.
...
Interestingly, Michael Kerrisk opened an issue about this in 2008[3],
but there was no consensus to support fixing this issue then.
Hopefully now that CVE-2021-4034 shows practical exploitative use[4]
of this bug in a shellcode, we can reconsider.
This issue is being tracked in the KSPP issue tracker[5]."
While the initial code searches[6][7] turned up what appeared to be
mostly corner case tests, trying to that just reject argv == NULL
(or an immediately terminated pointer list) quickly started tripping[8]
existing userspace programs.
The next best approach is forcing a single empty string into argv and
adjusting argc to match. The number of programs depending on argc == 0
seems a smaller set than those calling execve with a NULL argv.
Account for the additional stack space in bprm_stack_limits(). Inject an
empty string when argc == 0 (and set argc = 1). Warn about the case so
userspace has some notice about the change:
process './argc0' launched './argc0' with NULL argv: empty string added
Additionally WARN() and reject NULL argv usage for kernel threads.
[1] https://lore.kernel.org/lkml/20220127000724.15106-1-ariadne@dereferenced.or…
[2] https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
[3] https://bugzilla.kernel.org/show_bug.cgi?id=8408
[4] https://www.qualys.com/2022/01/25/cve-2021-4034/pwnkit.txt
[5] https://github.com/KSPP/linux/issues/176
[6] https://codesearch.debian.net/search?q=execve%5C+*%5C%28%5B%5E%2C%5D%2B%2C+…
[7] https://codesearch.debian.net/search?q=execlp%3F%5Cs*%5C%28%5B%5E%2C%5D%2B%…
[8] https://lore.kernel.org/lkml/20220131144352.GE16385@xsang-OptiPlex-9020/
Reported-by: Ariadne Conill <ariadne(a)dereferenced.org>
Reported-by: Michael Kerrisk <mtk.manpages(a)gmail.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Rich Felker <dalias(a)libc.org>
Cc: Eric Biederman <ebiederm(a)xmission.com>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: linux-fsdevel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Signed-off-by: Kees Cook <keescook(a)chromium.org>
Acked-by: Christian Brauner <brauner(a)kernel.org>
Acked-by: Ariadne Conill <ariadne(a)dereferenced.org>
Acked-by: Andy Lutomirski <luto(a)kernel.org>
Link: https://lore.kernel.org/r/20220201000947.2453721-1-keescook@chromium.org
diff --git a/fs/exec.c b/fs/exec.c
index 79f2c9483302..40b1008fb0f7 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -495,8 +495,14 @@ static int bprm_stack_limits(struct linux_binprm *bprm)
* the stack. They aren't stored until much later when we can't
* signal to the parent that the child has run out of stack space.
* Instead, calculate it here so it's possible to fail gracefully.
+ *
+ * In the case of argc = 0, make sure there is space for adding a
+ * empty string (which will bump argc to 1), to ensure confused
+ * userspace programs don't start processing from argv[1], thinking
+ * argc can never be 0, to keep them from walking envp by accident.
+ * See do_execveat_common().
*/
- ptr_size = (bprm->argc + bprm->envc) * sizeof(void *);
+ ptr_size = (max(bprm->argc, 1) + bprm->envc) * sizeof(void *);
if (limit <= ptr_size)
return -E2BIG;
limit -= ptr_size;
@@ -1897,6 +1903,9 @@ static int do_execveat_common(int fd, struct filename *filename,
}
retval = count(argv, MAX_ARG_STRINGS);
+ if (retval == 0)
+ pr_warn_once("process '%s' launched '%s' with NULL argv: empty string added\n",
+ current->comm, bprm->filename);
if (retval < 0)
goto out_free;
bprm->argc = retval;
@@ -1923,6 +1932,19 @@ static int do_execveat_common(int fd, struct filename *filename,
if (retval < 0)
goto out_free;
+ /*
+ * When argv is empty, add an empty string ("") as argv[0] to
+ * ensure confused userspace programs that start processing
+ * from argv[1] won't end up walking envp. See also
+ * bprm_stack_limits().
+ */
+ if (bprm->argc == 0) {
+ retval = copy_string_kernel("", bprm);
+ if (retval < 0)
+ goto out_free;
+ bprm->argc = 1;
+ }
+
retval = bprm_execve(bprm, fd, filename, flags);
out_free:
free_bprm(bprm);
@@ -1951,6 +1973,8 @@ int kernel_execve(const char *kernel_filename,
}
retval = count_strings_kernel(argv);
+ if (WARN_ON_ONCE(retval == 0))
+ retval = -EINVAL;
if (retval < 0)
goto out_free;
bprm->argc = retval;
This is the start of the stable review cycle for the 5.10.119 release.
There are 163 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 29 May 2022 08:46:26 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.119-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.10.119-rc1
Edward Matijevic <motolav(a)gmail.com>
ALSA: ctxfi: Add SB046x PCI ID
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: check for signals after page of pool writes
Jens Axboe <axboe(a)kernel.dk>
random: wire up fops->splice_{read,write}_iter()
Jens Axboe <axboe(a)kernel.dk>
random: convert to using fops->write_iter()
Jens Axboe <axboe(a)kernel.dk>
random: convert to using fops->read_iter()
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: unify batched entropy implementations
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: move randomize_page() into mm where it belongs
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: move initialization functions out of hot pages
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: make consistent use of buf and len
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: use proper return types on get_random_{int,long}_wait()
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: remove extern from functions in header
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: use static branch for crng_ready()
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: credit architectural init the exact amount
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: handle latent entropy and command line from random_init()
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: use proper jiffies comparison macro
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: remove ratelimiting for in-kernel unseeded randomness
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: move initialization out of reseeding hot path
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: avoid initializing twice in credit race
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: use symbolic constants for crng_init states
Jason A. Donenfeld <Jason(a)zx2c4.com>
siphash: use one source of truth for siphash permutations
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: help compiler out with fast_mix() by using simpler arguments
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: do not use input pool from hard IRQs
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: order timer entropy functions below interrupt functions
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: do not pretend to handle premature next security model
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: use first 128 bits of input as fast init
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: do not use batches when !crng_ready()
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: insist on random_get_entropy() existing in order to simplify
Jason A. Donenfeld <Jason(a)zx2c4.com>
xtensa: use fallback for random_get_entropy() instead of zero
Jason A. Donenfeld <Jason(a)zx2c4.com>
sparc: use fallback for random_get_entropy() instead of zero
Jason A. Donenfeld <Jason(a)zx2c4.com>
um: use fallback for random_get_entropy() instead of zero
Jason A. Donenfeld <Jason(a)zx2c4.com>
x86/tsc: Use fallback for random_get_entropy() instead of zero
Jason A. Donenfeld <Jason(a)zx2c4.com>
nios2: use fallback for random_get_entropy() instead of zero
Jason A. Donenfeld <Jason(a)zx2c4.com>
arm: use fallback for random_get_entropy() instead of zero
Jason A. Donenfeld <Jason(a)zx2c4.com>
mips: use fallback for random_get_entropy() instead of just c0 random
Jason A. Donenfeld <Jason(a)zx2c4.com>
riscv: use fallback for random_get_entropy() instead of zero
Jason A. Donenfeld <Jason(a)zx2c4.com>
m68k: use fallback for random_get_entropy() instead of zero
Jason A. Donenfeld <Jason(a)zx2c4.com>
timekeeping: Add raw clock fallback for random_get_entropy()
Jason A. Donenfeld <Jason(a)zx2c4.com>
powerpc: define get_cycles macro for arch-override
Jason A. Donenfeld <Jason(a)zx2c4.com>
alpha: define get_cycles macro for arch-override
Jason A. Donenfeld <Jason(a)zx2c4.com>
parisc: define get_cycles macro for arch-override
Jason A. Donenfeld <Jason(a)zx2c4.com>
s390: define get_cycles macro for arch-override
Jason A. Donenfeld <Jason(a)zx2c4.com>
ia64: define get_cycles macro for arch-override
Jason A. Donenfeld <Jason(a)zx2c4.com>
init: call time_init() before rand_initialize()
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: fix sysctl documentation nits
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: document crng_fast_key_erasure() destination possibility
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: make random_get_entropy() return an unsigned long
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: allow partial reads if later user copies fail
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: check for signals every PAGE_SIZE chunk of /dev/[u]random
Jann Horn <jannh(a)google.com>
random: check for signal_pending() outside of need_resched() check
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: do not allow user to keep crng key around on stack
Jan Varho <jan.varho(a)gmail.com>
random: do not split fast init input in add_hwgenerator_randomness()
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: mix build-time latent entropy into pool at init
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: re-add removed comment about get_random_{u32,u64} reseeding
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: treat bootloader trust toggle the same way as cpu trust toggle
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: skip fast_init if hwrng provides large chunk of entropy
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: check for signal and try earlier when generating entropy
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: reseed more often immediately after booting
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: make consistent usage of crng_ready()
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: use SipHash as interrupt entropy accumulator
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: replace custom notifier chain with standard one
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: don't let 644 read-only sysctls be written to
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: give sysctl_random_min_urandom_seed a more sensible value
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: do crng pre-init loading in worker rather than irq
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: unify cycles_t and jiffies usage and types
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: cleanup UUID handling
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: only wake up writers after zap if threshold was passed
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: round-robin registers as ulong, not u32
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: clear fast pool, crng, and batches in cpuhp bring up
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: pull add_hwgenerator_randomness() declaration into random.h
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: check for crng_init == 0 in add_device_randomness()
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: unify early init crng load accounting
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: do not take pool spinlock at boot
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: defer fast pool mixing to worker
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: rewrite header introductory comment
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: group sysctl functions
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: group userspace read/write functions
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: group entropy collection functions
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: group entropy extraction functions
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: group crng functions
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: group initialization wait functions
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: remove whitespace and reorder includes
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: remove useless header comment
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: introduce drain_entropy() helper to declutter crng_reseed()
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: deobfuscate irq u32/u64 contributions
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: add proper SPDX header
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: remove unused tracepoints
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: remove ifdef'd out interrupt bench
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: tie batched entropy generation to base_crng generation
Dominik Brodowski <linux(a)dominikbrodowski.net>
random: fix locking for crng_init in crng_reseed()
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: zero buffer after reading entropy from userspace
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: remove outdated INT_MAX >> 6 check in urandom_read()
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: make more consistent use of integer types
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: use hash function for crng_slow_load()
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: use simpler fast key erasure flow on per-cpu keys
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: absorb fast pool into input pool after fast load
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: do not xor RDRAND when writing into /dev/random
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: ensure early RDSEED goes through mixer on init
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: inline leaves of rand_initialize()
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: get rid of secondary crngs
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: use RDSEED instead of RDRAND in entropy extraction
Dominik Brodowski <linux(a)dominikbrodowski.net>
random: fix locking in crng_fast_load()
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: remove batched entropy locking
Eric Biggers <ebiggers(a)google.com>
random: remove use_input_pool parameter from crng_reseed()
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: make credit_entropy_bits() always safe
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: always wake up entropy writers after extraction
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: use linear min-entropy accumulation crediting
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: simplify entropy debiting
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: use computational hash for entropy extraction
Dominik Brodowski <linux(a)dominikbrodowski.net>
random: only call crng_finalize_init() for primary_crng
Dominik Brodowski <linux(a)dominikbrodowski.net>
random: access primary_pool directly rather than through pointer
Dominik Brodowski <linux(a)dominikbrodowski.net>
random: continually use hwgenerator randomness
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: simplify arithmetic function flow in account()
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: selectively clang-format where it makes sense
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: access input_pool_data directly rather than through pointer
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: cleanup fractional entropy shift constants
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: prepend remaining pool constants with POOL_
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: de-duplicate INPUT_POOL constants
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: remove unused OUTPUT_POOL constants
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: rather than entropy_store abstraction, use global
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: remove unused extract_entropy() reserved argument
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: remove incomplete last_data logic
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: cleanup integer types
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: cleanup poolinfo abstraction
Schspa Shi <schspa(a)gmail.com>
random: fix typo in comments
Jann Horn <jannh(a)google.com>
random: don't reset crng_init_cnt on urandom_read()
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: avoid superfluous call to RDRAND in CRNG extraction
Dominik Brodowski <linux(a)dominikbrodowski.net>
random: early initialization of ChaCha constants
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: use IS_ENABLED(CONFIG_NUMA) instead of ifdefs
Dominik Brodowski <linux(a)dominikbrodowski.net>
random: harmonize "crng init done" messages
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: mix bootloader randomness into pool
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: do not re-init if crng_reseed completes before primary init
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: do not sign extend bytes for rotation when mixing
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: use BLAKE2s instead of SHA1 in extraction
Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
random: remove unused irq_flags argument from add_interrupt_randomness()
Mark Brown <broonie(a)kernel.org>
random: document add_hwgenerator_randomness() with other input functions
Jason A. Donenfeld <Jason(a)zx2c4.com>
lib/crypto: blake2s: avoid indirect calls to compression function for Clang CFI
Jason A. Donenfeld <Jason(a)zx2c4.com>
lib/crypto: sha1: re-roll loops to reduce code size
Jason A. Donenfeld <Jason(a)zx2c4.com>
lib/crypto: blake2s: move hmac construction into wireguard
Jason A. Donenfeld <Jason(a)zx2c4.com>
lib/crypto: blake2s: include as built-in
Eric Biggers <ebiggers(a)google.com>
crypto: blake2s - include <linux/bug.h> instead of <asm/bug.h>
Eric Biggers <ebiggers(a)google.com>
crypto: blake2s - adjust include guard naming
Eric Biggers <ebiggers(a)google.com>
crypto: blake2s - add comment for blake2s_state fields
Eric Biggers <ebiggers(a)google.com>
crypto: blake2s - optimize blake2s initialization
Eric Biggers <ebiggers(a)google.com>
crypto: blake2s - share the "shash" API boilerplate code
Eric Biggers <ebiggers(a)google.com>
crypto: blake2s - move update and final logic to internal/blake2s.h
Eric Biggers <ebiggers(a)google.com>
crypto: blake2s - remove unneeded includes
Eric Biggers <ebiggers(a)google.com>
crypto: x86/blake2s - define shash_alg structs using macros
Eric Biggers <ebiggers(a)google.com>
crypto: blake2s - define shash_alg structs using macros
Herbert Xu <herbert(a)gondor.apana.org.au>
crypto: lib/blake2s - Move selftest prototype into header file
Jason A. Donenfeld <Jason(a)zx2c4.com>
MAINTAINERS: add git tree for random.c
Jason A. Donenfeld <Jason(a)zx2c4.com>
MAINTAINERS: co-maintain random.c
Eric Biggers <ebiggers(a)google.com>
random: remove dead code left over from blocking pool
Ard Biesheuvel <ardb(a)kernel.org>
random: avoid arch_get_random_seed_long() when collecting IRQ randomness
Lorenzo Pieralisi <lorenzo.pieralisi(a)arm.com>
ACPI: sysfs: Fix BERT error region memory mapping
Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
ACPI: sysfs: Make sparse happy about address space in use
Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
media: vim2m: initialize the media device earlier
Sakari Ailus <sakari.ailus(a)linux.intel.com>
media: vim2m: Register video device after setting up internals
Willy Tarreau <w(a)1wt.eu>
secure_seq: use the 64 bits of the siphash for port offset calculation
Eric Dumazet <edumazet(a)google.com>
tcp: change source port randomizarion at connect() time
Paolo Bonzini <pbonzini(a)redhat.com>
KVM: x86/mmu: fix NULL pointer dereference on guest INVPCID
Vitaly Kuznetsov <vkuznets(a)redhat.com>
KVM: x86: Properly handle APF vs disabled LAPIC situation
Denis Efremov (Oracle) <efremov(a)linux.com>
staging: rtl8723bs: prevent ->Ssid overflow in rtw_wx_set_scan()
Daniel Thompson <daniel.thompson(a)linaro.org>
lockdown: also lock down previous kgdb use
-------------
Diffstat:
Documentation/admin-guide/kernel-parameters.txt | 6 +
Documentation/admin-guide/sysctl/kernel.rst | 22 +-
MAINTAINERS | 2 +
Makefile | 4 +-
arch/alpha/include/asm/timex.h | 1 +
arch/arm/include/asm/timex.h | 1 +
arch/ia64/include/asm/timex.h | 1 +
arch/m68k/include/asm/timex.h | 2 +-
arch/mips/include/asm/timex.h | 17 +-
arch/nios2/include/asm/timex.h | 3 +
arch/parisc/include/asm/timex.h | 3 +-
arch/powerpc/include/asm/timex.h | 1 +
arch/riscv/include/asm/timex.h | 2 +-
arch/s390/include/asm/timex.h | 1 +
arch/sparc/include/asm/timex_32.h | 4 +-
arch/um/include/asm/timex.h | 9 +-
arch/x86/crypto/Makefile | 4 +-
arch/x86/crypto/blake2s-glue.c | 166 +-
arch/x86/crypto/blake2s-shash.c | 77 +
arch/x86/include/asm/timex.h | 9 +
arch/x86/include/asm/tsc.h | 7 +-
arch/x86/kernel/cpu/mshyperv.c | 2 +-
arch/x86/kvm/lapic.c | 6 +
arch/x86/kvm/mmu/mmu.c | 6 +-
arch/x86/kvm/x86.c | 2 +-
arch/xtensa/include/asm/timex.h | 6 +-
crypto/Kconfig | 3 +-
crypto/blake2s_generic.c | 158 +-
crypto/drbg.c | 17 +-
drivers/acpi/sysfs.c | 23 +-
drivers/char/Kconfig | 3 +-
drivers/char/hw_random/core.c | 1 +
drivers/char/random.c | 3035 +++++++++--------------
drivers/hv/vmbus_drv.c | 2 +-
drivers/media/test-drivers/vim2m.c | 22 +-
drivers/net/Kconfig | 1 -
drivers/net/wireguard/noise.c | 45 +-
drivers/staging/rtl8723bs/os_dep/ioctl_linux.c | 6 +-
include/crypto/blake2s.h | 66 +-
include/crypto/chacha.h | 15 +-
include/crypto/drbg.h | 2 +-
include/crypto/internal/blake2s.h | 123 +-
include/linux/cpuhotplug.h | 2 +
include/linux/hw_random.h | 2 -
include/linux/mm.h | 1 +
include/linux/prandom.h | 23 +-
include/linux/random.h | 100 +-
include/linux/security.h | 2 +
include/linux/siphash.h | 28 +
include/linux/timex.h | 10 +-
include/net/inet_hashtables.h | 2 +-
include/net/secure_seq.h | 4 +-
include/trace/events/random.h | 330 ---
init/main.c | 13 +-
kernel/cpu.c | 11 +
kernel/debug/debug_core.c | 24 +
kernel/debug/kdb/kdb_main.c | 62 +-
kernel/irq/handle.c | 2 +-
kernel/time/timekeeping.c | 15 +
lib/Kconfig.debug | 3 +-
lib/crypto/Kconfig | 23 +-
lib/crypto/Makefile | 9 +-
lib/crypto/blake2s-generic.c | 6 +-
lib/crypto/blake2s-selftest.c | 33 +-
lib/crypto/blake2s.c | 81 +-
lib/random32.c | 16 +-
lib/sha1.c | 95 +-
lib/siphash.c | 32 +-
lib/vsprintf.c | 10 +-
mm/util.c | 32 +
net/core/secure_seq.c | 4 +-
net/ipv4/inet_hashtables.c | 28 +-
net/ipv6/inet6_hashtables.c | 4 +-
security/security.c | 2 +
sound/pci/ctxfi/ctatc.c | 2 +
sound/pci/ctxfi/cthardware.h | 3 +-
76 files changed, 1865 insertions(+), 3035 deletions(-)
The patch titled
Subject: x86/kexec: fix memory leak of elf header buffer
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
x86-kexec-fix-memory-leak-of-elf-header-buffer.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Baoquan He <bhe(a)redhat.com>
Subject: x86/kexec: fix memory leak of elf header buffer
Date: Wed, 23 Feb 2022 19:32:24 +0800
This is reported by kmemleak detector:
unreferenced object 0xffffc900002a9000 (size 4096):
comm "kexec", pid 14950, jiffies 4295110793 (age 373.951s)
hex dump (first 32 bytes):
7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 .ELF............
04 00 3e 00 01 00 00 00 00 00 00 00 00 00 00 00 ..>.............
backtrace:
[<0000000016a8ef9f>] __vmalloc_node_range+0x101/0x170
[<000000002b66b6c0>] __vmalloc_node+0xb4/0x160
[<00000000ad40107d>] crash_prepare_elf64_headers+0x8e/0xcd0
[<0000000019afff23>] crash_load_segments+0x260/0x470
[<0000000019ebe95c>] bzImage64_load+0x814/0xad0
[<0000000093e16b05>] arch_kexec_kernel_image_load+0x1be/0x2a0
[<000000009ef2fc88>] kimage_file_alloc_init+0x2ec/0x5a0
[<0000000038f5a97a>] __do_sys_kexec_file_load+0x28d/0x530
[<0000000087c19992>] do_syscall_64+0x3b/0x90
[<0000000066e063a4>] entry_SYSCALL_64_after_hwframe+0x44/0xae
In crash_prepare_elf64_headers(), a buffer is allocated via vmalloc() to
store elf headers. While it's not freed back to system correctly when
kdump kernel is reloaded or unloaded. Then memory leak is caused. Fix it
by introducing x86 specific function arch_kimage_file_post_load_cleanup(),
and freeing the buffer there.
And also remove the incorrect elf header buffer freeing code. Before
calling arch specific kexec_file loading function, the image instance has
been initialized. So 'image->elf_headers' must be NULL. It doesn't make
sense to free the elf header buffer in the place.
Three different people have reported three bugs about the memory leak on
x86_64 inside Redhat.
Link: https://lkml.kernel.org/r/20220223113225.63106-2-bhe@redhat.com
Signed-off-by: Baoquan He <bhe(a)redhat.com>
Acked-by: Dave Young <dyoung(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/x86/kernel/machine_kexec_64.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
--- a/arch/x86/kernel/machine_kexec_64.c~x86-kexec-fix-memory-leak-of-elf-header-buffer
+++ a/arch/x86/kernel/machine_kexec_64.c
@@ -376,9 +376,6 @@ void machine_kexec(struct kimage *image)
#ifdef CONFIG_KEXEC_FILE
void *arch_kexec_kernel_image_load(struct kimage *image)
{
- vfree(image->elf_headers);
- image->elf_headers = NULL;
-
if (!image->fops || !image->fops->load)
return ERR_PTR(-ENOEXEC);
@@ -514,6 +511,15 @@ overflow:
(int)ELF64_R_TYPE(rel[i].r_info), value);
return -ENOEXEC;
}
+
+int arch_kimage_file_post_load_cleanup(struct kimage *image)
+{
+ vfree(image->elf_headers);
+ image->elf_headers = NULL;
+ image->elf_headers_sz = 0;
+
+ return kexec_image_post_load_cleanup_default(image);
+}
#endif /* CONFIG_KEXEC_FILE */
static int
_
Patches currently in -mm which might be from bhe(a)redhat.com are
x86-kexec-fix-memory-leak-of-elf-header-buffer.patch
The patch titled
Subject: mm/memremap: fix missing call to untrack_pfn() in pagemap_range()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-memremap-fix-missing-call-to-untrack_pfn-in-pagemap_range.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Miaohe Lin <linmiaohe(a)huawei.com>
Subject: mm/memremap: fix missing call to untrack_pfn() in pagemap_range()
Date: Tue, 31 May 2022 20:26:43 +0800
We forget to call untrack_pfn() to pair with track_pfn_remap() when range
is not allowed to hotplug. Fix it by jump err_kasan.
Link: https://lkml.kernel.org/r/20220531122643.25249-1-linmiaohe@huawei.com
Fixes: bca3feaa0764 ("mm/memory_hotplug: prevalidate the address range being added with platform")
Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Muchun Song <songmuchun(a)bytedance.com>
Cc: Anshuman Khandual <anshuman.khandual(a)arm.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memremap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/memremap.c~mm-memremap-fix-missing-call-to-untrack_pfn-in-pagemap_range
+++ a/mm/memremap.c
@@ -214,7 +214,7 @@ static int pagemap_range(struct dev_page
if (!mhp_range_allowed(range->start, range_len(range), !is_private)) {
error = -EINVAL;
- goto err_pfn_remap;
+ goto err_kasan;
}
mem_hotplug_begin();
_
Patches currently in -mm which might be from linmiaohe(a)huawei.com are
maintainers-add-maintainer-information-for-z3fold.patch
mm-memremap-fix-missing-call-to-untrack_pfn-in-pagemap_range.patch
mm-shmemc-clean-up-comment-of-shmem_swapin_folio.patch
mm-reduce-the-rcu-lock-duration.patch
mm-migration-remove-unneeded-lock-page-and-pagemovable-check.patch
mm-migration-return-errno-when-isolate_huge_page-failed.patch
mm-migration-fix-potential-pte_unmap-on-an-not-mapped-pte.patch
Dzień dobry,
czy interesują Państwo regały magazynowe, które pozwolą odpowiednio zagospodarować i całościowo wykorzystać przestrzeń hali?
Kontaktuję się ponieważ mogę zaproponować Państwu wytrzymałe i stabilne regały, szafy oraz pojemniki, a także skrzyniopalety i kontenery samowyładowcze.
Jeżeli zależy Państwu na bezpiecznym i wygodnym składowaniu towarów, produktów i półfabrykatów, nasze rozwiązania zagwarantują firmie efektywne wykorzystanie dostępnej przestrzeni.
Ze swojej strony zapewniamy transport oraz długoletnią gwarancję.
Czy byliby Państwo zainteresowani wstępną wyceną?
Pozdrawiam
Marek Pozyrewski
Dear stable
The world famous brand John Lewis & Partners, is UK's largest
multi-channel retailer with over 126 shops and multiple expansion
in Africa furnished by European/Asian/American products. We are
sourcing new products to attract new customers and also retain
our existing ones, create new partnerships with companies dealing
with different kinds of goods globally.
Your company's products are of interest to our market as we have
an amazing market for your products.Provide us your current
catalog through email to review more. We hope to be able to order
with you and start a long-term friendly, respectable and solid
business partnership. Please we would appreciate it if you could
send us your stock availability via email if any.
Our payment terms are 15 days net in Europe, 30 days Net in UK
and 30 days net in Asia/USA as we have operated with over 5297
suppliers around the globe for the past 50 years now. For
immediate response Send your reply to "robert_turner@johnlewis-
trades.com" for us to be able to treat with care and urgency.
Best Regards
Rob Turner
Head Of Procurement Operations
John Lewis & Partners.
robert_turner(a)johnlewis-trades.com
Tel: +44-7451-274090
WhatsApp: +447497483925
www.johnlewis.com
REGISTERED OFFICE: 171 VICTORIA STREET, LONDON SW1E 5NN
The patch titled
Subject: mm: lru_cache_disable: use synchronize_rcu_expedited
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-lru_cache_disable-use-synchronize_rcu_expedited.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Marcelo Tosatti <mtosatti(a)redhat.com>
Subject: mm: lru_cache_disable: use synchronize_rcu_expedited
Date: Mon, 30 May 2022 12:51:56 -0300
commit ff042f4a9b050 ("mm: lru_cache_disable: replace work queue
synchronization with synchronize_rcu") replaced lru_cache_disable's usage
of work queues with synchronize_rcu.
Some users reported large performance regressions due to this commit, for
example:
https://lore.kernel.org/all/20220521234616.GO1790663@paulmck-ThinkPad-P17-G…
Switching to synchronize_rcu_expedited fixes the problem.
Link: https://lkml.kernel.org/r/YpToHCmnx/HEcVyR@fuller.cnet
Fixes: ff042f4a9b050 ("mm: lru_cache_disable: replace work queue synchronization with synchronize_rcu")
Signed-off-by: Marcelo Tosatti <mtosatti(a)redhat.com>
Tested-by: Stefan Wahren <stefan.wahren(a)i2se.com>
Tested-by: Michael Larabel <Michael(a)MichaelLarabel.com>
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: Nicolas Saenz Julienne <nsaenzju(a)redhat.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Juri Lelli <juri.lelli(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Paul E. McKenney <paulmck(a)kernel.org>
Cc: Phil Elwell <phil(a)raspberrypi.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/swap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/swap.c~mm-lru_cache_disable-use-synchronize_rcu_expedited
+++ a/mm/swap.c
@@ -881,7 +881,7 @@ void lru_cache_disable(void)
* lru_disable_count = 0 will have exited the critical
* section when synchronize_rcu() returns.
*/
- synchronize_rcu();
+ synchronize_rcu_expedited();
#ifdef CONFIG_SMP
__lru_add_drain_all(true);
#else
_
Patches currently in -mm which might be from mtosatti(a)redhat.com are
mm-lru_cache_disable-use-synchronize_rcu_expedited.patch
From: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay(a)intel.com>
[ Upstream commit 0a967f5bfd9134b89681cae58deb222e20840e76 ]
The VT-d spec requires (10.4.4 Global Command Register, TE
field) that:
Hardware implementations supporting DMA draining must drain
any in-flight DMA read/write requests queued within the
Root-Complex before completing the translation enable
command and reflecting the status of the command through
the TES field in the Global Status register.
Unfortunately, some integrated graphic devices fail to do
so after some kind of power state transition. As the
result, the system might stuck in iommu_disable_translati
on(), waiting for the completion of TE transition.
This adds RPLS to a quirk list for those devices and skips
TE disabling if the qurik hits.
Link: https://gitlab.freedesktop.org/drm/intel/-/issues/4898
Tested-by: Raviteja Goud Talla <ravitejax.goud.talla(a)intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Acked-by: Lu Baolu <baolu.lu(a)linux.intel.com>
Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay(a)intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220302043256.191529-1-tejas…
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/iommu/intel/iommu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 91a5c75966f3..a1ffb3d6d901 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5728,7 +5728,7 @@ static void quirk_igfx_skip_te_disable(struct pci_dev *dev)
ver = (dev->device >> 8) & 0xff;
if (ver != 0x45 && ver != 0x46 && ver != 0x4c &&
ver != 0x4e && ver != 0x8a && ver != 0x98 &&
- ver != 0x9a)
+ ver != 0x9a && ver != 0xa7)
return;
if (risky_device(dev))
--
2.35.1
From: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay(a)intel.com>
[ Upstream commit 0a967f5bfd9134b89681cae58deb222e20840e76 ]
The VT-d spec requires (10.4.4 Global Command Register, TE
field) that:
Hardware implementations supporting DMA draining must drain
any in-flight DMA read/write requests queued within the
Root-Complex before completing the translation enable
command and reflecting the status of the command through
the TES field in the Global Status register.
Unfortunately, some integrated graphic devices fail to do
so after some kind of power state transition. As the
result, the system might stuck in iommu_disable_translati
on(), waiting for the completion of TE transition.
This adds RPLS to a quirk list for those devices and skips
TE disabling if the qurik hits.
Link: https://gitlab.freedesktop.org/drm/intel/-/issues/4898
Tested-by: Raviteja Goud Talla <ravitejax.goud.talla(a)intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Acked-by: Lu Baolu <baolu.lu(a)linux.intel.com>
Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay(a)intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220302043256.191529-1-tejas…
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/iommu/intel/iommu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 21749859ad45..477dde39823c 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -6296,7 +6296,7 @@ static void quirk_igfx_skip_te_disable(struct pci_dev *dev)
ver = (dev->device >> 8) & 0xff;
if (ver != 0x45 && ver != 0x46 && ver != 0x4c &&
ver != 0x4e && ver != 0x8a && ver != 0x98 &&
- ver != 0x9a)
+ ver != 0x9a && ver != 0xa7)
return;
if (risky_device(dev))
--
2.35.1