This is a note to let you know that I've just added the patch titled
iio: afe: rescale: Accept only offset channels
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From bee448390e5166d019e9e037194d487ee94399d9 Mon Sep 17 00:00:00 2001
From: Linus Walleij <linus.walleij(a)linaro.org>
Date: Sat, 2 Sep 2023 21:46:20 +0200
Subject: iio: afe: rescale: Accept only offset channels
As noted by Jonathan Cameron: it is perfectly legal for a channel
to have an offset but no scale in addition to the raw interface.
The conversion will imply that scale is 1:1.
Make rescale_configure_channel() accept just scale, or just offset
to process a channel.
When a user asks for IIO_CHAN_INFO_OFFSET in rescale_read_raw()
we now have to deal with the fact that OFFSET could be present
but SCALE missing. Add code to simply scale 1:1 in this case.
Link: https://lore.kernel.org/linux-iio/CACRpkdZXBjHU4t-GVOCFxRO-AHGxKnxMeHD2s4Y4…
Fixes: 53ebee949980 ("iio: afe: iio-rescale: Support processed channels")
Fixes: 9decacd8b3a4 ("iio: afe: rescale: Fix boolean logic bug")
Reported-by: Jonathan Cameron <jic23(a)kernel.org>
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
Reviewed-by: Peter Rosin <peda(a)axentia.se>
Link: https://lore.kernel.org/r/20230902-iio-rescale-only-offset-v2-1-988b807754c…
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/afe/iio-rescale.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/drivers/iio/afe/iio-rescale.c b/drivers/iio/afe/iio-rescale.c
index 1f280c360701..56e5913ab82d 100644
--- a/drivers/iio/afe/iio-rescale.c
+++ b/drivers/iio/afe/iio-rescale.c
@@ -214,8 +214,18 @@ static int rescale_read_raw(struct iio_dev *indio_dev,
return ret < 0 ? ret : -EOPNOTSUPP;
}
- ret = iio_read_channel_scale(rescale->source, &scale, &scale2);
- return rescale_process_offset(rescale, ret, scale, scale2,
+ if (iio_channel_has_info(rescale->source->channel,
+ IIO_CHAN_INFO_SCALE)) {
+ ret = iio_read_channel_scale(rescale->source, &scale, &scale2);
+ return rescale_process_offset(rescale, ret, scale, scale2,
+ schan_off, val, val2);
+ }
+
+ /*
+ * If we get here we have no scale so scale 1:1 but apply
+ * rescaler and offset, if any.
+ */
+ return rescale_process_offset(rescale, IIO_VAL_FRACTIONAL, 1, 1,
schan_off, val, val2);
default:
return -EINVAL;
@@ -280,8 +290,9 @@ static int rescale_configure_channel(struct device *dev,
chan->type = rescale->cfg->type;
if (iio_channel_has_info(schan, IIO_CHAN_INFO_RAW) &&
- iio_channel_has_info(schan, IIO_CHAN_INFO_SCALE)) {
- dev_info(dev, "using raw+scale source channel\n");
+ (iio_channel_has_info(schan, IIO_CHAN_INFO_SCALE) ||
+ iio_channel_has_info(schan, IIO_CHAN_INFO_OFFSET))) {
+ dev_info(dev, "using raw+scale/offset source channel\n");
} else if (iio_channel_has_info(schan, IIO_CHAN_INFO_PROCESSED)) {
dev_info(dev, "using processed channel\n");
rescale->chan_processed = true;
--
2.42.0
This is a note to let you know that I've just added the patch titled
iio: exynos-adc: request second interupt only when touchscreen mode
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 865b080e3229102f160889328ce2e8e97aa65ea0 Mon Sep 17 00:00:00 2001
From: Marek Szyprowski <m.szyprowski(a)samsung.com>
Date: Mon, 9 Oct 2023 12:14:12 +0200
Subject: iio: exynos-adc: request second interupt only when touchscreen mode
is used
Second interrupt is needed only when touchscreen mode is used, so don't
request it unconditionally. This removes the following annoying warning
during boot:
exynos-adc 14d10000.adc: error -ENXIO: IRQ index 1 not found
Fixes: 2bb8ad9b44c5 ("iio: exynos-adc: add experimental touchscreen support")
Signed-off-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
Link: https://lore.kernel.org/r/20231009101412.916922-1-m.szyprowski@samsung.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/exynos_adc.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/drivers/iio/adc/exynos_adc.c b/drivers/iio/adc/exynos_adc.c
index cff1ba57fb16..43c8af41b4a9 100644
--- a/drivers/iio/adc/exynos_adc.c
+++ b/drivers/iio/adc/exynos_adc.c
@@ -826,16 +826,26 @@ static int exynos_adc_probe(struct platform_device *pdev)
}
}
+ /* leave out any TS related code if unreachable */
+ if (IS_REACHABLE(CONFIG_INPUT)) {
+ has_ts = of_property_read_bool(pdev->dev.of_node,
+ "has-touchscreen") || pdata;
+ }
+
irq = platform_get_irq(pdev, 0);
if (irq < 0)
return irq;
info->irq = irq;
- irq = platform_get_irq(pdev, 1);
- if (irq == -EPROBE_DEFER)
- return irq;
+ if (has_ts) {
+ irq = platform_get_irq(pdev, 1);
+ if (irq == -EPROBE_DEFER)
+ return irq;
- info->tsirq = irq;
+ info->tsirq = irq;
+ } else {
+ info->tsirq = -1;
+ }
info->dev = &pdev->dev;
@@ -900,12 +910,6 @@ static int exynos_adc_probe(struct platform_device *pdev)
if (info->data->init_hw)
info->data->init_hw(info);
- /* leave out any TS related code if unreachable */
- if (IS_REACHABLE(CONFIG_INPUT)) {
- has_ts = of_property_read_bool(pdev->dev.of_node,
- "has-touchscreen") || pdata;
- }
-
if (pdata)
info->delay = pdata->delay;
else
--
2.42.0
This is a note to let you know that I've just added the patch titled
iio: adc: xilinx-xadc: Correct temperature offset/scale for
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From e2bd8c28b9bd835077eb65715d416d667694a80d Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Thu, 14 Sep 2023 18:10:19 -0600
Subject: iio: adc: xilinx-xadc: Correct temperature offset/scale for
UltraScale
The driver was previously using offset and scale values for the
temperature sensor readings which were only valid for 7-series devices.
Add per-device-type values for offset and scale and set them appropriately
for each device type.
Note that the values used for the UltraScale family are for UltraScale+
(i.e. the SYSMONE4 primitive) using the internal reference, as that seems
to be the most common configuration and the device tree values Xilinx's
device tree generator produces don't seem to give us anything to tell us
which configuration is used. However, the differences within the UltraScale
family seem fairly minor and it's closer than using the 7-series values
instead in any case.
Fixes: c2b7720a7905 ("iio: xilinx-xadc: Add basic support for Ultrascale System Monitor")
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Acked-by: O'Griofa, Conall <conall.ogriofa(a)amd.com>
Tested-by: O'Griofa, Conall <conall.ogriofa(a)amd.com>
Link: https://lore.kernel.org/r/20230915001019.2862964-3-robert.hancock@calian.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/xilinx-xadc-core.c | 17 ++++++++++++++---
drivers/iio/adc/xilinx-xadc.h | 2 ++
2 files changed, 16 insertions(+), 3 deletions(-)
diff --git a/drivers/iio/adc/xilinx-xadc-core.c b/drivers/iio/adc/xilinx-xadc-core.c
index d4d0d184a172..564c0cad0fc7 100644
--- a/drivers/iio/adc/xilinx-xadc-core.c
+++ b/drivers/iio/adc/xilinx-xadc-core.c
@@ -456,6 +456,9 @@ static const struct xadc_ops xadc_zynq_ops = {
.interrupt_handler = xadc_zynq_interrupt_handler,
.update_alarm = xadc_zynq_update_alarm,
.type = XADC_TYPE_S7,
+ /* Temp in C = (val * 503.975) / 2**bits - 273.15 */
+ .temp_scale = 503975,
+ .temp_offset = 273150,
};
static const unsigned int xadc_axi_reg_offsets[] = {
@@ -566,6 +569,9 @@ static const struct xadc_ops xadc_7s_axi_ops = {
.interrupt_handler = xadc_axi_interrupt_handler,
.flags = XADC_FLAGS_BUFFERED | XADC_FLAGS_IRQ_OPTIONAL,
.type = XADC_TYPE_S7,
+ /* Temp in C = (val * 503.975) / 2**bits - 273.15 */
+ .temp_scale = 503975,
+ .temp_offset = 273150,
};
static const struct xadc_ops xadc_us_axi_ops = {
@@ -577,6 +583,12 @@ static const struct xadc_ops xadc_us_axi_ops = {
.interrupt_handler = xadc_axi_interrupt_handler,
.flags = XADC_FLAGS_BUFFERED | XADC_FLAGS_IRQ_OPTIONAL,
.type = XADC_TYPE_US,
+ /**
+ * Values below are for UltraScale+ (SYSMONE4) using internal reference.
+ * See https://docs.xilinx.com/v/u/en-US/ug580-ultrascale-sysmon
+ */
+ .temp_scale = 509314,
+ .temp_offset = 280231,
};
static int _xadc_update_adc_reg(struct xadc *xadc, unsigned int reg,
@@ -945,8 +957,7 @@ static int xadc_read_raw(struct iio_dev *indio_dev,
*val2 = bits;
return IIO_VAL_FRACTIONAL_LOG2;
case IIO_TEMP:
- /* Temp in C = (val * 503.975) / 2**bits - 273.15 */
- *val = 503975;
+ *val = xadc->ops->temp_scale;
*val2 = bits;
return IIO_VAL_FRACTIONAL_LOG2;
default:
@@ -954,7 +965,7 @@ static int xadc_read_raw(struct iio_dev *indio_dev,
}
case IIO_CHAN_INFO_OFFSET:
/* Only the temperature channel has an offset */
- *val = -((273150 << bits) / 503975);
+ *val = -((xadc->ops->temp_offset << bits) / xadc->ops->temp_scale);
return IIO_VAL_INT;
case IIO_CHAN_INFO_SAMP_FREQ:
ret = xadc_read_samplerate(xadc);
diff --git a/drivers/iio/adc/xilinx-xadc.h b/drivers/iio/adc/xilinx-xadc.h
index 7d78ce698967..3036f4d613ff 100644
--- a/drivers/iio/adc/xilinx-xadc.h
+++ b/drivers/iio/adc/xilinx-xadc.h
@@ -85,6 +85,8 @@ struct xadc_ops {
unsigned int flags;
enum xadc_type type;
+ int temp_scale;
+ int temp_offset;
};
static inline int _xadc_read_adc_reg(struct xadc *xadc, unsigned int reg,
--
2.42.0
This is a note to let you know that I've just added the patch titled
iio: adc: xilinx-xadc: Don't clobber preset voltage/temperature
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 8d6b3ea4d9eaca80982442b68a292ce50ce0a135 Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Thu, 14 Sep 2023 18:10:18 -0600
Subject: iio: adc: xilinx-xadc: Don't clobber preset voltage/temperature
thresholds
In the probe function, the driver was reading out the thresholds already
set in the core, which can be configured by the user in the Vivado tools
when the FPGA image is built. However, it later clobbered those values
with zero or maximum values. In particular, the overtemperature shutdown
threshold register was overwritten with the max value, which effectively
prevents the FPGA from shutting down when the desired threshold was
eached, potentially risking hardware damage in that case.
Remove this code to leave the preconfigured default threshold values
intact.
The code was also disabling all alarms regardless of what enable state
they were left in by the FPGA image, including the overtemperature
shutdown feature. Leave these bits in their original state so they are
not unconditionally disabled.
Fixes: bdc8cda1d010 ("iio:adc: Add Xilinx XADC driver")
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Acked-by: O'Griofa, Conall <conall.ogriofa(a)amd.com>
Tested-by: O'Griofa, Conall <conall.ogriofa(a)amd.com>
Link: https://lore.kernel.org/r/20230915001019.2862964-2-robert.hancock@calian.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/xilinx-xadc-core.c | 22 ----------------------
1 file changed, 22 deletions(-)
diff --git a/drivers/iio/adc/xilinx-xadc-core.c b/drivers/iio/adc/xilinx-xadc-core.c
index dba73300f894..d4d0d184a172 100644
--- a/drivers/iio/adc/xilinx-xadc-core.c
+++ b/drivers/iio/adc/xilinx-xadc-core.c
@@ -1423,28 +1423,6 @@ static int xadc_probe(struct platform_device *pdev)
if (ret)
return ret;
- /* Disable all alarms */
- ret = xadc_update_adc_reg(xadc, XADC_REG_CONF1, XADC_CONF1_ALARM_MASK,
- XADC_CONF1_ALARM_MASK);
- if (ret)
- return ret;
-
- /* Set thresholds to min/max */
- for (i = 0; i < 16; i++) {
- /*
- * Set max voltage threshold and both temperature thresholds to
- * 0xffff, min voltage threshold to 0.
- */
- if (i % 8 < 4 || i == 7)
- xadc->threshold[i] = 0xffff;
- else
- xadc->threshold[i] = 0;
- ret = xadc_write_adc_reg(xadc, XADC_REG_THRESHOLD(i),
- xadc->threshold[i]);
- if (ret)
- return ret;
- }
-
/* Go to non-buffered mode */
xadc_postdisable(indio_dev);
--
2.42.0
DisplayPort Alt Mode CTS test 10.3.8 states that both sides of the
connection shall be compatible with one another such that the connection
is not Source to Source or Sink to Sink.
The DisplayPort driver currently checks for a compatible pin configuration
that resolves into a source and sink combination. The CTS test is designed
to send a Discover Modes message that has a compatible pin configuration
but advertises the same port capability as the device; the current check
fails this.
Verify that the port and port partner resolve into a valid source and sink
combination before checking for a compatible pin configuration.
---
Changes since v1:
* Fixed styling errors
* Added DP_CAP_IS_UFP_D and DP_CAP_IS_DFP_D as macros to typec_dp.h
---
Fixes: 0e3bb7d6894d ("usb: typec: Add driver for DisplayPort alternate mode")
Cc: stable(a)vger.kernel.org
Signed-off-by: RD Babiera <rdbabiera(a)google.com>
---
drivers/usb/typec/altmodes/displayport.c | 5 +++++
include/linux/usb/typec_dp.h | 2 ++
2 files changed, 7 insertions(+)
diff --git a/drivers/usb/typec/altmodes/displayport.c b/drivers/usb/typec/altmodes/displayport.c
index 718da02036d8..9c17955da570 100644
--- a/drivers/usb/typec/altmodes/displayport.c
+++ b/drivers/usb/typec/altmodes/displayport.c
@@ -578,6 +578,11 @@ int dp_altmode_probe(struct typec_altmode *alt)
/* FIXME: Port can only be DFP_U. */
+ /* Make sure that the port and partner can resolve into source and sink */
+ if (!(DP_CAP_IS_DFP_D(port->vdo) && DP_CAP_IS_UFP_D(alt->vdo)) &&
+ !(DP_CAP_IS_UFP_D(port->vdo) && DP_CAP_IS_DFP_D(alt->vdo)))
+ return -ENODEV;
+
/* Make sure we have compatiple pin configurations */
if (!(DP_CAP_PIN_ASSIGN_DFP_D(port->vdo) &
DP_CAP_PIN_ASSIGN_UFP_D(alt->vdo)) &&
diff --git a/include/linux/usb/typec_dp.h b/include/linux/usb/typec_dp.h
index 1f358098522d..4e6c0479307f 100644
--- a/include/linux/usb/typec_dp.h
+++ b/include/linux/usb/typec_dp.h
@@ -67,6 +67,8 @@ enum {
#define DP_CAP_UFP_D 1
#define DP_CAP_DFP_D 2
#define DP_CAP_DFP_D_AND_UFP_D 3
+#define DP_CAP_IS_UFP_D(_cap_) (!!(DP_CAP_CAPABILITY(_cap_) & DP_CAP_UFP_D))
+#define DP_CAP_IS_DFP_D(_cap_) (!!(DP_CAP_CAPABILITY(_cap_) & DP_CAP_DFP_D))
#define DP_CAP_DP_SIGNALLING(_cap_) (((_cap_) & GENMASK(5, 2)) >> 2)
#define DP_CAP_SIGNALLING_HBR3 1
#define DP_CAP_SIGNALLING_UHBR10 2
base-commit: 5220d8b04a840fa09434072c866d032b163419e3
--
2.42.0.655.g421f12c284-goog
Le 21/10/2023 à 02:23, Sasha Levin a écrit :
> This is a note to let you know that I've just added the patch titled
>
> ice: Remove useless DMA-32 fallback configuration
>
> to the 5.15-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> ice-remove-useless-dma-32-fallback-configuration.patch
> and it can be found in the queue-5.15 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
Why is it needed for backport, it is only dead code.
Another patch depends on it?
Looking *quickly* in other patches at [1], I've not seen anything that
conflicts.
CJ
[1]:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/com…
>
>
> commit 6f77725babf0559f90f19df76ff71f7807dff67f
> Author: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
> Date: Sun Jan 9 19:25:05 2022 +0100
>
> ice: Remove useless DMA-32 fallback configuration
>
> [ Upstream commit 9c3e54a632637f27d98fb0ec0c44f7039925809d ]
>
> As stated in [1], dma_set_mask() with a 64-bit mask never fails if
> dev->dma_mask is non-NULL.
> So, if it fails, the 32 bits case will also fail for the same reason.
>
> Simplify code and remove some dead code accordingly.
>
> [1]: https://lkml.org/lkml/2021/6/7/398
>
> Signed-off-by: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
> Reviewed-by: Christoph Hellwig <hch(a)lst.de>
> Reviewed-by: Alexander Lobakin <alexandr.lobakin(a)intel.com>
> Tested-by: Gurucharan G <gurucharanx.g(a)intel.com>
> Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
> Stable-dep-of: 0288c3e709e5 ("ice: reset first in crash dump kernels")
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> index 691c4320b6b1d..4aad089ea1f5d 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -4292,8 +4292,6 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
>
> /* set up for high or low DMA */
> err = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64));
> - if (err)
> - err = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
> if (err) {
> dev_err(dev, "DMA configuration failed: 0x%x\n", err);
> return err;
[ Upstream commit 0df9dab891ff0d9b646d82e4fe038229e4c02451 ]
Stop zapping invalidate TDP MMU roots via work queue now that KVM
preserves TDP MMU roots until they are explicitly invalidated. Zapping
roots asynchronously was effectively a workaround to avoid stalling a vCPU
for an extended during if a vCPU unloaded a root, which at the time
happened whenever the guest toggled CR0.WP (a frequent operation for some
guest kernels).
While a clever hack, zapping roots via an unbound worker had subtle,
unintended consequences on host scheduling, especially when zapping
multiple roots, e.g. as part of a memslot. Because the work of zapping a
root is no longer bound to the task that initiated the zap, things like
the CPU affinity and priority of the original task get lost. Losing the
affinity and priority can be especially problematic if unbound workqueues
aren't affined to a small number of CPUs, as zapping multiple roots can
cause KVM to heavily utilize the majority of CPUs in the system, *beyond*
the CPUs KVM is already using to run vCPUs.
When deleting a memslot via KVM_SET_USER_MEMORY_REGION, the async root
zap can result in KVM occupying all logical CPUs for ~8ms, and result in
high priority tasks not being scheduled in in a timely manner. In v5.15,
which doesn't preserve unloaded roots, the issues were even more noticeable
as KVM would zap roots more frequently and could occupy all CPUs for 50ms+.
Consuming all CPUs for an extended duration can lead to significant jitter
throughout the system, e.g. on ChromeOS with virtio-gpu, deleting memslots
is a semi-frequent operation as memslots are deleted and recreated with
different host virtual addresses to react to host GPU drivers allocating
and freeing GPU blobs. On ChromeOS, the jitter manifests as audio blips
during games due to the audio server's tasks not getting scheduled in
promptly, despite the tasks having a high realtime priority.
Deleting memslots isn't exactly a fast path and should be avoided when
possible, and ChromeOS is working towards utilizing MAP_FIXED to avoid the
memslot shenanigans, but KVM is squarely in the wrong. Not to mention
that removing the async zapping eliminates a non-trivial amount of
complexity.
Note, one of the subtle behaviors hidden behind the async zapping is that
KVM would zap invalidated roots only once (ignoring partial zaps from
things like mmu_notifier events). Preserve this behavior by adding a flag
to identify roots that are scheduled to be zapped versus roots that have
already been zapped but not yet freed.
Add a comment calling out why kvm_tdp_mmu_invalidate_all_roots() can
encounter invalid roots, as it's not at all obvious why zapping
invalidated roots shouldn't simply zap all invalid roots.
Reported-by: Pattara Teerapong <pteerapong(a)google.com>
Cc: David Stevens <stevensd(a)google.com>
Cc: Yiwei Zhang<zzyiwei(a)google.com>
Cc: Paul Hsia <paulhsia(a)google.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-Id: <20230916003916.2545000-4-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: David Matlack <dmatlack(a)google.com>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
---
Folks on Cc, it would be nice to get extra testing and/or reviews for this one
before it's picked up for 6.1, there were quite a few conflicts to resolve.
All of the conflicts were pretty straightforward, but I'd still appreciate an
extra set of eyeballs or three. Thanks!
arch/x86/include/asm/kvm_host.h | 3 +-
arch/x86/kvm/mmu/mmu.c | 9 +--
arch/x86/kvm/mmu/mmu_internal.h | 15 ++--
arch/x86/kvm/mmu/tdp_mmu.c | 135 +++++++++++++-------------------
arch/x86/kvm/mmu/tdp_mmu.h | 4 +-
arch/x86/kvm/x86.c | 5 +-
6 files changed, 69 insertions(+), 102 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 08a84f801bfe..c1dcaa3d2d6e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1324,7 +1324,6 @@ struct kvm_arch {
* the thread holds the MMU lock in write mode.
*/
spinlock_t tdp_mmu_pages_lock;
- struct workqueue_struct *tdp_mmu_zap_wq;
#endif /* CONFIG_X86_64 */
/*
@@ -1727,7 +1726,7 @@ void kvm_mmu_vendor_module_exit(void);
void kvm_mmu_destroy(struct kvm_vcpu *vcpu);
int kvm_mmu_create(struct kvm_vcpu *vcpu);
-int kvm_mmu_init_vm(struct kvm *kvm);
+void kvm_mmu_init_vm(struct kvm *kvm);
void kvm_mmu_uninit_vm(struct kvm *kvm);
void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 2a6fec4e2d19..d30325e297a0 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5994,19 +5994,16 @@ static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
kvm_mmu_zap_all_fast(kvm);
}
-int kvm_mmu_init_vm(struct kvm *kvm)
+void kvm_mmu_init_vm(struct kvm *kvm)
{
struct kvm_page_track_notifier_node *node = &kvm->arch.mmu_sp_tracker;
- int r;
INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);
INIT_LIST_HEAD(&kvm->arch.zapped_obsolete_pages);
INIT_LIST_HEAD(&kvm->arch.lpage_disallowed_mmu_pages);
spin_lock_init(&kvm->arch.mmu_unsync_pages_lock);
- r = kvm_mmu_init_tdp_mmu(kvm);
- if (r < 0)
- return r;
+ kvm_mmu_init_tdp_mmu(kvm);
node->track_write = kvm_mmu_pte_write;
node->track_flush_slot = kvm_mmu_invalidate_zap_pages_in_memslot;
@@ -6019,8 +6016,6 @@ int kvm_mmu_init_vm(struct kvm *kvm)
kvm->arch.split_desc_cache.kmem_cache = pte_list_desc_cache;
kvm->arch.split_desc_cache.gfp_zero = __GFP_ZERO;
-
- return 0;
}
static void mmu_free_vm_memory_caches(struct kvm *kvm)
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index 582def531d4d..0a9d5f2925c3 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -56,7 +56,12 @@ struct kvm_mmu_page {
bool tdp_mmu_page;
bool unsync;
- u8 mmu_valid_gen;
+ union {
+ u8 mmu_valid_gen;
+
+ /* Only accessed under slots_lock. */
+ bool tdp_mmu_scheduled_root_to_zap;
+ };
bool lpage_disallowed; /* Can't be replaced by an equiv large page */
/*
@@ -92,13 +97,7 @@ struct kvm_mmu_page {
struct kvm_rmap_head parent_ptes; /* rmap pointers to parent sptes */
tdp_ptep_t ptep;
};
- union {
- DECLARE_BITMAP(unsync_child_bitmap, 512);
- struct {
- struct work_struct tdp_mmu_async_work;
- void *tdp_mmu_async_data;
- };
- };
+ DECLARE_BITMAP(unsync_child_bitmap, 512);
struct list_head lpage_disallowed_link;
#ifdef CONFIG_X86_32
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 9b9fc4e834d0..c3b0f973375b 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -14,24 +14,16 @@ static bool __read_mostly tdp_mmu_enabled = true;
module_param_named(tdp_mmu, tdp_mmu_enabled, bool, 0644);
/* Initializes the TDP MMU for the VM, if enabled. */
-int kvm_mmu_init_tdp_mmu(struct kvm *kvm)
+void kvm_mmu_init_tdp_mmu(struct kvm *kvm)
{
- struct workqueue_struct *wq;
-
if (!tdp_enabled || !READ_ONCE(tdp_mmu_enabled))
- return 0;
-
- wq = alloc_workqueue("kvm", WQ_UNBOUND|WQ_MEM_RECLAIM|WQ_CPU_INTENSIVE, 0);
- if (!wq)
- return -ENOMEM;
+ return;
/* This should not be changed for the lifetime of the VM. */
kvm->arch.tdp_mmu_enabled = true;
INIT_LIST_HEAD(&kvm->arch.tdp_mmu_roots);
spin_lock_init(&kvm->arch.tdp_mmu_pages_lock);
INIT_LIST_HEAD(&kvm->arch.tdp_mmu_pages);
- kvm->arch.tdp_mmu_zap_wq = wq;
- return 1;
}
/* Arbitrarily returns true so that this may be used in if statements. */
@@ -57,20 +49,15 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm)
* ultimately frees all roots.
*/
kvm_tdp_mmu_invalidate_all_roots(kvm);
-
- /*
- * Destroying a workqueue also first flushes the workqueue, i.e. no
- * need to invoke kvm_tdp_mmu_zap_invalidated_roots().
- */
- destroy_workqueue(kvm->arch.tdp_mmu_zap_wq);
+ kvm_tdp_mmu_zap_invalidated_roots(kvm);
WARN_ON(!list_empty(&kvm->arch.tdp_mmu_pages));
WARN_ON(!list_empty(&kvm->arch.tdp_mmu_roots));
/*
* Ensure that all the outstanding RCU callbacks to free shadow pages
- * can run before the VM is torn down. Work items on tdp_mmu_zap_wq
- * can call kvm_tdp_mmu_put_root and create new callbacks.
+ * can run before the VM is torn down. Putting the last reference to
+ * zapped roots will create new callbacks.
*/
rcu_barrier();
}
@@ -97,46 +84,6 @@ static void tdp_mmu_free_sp_rcu_callback(struct rcu_head *head)
tdp_mmu_free_sp(sp);
}
-static void tdp_mmu_zap_root(struct kvm *kvm, struct kvm_mmu_page *root,
- bool shared);
-
-static void tdp_mmu_zap_root_work(struct work_struct *work)
-{
- struct kvm_mmu_page *root = container_of(work, struct kvm_mmu_page,
- tdp_mmu_async_work);
- struct kvm *kvm = root->tdp_mmu_async_data;
-
- read_lock(&kvm->mmu_lock);
-
- /*
- * A TLB flush is not necessary as KVM performs a local TLB flush when
- * allocating a new root (see kvm_mmu_load()), and when migrating vCPU
- * to a different pCPU. Note, the local TLB flush on reuse also
- * invalidates any paging-structure-cache entries, i.e. TLB entries for
- * intermediate paging structures, that may be zapped, as such entries
- * are associated with the ASID on both VMX and SVM.
- */
- tdp_mmu_zap_root(kvm, root, true);
-
- /*
- * Drop the refcount using kvm_tdp_mmu_put_root() to test its logic for
- * avoiding an infinite loop. By design, the root is reachable while
- * it's being asynchronously zapped, thus a different task can put its
- * last reference, i.e. flowing through kvm_tdp_mmu_put_root() for an
- * asynchronously zapped root is unavoidable.
- */
- kvm_tdp_mmu_put_root(kvm, root, true);
-
- read_unlock(&kvm->mmu_lock);
-}
-
-static void tdp_mmu_schedule_zap_root(struct kvm *kvm, struct kvm_mmu_page *root)
-{
- root->tdp_mmu_async_data = kvm;
- INIT_WORK(&root->tdp_mmu_async_work, tdp_mmu_zap_root_work);
- queue_work(kvm->arch.tdp_mmu_zap_wq, &root->tdp_mmu_async_work);
-}
-
void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root,
bool shared)
{
@@ -222,11 +169,11 @@ static struct kvm_mmu_page *tdp_mmu_next_root(struct kvm *kvm,
#define for_each_valid_tdp_mmu_root_yield_safe(_kvm, _root, _as_id, _shared) \
__for_each_tdp_mmu_root_yield_safe(_kvm, _root, _as_id, _shared, true)
-#define for_each_tdp_mmu_root_yield_safe(_kvm, _root) \
- for (_root = tdp_mmu_next_root(_kvm, NULL, false, false); \
+#define for_each_tdp_mmu_root_yield_safe(_kvm, _root, _shared) \
+ for (_root = tdp_mmu_next_root(_kvm, NULL, _shared, false); \
_root; \
- _root = tdp_mmu_next_root(_kvm, _root, false, false)) \
- if (!kvm_lockdep_assert_mmu_lock_held(_kvm, false)) { \
+ _root = tdp_mmu_next_root(_kvm, _root, _shared, false)) \
+ if (!kvm_lockdep_assert_mmu_lock_held(_kvm, _shared)) { \
} else
/*
@@ -305,7 +252,7 @@ hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu)
* by a memslot update or by the destruction of the VM. Initialize the
* refcount to two; one reference for the vCPU, and one reference for
* the TDP MMU itself, which is held until the root is invalidated and
- * is ultimately put by tdp_mmu_zap_root_work().
+ * is ultimately put by kvm_tdp_mmu_zap_invalidated_roots().
*/
refcount_set(&root->tdp_mmu_root_count, 2);
@@ -963,7 +910,7 @@ bool kvm_tdp_mmu_zap_leafs(struct kvm *kvm, gfn_t start, gfn_t end, bool flush)
{
struct kvm_mmu_page *root;
- for_each_tdp_mmu_root_yield_safe(kvm, root)
+ for_each_tdp_mmu_root_yield_safe(kvm, root, false)
flush = tdp_mmu_zap_leafs(kvm, root, start, end, true, flush);
return flush;
@@ -985,7 +932,7 @@ void kvm_tdp_mmu_zap_all(struct kvm *kvm)
* is being destroyed or the userspace VMM has exited. In both cases,
* KVM_RUN is unreachable, i.e. no vCPUs will ever service the request.
*/
- for_each_tdp_mmu_root_yield_safe(kvm, root)
+ for_each_tdp_mmu_root_yield_safe(kvm, root, false)
tdp_mmu_zap_root(kvm, root, false);
}
@@ -995,18 +942,47 @@ void kvm_tdp_mmu_zap_all(struct kvm *kvm)
*/
void kvm_tdp_mmu_zap_invalidated_roots(struct kvm *kvm)
{
- flush_workqueue(kvm->arch.tdp_mmu_zap_wq);
+ struct kvm_mmu_page *root;
+
+ read_lock(&kvm->mmu_lock);
+
+ for_each_tdp_mmu_root_yield_safe(kvm, root, true) {
+ if (!root->tdp_mmu_scheduled_root_to_zap)
+ continue;
+
+ root->tdp_mmu_scheduled_root_to_zap = false;
+ KVM_BUG_ON(!root->role.invalid, kvm);
+
+ /*
+ * A TLB flush is not necessary as KVM performs a local TLB
+ * flush when allocating a new root (see kvm_mmu_load()), and
+ * when migrating a vCPU to a different pCPU. Note, the local
+ * TLB flush on reuse also invalidates paging-structure-cache
+ * entries, i.e. TLB entries for intermediate paging structures,
+ * that may be zapped, as such entries are associated with the
+ * ASID on both VMX and SVM.
+ */
+ tdp_mmu_zap_root(kvm, root, true);
+
+ /*
+ * The referenced needs to be put *after* zapping the root, as
+ * the root must be reachable by mmu_notifiers while it's being
+ * zapped
+ */
+ kvm_tdp_mmu_put_root(kvm, root, true);
+ }
+
+ read_unlock(&kvm->mmu_lock);
}
/*
* Mark each TDP MMU root as invalid to prevent vCPUs from reusing a root that
* is about to be zapped, e.g. in response to a memslots update. The actual
- * zapping is performed asynchronously. Using a separate workqueue makes it
- * easy to ensure that the destruction is performed before the "fast zap"
- * completes, without keeping a separate list of invalidated roots; the list is
- * effectively the list of work items in the workqueue.
+ * zapping is done separately so that it happens with mmu_lock with read,
+ * whereas invalidating roots must be done with mmu_lock held for write (unless
+ * the VM is being destroyed).
*
- * Note, the asynchronous worker is gifted the TDP MMU's reference.
+ * Note, kvm_tdp_mmu_zap_invalidated_roots() is gifted the TDP MMU's reference.
* See kvm_tdp_mmu_get_vcpu_root_hpa().
*/
void kvm_tdp_mmu_invalidate_all_roots(struct kvm *kvm)
@@ -1031,19 +1007,20 @@ void kvm_tdp_mmu_invalidate_all_roots(struct kvm *kvm)
/*
* As above, mmu_lock isn't held when destroying the VM! There can't
* be other references to @kvm, i.e. nothing else can invalidate roots
- * or be consuming roots, but walking the list of roots does need to be
- * guarded against roots being deleted by the asynchronous zap worker.
+ * or get/put references to roots.
*/
- rcu_read_lock();
-
- list_for_each_entry_rcu(root, &kvm->arch.tdp_mmu_roots, link) {
+ list_for_each_entry(root, &kvm->arch.tdp_mmu_roots, link) {
+ /*
+ * Note, invalid roots can outlive a memslot update! Invalid
+ * roots must be *zapped* before the memslot update completes,
+ * but a different task can acquire a reference and keep the
+ * root alive after its been zapped.
+ */
if (!root->role.invalid) {
+ root->tdp_mmu_scheduled_root_to_zap = true;
root->role.invalid = true;
- tdp_mmu_schedule_zap_root(kvm, root);
}
}
-
- rcu_read_unlock();
}
/*
diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h
index d0a9fe0770fd..c82a8bb321bb 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.h
+++ b/arch/x86/kvm/mmu/tdp_mmu.h
@@ -65,7 +65,7 @@ u64 *kvm_tdp_mmu_fast_pf_get_last_sptep(struct kvm_vcpu *vcpu, u64 addr,
u64 *spte);
#ifdef CONFIG_X86_64
-int kvm_mmu_init_tdp_mmu(struct kvm *kvm);
+void kvm_mmu_init_tdp_mmu(struct kvm *kvm);
void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm);
static inline bool is_tdp_mmu_page(struct kvm_mmu_page *sp) { return sp->tdp_mmu_page; }
@@ -86,7 +86,7 @@ static inline bool is_tdp_mmu(struct kvm_mmu *mmu)
return sp && is_tdp_mmu_page(sp) && sp->root_count;
}
#else
-static inline int kvm_mmu_init_tdp_mmu(struct kvm *kvm) { return 0; }
+static inline void kvm_mmu_init_tdp_mmu(struct kvm *kvm) {}
static inline void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm) {}
static inline bool is_tdp_mmu_page(struct kvm_mmu_page *sp) { return false; }
static inline bool is_tdp_mmu(struct kvm_mmu *mmu) { return false; }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1931d3fcbbe0..b929254c7876 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12442,9 +12442,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
if (ret)
goto out;
- ret = kvm_mmu_init_vm(kvm);
- if (ret)
- goto out_page_track;
+ kvm_mmu_init_vm(kvm);
ret = static_call(kvm_x86_vm_init)(kvm);
if (ret)
@@ -12489,7 +12487,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
out_uninit_mmu:
kvm_mmu_uninit_vm(kvm);
-out_page_track:
kvm_page_track_cleanup(kvm);
out:
return ret;
base-commit: adc4d740ad9ec780657327c69ab966fa4fdf0e8e
--
2.42.0.655.g421f12c284-goog
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x 2b32c76e2b0154b98b9322ae7546b8156cd703e6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102147-granola-preschool-1418@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2b32c76e2b0154b98b9322ae7546b8156cd703e6 Mon Sep 17 00:00:00 2001
From: Keith Busch <kbusch(a)kernel.org>
Date: Mon, 16 Oct 2023 13:12:47 -0700
Subject: [PATCH] nvme: sanitize metadata bounce buffer for reads
User can request more metadata bytes than the device will write. Ensure
kernel buffer is initialized so we're not leaking unsanitized memory on
the copy-out.
Fixes: 0b7f1f26f95a51a ("nvme: use the block layer for userspace passthrough metadata")
Reviewed-by: Jens Axboe <axboe(a)kernel.dk>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Kanchan Joshi <joshi.k(a)samsung.com>
Reviewed-by: Chaitanya Kulkarni <kch(a)nvidia.com>
Signed-off-by: Keith Busch <kbusch(a)kernel.org>
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index d8ff796fd5f2..747c879e8982 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -108,9 +108,13 @@ static void *nvme_add_user_metadata(struct request *req, void __user *ubuf,
if (!buf)
goto out;
- ret = -EFAULT;
- if ((req_op(req) == REQ_OP_DRV_OUT) && copy_from_user(buf, ubuf, len))
- goto out_free_meta;
+ if (req_op(req) == REQ_OP_DRV_OUT) {
+ ret = -EFAULT;
+ if (copy_from_user(buf, ubuf, len))
+ goto out_free_meta;
+ } else {
+ memset(buf, 0, len);
+ }
bip = bio_integrity_alloc(bio, GFP_KERNEL, 1);
if (IS_ERR(bip)) {
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 2b32c76e2b0154b98b9322ae7546b8156cd703e6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102146-postwar-smugly-830b@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2b32c76e2b0154b98b9322ae7546b8156cd703e6 Mon Sep 17 00:00:00 2001
From: Keith Busch <kbusch(a)kernel.org>
Date: Mon, 16 Oct 2023 13:12:47 -0700
Subject: [PATCH] nvme: sanitize metadata bounce buffer for reads
User can request more metadata bytes than the device will write. Ensure
kernel buffer is initialized so we're not leaking unsanitized memory on
the copy-out.
Fixes: 0b7f1f26f95a51a ("nvme: use the block layer for userspace passthrough metadata")
Reviewed-by: Jens Axboe <axboe(a)kernel.dk>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Kanchan Joshi <joshi.k(a)samsung.com>
Reviewed-by: Chaitanya Kulkarni <kch(a)nvidia.com>
Signed-off-by: Keith Busch <kbusch(a)kernel.org>
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index d8ff796fd5f2..747c879e8982 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -108,9 +108,13 @@ static void *nvme_add_user_metadata(struct request *req, void __user *ubuf,
if (!buf)
goto out;
- ret = -EFAULT;
- if ((req_op(req) == REQ_OP_DRV_OUT) && copy_from_user(buf, ubuf, len))
- goto out_free_meta;
+ if (req_op(req) == REQ_OP_DRV_OUT) {
+ ret = -EFAULT;
+ if (copy_from_user(buf, ubuf, len))
+ goto out_free_meta;
+ } else {
+ memset(buf, 0, len);
+ }
bip = bio_integrity_alloc(bio, GFP_KERNEL, 1);
if (IS_ERR(bip)) {