Fix the PCI comment for the IS-200 card. The PCI ID for the IS-200
is 0x0d80, and the definition used (PCI_DEVICE_ID_INTASHIELD_IS200)
is indeed 0x0d80, clarify that by fixing the comment as its
neighbouring cards are all at 0x0020 offsets.
Fixes: 737c17561fb2 ("[SERIAL] Support for Intashield 2 port PCI serial card")
Cc: stable(a)vger.kernel.org
Signed-off-by: Cameron Williams <cang1(a)live.co.uk>
---
I argue for fixing this rather than removing due to this patch series (and
the code already in the kernel) referring to the rest of the cards in
the manufacturer's product line by hex ID, makes sense to me to
have the hex IDs all displayed correctly one way or another in the
one driver as the IS-200 and 400 are the only cards to use a definition instead.
v3 - v4:
Added Fixes: and Cc: tag
Added above note for reviewer.
v2 - v3:
Clarify commit message with better explanation of the change.
Re-submit patch series using git send-email to make threading work.
v1 - v2:
This is a resubmission series for the patch series below. That series
was lots of changes sent to lots of maintainers, this series is just for
the tty/serial/8250 subsystem.
[1] https://lore.kernel.org/all/DU0PR02MB789950E64D808DB57E9D7312C4F8A@DU0PR02M…
[2] https://lore.kernel.org/all/DU0PR02MB7899DE53DFC900EFB50E53F2C4F8A@DU0PR02M…
[3] https://lore.kernel.org/all/DU0PR02MB7899033E7E81EAF3694BC20AC4F8A@DU0PR02M…
[4] https://lore.kernel.org/all/DU0PR02MB7899EABA8C3DCAC94DCC79D4C4F8A@DU0PR02M…
drivers/tty/serial/8250/8250_pci.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/tty/serial/8250/8250_pci.c b/drivers/tty/serial/8250/8250_pci.c
index 62a9bd30b4db..ecb4e9acc70d 100644
--- a/drivers/tty/serial/8250/8250_pci.c
+++ b/drivers/tty/serial/8250/8250_pci.c
@@ -4917,7 +4917,7 @@ static const struct pci_device_id serial_pci_tbl[] = {
* IntaShield IS-200
*/
{ PCI_VENDOR_ID_INTASHIELD, PCI_DEVICE_ID_INTASHIELD_IS200,
- PCI_ANY_ID, PCI_ANY_ID, 0, 0, /* 135a.0811 */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0, /* 135a.0d80 */
pbn_b2_2_115200 },
/*
* IntaShield IS-400
--
2.42.0
This is a note to let you know that I've just added the patch titled
iio: afe: rescale: Accept only offset channels
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From bee448390e5166d019e9e037194d487ee94399d9 Mon Sep 17 00:00:00 2001
From: Linus Walleij <linus.walleij(a)linaro.org>
Date: Sat, 2 Sep 2023 21:46:20 +0200
Subject: iio: afe: rescale: Accept only offset channels
As noted by Jonathan Cameron: it is perfectly legal for a channel
to have an offset but no scale in addition to the raw interface.
The conversion will imply that scale is 1:1.
Make rescale_configure_channel() accept just scale, or just offset
to process a channel.
When a user asks for IIO_CHAN_INFO_OFFSET in rescale_read_raw()
we now have to deal with the fact that OFFSET could be present
but SCALE missing. Add code to simply scale 1:1 in this case.
Link: https://lore.kernel.org/linux-iio/CACRpkdZXBjHU4t-GVOCFxRO-AHGxKnxMeHD2s4Y4…
Fixes: 53ebee949980 ("iio: afe: iio-rescale: Support processed channels")
Fixes: 9decacd8b3a4 ("iio: afe: rescale: Fix boolean logic bug")
Reported-by: Jonathan Cameron <jic23(a)kernel.org>
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
Reviewed-by: Peter Rosin <peda(a)axentia.se>
Link: https://lore.kernel.org/r/20230902-iio-rescale-only-offset-v2-1-988b807754c…
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/afe/iio-rescale.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/drivers/iio/afe/iio-rescale.c b/drivers/iio/afe/iio-rescale.c
index 1f280c360701..56e5913ab82d 100644
--- a/drivers/iio/afe/iio-rescale.c
+++ b/drivers/iio/afe/iio-rescale.c
@@ -214,8 +214,18 @@ static int rescale_read_raw(struct iio_dev *indio_dev,
return ret < 0 ? ret : -EOPNOTSUPP;
}
- ret = iio_read_channel_scale(rescale->source, &scale, &scale2);
- return rescale_process_offset(rescale, ret, scale, scale2,
+ if (iio_channel_has_info(rescale->source->channel,
+ IIO_CHAN_INFO_SCALE)) {
+ ret = iio_read_channel_scale(rescale->source, &scale, &scale2);
+ return rescale_process_offset(rescale, ret, scale, scale2,
+ schan_off, val, val2);
+ }
+
+ /*
+ * If we get here we have no scale so scale 1:1 but apply
+ * rescaler and offset, if any.
+ */
+ return rescale_process_offset(rescale, IIO_VAL_FRACTIONAL, 1, 1,
schan_off, val, val2);
default:
return -EINVAL;
@@ -280,8 +290,9 @@ static int rescale_configure_channel(struct device *dev,
chan->type = rescale->cfg->type;
if (iio_channel_has_info(schan, IIO_CHAN_INFO_RAW) &&
- iio_channel_has_info(schan, IIO_CHAN_INFO_SCALE)) {
- dev_info(dev, "using raw+scale source channel\n");
+ (iio_channel_has_info(schan, IIO_CHAN_INFO_SCALE) ||
+ iio_channel_has_info(schan, IIO_CHAN_INFO_OFFSET))) {
+ dev_info(dev, "using raw+scale/offset source channel\n");
} else if (iio_channel_has_info(schan, IIO_CHAN_INFO_PROCESSED)) {
dev_info(dev, "using processed channel\n");
rescale->chan_processed = true;
--
2.42.0
This is a note to let you know that I've just added the patch titled
iio: exynos-adc: request second interupt only when touchscreen mode
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 865b080e3229102f160889328ce2e8e97aa65ea0 Mon Sep 17 00:00:00 2001
From: Marek Szyprowski <m.szyprowski(a)samsung.com>
Date: Mon, 9 Oct 2023 12:14:12 +0200
Subject: iio: exynos-adc: request second interupt only when touchscreen mode
is used
Second interrupt is needed only when touchscreen mode is used, so don't
request it unconditionally. This removes the following annoying warning
during boot:
exynos-adc 14d10000.adc: error -ENXIO: IRQ index 1 not found
Fixes: 2bb8ad9b44c5 ("iio: exynos-adc: add experimental touchscreen support")
Signed-off-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
Link: https://lore.kernel.org/r/20231009101412.916922-1-m.szyprowski@samsung.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/exynos_adc.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/drivers/iio/adc/exynos_adc.c b/drivers/iio/adc/exynos_adc.c
index cff1ba57fb16..43c8af41b4a9 100644
--- a/drivers/iio/adc/exynos_adc.c
+++ b/drivers/iio/adc/exynos_adc.c
@@ -826,16 +826,26 @@ static int exynos_adc_probe(struct platform_device *pdev)
}
}
+ /* leave out any TS related code if unreachable */
+ if (IS_REACHABLE(CONFIG_INPUT)) {
+ has_ts = of_property_read_bool(pdev->dev.of_node,
+ "has-touchscreen") || pdata;
+ }
+
irq = platform_get_irq(pdev, 0);
if (irq < 0)
return irq;
info->irq = irq;
- irq = platform_get_irq(pdev, 1);
- if (irq == -EPROBE_DEFER)
- return irq;
+ if (has_ts) {
+ irq = platform_get_irq(pdev, 1);
+ if (irq == -EPROBE_DEFER)
+ return irq;
- info->tsirq = irq;
+ info->tsirq = irq;
+ } else {
+ info->tsirq = -1;
+ }
info->dev = &pdev->dev;
@@ -900,12 +910,6 @@ static int exynos_adc_probe(struct platform_device *pdev)
if (info->data->init_hw)
info->data->init_hw(info);
- /* leave out any TS related code if unreachable */
- if (IS_REACHABLE(CONFIG_INPUT)) {
- has_ts = of_property_read_bool(pdev->dev.of_node,
- "has-touchscreen") || pdata;
- }
-
if (pdata)
info->delay = pdata->delay;
else
--
2.42.0
This is a note to let you know that I've just added the patch titled
iio: adc: xilinx-xadc: Correct temperature offset/scale for
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From e2bd8c28b9bd835077eb65715d416d667694a80d Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Thu, 14 Sep 2023 18:10:19 -0600
Subject: iio: adc: xilinx-xadc: Correct temperature offset/scale for
UltraScale
The driver was previously using offset and scale values for the
temperature sensor readings which were only valid for 7-series devices.
Add per-device-type values for offset and scale and set them appropriately
for each device type.
Note that the values used for the UltraScale family are for UltraScale+
(i.e. the SYSMONE4 primitive) using the internal reference, as that seems
to be the most common configuration and the device tree values Xilinx's
device tree generator produces don't seem to give us anything to tell us
which configuration is used. However, the differences within the UltraScale
family seem fairly minor and it's closer than using the 7-series values
instead in any case.
Fixes: c2b7720a7905 ("iio: xilinx-xadc: Add basic support for Ultrascale System Monitor")
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Acked-by: O'Griofa, Conall <conall.ogriofa(a)amd.com>
Tested-by: O'Griofa, Conall <conall.ogriofa(a)amd.com>
Link: https://lore.kernel.org/r/20230915001019.2862964-3-robert.hancock@calian.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/xilinx-xadc-core.c | 17 ++++++++++++++---
drivers/iio/adc/xilinx-xadc.h | 2 ++
2 files changed, 16 insertions(+), 3 deletions(-)
diff --git a/drivers/iio/adc/xilinx-xadc-core.c b/drivers/iio/adc/xilinx-xadc-core.c
index d4d0d184a172..564c0cad0fc7 100644
--- a/drivers/iio/adc/xilinx-xadc-core.c
+++ b/drivers/iio/adc/xilinx-xadc-core.c
@@ -456,6 +456,9 @@ static const struct xadc_ops xadc_zynq_ops = {
.interrupt_handler = xadc_zynq_interrupt_handler,
.update_alarm = xadc_zynq_update_alarm,
.type = XADC_TYPE_S7,
+ /* Temp in C = (val * 503.975) / 2**bits - 273.15 */
+ .temp_scale = 503975,
+ .temp_offset = 273150,
};
static const unsigned int xadc_axi_reg_offsets[] = {
@@ -566,6 +569,9 @@ static const struct xadc_ops xadc_7s_axi_ops = {
.interrupt_handler = xadc_axi_interrupt_handler,
.flags = XADC_FLAGS_BUFFERED | XADC_FLAGS_IRQ_OPTIONAL,
.type = XADC_TYPE_S7,
+ /* Temp in C = (val * 503.975) / 2**bits - 273.15 */
+ .temp_scale = 503975,
+ .temp_offset = 273150,
};
static const struct xadc_ops xadc_us_axi_ops = {
@@ -577,6 +583,12 @@ static const struct xadc_ops xadc_us_axi_ops = {
.interrupt_handler = xadc_axi_interrupt_handler,
.flags = XADC_FLAGS_BUFFERED | XADC_FLAGS_IRQ_OPTIONAL,
.type = XADC_TYPE_US,
+ /**
+ * Values below are for UltraScale+ (SYSMONE4) using internal reference.
+ * See https://docs.xilinx.com/v/u/en-US/ug580-ultrascale-sysmon
+ */
+ .temp_scale = 509314,
+ .temp_offset = 280231,
};
static int _xadc_update_adc_reg(struct xadc *xadc, unsigned int reg,
@@ -945,8 +957,7 @@ static int xadc_read_raw(struct iio_dev *indio_dev,
*val2 = bits;
return IIO_VAL_FRACTIONAL_LOG2;
case IIO_TEMP:
- /* Temp in C = (val * 503.975) / 2**bits - 273.15 */
- *val = 503975;
+ *val = xadc->ops->temp_scale;
*val2 = bits;
return IIO_VAL_FRACTIONAL_LOG2;
default:
@@ -954,7 +965,7 @@ static int xadc_read_raw(struct iio_dev *indio_dev,
}
case IIO_CHAN_INFO_OFFSET:
/* Only the temperature channel has an offset */
- *val = -((273150 << bits) / 503975);
+ *val = -((xadc->ops->temp_offset << bits) / xadc->ops->temp_scale);
return IIO_VAL_INT;
case IIO_CHAN_INFO_SAMP_FREQ:
ret = xadc_read_samplerate(xadc);
diff --git a/drivers/iio/adc/xilinx-xadc.h b/drivers/iio/adc/xilinx-xadc.h
index 7d78ce698967..3036f4d613ff 100644
--- a/drivers/iio/adc/xilinx-xadc.h
+++ b/drivers/iio/adc/xilinx-xadc.h
@@ -85,6 +85,8 @@ struct xadc_ops {
unsigned int flags;
enum xadc_type type;
+ int temp_scale;
+ int temp_offset;
};
static inline int _xadc_read_adc_reg(struct xadc *xadc, unsigned int reg,
--
2.42.0
This is a note to let you know that I've just added the patch titled
iio: adc: xilinx-xadc: Don't clobber preset voltage/temperature
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 8d6b3ea4d9eaca80982442b68a292ce50ce0a135 Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Thu, 14 Sep 2023 18:10:18 -0600
Subject: iio: adc: xilinx-xadc: Don't clobber preset voltage/temperature
thresholds
In the probe function, the driver was reading out the thresholds already
set in the core, which can be configured by the user in the Vivado tools
when the FPGA image is built. However, it later clobbered those values
with zero or maximum values. In particular, the overtemperature shutdown
threshold register was overwritten with the max value, which effectively
prevents the FPGA from shutting down when the desired threshold was
eached, potentially risking hardware damage in that case.
Remove this code to leave the preconfigured default threshold values
intact.
The code was also disabling all alarms regardless of what enable state
they were left in by the FPGA image, including the overtemperature
shutdown feature. Leave these bits in their original state so they are
not unconditionally disabled.
Fixes: bdc8cda1d010 ("iio:adc: Add Xilinx XADC driver")
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Acked-by: O'Griofa, Conall <conall.ogriofa(a)amd.com>
Tested-by: O'Griofa, Conall <conall.ogriofa(a)amd.com>
Link: https://lore.kernel.org/r/20230915001019.2862964-2-robert.hancock@calian.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/xilinx-xadc-core.c | 22 ----------------------
1 file changed, 22 deletions(-)
diff --git a/drivers/iio/adc/xilinx-xadc-core.c b/drivers/iio/adc/xilinx-xadc-core.c
index dba73300f894..d4d0d184a172 100644
--- a/drivers/iio/adc/xilinx-xadc-core.c
+++ b/drivers/iio/adc/xilinx-xadc-core.c
@@ -1423,28 +1423,6 @@ static int xadc_probe(struct platform_device *pdev)
if (ret)
return ret;
- /* Disable all alarms */
- ret = xadc_update_adc_reg(xadc, XADC_REG_CONF1, XADC_CONF1_ALARM_MASK,
- XADC_CONF1_ALARM_MASK);
- if (ret)
- return ret;
-
- /* Set thresholds to min/max */
- for (i = 0; i < 16; i++) {
- /*
- * Set max voltage threshold and both temperature thresholds to
- * 0xffff, min voltage threshold to 0.
- */
- if (i % 8 < 4 || i == 7)
- xadc->threshold[i] = 0xffff;
- else
- xadc->threshold[i] = 0;
- ret = xadc_write_adc_reg(xadc, XADC_REG_THRESHOLD(i),
- xadc->threshold[i]);
- if (ret)
- return ret;
- }
-
/* Go to non-buffered mode */
xadc_postdisable(indio_dev);
--
2.42.0
DisplayPort Alt Mode CTS test 10.3.8 states that both sides of the
connection shall be compatible with one another such that the connection
is not Source to Source or Sink to Sink.
The DisplayPort driver currently checks for a compatible pin configuration
that resolves into a source and sink combination. The CTS test is designed
to send a Discover Modes message that has a compatible pin configuration
but advertises the same port capability as the device; the current check
fails this.
Verify that the port and port partner resolve into a valid source and sink
combination before checking for a compatible pin configuration.
---
Changes since v1:
* Fixed styling errors
* Added DP_CAP_IS_UFP_D and DP_CAP_IS_DFP_D as macros to typec_dp.h
---
Fixes: 0e3bb7d6894d ("usb: typec: Add driver for DisplayPort alternate mode")
Cc: stable(a)vger.kernel.org
Signed-off-by: RD Babiera <rdbabiera(a)google.com>
---
drivers/usb/typec/altmodes/displayport.c | 5 +++++
include/linux/usb/typec_dp.h | 2 ++
2 files changed, 7 insertions(+)
diff --git a/drivers/usb/typec/altmodes/displayport.c b/drivers/usb/typec/altmodes/displayport.c
index 718da02036d8..9c17955da570 100644
--- a/drivers/usb/typec/altmodes/displayport.c
+++ b/drivers/usb/typec/altmodes/displayport.c
@@ -578,6 +578,11 @@ int dp_altmode_probe(struct typec_altmode *alt)
/* FIXME: Port can only be DFP_U. */
+ /* Make sure that the port and partner can resolve into source and sink */
+ if (!(DP_CAP_IS_DFP_D(port->vdo) && DP_CAP_IS_UFP_D(alt->vdo)) &&
+ !(DP_CAP_IS_UFP_D(port->vdo) && DP_CAP_IS_DFP_D(alt->vdo)))
+ return -ENODEV;
+
/* Make sure we have compatiple pin configurations */
if (!(DP_CAP_PIN_ASSIGN_DFP_D(port->vdo) &
DP_CAP_PIN_ASSIGN_UFP_D(alt->vdo)) &&
diff --git a/include/linux/usb/typec_dp.h b/include/linux/usb/typec_dp.h
index 1f358098522d..4e6c0479307f 100644
--- a/include/linux/usb/typec_dp.h
+++ b/include/linux/usb/typec_dp.h
@@ -67,6 +67,8 @@ enum {
#define DP_CAP_UFP_D 1
#define DP_CAP_DFP_D 2
#define DP_CAP_DFP_D_AND_UFP_D 3
+#define DP_CAP_IS_UFP_D(_cap_) (!!(DP_CAP_CAPABILITY(_cap_) & DP_CAP_UFP_D))
+#define DP_CAP_IS_DFP_D(_cap_) (!!(DP_CAP_CAPABILITY(_cap_) & DP_CAP_DFP_D))
#define DP_CAP_DP_SIGNALLING(_cap_) (((_cap_) & GENMASK(5, 2)) >> 2)
#define DP_CAP_SIGNALLING_HBR3 1
#define DP_CAP_SIGNALLING_UHBR10 2
base-commit: 5220d8b04a840fa09434072c866d032b163419e3
--
2.42.0.655.g421f12c284-goog
Le 21/10/2023 à 02:23, Sasha Levin a écrit :
> This is a note to let you know that I've just added the patch titled
>
> ice: Remove useless DMA-32 fallback configuration
>
> to the 5.15-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> ice-remove-useless-dma-32-fallback-configuration.patch
> and it can be found in the queue-5.15 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
Why is it needed for backport, it is only dead code.
Another patch depends on it?
Looking *quickly* in other patches at [1], I've not seen anything that
conflicts.
CJ
[1]:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/com…
>
>
> commit 6f77725babf0559f90f19df76ff71f7807dff67f
> Author: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
> Date: Sun Jan 9 19:25:05 2022 +0100
>
> ice: Remove useless DMA-32 fallback configuration
>
> [ Upstream commit 9c3e54a632637f27d98fb0ec0c44f7039925809d ]
>
> As stated in [1], dma_set_mask() with a 64-bit mask never fails if
> dev->dma_mask is non-NULL.
> So, if it fails, the 32 bits case will also fail for the same reason.
>
> Simplify code and remove some dead code accordingly.
>
> [1]: https://lkml.org/lkml/2021/6/7/398
>
> Signed-off-by: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
> Reviewed-by: Christoph Hellwig <hch(a)lst.de>
> Reviewed-by: Alexander Lobakin <alexandr.lobakin(a)intel.com>
> Tested-by: Gurucharan G <gurucharanx.g(a)intel.com>
> Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
> Stable-dep-of: 0288c3e709e5 ("ice: reset first in crash dump kernels")
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> index 691c4320b6b1d..4aad089ea1f5d 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -4292,8 +4292,6 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
>
> /* set up for high or low DMA */
> err = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64));
> - if (err)
> - err = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
> if (err) {
> dev_err(dev, "DMA configuration failed: 0x%x\n", err);
> return err;
[ Upstream commit 0df9dab891ff0d9b646d82e4fe038229e4c02451 ]
Stop zapping invalidate TDP MMU roots via work queue now that KVM
preserves TDP MMU roots until they are explicitly invalidated. Zapping
roots asynchronously was effectively a workaround to avoid stalling a vCPU
for an extended during if a vCPU unloaded a root, which at the time
happened whenever the guest toggled CR0.WP (a frequent operation for some
guest kernels).
While a clever hack, zapping roots via an unbound worker had subtle,
unintended consequences on host scheduling, especially when zapping
multiple roots, e.g. as part of a memslot. Because the work of zapping a
root is no longer bound to the task that initiated the zap, things like
the CPU affinity and priority of the original task get lost. Losing the
affinity and priority can be especially problematic if unbound workqueues
aren't affined to a small number of CPUs, as zapping multiple roots can
cause KVM to heavily utilize the majority of CPUs in the system, *beyond*
the CPUs KVM is already using to run vCPUs.
When deleting a memslot via KVM_SET_USER_MEMORY_REGION, the async root
zap can result in KVM occupying all logical CPUs for ~8ms, and result in
high priority tasks not being scheduled in in a timely manner. In v5.15,
which doesn't preserve unloaded roots, the issues were even more noticeable
as KVM would zap roots more frequently and could occupy all CPUs for 50ms+.
Consuming all CPUs for an extended duration can lead to significant jitter
throughout the system, e.g. on ChromeOS with virtio-gpu, deleting memslots
is a semi-frequent operation as memslots are deleted and recreated with
different host virtual addresses to react to host GPU drivers allocating
and freeing GPU blobs. On ChromeOS, the jitter manifests as audio blips
during games due to the audio server's tasks not getting scheduled in
promptly, despite the tasks having a high realtime priority.
Deleting memslots isn't exactly a fast path and should be avoided when
possible, and ChromeOS is working towards utilizing MAP_FIXED to avoid the
memslot shenanigans, but KVM is squarely in the wrong. Not to mention
that removing the async zapping eliminates a non-trivial amount of
complexity.
Note, one of the subtle behaviors hidden behind the async zapping is that
KVM would zap invalidated roots only once (ignoring partial zaps from
things like mmu_notifier events). Preserve this behavior by adding a flag
to identify roots that are scheduled to be zapped versus roots that have
already been zapped but not yet freed.
Add a comment calling out why kvm_tdp_mmu_invalidate_all_roots() can
encounter invalid roots, as it's not at all obvious why zapping
invalidated roots shouldn't simply zap all invalid roots.
Reported-by: Pattara Teerapong <pteerapong(a)google.com>
Cc: David Stevens <stevensd(a)google.com>
Cc: Yiwei Zhang<zzyiwei(a)google.com>
Cc: Paul Hsia <paulhsia(a)google.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-Id: <20230916003916.2545000-4-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: David Matlack <dmatlack(a)google.com>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
---
Folks on Cc, it would be nice to get extra testing and/or reviews for this one
before it's picked up for 6.1, there were quite a few conflicts to resolve.
All of the conflicts were pretty straightforward, but I'd still appreciate an
extra set of eyeballs or three. Thanks!
arch/x86/include/asm/kvm_host.h | 3 +-
arch/x86/kvm/mmu/mmu.c | 9 +--
arch/x86/kvm/mmu/mmu_internal.h | 15 ++--
arch/x86/kvm/mmu/tdp_mmu.c | 135 +++++++++++++-------------------
arch/x86/kvm/mmu/tdp_mmu.h | 4 +-
arch/x86/kvm/x86.c | 5 +-
6 files changed, 69 insertions(+), 102 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 08a84f801bfe..c1dcaa3d2d6e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1324,7 +1324,6 @@ struct kvm_arch {
* the thread holds the MMU lock in write mode.
*/
spinlock_t tdp_mmu_pages_lock;
- struct workqueue_struct *tdp_mmu_zap_wq;
#endif /* CONFIG_X86_64 */
/*
@@ -1727,7 +1726,7 @@ void kvm_mmu_vendor_module_exit(void);
void kvm_mmu_destroy(struct kvm_vcpu *vcpu);
int kvm_mmu_create(struct kvm_vcpu *vcpu);
-int kvm_mmu_init_vm(struct kvm *kvm);
+void kvm_mmu_init_vm(struct kvm *kvm);
void kvm_mmu_uninit_vm(struct kvm *kvm);
void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 2a6fec4e2d19..d30325e297a0 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5994,19 +5994,16 @@ static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
kvm_mmu_zap_all_fast(kvm);
}
-int kvm_mmu_init_vm(struct kvm *kvm)
+void kvm_mmu_init_vm(struct kvm *kvm)
{
struct kvm_page_track_notifier_node *node = &kvm->arch.mmu_sp_tracker;
- int r;
INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);
INIT_LIST_HEAD(&kvm->arch.zapped_obsolete_pages);
INIT_LIST_HEAD(&kvm->arch.lpage_disallowed_mmu_pages);
spin_lock_init(&kvm->arch.mmu_unsync_pages_lock);
- r = kvm_mmu_init_tdp_mmu(kvm);
- if (r < 0)
- return r;
+ kvm_mmu_init_tdp_mmu(kvm);
node->track_write = kvm_mmu_pte_write;
node->track_flush_slot = kvm_mmu_invalidate_zap_pages_in_memslot;
@@ -6019,8 +6016,6 @@ int kvm_mmu_init_vm(struct kvm *kvm)
kvm->arch.split_desc_cache.kmem_cache = pte_list_desc_cache;
kvm->arch.split_desc_cache.gfp_zero = __GFP_ZERO;
-
- return 0;
}
static void mmu_free_vm_memory_caches(struct kvm *kvm)
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index 582def531d4d..0a9d5f2925c3 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -56,7 +56,12 @@ struct kvm_mmu_page {
bool tdp_mmu_page;
bool unsync;
- u8 mmu_valid_gen;
+ union {
+ u8 mmu_valid_gen;
+
+ /* Only accessed under slots_lock. */
+ bool tdp_mmu_scheduled_root_to_zap;
+ };
bool lpage_disallowed; /* Can't be replaced by an equiv large page */
/*
@@ -92,13 +97,7 @@ struct kvm_mmu_page {
struct kvm_rmap_head parent_ptes; /* rmap pointers to parent sptes */
tdp_ptep_t ptep;
};
- union {
- DECLARE_BITMAP(unsync_child_bitmap, 512);
- struct {
- struct work_struct tdp_mmu_async_work;
- void *tdp_mmu_async_data;
- };
- };
+ DECLARE_BITMAP(unsync_child_bitmap, 512);
struct list_head lpage_disallowed_link;
#ifdef CONFIG_X86_32
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 9b9fc4e834d0..c3b0f973375b 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -14,24 +14,16 @@ static bool __read_mostly tdp_mmu_enabled = true;
module_param_named(tdp_mmu, tdp_mmu_enabled, bool, 0644);
/* Initializes the TDP MMU for the VM, if enabled. */
-int kvm_mmu_init_tdp_mmu(struct kvm *kvm)
+void kvm_mmu_init_tdp_mmu(struct kvm *kvm)
{
- struct workqueue_struct *wq;
-
if (!tdp_enabled || !READ_ONCE(tdp_mmu_enabled))
- return 0;
-
- wq = alloc_workqueue("kvm", WQ_UNBOUND|WQ_MEM_RECLAIM|WQ_CPU_INTENSIVE, 0);
- if (!wq)
- return -ENOMEM;
+ return;
/* This should not be changed for the lifetime of the VM. */
kvm->arch.tdp_mmu_enabled = true;
INIT_LIST_HEAD(&kvm->arch.tdp_mmu_roots);
spin_lock_init(&kvm->arch.tdp_mmu_pages_lock);
INIT_LIST_HEAD(&kvm->arch.tdp_mmu_pages);
- kvm->arch.tdp_mmu_zap_wq = wq;
- return 1;
}
/* Arbitrarily returns true so that this may be used in if statements. */
@@ -57,20 +49,15 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm)
* ultimately frees all roots.
*/
kvm_tdp_mmu_invalidate_all_roots(kvm);
-
- /*
- * Destroying a workqueue also first flushes the workqueue, i.e. no
- * need to invoke kvm_tdp_mmu_zap_invalidated_roots().
- */
- destroy_workqueue(kvm->arch.tdp_mmu_zap_wq);
+ kvm_tdp_mmu_zap_invalidated_roots(kvm);
WARN_ON(!list_empty(&kvm->arch.tdp_mmu_pages));
WARN_ON(!list_empty(&kvm->arch.tdp_mmu_roots));
/*
* Ensure that all the outstanding RCU callbacks to free shadow pages
- * can run before the VM is torn down. Work items on tdp_mmu_zap_wq
- * can call kvm_tdp_mmu_put_root and create new callbacks.
+ * can run before the VM is torn down. Putting the last reference to
+ * zapped roots will create new callbacks.
*/
rcu_barrier();
}
@@ -97,46 +84,6 @@ static void tdp_mmu_free_sp_rcu_callback(struct rcu_head *head)
tdp_mmu_free_sp(sp);
}
-static void tdp_mmu_zap_root(struct kvm *kvm, struct kvm_mmu_page *root,
- bool shared);
-
-static void tdp_mmu_zap_root_work(struct work_struct *work)
-{
- struct kvm_mmu_page *root = container_of(work, struct kvm_mmu_page,
- tdp_mmu_async_work);
- struct kvm *kvm = root->tdp_mmu_async_data;
-
- read_lock(&kvm->mmu_lock);
-
- /*
- * A TLB flush is not necessary as KVM performs a local TLB flush when
- * allocating a new root (see kvm_mmu_load()), and when migrating vCPU
- * to a different pCPU. Note, the local TLB flush on reuse also
- * invalidates any paging-structure-cache entries, i.e. TLB entries for
- * intermediate paging structures, that may be zapped, as such entries
- * are associated with the ASID on both VMX and SVM.
- */
- tdp_mmu_zap_root(kvm, root, true);
-
- /*
- * Drop the refcount using kvm_tdp_mmu_put_root() to test its logic for
- * avoiding an infinite loop. By design, the root is reachable while
- * it's being asynchronously zapped, thus a different task can put its
- * last reference, i.e. flowing through kvm_tdp_mmu_put_root() for an
- * asynchronously zapped root is unavoidable.
- */
- kvm_tdp_mmu_put_root(kvm, root, true);
-
- read_unlock(&kvm->mmu_lock);
-}
-
-static void tdp_mmu_schedule_zap_root(struct kvm *kvm, struct kvm_mmu_page *root)
-{
- root->tdp_mmu_async_data = kvm;
- INIT_WORK(&root->tdp_mmu_async_work, tdp_mmu_zap_root_work);
- queue_work(kvm->arch.tdp_mmu_zap_wq, &root->tdp_mmu_async_work);
-}
-
void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root,
bool shared)
{
@@ -222,11 +169,11 @@ static struct kvm_mmu_page *tdp_mmu_next_root(struct kvm *kvm,
#define for_each_valid_tdp_mmu_root_yield_safe(_kvm, _root, _as_id, _shared) \
__for_each_tdp_mmu_root_yield_safe(_kvm, _root, _as_id, _shared, true)
-#define for_each_tdp_mmu_root_yield_safe(_kvm, _root) \
- for (_root = tdp_mmu_next_root(_kvm, NULL, false, false); \
+#define for_each_tdp_mmu_root_yield_safe(_kvm, _root, _shared) \
+ for (_root = tdp_mmu_next_root(_kvm, NULL, _shared, false); \
_root; \
- _root = tdp_mmu_next_root(_kvm, _root, false, false)) \
- if (!kvm_lockdep_assert_mmu_lock_held(_kvm, false)) { \
+ _root = tdp_mmu_next_root(_kvm, _root, _shared, false)) \
+ if (!kvm_lockdep_assert_mmu_lock_held(_kvm, _shared)) { \
} else
/*
@@ -305,7 +252,7 @@ hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu)
* by a memslot update or by the destruction of the VM. Initialize the
* refcount to two; one reference for the vCPU, and one reference for
* the TDP MMU itself, which is held until the root is invalidated and
- * is ultimately put by tdp_mmu_zap_root_work().
+ * is ultimately put by kvm_tdp_mmu_zap_invalidated_roots().
*/
refcount_set(&root->tdp_mmu_root_count, 2);
@@ -963,7 +910,7 @@ bool kvm_tdp_mmu_zap_leafs(struct kvm *kvm, gfn_t start, gfn_t end, bool flush)
{
struct kvm_mmu_page *root;
- for_each_tdp_mmu_root_yield_safe(kvm, root)
+ for_each_tdp_mmu_root_yield_safe(kvm, root, false)
flush = tdp_mmu_zap_leafs(kvm, root, start, end, true, flush);
return flush;
@@ -985,7 +932,7 @@ void kvm_tdp_mmu_zap_all(struct kvm *kvm)
* is being destroyed or the userspace VMM has exited. In both cases,
* KVM_RUN is unreachable, i.e. no vCPUs will ever service the request.
*/
- for_each_tdp_mmu_root_yield_safe(kvm, root)
+ for_each_tdp_mmu_root_yield_safe(kvm, root, false)
tdp_mmu_zap_root(kvm, root, false);
}
@@ -995,18 +942,47 @@ void kvm_tdp_mmu_zap_all(struct kvm *kvm)
*/
void kvm_tdp_mmu_zap_invalidated_roots(struct kvm *kvm)
{
- flush_workqueue(kvm->arch.tdp_mmu_zap_wq);
+ struct kvm_mmu_page *root;
+
+ read_lock(&kvm->mmu_lock);
+
+ for_each_tdp_mmu_root_yield_safe(kvm, root, true) {
+ if (!root->tdp_mmu_scheduled_root_to_zap)
+ continue;
+
+ root->tdp_mmu_scheduled_root_to_zap = false;
+ KVM_BUG_ON(!root->role.invalid, kvm);
+
+ /*
+ * A TLB flush is not necessary as KVM performs a local TLB
+ * flush when allocating a new root (see kvm_mmu_load()), and
+ * when migrating a vCPU to a different pCPU. Note, the local
+ * TLB flush on reuse also invalidates paging-structure-cache
+ * entries, i.e. TLB entries for intermediate paging structures,
+ * that may be zapped, as such entries are associated with the
+ * ASID on both VMX and SVM.
+ */
+ tdp_mmu_zap_root(kvm, root, true);
+
+ /*
+ * The referenced needs to be put *after* zapping the root, as
+ * the root must be reachable by mmu_notifiers while it's being
+ * zapped
+ */
+ kvm_tdp_mmu_put_root(kvm, root, true);
+ }
+
+ read_unlock(&kvm->mmu_lock);
}
/*
* Mark each TDP MMU root as invalid to prevent vCPUs from reusing a root that
* is about to be zapped, e.g. in response to a memslots update. The actual
- * zapping is performed asynchronously. Using a separate workqueue makes it
- * easy to ensure that the destruction is performed before the "fast zap"
- * completes, without keeping a separate list of invalidated roots; the list is
- * effectively the list of work items in the workqueue.
+ * zapping is done separately so that it happens with mmu_lock with read,
+ * whereas invalidating roots must be done with mmu_lock held for write (unless
+ * the VM is being destroyed).
*
- * Note, the asynchronous worker is gifted the TDP MMU's reference.
+ * Note, kvm_tdp_mmu_zap_invalidated_roots() is gifted the TDP MMU's reference.
* See kvm_tdp_mmu_get_vcpu_root_hpa().
*/
void kvm_tdp_mmu_invalidate_all_roots(struct kvm *kvm)
@@ -1031,19 +1007,20 @@ void kvm_tdp_mmu_invalidate_all_roots(struct kvm *kvm)
/*
* As above, mmu_lock isn't held when destroying the VM! There can't
* be other references to @kvm, i.e. nothing else can invalidate roots
- * or be consuming roots, but walking the list of roots does need to be
- * guarded against roots being deleted by the asynchronous zap worker.
+ * or get/put references to roots.
*/
- rcu_read_lock();
-
- list_for_each_entry_rcu(root, &kvm->arch.tdp_mmu_roots, link) {
+ list_for_each_entry(root, &kvm->arch.tdp_mmu_roots, link) {
+ /*
+ * Note, invalid roots can outlive a memslot update! Invalid
+ * roots must be *zapped* before the memslot update completes,
+ * but a different task can acquire a reference and keep the
+ * root alive after its been zapped.
+ */
if (!root->role.invalid) {
+ root->tdp_mmu_scheduled_root_to_zap = true;
root->role.invalid = true;
- tdp_mmu_schedule_zap_root(kvm, root);
}
}
-
- rcu_read_unlock();
}
/*
diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h
index d0a9fe0770fd..c82a8bb321bb 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.h
+++ b/arch/x86/kvm/mmu/tdp_mmu.h
@@ -65,7 +65,7 @@ u64 *kvm_tdp_mmu_fast_pf_get_last_sptep(struct kvm_vcpu *vcpu, u64 addr,
u64 *spte);
#ifdef CONFIG_X86_64
-int kvm_mmu_init_tdp_mmu(struct kvm *kvm);
+void kvm_mmu_init_tdp_mmu(struct kvm *kvm);
void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm);
static inline bool is_tdp_mmu_page(struct kvm_mmu_page *sp) { return sp->tdp_mmu_page; }
@@ -86,7 +86,7 @@ static inline bool is_tdp_mmu(struct kvm_mmu *mmu)
return sp && is_tdp_mmu_page(sp) && sp->root_count;
}
#else
-static inline int kvm_mmu_init_tdp_mmu(struct kvm *kvm) { return 0; }
+static inline void kvm_mmu_init_tdp_mmu(struct kvm *kvm) {}
static inline void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm) {}
static inline bool is_tdp_mmu_page(struct kvm_mmu_page *sp) { return false; }
static inline bool is_tdp_mmu(struct kvm_mmu *mmu) { return false; }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1931d3fcbbe0..b929254c7876 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12442,9 +12442,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
if (ret)
goto out;
- ret = kvm_mmu_init_vm(kvm);
- if (ret)
- goto out_page_track;
+ kvm_mmu_init_vm(kvm);
ret = static_call(kvm_x86_vm_init)(kvm);
if (ret)
@@ -12489,7 +12487,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
out_uninit_mmu:
kvm_mmu_uninit_vm(kvm);
-out_page_track:
kvm_page_track_cleanup(kvm);
out:
return ret;
base-commit: adc4d740ad9ec780657327c69ab966fa4fdf0e8e
--
2.42.0.655.g421f12c284-goog
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x 2b32c76e2b0154b98b9322ae7546b8156cd703e6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102147-granola-preschool-1418@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2b32c76e2b0154b98b9322ae7546b8156cd703e6 Mon Sep 17 00:00:00 2001
From: Keith Busch <kbusch(a)kernel.org>
Date: Mon, 16 Oct 2023 13:12:47 -0700
Subject: [PATCH] nvme: sanitize metadata bounce buffer for reads
User can request more metadata bytes than the device will write. Ensure
kernel buffer is initialized so we're not leaking unsanitized memory on
the copy-out.
Fixes: 0b7f1f26f95a51a ("nvme: use the block layer for userspace passthrough metadata")
Reviewed-by: Jens Axboe <axboe(a)kernel.dk>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Kanchan Joshi <joshi.k(a)samsung.com>
Reviewed-by: Chaitanya Kulkarni <kch(a)nvidia.com>
Signed-off-by: Keith Busch <kbusch(a)kernel.org>
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index d8ff796fd5f2..747c879e8982 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -108,9 +108,13 @@ static void *nvme_add_user_metadata(struct request *req, void __user *ubuf,
if (!buf)
goto out;
- ret = -EFAULT;
- if ((req_op(req) == REQ_OP_DRV_OUT) && copy_from_user(buf, ubuf, len))
- goto out_free_meta;
+ if (req_op(req) == REQ_OP_DRV_OUT) {
+ ret = -EFAULT;
+ if (copy_from_user(buf, ubuf, len))
+ goto out_free_meta;
+ } else {
+ memset(buf, 0, len);
+ }
bip = bio_integrity_alloc(bio, GFP_KERNEL, 1);
if (IS_ERR(bip)) {
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 2b32c76e2b0154b98b9322ae7546b8156cd703e6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102146-postwar-smugly-830b@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2b32c76e2b0154b98b9322ae7546b8156cd703e6 Mon Sep 17 00:00:00 2001
From: Keith Busch <kbusch(a)kernel.org>
Date: Mon, 16 Oct 2023 13:12:47 -0700
Subject: [PATCH] nvme: sanitize metadata bounce buffer for reads
User can request more metadata bytes than the device will write. Ensure
kernel buffer is initialized so we're not leaking unsanitized memory on
the copy-out.
Fixes: 0b7f1f26f95a51a ("nvme: use the block layer for userspace passthrough metadata")
Reviewed-by: Jens Axboe <axboe(a)kernel.dk>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Kanchan Joshi <joshi.k(a)samsung.com>
Reviewed-by: Chaitanya Kulkarni <kch(a)nvidia.com>
Signed-off-by: Keith Busch <kbusch(a)kernel.org>
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index d8ff796fd5f2..747c879e8982 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -108,9 +108,13 @@ static void *nvme_add_user_metadata(struct request *req, void __user *ubuf,
if (!buf)
goto out;
- ret = -EFAULT;
- if ((req_op(req) == REQ_OP_DRV_OUT) && copy_from_user(buf, ubuf, len))
- goto out_free_meta;
+ if (req_op(req) == REQ_OP_DRV_OUT) {
+ ret = -EFAULT;
+ if (copy_from_user(buf, ubuf, len))
+ goto out_free_meta;
+ } else {
+ memset(buf, 0, len);
+ }
bip = bio_integrity_alloc(bio, GFP_KERNEL, 1);
if (IS_ERR(bip)) {
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 2b32c76e2b0154b98b9322ae7546b8156cd703e6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102145-nacho-gulp-0f93@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2b32c76e2b0154b98b9322ae7546b8156cd703e6 Mon Sep 17 00:00:00 2001
From: Keith Busch <kbusch(a)kernel.org>
Date: Mon, 16 Oct 2023 13:12:47 -0700
Subject: [PATCH] nvme: sanitize metadata bounce buffer for reads
User can request more metadata bytes than the device will write. Ensure
kernel buffer is initialized so we're not leaking unsanitized memory on
the copy-out.
Fixes: 0b7f1f26f95a51a ("nvme: use the block layer for userspace passthrough metadata")
Reviewed-by: Jens Axboe <axboe(a)kernel.dk>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Kanchan Joshi <joshi.k(a)samsung.com>
Reviewed-by: Chaitanya Kulkarni <kch(a)nvidia.com>
Signed-off-by: Keith Busch <kbusch(a)kernel.org>
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index d8ff796fd5f2..747c879e8982 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -108,9 +108,13 @@ static void *nvme_add_user_metadata(struct request *req, void __user *ubuf,
if (!buf)
goto out;
- ret = -EFAULT;
- if ((req_op(req) == REQ_OP_DRV_OUT) && copy_from_user(buf, ubuf, len))
- goto out_free_meta;
+ if (req_op(req) == REQ_OP_DRV_OUT) {
+ ret = -EFAULT;
+ if (copy_from_user(buf, ubuf, len))
+ goto out_free_meta;
+ } else {
+ memset(buf, 0, len);
+ }
bip = bio_integrity_alloc(bio, GFP_KERNEL, 1);
if (IS_ERR(bip)) {
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 2b32c76e2b0154b98b9322ae7546b8156cd703e6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102144-lavish-paralysis-4c58@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2b32c76e2b0154b98b9322ae7546b8156cd703e6 Mon Sep 17 00:00:00 2001
From: Keith Busch <kbusch(a)kernel.org>
Date: Mon, 16 Oct 2023 13:12:47 -0700
Subject: [PATCH] nvme: sanitize metadata bounce buffer for reads
User can request more metadata bytes than the device will write. Ensure
kernel buffer is initialized so we're not leaking unsanitized memory on
the copy-out.
Fixes: 0b7f1f26f95a51a ("nvme: use the block layer for userspace passthrough metadata")
Reviewed-by: Jens Axboe <axboe(a)kernel.dk>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Kanchan Joshi <joshi.k(a)samsung.com>
Reviewed-by: Chaitanya Kulkarni <kch(a)nvidia.com>
Signed-off-by: Keith Busch <kbusch(a)kernel.org>
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index d8ff796fd5f2..747c879e8982 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -108,9 +108,13 @@ static void *nvme_add_user_metadata(struct request *req, void __user *ubuf,
if (!buf)
goto out;
- ret = -EFAULT;
- if ((req_op(req) == REQ_OP_DRV_OUT) && copy_from_user(buf, ubuf, len))
- goto out_free_meta;
+ if (req_op(req) == REQ_OP_DRV_OUT) {
+ ret = -EFAULT;
+ if (copy_from_user(buf, ubuf, len))
+ goto out_free_meta;
+ } else {
+ memset(buf, 0, len);
+ }
bip = bio_integrity_alloc(bio, GFP_KERNEL, 1);
if (IS_ERR(bip)) {
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 2b32c76e2b0154b98b9322ae7546b8156cd703e6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102143-task-roundness-da6d@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2b32c76e2b0154b98b9322ae7546b8156cd703e6 Mon Sep 17 00:00:00 2001
From: Keith Busch <kbusch(a)kernel.org>
Date: Mon, 16 Oct 2023 13:12:47 -0700
Subject: [PATCH] nvme: sanitize metadata bounce buffer for reads
User can request more metadata bytes than the device will write. Ensure
kernel buffer is initialized so we're not leaking unsanitized memory on
the copy-out.
Fixes: 0b7f1f26f95a51a ("nvme: use the block layer for userspace passthrough metadata")
Reviewed-by: Jens Axboe <axboe(a)kernel.dk>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Kanchan Joshi <joshi.k(a)samsung.com>
Reviewed-by: Chaitanya Kulkarni <kch(a)nvidia.com>
Signed-off-by: Keith Busch <kbusch(a)kernel.org>
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index d8ff796fd5f2..747c879e8982 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -108,9 +108,13 @@ static void *nvme_add_user_metadata(struct request *req, void __user *ubuf,
if (!buf)
goto out;
- ret = -EFAULT;
- if ((req_op(req) == REQ_OP_DRV_OUT) && copy_from_user(buf, ubuf, len))
- goto out_free_meta;
+ if (req_op(req) == REQ_OP_DRV_OUT) {
+ ret = -EFAULT;
+ if (copy_from_user(buf, ubuf, len))
+ goto out_free_meta;
+ } else {
+ memset(buf, 0, len);
+ }
bip = bio_integrity_alloc(bio, GFP_KERNEL, 1);
if (IS_ERR(bip)) {
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x e1c6cfbb3bd1377e2ddcbe06cf8fb1ec323ea7d3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102129-transfer-runway-bac8@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e1c6cfbb3bd1377e2ddcbe06cf8fb1ec323ea7d3 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Date: Sun, 8 Oct 2023 14:28:46 -0400
Subject: [PATCH] pNFS/flexfiles: Check the layout validity in
ff_layout_mirror_prepare_stats
Ensure that we check the layout pointer and validity after dereferencing
it in ff_layout_mirror_prepare_stats.
Fixes: 08e2e5bc6c9a ("pNFS/flexfiles: Clean up layoutstats")
Signed-off-by: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker(a)Netapp.com>
diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c
index a1dc33864906..ef817a0475ff 100644
--- a/fs/nfs/flexfilelayout/flexfilelayout.c
+++ b/fs/nfs/flexfilelayout/flexfilelayout.c
@@ -2520,9 +2520,9 @@ ff_layout_mirror_prepare_stats(struct pnfs_layout_hdr *lo,
return i;
}
-static int
-ff_layout_prepare_layoutstats(struct nfs42_layoutstat_args *args)
+static int ff_layout_prepare_layoutstats(struct nfs42_layoutstat_args *args)
{
+ struct pnfs_layout_hdr *lo;
struct nfs4_flexfile_layout *ff_layout;
const int dev_count = PNFS_LAYOUTSTATS_MAXDEV;
@@ -2533,11 +2533,14 @@ ff_layout_prepare_layoutstats(struct nfs42_layoutstat_args *args)
return -ENOMEM;
spin_lock(&args->inode->i_lock);
- ff_layout = FF_LAYOUT_FROM_HDR(NFS_I(args->inode)->layout);
- args->num_dev = ff_layout_mirror_prepare_stats(&ff_layout->generic_hdr,
- &args->devinfo[0],
- dev_count,
- NFS4_FF_OP_LAYOUTSTATS);
+ lo = NFS_I(args->inode)->layout;
+ if (lo && pnfs_layout_is_valid(lo)) {
+ ff_layout = FF_LAYOUT_FROM_HDR(lo);
+ args->num_dev = ff_layout_mirror_prepare_stats(
+ &ff_layout->generic_hdr, &args->devinfo[0], dev_count,
+ NFS4_FF_OP_LAYOUTSTATS);
+ } else
+ args->num_dev = 0;
spin_unlock(&args->inode->i_lock);
if (!args->num_dev) {
kfree(args->devinfo);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x e1c6cfbb3bd1377e2ddcbe06cf8fb1ec323ea7d3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102128-depress-ungloved-ee4e@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e1c6cfbb3bd1377e2ddcbe06cf8fb1ec323ea7d3 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Date: Sun, 8 Oct 2023 14:28:46 -0400
Subject: [PATCH] pNFS/flexfiles: Check the layout validity in
ff_layout_mirror_prepare_stats
Ensure that we check the layout pointer and validity after dereferencing
it in ff_layout_mirror_prepare_stats.
Fixes: 08e2e5bc6c9a ("pNFS/flexfiles: Clean up layoutstats")
Signed-off-by: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker(a)Netapp.com>
diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c
index a1dc33864906..ef817a0475ff 100644
--- a/fs/nfs/flexfilelayout/flexfilelayout.c
+++ b/fs/nfs/flexfilelayout/flexfilelayout.c
@@ -2520,9 +2520,9 @@ ff_layout_mirror_prepare_stats(struct pnfs_layout_hdr *lo,
return i;
}
-static int
-ff_layout_prepare_layoutstats(struct nfs42_layoutstat_args *args)
+static int ff_layout_prepare_layoutstats(struct nfs42_layoutstat_args *args)
{
+ struct pnfs_layout_hdr *lo;
struct nfs4_flexfile_layout *ff_layout;
const int dev_count = PNFS_LAYOUTSTATS_MAXDEV;
@@ -2533,11 +2533,14 @@ ff_layout_prepare_layoutstats(struct nfs42_layoutstat_args *args)
return -ENOMEM;
spin_lock(&args->inode->i_lock);
- ff_layout = FF_LAYOUT_FROM_HDR(NFS_I(args->inode)->layout);
- args->num_dev = ff_layout_mirror_prepare_stats(&ff_layout->generic_hdr,
- &args->devinfo[0],
- dev_count,
- NFS4_FF_OP_LAYOUTSTATS);
+ lo = NFS_I(args->inode)->layout;
+ if (lo && pnfs_layout_is_valid(lo)) {
+ ff_layout = FF_LAYOUT_FROM_HDR(lo);
+ args->num_dev = ff_layout_mirror_prepare_stats(
+ &ff_layout->generic_hdr, &args->devinfo[0], dev_count,
+ NFS4_FF_OP_LAYOUTSTATS);
+ } else
+ args->num_dev = 0;
spin_unlock(&args->inode->i_lock);
if (!args->num_dev) {
kfree(args->devinfo);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x e1c6cfbb3bd1377e2ddcbe06cf8fb1ec323ea7d3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102128-paralysis-unwary-3d2b@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e1c6cfbb3bd1377e2ddcbe06cf8fb1ec323ea7d3 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Date: Sun, 8 Oct 2023 14:28:46 -0400
Subject: [PATCH] pNFS/flexfiles: Check the layout validity in
ff_layout_mirror_prepare_stats
Ensure that we check the layout pointer and validity after dereferencing
it in ff_layout_mirror_prepare_stats.
Fixes: 08e2e5bc6c9a ("pNFS/flexfiles: Clean up layoutstats")
Signed-off-by: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker(a)Netapp.com>
diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c
index a1dc33864906..ef817a0475ff 100644
--- a/fs/nfs/flexfilelayout/flexfilelayout.c
+++ b/fs/nfs/flexfilelayout/flexfilelayout.c
@@ -2520,9 +2520,9 @@ ff_layout_mirror_prepare_stats(struct pnfs_layout_hdr *lo,
return i;
}
-static int
-ff_layout_prepare_layoutstats(struct nfs42_layoutstat_args *args)
+static int ff_layout_prepare_layoutstats(struct nfs42_layoutstat_args *args)
{
+ struct pnfs_layout_hdr *lo;
struct nfs4_flexfile_layout *ff_layout;
const int dev_count = PNFS_LAYOUTSTATS_MAXDEV;
@@ -2533,11 +2533,14 @@ ff_layout_prepare_layoutstats(struct nfs42_layoutstat_args *args)
return -ENOMEM;
spin_lock(&args->inode->i_lock);
- ff_layout = FF_LAYOUT_FROM_HDR(NFS_I(args->inode)->layout);
- args->num_dev = ff_layout_mirror_prepare_stats(&ff_layout->generic_hdr,
- &args->devinfo[0],
- dev_count,
- NFS4_FF_OP_LAYOUTSTATS);
+ lo = NFS_I(args->inode)->layout;
+ if (lo && pnfs_layout_is_valid(lo)) {
+ ff_layout = FF_LAYOUT_FROM_HDR(lo);
+ args->num_dev = ff_layout_mirror_prepare_stats(
+ &ff_layout->generic_hdr, &args->devinfo[0], dev_count,
+ NFS4_FF_OP_LAYOUTSTATS);
+ } else
+ args->num_dev = 0;
spin_unlock(&args->inode->i_lock);
if (!args->num_dev) {
kfree(args->devinfo);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x e1c6cfbb3bd1377e2ddcbe06cf8fb1ec323ea7d3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102127-gray-quill-78d1@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e1c6cfbb3bd1377e2ddcbe06cf8fb1ec323ea7d3 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Date: Sun, 8 Oct 2023 14:28:46 -0400
Subject: [PATCH] pNFS/flexfiles: Check the layout validity in
ff_layout_mirror_prepare_stats
Ensure that we check the layout pointer and validity after dereferencing
it in ff_layout_mirror_prepare_stats.
Fixes: 08e2e5bc6c9a ("pNFS/flexfiles: Clean up layoutstats")
Signed-off-by: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker(a)Netapp.com>
diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c
index a1dc33864906..ef817a0475ff 100644
--- a/fs/nfs/flexfilelayout/flexfilelayout.c
+++ b/fs/nfs/flexfilelayout/flexfilelayout.c
@@ -2520,9 +2520,9 @@ ff_layout_mirror_prepare_stats(struct pnfs_layout_hdr *lo,
return i;
}
-static int
-ff_layout_prepare_layoutstats(struct nfs42_layoutstat_args *args)
+static int ff_layout_prepare_layoutstats(struct nfs42_layoutstat_args *args)
{
+ struct pnfs_layout_hdr *lo;
struct nfs4_flexfile_layout *ff_layout;
const int dev_count = PNFS_LAYOUTSTATS_MAXDEV;
@@ -2533,11 +2533,14 @@ ff_layout_prepare_layoutstats(struct nfs42_layoutstat_args *args)
return -ENOMEM;
spin_lock(&args->inode->i_lock);
- ff_layout = FF_LAYOUT_FROM_HDR(NFS_I(args->inode)->layout);
- args->num_dev = ff_layout_mirror_prepare_stats(&ff_layout->generic_hdr,
- &args->devinfo[0],
- dev_count,
- NFS4_FF_OP_LAYOUTSTATS);
+ lo = NFS_I(args->inode)->layout;
+ if (lo && pnfs_layout_is_valid(lo)) {
+ ff_layout = FF_LAYOUT_FROM_HDR(lo);
+ args->num_dev = ff_layout_mirror_prepare_stats(
+ &ff_layout->generic_hdr, &args->devinfo[0], dev_count,
+ NFS4_FF_OP_LAYOUTSTATS);
+ } else
+ args->num_dev = 0;
spin_unlock(&args->inode->i_lock);
if (!args->num_dev) {
kfree(args->devinfo);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x e1c6cfbb3bd1377e2ddcbe06cf8fb1ec323ea7d3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102126-directed-excitable-b90e@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e1c6cfbb3bd1377e2ddcbe06cf8fb1ec323ea7d3 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Date: Sun, 8 Oct 2023 14:28:46 -0400
Subject: [PATCH] pNFS/flexfiles: Check the layout validity in
ff_layout_mirror_prepare_stats
Ensure that we check the layout pointer and validity after dereferencing
it in ff_layout_mirror_prepare_stats.
Fixes: 08e2e5bc6c9a ("pNFS/flexfiles: Clean up layoutstats")
Signed-off-by: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker(a)Netapp.com>
diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c
index a1dc33864906..ef817a0475ff 100644
--- a/fs/nfs/flexfilelayout/flexfilelayout.c
+++ b/fs/nfs/flexfilelayout/flexfilelayout.c
@@ -2520,9 +2520,9 @@ ff_layout_mirror_prepare_stats(struct pnfs_layout_hdr *lo,
return i;
}
-static int
-ff_layout_prepare_layoutstats(struct nfs42_layoutstat_args *args)
+static int ff_layout_prepare_layoutstats(struct nfs42_layoutstat_args *args)
{
+ struct pnfs_layout_hdr *lo;
struct nfs4_flexfile_layout *ff_layout;
const int dev_count = PNFS_LAYOUTSTATS_MAXDEV;
@@ -2533,11 +2533,14 @@ ff_layout_prepare_layoutstats(struct nfs42_layoutstat_args *args)
return -ENOMEM;
spin_lock(&args->inode->i_lock);
- ff_layout = FF_LAYOUT_FROM_HDR(NFS_I(args->inode)->layout);
- args->num_dev = ff_layout_mirror_prepare_stats(&ff_layout->generic_hdr,
- &args->devinfo[0],
- dev_count,
- NFS4_FF_OP_LAYOUTSTATS);
+ lo = NFS_I(args->inode)->layout;
+ if (lo && pnfs_layout_is_valid(lo)) {
+ ff_layout = FF_LAYOUT_FROM_HDR(lo);
+ args->num_dev = ff_layout_mirror_prepare_stats(
+ &ff_layout->generic_hdr, &args->devinfo[0], dev_count,
+ NFS4_FF_OP_LAYOUTSTATS);
+ } else
+ args->num_dev = 0;
spin_unlock(&args->inode->i_lock);
if (!args->num_dev) {
kfree(args->devinfo);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x f63955721a8020e979b99cc417dcb6da3106aa24
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102137-pawing-rift-8b65@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f63955721a8020e979b99cc417dcb6da3106aa24 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Date: Sun, 8 Oct 2023 14:20:19 -0400
Subject: [PATCH] pNFS: Fix a hang in nfs4_evict_inode()
We are not allowed to call pnfs_mark_matching_lsegs_return() without
also holding a reference to the layout header, since doing so could lead
to the reference count going to zero when we call
pnfs_layout_remove_lseg(). This again can lead to a hang when we get to
nfs4_evict_inode() and are unable to clear the layout pointer.
pnfs_layout_return_unused_byserver() is guilty of this behaviour, and
has been seen to trigger the refcount warning prior to a hang.
Fixes: b6d49ecd1081 ("NFSv4: Fix a pNFS layout related use-after-free race when freeing the inode")
Cc: stable(a)vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker(a)Netapp.com>
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 306cba0b9e69..84343aefbbd6 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -2634,31 +2634,44 @@ pnfs_should_return_unused_layout(struct pnfs_layout_hdr *lo,
return mode == 0;
}
-static int
-pnfs_layout_return_unused_byserver(struct nfs_server *server, void *data)
+static int pnfs_layout_return_unused_byserver(struct nfs_server *server,
+ void *data)
{
const struct pnfs_layout_range *range = data;
+ const struct cred *cred;
struct pnfs_layout_hdr *lo;
struct inode *inode;
+ nfs4_stateid stateid;
+ enum pnfs_iomode iomode;
+
restart:
rcu_read_lock();
list_for_each_entry_rcu(lo, &server->layouts, plh_layouts) {
- if (!pnfs_layout_can_be_returned(lo) ||
+ inode = lo->plh_inode;
+ if (!inode || !pnfs_layout_can_be_returned(lo) ||
test_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags))
continue;
- inode = lo->plh_inode;
spin_lock(&inode->i_lock);
- if (!pnfs_should_return_unused_layout(lo, range)) {
+ if (!lo->plh_inode ||
+ !pnfs_should_return_unused_layout(lo, range)) {
spin_unlock(&inode->i_lock);
continue;
}
+ pnfs_get_layout_hdr(lo);
+ pnfs_set_plh_return_info(lo, range->iomode, 0);
+ if (pnfs_mark_matching_lsegs_return(lo, &lo->plh_return_segs,
+ range, 0) != 0 ||
+ !pnfs_prepare_layoutreturn(lo, &stateid, &cred, &iomode)) {
+ spin_unlock(&inode->i_lock);
+ rcu_read_unlock();
+ pnfs_put_layout_hdr(lo);
+ cond_resched();
+ goto restart;
+ }
spin_unlock(&inode->i_lock);
- inode = pnfs_grab_inode_layout_hdr(lo);
- if (!inode)
- continue;
rcu_read_unlock();
- pnfs_mark_layout_for_return(inode, range);
- iput(inode);
+ pnfs_send_layoutreturn(lo, &stateid, &cred, iomode, false);
+ pnfs_put_layout_hdr(lo);
cond_resched();
goto restart;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x f63955721a8020e979b99cc417dcb6da3106aa24
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102135-blurt-entree-54ec@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f63955721a8020e979b99cc417dcb6da3106aa24 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Date: Sun, 8 Oct 2023 14:20:19 -0400
Subject: [PATCH] pNFS: Fix a hang in nfs4_evict_inode()
We are not allowed to call pnfs_mark_matching_lsegs_return() without
also holding a reference to the layout header, since doing so could lead
to the reference count going to zero when we call
pnfs_layout_remove_lseg(). This again can lead to a hang when we get to
nfs4_evict_inode() and are unable to clear the layout pointer.
pnfs_layout_return_unused_byserver() is guilty of this behaviour, and
has been seen to trigger the refcount warning prior to a hang.
Fixes: b6d49ecd1081 ("NFSv4: Fix a pNFS layout related use-after-free race when freeing the inode")
Cc: stable(a)vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker(a)Netapp.com>
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 306cba0b9e69..84343aefbbd6 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -2634,31 +2634,44 @@ pnfs_should_return_unused_layout(struct pnfs_layout_hdr *lo,
return mode == 0;
}
-static int
-pnfs_layout_return_unused_byserver(struct nfs_server *server, void *data)
+static int pnfs_layout_return_unused_byserver(struct nfs_server *server,
+ void *data)
{
const struct pnfs_layout_range *range = data;
+ const struct cred *cred;
struct pnfs_layout_hdr *lo;
struct inode *inode;
+ nfs4_stateid stateid;
+ enum pnfs_iomode iomode;
+
restart:
rcu_read_lock();
list_for_each_entry_rcu(lo, &server->layouts, plh_layouts) {
- if (!pnfs_layout_can_be_returned(lo) ||
+ inode = lo->plh_inode;
+ if (!inode || !pnfs_layout_can_be_returned(lo) ||
test_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags))
continue;
- inode = lo->plh_inode;
spin_lock(&inode->i_lock);
- if (!pnfs_should_return_unused_layout(lo, range)) {
+ if (!lo->plh_inode ||
+ !pnfs_should_return_unused_layout(lo, range)) {
spin_unlock(&inode->i_lock);
continue;
}
+ pnfs_get_layout_hdr(lo);
+ pnfs_set_plh_return_info(lo, range->iomode, 0);
+ if (pnfs_mark_matching_lsegs_return(lo, &lo->plh_return_segs,
+ range, 0) != 0 ||
+ !pnfs_prepare_layoutreturn(lo, &stateid, &cred, &iomode)) {
+ spin_unlock(&inode->i_lock);
+ rcu_read_unlock();
+ pnfs_put_layout_hdr(lo);
+ cond_resched();
+ goto restart;
+ }
spin_unlock(&inode->i_lock);
- inode = pnfs_grab_inode_layout_hdr(lo);
- if (!inode)
- continue;
rcu_read_unlock();
- pnfs_mark_layout_for_return(inode, range);
- iput(inode);
+ pnfs_send_layoutreturn(lo, &stateid, &cred, iomode, false);
+ pnfs_put_layout_hdr(lo);
cond_resched();
goto restart;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x c8befdc411e5fd1bf95a13e8744c8ca79b412bee
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102142-tiny-thing-872a@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c8befdc411e5fd1bf95a13e8744c8ca79b412bee Mon Sep 17 00:00:00 2001
From: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Date: Fri, 13 Oct 2023 16:57:05 +0200
Subject: [PATCH] pinctrl: qcom: lpass-lpi: fix concurrent register updates
The Qualcomm LPASS LPI pin controller driver uses one lock for guarding
Read-Modify-Write code for slew rate registers. However the pin
configuration and muxing registers have exactly the same RMW code but
are not protected.
Pin controller framework does not provide locking here, thus it is
possible to trigger simultaneous change of pin configuration registers
resulting in non-atomic changes.
Protect from concurrent access by re-using the same lock used to cover
the slew rate register. Using the same lock instead of adding second
one will make more sense, once we add support for newer Qualcomm SoC,
where slew rate is configured in the same register as pin
configuration/muxing.
Fixes: 6e261d1090d6 ("pinctrl: qcom: Add sm8250 lpass lpi pinctrl driver")
Cc: stable(a)vger.kernel.org
Reviewed-by: Linus Walleij <linus.walleij(a)linaro.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Link: https://lore.kernel.org/r/20231013145705.219954-1-krzysztof.kozlowski@linar…
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
diff --git a/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c b/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c
index e5a418026ba3..0b2839d27fd6 100644
--- a/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c
+++ b/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c
@@ -32,7 +32,8 @@ struct lpi_pinctrl {
char __iomem *tlmm_base;
char __iomem *slew_base;
struct clk_bulk_data clks[MAX_LPI_NUM_CLKS];
- struct mutex slew_access_lock;
+ /* Protects from concurrent register updates */
+ struct mutex lock;
DECLARE_BITMAP(ever_gpio, MAX_NR_GPIO);
const struct lpi_pinctrl_variant_data *data;
};
@@ -103,6 +104,7 @@ static int lpi_gpio_set_mux(struct pinctrl_dev *pctldev, unsigned int function,
if (WARN_ON(i == g->nfuncs))
return -EINVAL;
+ mutex_lock(&pctrl->lock);
val = lpi_gpio_read(pctrl, pin, LPI_GPIO_CFG_REG);
/*
@@ -128,6 +130,7 @@ static int lpi_gpio_set_mux(struct pinctrl_dev *pctldev, unsigned int function,
u32p_replace_bits(&val, i, LPI_GPIO_FUNCTION_MASK);
lpi_gpio_write(pctrl, pin, LPI_GPIO_CFG_REG, val);
+ mutex_unlock(&pctrl->lock);
return 0;
}
@@ -233,14 +236,14 @@ static int lpi_config_set(struct pinctrl_dev *pctldev, unsigned int group,
if (slew_offset == LPI_NO_SLEW)
break;
- mutex_lock(&pctrl->slew_access_lock);
+ mutex_lock(&pctrl->lock);
sval = ioread32(pctrl->slew_base + LPI_SLEW_RATE_CTL_REG);
sval &= ~(LPI_SLEW_RATE_MASK << slew_offset);
sval |= arg << slew_offset;
iowrite32(sval, pctrl->slew_base + LPI_SLEW_RATE_CTL_REG);
- mutex_unlock(&pctrl->slew_access_lock);
+ mutex_unlock(&pctrl->lock);
break;
default:
return -EINVAL;
@@ -256,6 +259,7 @@ static int lpi_config_set(struct pinctrl_dev *pctldev, unsigned int group,
lpi_gpio_write(pctrl, group, LPI_GPIO_VALUE_REG, val);
}
+ mutex_lock(&pctrl->lock);
val = lpi_gpio_read(pctrl, group, LPI_GPIO_CFG_REG);
u32p_replace_bits(&val, pullup, LPI_GPIO_PULL_MASK);
@@ -264,6 +268,7 @@ static int lpi_config_set(struct pinctrl_dev *pctldev, unsigned int group,
u32p_replace_bits(&val, output_enabled, LPI_GPIO_OE_MASK);
lpi_gpio_write(pctrl, group, LPI_GPIO_CFG_REG, val);
+ mutex_unlock(&pctrl->lock);
return 0;
}
@@ -461,7 +466,7 @@ int lpi_pinctrl_probe(struct platform_device *pdev)
pctrl->chip.label = dev_name(dev);
pctrl->chip.can_sleep = false;
- mutex_init(&pctrl->slew_access_lock);
+ mutex_init(&pctrl->lock);
pctrl->ctrl = devm_pinctrl_register(dev, &pctrl->desc, pctrl);
if (IS_ERR(pctrl->ctrl)) {
@@ -483,7 +488,7 @@ int lpi_pinctrl_probe(struct platform_device *pdev)
return 0;
err_pinctrl:
- mutex_destroy(&pctrl->slew_access_lock);
+ mutex_destroy(&pctrl->lock);
clk_bulk_disable_unprepare(MAX_LPI_NUM_CLKS, pctrl->clks);
return ret;
@@ -495,7 +500,7 @@ int lpi_pinctrl_remove(struct platform_device *pdev)
struct lpi_pinctrl *pctrl = platform_get_drvdata(pdev);
int i;
- mutex_destroy(&pctrl->slew_access_lock);
+ mutex_destroy(&pctrl->lock);
clk_bulk_disable_unprepare(MAX_LPI_NUM_CLKS, pctrl->clks);
for (i = 0; i < pctrl->data->npins; i++)
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x c8befdc411e5fd1bf95a13e8744c8ca79b412bee
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102141-caring-alienate-2ec8@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c8befdc411e5fd1bf95a13e8744c8ca79b412bee Mon Sep 17 00:00:00 2001
From: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Date: Fri, 13 Oct 2023 16:57:05 +0200
Subject: [PATCH] pinctrl: qcom: lpass-lpi: fix concurrent register updates
The Qualcomm LPASS LPI pin controller driver uses one lock for guarding
Read-Modify-Write code for slew rate registers. However the pin
configuration and muxing registers have exactly the same RMW code but
are not protected.
Pin controller framework does not provide locking here, thus it is
possible to trigger simultaneous change of pin configuration registers
resulting in non-atomic changes.
Protect from concurrent access by re-using the same lock used to cover
the slew rate register. Using the same lock instead of adding second
one will make more sense, once we add support for newer Qualcomm SoC,
where slew rate is configured in the same register as pin
configuration/muxing.
Fixes: 6e261d1090d6 ("pinctrl: qcom: Add sm8250 lpass lpi pinctrl driver")
Cc: stable(a)vger.kernel.org
Reviewed-by: Linus Walleij <linus.walleij(a)linaro.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Link: https://lore.kernel.org/r/20231013145705.219954-1-krzysztof.kozlowski@linar…
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
diff --git a/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c b/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c
index e5a418026ba3..0b2839d27fd6 100644
--- a/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c
+++ b/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c
@@ -32,7 +32,8 @@ struct lpi_pinctrl {
char __iomem *tlmm_base;
char __iomem *slew_base;
struct clk_bulk_data clks[MAX_LPI_NUM_CLKS];
- struct mutex slew_access_lock;
+ /* Protects from concurrent register updates */
+ struct mutex lock;
DECLARE_BITMAP(ever_gpio, MAX_NR_GPIO);
const struct lpi_pinctrl_variant_data *data;
};
@@ -103,6 +104,7 @@ static int lpi_gpio_set_mux(struct pinctrl_dev *pctldev, unsigned int function,
if (WARN_ON(i == g->nfuncs))
return -EINVAL;
+ mutex_lock(&pctrl->lock);
val = lpi_gpio_read(pctrl, pin, LPI_GPIO_CFG_REG);
/*
@@ -128,6 +130,7 @@ static int lpi_gpio_set_mux(struct pinctrl_dev *pctldev, unsigned int function,
u32p_replace_bits(&val, i, LPI_GPIO_FUNCTION_MASK);
lpi_gpio_write(pctrl, pin, LPI_GPIO_CFG_REG, val);
+ mutex_unlock(&pctrl->lock);
return 0;
}
@@ -233,14 +236,14 @@ static int lpi_config_set(struct pinctrl_dev *pctldev, unsigned int group,
if (slew_offset == LPI_NO_SLEW)
break;
- mutex_lock(&pctrl->slew_access_lock);
+ mutex_lock(&pctrl->lock);
sval = ioread32(pctrl->slew_base + LPI_SLEW_RATE_CTL_REG);
sval &= ~(LPI_SLEW_RATE_MASK << slew_offset);
sval |= arg << slew_offset;
iowrite32(sval, pctrl->slew_base + LPI_SLEW_RATE_CTL_REG);
- mutex_unlock(&pctrl->slew_access_lock);
+ mutex_unlock(&pctrl->lock);
break;
default:
return -EINVAL;
@@ -256,6 +259,7 @@ static int lpi_config_set(struct pinctrl_dev *pctldev, unsigned int group,
lpi_gpio_write(pctrl, group, LPI_GPIO_VALUE_REG, val);
}
+ mutex_lock(&pctrl->lock);
val = lpi_gpio_read(pctrl, group, LPI_GPIO_CFG_REG);
u32p_replace_bits(&val, pullup, LPI_GPIO_PULL_MASK);
@@ -264,6 +268,7 @@ static int lpi_config_set(struct pinctrl_dev *pctldev, unsigned int group,
u32p_replace_bits(&val, output_enabled, LPI_GPIO_OE_MASK);
lpi_gpio_write(pctrl, group, LPI_GPIO_CFG_REG, val);
+ mutex_unlock(&pctrl->lock);
return 0;
}
@@ -461,7 +466,7 @@ int lpi_pinctrl_probe(struct platform_device *pdev)
pctrl->chip.label = dev_name(dev);
pctrl->chip.can_sleep = false;
- mutex_init(&pctrl->slew_access_lock);
+ mutex_init(&pctrl->lock);
pctrl->ctrl = devm_pinctrl_register(dev, &pctrl->desc, pctrl);
if (IS_ERR(pctrl->ctrl)) {
@@ -483,7 +488,7 @@ int lpi_pinctrl_probe(struct platform_device *pdev)
return 0;
err_pinctrl:
- mutex_destroy(&pctrl->slew_access_lock);
+ mutex_destroy(&pctrl->lock);
clk_bulk_disable_unprepare(MAX_LPI_NUM_CLKS, pctrl->clks);
return ret;
@@ -495,7 +500,7 @@ int lpi_pinctrl_remove(struct platform_device *pdev)
struct lpi_pinctrl *pctrl = platform_get_drvdata(pdev);
int i;
- mutex_destroy(&pctrl->slew_access_lock);
+ mutex_destroy(&pctrl->lock);
clk_bulk_disable_unprepare(MAX_LPI_NUM_CLKS, pctrl->clks);
for (i = 0; i < pctrl->data->npins; i++)
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x 32a9cdb8869dc111a0c96cf8e1762be9684af15b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102105-washbasin-crimp-78d1@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 32a9cdb8869dc111a0c96cf8e1762be9684af15b Mon Sep 17 00:00:00 2001
From: Haibo Chen <haibo.chen(a)nxp.com>
Date: Wed, 30 Aug 2023 17:39:22 +0800
Subject: [PATCH] mmc: core: sdio: hold retuning if sdio in 1-bit mode
tuning only support in 4-bit mode or 8 bit mode, so in 1-bit mode,
need to hold retuning.
Find this issue when use manual tuning method on imx93. When system
resume back, SDIO WIFI try to switch back to 4 bit mode, first will
trigger retuning, and all tuning command failed.
Signed-off-by: Haibo Chen <haibo.chen(a)nxp.com>
Acked-by: Adrian Hunter <adrian.hunter(a)intel.com>
Fixes: dfa13ebbe334 ("mmc: host: Add facility to support re-tuning")
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20230830093922.3095850-1-haibo.chen@nxp.com
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/core/sdio.c b/drivers/mmc/core/sdio.c
index f64b9ac76a5c..5914516df2f7 100644
--- a/drivers/mmc/core/sdio.c
+++ b/drivers/mmc/core/sdio.c
@@ -1089,8 +1089,14 @@ static int mmc_sdio_resume(struct mmc_host *host)
}
err = mmc_sdio_reinit_card(host);
} else if (mmc_card_wake_sdio_irq(host)) {
- /* We may have switched to 1-bit mode during suspend */
+ /*
+ * We may have switched to 1-bit mode during suspend,
+ * need to hold retuning, because tuning only supprt
+ * 4-bit mode or 8 bit mode.
+ */
+ mmc_retune_hold_now(host);
err = sdio_enable_4bit_bus(host->card);
+ mmc_retune_release(host);
}
if (err)
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 32a9cdb8869dc111a0c96cf8e1762be9684af15b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102104-decal-avenue-0d88@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 32a9cdb8869dc111a0c96cf8e1762be9684af15b Mon Sep 17 00:00:00 2001
From: Haibo Chen <haibo.chen(a)nxp.com>
Date: Wed, 30 Aug 2023 17:39:22 +0800
Subject: [PATCH] mmc: core: sdio: hold retuning if sdio in 1-bit mode
tuning only support in 4-bit mode or 8 bit mode, so in 1-bit mode,
need to hold retuning.
Find this issue when use manual tuning method on imx93. When system
resume back, SDIO WIFI try to switch back to 4 bit mode, first will
trigger retuning, and all tuning command failed.
Signed-off-by: Haibo Chen <haibo.chen(a)nxp.com>
Acked-by: Adrian Hunter <adrian.hunter(a)intel.com>
Fixes: dfa13ebbe334 ("mmc: host: Add facility to support re-tuning")
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20230830093922.3095850-1-haibo.chen@nxp.com
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/core/sdio.c b/drivers/mmc/core/sdio.c
index f64b9ac76a5c..5914516df2f7 100644
--- a/drivers/mmc/core/sdio.c
+++ b/drivers/mmc/core/sdio.c
@@ -1089,8 +1089,14 @@ static int mmc_sdio_resume(struct mmc_host *host)
}
err = mmc_sdio_reinit_card(host);
} else if (mmc_card_wake_sdio_irq(host)) {
- /* We may have switched to 1-bit mode during suspend */
+ /*
+ * We may have switched to 1-bit mode during suspend,
+ * need to hold retuning, because tuning only supprt
+ * 4-bit mode or 8 bit mode.
+ */
+ mmc_retune_hold_now(host);
err = sdio_enable_4bit_bus(host->card);
+ mmc_retune_release(host);
}
if (err)
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x 3e01d5254698ea3d18e09d96b974c762328352cd
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102111-unclaimed-marbled-f39f@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3e01d5254698ea3d18e09d96b974c762328352cd Mon Sep 17 00:00:00 2001
From: Miquel Raynal <miquel.raynal(a)bootlin.com>
Date: Mon, 17 Jul 2023 21:42:19 +0200
Subject: [PATCH] mtd: rawnand: marvell: Ensure program page operations are
successful
The NAND core complies with the ONFI specification, which itself
mentions that after any program or erase operation, a status check
should be performed to see whether the operation was finished *and*
successful.
The NAND core offers helpers to finish a page write (sending the
"PAGE PROG" command, waiting for the NAND chip to be ready again, and
checking the operation status). But in some cases, advanced controller
drivers might want to optimize this and craft their own page write
helper to leverage additional hardware capabilities, thus not always
using the core facilities.
Some drivers, like this one, do not use the core helper to finish a page
write because the final cycles are automatically managed by the
hardware. In this case, the additional care must be taken to manually
perform the final status check.
Let's read the NAND chip status at the end of the page write helper and
return -EIO upon error.
Cc: stable(a)vger.kernel.org
Fixes: 02f26ecf8c77 ("mtd: nand: add reworked Marvell NAND controller driver")
Reported-by: Aviram Dali <aviramd(a)marvell.com>
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
Tested-by: Ravi Chandra Minnikanti <rminnikanti(a)marvell.com>
Link: https://lore.kernel.org/linux-mtd/20230717194221.229778-1-miquel.raynal@boo…
diff --git a/drivers/mtd/nand/raw/marvell_nand.c b/drivers/mtd/nand/raw/marvell_nand.c
index 2c94da7a3b3a..b841a81cb128 100644
--- a/drivers/mtd/nand/raw/marvell_nand.c
+++ b/drivers/mtd/nand/raw/marvell_nand.c
@@ -1165,6 +1165,7 @@ static int marvell_nfc_hw_ecc_hmg_do_write_page(struct nand_chip *chip,
.ndcb[2] = NDCB2_ADDR5_PAGE(page),
};
unsigned int oob_bytes = lt->spare_bytes + (raw ? lt->ecc_bytes : 0);
+ u8 status;
int ret;
/* NFCv2 needs more information about the operation being executed */
@@ -1198,7 +1199,18 @@ static int marvell_nfc_hw_ecc_hmg_do_write_page(struct nand_chip *chip,
ret = marvell_nfc_wait_op(chip,
PSEC_TO_MSEC(sdr->tPROG_max));
- return ret;
+ if (ret)
+ return ret;
+
+ /* Check write status on the chip side */
+ ret = nand_status_op(chip, &status);
+ if (ret)
+ return ret;
+
+ if (status & NAND_STATUS_FAIL)
+ return -EIO;
+
+ return 0;
}
static int marvell_nfc_hw_ecc_hmg_write_page_raw(struct nand_chip *chip,
@@ -1627,6 +1639,7 @@ static int marvell_nfc_hw_ecc_bch_write_page(struct nand_chip *chip,
int data_len = lt->data_bytes;
int spare_len = lt->spare_bytes;
int chunk, ret;
+ u8 status;
marvell_nfc_select_target(chip, chip->cur_cs);
@@ -1663,6 +1676,14 @@ static int marvell_nfc_hw_ecc_bch_write_page(struct nand_chip *chip,
if (ret)
return ret;
+ /* Check write status on the chip side */
+ ret = nand_status_op(chip, &status);
+ if (ret)
+ return ret;
+
+ if (status & NAND_STATUS_FAIL)
+ return -EIO;
+
return 0;
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 3e01d5254698ea3d18e09d96b974c762328352cd
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102110-prominent-seventh-4ea6@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3e01d5254698ea3d18e09d96b974c762328352cd Mon Sep 17 00:00:00 2001
From: Miquel Raynal <miquel.raynal(a)bootlin.com>
Date: Mon, 17 Jul 2023 21:42:19 +0200
Subject: [PATCH] mtd: rawnand: marvell: Ensure program page operations are
successful
The NAND core complies with the ONFI specification, which itself
mentions that after any program or erase operation, a status check
should be performed to see whether the operation was finished *and*
successful.
The NAND core offers helpers to finish a page write (sending the
"PAGE PROG" command, waiting for the NAND chip to be ready again, and
checking the operation status). But in some cases, advanced controller
drivers might want to optimize this and craft their own page write
helper to leverage additional hardware capabilities, thus not always
using the core facilities.
Some drivers, like this one, do not use the core helper to finish a page
write because the final cycles are automatically managed by the
hardware. In this case, the additional care must be taken to manually
perform the final status check.
Let's read the NAND chip status at the end of the page write helper and
return -EIO upon error.
Cc: stable(a)vger.kernel.org
Fixes: 02f26ecf8c77 ("mtd: nand: add reworked Marvell NAND controller driver")
Reported-by: Aviram Dali <aviramd(a)marvell.com>
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
Tested-by: Ravi Chandra Minnikanti <rminnikanti(a)marvell.com>
Link: https://lore.kernel.org/linux-mtd/20230717194221.229778-1-miquel.raynal@boo…
diff --git a/drivers/mtd/nand/raw/marvell_nand.c b/drivers/mtd/nand/raw/marvell_nand.c
index 2c94da7a3b3a..b841a81cb128 100644
--- a/drivers/mtd/nand/raw/marvell_nand.c
+++ b/drivers/mtd/nand/raw/marvell_nand.c
@@ -1165,6 +1165,7 @@ static int marvell_nfc_hw_ecc_hmg_do_write_page(struct nand_chip *chip,
.ndcb[2] = NDCB2_ADDR5_PAGE(page),
};
unsigned int oob_bytes = lt->spare_bytes + (raw ? lt->ecc_bytes : 0);
+ u8 status;
int ret;
/* NFCv2 needs more information about the operation being executed */
@@ -1198,7 +1199,18 @@ static int marvell_nfc_hw_ecc_hmg_do_write_page(struct nand_chip *chip,
ret = marvell_nfc_wait_op(chip,
PSEC_TO_MSEC(sdr->tPROG_max));
- return ret;
+ if (ret)
+ return ret;
+
+ /* Check write status on the chip side */
+ ret = nand_status_op(chip, &status);
+ if (ret)
+ return ret;
+
+ if (status & NAND_STATUS_FAIL)
+ return -EIO;
+
+ return 0;
}
static int marvell_nfc_hw_ecc_hmg_write_page_raw(struct nand_chip *chip,
@@ -1627,6 +1639,7 @@ static int marvell_nfc_hw_ecc_bch_write_page(struct nand_chip *chip,
int data_len = lt->data_bytes;
int spare_len = lt->spare_bytes;
int chunk, ret;
+ u8 status;
marvell_nfc_select_target(chip, chip->cur_cs);
@@ -1663,6 +1676,14 @@ static int marvell_nfc_hw_ecc_bch_write_page(struct nand_chip *chip,
if (ret)
return ret;
+ /* Check write status on the chip side */
+ ret = nand_status_op(chip, &status);
+ if (ret)
+ return ret;
+
+ if (status & NAND_STATUS_FAIL)
+ return -EIO;
+
return 0;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 3e01d5254698ea3d18e09d96b974c762328352cd
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102108-winner-gorged-4d0d@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3e01d5254698ea3d18e09d96b974c762328352cd Mon Sep 17 00:00:00 2001
From: Miquel Raynal <miquel.raynal(a)bootlin.com>
Date: Mon, 17 Jul 2023 21:42:19 +0200
Subject: [PATCH] mtd: rawnand: marvell: Ensure program page operations are
successful
The NAND core complies with the ONFI specification, which itself
mentions that after any program or erase operation, a status check
should be performed to see whether the operation was finished *and*
successful.
The NAND core offers helpers to finish a page write (sending the
"PAGE PROG" command, waiting for the NAND chip to be ready again, and
checking the operation status). But in some cases, advanced controller
drivers might want to optimize this and craft their own page write
helper to leverage additional hardware capabilities, thus not always
using the core facilities.
Some drivers, like this one, do not use the core helper to finish a page
write because the final cycles are automatically managed by the
hardware. In this case, the additional care must be taken to manually
perform the final status check.
Let's read the NAND chip status at the end of the page write helper and
return -EIO upon error.
Cc: stable(a)vger.kernel.org
Fixes: 02f26ecf8c77 ("mtd: nand: add reworked Marvell NAND controller driver")
Reported-by: Aviram Dali <aviramd(a)marvell.com>
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
Tested-by: Ravi Chandra Minnikanti <rminnikanti(a)marvell.com>
Link: https://lore.kernel.org/linux-mtd/20230717194221.229778-1-miquel.raynal@boo…
diff --git a/drivers/mtd/nand/raw/marvell_nand.c b/drivers/mtd/nand/raw/marvell_nand.c
index 2c94da7a3b3a..b841a81cb128 100644
--- a/drivers/mtd/nand/raw/marvell_nand.c
+++ b/drivers/mtd/nand/raw/marvell_nand.c
@@ -1165,6 +1165,7 @@ static int marvell_nfc_hw_ecc_hmg_do_write_page(struct nand_chip *chip,
.ndcb[2] = NDCB2_ADDR5_PAGE(page),
};
unsigned int oob_bytes = lt->spare_bytes + (raw ? lt->ecc_bytes : 0);
+ u8 status;
int ret;
/* NFCv2 needs more information about the operation being executed */
@@ -1198,7 +1199,18 @@ static int marvell_nfc_hw_ecc_hmg_do_write_page(struct nand_chip *chip,
ret = marvell_nfc_wait_op(chip,
PSEC_TO_MSEC(sdr->tPROG_max));
- return ret;
+ if (ret)
+ return ret;
+
+ /* Check write status on the chip side */
+ ret = nand_status_op(chip, &status);
+ if (ret)
+ return ret;
+
+ if (status & NAND_STATUS_FAIL)
+ return -EIO;
+
+ return 0;
}
static int marvell_nfc_hw_ecc_hmg_write_page_raw(struct nand_chip *chip,
@@ -1627,6 +1639,7 @@ static int marvell_nfc_hw_ecc_bch_write_page(struct nand_chip *chip,
int data_len = lt->data_bytes;
int spare_len = lt->spare_bytes;
int chunk, ret;
+ u8 status;
marvell_nfc_select_target(chip, chip->cur_cs);
@@ -1663,6 +1676,14 @@ static int marvell_nfc_hw_ecc_bch_write_page(struct nand_chip *chip,
if (ret)
return ret;
+ /* Check write status on the chip side */
+ ret = nand_status_op(chip, &status);
+ if (ret)
+ return ret;
+
+ if (status & NAND_STATUS_FAIL)
+ return -EIO;
+
return 0;
}
The patch below does not apply to the 6.5-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.5.y
git checkout FETCH_HEAD
git cherry-pick -x 8e15aee621618a3ee3abecaf1fd8c1428098b7ef
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102009-aqueduct-crate-9618@gregkh' --subject-prefix 'PATCH 6.5.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8e15aee621618a3ee3abecaf1fd8c1428098b7ef Mon Sep 17 00:00:00 2001
From: Jakub Kicinski <kuba(a)kernel.org>
Date: Tue, 17 Oct 2023 18:38:16 -0700
Subject: [PATCH] net: move altnames together with the netdevice
The altname nodes are currently not moved to the new netns
when netdevice itself moves:
[ ~]# ip netns add test
[ ~]# ip -netns test link add name eth0 type dummy
[ ~]# ip -netns test link property add dev eth0 altname some-name
[ ~]# ip -netns test link show dev some-name
2: eth0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 1e:67:ed:19:3d:24 brd ff:ff:ff:ff:ff:ff
altname some-name
[ ~]# ip -netns test link set dev eth0 netns 1
[ ~]# ip link
...
3: eth0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 02:40:88:62:ec:b8 brd ff:ff:ff:ff:ff:ff
altname some-name
[ ~]# ip li show dev some-name
Device "some-name" does not exist.
Remove them from the hash table when device is unlisted
and add back when listed again.
Fixes: 36fbf1e52bd3 ("net: rtnetlink: add linkprop commands to add and delete alternative ifnames")
Reviewed-by: Jiri Pirko <jiri(a)nvidia.com>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
diff --git a/net/core/dev.c b/net/core/dev.c
index 559705aeefe4..9f3f8930c691 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -381,6 +381,7 @@ static void netdev_name_node_alt_flush(struct net_device *dev)
/* Device list insertion */
static void list_netdevice(struct net_device *dev)
{
+ struct netdev_name_node *name_node;
struct net *net = dev_net(dev);
ASSERT_RTNL();
@@ -391,6 +392,10 @@ static void list_netdevice(struct net_device *dev)
hlist_add_head_rcu(&dev->index_hlist,
dev_index_hash(net, dev->ifindex));
write_unlock(&dev_base_lock);
+
+ netdev_for_each_altname(dev, name_node)
+ netdev_name_node_add(net, name_node);
+
/* We reserved the ifindex, this can't fail */
WARN_ON(xa_store(&net->dev_by_index, dev->ifindex, dev, GFP_KERNEL));
@@ -402,12 +407,16 @@ static void list_netdevice(struct net_device *dev)
*/
static void unlist_netdevice(struct net_device *dev, bool lock)
{
+ struct netdev_name_node *name_node;
struct net *net = dev_net(dev);
ASSERT_RTNL();
xa_erase(&net->dev_by_index, dev->ifindex);
+ netdev_for_each_altname(dev, name_node)
+ netdev_name_node_del(name_node);
+
/* Unlink dev from the device chain */
if (lock)
write_lock(&dev_base_lock);
@@ -10942,7 +10951,6 @@ void unregister_netdevice_many_notify(struct list_head *head,
synchronize_net();
list_for_each_entry(dev, head, unreg_list) {
- struct netdev_name_node *name_node;
struct sk_buff *skb = NULL;
/* Shutdown queueing discipline. */
@@ -10970,9 +10978,6 @@ void unregister_netdevice_many_notify(struct list_head *head,
dev_uc_flush(dev);
dev_mc_flush(dev);
- netdev_for_each_altname(dev, name_node)
- netdev_name_node_del(name_node);
- synchronize_rcu();
netdev_name_node_alt_flush(dev);
netdev_name_node_free(dev->name_node);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 1a83f4a7c156fa6bbd6b530e89fa3270bf3d9d1b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102023-ignore-graveness-861d@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1a83f4a7c156fa6bbd6b530e89fa3270bf3d9d1b Mon Sep 17 00:00:00 2001
From: Jakub Kicinski <kuba(a)kernel.org>
Date: Tue, 17 Oct 2023 18:38:15 -0700
Subject: [PATCH] net: avoid UAF on deleted altname
Altnames are accessed under RCU (dev_get_by_name_rcu())
but freed by kfree() with no synchronization point.
Each node has one or two allocations (node and a variable-size
name, sometimes the name is netdev->name). Adding rcu_heads
here is a bit tedious. Besides most code which unlists the names
already has rcu barriers - so take the simpler approach of adding
synchronize_rcu(). Note that the one on the unregistration path
(which matters more) is removed by the next fix.
Fixes: ff92741270bf ("net: introduce name_node struct to be used in hashlist")
Reviewed-by: Jiri Pirko <jiri(a)nvidia.com>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
diff --git a/net/core/dev.c b/net/core/dev.c
index ae557193b77c..559705aeefe4 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -345,7 +345,6 @@ int netdev_name_node_alt_create(struct net_device *dev, const char *name)
static void __netdev_name_node_alt_destroy(struct netdev_name_node *name_node)
{
list_del(&name_node->list);
- netdev_name_node_del(name_node);
kfree(name_node->name);
netdev_name_node_free(name_node);
}
@@ -364,6 +363,8 @@ int netdev_name_node_alt_destroy(struct net_device *dev, const char *name)
if (name_node == dev->name_node || name_node->dev != dev)
return -EINVAL;
+ netdev_name_node_del(name_node);
+ synchronize_rcu();
__netdev_name_node_alt_destroy(name_node);
return 0;
@@ -10941,6 +10942,7 @@ void unregister_netdevice_many_notify(struct list_head *head,
synchronize_net();
list_for_each_entry(dev, head, unreg_list) {
+ struct netdev_name_node *name_node;
struct sk_buff *skb = NULL;
/* Shutdown queueing discipline. */
@@ -10968,6 +10970,9 @@ void unregister_netdevice_many_notify(struct list_head *head,
dev_uc_flush(dev);
dev_mc_flush(dev);
+ netdev_for_each_altname(dev, name_node)
+ netdev_name_node_del(name_node);
+ synchronize_rcu();
netdev_name_node_alt_flush(dev);
netdev_name_node_free(dev->name_node);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 1a83f4a7c156fa6bbd6b530e89fa3270bf3d9d1b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102011-rubble-require-8127@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1a83f4a7c156fa6bbd6b530e89fa3270bf3d9d1b Mon Sep 17 00:00:00 2001
From: Jakub Kicinski <kuba(a)kernel.org>
Date: Tue, 17 Oct 2023 18:38:15 -0700
Subject: [PATCH] net: avoid UAF on deleted altname
Altnames are accessed under RCU (dev_get_by_name_rcu())
but freed by kfree() with no synchronization point.
Each node has one or two allocations (node and a variable-size
name, sometimes the name is netdev->name). Adding rcu_heads
here is a bit tedious. Besides most code which unlists the names
already has rcu barriers - so take the simpler approach of adding
synchronize_rcu(). Note that the one on the unregistration path
(which matters more) is removed by the next fix.
Fixes: ff92741270bf ("net: introduce name_node struct to be used in hashlist")
Reviewed-by: Jiri Pirko <jiri(a)nvidia.com>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
diff --git a/net/core/dev.c b/net/core/dev.c
index ae557193b77c..559705aeefe4 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -345,7 +345,6 @@ int netdev_name_node_alt_create(struct net_device *dev, const char *name)
static void __netdev_name_node_alt_destroy(struct netdev_name_node *name_node)
{
list_del(&name_node->list);
- netdev_name_node_del(name_node);
kfree(name_node->name);
netdev_name_node_free(name_node);
}
@@ -364,6 +363,8 @@ int netdev_name_node_alt_destroy(struct net_device *dev, const char *name)
if (name_node == dev->name_node || name_node->dev != dev)
return -EINVAL;
+ netdev_name_node_del(name_node);
+ synchronize_rcu();
__netdev_name_node_alt_destroy(name_node);
return 0;
@@ -10941,6 +10942,7 @@ void unregister_netdevice_many_notify(struct list_head *head,
synchronize_net();
list_for_each_entry(dev, head, unreg_list) {
+ struct netdev_name_node *name_node;
struct sk_buff *skb = NULL;
/* Shutdown queueing discipline. */
@@ -10968,6 +10970,9 @@ void unregister_netdevice_many_notify(struct list_head *head,
dev_uc_flush(dev);
dev_mc_flush(dev);
+ netdev_for_each_altname(dev, name_node)
+ netdev_name_node_del(name_node);
+ synchronize_rcu();
netdev_name_node_alt_flush(dev);
netdev_name_node_free(dev->name_node);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 8e15aee621618a3ee3abecaf1fd8c1428098b7ef
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102012-trustee-handball-1567@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8e15aee621618a3ee3abecaf1fd8c1428098b7ef Mon Sep 17 00:00:00 2001
From: Jakub Kicinski <kuba(a)kernel.org>
Date: Tue, 17 Oct 2023 18:38:16 -0700
Subject: [PATCH] net: move altnames together with the netdevice
The altname nodes are currently not moved to the new netns
when netdevice itself moves:
[ ~]# ip netns add test
[ ~]# ip -netns test link add name eth0 type dummy
[ ~]# ip -netns test link property add dev eth0 altname some-name
[ ~]# ip -netns test link show dev some-name
2: eth0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 1e:67:ed:19:3d:24 brd ff:ff:ff:ff:ff:ff
altname some-name
[ ~]# ip -netns test link set dev eth0 netns 1
[ ~]# ip link
...
3: eth0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 02:40:88:62:ec:b8 brd ff:ff:ff:ff:ff:ff
altname some-name
[ ~]# ip li show dev some-name
Device "some-name" does not exist.
Remove them from the hash table when device is unlisted
and add back when listed again.
Fixes: 36fbf1e52bd3 ("net: rtnetlink: add linkprop commands to add and delete alternative ifnames")
Reviewed-by: Jiri Pirko <jiri(a)nvidia.com>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
diff --git a/net/core/dev.c b/net/core/dev.c
index 559705aeefe4..9f3f8930c691 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -381,6 +381,7 @@ static void netdev_name_node_alt_flush(struct net_device *dev)
/* Device list insertion */
static void list_netdevice(struct net_device *dev)
{
+ struct netdev_name_node *name_node;
struct net *net = dev_net(dev);
ASSERT_RTNL();
@@ -391,6 +392,10 @@ static void list_netdevice(struct net_device *dev)
hlist_add_head_rcu(&dev->index_hlist,
dev_index_hash(net, dev->ifindex));
write_unlock(&dev_base_lock);
+
+ netdev_for_each_altname(dev, name_node)
+ netdev_name_node_add(net, name_node);
+
/* We reserved the ifindex, this can't fail */
WARN_ON(xa_store(&net->dev_by_index, dev->ifindex, dev, GFP_KERNEL));
@@ -402,12 +407,16 @@ static void list_netdevice(struct net_device *dev)
*/
static void unlist_netdevice(struct net_device *dev, bool lock)
{
+ struct netdev_name_node *name_node;
struct net *net = dev_net(dev);
ASSERT_RTNL();
xa_erase(&net->dev_by_index, dev->ifindex);
+ netdev_for_each_altname(dev, name_node)
+ netdev_name_node_del(name_node);
+
/* Unlink dev from the device chain */
if (lock)
write_lock(&dev_base_lock);
@@ -10942,7 +10951,6 @@ void unregister_netdevice_many_notify(struct list_head *head,
synchronize_net();
list_for_each_entry(dev, head, unreg_list) {
- struct netdev_name_node *name_node;
struct sk_buff *skb = NULL;
/* Shutdown queueing discipline. */
@@ -10970,9 +10978,6 @@ void unregister_netdevice_many_notify(struct list_head *head,
dev_uc_flush(dev);
dev_mc_flush(dev);
- netdev_for_each_altname(dev, name_node)
- netdev_name_node_del(name_node);
- synchronize_rcu();
netdev_name_node_alt_flush(dev);
netdev_name_node_free(dev->name_node);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 8e15aee621618a3ee3abecaf1fd8c1428098b7ef
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102011-escapable-flattop-efeb@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8e15aee621618a3ee3abecaf1fd8c1428098b7ef Mon Sep 17 00:00:00 2001
From: Jakub Kicinski <kuba(a)kernel.org>
Date: Tue, 17 Oct 2023 18:38:16 -0700
Subject: [PATCH] net: move altnames together with the netdevice
The altname nodes are currently not moved to the new netns
when netdevice itself moves:
[ ~]# ip netns add test
[ ~]# ip -netns test link add name eth0 type dummy
[ ~]# ip -netns test link property add dev eth0 altname some-name
[ ~]# ip -netns test link show dev some-name
2: eth0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 1e:67:ed:19:3d:24 brd ff:ff:ff:ff:ff:ff
altname some-name
[ ~]# ip -netns test link set dev eth0 netns 1
[ ~]# ip link
...
3: eth0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 02:40:88:62:ec:b8 brd ff:ff:ff:ff:ff:ff
altname some-name
[ ~]# ip li show dev some-name
Device "some-name" does not exist.
Remove them from the hash table when device is unlisted
and add back when listed again.
Fixes: 36fbf1e52bd3 ("net: rtnetlink: add linkprop commands to add and delete alternative ifnames")
Reviewed-by: Jiri Pirko <jiri(a)nvidia.com>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
diff --git a/net/core/dev.c b/net/core/dev.c
index 559705aeefe4..9f3f8930c691 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -381,6 +381,7 @@ static void netdev_name_node_alt_flush(struct net_device *dev)
/* Device list insertion */
static void list_netdevice(struct net_device *dev)
{
+ struct netdev_name_node *name_node;
struct net *net = dev_net(dev);
ASSERT_RTNL();
@@ -391,6 +392,10 @@ static void list_netdevice(struct net_device *dev)
hlist_add_head_rcu(&dev->index_hlist,
dev_index_hash(net, dev->ifindex));
write_unlock(&dev_base_lock);
+
+ netdev_for_each_altname(dev, name_node)
+ netdev_name_node_add(net, name_node);
+
/* We reserved the ifindex, this can't fail */
WARN_ON(xa_store(&net->dev_by_index, dev->ifindex, dev, GFP_KERNEL));
@@ -402,12 +407,16 @@ static void list_netdevice(struct net_device *dev)
*/
static void unlist_netdevice(struct net_device *dev, bool lock)
{
+ struct netdev_name_node *name_node;
struct net *net = dev_net(dev);
ASSERT_RTNL();
xa_erase(&net->dev_by_index, dev->ifindex);
+ netdev_for_each_altname(dev, name_node)
+ netdev_name_node_del(name_node);
+
/* Unlink dev from the device chain */
if (lock)
write_lock(&dev_base_lock);
@@ -10942,7 +10951,6 @@ void unregister_netdevice_many_notify(struct list_head *head,
synchronize_net();
list_for_each_entry(dev, head, unreg_list) {
- struct netdev_name_node *name_node;
struct sk_buff *skb = NULL;
/* Shutdown queueing discipline. */
@@ -10970,9 +10978,6 @@ void unregister_netdevice_many_notify(struct list_head *head,
dev_uc_flush(dev);
dev_mc_flush(dev);
- netdev_for_each_altname(dev, name_node)
- netdev_name_node_del(name_node);
- synchronize_rcu();
netdev_name_node_alt_flush(dev);
netdev_name_node_free(dev->name_node);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 7663d522099ecc464512164e660bc771b2ff7b64
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102049-endanger-chief-b8ab@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7663d522099ecc464512164e660bc771b2ff7b64 Mon Sep 17 00:00:00 2001
From: Jakub Kicinski <kuba(a)kernel.org>
Date: Tue, 17 Oct 2023 18:38:14 -0700
Subject: [PATCH] net: check for altname conflicts when changing netdev's netns
It's currently possible to create an altname conflicting
with an altname or real name of another device by creating
it in another netns and moving it over:
[ ~]$ ip link add dev eth0 type dummy
[ ~]$ ip netns add test
[ ~]$ ip -netns test link add dev ethX netns test type dummy
[ ~]$ ip -netns test link property add dev ethX altname eth0
[ ~]$ ip -netns test link set dev ethX netns 1
[ ~]$ ip link
...
3: eth0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 02:40:88:62:ec:b8 brd ff:ff:ff:ff:ff:ff
...
5: ethX: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 26:b7:28:78:38:0f brd ff:ff:ff:ff:ff:ff
altname eth0
Create a macro for walking the altnames, this hopefully makes
it clearer that the list we walk contains only altnames.
Which is otherwise not entirely intuitive.
Fixes: 36fbf1e52bd3 ("net: rtnetlink: add linkprop commands to add and delete alternative ifnames")
Reviewed-by: Jiri Pirko <jiri(a)nvidia.com>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
diff --git a/net/core/dev.c b/net/core/dev.c
index f109ad34d660..ae557193b77c 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1086,7 +1086,8 @@ static int __dev_alloc_name(struct net *net, const char *name, char *buf)
for_each_netdev(net, d) {
struct netdev_name_node *name_node;
- list_for_each_entry(name_node, &d->name_node->list, list) {
+
+ netdev_for_each_altname(d, name_node) {
if (!sscanf(name_node->name, name, &i))
continue;
if (i < 0 || i >= max_netdevices)
@@ -11051,6 +11052,7 @@ EXPORT_SYMBOL(unregister_netdev);
int __dev_change_net_namespace(struct net_device *dev, struct net *net,
const char *pat, int new_ifindex)
{
+ struct netdev_name_node *name_node;
struct net *net_old = dev_net(dev);
char new_name[IFNAMSIZ] = {};
int err, new_nsid;
@@ -11083,6 +11085,11 @@ int __dev_change_net_namespace(struct net_device *dev, struct net *net,
if (err < 0)
goto out;
}
+ /* Check that none of the altnames conflicts. */
+ err = -EEXIST;
+ netdev_for_each_altname(dev, name_node)
+ if (netdev_name_in_use(net, name_node->name))
+ goto out;
/* Check that new_ifindex isn't used yet. */
if (new_ifindex) {
diff --git a/net/core/dev.h b/net/core/dev.h
index e075e198092c..fa2e9c5c4122 100644
--- a/net/core/dev.h
+++ b/net/core/dev.h
@@ -62,6 +62,9 @@ struct netdev_name_node {
int netdev_get_name(struct net *net, char *name, int ifindex);
int dev_change_name(struct net_device *dev, const char *newname);
+#define netdev_for_each_altname(dev, namenode) \
+ list_for_each_entry((namenode), &(dev)->name_node->list, list)
+
int netdev_name_node_alt_create(struct net_device *dev, const char *name);
int netdev_name_node_alt_destroy(struct net_device *dev, const char *name);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 7663d522099ecc464512164e660bc771b2ff7b64
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102047-purplish-sectional-1245@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7663d522099ecc464512164e660bc771b2ff7b64 Mon Sep 17 00:00:00 2001
From: Jakub Kicinski <kuba(a)kernel.org>
Date: Tue, 17 Oct 2023 18:38:14 -0700
Subject: [PATCH] net: check for altname conflicts when changing netdev's netns
It's currently possible to create an altname conflicting
with an altname or real name of another device by creating
it in another netns and moving it over:
[ ~]$ ip link add dev eth0 type dummy
[ ~]$ ip netns add test
[ ~]$ ip -netns test link add dev ethX netns test type dummy
[ ~]$ ip -netns test link property add dev ethX altname eth0
[ ~]$ ip -netns test link set dev ethX netns 1
[ ~]$ ip link
...
3: eth0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 02:40:88:62:ec:b8 brd ff:ff:ff:ff:ff:ff
...
5: ethX: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 26:b7:28:78:38:0f brd ff:ff:ff:ff:ff:ff
altname eth0
Create a macro for walking the altnames, this hopefully makes
it clearer that the list we walk contains only altnames.
Which is otherwise not entirely intuitive.
Fixes: 36fbf1e52bd3 ("net: rtnetlink: add linkprop commands to add and delete alternative ifnames")
Reviewed-by: Jiri Pirko <jiri(a)nvidia.com>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
diff --git a/net/core/dev.c b/net/core/dev.c
index f109ad34d660..ae557193b77c 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1086,7 +1086,8 @@ static int __dev_alloc_name(struct net *net, const char *name, char *buf)
for_each_netdev(net, d) {
struct netdev_name_node *name_node;
- list_for_each_entry(name_node, &d->name_node->list, list) {
+
+ netdev_for_each_altname(d, name_node) {
if (!sscanf(name_node->name, name, &i))
continue;
if (i < 0 || i >= max_netdevices)
@@ -11051,6 +11052,7 @@ EXPORT_SYMBOL(unregister_netdev);
int __dev_change_net_namespace(struct net_device *dev, struct net *net,
const char *pat, int new_ifindex)
{
+ struct netdev_name_node *name_node;
struct net *net_old = dev_net(dev);
char new_name[IFNAMSIZ] = {};
int err, new_nsid;
@@ -11083,6 +11085,11 @@ int __dev_change_net_namespace(struct net_device *dev, struct net *net,
if (err < 0)
goto out;
}
+ /* Check that none of the altnames conflicts. */
+ err = -EEXIST;
+ netdev_for_each_altname(dev, name_node)
+ if (netdev_name_in_use(net, name_node->name))
+ goto out;
/* Check that new_ifindex isn't used yet. */
if (new_ifindex) {
diff --git a/net/core/dev.h b/net/core/dev.h
index e075e198092c..fa2e9c5c4122 100644
--- a/net/core/dev.h
+++ b/net/core/dev.h
@@ -62,6 +62,9 @@ struct netdev_name_node {
int netdev_get_name(struct net *net, char *name, int ifindex);
int dev_change_name(struct net_device *dev, const char *newname);
+#define netdev_for_each_altname(dev, namenode) \
+ list_for_each_entry((namenode), &(dev)->name_node->list, list)
+
int netdev_name_node_alt_create(struct net_device *dev, const char *name);
int netdev_name_node_alt_destroy(struct net_device *dev, const char *name);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x 311cca40661f428b7aa114fb5af578cfdbe3e8b6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102028-tartness-disown-ace1@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 311cca40661f428b7aa114fb5af578cfdbe3e8b6 Mon Sep 17 00:00:00 2001
From: Jakub Kicinski <kuba(a)kernel.org>
Date: Tue, 17 Oct 2023 18:38:13 -0700
Subject: [PATCH] net: fix ifname in netlink ntf during netns move
dev_get_valid_name() overwrites the netdev's name on success.
This makes it hard to use in prepare-commit-like fashion,
where we do validation first, and "commit" to the change
later.
Factor out a helper which lets us save the new name to a buffer.
Use it to fix the problem of notification on netns move having
incorrect name:
5: eth0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether be:4d:58:f9:d5:40 brd ff:ff:ff:ff:ff:ff
6: eth1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether 1e:4a:34:36:e3:cd brd ff:ff:ff:ff:ff:ff
[ ~]# ip link set dev eth0 netns 1 name eth1
ip monitor inside netns:
Deleted inet eth0
Deleted inet6 eth0
Deleted 5: eth1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether be:4d:58:f9:d5:40 brd ff:ff:ff:ff:ff:ff new-netnsid 0 new-ifindex 7
Name is reported as eth1 in old netns for ifindex 5, already renamed.
Fixes: d90310243fd7 ("net: device name allocation cleanups")
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Reviewed-by: Jiri Pirko <jiri(a)nvidia.com>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
diff --git a/net/core/dev.c b/net/core/dev.c
index 5aaf5753d4e4..f109ad34d660 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1123,6 +1123,26 @@ static int __dev_alloc_name(struct net *net, const char *name, char *buf)
return -ENFILE;
}
+static int dev_prep_valid_name(struct net *net, struct net_device *dev,
+ const char *want_name, char *out_name)
+{
+ int ret;
+
+ if (!dev_valid_name(want_name))
+ return -EINVAL;
+
+ if (strchr(want_name, '%')) {
+ ret = __dev_alloc_name(net, want_name, out_name);
+ return ret < 0 ? ret : 0;
+ } else if (netdev_name_in_use(net, want_name)) {
+ return -EEXIST;
+ } else if (out_name != want_name) {
+ strscpy(out_name, want_name, IFNAMSIZ);
+ }
+
+ return 0;
+}
+
static int dev_alloc_name_ns(struct net *net,
struct net_device *dev,
const char *name)
@@ -1160,19 +1180,13 @@ EXPORT_SYMBOL(dev_alloc_name);
static int dev_get_valid_name(struct net *net, struct net_device *dev,
const char *name)
{
- BUG_ON(!net);
-
- if (!dev_valid_name(name))
- return -EINVAL;
-
- if (strchr(name, '%'))
- return dev_alloc_name_ns(net, dev, name);
- else if (netdev_name_in_use(net, name))
- return -EEXIST;
- else if (dev->name != name)
- strscpy(dev->name, name, IFNAMSIZ);
+ char buf[IFNAMSIZ];
+ int ret;
- return 0;
+ ret = dev_prep_valid_name(net, dev, name, buf);
+ if (ret >= 0)
+ strscpy(dev->name, buf, IFNAMSIZ);
+ return ret;
}
/**
@@ -11038,6 +11052,7 @@ int __dev_change_net_namespace(struct net_device *dev, struct net *net,
const char *pat, int new_ifindex)
{
struct net *net_old = dev_net(dev);
+ char new_name[IFNAMSIZ] = {};
int err, new_nsid;
ASSERT_RTNL();
@@ -11064,7 +11079,7 @@ int __dev_change_net_namespace(struct net_device *dev, struct net *net,
/* We get here if we can't use the current device name */
if (!pat)
goto out;
- err = dev_get_valid_name(net, dev, pat);
+ err = dev_prep_valid_name(net, dev, pat, new_name);
if (err < 0)
goto out;
}
@@ -11135,6 +11150,9 @@ int __dev_change_net_namespace(struct net_device *dev, struct net *net,
kobject_uevent(&dev->dev.kobj, KOBJ_ADD);
netdev_adjacent_add_links(dev);
+ if (new_name[0]) /* Rename the netdev to prepared name */
+ strscpy(dev->name, new_name, IFNAMSIZ);
+
/* Fixup kobjects */
err = device_rename(&dev->dev, dev->name);
WARN_ON(err);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 311cca40661f428b7aa114fb5af578cfdbe3e8b6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102027-writing-grab-f97c@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 311cca40661f428b7aa114fb5af578cfdbe3e8b6 Mon Sep 17 00:00:00 2001
From: Jakub Kicinski <kuba(a)kernel.org>
Date: Tue, 17 Oct 2023 18:38:13 -0700
Subject: [PATCH] net: fix ifname in netlink ntf during netns move
dev_get_valid_name() overwrites the netdev's name on success.
This makes it hard to use in prepare-commit-like fashion,
where we do validation first, and "commit" to the change
later.
Factor out a helper which lets us save the new name to a buffer.
Use it to fix the problem of notification on netns move having
incorrect name:
5: eth0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether be:4d:58:f9:d5:40 brd ff:ff:ff:ff:ff:ff
6: eth1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether 1e:4a:34:36:e3:cd brd ff:ff:ff:ff:ff:ff
[ ~]# ip link set dev eth0 netns 1 name eth1
ip monitor inside netns:
Deleted inet eth0
Deleted inet6 eth0
Deleted 5: eth1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether be:4d:58:f9:d5:40 brd ff:ff:ff:ff:ff:ff new-netnsid 0 new-ifindex 7
Name is reported as eth1 in old netns for ifindex 5, already renamed.
Fixes: d90310243fd7 ("net: device name allocation cleanups")
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Reviewed-by: Jiri Pirko <jiri(a)nvidia.com>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
diff --git a/net/core/dev.c b/net/core/dev.c
index 5aaf5753d4e4..f109ad34d660 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1123,6 +1123,26 @@ static int __dev_alloc_name(struct net *net, const char *name, char *buf)
return -ENFILE;
}
+static int dev_prep_valid_name(struct net *net, struct net_device *dev,
+ const char *want_name, char *out_name)
+{
+ int ret;
+
+ if (!dev_valid_name(want_name))
+ return -EINVAL;
+
+ if (strchr(want_name, '%')) {
+ ret = __dev_alloc_name(net, want_name, out_name);
+ return ret < 0 ? ret : 0;
+ } else if (netdev_name_in_use(net, want_name)) {
+ return -EEXIST;
+ } else if (out_name != want_name) {
+ strscpy(out_name, want_name, IFNAMSIZ);
+ }
+
+ return 0;
+}
+
static int dev_alloc_name_ns(struct net *net,
struct net_device *dev,
const char *name)
@@ -1160,19 +1180,13 @@ EXPORT_SYMBOL(dev_alloc_name);
static int dev_get_valid_name(struct net *net, struct net_device *dev,
const char *name)
{
- BUG_ON(!net);
-
- if (!dev_valid_name(name))
- return -EINVAL;
-
- if (strchr(name, '%'))
- return dev_alloc_name_ns(net, dev, name);
- else if (netdev_name_in_use(net, name))
- return -EEXIST;
- else if (dev->name != name)
- strscpy(dev->name, name, IFNAMSIZ);
+ char buf[IFNAMSIZ];
+ int ret;
- return 0;
+ ret = dev_prep_valid_name(net, dev, name, buf);
+ if (ret >= 0)
+ strscpy(dev->name, buf, IFNAMSIZ);
+ return ret;
}
/**
@@ -11038,6 +11052,7 @@ int __dev_change_net_namespace(struct net_device *dev, struct net *net,
const char *pat, int new_ifindex)
{
struct net *net_old = dev_net(dev);
+ char new_name[IFNAMSIZ] = {};
int err, new_nsid;
ASSERT_RTNL();
@@ -11064,7 +11079,7 @@ int __dev_change_net_namespace(struct net_device *dev, struct net *net,
/* We get here if we can't use the current device name */
if (!pat)
goto out;
- err = dev_get_valid_name(net, dev, pat);
+ err = dev_prep_valid_name(net, dev, pat, new_name);
if (err < 0)
goto out;
}
@@ -11135,6 +11150,9 @@ int __dev_change_net_namespace(struct net_device *dev, struct net *net,
kobject_uevent(&dev->dev.kobj, KOBJ_ADD);
netdev_adjacent_add_links(dev);
+ if (new_name[0]) /* Rename the netdev to prepared name */
+ strscpy(dev->name, new_name, IFNAMSIZ);
+
/* Fixup kobjects */
err = device_rename(&dev->dev, dev->name);
WARN_ON(err);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 311cca40661f428b7aa114fb5af578cfdbe3e8b6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102026-cruelly-untaken-8b7c@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 311cca40661f428b7aa114fb5af578cfdbe3e8b6 Mon Sep 17 00:00:00 2001
From: Jakub Kicinski <kuba(a)kernel.org>
Date: Tue, 17 Oct 2023 18:38:13 -0700
Subject: [PATCH] net: fix ifname in netlink ntf during netns move
dev_get_valid_name() overwrites the netdev's name on success.
This makes it hard to use in prepare-commit-like fashion,
where we do validation first, and "commit" to the change
later.
Factor out a helper which lets us save the new name to a buffer.
Use it to fix the problem of notification on netns move having
incorrect name:
5: eth0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether be:4d:58:f9:d5:40 brd ff:ff:ff:ff:ff:ff
6: eth1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether 1e:4a:34:36:e3:cd brd ff:ff:ff:ff:ff:ff
[ ~]# ip link set dev eth0 netns 1 name eth1
ip monitor inside netns:
Deleted inet eth0
Deleted inet6 eth0
Deleted 5: eth1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether be:4d:58:f9:d5:40 brd ff:ff:ff:ff:ff:ff new-netnsid 0 new-ifindex 7
Name is reported as eth1 in old netns for ifindex 5, already renamed.
Fixes: d90310243fd7 ("net: device name allocation cleanups")
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Reviewed-by: Jiri Pirko <jiri(a)nvidia.com>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
diff --git a/net/core/dev.c b/net/core/dev.c
index 5aaf5753d4e4..f109ad34d660 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1123,6 +1123,26 @@ static int __dev_alloc_name(struct net *net, const char *name, char *buf)
return -ENFILE;
}
+static int dev_prep_valid_name(struct net *net, struct net_device *dev,
+ const char *want_name, char *out_name)
+{
+ int ret;
+
+ if (!dev_valid_name(want_name))
+ return -EINVAL;
+
+ if (strchr(want_name, '%')) {
+ ret = __dev_alloc_name(net, want_name, out_name);
+ return ret < 0 ? ret : 0;
+ } else if (netdev_name_in_use(net, want_name)) {
+ return -EEXIST;
+ } else if (out_name != want_name) {
+ strscpy(out_name, want_name, IFNAMSIZ);
+ }
+
+ return 0;
+}
+
static int dev_alloc_name_ns(struct net *net,
struct net_device *dev,
const char *name)
@@ -1160,19 +1180,13 @@ EXPORT_SYMBOL(dev_alloc_name);
static int dev_get_valid_name(struct net *net, struct net_device *dev,
const char *name)
{
- BUG_ON(!net);
-
- if (!dev_valid_name(name))
- return -EINVAL;
-
- if (strchr(name, '%'))
- return dev_alloc_name_ns(net, dev, name);
- else if (netdev_name_in_use(net, name))
- return -EEXIST;
- else if (dev->name != name)
- strscpy(dev->name, name, IFNAMSIZ);
+ char buf[IFNAMSIZ];
+ int ret;
- return 0;
+ ret = dev_prep_valid_name(net, dev, name, buf);
+ if (ret >= 0)
+ strscpy(dev->name, buf, IFNAMSIZ);
+ return ret;
}
/**
@@ -11038,6 +11052,7 @@ int __dev_change_net_namespace(struct net_device *dev, struct net *net,
const char *pat, int new_ifindex)
{
struct net *net_old = dev_net(dev);
+ char new_name[IFNAMSIZ] = {};
int err, new_nsid;
ASSERT_RTNL();
@@ -11064,7 +11079,7 @@ int __dev_change_net_namespace(struct net_device *dev, struct net *net,
/* We get here if we can't use the current device name */
if (!pat)
goto out;
- err = dev_get_valid_name(net, dev, pat);
+ err = dev_prep_valid_name(net, dev, pat, new_name);
if (err < 0)
goto out;
}
@@ -11135,6 +11150,9 @@ int __dev_change_net_namespace(struct net_device *dev, struct net *net,
kobject_uevent(&dev->dev.kobj, KOBJ_ADD);
netdev_adjacent_add_links(dev);
+ if (new_name[0]) /* Rename the netdev to prepared name */
+ strscpy(dev->name, new_name, IFNAMSIZ);
+
/* Fixup kobjects */
err = device_rename(&dev->dev, dev->name);
WARN_ON(err);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 311cca40661f428b7aa114fb5af578cfdbe3e8b6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102025-divisibly-deck-9f5a@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 311cca40661f428b7aa114fb5af578cfdbe3e8b6 Mon Sep 17 00:00:00 2001
From: Jakub Kicinski <kuba(a)kernel.org>
Date: Tue, 17 Oct 2023 18:38:13 -0700
Subject: [PATCH] net: fix ifname in netlink ntf during netns move
dev_get_valid_name() overwrites the netdev's name on success.
This makes it hard to use in prepare-commit-like fashion,
where we do validation first, and "commit" to the change
later.
Factor out a helper which lets us save the new name to a buffer.
Use it to fix the problem of notification on netns move having
incorrect name:
5: eth0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether be:4d:58:f9:d5:40 brd ff:ff:ff:ff:ff:ff
6: eth1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether 1e:4a:34:36:e3:cd brd ff:ff:ff:ff:ff:ff
[ ~]# ip link set dev eth0 netns 1 name eth1
ip monitor inside netns:
Deleted inet eth0
Deleted inet6 eth0
Deleted 5: eth1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether be:4d:58:f9:d5:40 brd ff:ff:ff:ff:ff:ff new-netnsid 0 new-ifindex 7
Name is reported as eth1 in old netns for ifindex 5, already renamed.
Fixes: d90310243fd7 ("net: device name allocation cleanups")
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Reviewed-by: Jiri Pirko <jiri(a)nvidia.com>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
diff --git a/net/core/dev.c b/net/core/dev.c
index 5aaf5753d4e4..f109ad34d660 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1123,6 +1123,26 @@ static int __dev_alloc_name(struct net *net, const char *name, char *buf)
return -ENFILE;
}
+static int dev_prep_valid_name(struct net *net, struct net_device *dev,
+ const char *want_name, char *out_name)
+{
+ int ret;
+
+ if (!dev_valid_name(want_name))
+ return -EINVAL;
+
+ if (strchr(want_name, '%')) {
+ ret = __dev_alloc_name(net, want_name, out_name);
+ return ret < 0 ? ret : 0;
+ } else if (netdev_name_in_use(net, want_name)) {
+ return -EEXIST;
+ } else if (out_name != want_name) {
+ strscpy(out_name, want_name, IFNAMSIZ);
+ }
+
+ return 0;
+}
+
static int dev_alloc_name_ns(struct net *net,
struct net_device *dev,
const char *name)
@@ -1160,19 +1180,13 @@ EXPORT_SYMBOL(dev_alloc_name);
static int dev_get_valid_name(struct net *net, struct net_device *dev,
const char *name)
{
- BUG_ON(!net);
-
- if (!dev_valid_name(name))
- return -EINVAL;
-
- if (strchr(name, '%'))
- return dev_alloc_name_ns(net, dev, name);
- else if (netdev_name_in_use(net, name))
- return -EEXIST;
- else if (dev->name != name)
- strscpy(dev->name, name, IFNAMSIZ);
+ char buf[IFNAMSIZ];
+ int ret;
- return 0;
+ ret = dev_prep_valid_name(net, dev, name, buf);
+ if (ret >= 0)
+ strscpy(dev->name, buf, IFNAMSIZ);
+ return ret;
}
/**
@@ -11038,6 +11052,7 @@ int __dev_change_net_namespace(struct net_device *dev, struct net *net,
const char *pat, int new_ifindex)
{
struct net *net_old = dev_net(dev);
+ char new_name[IFNAMSIZ] = {};
int err, new_nsid;
ASSERT_RTNL();
@@ -11064,7 +11079,7 @@ int __dev_change_net_namespace(struct net_device *dev, struct net *net,
/* We get here if we can't use the current device name */
if (!pat)
goto out;
- err = dev_get_valid_name(net, dev, pat);
+ err = dev_prep_valid_name(net, dev, pat, new_name);
if (err < 0)
goto out;
}
@@ -11135,6 +11150,9 @@ int __dev_change_net_namespace(struct net_device *dev, struct net *net,
kobject_uevent(&dev->dev.kobj, KOBJ_ADD);
netdev_adjacent_add_links(dev);
+ if (new_name[0]) /* Rename the netdev to prepared name */
+ strscpy(dev->name, new_name, IFNAMSIZ);
+
/* Fixup kobjects */
err = device_rename(&dev->dev, dev->name);
WARN_ON(err);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 311cca40661f428b7aa114fb5af578cfdbe3e8b6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102024-blame-romp-1612@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 311cca40661f428b7aa114fb5af578cfdbe3e8b6 Mon Sep 17 00:00:00 2001
From: Jakub Kicinski <kuba(a)kernel.org>
Date: Tue, 17 Oct 2023 18:38:13 -0700
Subject: [PATCH] net: fix ifname in netlink ntf during netns move
dev_get_valid_name() overwrites the netdev's name on success.
This makes it hard to use in prepare-commit-like fashion,
where we do validation first, and "commit" to the change
later.
Factor out a helper which lets us save the new name to a buffer.
Use it to fix the problem of notification on netns move having
incorrect name:
5: eth0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether be:4d:58:f9:d5:40 brd ff:ff:ff:ff:ff:ff
6: eth1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether 1e:4a:34:36:e3:cd brd ff:ff:ff:ff:ff:ff
[ ~]# ip link set dev eth0 netns 1 name eth1
ip monitor inside netns:
Deleted inet eth0
Deleted inet6 eth0
Deleted 5: eth1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether be:4d:58:f9:d5:40 brd ff:ff:ff:ff:ff:ff new-netnsid 0 new-ifindex 7
Name is reported as eth1 in old netns for ifindex 5, already renamed.
Fixes: d90310243fd7 ("net: device name allocation cleanups")
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Reviewed-by: Jiri Pirko <jiri(a)nvidia.com>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
diff --git a/net/core/dev.c b/net/core/dev.c
index 5aaf5753d4e4..f109ad34d660 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1123,6 +1123,26 @@ static int __dev_alloc_name(struct net *net, const char *name, char *buf)
return -ENFILE;
}
+static int dev_prep_valid_name(struct net *net, struct net_device *dev,
+ const char *want_name, char *out_name)
+{
+ int ret;
+
+ if (!dev_valid_name(want_name))
+ return -EINVAL;
+
+ if (strchr(want_name, '%')) {
+ ret = __dev_alloc_name(net, want_name, out_name);
+ return ret < 0 ? ret : 0;
+ } else if (netdev_name_in_use(net, want_name)) {
+ return -EEXIST;
+ } else if (out_name != want_name) {
+ strscpy(out_name, want_name, IFNAMSIZ);
+ }
+
+ return 0;
+}
+
static int dev_alloc_name_ns(struct net *net,
struct net_device *dev,
const char *name)
@@ -1160,19 +1180,13 @@ EXPORT_SYMBOL(dev_alloc_name);
static int dev_get_valid_name(struct net *net, struct net_device *dev,
const char *name)
{
- BUG_ON(!net);
-
- if (!dev_valid_name(name))
- return -EINVAL;
-
- if (strchr(name, '%'))
- return dev_alloc_name_ns(net, dev, name);
- else if (netdev_name_in_use(net, name))
- return -EEXIST;
- else if (dev->name != name)
- strscpy(dev->name, name, IFNAMSIZ);
+ char buf[IFNAMSIZ];
+ int ret;
- return 0;
+ ret = dev_prep_valid_name(net, dev, name, buf);
+ if (ret >= 0)
+ strscpy(dev->name, buf, IFNAMSIZ);
+ return ret;
}
/**
@@ -11038,6 +11052,7 @@ int __dev_change_net_namespace(struct net_device *dev, struct net *net,
const char *pat, int new_ifindex)
{
struct net *net_old = dev_net(dev);
+ char new_name[IFNAMSIZ] = {};
int err, new_nsid;
ASSERT_RTNL();
@@ -11064,7 +11079,7 @@ int __dev_change_net_namespace(struct net_device *dev, struct net *net,
/* We get here if we can't use the current device name */
if (!pat)
goto out;
- err = dev_get_valid_name(net, dev, pat);
+ err = dev_prep_valid_name(net, dev, pat, new_name);
if (err < 0)
goto out;
}
@@ -11135,6 +11150,9 @@ int __dev_change_net_namespace(struct net_device *dev, struct net *net,
kobject_uevent(&dev->dev.kobj, KOBJ_ADD);
netdev_adjacent_add_links(dev);
+ if (new_name[0]) /* Rename the netdev to prepared name */
+ strscpy(dev->name, new_name, IFNAMSIZ);
+
/* Fixup kobjects */
err = device_rename(&dev->dev, dev->name);
WARN_ON(err);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x a13b67c9a015c4e21601ef9aa4ec9c5d972df1b4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102023-poster-opulently-4c96@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a13b67c9a015c4e21601ef9aa4ec9c5d972df1b4 Mon Sep 17 00:00:00 2001
From: Pedro Tammela <pctammela(a)mojatatu.com>
Date: Tue, 17 Oct 2023 11:36:02 -0300
Subject: [PATCH] net/sched: sch_hfsc: upgrade 'rt' to 'sc' when it becomes a
inner curve
Christian Theune says:
I upgraded from 6.1.38 to 6.1.55 this morning and it broke my traffic shaping script,
leaving me with a non-functional uplink on a remote router.
A 'rt' curve cannot be used as a inner curve (parent class), but we were
allowing such configurations since the qdisc was introduced. Such
configurations would trigger a UAF as Budimir explains:
The parent will have vttree_insert() called on it in init_vf(),
but will not have vttree_remove() called on it in update_vf()
because it does not have the HFSC_FSC flag set.
The qdisc always assumes that inner classes have the HFSC_FSC flag set.
This is by design as it doesn't make sense 'qdisc wise' for an 'rt'
curve to be an inner curve.
Budimir's original patch disallows users to add classes with a 'rt'
parent, but this is too strict as it breaks users that have been using
'rt' as a inner class. Another approach, taken by this patch, is to
upgrade the inner 'rt' into a 'sc', warning the user in the process.
It avoids the UAF reported by Budimir while also being more permissive
to bad scripts/users/code using 'rt' as a inner class.
Users checking the `tc class ls [...]` or `tc class get [...]` dumps would
observe the curve change and are potentially breaking with this change.
v1->v2: https://lore.kernel.org/all/20231013151057.2611860-1-pctammela@mojatatu.com/
- Correct 'Fixes' tag and merge with revert (Jakub)
Cc: Christian Theune <ct(a)flyingcircus.io>
Cc: Budimir Markovic <markovicbudimir(a)gmail.com>
Fixes: b3d26c5702c7 ("net/sched: sch_hfsc: Ensure inner classes have fsc curve")
Signed-off-by: Pedro Tammela <pctammela(a)mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs(a)mojatatu.com>
Link: https://lore.kernel.org/r/20231017143602.3191556-1-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index 3554085bc2be..880c5f16b29c 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -902,6 +902,14 @@ hfsc_change_usc(struct hfsc_class *cl, struct tc_service_curve *usc,
cl->cl_flags |= HFSC_USC;
}
+static void
+hfsc_upgrade_rt(struct hfsc_class *cl)
+{
+ cl->cl_fsc = cl->cl_rsc;
+ rtsc_init(&cl->cl_virtual, &cl->cl_fsc, cl->cl_vt, cl->cl_total);
+ cl->cl_flags |= HFSC_FSC;
+}
+
static const struct nla_policy hfsc_policy[TCA_HFSC_MAX + 1] = {
[TCA_HFSC_RSC] = { .len = sizeof(struct tc_service_curve) },
[TCA_HFSC_FSC] = { .len = sizeof(struct tc_service_curve) },
@@ -1011,10 +1019,6 @@ hfsc_change_class(struct Qdisc *sch, u32 classid, u32 parentid,
if (parent == NULL)
return -ENOENT;
}
- if (!(parent->cl_flags & HFSC_FSC) && parent != &q->root) {
- NL_SET_ERR_MSG(extack, "Invalid parent - parent class must have FSC");
- return -EINVAL;
- }
if (classid == 0 || TC_H_MAJ(classid ^ sch->handle) != 0)
return -EINVAL;
@@ -1065,6 +1069,12 @@ hfsc_change_class(struct Qdisc *sch, u32 classid, u32 parentid,
cl->cf_tree = RB_ROOT;
sch_tree_lock(sch);
+ /* Check if the inner class is a misconfigured 'rt' */
+ if (!(parent->cl_flags & HFSC_FSC) && parent != &q->root) {
+ NL_SET_ERR_MSG(extack,
+ "Forced curve change on parent 'rt' to 'sc'");
+ hfsc_upgrade_rt(parent);
+ }
qdisc_class_hash_insert(&q->clhash, &cl->cl_common);
list_add_tail(&cl->siblings, &parent->children);
if (parent->level == 0)
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x 61b40cefe51af005c72dbdcf975a3d166c6e6406
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102042-clinic-pruning-a3ca@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 61b40cefe51af005c72dbdcf975a3d166c6e6406 Mon Sep 17 00:00:00 2001
From: Jinjie Ruan <ruanjinjie(a)huawei.com>
Date: Wed, 11 Oct 2023 11:24:19 +0800
Subject: [PATCH] net: dsa: bcm_sf2: Fix possible memory leak in
bcm_sf2_mdio_register()
In bcm_sf2_mdio_register(), the class_find_device() will call get_device()
to increment reference count for priv->master_mii_bus->dev if
of_mdio_find_bus() succeeds. If mdiobus_alloc() or mdiobus_register()
fails, it will call get_device() twice without decrement reference count
for the device. And it is the same if bcm_sf2_mdio_register() succeeds but
fails in bcm_sf2_sw_probe(), or if bcm_sf2_sw_probe() succeeds. If the
reference count has not decremented to zero, the dev related resource will
not be freed.
So remove the get_device() in bcm_sf2_mdio_register(), and call
put_device() if mdiobus_alloc() or mdiobus_register() fails and in
bcm_sf2_mdio_unregister() to solve the issue.
And as Simon suggested, unwind from errors for bcm_sf2_mdio_register() and
just return 0 if it succeeds to make it cleaner.
Fixes: 461cd1b03e32 ("net: dsa: bcm_sf2: Register our slave MDIO bus")
Signed-off-by: Jinjie Ruan <ruanjinjie(a)huawei.com>
Suggested-by: Simon Horman <horms(a)kernel.org>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Reviewed-by: Florian Fainelli <florian.fainelli(a)broadcom.com>
Link: https://lore.kernel.org/r/20231011032419.2423290-1-ruanjinjie@huawei.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 72374b066f64..cd1f240c90f3 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -617,17 +617,16 @@ static int bcm_sf2_mdio_register(struct dsa_switch *ds)
dn = of_find_compatible_node(NULL, NULL, "brcm,unimac-mdio");
priv->master_mii_bus = of_mdio_find_bus(dn);
if (!priv->master_mii_bus) {
- of_node_put(dn);
- return -EPROBE_DEFER;
+ err = -EPROBE_DEFER;
+ goto err_of_node_put;
}
- get_device(&priv->master_mii_bus->dev);
priv->master_mii_dn = dn;
priv->slave_mii_bus = mdiobus_alloc();
if (!priv->slave_mii_bus) {
- of_node_put(dn);
- return -ENOMEM;
+ err = -ENOMEM;
+ goto err_put_master_mii_bus_dev;
}
priv->slave_mii_bus->priv = priv;
@@ -684,11 +683,17 @@ static int bcm_sf2_mdio_register(struct dsa_switch *ds)
}
err = mdiobus_register(priv->slave_mii_bus);
- if (err && dn) {
- mdiobus_free(priv->slave_mii_bus);
- of_node_put(dn);
- }
+ if (err && dn)
+ goto err_free_slave_mii_bus;
+ return 0;
+
+err_free_slave_mii_bus:
+ mdiobus_free(priv->slave_mii_bus);
+err_put_master_mii_bus_dev:
+ put_device(&priv->master_mii_bus->dev);
+err_of_node_put:
+ of_node_put(dn);
return err;
}
@@ -696,6 +701,7 @@ static void bcm_sf2_mdio_unregister(struct bcm_sf2_priv *priv)
{
mdiobus_unregister(priv->slave_mii_bus);
mdiobus_free(priv->slave_mii_bus);
+ put_device(&priv->master_mii_bus->dev);
of_node_put(priv->master_mii_dn);
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 61b40cefe51af005c72dbdcf975a3d166c6e6406
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102041-speculate-spew-1d42@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 61b40cefe51af005c72dbdcf975a3d166c6e6406 Mon Sep 17 00:00:00 2001
From: Jinjie Ruan <ruanjinjie(a)huawei.com>
Date: Wed, 11 Oct 2023 11:24:19 +0800
Subject: [PATCH] net: dsa: bcm_sf2: Fix possible memory leak in
bcm_sf2_mdio_register()
In bcm_sf2_mdio_register(), the class_find_device() will call get_device()
to increment reference count for priv->master_mii_bus->dev if
of_mdio_find_bus() succeeds. If mdiobus_alloc() or mdiobus_register()
fails, it will call get_device() twice without decrement reference count
for the device. And it is the same if bcm_sf2_mdio_register() succeeds but
fails in bcm_sf2_sw_probe(), or if bcm_sf2_sw_probe() succeeds. If the
reference count has not decremented to zero, the dev related resource will
not be freed.
So remove the get_device() in bcm_sf2_mdio_register(), and call
put_device() if mdiobus_alloc() or mdiobus_register() fails and in
bcm_sf2_mdio_unregister() to solve the issue.
And as Simon suggested, unwind from errors for bcm_sf2_mdio_register() and
just return 0 if it succeeds to make it cleaner.
Fixes: 461cd1b03e32 ("net: dsa: bcm_sf2: Register our slave MDIO bus")
Signed-off-by: Jinjie Ruan <ruanjinjie(a)huawei.com>
Suggested-by: Simon Horman <horms(a)kernel.org>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Reviewed-by: Florian Fainelli <florian.fainelli(a)broadcom.com>
Link: https://lore.kernel.org/r/20231011032419.2423290-1-ruanjinjie@huawei.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 72374b066f64..cd1f240c90f3 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -617,17 +617,16 @@ static int bcm_sf2_mdio_register(struct dsa_switch *ds)
dn = of_find_compatible_node(NULL, NULL, "brcm,unimac-mdio");
priv->master_mii_bus = of_mdio_find_bus(dn);
if (!priv->master_mii_bus) {
- of_node_put(dn);
- return -EPROBE_DEFER;
+ err = -EPROBE_DEFER;
+ goto err_of_node_put;
}
- get_device(&priv->master_mii_bus->dev);
priv->master_mii_dn = dn;
priv->slave_mii_bus = mdiobus_alloc();
if (!priv->slave_mii_bus) {
- of_node_put(dn);
- return -ENOMEM;
+ err = -ENOMEM;
+ goto err_put_master_mii_bus_dev;
}
priv->slave_mii_bus->priv = priv;
@@ -684,11 +683,17 @@ static int bcm_sf2_mdio_register(struct dsa_switch *ds)
}
err = mdiobus_register(priv->slave_mii_bus);
- if (err && dn) {
- mdiobus_free(priv->slave_mii_bus);
- of_node_put(dn);
- }
+ if (err && dn)
+ goto err_free_slave_mii_bus;
+ return 0;
+
+err_free_slave_mii_bus:
+ mdiobus_free(priv->slave_mii_bus);
+err_put_master_mii_bus_dev:
+ put_device(&priv->master_mii_bus->dev);
+err_of_node_put:
+ of_node_put(dn);
return err;
}
@@ -696,6 +701,7 @@ static void bcm_sf2_mdio_unregister(struct bcm_sf2_priv *priv)
{
mdiobus_unregister(priv->slave_mii_bus);
mdiobus_free(priv->slave_mii_bus);
+ put_device(&priv->master_mii_bus->dev);
of_node_put(priv->master_mii_dn);
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 61b40cefe51af005c72dbdcf975a3d166c6e6406
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102039-chance-unseen-916d@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 61b40cefe51af005c72dbdcf975a3d166c6e6406 Mon Sep 17 00:00:00 2001
From: Jinjie Ruan <ruanjinjie(a)huawei.com>
Date: Wed, 11 Oct 2023 11:24:19 +0800
Subject: [PATCH] net: dsa: bcm_sf2: Fix possible memory leak in
bcm_sf2_mdio_register()
In bcm_sf2_mdio_register(), the class_find_device() will call get_device()
to increment reference count for priv->master_mii_bus->dev if
of_mdio_find_bus() succeeds. If mdiobus_alloc() or mdiobus_register()
fails, it will call get_device() twice without decrement reference count
for the device. And it is the same if bcm_sf2_mdio_register() succeeds but
fails in bcm_sf2_sw_probe(), or if bcm_sf2_sw_probe() succeeds. If the
reference count has not decremented to zero, the dev related resource will
not be freed.
So remove the get_device() in bcm_sf2_mdio_register(), and call
put_device() if mdiobus_alloc() or mdiobus_register() fails and in
bcm_sf2_mdio_unregister() to solve the issue.
And as Simon suggested, unwind from errors for bcm_sf2_mdio_register() and
just return 0 if it succeeds to make it cleaner.
Fixes: 461cd1b03e32 ("net: dsa: bcm_sf2: Register our slave MDIO bus")
Signed-off-by: Jinjie Ruan <ruanjinjie(a)huawei.com>
Suggested-by: Simon Horman <horms(a)kernel.org>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Reviewed-by: Florian Fainelli <florian.fainelli(a)broadcom.com>
Link: https://lore.kernel.org/r/20231011032419.2423290-1-ruanjinjie@huawei.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 72374b066f64..cd1f240c90f3 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -617,17 +617,16 @@ static int bcm_sf2_mdio_register(struct dsa_switch *ds)
dn = of_find_compatible_node(NULL, NULL, "brcm,unimac-mdio");
priv->master_mii_bus = of_mdio_find_bus(dn);
if (!priv->master_mii_bus) {
- of_node_put(dn);
- return -EPROBE_DEFER;
+ err = -EPROBE_DEFER;
+ goto err_of_node_put;
}
- get_device(&priv->master_mii_bus->dev);
priv->master_mii_dn = dn;
priv->slave_mii_bus = mdiobus_alloc();
if (!priv->slave_mii_bus) {
- of_node_put(dn);
- return -ENOMEM;
+ err = -ENOMEM;
+ goto err_put_master_mii_bus_dev;
}
priv->slave_mii_bus->priv = priv;
@@ -684,11 +683,17 @@ static int bcm_sf2_mdio_register(struct dsa_switch *ds)
}
err = mdiobus_register(priv->slave_mii_bus);
- if (err && dn) {
- mdiobus_free(priv->slave_mii_bus);
- of_node_put(dn);
- }
+ if (err && dn)
+ goto err_free_slave_mii_bus;
+ return 0;
+
+err_free_slave_mii_bus:
+ mdiobus_free(priv->slave_mii_bus);
+err_put_master_mii_bus_dev:
+ put_device(&priv->master_mii_bus->dev);
+err_of_node_put:
+ of_node_put(dn);
return err;
}
@@ -696,6 +701,7 @@ static void bcm_sf2_mdio_unregister(struct bcm_sf2_priv *priv)
{
mdiobus_unregister(priv->slave_mii_bus);
mdiobus_free(priv->slave_mii_bus);
+ put_device(&priv->master_mii_bus->dev);
of_node_put(priv->master_mii_dn);
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x 1c2709cfff1dedbb9591e989e2f001484208d914
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102017-pedometer-skier-c67d@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1c2709cfff1dedbb9591e989e2f001484208d914 Mon Sep 17 00:00:00 2001
From: Neal Cardwell <ncardwell(a)google.com>
Date: Sun, 15 Oct 2023 13:47:00 -0400
Subject: [PATCH] tcp: fix excessive TLP and RACK timeouts from HZ rounding
We discovered from packet traces of slow loss recovery on kernels with
the default HZ=250 setting (and min_rtt < 1ms) that after reordering,
when receiving a SACKed sequence range, the RACK reordering timer was
firing after about 16ms rather than the desired value of roughly
min_rtt/4 + 2ms. The problem is largely due to the RACK reorder timer
calculation adding in TCP_TIMEOUT_MIN, which is 2 jiffies. On kernels
with HZ=250, this is 2*4ms = 8ms. The TLP timer calculation has the
exact same issue.
This commit fixes the TLP transmit timer and RACK reordering timer
floor calculation to more closely match the intended 2ms floor even on
kernels with HZ=250. It does this by adding in a new
TCP_TIMEOUT_MIN_US floor of 2000 us and then converting to jiffies,
instead of the current approach of converting to jiffies and then
adding th TCP_TIMEOUT_MIN value of 2 jiffies.
Our testing has verified that on kernels with HZ=1000, as expected,
this does not produce significant changes in behavior, but on kernels
with the default HZ=250 the latency improvement can be large. For
example, our tests show that for HZ=250 kernels at low RTTs this fix
roughly halves the latency for the RACK reorder timer: instead of
mostly firing at 16ms it mostly fires at 8ms.
Suggested-by: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: Neal Cardwell <ncardwell(a)google.com>
Signed-off-by: Yuchung Cheng <ycheng(a)google.com>
Fixes: bb4d991a28cc ("tcp: adjust tail loss probe timeout")
Reviewed-by: Eric Dumazet <edumazet(a)google.com>
Link: https://lore.kernel.org/r/20231015174700.2206872-1-ncardwell.sw@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 7b1a720691ae..4b03ca7cb8a5 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -141,6 +141,9 @@ void tcp_time_wait(struct sock *sk, int state, int timeo);
#define TCP_RTO_MAX ((unsigned)(120*HZ))
#define TCP_RTO_MIN ((unsigned)(HZ/5))
#define TCP_TIMEOUT_MIN (2U) /* Min timeout for TCP timers in jiffies */
+
+#define TCP_TIMEOUT_MIN_US (2*USEC_PER_MSEC) /* Min TCP timeout in microsecs */
+
#define TCP_TIMEOUT_INIT ((unsigned)(1*HZ)) /* RFC6298 2.1 initial RTO value */
#define TCP_TIMEOUT_FALLBACK ((unsigned)(3*HZ)) /* RFC 1122 initial RTO value, now
* used as a fallback RTO for the
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 9c8c42c280b7..bbd85672fda7 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2788,7 +2788,7 @@ bool tcp_schedule_loss_probe(struct sock *sk, bool advancing_rto)
{
struct inet_connection_sock *icsk = inet_csk(sk);
struct tcp_sock *tp = tcp_sk(sk);
- u32 timeout, rto_delta_us;
+ u32 timeout, timeout_us, rto_delta_us;
int early_retrans;
/* Don't do any loss probe on a Fast Open connection before 3WHS
@@ -2812,11 +2812,12 @@ bool tcp_schedule_loss_probe(struct sock *sk, bool advancing_rto)
* sample is available then probe after TCP_TIMEOUT_INIT.
*/
if (tp->srtt_us) {
- timeout = usecs_to_jiffies(tp->srtt_us >> 2);
+ timeout_us = tp->srtt_us >> 2;
if (tp->packets_out == 1)
- timeout += TCP_RTO_MIN;
+ timeout_us += tcp_rto_min_us(sk);
else
- timeout += TCP_TIMEOUT_MIN;
+ timeout_us += TCP_TIMEOUT_MIN_US;
+ timeout = usecs_to_jiffies(timeout_us);
} else {
timeout = TCP_TIMEOUT_INIT;
}
diff --git a/net/ipv4/tcp_recovery.c b/net/ipv4/tcp_recovery.c
index acf4869c5d3b..bba10110fbbc 100644
--- a/net/ipv4/tcp_recovery.c
+++ b/net/ipv4/tcp_recovery.c
@@ -104,7 +104,7 @@ bool tcp_rack_mark_lost(struct sock *sk)
tp->rack.advanced = 0;
tcp_rack_detect_loss(sk, &timeout);
if (timeout) {
- timeout = usecs_to_jiffies(timeout) + TCP_TIMEOUT_MIN;
+ timeout = usecs_to_jiffies(timeout + TCP_TIMEOUT_MIN_US);
inet_csk_reset_xmit_timer(sk, ICSK_TIME_REO_TIMEOUT,
timeout, inet_csk(sk)->icsk_rto);
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 700b2b439766e8aab8a7174991198497345bd411
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102023-rubble-eggshell-ae24@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 700b2b439766e8aab8a7174991198497345bd411 Mon Sep 17 00:00:00 2001
From: "Masami Hiramatsu (Google)" <mhiramat(a)kernel.org>
Date: Tue, 17 Oct 2023 08:49:45 +0900
Subject: [PATCH] fprobe: Fix to ensure the number of active retprobes is not
zero
The number of active retprobes can be zero but it is not acceptable,
so return EINVAL error if detected.
Link: https://lore.kernel.org/all/169750018550.186853.11198884812017796410.stgit@…
Reported-by: wuqiang.matt <wuqiang.matt(a)bytedance.com>
Closes: https://lore.kernel.org/all/20231016222103.cb9f426edc60220eabd8aa6a@kernel.…
Fixes: 5b0ab78998e3 ("fprobe: Add exit_handler support")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
index 3b21f4063258..881f90f0cbcf 100644
--- a/kernel/trace/fprobe.c
+++ b/kernel/trace/fprobe.c
@@ -189,7 +189,7 @@ static int fprobe_init_rethook(struct fprobe *fp, int num)
{
int i, size;
- if (num < 0)
+ if (num <= 0)
return -EINVAL;
if (!fp->exit_handler) {
@@ -202,8 +202,8 @@ static int fprobe_init_rethook(struct fprobe *fp, int num)
size = fp->nr_maxactive;
else
size = num * num_possible_cpus() * 2;
- if (size < 0)
- return -E2BIG;
+ if (size <= 0)
+ return -EINVAL;
fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler);
if (!fp->rethook)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 6d41d4fe28724db16ca1016df0713a07e0cc7448
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102006-detector-tweed-1a0c@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6d41d4fe28724db16ca1016df0713a07e0cc7448 Mon Sep 17 00:00:00 2001
From: Dong Chenchen <dongchenchen2(a)huawei.com>
Date: Tue, 15 Aug 2023 22:18:34 +0800
Subject: [PATCH] net: xfrm: skip policies marked as dead while reinserting
policies
BUG: KASAN: slab-use-after-free in xfrm_policy_inexact_list_reinsert+0xb6/0x430
Read of size 1 at addr ffff8881051f3bf8 by task ip/668
CPU: 2 PID: 668 Comm: ip Not tainted 6.5.0-rc5-00182-g25aa0bebba72-dirty #64
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x72/0xa0
print_report+0xd0/0x620
kasan_report+0xb6/0xf0
xfrm_policy_inexact_list_reinsert+0xb6/0x430
xfrm_policy_inexact_insert_node.constprop.0+0x537/0x800
xfrm_policy_inexact_alloc_chain+0x23f/0x320
xfrm_policy_inexact_insert+0x6b/0x590
xfrm_policy_insert+0x3b1/0x480
xfrm_add_policy+0x23c/0x3c0
xfrm_user_rcv_msg+0x2d0/0x510
netlink_rcv_skb+0x10d/0x2d0
xfrm_netlink_rcv+0x49/0x60
netlink_unicast+0x3fe/0x540
netlink_sendmsg+0x528/0x970
sock_sendmsg+0x14a/0x160
____sys_sendmsg+0x4fc/0x580
___sys_sendmsg+0xef/0x160
__sys_sendmsg+0xf7/0x1b0
do_syscall_64+0x3f/0x90
entry_SYSCALL_64_after_hwframe+0x73/0xdd
The root cause is:
cpu 0 cpu1
xfrm_dump_policy
xfrm_policy_walk
list_move_tail
xfrm_add_policy
... ...
xfrm_policy_inexact_list_reinsert
list_for_each_entry_reverse
if (!policy->bydst_reinsert)
//read non-existent policy
xfrm_dump_policy_done
xfrm_policy_walk_done
list_del(&walk->walk.all);
If dump_one_policy() returns err (triggered by netlink socket),
xfrm_policy_walk() will move walk initialized by socket to list
net->xfrm.policy_all. so this socket becomes visible in the global
policy list. The head *walk can be traversed when users add policies
with different prefixlen and trigger xfrm_policy node merge.
The issue can also be triggered by policy list traversal while rehashing
and flushing policies.
It can be fixed by skip such "policies" with walk.dead set to 1.
Fixes: 9cf545ebd591 ("xfrm: policy: store inexact policies in a tree ordered by destination address")
Fixes: 12a169e7d8f4 ("ipsec: Put dumpers on the dump list")
Signed-off-by: Dong Chenchen <dongchenchen2(a)huawei.com>
Signed-off-by: Steffen Klassert <steffen.klassert(a)secunet.com>
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index d6b405782b63..113fb7e9cdaf 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -851,7 +851,7 @@ static void xfrm_policy_inexact_list_reinsert(struct net *net,
struct hlist_node *newpos = NULL;
bool matches_s, matches_d;
- if (!policy->bydst_reinsert)
+ if (policy->walk.dead || !policy->bydst_reinsert)
continue;
WARN_ON_ONCE(policy->family != family);
@@ -1256,8 +1256,11 @@ static void xfrm_hash_rebuild(struct work_struct *work)
struct xfrm_pol_inexact_bin *bin;
u8 dbits, sbits;
+ if (policy->walk.dead)
+ continue;
+
dir = xfrm_policy_id2dir(policy->index);
- if (policy->walk.dead || dir >= XFRM_POLICY_MAX)
+ if (dir >= XFRM_POLICY_MAX)
continue;
if ((dir & XFRM_POLICY_MASK) == XFRM_POLICY_OUT) {
@@ -1823,9 +1826,11 @@ int xfrm_policy_flush(struct net *net, u8 type, bool task_valid)
again:
list_for_each_entry(pol, &net->xfrm.policy_all, walk.all) {
+ if (pol->walk.dead)
+ continue;
+
dir = xfrm_policy_id2dir(pol->index);
- if (pol->walk.dead ||
- dir >= XFRM_POLICY_MAX ||
+ if (dir >= XFRM_POLICY_MAX ||
pol->type != type)
continue;
@@ -1862,9 +1867,11 @@ int xfrm_dev_policy_flush(struct net *net, struct net_device *dev,
again:
list_for_each_entry(pol, &net->xfrm.policy_all, walk.all) {
+ if (pol->walk.dead)
+ continue;
+
dir = xfrm_policy_id2dir(pol->index);
- if (pol->walk.dead ||
- dir >= XFRM_POLICY_MAX ||
+ if (dir >= XFRM_POLICY_MAX ||
pol->xdo.dev != dev)
continue;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 6d41d4fe28724db16ca1016df0713a07e0cc7448
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102004-imbecile-uneasy-23a8@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6d41d4fe28724db16ca1016df0713a07e0cc7448 Mon Sep 17 00:00:00 2001
From: Dong Chenchen <dongchenchen2(a)huawei.com>
Date: Tue, 15 Aug 2023 22:18:34 +0800
Subject: [PATCH] net: xfrm: skip policies marked as dead while reinserting
policies
BUG: KASAN: slab-use-after-free in xfrm_policy_inexact_list_reinsert+0xb6/0x430
Read of size 1 at addr ffff8881051f3bf8 by task ip/668
CPU: 2 PID: 668 Comm: ip Not tainted 6.5.0-rc5-00182-g25aa0bebba72-dirty #64
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x72/0xa0
print_report+0xd0/0x620
kasan_report+0xb6/0xf0
xfrm_policy_inexact_list_reinsert+0xb6/0x430
xfrm_policy_inexact_insert_node.constprop.0+0x537/0x800
xfrm_policy_inexact_alloc_chain+0x23f/0x320
xfrm_policy_inexact_insert+0x6b/0x590
xfrm_policy_insert+0x3b1/0x480
xfrm_add_policy+0x23c/0x3c0
xfrm_user_rcv_msg+0x2d0/0x510
netlink_rcv_skb+0x10d/0x2d0
xfrm_netlink_rcv+0x49/0x60
netlink_unicast+0x3fe/0x540
netlink_sendmsg+0x528/0x970
sock_sendmsg+0x14a/0x160
____sys_sendmsg+0x4fc/0x580
___sys_sendmsg+0xef/0x160
__sys_sendmsg+0xf7/0x1b0
do_syscall_64+0x3f/0x90
entry_SYSCALL_64_after_hwframe+0x73/0xdd
The root cause is:
cpu 0 cpu1
xfrm_dump_policy
xfrm_policy_walk
list_move_tail
xfrm_add_policy
... ...
xfrm_policy_inexact_list_reinsert
list_for_each_entry_reverse
if (!policy->bydst_reinsert)
//read non-existent policy
xfrm_dump_policy_done
xfrm_policy_walk_done
list_del(&walk->walk.all);
If dump_one_policy() returns err (triggered by netlink socket),
xfrm_policy_walk() will move walk initialized by socket to list
net->xfrm.policy_all. so this socket becomes visible in the global
policy list. The head *walk can be traversed when users add policies
with different prefixlen and trigger xfrm_policy node merge.
The issue can also be triggered by policy list traversal while rehashing
and flushing policies.
It can be fixed by skip such "policies" with walk.dead set to 1.
Fixes: 9cf545ebd591 ("xfrm: policy: store inexact policies in a tree ordered by destination address")
Fixes: 12a169e7d8f4 ("ipsec: Put dumpers on the dump list")
Signed-off-by: Dong Chenchen <dongchenchen2(a)huawei.com>
Signed-off-by: Steffen Klassert <steffen.klassert(a)secunet.com>
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index d6b405782b63..113fb7e9cdaf 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -851,7 +851,7 @@ static void xfrm_policy_inexact_list_reinsert(struct net *net,
struct hlist_node *newpos = NULL;
bool matches_s, matches_d;
- if (!policy->bydst_reinsert)
+ if (policy->walk.dead || !policy->bydst_reinsert)
continue;
WARN_ON_ONCE(policy->family != family);
@@ -1256,8 +1256,11 @@ static void xfrm_hash_rebuild(struct work_struct *work)
struct xfrm_pol_inexact_bin *bin;
u8 dbits, sbits;
+ if (policy->walk.dead)
+ continue;
+
dir = xfrm_policy_id2dir(policy->index);
- if (policy->walk.dead || dir >= XFRM_POLICY_MAX)
+ if (dir >= XFRM_POLICY_MAX)
continue;
if ((dir & XFRM_POLICY_MASK) == XFRM_POLICY_OUT) {
@@ -1823,9 +1826,11 @@ int xfrm_policy_flush(struct net *net, u8 type, bool task_valid)
again:
list_for_each_entry(pol, &net->xfrm.policy_all, walk.all) {
+ if (pol->walk.dead)
+ continue;
+
dir = xfrm_policy_id2dir(pol->index);
- if (pol->walk.dead ||
- dir >= XFRM_POLICY_MAX ||
+ if (dir >= XFRM_POLICY_MAX ||
pol->type != type)
continue;
@@ -1862,9 +1867,11 @@ int xfrm_dev_policy_flush(struct net *net, struct net_device *dev,
again:
list_for_each_entry(pol, &net->xfrm.policy_all, walk.all) {
+ if (pol->walk.dead)
+ continue;
+
dir = xfrm_policy_id2dir(pol->index);
- if (pol->walk.dead ||
- dir >= XFRM_POLICY_MAX ||
+ if (dir >= XFRM_POLICY_MAX ||
pol->xdo.dev != dev)
continue;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 6d41d4fe28724db16ca1016df0713a07e0cc7448
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102003-shank-happiness-527c@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6d41d4fe28724db16ca1016df0713a07e0cc7448 Mon Sep 17 00:00:00 2001
From: Dong Chenchen <dongchenchen2(a)huawei.com>
Date: Tue, 15 Aug 2023 22:18:34 +0800
Subject: [PATCH] net: xfrm: skip policies marked as dead while reinserting
policies
BUG: KASAN: slab-use-after-free in xfrm_policy_inexact_list_reinsert+0xb6/0x430
Read of size 1 at addr ffff8881051f3bf8 by task ip/668
CPU: 2 PID: 668 Comm: ip Not tainted 6.5.0-rc5-00182-g25aa0bebba72-dirty #64
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x72/0xa0
print_report+0xd0/0x620
kasan_report+0xb6/0xf0
xfrm_policy_inexact_list_reinsert+0xb6/0x430
xfrm_policy_inexact_insert_node.constprop.0+0x537/0x800
xfrm_policy_inexact_alloc_chain+0x23f/0x320
xfrm_policy_inexact_insert+0x6b/0x590
xfrm_policy_insert+0x3b1/0x480
xfrm_add_policy+0x23c/0x3c0
xfrm_user_rcv_msg+0x2d0/0x510
netlink_rcv_skb+0x10d/0x2d0
xfrm_netlink_rcv+0x49/0x60
netlink_unicast+0x3fe/0x540
netlink_sendmsg+0x528/0x970
sock_sendmsg+0x14a/0x160
____sys_sendmsg+0x4fc/0x580
___sys_sendmsg+0xef/0x160
__sys_sendmsg+0xf7/0x1b0
do_syscall_64+0x3f/0x90
entry_SYSCALL_64_after_hwframe+0x73/0xdd
The root cause is:
cpu 0 cpu1
xfrm_dump_policy
xfrm_policy_walk
list_move_tail
xfrm_add_policy
... ...
xfrm_policy_inexact_list_reinsert
list_for_each_entry_reverse
if (!policy->bydst_reinsert)
//read non-existent policy
xfrm_dump_policy_done
xfrm_policy_walk_done
list_del(&walk->walk.all);
If dump_one_policy() returns err (triggered by netlink socket),
xfrm_policy_walk() will move walk initialized by socket to list
net->xfrm.policy_all. so this socket becomes visible in the global
policy list. The head *walk can be traversed when users add policies
with different prefixlen and trigger xfrm_policy node merge.
The issue can also be triggered by policy list traversal while rehashing
and flushing policies.
It can be fixed by skip such "policies" with walk.dead set to 1.
Fixes: 9cf545ebd591 ("xfrm: policy: store inexact policies in a tree ordered by destination address")
Fixes: 12a169e7d8f4 ("ipsec: Put dumpers on the dump list")
Signed-off-by: Dong Chenchen <dongchenchen2(a)huawei.com>
Signed-off-by: Steffen Klassert <steffen.klassert(a)secunet.com>
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index d6b405782b63..113fb7e9cdaf 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -851,7 +851,7 @@ static void xfrm_policy_inexact_list_reinsert(struct net *net,
struct hlist_node *newpos = NULL;
bool matches_s, matches_d;
- if (!policy->bydst_reinsert)
+ if (policy->walk.dead || !policy->bydst_reinsert)
continue;
WARN_ON_ONCE(policy->family != family);
@@ -1256,8 +1256,11 @@ static void xfrm_hash_rebuild(struct work_struct *work)
struct xfrm_pol_inexact_bin *bin;
u8 dbits, sbits;
+ if (policy->walk.dead)
+ continue;
+
dir = xfrm_policy_id2dir(policy->index);
- if (policy->walk.dead || dir >= XFRM_POLICY_MAX)
+ if (dir >= XFRM_POLICY_MAX)
continue;
if ((dir & XFRM_POLICY_MASK) == XFRM_POLICY_OUT) {
@@ -1823,9 +1826,11 @@ int xfrm_policy_flush(struct net *net, u8 type, bool task_valid)
again:
list_for_each_entry(pol, &net->xfrm.policy_all, walk.all) {
+ if (pol->walk.dead)
+ continue;
+
dir = xfrm_policy_id2dir(pol->index);
- if (pol->walk.dead ||
- dir >= XFRM_POLICY_MAX ||
+ if (dir >= XFRM_POLICY_MAX ||
pol->type != type)
continue;
@@ -1862,9 +1867,11 @@ int xfrm_dev_policy_flush(struct net *net, struct net_device *dev,
again:
list_for_each_entry(pol, &net->xfrm.policy_all, walk.all) {
+ if (pol->walk.dead)
+ continue;
+
dir = xfrm_policy_id2dir(pol->index);
- if (pol->walk.dead ||
- dir >= XFRM_POLICY_MAX ||
+ if (dir >= XFRM_POLICY_MAX ||
pol->xdo.dev != dev)
continue;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 6d41d4fe28724db16ca1016df0713a07e0cc7448
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102002-overplant-unseated-c388@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6d41d4fe28724db16ca1016df0713a07e0cc7448 Mon Sep 17 00:00:00 2001
From: Dong Chenchen <dongchenchen2(a)huawei.com>
Date: Tue, 15 Aug 2023 22:18:34 +0800
Subject: [PATCH] net: xfrm: skip policies marked as dead while reinserting
policies
BUG: KASAN: slab-use-after-free in xfrm_policy_inexact_list_reinsert+0xb6/0x430
Read of size 1 at addr ffff8881051f3bf8 by task ip/668
CPU: 2 PID: 668 Comm: ip Not tainted 6.5.0-rc5-00182-g25aa0bebba72-dirty #64
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x72/0xa0
print_report+0xd0/0x620
kasan_report+0xb6/0xf0
xfrm_policy_inexact_list_reinsert+0xb6/0x430
xfrm_policy_inexact_insert_node.constprop.0+0x537/0x800
xfrm_policy_inexact_alloc_chain+0x23f/0x320
xfrm_policy_inexact_insert+0x6b/0x590
xfrm_policy_insert+0x3b1/0x480
xfrm_add_policy+0x23c/0x3c0
xfrm_user_rcv_msg+0x2d0/0x510
netlink_rcv_skb+0x10d/0x2d0
xfrm_netlink_rcv+0x49/0x60
netlink_unicast+0x3fe/0x540
netlink_sendmsg+0x528/0x970
sock_sendmsg+0x14a/0x160
____sys_sendmsg+0x4fc/0x580
___sys_sendmsg+0xef/0x160
__sys_sendmsg+0xf7/0x1b0
do_syscall_64+0x3f/0x90
entry_SYSCALL_64_after_hwframe+0x73/0xdd
The root cause is:
cpu 0 cpu1
xfrm_dump_policy
xfrm_policy_walk
list_move_tail
xfrm_add_policy
... ...
xfrm_policy_inexact_list_reinsert
list_for_each_entry_reverse
if (!policy->bydst_reinsert)
//read non-existent policy
xfrm_dump_policy_done
xfrm_policy_walk_done
list_del(&walk->walk.all);
If dump_one_policy() returns err (triggered by netlink socket),
xfrm_policy_walk() will move walk initialized by socket to list
net->xfrm.policy_all. so this socket becomes visible in the global
policy list. The head *walk can be traversed when users add policies
with different prefixlen and trigger xfrm_policy node merge.
The issue can also be triggered by policy list traversal while rehashing
and flushing policies.
It can be fixed by skip such "policies" with walk.dead set to 1.
Fixes: 9cf545ebd591 ("xfrm: policy: store inexact policies in a tree ordered by destination address")
Fixes: 12a169e7d8f4 ("ipsec: Put dumpers on the dump list")
Signed-off-by: Dong Chenchen <dongchenchen2(a)huawei.com>
Signed-off-by: Steffen Klassert <steffen.klassert(a)secunet.com>
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index d6b405782b63..113fb7e9cdaf 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -851,7 +851,7 @@ static void xfrm_policy_inexact_list_reinsert(struct net *net,
struct hlist_node *newpos = NULL;
bool matches_s, matches_d;
- if (!policy->bydst_reinsert)
+ if (policy->walk.dead || !policy->bydst_reinsert)
continue;
WARN_ON_ONCE(policy->family != family);
@@ -1256,8 +1256,11 @@ static void xfrm_hash_rebuild(struct work_struct *work)
struct xfrm_pol_inexact_bin *bin;
u8 dbits, sbits;
+ if (policy->walk.dead)
+ continue;
+
dir = xfrm_policy_id2dir(policy->index);
- if (policy->walk.dead || dir >= XFRM_POLICY_MAX)
+ if (dir >= XFRM_POLICY_MAX)
continue;
if ((dir & XFRM_POLICY_MASK) == XFRM_POLICY_OUT) {
@@ -1823,9 +1826,11 @@ int xfrm_policy_flush(struct net *net, u8 type, bool task_valid)
again:
list_for_each_entry(pol, &net->xfrm.policy_all, walk.all) {
+ if (pol->walk.dead)
+ continue;
+
dir = xfrm_policy_id2dir(pol->index);
- if (pol->walk.dead ||
- dir >= XFRM_POLICY_MAX ||
+ if (dir >= XFRM_POLICY_MAX ||
pol->type != type)
continue;
@@ -1862,9 +1867,11 @@ int xfrm_dev_policy_flush(struct net *net, struct net_device *dev,
again:
list_for_each_entry(pol, &net->xfrm.policy_all, walk.all) {
+ if (pol->walk.dead)
+ continue;
+
dir = xfrm_policy_id2dir(pol->index);
- if (pol->walk.dead ||
- dir >= XFRM_POLICY_MAX ||
+ if (dir >= XFRM_POLICY_MAX ||
pol->xdo.dev != dev)
continue;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x cc9b364bb1d58d3dae270c7a931a8cc717dc2b3b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102005-strained-decompose-2eae@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From cc9b364bb1d58d3dae270c7a931a8cc717dc2b3b Mon Sep 17 00:00:00 2001
From: Zhang Changzhong <zhangchangzhong(a)huawei.com>
Date: Fri, 15 Sep 2023 19:20:41 +0800
Subject: [PATCH] xfrm6: fix inet6_dev refcount underflow problem
There are race conditions that may lead to inet6_dev refcount underflow
in xfrm6_dst_destroy() and rt6_uncached_list_flush_dev().
One of the refcount underflow bugs is shown below:
(cpu 1) | (cpu 2)
xfrm6_dst_destroy() |
... |
in6_dev_put() |
| rt6_uncached_list_flush_dev()
... | ...
| in6_dev_put()
rt6_uncached_list_del() | ...
... |
xfrm6_dst_destroy() calls rt6_uncached_list_del() after in6_dev_put(),
so rt6_uncached_list_flush_dev() has a chance to call in6_dev_put()
again for the same inet6_dev.
Fix it by moving in6_dev_put() after rt6_uncached_list_del() in
xfrm6_dst_destroy().
Fixes: 510c321b5571 ("xfrm: reuse uncached_list to track xdsts")
Signed-off-by: Zhang Changzhong <zhangchangzhong(a)huawei.com>
Reviewed-by: Xin Long <lucien.xin(a)gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert(a)secunet.com>
diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
index 188224a76685..45d0f9a8b28c 100644
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -117,10 +117,10 @@ static void xfrm6_dst_destroy(struct dst_entry *dst)
{
struct xfrm_dst *xdst = (struct xfrm_dst *)dst;
- if (likely(xdst->u.rt6.rt6i_idev))
- in6_dev_put(xdst->u.rt6.rt6i_idev);
dst_destroy_metrics_generic(dst);
rt6_uncached_list_del(&xdst->u.rt6);
+ if (likely(xdst->u.rt6.rt6i_idev))
+ in6_dev_put(xdst->u.rt6.rt6i_idev);
xfrm_dst_destroy(xdst);
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x cc9b364bb1d58d3dae270c7a931a8cc717dc2b3b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102004-collision-feminism-9a66@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From cc9b364bb1d58d3dae270c7a931a8cc717dc2b3b Mon Sep 17 00:00:00 2001
From: Zhang Changzhong <zhangchangzhong(a)huawei.com>
Date: Fri, 15 Sep 2023 19:20:41 +0800
Subject: [PATCH] xfrm6: fix inet6_dev refcount underflow problem
There are race conditions that may lead to inet6_dev refcount underflow
in xfrm6_dst_destroy() and rt6_uncached_list_flush_dev().
One of the refcount underflow bugs is shown below:
(cpu 1) | (cpu 2)
xfrm6_dst_destroy() |
... |
in6_dev_put() |
| rt6_uncached_list_flush_dev()
... | ...
| in6_dev_put()
rt6_uncached_list_del() | ...
... |
xfrm6_dst_destroy() calls rt6_uncached_list_del() after in6_dev_put(),
so rt6_uncached_list_flush_dev() has a chance to call in6_dev_put()
again for the same inet6_dev.
Fix it by moving in6_dev_put() after rt6_uncached_list_del() in
xfrm6_dst_destroy().
Fixes: 510c321b5571 ("xfrm: reuse uncached_list to track xdsts")
Signed-off-by: Zhang Changzhong <zhangchangzhong(a)huawei.com>
Reviewed-by: Xin Long <lucien.xin(a)gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert(a)secunet.com>
diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
index 188224a76685..45d0f9a8b28c 100644
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -117,10 +117,10 @@ static void xfrm6_dst_destroy(struct dst_entry *dst)
{
struct xfrm_dst *xdst = (struct xfrm_dst *)dst;
- if (likely(xdst->u.rt6.rt6i_idev))
- in6_dev_put(xdst->u.rt6.rt6i_idev);
dst_destroy_metrics_generic(dst);
rt6_uncached_list_del(&xdst->u.rt6);
+ if (likely(xdst->u.rt6.rt6i_idev))
+ in6_dev_put(xdst->u.rt6.rt6i_idev);
xfrm_dst_destroy(xdst);
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x cc9b364bb1d58d3dae270c7a931a8cc717dc2b3b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102003-anagram-decency-29a0@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From cc9b364bb1d58d3dae270c7a931a8cc717dc2b3b Mon Sep 17 00:00:00 2001
From: Zhang Changzhong <zhangchangzhong(a)huawei.com>
Date: Fri, 15 Sep 2023 19:20:41 +0800
Subject: [PATCH] xfrm6: fix inet6_dev refcount underflow problem
There are race conditions that may lead to inet6_dev refcount underflow
in xfrm6_dst_destroy() and rt6_uncached_list_flush_dev().
One of the refcount underflow bugs is shown below:
(cpu 1) | (cpu 2)
xfrm6_dst_destroy() |
... |
in6_dev_put() |
| rt6_uncached_list_flush_dev()
... | ...
| in6_dev_put()
rt6_uncached_list_del() | ...
... |
xfrm6_dst_destroy() calls rt6_uncached_list_del() after in6_dev_put(),
so rt6_uncached_list_flush_dev() has a chance to call in6_dev_put()
again for the same inet6_dev.
Fix it by moving in6_dev_put() after rt6_uncached_list_del() in
xfrm6_dst_destroy().
Fixes: 510c321b5571 ("xfrm: reuse uncached_list to track xdsts")
Signed-off-by: Zhang Changzhong <zhangchangzhong(a)huawei.com>
Reviewed-by: Xin Long <lucien.xin(a)gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert(a)secunet.com>
diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
index 188224a76685..45d0f9a8b28c 100644
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -117,10 +117,10 @@ static void xfrm6_dst_destroy(struct dst_entry *dst)
{
struct xfrm_dst *xdst = (struct xfrm_dst *)dst;
- if (likely(xdst->u.rt6.rt6i_idev))
- in6_dev_put(xdst->u.rt6.rt6i_idev);
dst_destroy_metrics_generic(dst);
rt6_uncached_list_del(&xdst->u.rt6);
+ if (likely(xdst->u.rt6.rt6i_idev))
+ in6_dev_put(xdst->u.rt6.rt6i_idev);
xfrm_dst_destroy(xdst);
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x cc9b364bb1d58d3dae270c7a931a8cc717dc2b3b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102002-laurel-arise-0294@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From cc9b364bb1d58d3dae270c7a931a8cc717dc2b3b Mon Sep 17 00:00:00 2001
From: Zhang Changzhong <zhangchangzhong(a)huawei.com>
Date: Fri, 15 Sep 2023 19:20:41 +0800
Subject: [PATCH] xfrm6: fix inet6_dev refcount underflow problem
There are race conditions that may lead to inet6_dev refcount underflow
in xfrm6_dst_destroy() and rt6_uncached_list_flush_dev().
One of the refcount underflow bugs is shown below:
(cpu 1) | (cpu 2)
xfrm6_dst_destroy() |
... |
in6_dev_put() |
| rt6_uncached_list_flush_dev()
... | ...
| in6_dev_put()
rt6_uncached_list_del() | ...
... |
xfrm6_dst_destroy() calls rt6_uncached_list_del() after in6_dev_put(),
so rt6_uncached_list_flush_dev() has a chance to call in6_dev_put()
again for the same inet6_dev.
Fix it by moving in6_dev_put() after rt6_uncached_list_del() in
xfrm6_dst_destroy().
Fixes: 510c321b5571 ("xfrm: reuse uncached_list to track xdsts")
Signed-off-by: Zhang Changzhong <zhangchangzhong(a)huawei.com>
Reviewed-by: Xin Long <lucien.xin(a)gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert(a)secunet.com>
diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
index 188224a76685..45d0f9a8b28c 100644
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -117,10 +117,10 @@ static void xfrm6_dst_destroy(struct dst_entry *dst)
{
struct xfrm_dst *xdst = (struct xfrm_dst *)dst;
- if (likely(xdst->u.rt6.rt6i_idev))
- in6_dev_put(xdst->u.rt6.rt6i_idev);
dst_destroy_metrics_generic(dst);
rt6_uncached_list_del(&xdst->u.rt6);
+ if (likely(xdst->u.rt6.rt6i_idev))
+ in6_dev_put(xdst->u.rt6.rt6i_idev);
xfrm_dst_destroy(xdst);
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x cc9b364bb1d58d3dae270c7a931a8cc717dc2b3b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102001-barometer-press-265e@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From cc9b364bb1d58d3dae270c7a931a8cc717dc2b3b Mon Sep 17 00:00:00 2001
From: Zhang Changzhong <zhangchangzhong(a)huawei.com>
Date: Fri, 15 Sep 2023 19:20:41 +0800
Subject: [PATCH] xfrm6: fix inet6_dev refcount underflow problem
There are race conditions that may lead to inet6_dev refcount underflow
in xfrm6_dst_destroy() and rt6_uncached_list_flush_dev().
One of the refcount underflow bugs is shown below:
(cpu 1) | (cpu 2)
xfrm6_dst_destroy() |
... |
in6_dev_put() |
| rt6_uncached_list_flush_dev()
... | ...
| in6_dev_put()
rt6_uncached_list_del() | ...
... |
xfrm6_dst_destroy() calls rt6_uncached_list_del() after in6_dev_put(),
so rt6_uncached_list_flush_dev() has a chance to call in6_dev_put()
again for the same inet6_dev.
Fix it by moving in6_dev_put() after rt6_uncached_list_del() in
xfrm6_dst_destroy().
Fixes: 510c321b5571 ("xfrm: reuse uncached_list to track xdsts")
Signed-off-by: Zhang Changzhong <zhangchangzhong(a)huawei.com>
Reviewed-by: Xin Long <lucien.xin(a)gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert(a)secunet.com>
diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
index 188224a76685..45d0f9a8b28c 100644
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -117,10 +117,10 @@ static void xfrm6_dst_destroy(struct dst_entry *dst)
{
struct xfrm_dst *xdst = (struct xfrm_dst *)dst;
- if (likely(xdst->u.rt6.rt6i_idev))
- in6_dev_put(xdst->u.rt6.rt6i_idev);
dst_destroy_metrics_generic(dst);
rt6_uncached_list_del(&xdst->u.rt6);
+ if (likely(xdst->u.rt6.rt6i_idev))
+ in6_dev_put(xdst->u.rt6.rt6i_idev);
xfrm_dst_destroy(xdst);
}
Some Jasperlake Chromebooks overwrite the system vendor DMI value to the
name of the OEM that manufactured the device. This breaks Chromebook
quirk detection as it expects the system vendor to be "Google".
Add another quirk detection entry that looks for "Google" in the BIOS
version.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mark Hasemeyer <markhas(a)chromium.org>
---
sound/hda/intel-dsp-config.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/sound/hda/intel-dsp-config.c b/sound/hda/intel-dsp-config.c
index 24a948baf1bc..756fa0aa69bb 100644
--- a/sound/hda/intel-dsp-config.c
+++ b/sound/hda/intel-dsp-config.c
@@ -336,6 +336,12 @@ static const struct config_entry config_table[] = {
DMI_MATCH(DMI_SYS_VENDOR, "Google"),
}
},
+ {
+ .ident = "Google firmware",
+ .matches = {
+ DMI_MATCH(DMI_BIOS_VERSION, "Google"),
+ }
+ },
{}
}
},
--
2.42.0.655.g421f12c284-goog
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 2f3389c73832ad90b63208c0fc281ad080114c7a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102029-display-provolone-fab2@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2f3389c73832ad90b63208c0fc281ad080114c7a Mon Sep 17 00:00:00 2001
From: Manish Chopra <manishc(a)marvell.com>
Date: Fri, 13 Oct 2023 18:48:12 +0530
Subject: [PATCH] qed: fix LL2 RX buffer allocation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Driver allocates the LL2 rx buffers from kmalloc()
area to construct the skb using slab_build_skb()
The required size allocation seems to have overlooked
for accounting both skb_shared_info size and device
placement padding bytes which results into the below
panic when doing skb_put() for a standard MTU sized frame.
skbuff: skb_over_panic: text:ffffffffc0b0225f len:1514 put:1514
head:ff3dabceaf39c000 data:ff3dabceaf39c042 tail:0x62c end:0x566
dev:<NULL>
…
skb_panic+0x48/0x4a
skb_put.cold+0x10/0x10
qed_ll2b_complete_rx_packet+0x14f/0x260 [qed]
qed_ll2_rxq_handle_completion.constprop.0+0x169/0x200 [qed]
qed_ll2_rxq_completion+0xba/0x320 [qed]
qed_int_sp_dpc+0x1a7/0x1e0 [qed]
This patch fixes this by accouting skb_shared_info and device
placement padding size bytes when allocating the buffers.
Cc: David S. Miller <davem(a)davemloft.net>
Fixes: 0a7fb11c23c0 ("qed: Add Light L2 support")
Signed-off-by: Manish Chopra <manishc(a)marvell.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/qlogic/qed/qed_ll2.c b/drivers/net/ethernet/qlogic/qed/qed_ll2.c
index 717a0b3f89bd..ab5ef254a748 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_ll2.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_ll2.c
@@ -113,7 +113,10 @@ static void qed_ll2b_complete_tx_packet(void *cxt,
static int qed_ll2_alloc_buffer(struct qed_dev *cdev,
u8 **data, dma_addr_t *phys_addr)
{
- *data = kmalloc(cdev->ll2->rx_size, GFP_ATOMIC);
+ size_t size = cdev->ll2->rx_size + NET_SKB_PAD +
+ SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+
+ *data = kmalloc(size, GFP_ATOMIC);
if (!(*data)) {
DP_INFO(cdev, "Failed to allocate LL2 buffer data\n");
return -ENOMEM;
@@ -2589,7 +2592,7 @@ static int qed_ll2_start(struct qed_dev *cdev, struct qed_ll2_params *params)
INIT_LIST_HEAD(&cdev->ll2->list);
spin_lock_init(&cdev->ll2->lock);
- cdev->ll2->rx_size = NET_SKB_PAD + ETH_HLEN +
+ cdev->ll2->rx_size = PRM_DMA_PAD_BYTES_NUM + ETH_HLEN +
L1_CACHE_BYTES + params->mtu;
/* Allocate memory for LL2.
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x 2f3389c73832ad90b63208c0fc281ad080114c7a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102030-strict-cake-e8e2@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2f3389c73832ad90b63208c0fc281ad080114c7a Mon Sep 17 00:00:00 2001
From: Manish Chopra <manishc(a)marvell.com>
Date: Fri, 13 Oct 2023 18:48:12 +0530
Subject: [PATCH] qed: fix LL2 RX buffer allocation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Driver allocates the LL2 rx buffers from kmalloc()
area to construct the skb using slab_build_skb()
The required size allocation seems to have overlooked
for accounting both skb_shared_info size and device
placement padding bytes which results into the below
panic when doing skb_put() for a standard MTU sized frame.
skbuff: skb_over_panic: text:ffffffffc0b0225f len:1514 put:1514
head:ff3dabceaf39c000 data:ff3dabceaf39c042 tail:0x62c end:0x566
dev:<NULL>
…
skb_panic+0x48/0x4a
skb_put.cold+0x10/0x10
qed_ll2b_complete_rx_packet+0x14f/0x260 [qed]
qed_ll2_rxq_handle_completion.constprop.0+0x169/0x200 [qed]
qed_ll2_rxq_completion+0xba/0x320 [qed]
qed_int_sp_dpc+0x1a7/0x1e0 [qed]
This patch fixes this by accouting skb_shared_info and device
placement padding size bytes when allocating the buffers.
Cc: David S. Miller <davem(a)davemloft.net>
Fixes: 0a7fb11c23c0 ("qed: Add Light L2 support")
Signed-off-by: Manish Chopra <manishc(a)marvell.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/qlogic/qed/qed_ll2.c b/drivers/net/ethernet/qlogic/qed/qed_ll2.c
index 717a0b3f89bd..ab5ef254a748 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_ll2.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_ll2.c
@@ -113,7 +113,10 @@ static void qed_ll2b_complete_tx_packet(void *cxt,
static int qed_ll2_alloc_buffer(struct qed_dev *cdev,
u8 **data, dma_addr_t *phys_addr)
{
- *data = kmalloc(cdev->ll2->rx_size, GFP_ATOMIC);
+ size_t size = cdev->ll2->rx_size + NET_SKB_PAD +
+ SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+
+ *data = kmalloc(size, GFP_ATOMIC);
if (!(*data)) {
DP_INFO(cdev, "Failed to allocate LL2 buffer data\n");
return -ENOMEM;
@@ -2589,7 +2592,7 @@ static int qed_ll2_start(struct qed_dev *cdev, struct qed_ll2_params *params)
INIT_LIST_HEAD(&cdev->ll2->list);
spin_lock_init(&cdev->ll2->lock);
- cdev->ll2->rx_size = NET_SKB_PAD + ETH_HLEN +
+ cdev->ll2->rx_size = PRM_DMA_PAD_BYTES_NUM + ETH_HLEN +
L1_CACHE_BYTES + params->mtu;
/* Allocate memory for LL2.
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 3ebebb2c1eca92a15107b2d7aeff34196fd9e217
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102002-maximum-tray-450b@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3ebebb2c1eca92a15107b2d7aeff34196fd9e217 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan+linaro(a)kernel.org>
Date: Tue, 3 Oct 2023 17:55:56 +0200
Subject: [PATCH] ASoC: codecs: wcd938x: fix runtime PM imbalance on remove
Make sure to balance the runtime PM operations, including the disable
count, on driver unbind.
Fixes: 16572522aece ("ASoC: codecs: wcd938x-sdw: add SoundWire driver")
Cc: stable(a)vger.kernel.org # 5.14
Cc: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
Link: https://lore.kernel.org/r/20231003155558.27079-6-johan+linaro@kernel.org
Signed-off-by: Mark Brown <broonie(a)kernel.org>
diff --git a/sound/soc/codecs/wcd938x.c b/sound/soc/codecs/wcd938x.c
index 679c627f7eaa..d27b919c63b4 100644
--- a/sound/soc/codecs/wcd938x.c
+++ b/sound/soc/codecs/wcd938x.c
@@ -3620,9 +3620,15 @@ static int wcd938x_probe(struct platform_device *pdev)
static void wcd938x_remove(struct platform_device *pdev)
{
- struct wcd938x_priv *wcd938x = dev_get_drvdata(&pdev->dev);
+ struct device *dev = &pdev->dev;
+ struct wcd938x_priv *wcd938x = dev_get_drvdata(dev);
+
+ component_master_del(dev, &wcd938x_comp_ops);
+
+ pm_runtime_disable(dev);
+ pm_runtime_set_suspended(dev);
+ pm_runtime_dont_use_autosuspend(dev);
- component_master_del(&pdev->dev, &wcd938x_comp_ops);
regulator_bulk_disable(WCD938X_MAX_SUPPLY, wcd938x->supplies);
regulator_bulk_free(WCD938X_MAX_SUPPLY, wcd938x->supplies);
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 3ebebb2c1eca92a15107b2d7aeff34196fd9e217
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102002-womanhood-mushy-54a2@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3ebebb2c1eca92a15107b2d7aeff34196fd9e217 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan+linaro(a)kernel.org>
Date: Tue, 3 Oct 2023 17:55:56 +0200
Subject: [PATCH] ASoC: codecs: wcd938x: fix runtime PM imbalance on remove
Make sure to balance the runtime PM operations, including the disable
count, on driver unbind.
Fixes: 16572522aece ("ASoC: codecs: wcd938x-sdw: add SoundWire driver")
Cc: stable(a)vger.kernel.org # 5.14
Cc: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
Link: https://lore.kernel.org/r/20231003155558.27079-6-johan+linaro@kernel.org
Signed-off-by: Mark Brown <broonie(a)kernel.org>
diff --git a/sound/soc/codecs/wcd938x.c b/sound/soc/codecs/wcd938x.c
index 679c627f7eaa..d27b919c63b4 100644
--- a/sound/soc/codecs/wcd938x.c
+++ b/sound/soc/codecs/wcd938x.c
@@ -3620,9 +3620,15 @@ static int wcd938x_probe(struct platform_device *pdev)
static void wcd938x_remove(struct platform_device *pdev)
{
- struct wcd938x_priv *wcd938x = dev_get_drvdata(&pdev->dev);
+ struct device *dev = &pdev->dev;
+ struct wcd938x_priv *wcd938x = dev_get_drvdata(dev);
+
+ component_master_del(dev, &wcd938x_comp_ops);
+
+ pm_runtime_disable(dev);
+ pm_runtime_set_suspended(dev);
+ pm_runtime_dont_use_autosuspend(dev);
- component_master_del(&pdev->dev, &wcd938x_comp_ops);
regulator_bulk_disable(WCD938X_MAX_SUPPLY, wcd938x->supplies);
regulator_bulk_free(WCD938X_MAX_SUPPLY, wcd938x->supplies);
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x dcc583c225e659d5da34b4ad83914fd6b51e3dbf
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102025-copious-thud-be0f@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From dcc583c225e659d5da34b4ad83914fd6b51e3dbf Mon Sep 17 00:00:00 2001
From: Chen-Yu Tsai <wenst(a)chromium.org>
Date: Wed, 4 Oct 2023 16:32:24 +0800
Subject: [PATCH] drm/mediatek: Correctly free sg_table in gem prime vmap
The MediaTek DRM driver implements GEM PRIME vmap by fetching the
sg_table for the object, iterating through the pages, and then
vmapping them. In essence, unlike the GEM DMA helpers which vmap
when the object is first created or imported, the MediaTek version
does it on request.
Unfortunately, the code never correctly frees the sg_table contents.
This results in a kernel memory leak. On a Hayato device with a text
console on the internal display, this results in the system running
out of memory in a few days from all the console screen cursor updates.
Add sg_free_table() to correctly free the contents of the sg_table. This
was missing despite explicitly required by mtk_gem_prime_get_sg_table().
Also move the "out" shortcut label to after the kfree() call for the
sg_table. Having sg_free_table() together with kfree() makes more sense.
The shortcut is only used when the object already has a kernel address,
in which case the pointer is NULL and kfree() does nothing. Hence this
change causes no functional change.
Fixes: 3df64d7b0a4f ("drm/mediatek: Implement gem prime vmap/vunmap function")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Chen-Yu Tsai <wenst(a)chromium.org>
Reviewed-by: CK Hu <ck.hu(a)mediatek.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20231004083226.1940055…
Signed-off-by: Chun-Kuang Hu <chunkuang.hu(a)kernel.org>
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.c b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
index 9f364df52478..0e0a41b2f57f 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_gem.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
@@ -239,6 +239,7 @@ int mtk_drm_gem_prime_vmap(struct drm_gem_object *obj, struct iosys_map *map)
npages = obj->size >> PAGE_SHIFT;
mtk_gem->pages = kcalloc(npages, sizeof(*mtk_gem->pages), GFP_KERNEL);
if (!mtk_gem->pages) {
+ sg_free_table(sgt);
kfree(sgt);
return -ENOMEM;
}
@@ -248,12 +249,15 @@ int mtk_drm_gem_prime_vmap(struct drm_gem_object *obj, struct iosys_map *map)
mtk_gem->kvaddr = vmap(mtk_gem->pages, npages, VM_MAP,
pgprot_writecombine(PAGE_KERNEL));
if (!mtk_gem->kvaddr) {
+ sg_free_table(sgt);
kfree(sgt);
kfree(mtk_gem->pages);
return -ENOMEM;
}
-out:
+ sg_free_table(sgt);
kfree(sgt);
+
+out:
iosys_map_set_vaddr(map, mtk_gem->kvaddr);
return 0;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x dcc583c225e659d5da34b4ad83914fd6b51e3dbf
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102024-bagginess-ultra-3e5b@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From dcc583c225e659d5da34b4ad83914fd6b51e3dbf Mon Sep 17 00:00:00 2001
From: Chen-Yu Tsai <wenst(a)chromium.org>
Date: Wed, 4 Oct 2023 16:32:24 +0800
Subject: [PATCH] drm/mediatek: Correctly free sg_table in gem prime vmap
The MediaTek DRM driver implements GEM PRIME vmap by fetching the
sg_table for the object, iterating through the pages, and then
vmapping them. In essence, unlike the GEM DMA helpers which vmap
when the object is first created or imported, the MediaTek version
does it on request.
Unfortunately, the code never correctly frees the sg_table contents.
This results in a kernel memory leak. On a Hayato device with a text
console on the internal display, this results in the system running
out of memory in a few days from all the console screen cursor updates.
Add sg_free_table() to correctly free the contents of the sg_table. This
was missing despite explicitly required by mtk_gem_prime_get_sg_table().
Also move the "out" shortcut label to after the kfree() call for the
sg_table. Having sg_free_table() together with kfree() makes more sense.
The shortcut is only used when the object already has a kernel address,
in which case the pointer is NULL and kfree() does nothing. Hence this
change causes no functional change.
Fixes: 3df64d7b0a4f ("drm/mediatek: Implement gem prime vmap/vunmap function")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Chen-Yu Tsai <wenst(a)chromium.org>
Reviewed-by: CK Hu <ck.hu(a)mediatek.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20231004083226.1940055…
Signed-off-by: Chun-Kuang Hu <chunkuang.hu(a)kernel.org>
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.c b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
index 9f364df52478..0e0a41b2f57f 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_gem.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
@@ -239,6 +239,7 @@ int mtk_drm_gem_prime_vmap(struct drm_gem_object *obj, struct iosys_map *map)
npages = obj->size >> PAGE_SHIFT;
mtk_gem->pages = kcalloc(npages, sizeof(*mtk_gem->pages), GFP_KERNEL);
if (!mtk_gem->pages) {
+ sg_free_table(sgt);
kfree(sgt);
return -ENOMEM;
}
@@ -248,12 +249,15 @@ int mtk_drm_gem_prime_vmap(struct drm_gem_object *obj, struct iosys_map *map)
mtk_gem->kvaddr = vmap(mtk_gem->pages, npages, VM_MAP,
pgprot_writecombine(PAGE_KERNEL));
if (!mtk_gem->kvaddr) {
+ sg_free_table(sgt);
kfree(sgt);
kfree(mtk_gem->pages);
return -ENOMEM;
}
-out:
+ sg_free_table(sgt);
kfree(sgt);
+
+out:
iosys_map_set_vaddr(map, mtk_gem->kvaddr);
return 0;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x dcc583c225e659d5da34b4ad83914fd6b51e3dbf
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102023-praising-mumbo-a93b@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From dcc583c225e659d5da34b4ad83914fd6b51e3dbf Mon Sep 17 00:00:00 2001
From: Chen-Yu Tsai <wenst(a)chromium.org>
Date: Wed, 4 Oct 2023 16:32:24 +0800
Subject: [PATCH] drm/mediatek: Correctly free sg_table in gem prime vmap
The MediaTek DRM driver implements GEM PRIME vmap by fetching the
sg_table for the object, iterating through the pages, and then
vmapping them. In essence, unlike the GEM DMA helpers which vmap
when the object is first created or imported, the MediaTek version
does it on request.
Unfortunately, the code never correctly frees the sg_table contents.
This results in a kernel memory leak. On a Hayato device with a text
console on the internal display, this results in the system running
out of memory in a few days from all the console screen cursor updates.
Add sg_free_table() to correctly free the contents of the sg_table. This
was missing despite explicitly required by mtk_gem_prime_get_sg_table().
Also move the "out" shortcut label to after the kfree() call for the
sg_table. Having sg_free_table() together with kfree() makes more sense.
The shortcut is only used when the object already has a kernel address,
in which case the pointer is NULL and kfree() does nothing. Hence this
change causes no functional change.
Fixes: 3df64d7b0a4f ("drm/mediatek: Implement gem prime vmap/vunmap function")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Chen-Yu Tsai <wenst(a)chromium.org>
Reviewed-by: CK Hu <ck.hu(a)mediatek.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20231004083226.1940055…
Signed-off-by: Chun-Kuang Hu <chunkuang.hu(a)kernel.org>
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.c b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
index 9f364df52478..0e0a41b2f57f 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_gem.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
@@ -239,6 +239,7 @@ int mtk_drm_gem_prime_vmap(struct drm_gem_object *obj, struct iosys_map *map)
npages = obj->size >> PAGE_SHIFT;
mtk_gem->pages = kcalloc(npages, sizeof(*mtk_gem->pages), GFP_KERNEL);
if (!mtk_gem->pages) {
+ sg_free_table(sgt);
kfree(sgt);
return -ENOMEM;
}
@@ -248,12 +249,15 @@ int mtk_drm_gem_prime_vmap(struct drm_gem_object *obj, struct iosys_map *map)
mtk_gem->kvaddr = vmap(mtk_gem->pages, npages, VM_MAP,
pgprot_writecombine(PAGE_KERNEL));
if (!mtk_gem->kvaddr) {
+ sg_free_table(sgt);
kfree(sgt);
kfree(mtk_gem->pages);
return -ENOMEM;
}
-out:
+ sg_free_table(sgt);
kfree(sgt);
+
+out:
iosys_map_set_vaddr(map, mtk_gem->kvaddr);
return 0;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 72377ab2d671befd6390a1d5677f5cca61235b65
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102002-grooving-carnivore-aa31@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 72377ab2d671befd6390a1d5677f5cca61235b65 Mon Sep 17 00:00:00 2001
From: Paolo Abeni <pabeni(a)redhat.com>
Date: Wed, 18 Oct 2023 11:23:54 -0700
Subject: [PATCH] mptcp: more conservative check for zero probes
Christoph reported that the MPTCP protocol can find the subflow-level
write queue unexpectedly not empty while crafting a zero-window probe,
hitting a warning:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 188 at net/mptcp/protocol.c:1312 mptcp_sendmsg_frag+0xc06/0xe70
Modules linked in:
CPU: 0 PID: 188 Comm: kworker/0:2 Not tainted 6.6.0-rc2-g1176aa719d7a #47
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
Workqueue: events mptcp_worker
RIP: 0010:mptcp_sendmsg_frag+0xc06/0xe70 net/mptcp/protocol.c:1312
RAX: 47d0530de347ff6a RBX: 47d0530de347ff6b RCX: ffff8881015d3c00
RDX: ffff8881015d3c00 RSI: 47d0530de347ff6b RDI: 47d0530de347ff6b
RBP: 47d0530de347ff6b R08: ffffffff8243c6a8 R09: ffffffff82042d9c
R10: 0000000000000002 R11: ffffffff82056850 R12: ffff88812a13d580
R13: 0000000000000001 R14: ffff88812b375e50 R15: ffff88812bbf3200
FS: 0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000695118 CR3: 0000000115dfc001 CR4: 0000000000170ef0
Call Trace:
<TASK>
__subflow_push_pending+0xa4/0x420 net/mptcp/protocol.c:1545
__mptcp_push_pending+0x128/0x3b0 net/mptcp/protocol.c:1614
mptcp_release_cb+0x218/0x5b0 net/mptcp/protocol.c:3391
release_sock+0xf6/0x100 net/core/sock.c:3521
mptcp_worker+0x6e8/0x8f0 net/mptcp/protocol.c:2746
process_scheduled_works+0x341/0x690 kernel/workqueue.c:2630
worker_thread+0x3a7/0x610 kernel/workqueue.c:2784
kthread+0x143/0x180 kernel/kthread.c:388
ret_from_fork+0x4d/0x60 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1b/0x30 arch/x86/entry/entry_64.S:304
</TASK>
The root cause of the issue is that expectations are wrong: e.g. due
to MPTCP-level re-injection we can hit the critical condition.
Explicitly avoid the zero-window probe when the subflow write queue
is not empty and drop the related warnings.
Reported-by: Christoph Paasch <cpaasch(a)apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/444
Fixes: f70cad1085d1 ("mptcp: stop relying on tcp_tx_skb_cache")
Cc: stable(a)vger.kernel.org
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Signed-off-by: Mat Martineau <martineau(a)kernel.org>
Link: https://lore.kernel.org/r/20231018-send-net-20231018-v1-3-17ecb002e41d@kern…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index d1902373c974..4e30e5ba3795 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1298,7 +1298,7 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
if (copy == 0) {
u64 snd_una = READ_ONCE(msk->snd_una);
- if (snd_una != msk->snd_nxt) {
+ if (snd_una != msk->snd_nxt || tcp_write_queue_tail(ssk)) {
tcp_remove_empty_skb(ssk);
return 0;
}
@@ -1306,11 +1306,6 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
zero_window_probe = true;
data_seq = snd_una - 1;
copy = 1;
-
- /* all mptcp-level data is acked, no skbs should be present into the
- * ssk write queue
- */
- WARN_ON_ONCE(reuse_skb);
}
copy = min_t(size_t, copy, info->limit - info->sent);
@@ -1339,7 +1334,6 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
if (reuse_skb) {
TCP_SKB_CB(skb)->tcp_flags &= ~TCPHDR_PSH;
mpext->data_len += copy;
- WARN_ON_ONCE(zero_window_probe);
goto out;
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x b29a2acd36dd7a33c63f260df738fb96baa3d4f8
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102044-ice-badass-92b5@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b29a2acd36dd7a33c63f260df738fb96baa3d4f8 Mon Sep 17 00:00:00 2001
From: Roman Kagan <rkagan(a)amazon.de>
Date: Thu, 4 May 2023 14:00:42 +0200
Subject: [PATCH] KVM: x86/pmu: Truncate counter value to allowed width on
write
Performance counters are defined to have width less than 64 bits. The
vPMU code maintains the counters in u64 variables but assumes the value
to fit within the defined width. However, for Intel non-full-width
counters (MSR_IA32_PERFCTRx) the value receieved from the guest is
truncated to 32 bits and then sign-extended to full 64 bits. If a
negative value is set, it's sign-extended to 64 bits, but then in
kvm_pmu_incr_counter() it's incremented, truncated, and compared to the
previous value for overflow detection.
That previous value is not truncated, so it always evaluates bigger than
the truncated new one, and a PMI is injected. If the PMI handler writes
a negative counter value itself, the vCPU never quits the PMI loop.
Turns out that Linux PMI handler actually does write the counter with
the value just read with RDPMC, so when no full-width support is exposed
via MSR_IA32_PERF_CAPABILITIES, and the guest initializes the counter to
a negative value, it locks up.
This has been observed in the field, for example, when the guest configures
atop to use perfevents and runs two instances of it simultaneously.
To address the problem, maintain the invariant that the counter value
always fits in the defined bit width, by truncating the received value
in the respective set_msr methods. For better readability, factor the
out into a helper function, pmc_write_counter(), shared by vmx and svm
parts.
Fixes: 9cd803d496e7 ("KVM: x86: Update vPMCs when retiring instructions")
Cc: stable(a)vger.kernel.org
Signed-off-by: Roman Kagan <rkagan(a)amazon.de>
Link: https://lore.kernel.org/all/20230504120042.785651-1-rkagan@amazon.de
Tested-by: Like Xu <likexu(a)tencent.com>
[sean: tweak changelog, s/set/write in the helper]
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 7d9ba301c090..1d64113de488 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -74,6 +74,12 @@ static inline u64 pmc_read_counter(struct kvm_pmc *pmc)
return counter & pmc_bitmask(pmc);
}
+static inline void pmc_write_counter(struct kvm_pmc *pmc, u64 val)
+{
+ pmc->counter += val - pmc_read_counter(pmc);
+ pmc->counter &= pmc_bitmask(pmc);
+}
+
static inline void pmc_release_perf_event(struct kvm_pmc *pmc)
{
if (pmc->perf_event) {
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index cef5a3d0abd0..373ff6a6687b 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -160,7 +160,7 @@ static int amd_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
/* MSR_PERFCTRn */
pmc = get_gp_pmc_amd(pmu, msr, PMU_TYPE_COUNTER);
if (pmc) {
- pmc->counter += data - pmc_read_counter(pmc);
+ pmc_write_counter(pmc, data);
pmc_update_sample_period(pmc);
return 0;
}
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index f2efa0bf7ae8..820d3e1f6b4f 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -436,11 +436,11 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
if (!msr_info->host_initiated &&
!(msr & MSR_PMC_FULL_WIDTH_BIT))
data = (s64)(s32)data;
- pmc->counter += data - pmc_read_counter(pmc);
+ pmc_write_counter(pmc, data);
pmc_update_sample_period(pmc);
break;
} else if ((pmc = get_fixed_pmc(pmu, msr))) {
- pmc->counter += data - pmc_read_counter(pmc);
+ pmc_write_counter(pmc, data);
pmc_update_sample_period(pmc);
break;
} else if ((pmc = get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0))) {
When calculating the hotness threshold for lru_prio scheme of
DAMON_LRU_SORT, the module divides some values by the maximum
nr_accesses. However, due to the type of the related variables, simple
division-based calculation of the divisor can return zero. As a result,
divide-by-zero is possible. Fix it by using damon_max_nr_accesses(),
which handles the case.
Reported-by: Jakub Acs <acsjakub(a)amazon.de>
Fixes: 40e983cca927 ("mm/damon: introduce DAMON-based LRU-lists Sorting")
Cc: <stable(a)vger.kernel.org> # 6.0.x
Signed-off-by: SeongJae Park <sj(a)kernel.org>
---
mm/damon/lru_sort.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/mm/damon/lru_sort.c b/mm/damon/lru_sort.c
index 3ecdcc029443..f2e5f9431892 100644
--- a/mm/damon/lru_sort.c
+++ b/mm/damon/lru_sort.c
@@ -195,9 +195,7 @@ static int damon_lru_sort_apply_parameters(void)
if (err)
return err;
- /* aggr_interval / sample_interval is the maximum nr_accesses */
- hot_thres = damon_lru_sort_mon_attrs.aggr_interval /
- damon_lru_sort_mon_attrs.sample_interval *
+ hot_thres = damon_max_nr_accesses(&damon_lru_sort_mon_attrs) *
hot_thres_access_freq / 1000;
scheme = damon_lru_sort_new_hot_scheme(hot_thres);
if (!scheme)
--
2.34.1
When calculating the hotness of each region for the under-quota regions
prioritization, DAMON divides some values by the maximum nr_accesses.
However, due to the type of the related variables, simple division-based
calculation of the divisor can return zero. As a result, divide-by-zero
is possible. Fix it by using damon_max_nr_accesses(), which handles the
case.
Reported-by: Jakub Acs <acsjakub(a)amazon.de>
Fixes: 198f0f4c58b9 ("mm/damon/vaddr,paddr: support pageout prioritization")
Cc: <stable(a)vger.kernel.org> # 5.16.x
Signed-off-by: SeongJae Park <sj(a)kernel.org>
---
mm/damon/ops-common.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/mm/damon/ops-common.c b/mm/damon/ops-common.c
index ac1c3fa80f98..d25d99cb5f2b 100644
--- a/mm/damon/ops-common.c
+++ b/mm/damon/ops-common.c
@@ -73,7 +73,6 @@ void damon_pmdp_mkold(pmd_t *pmd, struct vm_area_struct *vma, unsigned long addr
int damon_hot_score(struct damon_ctx *c, struct damon_region *r,
struct damos *s)
{
- unsigned int max_nr_accesses;
int freq_subscore;
unsigned int age_in_sec;
int age_in_log, age_subscore;
@@ -81,8 +80,8 @@ int damon_hot_score(struct damon_ctx *c, struct damon_region *r,
unsigned int age_weight = s->quota.weight_age;
int hotness;
- max_nr_accesses = c->attrs.aggr_interval / c->attrs.sample_interval;
- freq_subscore = r->nr_accesses * DAMON_MAX_SUBSCORE / max_nr_accesses;
+ freq_subscore = r->nr_accesses * DAMON_MAX_SUBSCORE /
+ damon_max_nr_accesses(&c->attrs);
age_in_sec = (unsigned long)r->age * c->attrs.aggr_interval / 1000000;
for (age_in_log = 0; age_in_log < DAMON_MAX_AGE_IN_LOG && age_in_sec;
--
2.34.1
When monitoring attributes are changed, DAMON updates access rate of the
monitoring results accordingly. For that, it divides some values by the
maximum nr_accesses. However, due to the type of the related variables,
simple division-based calculation of the divisor can return zero. As a
result, divide-by-zero is possible. Fix it by using
damon_max_nr_accesses(), which handles the case.
Reported-by: Jakub Acs <acsjakub(a)amazon.de>
Fixes: 2f5bef5a590b ("mm/damon/core: update monitoring results for new monitoring attributes")
Cc: <stable(a)vger.kernel.org> # 6.3.x
Signed-off-by: SeongJae Park <sj(a)kernel.org>
---
mm/damon/core.c | 10 ++--------
1 file changed, 2 insertions(+), 8 deletions(-)
diff --git a/mm/damon/core.c b/mm/damon/core.c
index 9f4f7c378cf3..e194c8075235 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -500,20 +500,14 @@ static unsigned int damon_age_for_new_attrs(unsigned int age,
static unsigned int damon_accesses_bp_to_nr_accesses(
unsigned int accesses_bp, struct damon_attrs *attrs)
{
- unsigned int max_nr_accesses =
- attrs->aggr_interval / attrs->sample_interval;
-
- return accesses_bp * max_nr_accesses / 10000;
+ return accesses_bp * damon_max_nr_accesses(attrs) / 10000;
}
/* convert nr_accesses to access ratio in bp (per 10,000) */
static unsigned int damon_nr_accesses_to_accesses_bp(
unsigned int nr_accesses, struct damon_attrs *attrs)
{
- unsigned int max_nr_accesses =
- attrs->aggr_interval / attrs->sample_interval;
-
- return nr_accesses * 10000 / max_nr_accesses;
+ return nr_accesses * 10000 / damon_max_nr_accesses(attrs);
}
static unsigned int damon_nr_accesses_for_new_attrs(unsigned int nr_accesses,
--
2.34.1
The maximum nr_accesses of given DAMON context can be calculated by
dividing the aggregation interval by the sampling interval. Some logics
in DAMON uses the maximum nr_accesses as a divisor. Hence, the value
shouldn't be zero. Such case is avoided since DAMON avoids setting the
agregation interval as samller than the sampling interval. However,
since nr_accesses is unsigned int while the intervals are unsigned long,
the maximum nr_accesses could be zero while casting. Implement a
function that handles the corner case.
Note that this commit is not fixing the real issue since this is only
introducing the safe function that will replaces the problematic
divisions. The replacements will be made by followup commits, to make
backporting on stable series easier.
Reported-by: Jakub Acs <acsjakub(a)amazon.de>
Fixes: 198f0f4c58b9 ("mm/damon/vaddr,paddr: support pageout prioritization")
Cc: <stable(a)vger.kernel.org> # 5.16.x
Signed-off-by: SeongJae Park <sj(a)kernel.org>
---
include/linux/damon.h | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/include/linux/damon.h b/include/linux/damon.h
index 27b995c22497..ab2f17d9926b 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -681,6 +681,13 @@ static inline bool damon_target_has_pid(const struct damon_ctx *ctx)
return ctx->ops.id == DAMON_OPS_VADDR || ctx->ops.id == DAMON_OPS_FVADDR;
}
+static inline unsigned int damon_max_nr_accesses(const struct damon_attrs *attrs)
+{
+ /* {aggr,sample}_interval are unsigned long, hence could overflow */
+ return min(attrs->aggr_interval / attrs->sample_interval,
+ (unsigned long)UINT_MAX);
+}
+
int damon_start(struct damon_ctx **ctxs, int nr_ctxs, bool exclusive);
int damon_stop(struct damon_ctx **ctxs, int nr_ctxs);
--
2.34.1
The patch titled
Subject: mm/damon/lru_sort: avoid divide-by-zero in hot threshold calculation
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-damon-lru_sort-avoid-divide-by-zero-in-hot-threshold-calculation.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: SeongJae Park <sj(a)kernel.org>
Subject: mm/damon/lru_sort: avoid divide-by-zero in hot threshold calculation
Date: Thu, 19 Oct 2023 19:49:23 +0000
When calculating the hotness threshold for lru_prio scheme of
DAMON_LRU_SORT, the module divides some values by the maximum nr_accesses.
However, due to the type of the related variables, simple division-based
calculation of the divisor can return zero. As a result, divide-by-zero
is possible. Fix it by using damon_max_nr_accesses(), which handles the
case.
Link: https://lkml.kernel.org/r/20231019194924.100347-5-sj@kernel.org
Fixes: 40e983cca927 ("mm/damon: introduce DAMON-based LRU-lists Sorting")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org> [6.0+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/lru_sort.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
--- a/mm/damon/lru_sort.c~mm-damon-lru_sort-avoid-divide-by-zero-in-hot-threshold-calculation
+++ a/mm/damon/lru_sort.c
@@ -193,9 +193,7 @@ static int damon_lru_sort_apply_paramete
if (err)
return err;
- /* aggr_interval / sample_interval is the maximum nr_accesses */
- hot_thres = damon_lru_sort_mon_attrs.aggr_interval /
- damon_lru_sort_mon_attrs.sample_interval *
+ hot_thres = damon_max_nr_accesses(&damon_lru_sort_mon_attrs) *
hot_thres_access_freq / 1000;
scheme = damon_lru_sort_new_hot_scheme(hot_thres);
if (!scheme)
_
Patches currently in -mm which might be from sj(a)kernel.org are
mm-damon-implement-a-function-for-max-nr_accesses-safe-calculation.patch
mm-damon-core-avoid-divide-by-zero-during-monitoring-results-update.patch
mm-damon-ops-common-avoid-divide-by-zero-during-region-hotness-calculation.patch
mm-damon-lru_sort-avoid-divide-by-zero-in-hot-threshold-calculation.patch
The patch titled
Subject: mm/damon/ops-common: avoid divide-by-zero during region hotness calculation
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-damon-ops-common-avoid-divide-by-zero-during-region-hotness-calculation.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: SeongJae Park <sj(a)kernel.org>
Subject: mm/damon/ops-common: avoid divide-by-zero during region hotness calculation
Date: Thu, 19 Oct 2023 19:49:22 +0000
When calculating the hotness of each region for the under-quota regions
prioritization, DAMON divides some values by the maximum nr_accesses.
However, due to the type of the related variables, simple division-based
calculation of the divisor can return zero. As a result, divide-by-zero
is possible. Fix it by using damon_max_nr_accesses(), which handles the
case.
Link: https://lkml.kernel.org/r/20231019194924.100347-4-sj@kernel.org
Fixes: 198f0f4c58b9 ("mm/damon/vaddr,paddr: support pageout prioritization")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org> [5.16+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/ops-common.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
--- a/mm/damon/ops-common.c~mm-damon-ops-common-avoid-divide-by-zero-during-region-hotness-calculation
+++ a/mm/damon/ops-common.c
@@ -73,7 +73,6 @@ void damon_pmdp_mkold(pmd_t *pmd, struct
int damon_hot_score(struct damon_ctx *c, struct damon_region *r,
struct damos *s)
{
- unsigned int max_nr_accesses;
int freq_subscore;
unsigned int age_in_sec;
int age_in_log, age_subscore;
@@ -81,8 +80,8 @@ int damon_hot_score(struct damon_ctx *c,
unsigned int age_weight = s->quota.weight_age;
int hotness;
- max_nr_accesses = c->attrs.aggr_interval / c->attrs.sample_interval;
- freq_subscore = r->nr_accesses * DAMON_MAX_SUBSCORE / max_nr_accesses;
+ freq_subscore = r->nr_accesses * DAMON_MAX_SUBSCORE /
+ damon_max_nr_accesses(&c->attrs);
age_in_sec = (unsigned long)r->age * c->attrs.aggr_interval / 1000000;
for (age_in_log = 0; age_in_log < DAMON_MAX_AGE_IN_LOG && age_in_sec;
_
Patches currently in -mm which might be from sj(a)kernel.org are
mm-damon-implement-a-function-for-max-nr_accesses-safe-calculation.patch
mm-damon-core-avoid-divide-by-zero-during-monitoring-results-update.patch
mm-damon-ops-common-avoid-divide-by-zero-during-region-hotness-calculation.patch
mm-damon-lru_sort-avoid-divide-by-zero-in-hot-threshold-calculation.patch
The patch titled
Subject: mm/damon/core: avoid divide-by-zero during monitoring results update
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-damon-core-avoid-divide-by-zero-during-monitoring-results-update.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: SeongJae Park <sj(a)kernel.org>
Subject: mm/damon/core: avoid divide-by-zero during monitoring results update
Date: Thu, 19 Oct 2023 19:49:21 +0000
When monitoring attributes are changed, DAMON updates access rate of the
monitoring results accordingly. For that, it divides some values by the
maximum nr_accesses. However, due to the type of the related variables,
simple division-based calculation of the divisor can return zero. As a
result, divide-by-zero is possible. Fix it by using
damon_max_nr_accesses(), which handles the case.
Link: https://lkml.kernel.org/r/20231019194924.100347-3-sj@kernel.org
Fixes: 2f5bef5a590b ("mm/damon/core: update monitoring results for new monitoring attributes")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org> [6.3+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/core.c | 10 ++--------
1 file changed, 2 insertions(+), 8 deletions(-)
--- a/mm/damon/core.c~mm-damon-core-avoid-divide-by-zero-during-monitoring-results-update
+++ a/mm/damon/core.c
@@ -476,20 +476,14 @@ static unsigned int damon_age_for_new_at
static unsigned int damon_accesses_bp_to_nr_accesses(
unsigned int accesses_bp, struct damon_attrs *attrs)
{
- unsigned int max_nr_accesses =
- attrs->aggr_interval / attrs->sample_interval;
-
- return accesses_bp * max_nr_accesses / 10000;
+ return accesses_bp * damon_max_nr_accesses(attrs) / 10000;
}
/* convert nr_accesses to access ratio in bp (per 10,000) */
static unsigned int damon_nr_accesses_to_accesses_bp(
unsigned int nr_accesses, struct damon_attrs *attrs)
{
- unsigned int max_nr_accesses =
- attrs->aggr_interval / attrs->sample_interval;
-
- return nr_accesses * 10000 / max_nr_accesses;
+ return nr_accesses * 10000 / damon_max_nr_accesses(attrs);
}
static unsigned int damon_nr_accesses_for_new_attrs(unsigned int nr_accesses,
_
Patches currently in -mm which might be from sj(a)kernel.org are
mm-damon-implement-a-function-for-max-nr_accesses-safe-calculation.patch
mm-damon-core-avoid-divide-by-zero-during-monitoring-results-update.patch
mm-damon-ops-common-avoid-divide-by-zero-during-region-hotness-calculation.patch
mm-damon-lru_sort-avoid-divide-by-zero-in-hot-threshold-calculation.patch
The patch titled
Subject: mm/damon: implement a function for max nr_accesses safe calculation
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-damon-implement-a-function-for-max-nr_accesses-safe-calculation.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: SeongJae Park <sj(a)kernel.org>
Subject: mm/damon: implement a function for max nr_accesses safe calculation
Date: Thu, 19 Oct 2023 19:49:20 +0000
Patch series "avoid divide-by-zero due to max_nr_accesses overflow".
The maximum nr_accesses of given DAMON context can be calculated by
dividing the aggregation interval by the sampling interval. Some logics
in DAMON uses the maximum nr_accesses as a divisor. Hence, the value
shouldn't be zero. Such case is avoided since DAMON avoids setting the
agregation interval as samller than the sampling interval. However, since
nr_accesses is unsigned int while the intervals are unsigned long, the
maximum nr_accesses could be zero while casting.
Avoid the divide-by-zero by implementing a function that handles the
corner case (first patch), and replaces the vulnerable direct max
nr_accesses calculations (remaining patches).
Note that the patches for the replacements are divided for broken commits,
to make backporting on required tres easier. Especially, the last patch
is for a patch that not yet merged into the mainline but in mm tree.
This patch (of 4):
The maximum nr_accesses of given DAMON context can be calculated by
dividing the aggregation interval by the sampling interval. Some logics
in DAMON uses the maximum nr_accesses as a divisor. Hence, the value
shouldn't be zero. Such case is avoided since DAMON avoids setting the
agregation interval as samller than the sampling interval. However, since
nr_accesses is unsigned int while the intervals are unsigned long, the
maximum nr_accesses could be zero while casting. Implement a function
that handles the corner case.
Note that this commit is not fixing the real issue since this is only
introducing the safe function that will replaces the problematic
divisions. The replacements will be made by followup commits, to make
backporting on stable series easier.
Link: https://lkml.kernel.org/r/20231019194924.100347-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20231019194924.100347-2-sj@kernel.org
Fixes: 198f0f4c58b9 ("mm/damon/vaddr,paddr: support pageout prioritization")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org> [5.16+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/damon.h | 7 +++++++
1 file changed, 7 insertions(+)
--- a/include/linux/damon.h~mm-damon-implement-a-function-for-max-nr_accesses-safe-calculation
+++ a/include/linux/damon.h
@@ -642,6 +642,13 @@ static inline bool damon_target_has_pid(
return ctx->ops.id == DAMON_OPS_VADDR || ctx->ops.id == DAMON_OPS_FVADDR;
}
+static inline unsigned int damon_max_nr_accesses(const struct damon_attrs *attrs)
+{
+ /* {aggr,sample}_interval are unsigned long, hence could overflow */
+ return min(attrs->aggr_interval / attrs->sample_interval,
+ (unsigned long)UINT_MAX);
+}
+
int damon_start(struct damon_ctx **ctxs, int nr_ctxs, bool exclusive);
int damon_stop(struct damon_ctx **ctxs, int nr_ctxs);
_
Patches currently in -mm which might be from sj(a)kernel.org are
mm-damon-implement-a-function-for-max-nr_accesses-safe-calculation.patch
mm-damon-core-avoid-divide-by-zero-during-monitoring-results-update.patch
mm-damon-ops-common-avoid-divide-by-zero-during-region-hotness-calculation.patch
mm-damon-lru_sort-avoid-divide-by-zero-in-hot-threshold-calculation.patch
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x c68681ae46eaaa1640b52fe366d21a93b2185df5
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102058-bullish-chess-e399@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c68681ae46eaaa1640b52fe366d21a93b2185df5 Mon Sep 17 00:00:00 2001
From: Albert Huang <huangjie.albert(a)bytedance.com>
Date: Wed, 11 Oct 2023 15:48:51 +0800
Subject: [PATCH] net/smc: fix smc clc failed issue when netdevice not in
init_net
If the netdevice is within a container and communicates externally
through network technologies such as VxLAN, we won't be able to find
routing information in the init_net namespace. To address this issue,
we need to add a struct net parameter to the smc_ib_find_route function.
This allow us to locate the routing information within the corresponding
net namespace, ensuring the correct completion of the SMC CLC interaction.
Fixes: e5c4744cfb59 ("net/smc: add SMC-Rv2 connection establishment")
Signed-off-by: Albert Huang <huangjie.albert(a)bytedance.com>
Reviewed-by: Dust Li <dust.li(a)linux.alibaba.com>
Reviewed-by: Wenjia Zhang <wenjia(a)linux.ibm.com>
Link: https://lore.kernel.org/r/20231011074851.95280-1-huangjie.albert@bytedance.…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index bacdd971615e..7a874da90c7f 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -1201,6 +1201,7 @@ static int smc_connect_rdma_v2_prepare(struct smc_sock *smc,
(struct smc_clc_msg_accept_confirm_v2 *)aclc;
struct smc_clc_first_contact_ext *fce =
smc_get_clc_first_contact_ext(clc_v2, false);
+ struct net *net = sock_net(&smc->sk);
int rc;
if (!ini->first_contact_peer || aclc->hdr.version == SMC_V1)
@@ -1210,7 +1211,7 @@ static int smc_connect_rdma_v2_prepare(struct smc_sock *smc,
memcpy(ini->smcrv2.nexthop_mac, &aclc->r0.lcl.mac, ETH_ALEN);
ini->smcrv2.uses_gateway = false;
} else {
- if (smc_ib_find_route(smc->clcsock->sk->sk_rcv_saddr,
+ if (smc_ib_find_route(net, smc->clcsock->sk->sk_rcv_saddr,
smc_ib_gid_to_ipv4(aclc->r0.lcl.gid),
ini->smcrv2.nexthop_mac,
&ini->smcrv2.uses_gateway))
diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
index 9b66d6aeeb1a..89981dbe46c9 100644
--- a/net/smc/smc_ib.c
+++ b/net/smc/smc_ib.c
@@ -193,7 +193,7 @@ bool smc_ib_port_active(struct smc_ib_device *smcibdev, u8 ibport)
return smcibdev->pattr[ibport - 1].state == IB_PORT_ACTIVE;
}
-int smc_ib_find_route(__be32 saddr, __be32 daddr,
+int smc_ib_find_route(struct net *net, __be32 saddr, __be32 daddr,
u8 nexthop_mac[], u8 *uses_gateway)
{
struct neighbour *neigh = NULL;
@@ -205,7 +205,7 @@ int smc_ib_find_route(__be32 saddr, __be32 daddr,
if (daddr == cpu_to_be32(INADDR_NONE))
goto out;
- rt = ip_route_output_flow(&init_net, &fl4, NULL);
+ rt = ip_route_output_flow(net, &fl4, NULL);
if (IS_ERR(rt))
goto out;
if (rt->rt_uses_gateway && rt->rt_gw_family != AF_INET)
@@ -235,6 +235,7 @@ static int smc_ib_determine_gid_rcu(const struct net_device *ndev,
if (smcrv2 && attr->gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP &&
smc_ib_gid_to_ipv4((u8 *)&attr->gid) != cpu_to_be32(INADDR_NONE)) {
struct in_device *in_dev = __in_dev_get_rcu(ndev);
+ struct net *net = dev_net(ndev);
const struct in_ifaddr *ifa;
bool subnet_match = false;
@@ -248,7 +249,7 @@ static int smc_ib_determine_gid_rcu(const struct net_device *ndev,
}
if (!subnet_match)
goto out;
- if (smcrv2->daddr && smc_ib_find_route(smcrv2->saddr,
+ if (smcrv2->daddr && smc_ib_find_route(net, smcrv2->saddr,
smcrv2->daddr,
smcrv2->nexthop_mac,
&smcrv2->uses_gateway))
diff --git a/net/smc/smc_ib.h b/net/smc/smc_ib.h
index 4df5f8c8a0a1..ef8ac2b7546d 100644
--- a/net/smc/smc_ib.h
+++ b/net/smc/smc_ib.h
@@ -112,7 +112,7 @@ void smc_ib_sync_sg_for_device(struct smc_link *lnk,
int smc_ib_determine_gid(struct smc_ib_device *smcibdev, u8 ibport,
unsigned short vlan_id, u8 gid[], u8 *sgid_index,
struct smc_init_info_smcrv2 *smcrv2);
-int smc_ib_find_route(__be32 saddr, __be32 daddr,
+int smc_ib_find_route(struct net *net, __be32 saddr, __be32 daddr,
u8 nexthop_mac[], u8 *uses_gateway);
bool smc_ib_is_valid_local_systemid(void);
int smcr_nl_get_device(struct sk_buff *skb, struct netlink_callback *cb);
The patch below does not apply to the 6.5-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.5.y
git checkout FETCH_HEAD
git cherry-pick -x c68681ae46eaaa1640b52fe366d21a93b2185df5
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102057-arming-autism-6d65@gregkh' --subject-prefix 'PATCH 6.5.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c68681ae46eaaa1640b52fe366d21a93b2185df5 Mon Sep 17 00:00:00 2001
From: Albert Huang <huangjie.albert(a)bytedance.com>
Date: Wed, 11 Oct 2023 15:48:51 +0800
Subject: [PATCH] net/smc: fix smc clc failed issue when netdevice not in
init_net
If the netdevice is within a container and communicates externally
through network technologies such as VxLAN, we won't be able to find
routing information in the init_net namespace. To address this issue,
we need to add a struct net parameter to the smc_ib_find_route function.
This allow us to locate the routing information within the corresponding
net namespace, ensuring the correct completion of the SMC CLC interaction.
Fixes: e5c4744cfb59 ("net/smc: add SMC-Rv2 connection establishment")
Signed-off-by: Albert Huang <huangjie.albert(a)bytedance.com>
Reviewed-by: Dust Li <dust.li(a)linux.alibaba.com>
Reviewed-by: Wenjia Zhang <wenjia(a)linux.ibm.com>
Link: https://lore.kernel.org/r/20231011074851.95280-1-huangjie.albert@bytedance.…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index bacdd971615e..7a874da90c7f 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -1201,6 +1201,7 @@ static int smc_connect_rdma_v2_prepare(struct smc_sock *smc,
(struct smc_clc_msg_accept_confirm_v2 *)aclc;
struct smc_clc_first_contact_ext *fce =
smc_get_clc_first_contact_ext(clc_v2, false);
+ struct net *net = sock_net(&smc->sk);
int rc;
if (!ini->first_contact_peer || aclc->hdr.version == SMC_V1)
@@ -1210,7 +1211,7 @@ static int smc_connect_rdma_v2_prepare(struct smc_sock *smc,
memcpy(ini->smcrv2.nexthop_mac, &aclc->r0.lcl.mac, ETH_ALEN);
ini->smcrv2.uses_gateway = false;
} else {
- if (smc_ib_find_route(smc->clcsock->sk->sk_rcv_saddr,
+ if (smc_ib_find_route(net, smc->clcsock->sk->sk_rcv_saddr,
smc_ib_gid_to_ipv4(aclc->r0.lcl.gid),
ini->smcrv2.nexthop_mac,
&ini->smcrv2.uses_gateway))
diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
index 9b66d6aeeb1a..89981dbe46c9 100644
--- a/net/smc/smc_ib.c
+++ b/net/smc/smc_ib.c
@@ -193,7 +193,7 @@ bool smc_ib_port_active(struct smc_ib_device *smcibdev, u8 ibport)
return smcibdev->pattr[ibport - 1].state == IB_PORT_ACTIVE;
}
-int smc_ib_find_route(__be32 saddr, __be32 daddr,
+int smc_ib_find_route(struct net *net, __be32 saddr, __be32 daddr,
u8 nexthop_mac[], u8 *uses_gateway)
{
struct neighbour *neigh = NULL;
@@ -205,7 +205,7 @@ int smc_ib_find_route(__be32 saddr, __be32 daddr,
if (daddr == cpu_to_be32(INADDR_NONE))
goto out;
- rt = ip_route_output_flow(&init_net, &fl4, NULL);
+ rt = ip_route_output_flow(net, &fl4, NULL);
if (IS_ERR(rt))
goto out;
if (rt->rt_uses_gateway && rt->rt_gw_family != AF_INET)
@@ -235,6 +235,7 @@ static int smc_ib_determine_gid_rcu(const struct net_device *ndev,
if (smcrv2 && attr->gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP &&
smc_ib_gid_to_ipv4((u8 *)&attr->gid) != cpu_to_be32(INADDR_NONE)) {
struct in_device *in_dev = __in_dev_get_rcu(ndev);
+ struct net *net = dev_net(ndev);
const struct in_ifaddr *ifa;
bool subnet_match = false;
@@ -248,7 +249,7 @@ static int smc_ib_determine_gid_rcu(const struct net_device *ndev,
}
if (!subnet_match)
goto out;
- if (smcrv2->daddr && smc_ib_find_route(smcrv2->saddr,
+ if (smcrv2->daddr && smc_ib_find_route(net, smcrv2->saddr,
smcrv2->daddr,
smcrv2->nexthop_mac,
&smcrv2->uses_gateway))
diff --git a/net/smc/smc_ib.h b/net/smc/smc_ib.h
index 4df5f8c8a0a1..ef8ac2b7546d 100644
--- a/net/smc/smc_ib.h
+++ b/net/smc/smc_ib.h
@@ -112,7 +112,7 @@ void smc_ib_sync_sg_for_device(struct smc_link *lnk,
int smc_ib_determine_gid(struct smc_ib_device *smcibdev, u8 ibport,
unsigned short vlan_id, u8 gid[], u8 *sgid_index,
struct smc_init_info_smcrv2 *smcrv2);
-int smc_ib_find_route(__be32 saddr, __be32 daddr,
+int smc_ib_find_route(struct net *net, __be32 saddr, __be32 daddr,
u8 nexthop_mac[], u8 *uses_gateway);
bool smc_ib_is_valid_local_systemid(void);
int smcr_nl_get_device(struct sk_buff *skb, struct netlink_callback *cb);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 0288c3e709e5fabd51e84715c5c798a02f43061a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102021-polygraph-modular-418a@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0288c3e709e5fabd51e84715c5c798a02f43061a Mon Sep 17 00:00:00 2001
From: Jesse Brandeburg <jesse.brandeburg(a)intel.com>
Date: Wed, 11 Oct 2023 16:33:33 -0700
Subject: [PATCH] ice: reset first in crash dump kernels
When the system boots into the crash dump kernel after a panic, the ice
networking device may still have pending transactions that can cause errors
or machine checks when the device is re-enabled. This can prevent the crash
dump kernel from loading the driver or collecting the crash data.
To avoid this issue, perform a function level reset (FLR) on the ice device
via PCIe config space before enabling it on the crash kernel. This will
clear any outstanding transactions and stop all queues and interrupts.
Restore the config space after the FLR, otherwise it was found in testing
that the driver wouldn't load successfully.
The following sequence causes the original issue:
- Load the ice driver with modprobe ice
- Enable SR-IOV with 2 VFs: echo 2 > /sys/class/net/eth0/device/sriov_num_vfs
- Trigger a crash with echo c > /proc/sysrq-trigger
- Load the ice driver again (or let it load automatically) with modprobe ice
- The system crashes again during pcim_enable_device()
Fixes: 837f08fdecbe ("ice: Add basic driver framework for Intel(R) E800 Series")
Reported-by: Vishal Agrawal <vagrawal(a)redhat.com>
Reviewed-by: Jay Vosburgh <jay.vosburgh(a)canonical.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel(a)intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg(a)intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha(a)intel.com> (A Contingent worker at Intel)
Link: https://lore.kernel.org/r/20231011233334.336092-3-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index c8286adae946..6550c46e4e36 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -6,6 +6,7 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <generated/utsrelease.h>
+#include <linux/crash_dump.h>
#include "ice.h"
#include "ice_base.h"
#include "ice_lib.h"
@@ -5014,6 +5015,20 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
return -EINVAL;
}
+ /* when under a kdump kernel initiate a reset before enabling the
+ * device in order to clear out any pending DMA transactions. These
+ * transactions can cause some systems to machine check when doing
+ * the pcim_enable_device() below.
+ */
+ if (is_kdump_kernel()) {
+ pci_save_state(pdev);
+ pci_clear_master(pdev);
+ err = pcie_flr(pdev);
+ if (err)
+ return err;
+ pci_restore_state(pdev);
+ }
+
/* this driver uses devres, see
* Documentation/driver-api/driver-model/devres.rst
*/
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 0288c3e709e5fabd51e84715c5c798a02f43061a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102020-spyglass-frostlike-54b5@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0288c3e709e5fabd51e84715c5c798a02f43061a Mon Sep 17 00:00:00 2001
From: Jesse Brandeburg <jesse.brandeburg(a)intel.com>
Date: Wed, 11 Oct 2023 16:33:33 -0700
Subject: [PATCH] ice: reset first in crash dump kernels
When the system boots into the crash dump kernel after a panic, the ice
networking device may still have pending transactions that can cause errors
or machine checks when the device is re-enabled. This can prevent the crash
dump kernel from loading the driver or collecting the crash data.
To avoid this issue, perform a function level reset (FLR) on the ice device
via PCIe config space before enabling it on the crash kernel. This will
clear any outstanding transactions and stop all queues and interrupts.
Restore the config space after the FLR, otherwise it was found in testing
that the driver wouldn't load successfully.
The following sequence causes the original issue:
- Load the ice driver with modprobe ice
- Enable SR-IOV with 2 VFs: echo 2 > /sys/class/net/eth0/device/sriov_num_vfs
- Trigger a crash with echo c > /proc/sysrq-trigger
- Load the ice driver again (or let it load automatically) with modprobe ice
- The system crashes again during pcim_enable_device()
Fixes: 837f08fdecbe ("ice: Add basic driver framework for Intel(R) E800 Series")
Reported-by: Vishal Agrawal <vagrawal(a)redhat.com>
Reviewed-by: Jay Vosburgh <jay.vosburgh(a)canonical.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel(a)intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg(a)intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha(a)intel.com> (A Contingent worker at Intel)
Link: https://lore.kernel.org/r/20231011233334.336092-3-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index c8286adae946..6550c46e4e36 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -6,6 +6,7 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <generated/utsrelease.h>
+#include <linux/crash_dump.h>
#include "ice.h"
#include "ice_base.h"
#include "ice_lib.h"
@@ -5014,6 +5015,20 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
return -EINVAL;
}
+ /* when under a kdump kernel initiate a reset before enabling the
+ * device in order to clear out any pending DMA transactions. These
+ * transactions can cause some systems to machine check when doing
+ * the pcim_enable_device() below.
+ */
+ if (is_kdump_kernel()) {
+ pci_save_state(pdev);
+ pci_clear_master(pdev);
+ err = pcie_flr(pdev);
+ if (err)
+ return err;
+ pci_restore_state(pdev);
+ }
+
/* this driver uses devres, see
* Documentation/driver-api/driver-model/devres.rst
*/
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 242e34500a32631f85c2b4eb6cb42a368a39e54f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102051-monsoon-democracy-4b10@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 242e34500a32631f85c2b4eb6cb42a368a39e54f Mon Sep 17 00:00:00 2001
From: Jesse Brandeburg <jesse.brandeburg(a)intel.com>
Date: Tue, 10 Oct 2023 13:30:59 -0700
Subject: [PATCH] ice: fix over-shifted variable
Since the introduction of the ice driver the code has been
double-shifting the RSS enabling field, because the define already has
shifts in it and can't have the regular pattern of "a << shiftval &
mask" applied.
Most places in the code got it right, but one line was still wrong. Fix
this one location for easy backports to stable. An in-progress patch
fixes the defines to "standard" and will be applied as part of the
regular -next process sometime after this one.
Fixes: d76a60ba7afb ("ice: Add support for VLANs and offloads")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel(a)intel.com>
CC: stable(a)vger.kernel.org
Signed-off-by: Jesse Brandeburg <jesse.brandeburg(a)intel.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha(a)intel.com> (A Contingent worker at Intel)
Signed-off-by: Jacob Keller <jacob.e.keller(a)intel.com>
Link: https://lore.kernel.org/r/20231010203101.406248-1-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
index 7bf9b7069754..73bbf06a76db 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -1201,8 +1201,7 @@ static void ice_set_rss_vsi_ctx(struct ice_vsi_ctx *ctxt, struct ice_vsi *vsi)
ctxt->info.q_opt_rss = ((lut_type << ICE_AQ_VSI_Q_OPT_RSS_LUT_S) &
ICE_AQ_VSI_Q_OPT_RSS_LUT_M) |
- ((hash_type << ICE_AQ_VSI_Q_OPT_RSS_HASH_S) &
- ICE_AQ_VSI_Q_OPT_RSS_HASH_M);
+ (hash_type & ICE_AQ_VSI_Q_OPT_RSS_HASH_M);
}
static void
The patch below does not apply to the 6.5-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.5.y
git checkout FETCH_HEAD
git cherry-pick -x acab8ff29a2a226409cfe04e6d2e0896928c1b3a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102005-stomp-defy-0f8e@gregkh' --subject-prefix 'PATCH 6.5.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From acab8ff29a2a226409cfe04e6d2e0896928c1b3a Mon Sep 17 00:00:00 2001
From: Iulia Tanasescu <iulia.tanasescu(a)nxp.com>
Date: Thu, 28 Sep 2023 10:52:57 +0300
Subject: [PATCH] Bluetooth: ISO: Fix invalid context error
This moves the hci_le_terminate_big_sync call from rx_work
to cmd_sync_work, to avoid calling sleeping function from
an invalid context.
Reported-by: syzbot+c715e1bd8dfbcb1ab176(a)syzkaller.appspotmail.com
Fixes: a0bfde167b50 ("Bluetooth: ISO: Add support for connecting multiple BISes")
Signed-off-by: Iulia Tanasescu <iulia.tanasescu(a)nxp.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
index 31d02b54eea1..e6cfc65abcb8 100644
--- a/net/bluetooth/hci_event.c
+++ b/net/bluetooth/hci_event.c
@@ -7021,6 +7021,14 @@ static void hci_le_cis_req_evt(struct hci_dev *hdev, void *data,
hci_dev_unlock(hdev);
}
+static int hci_iso_term_big_sync(struct hci_dev *hdev, void *data)
+{
+ u8 handle = PTR_UINT(data);
+
+ return hci_le_terminate_big_sync(hdev, handle,
+ HCI_ERROR_LOCAL_HOST_TERM);
+}
+
static void hci_le_create_big_complete_evt(struct hci_dev *hdev, void *data,
struct sk_buff *skb)
{
@@ -7065,16 +7073,17 @@ static void hci_le_create_big_complete_evt(struct hci_dev *hdev, void *data,
rcu_read_lock();
}
+ rcu_read_unlock();
+
if (!ev->status && !i)
/* If no BISes have been connected for the BIG,
* terminate. This is in case all bound connections
* have been closed before the BIG creation
* has completed.
*/
- hci_le_terminate_big_sync(hdev, ev->handle,
- HCI_ERROR_LOCAL_HOST_TERM);
+ hci_cmd_sync_queue(hdev, hci_iso_term_big_sync,
+ UINT_PTR(ev->handle), NULL);
- rcu_read_unlock();
hci_dev_unlock(hdev);
}
The patch below does not apply to the 6.5-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.5.y
git checkout FETCH_HEAD
git cherry-pick -x a239110ee8e0b0aafa265f0d54f7a16744855e70
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023102019-shove-stagnant-982e@gregkh' --subject-prefix 'PATCH 6.5.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a239110ee8e0b0aafa265f0d54f7a16744855e70 Mon Sep 17 00:00:00 2001
From: Pauli Virtanen <pav(a)iki.fi>
Date: Sat, 30 Sep 2023 15:53:32 +0300
Subject: [PATCH] Bluetooth: hci_sync: always check if connection is alive
before deleting
In hci_abort_conn_sync it is possible that conn is deleted concurrently
by something else, also e.g. when waiting for hdev->lock. This causes
double deletion of the conn, so UAF or conn_hash.list corruption.
Fix by having all code paths check that the connection is still in
conn_hash before deleting it, while holding hdev->lock which prevents
any races.
Log (when powering off while BAP streaming, occurs rarely):
=======================================================================
kernel BUG at lib/list_debug.c:56!
...
? __list_del_entry_valid (lib/list_debug.c:56)
hci_conn_del (net/bluetooth/hci_conn.c:154) bluetooth
hci_abort_conn_sync (net/bluetooth/hci_sync.c:5415) bluetooth
? __pfx_hci_abort_conn_sync+0x10/0x10 [bluetooth]
? lock_release+0x1d5/0x3c0
? hci_disconnect_all_sync.constprop.0+0xb2/0x230 [bluetooth]
? __pfx_lock_release+0x10/0x10
? __kmem_cache_free+0x14d/0x2e0
hci_disconnect_all_sync.constprop.0+0xda/0x230 [bluetooth]
? __pfx_hci_disconnect_all_sync.constprop.0+0x10/0x10 [bluetooth]
? hci_clear_adv_sync+0x14f/0x170 [bluetooth]
? __pfx_set_powered_sync+0x10/0x10 [bluetooth]
hci_set_powered_sync+0x293/0x450 [bluetooth]
=======================================================================
Fixes: 94d9ba9f9888 ("Bluetooth: hci_sync: Fix UAF in hci_disconnect_all_sync")
Signed-off-by: Pauli Virtanen <pav(a)iki.fi>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c
index d06e07a0ea5a..a15ab0b874a9 100644
--- a/net/bluetooth/hci_sync.c
+++ b/net/bluetooth/hci_sync.c
@@ -5369,6 +5369,7 @@ int hci_abort_conn_sync(struct hci_dev *hdev, struct hci_conn *conn, u8 reason)
{
int err = 0;
u16 handle = conn->handle;
+ bool disconnect = false;
struct hci_conn *c;
switch (conn->state) {
@@ -5399,24 +5400,15 @@ int hci_abort_conn_sync(struct hci_dev *hdev, struct hci_conn *conn, u8 reason)
hci_dev_unlock(hdev);
return 0;
case BT_BOUND:
- hci_dev_lock(hdev);
- hci_conn_failed(conn, reason);
- hci_dev_unlock(hdev);
- return 0;
+ break;
default:
- hci_dev_lock(hdev);
- conn->state = BT_CLOSED;
- hci_disconn_cfm(conn, reason);
- hci_conn_del(conn);
- hci_dev_unlock(hdev);
- return 0;
+ disconnect = true;
+ break;
}
hci_dev_lock(hdev);
- /* Check if the connection hasn't been cleanup while waiting
- * commands to complete.
- */
+ /* Check if the connection has been cleaned up concurrently */
c = hci_conn_hash_lookup_handle(hdev, handle);
if (!c || c != conn) {
err = 0;
@@ -5428,7 +5420,13 @@ int hci_abort_conn_sync(struct hci_dev *hdev, struct hci_conn *conn, u8 reason)
* or in case of LE it was still scanning so it can be cleanup
* safely.
*/
- hci_conn_failed(conn, reason);
+ if (disconnect) {
+ conn->state = BT_CLOSED;
+ hci_disconn_cfm(conn, reason);
+ hci_conn_del(conn);
+ } else {
+ hci_conn_failed(conn, reason);
+ }
unlock:
hci_dev_unlock(hdev);
This is the start of the stable review cycle for the 5.4.237 release.
There are 68 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Fri, 17 Mar 2023 11:57:10 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.237-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.4.237-rc1
Stefan Haberland <sth(a)linux.ibm.com>
s390/dasd: add missing discipline function
Masahiro Yamada <masahiroy(a)kernel.org>
UML: define RUNTIME_DISCARD_EXIT
Tom Saeger <tom.saeger(a)oracle.com>
sh: define RUNTIME_DISCARD_EXIT
Masahiro Yamada <masahiroy(a)kernel.org>
s390: define RUNTIME_DISCARD_EXIT to fix link error with GNU ld < 2.36
Michael Ellerman <mpe(a)ellerman.id.au>
powerpc/vmlinux.lds: Don't discard .rela* for relocatable builds
Michael Ellerman <mpe(a)ellerman.id.au>
powerpc/vmlinux.lds: Define RUNTIME_DISCARD_EXIT
Masahiro Yamada <masahiroy(a)kernel.org>
arch: fix broken BuildID for arm64 and riscv
H.J. Lu <hjl.tools(a)gmail.com>
x86, vmlinux.lds: Add RUNTIME_DISCARD_EXIT to generic DISCARDS
John Harrison <John.C.Harrison(a)Intel.com>
drm/i915: Don't use BAR mappings for ring buffers with LLC
Corey Minyard <cminyard(a)mvista.com>
ipmi:watchdog: Set panic count to proper value on a panic
Yejune Deng <yejune.deng(a)gmail.com>
ipmi/watchdog: replace atomic_add() and atomic_sub()
Paul Elder <paul.elder(a)ideasonboard.com>
media: ov5640: Fix analogue gain control
Alvaro Karsz <alvaro.karsz(a)solid-run.com>
PCI: Avoid FLR for SolidRun SNET DPU rev 1
Alvaro Karsz <alvaro.karsz(a)solid-run.com>
PCI: Add SolidRun vendor ID
Nathan Chancellor <nathan(a)kernel.org>
macintosh: windfarm: Use unsigned type for 1-bit bitfields
Edward Humes <aurxenon(a)lunos.org>
alpha: fix R_ALPHA_LITERAL reloc for large modules
Christophe Leroy <christophe.leroy(a)csgroup.eu>
powerpc: Check !irq instead of irq == NO_IRQ and remove NO_IRQ
xurui <xurui(a)kylinos.cn>
MIPS: Fix a compilation issue
Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org>
clk: qcom: mmcc-apq8084: remove spdm clocks
Jan Kara <jack(a)suse.cz>
ext4: Fix deadlock during directory rename
Alexandre Ghiti <alexghiti(a)rivosinc.com>
riscv: Use READ_ONCE_NOCHECK in imprecise unwinding stack mode
D. Wythe <alibuda(a)linux.alibaba.com>
net/smc: fix fallback failed while sendmsg with fastopen
Chandrakanth Patil <chandrakanth.patil(a)broadcom.com>
scsi: megaraid_sas: Update max supported LD IDs to 240
Lorenz Bauer <lorenz.bauer(a)isovalent.com>
btf: fix resolving BTF_KIND_VAR after ARRAY, STRUCT, UNION, PTR
Florian Westphal <fw(a)strlen.de>
netfilter: tproxy: fix deadlock due to missing BH disable
Michael Chan <michael.chan(a)broadcom.com>
bnxt_en: Avoid order-5 memory allocation for TPA data
Shigeru Yoshida <syoshida(a)redhat.com>
net: caif: Fix use-after-free in cfusbl_device_notify()
Yuiko Oshino <yuiko.oshino(a)microchip.com>
net: lan78xx: fix accessing the LAN7800's internal phy specific registers from the MAC driver
Lee Jones <lee.jones(a)linaro.org>
net: usb: lan78xx: Remove lots of set but unused 'ret' variables
Hangbin Liu <liuhangbin(a)gmail.com>
selftests: nft_nat: ensuring the listening side is up before starting the client
Eric Dumazet <edumazet(a)google.com>
ila: do not generate empty messages in ila_xlat_nl_cmd_get_mapping()
Kang Chen <void0red(a)gmail.com>
nfc: fdp: add null check of devm_kmalloc_array in fdp_nci_i2c_read_device_properties
Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org>
drm/msm/a5xx: fix setting of the CP_PREEMPT_ENABLE_LOCAL register
Jan Kara <jack(a)suse.cz>
ext4: Fix possible corruption when moving a directory
Bart Van Assche <bvanassche(a)acm.org>
scsi: core: Remove the /proc/scsi/${proc_name} directory earlier
Volker Lendecke <vl(a)samba.org>
cifs: Fix uninitialized memory read in smb3_qfs_tcon()
Amir Goldstein <amir73il(a)gmail.com>
SMB3: Backup intent flag missing from some more ops
Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
ARM: dts: exynos: correct TMU phandle in Odroid XU3 family
Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
ARM: dts: exynos: correct TMU phandle in Odroid HC1
Marek Szyprowski <m.szyprowski(a)samsung.com>
ARM: dts: exynos: Add GPU thermal zone cooling maps for Odroid XU3/XU4/HC1
Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
ARM: dts: exynos: correct TMU phandle in Exynos5250
Krzysztof Kozlowski <krzk(a)kernel.org>
ARM: dts: exynos: Override thermal by label in Exynos5250
Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
ARM: dts: exynos: correct TMU phandle in Exynos4210
Krzysztof Kozlowski <krzk(a)kernel.org>
ARM: dts: exynos: Override thermal by label in Exynos4210
Jacob Pan <jacob.jun.pan(a)linux.intel.com>
iommu/vt-d: Fix PASID directory pointer coherency
Marc Zyngier <maz(a)kernel.org>
irqdomain: Fix domain registration race
Bixuan Cui <cuibixuan(a)huawei.com>
irqdomain: Change the type of 'size' in __irq_domain_add() to be consistent
Corey Minyard <cminyard(a)mvista.com>
ipmi:ssif: Add a timer between request retries
Corey Minyard <cminyard(a)mvista.com>
ipmi:ssif: Increase the message retry time
Corey Minyard <cminyard(a)mvista.com>
ipmi:ssif: Remove rtc_us_timer
Corey Minyard <cminyard(a)mvista.com>
ipmi:ssif: resend_msg() cannot fail
Liguang Zhang <zhangliguang(a)linux.alibaba.com>
ipmi:ssif: make ssif_i2c_send() void
Gavrilov Ilia <Ilia.Gavrilov(a)infotecs.ru>
iommu/amd: Add a length limitation for the ivrs_acpihid command-line parameter
Kim Phillips <kim.phillips(a)amd.com>
iommu/amd: Fix ill-formed ivrs_ioapic, ivrs_hpet and ivrs_acpihid options
Suravee Suthikulpanit <suravee.suthikulpanit(a)amd.com>
iommu/amd: Add PCI segment support for ivrs_[ioapic/hpet/acpihid] commands
Jani Nikula <jani.nikula(a)intel.com>
drm/edid: fix AVI infoframe aspect ratio handling
Wayne Lin <Wayne.Lin(a)amd.com>
drm/edid: Add aspect ratios to HDMI 4K modes
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/edid: Fix HDMI VIC handling
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/edid: Extract drm_mode_cea_vic()
Fedor Pchelkin <pchelkin(a)ispras.ru>
nfc: change order inside nfc_se_io error path
Zhihao Cheng <chengzhihao1(a)huawei.com>
ext4: zero i_disksize when initializing the bootloader inode
Ye Bin <yebin10(a)huawei.com>
ext4: fix WARNING in ext4_update_inline_data
Ye Bin <yebin10(a)huawei.com>
ext4: move where set the MAY_INLINE_DATA flag is set
Darrick J. Wong <djwong(a)kernel.org>
ext4: fix another off-by-one fsmap error on 1k block filesystems
Eric Whitney <enwlinux(a)gmail.com>
ext4: fix RENAME_WHITEOUT handling for inline directories
Harry Wentland <harry.wentland(a)amd.com>
drm/connector: print max_requested_bpc in state debugfs
Andrew Cooper <andrew.cooper3(a)citrix.com>
x86/CPU/AMD: Disable XSAVES on AMD family 0x17
Theodore Ts'o <tytso(a)mit.edu>
fs: prevent out-of-bounds array speculation when closing a file descriptor
-------------
Diffstat:
Documentation/admin-guide/kernel-parameters.txt | 51 +++-
Makefile | 4 +-
arch/alpha/kernel/module.c | 4 +-
arch/arm/boot/dts/exynos4210.dtsi | 35 ++-
arch/arm/boot/dts/exynos5250.dtsi | 38 ++-
arch/arm/boot/dts/exynos5422-odroidhc1.dts | 38 ++-
arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi | 67 ++++-
arch/mips/include/asm/mach-rc32434/pci.h | 2 +-
arch/powerpc/include/asm/irq.h | 3 -
arch/powerpc/kernel/vmlinux.lds.S | 6 +-
arch/powerpc/platforms/44x/fsp2.c | 2 +-
arch/riscv/kernel/stacktrace.c | 2 +-
arch/s390/kernel/vmlinux.lds.S | 2 +
arch/sh/kernel/vmlinux.lds.S | 1 +
arch/um/kernel/vmlinux.lds.S | 2 +-
arch/x86/kernel/cpu/amd.c | 9 +
arch/x86/kernel/vmlinux.lds.S | 2 +
drivers/char/ipmi/ipmi_ssif.c | 146 ++++-------
drivers/char/ipmi/ipmi_watchdog.c | 8 +-
drivers/clk/qcom/mmcc-apq8084.c | 271 ---------------------
drivers/gpu/drm/drm_atomic.c | 1 +
drivers/gpu/drm/drm_edid.c | 130 ++++++----
drivers/gpu/drm/i915/gt/intel_ringbuffer.c | 4 +-
drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 4 +-
drivers/iommu/amd_iommu_init.c | 105 ++++++--
drivers/iommu/intel-pasid.c | 7 +
drivers/macintosh/windfarm_lm75_sensor.c | 4 +-
drivers/macintosh/windfarm_smu_sensors.c | 4 +-
drivers/media/i2c/ov5640.c | 2 +-
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 23 +-
drivers/net/phy/microchip.c | 32 +++
drivers/net/usb/lan78xx.c | 189 ++++++--------
drivers/nfc/fdp/i2c.c | 4 +
drivers/pci/quirks.c | 8 +
drivers/s390/block/dasd_diag.c | 7 +-
drivers/s390/block/dasd_fba.c | 7 +-
drivers/s390/block/dasd_int.h | 1 -
drivers/scsi/hosts.c | 2 +
drivers/scsi/megaraid/megaraid_sas.h | 2 +
drivers/scsi/megaraid/megaraid_sas_fp.c | 2 +-
fs/cifs/cifsacl.c | 14 +-
fs/cifs/cifsfs.c | 2 +-
fs/cifs/cifsglob.h | 6 +-
fs/cifs/cifsproto.h | 8 +
fs/cifs/connect.c | 2 +-
fs/cifs/dir.c | 5 +-
fs/cifs/file.c | 10 +-
fs/cifs/inode.c | 8 +-
fs/cifs/ioctl.c | 2 +-
fs/cifs/link.c | 18 +-
fs/cifs/smb1ops.c | 19 +-
fs/cifs/smb2inode.c | 9 +-
fs/cifs/smb2ops.c | 92 +++----
fs/cifs/smb2proto.h | 2 +-
fs/ext4/fsmap.c | 2 +
fs/ext4/inline.c | 1 -
fs/ext4/inode.c | 7 +-
fs/ext4/ioctl.c | 1 +
fs/ext4/namei.c | 36 ++-
fs/ext4/xattr.c | 3 +
fs/file.c | 1 +
include/asm-generic/vmlinux.lds.h | 16 +-
include/linux/irqdomain.h | 2 +-
include/linux/pci_ids.h | 2 +
include/net/netfilter/nf_tproxy.h | 7 +
kernel/bpf/btf.c | 1 +
kernel/irq/irqdomain.c | 62 +++--
net/caif/caif_usb.c | 3 +
net/ipv4/netfilter/nf_tproxy_ipv4.c | 2 +-
net/ipv6/ila/ila_xlat.c | 1 +
net/ipv6/netfilter/nf_tproxy_ipv6.c | 2 +-
net/nfc/netlink.c | 2 +-
net/smc/af_smc.c | 13 +-
tools/testing/selftests/netfilter/nft_nat.sh | 2 +
74 files changed, 783 insertions(+), 811 deletions(-)
Dear Stable Maintainers,
I am submitting several individual patches for consideration to be applied to the Linux stable kernel v6.1.
Below, you will find the necessary information for each patch:
Patch 1:
Subject: igc: remove I226 Qbv BaseTime restriction
Commit ID: b8897dc54e3b
Reason for Application: Remove the Qbv BaseTime restriction for I226 so that the BaseTime can be scheduled to the future time
Target Kernel Version: v6.1
Patch 2:
Subject: igc: enable Qbv configuration for 2nd GCL
Commit ID: 5ac1231ac14d
Reason for Application: Make reset task only executes for i225 and Qbv disabling to allow i226 configure for 2nd GCL without resetting the adapter.
Target Kernel Version: v6.1
Patch 3:
Subject: igc: Remove reset adapter task for i226 during disable tsn config
Commit ID: 1d1b4c63ba73
Reason for Application: Removes the power cycle restriction so that when user configure/remove any TSN mode, it would not go into power cycle reset adapter.
Target Kernel Version: v6.1
Patch 4:
Subject: igc: Add qbv_config_change_errors counter
Commit ID: ae4fe4698300
Reason for Application: Add ConfigChangeError(qbv_config_change_errors) when user try to set the AdminBaseTime to past value while the current GCL is still running.
Target Kernel Version: v6.1
Patch 5:
Subject: igc: Add condition for qbv_config_change_errors counter
Commit ID: ed89b74d2dc9
Reason for Application: Fix patch ae4fe4698300 ("igc: Add qbv_config_change_errors counter")
Target Kernel Version: v6.1
Patch 6:
Subject: igc: Fix race condition in PTP Tx code
Commit ID: 9c50e2b150c8
Reason for Application: Fix race condition introduced by patch 2c344ae24501 ("igc: Add support for TX timestamping")
Target Kernel Version: v6.1
Thank you for your time and consideration.
Please let me know if you require any additional information or have any questions regarding these patch submissions.
Best regards,
Faizal
faizal.abdul.rahim(a)intel.com
Add support for the IX-100, IX-200 and IX-400 serial cards.
Cc: stable(a)vger.kernel.org
Signed-off-by: Cameron Williams <cang1(a)live.co.uk>
---
For stable: This patch is only applicable to 5.15 LTS and up, other LTS
kernels dont have the neccesary quirk options available (in patch 11) to
have these cards work correctly.
v3 - v4:
Add Cc: tag.
v2 - v3:
Re-submit patch series using git send-email to make threading work.
v1 - v2:
This is a resubmission series for the patch series below. That series
was lots of changes sent to lots of maintainers, this series is just for
the tty/serial/8250 subsystem.
[1] https://lore.kernel.org/all/DU0PR02MB789950E64D808DB57E9D7312C4F8A@DU0PR02M…
[2] https://lore.kernel.org/all/DU0PR02MB7899DE53DFC900EFB50E53F2C4F8A@DU0PR02M…
[3] https://lore.kernel.org/all/DU0PR02MB7899033E7E81EAF3694BC20AC4F8A@DU0PR02M…
[4] https://lore.kernel.org/all/DU0PR02MB7899EABA8C3DCAC94DCC79D4C4F8A@DU0PR02M…
drivers/tty/serial/8250/8250_pci.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/drivers/tty/serial/8250/8250_pci.c b/drivers/tty/serial/8250/8250_pci.c
index 65ac7fdd361f..af970b372cc2 100644
--- a/drivers/tty/serial/8250/8250_pci.c
+++ b/drivers/tty/serial/8250/8250_pci.c
@@ -4931,6 +4931,27 @@ static const struct pci_device_id serial_pci_tbl[] = {
{ PCI_VENDOR_ID_INTASHIELD, PCI_DEVICE_ID_INTASHIELD_IS400,
PCI_ANY_ID, PCI_ANY_ID, 0, 0, /* 135a.0dc0 */
pbn_b2_4_115200 },
+ /*
+ * IntaShield IX-100
+ */
+ { PCI_VENDOR_ID_INTASHIELD, 0x4027,
+ PCI_ANY_ID, PCI_ANY_ID,
+ 0, 0,
+ pbn_oxsemi_1_15625000 },
+ /*
+ * IntaShield IX-200
+ */
+ { PCI_VENDOR_ID_INTASHIELD, 0x4028,
+ PCI_ANY_ID, PCI_ANY_ID,
+ 0, 0,
+ pbn_oxsemi_2_15625000 },
+ /*
+ * IntaShield IX-400
+ */
+ { PCI_VENDOR_ID_INTASHIELD, 0x4029,
+ PCI_ANY_ID, PCI_ANY_ID,
+ 0, 0,
+ pbn_oxsemi_4_15625000 },
/* Brainboxes Devices */
/*
* Brainboxes UC-101
--
2.42.0
The PX-835 and PX-846 are variants of each other, clarify that
in the comment for the card.
This patch makes no functional difference.
Fixes: ef5a03a26c87 ("tty: 8250: Add support for Brainboxes PX cards.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Cameron Williams <cang1(a)live.co.uk>
---
For stable: This patch is only applicable to 5.15 LTS and up, other LTS
kernels dont have PX card support.
v3 - v4:
Split patch v3 part 5 into multiple Fixes patches and an Additions patch.
Add Fixes: and Cc: tag.
v2 - v3:
Alter commit message a little to make the additions/fixes cleaner
Re-submit patch series using git send-email to make threading work.
v1 - v2:
This is a resubmission series for the patch series below. That series
was lots of changes sent to lots of maintainers, this series is just for
the tty/serial/8250 subsystem.
[1] https://lore.kernel.org/all/DU0PR02MB789950E64D808DB57E9D7312C4F8A@DU0PR02M…
[2] https://lore.kernel.org/all/DU0PR02MB7899DE53DFC900EFB50E53F2C4F8A@DU0PR02M…
[3] https://lore.kernel.org/all/DU0PR02MB7899033E7E81EAF3694BC20AC4F8A@DU0PR02M…
[4] https://lore.kernel.org/all/DU0PR02MB7899EABA8C3DCAC94DCC79D4C4F8A@DU0PR02M…
drivers/tty/serial/8250/8250_pci.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/tty/serial/8250/8250_pci.c b/drivers/tty/serial/8250/8250_pci.c
index a68ae56e5578..f1acf5a4704f 100644
--- a/drivers/tty/serial/8250/8250_pci.c
+++ b/drivers/tty/serial/8250/8250_pci.c
@@ -5246,7 +5246,7 @@ static const struct pci_device_id serial_pci_tbl[] = {
0, 0,
pbn_oxsemi_2_15625000 },
/*
- * Brainboxes PX-846
+ * Brainboxes PX-835/PX-846
*/
{ PCI_VENDOR_ID_INTASHIELD, 0x4008,
PCI_ANY_ID, PCI_ANY_ID,
--
2.42.0
The PX-803/PX-857 are variants of each other, add a note.
Additionally fix up the port counts for the card (2, not 1).
Fixes: ef5a03a26c87 ("tty: 8250: Add support for Brainboxes PX cards.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Cameron Williams <cang1(a)live.co.uk>
---
For stable: This patch is only applicable to 5.15 LTS and up, other LTS
kernels dont have PX card support.
v3 - v4:
Split patch v3 part 5 into multiple Fixes patches and an Additions patch.
Add Fixes: and Cc: tag.
v2 - v3:
Alter commit message a little to make the additions/fixes cleaner
Re-submit patch series using git send-email to make threading work.
v1 - v2:
This is a resubmission series for the patch series below. That series
was lots of changes sent to lots of maintainers, this series is just for
the tty/serial/8250 subsystem.
[1] https://lore.kernel.org/all/DU0PR02MB789950E64D808DB57E9D7312C4F8A@DU0PR02M…
[2] https://lore.kernel.org/all/DU0PR02MB7899DE53DFC900EFB50E53F2C4F8A@DU0PR02M…
[3] https://lore.kernel.org/all/DU0PR02MB7899033E7E81EAF3694BC20AC4F8A@DU0PR02M…
[4] https://lore.kernel.org/all/DU0PR02MB7899EABA8C3DCAC94DCC79D4C4F8A@DU0PR02M…
drivers/tty/serial/8250/8250_pci.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/tty/serial/8250/8250_pci.c b/drivers/tty/serial/8250/8250_pci.c
index 59074a709254..a68ae56e5578 100644
--- a/drivers/tty/serial/8250/8250_pci.c
+++ b/drivers/tty/serial/8250/8250_pci.c
@@ -5235,16 +5235,16 @@ static const struct pci_device_id serial_pci_tbl[] = {
0, 0,
pbn_oxsemi_4_15625000 },
/*
- * Brainboxes PX-803
+ * Brainboxes PX-803/PX-857
*/
{ PCI_VENDOR_ID_INTASHIELD, 0x4009,
PCI_ANY_ID, PCI_ANY_ID,
0, 0,
- pbn_b0_1_115200 },
+ pbn_b0_2_115200 },
{ PCI_VENDOR_ID_INTASHIELD, 0x401E,
PCI_ANY_ID, PCI_ANY_ID,
0, 0,
- pbn_oxsemi_1_15625000 },
+ pbn_oxsemi_2_15625000 },
/*
* Brainboxes PX-846
*/
--
2.42.0
The UC-257 is a serial + LPT card, so remove it from this driver.
A patch has been submitted to add it to parport_serial instead.
Additionaly, the UC-431 does not use this card ID, only the UC-420
does. The 431 is a 3-port card and there is no generic 3-port configuration
available, so remove reference to it from this driver.
Fixes: 152d1afa834c ("tty: Add support for Brainboxes UC cards.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Cameron Williams <cang1(a)live.co.uk>
---
With regards to the parport_serial patch, I mistakenly did not copy in
the LKML, only the maintainer, so I have no link to the patch, my mistake
and apologies.
v3 - v4:
Split patch v3 part 2 into Fixes and Additions
Add Fixes: and Cc: tag.
v2 - v3:
Re-submit patch series using git send-email to make threading work.
v1 - v2:
This is a resubmission series for the patch series below. That series
was lots of changes sent to lots of maintainers, this series is just for
the tty/serial/8250 subsystem.
[1] https://lore.kernel.org/all/DU0PR02MB789950E64D808DB57E9D7312C4F8A@DU0PR02M…
[2] https://lore.kernel.org/all/DU0PR02MB7899DE53DFC900EFB50E53F2C4F8A@DU0PR02M…
[3] https://lore.kernel.org/all/DU0PR02MB7899033E7E81EAF3694BC20AC4F8A@DU0PR02M…
[4] https://lore.kernel.org/all/DU0PR02MB7899EABA8C3DCAC94DCC79D4C4F8A@DU0PR02M…
drivers/tty/serial/8250/8250_pci.c | 9 +--------
1 file changed, 1 insertion(+), 8 deletions(-)
diff --git a/drivers/tty/serial/8250/8250_pci.c b/drivers/tty/serial/8250/8250_pci.c
index ecb4e9acc70d..a59b9b8eaa68 100644
--- a/drivers/tty/serial/8250/8250_pci.c
+++ b/drivers/tty/serial/8250/8250_pci.c
@@ -4940,13 +4940,6 @@ static const struct pci_device_id serial_pci_tbl[] = {
PCI_ANY_ID, PCI_ANY_ID,
0, 0,
pbn_b2_1_115200 },
- /*
- * Brainboxes UC-257
- */
- { PCI_VENDOR_ID_INTASHIELD, 0x0861,
- PCI_ANY_ID, PCI_ANY_ID,
- 0, 0,
- pbn_b2_2_115200 },
/*
* Brainboxes UC-260/271/701/756
*/
@@ -5026,7 +5019,7 @@ static const struct pci_device_id serial_pci_tbl[] = {
0, 0,
pbn_b2_4_115200 },
/*
- * Brainboxes UC-420/431
+ * Brainboxes UC-420
*/
{ PCI_VENDOR_ID_INTASHIELD, 0x0921,
PCI_ANY_ID, PCI_ANY_ID,
--
2.42.0
The quilt patch titled
Subject: x86/mm: drop 4MB restriction on minimal NUMA node size
has been removed from the -mm tree. Its filename was
x86-mm-drop-4mb-restriction-on-minimal-numa-node-size.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: "Mike Rapoport (IBM)" <rppt(a)kernel.org>
Subject: x86/mm: drop 4MB restriction on minimal NUMA node size
Date: Tue, 17 Oct 2023 09:22:15 +0300
Qi Zheng reports crashes in a production environment and provides a
simplified example as a reproducer:
For example, if we use qemu to start a two NUMA node kernel,
one of the nodes has 2M memory (less than NODE_MIN_SIZE),
and the other node has 2G, then we will encounter the
following panic:
[ 0.149844] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 0.150783] #PF: supervisor write access in kernel mode
[ 0.151488] #PF: error_code(0x0002) - not-present page
<...>
[ 0.156056] RIP: 0010:_raw_spin_lock_irqsave+0x22/0x40
<...>
[ 0.169781] Call Trace:
[ 0.170159] <TASK>
[ 0.170448] deactivate_slab+0x187/0x3c0
[ 0.171031] ? bootstrap+0x1b/0x10e
[ 0.171559] ? preempt_count_sub+0x9/0xa0
[ 0.172145] ? kmem_cache_alloc+0x12c/0x440
[ 0.172735] ? bootstrap+0x1b/0x10e
[ 0.173236] bootstrap+0x6b/0x10e
[ 0.173720] kmem_cache_init+0x10a/0x188
[ 0.174240] start_kernel+0x415/0x6ac
[ 0.174738] secondary_startup_64_no_verify+0xe0/0xeb
[ 0.175417] </TASK>
[ 0.175713] Modules linked in:
[ 0.176117] CR2: 0000000000000000
The crashes happen because of inconsistency between nodemask that has
nodes with less than 4MB as memoryless and the actual memory fed into
core mm.
The commit 9391a3f9c7f1 ("[PATCH] x86_64: Clear more state when ignoring
empty node in SRAT parsing") that introduced minimal size of a NUMA node
does not explain why a node size cannot be less than 4MB and what boot
failures this restriction might fix.
Since then a lot has changed and core mm won't confuse badly about small
node sizes.
Drop the limitation for the minimal node size.
Link: https://lkml.kernel.org/r/20231017062215.171670-1-rppt@kernel.org
Signed-off-by: Mike Rapoport (IBM) <rppt(a)kernel.org>
Reported-by: Qi Zheng <zhengqi.arch(a)bytedance.com>
Closes: https://lore.kernel.org/all/20230212110305.93670-1-zhengqi.arch@bytedance.c…
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Tested-by: Mario Casquero <mcasquer(a)redhat.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Borislav Petkov (AMD) <bp(a)alien8.de>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Peter Ziljstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/x86/include/asm/numa.h | 7 -------
arch/x86/mm/numa.c | 7 -------
2 files changed, 14 deletions(-)
--- a/arch/x86/include/asm/numa.h~x86-mm-drop-4mb-restriction-on-minimal-numa-node-size
+++ a/arch/x86/include/asm/numa.h
@@ -12,13 +12,6 @@
#define NR_NODE_MEMBLKS (MAX_NUMNODES*2)
-/*
- * Too small node sizes may confuse the VM badly. Usually they
- * result from BIOS bugs. So dont recognize nodes as standalone
- * NUMA entities that have less than this amount of RAM listed:
- */
-#define NODE_MIN_SIZE (4*1024*1024)
-
extern int numa_off;
/*
--- a/arch/x86/mm/numa.c~x86-mm-drop-4mb-restriction-on-minimal-numa-node-size
+++ a/arch/x86/mm/numa.c
@@ -601,13 +601,6 @@ static int __init numa_register_memblks(
if (start >= end)
continue;
- /*
- * Don't confuse VM with a node that doesn't have the
- * minimum amount of memory:
- */
- if (end && (end - start) < NODE_MIN_SIZE)
- continue;
-
alloc_node_data(nid);
}
_
Patches currently in -mm which might be from rppt(a)kernel.org are
Several stable branches (4.14 to 5.15 inclusive) belatedly got
backports of commit 1202cdd665315c ("Remove DECnet support from
kernel"). This causes a minor regression for the documentation build,
which was fixed upstream by:
commit 1faa34672f8a17a3e155e74bde9648564e9480d6
Author: Bagas Sanjaya <bagasdotme(a)gmail.com>
Date: Wed Aug 24 10:58:04 2022 +0700
Documentation: sysctl: align cells in second content column
Please apply this to the affected branches.
Ben.
--
Ben Hutchings
Who are all these weirdos? - David Bowie, on joining IRC
Please backport commit 786dee864804 ("mm/memory_hotplug: ratelimit page
migration warnings") to 5.10 stable branch.
Commit message:
"""
When offlining memory the system can attempt to migrate a lot of pages, if
there are problems with migration this can flood the logs. Printing all
the data hogs the CPU and cause some RT threads to run for a long time,
which may have some bad consequences.
Rate limit the page migration warnings in order to avoid this.
"""
We are sometimes seeing RCU stalls while offlining memory with the 5.10
kernel due to the printing of these messages.
Applying this patch solves the problem by ratelimiting the prints.
Thanks,
Puranjay
> If this holds up without regressions than all LTSes. That's what Amir
> and Leah did for some other work. I can add that to the comment for
> clarity.
Unfortunately, this has not held up in LTSes without causing
regressions, specifically in crun:
Crun issue and patch
1. https://github.com/containers/crun/issues/1308
2. https://github.com/containers/crun/pull/1309
Debian bug report
1. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1053821
I think it should be reverted in LTSes and possibly in upstream.
Yours kindly, Jesse Hathaway
P.S. apologies for not having the correct threading headers. I am not on
the list.
I'm announcing the release of the 6.1.58 kernel.
All users of the 6.1 kernel series must upgrade.
The updated 6.1.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-6.1.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2
fs/nfs/direct.c | 134 ++++++++++++++---------------------------------
fs/nfs/write.c | 23 +++-----
include/linux/nfs_page.h | 4 -
lib/test_meminit.c | 2
5 files changed, 55 insertions(+), 110 deletions(-)
Greg Kroah-Hartman (7):
Revert "NFS: More fixes for nfs_direct_write_reschedule_io()"
Revert "NFS: Use the correct commit info in nfs_join_page_group()"
Revert "NFS: More O_DIRECT accounting fixes for error paths"
Revert "NFS: Fix O_DIRECT locking issues"
Revert "NFS: Fix error handling for O_DIRECT write scheduling"
lib/test_meminit: fix off-by-one error in test_pages()
Linux 6.1.58
Hi,
As reported before[1], we found another regression in 6.1 when doing
performance comparisons with 5.10. This one is caused by CONFIG_DEBUG_PREEMPT
being enabled by default by the following upstream commit if you have the
right config dependencies enabled (commit is introduced in v5.16-rc1):
"""
commit c597bfddc9e9e8a63817252b67c3ca0e544ace26
Author: Frederic Weisbecker <frederic(a)kernel.org>
Date: Tue Sep 14 12:31:34 2021 +0200
sched: Provide Kconfig support for default dynamic preempt mode
"""
We found up to 8% performance improvement with CONFIG_DEBUG_PREEMPT
disabled in different perf benchmarks (including UnixBench process
creation and redis). The root cause is explained in the commit log
below which is merged in 6.3 and applies (almost) clealy on 6.1.59.
"""
commit cc6003916ed46d7a67d91ee32de0f9138047d55f
Author: Hyeonggon Yoo <42.hyeyoo(a)gmail.com>
Date: Sat Jan 21 12:39:42 2023 +0900
lib/Kconfig.debug: do not enable DEBUG_PREEMPT by default
In workloads where this_cpu operations are frequently performed,
enabling DEBUG_PREEMPT may result in significant increase in
runtime overhead due to frequent invocation of
__this_cpu_preempt_check() function.
This can be demonstrated through benchmarks such as hackbench where this
configuration results in a 10% reduction in performance, primarily due to
the added overhead within memcg charging path.
"""
[1] https://lore.kernel.org/stable/010edf5a-453d-4c98-9c07-12e75d3f983c@amazon.…
Igor Raits and Bagas Sanjaya report a RQCF_ACT_SKIP leak warning.
Link: https://lore.kernel.org/all/a5dd536d-041a-2ce9-f4b7-64d8d85c86dc@gmail.com
This warning may be triggered in the following situations:
CPU0 CPU1
__schedule()
*rq->clock_update_flags <<= 1;* unregister_fair_sched_group()
pick_next_task_fair+0x4a/0x410 destroy_cfs_bandwidth()
newidle_balance+0x115/0x3e0 for_each_possible_cpu(i) *i=0*
rq_unpin_lock(this_rq, rf) __cfsb_csd_unthrottle()
raw_spin_rq_unlock(this_rq)
rq_lock(*CPU0_rq*, &rf)
rq_clock_start_loop_update()
rq->clock_update_flags & RQCF_ACT_SKIP <--
raw_spin_rq_lock(this_rq)
The purpose of RQCF_ACT_SKIP is to skip the update rq clock,
but the update is very early in __schedule(), but we clear
RQCF_*_SKIP very late, causing it to span that gap above
and triggering this warning.
In __schedule() we can clear the RQCF_*_SKIP flag immediately
after update_rq_clock() to avoid this RQCF_ACT_SKIP leak warning.
And set rq->clock_update_flags to RQCF_UPDATED to avoid
rq->clock_update_flags < RQCF_ACT_SKIP warning that may be triggered later.
Fixes: ebb83d84e49b ("sched/core: Avoid multiple calling update_rq_clock() in __cfsb_csd_unthrottle()")
Cc: stable(a)vger.kernel.org
Reported-by: Igor Raits <igor.raits(a)gmail.com>
Reported-by: Bagas Sanjaya <bagasdotme(a)gmail.com>
Closes: https://lore.kernel.org/all/20230913082424.73252-1-jiahao.os@bytedance.com
Suggested-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Signed-off-by: Hao Jia <jiahao.os(a)bytedance.com>
---
kernel/sched/core.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 802551e0009b..afb8d213155b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5374,8 +5374,6 @@ context_switch(struct rq *rq, struct task_struct *prev,
/* switch_mm_cid() requires the memory barriers above. */
switch_mm_cid(rq, prev, next);
- rq->clock_update_flags &= ~(RQCF_ACT_SKIP|RQCF_REQ_SKIP);
-
prepare_lock_switch(rq, next, rf);
/* Here we just switch the register state and the stack. */
@@ -6615,6 +6613,7 @@ static void __sched notrace __schedule(unsigned int sched_mode)
/* Promote REQ to ACT */
rq->clock_update_flags <<= 1;
update_rq_clock(rq);
+ rq->clock_update_flags = RQCF_UPDATED;
switch_count = &prev->nivcsw;
@@ -6694,8 +6693,6 @@ static void __sched notrace __schedule(unsigned int sched_mode)
/* Also unlocks the rq: */
rq = context_switch(rq, prev, next, &rf);
} else {
- rq->clock_update_flags &= ~(RQCF_ACT_SKIP|RQCF_REQ_SKIP);
-
rq_unpin_lock(rq, &rf);
__balance_callbacks(rq);
raw_spin_rq_unlock_irq(rq);
--
2.39.2
If name_show() is non unique, this test will try to install a kprobe on this
function which should fail returning EADDRNOTAVAIL.
On kernel where name_show() is not unique, this test is skipped.
Cc: stable(a)vger.kernel.org
Signed-off-by: Francis Laniel <flaniel(a)linux.microsoft.com>
---
.../ftrace/test.d/kprobe/kprobe_non_uniq_symbol.tc | 13 +++++++++++++
1 file changed, 13 insertions(+)
create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kprobe_non_uniq_symbol.tc
diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_non_uniq_symbol.tc b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_non_uniq_symbol.tc
new file mode 100644
index 000000000000..bc9514428dba
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_non_uniq_symbol.tc
@@ -0,0 +1,13 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# description: Test failure of registering kprobe on non unique symbol
+# requires: kprobe_events
+
+SYMBOL='name_show'
+
+# We skip this test on kernel where SYMBOL is unique or does not exist.
+if [ "$(grep -c -E "[[:alnum:]]+ t ${SYMBOL}" /proc/kallsyms)" -le '1' ]; then
+ exit_unsupported
+fi
+
+! echo "p:test_non_unique ${SYMBOL}" > kprobe_events
--
2.34.1
When a kprobe is attached to a function that's name is not unique (is
static and shares the name with other functions in the kernel), the
kprobe is attached to the first function it finds. This is a bug as the
function that it is attaching to is not necessarily the one that the
user wants to attach to.
Instead of blindly picking a function to attach to what is ambiguous,
error with EADDRNOTAVAIL to let the user know that this function is not
unique, and that the user must use another unique function with an
address offset to get to the function they want to attach to.
Cc: stable(a)vger.kernel.org
Fixes: 413d37d1eb69 ("tracing: Add kprobe-based event tracer")
Suggested-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Signed-off-by: Francis Laniel <flaniel(a)linux.microsoft.com>
Link: https://lore.kernel.org/lkml/20230819101105.b0c104ae4494a7d1f2eea742@kernel…
---
kernel/trace/trace_kprobe.c | 63 +++++++++++++++++++++++++++++++++++++
kernel/trace/trace_probe.h | 1 +
2 files changed, 64 insertions(+)
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 3d7a180a8427..a8fef6ab0872 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -705,6 +705,25 @@ static struct notifier_block trace_kprobe_module_nb = {
.priority = 1 /* Invoked after kprobe module callback */
};
+static int count_symbols(void *data, unsigned long unused)
+{
+ unsigned int *count = data;
+
+ (*count)++;
+
+ return 0;
+}
+
+static unsigned int number_of_same_symbols(char *func_name)
+{
+ unsigned int count;
+
+ count = 0;
+ kallsyms_on_each_match_symbol(count_symbols, func_name, &count);
+
+ return count;
+}
+
static int __trace_kprobe_create(int argc, const char *argv[])
{
/*
@@ -836,6 +855,31 @@ static int __trace_kprobe_create(int argc, const char *argv[])
}
}
+ if (symbol && !strchr(symbol, ':')) {
+ unsigned int count;
+
+ count = number_of_same_symbols(symbol);
+ if (count > 1) {
+ /*
+ * Users should use ADDR to remove the ambiguity of
+ * using KSYM only.
+ */
+ trace_probe_log_err(0, NON_UNIQ_SYMBOL);
+ ret = -EADDRNOTAVAIL;
+
+ goto error;
+ } else if (count == 0) {
+ /*
+ * We can return ENOENT earlier than when register the
+ * kprobe.
+ */
+ trace_probe_log_err(0, BAD_PROBE_ADDR);
+ ret = -ENOENT;
+
+ goto error;
+ }
+ }
+
trace_probe_log_set_index(0);
if (event) {
ret = traceprobe_parse_event_name(&event, &group, gbuf,
@@ -1695,6 +1739,7 @@ static int unregister_kprobe_event(struct trace_kprobe *tk)
}
#ifdef CONFIG_PERF_EVENTS
+
/* create a trace_kprobe, but don't add it to global lists */
struct trace_event_call *
create_local_trace_kprobe(char *func, void *addr, unsigned long offs,
@@ -1705,6 +1750,24 @@ create_local_trace_kprobe(char *func, void *addr, unsigned long offs,
int ret;
char *event;
+ if (func) {
+ unsigned int count;
+
+ count = number_of_same_symbols(func);
+ if (count > 1)
+ /*
+ * Users should use addr to remove the ambiguity of
+ * using func only.
+ */
+ return ERR_PTR(-EADDRNOTAVAIL);
+ else if (count == 0)
+ /*
+ * We can return ENOENT earlier than when register the
+ * kprobe.
+ */
+ return ERR_PTR(-ENOENT);
+ }
+
/*
* local trace_kprobes are not added to dyn_event, so they are never
* searched in find_trace_kprobe(). Therefore, there is no concern of
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index 02b432ae7513..850d9ecb6765 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -450,6 +450,7 @@ extern int traceprobe_define_arg_fields(struct trace_event_call *event_call,
C(BAD_MAXACT, "Invalid maxactive number"), \
C(MAXACT_TOO_BIG, "Maxactive is too big"), \
C(BAD_PROBE_ADDR, "Invalid probed address or symbol"), \
+ C(NON_UNIQ_SYMBOL, "The symbol is not unique"), \
C(BAD_RETPROBE, "Retprobe address must be an function entry"), \
C(NO_TRACEPOINT, "Tracepoint is not found"), \
C(BAD_ADDR_SUFFIX, "Invalid probed address suffix"), \
--
2.34.1
Hi.
In the kernel source code, it exists different functions which share the same
name but which have, of course, different addresses as they can be defined in
different modules:
# Kernel was compiled with CONFIG_NTFS_FS and CONFIG_NTFS3_FS as built-in.
root@vm-amd64:~# grep ntfs_file_write_iter /proc/kallsyms
ffffffff814ce3c0 t __pfx_ntfs_file_write_iter
ffffffff814ce3d0 t ntfs_file_write_iter
ffffffff814fc8a0 t __pfx_ntfs_file_write_iter
ffffffff814fc8b0 t ntfs_file_write_iter
This can be source of troubles when you create a PMU kprobe for such a function,
as it will only install one for the first address (e.g. 0xffffffff814ce3d0 in
the above).
This could lead to some troubles were BPF based tools does not report any event
because the second function is not called:
root@vm-amd64:/mnt# mount | grep /mnt
/foo.img on /mnt type ntfs3 (rw,relatime,uid=0,gid=0,iocharset=utf8)
# ig is a tool which installs a PMU kprobe on ntfs_file_write_iter().
root@vm-amd64:/mnt# ig trace fsslower -m 0 -f ntfs3 --host &> /tmp/foo &
[1] 207
root@vm-amd64:/mnt# dd if=./foo of=./bar count=3
3+0 records in
3+0 records out
1536 bytes (1.5 kB, 1.5 KiB) copied, 0.00543323 s, 283 kB/s
root@vm-amd64:/mnt# fg
ig trace fsslower -m 0 -f ntfs3 --host &> /tmp/foo
^Croot@vm-amd64:/mnt# more /tmp/foo
RUNTIME.CONTAINERNAME RUNTIME.CONTAIN… PID COMM
T BYTES OFFSET LAT FILE
214 dd
R 512 0 766 foo
214 dd
R 512 512 9 foo
214 dd
As you can see in the above, only read events are reported and no write because
the kprobe is installed for the old ntfs_file_write_iter() and not the ntfs3
one.
The same behavior occurs with sysfs kprobe:
root@vm-amd64:/# echo 'p:probe/ntfs_file_write_iter ntfs_file_write_iter' > /sys/kernel/tracing/kprobe_events
root@vm-amd64:/# cat /sys/kernel/tracing/kprobe_events
p:probe/ntfs_file_write_iter ntfs_file_write_iter
root@vm-amd64:/# mount | grep /mnt
/foo.img on /mnt type ntfs3 (rw,relatime,uid=0,gid=0,iocharset=utf8)
root@vm-amd64:/# perf record -e probe:ntfs_file_write_iter &
[1] 210
root@vm-amd64:/# cd /mnt/
root@vm-amd64:/mnt# dd if=./foo of=./bar count=3
3+0 records in
3+0 records out
1536 bytes (1.5 kB, 1.5 KiB) copied, 0.00234793 s, 654 kB/s
root@vm-amd64:/mnt# cd -
/
root@vm-amd64:/# fg
perf record -e probe:ntfs_file_write_iter
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.056 MB perf.data ]
root@vm-amd64:/# perf report
Error:
The perf.data data has no samples!
# To display the perf.data header info, please use --header/--header-only optio>
#
In this contribution, I modified the functions creating sysfs and PMU kprobes to
test if the function name given as argument matches several symbols.
In this case, these functions return EADDRNOTAVAIL to indicate the user to use
addr and offs to remove this ambiguity.
So, when the above BPF tool is run, the following error message is printed:
root@vm-amd64:~# ig trace fsslower -m 0 -f ntfs3 --host &> /tmp/foo &
[1] 228
root@vm-amd64:~# more /tmp/foo
RUNTIME.CONTAINERNAME RUNTIME.CONTAIN… PID COMM
T BYTES OFFSET LAT FILE
Error: running gadget: running gadget: installing tracer: attaching kprobe: crea
ting perf_kprobe PMU (arch-specific fallback for "ntfs_file_write_iter"): token
ntfs_file_write_iter: opening perf event: cannot assign requested address
And the same with sysfs kprobe:
root@vm-amd64:/# echo 'p:probe/ntfs_file_write_iter ntfs_file_write_iter' > /sys/kernel/tracing/kprobe_events
-bash: echo: write error: Cannot assign requested address
Note that, this does not influence perf as it installs kprobes as offset on
_text:
root@vm-amd64:/# perf probe --add ntfs_file_write_iter
Added new events:
probe:ntfs_file_write_iter (on ntfs_file_write_iter)
probe:ntfs_file_write_iter (on ntfs_file_write_iter)
...
root@vm-amd64:/# cat /sys/kernel/tracing/kprobe_events
p:probe/ntfs_file_write_iter _text+5039088
p:probe/ntfs_file_write_iter _text+5228752
Note that, this contribution is the conclusion of a previous RFC which intended
to install a PMU kprobe for all matching symbols [1, 2].
If you see any way to improve this contribution, particularly if you have an
idea to add tests or documentation for this behavior, please share your
feedback.
Changes since:
v1:
* Use EADDRNOTAVAIL instead of adding a new error code.
* Correct also this behavior for sysfs kprobe.
v2:
* Count the number of symbols corresponding to function name and return
EADDRNOTAVAIL if higher than 1.
* Return ENOENT if above count is 0, as it would be returned later by while
registering the kprobe.
v3:
* Check symbol does not contain ':' before testing its uniqueness.
* Add a selftest to check this is not possible to install a kprobe for a non
unique symbol.
v5:
* No changes, just add linux-stable as recipient.
Francis Laniel (2):
tracing/kprobes: Return EADDRNOTAVAIL when func matches several
symbols
selftests/ftrace: Add new test case which checks non unique symbol
kernel/trace/trace_kprobe.c | 63 +++++++++++++++++++
kernel/trace/trace_probe.h | 1 +
.../test.d/kprobe/kprobe_non_uniq_symbol.tc | 13 ++++
3 files changed, 77 insertions(+)
create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kprobe_non_uniq_symbol.tc
Best regards and thank you in advance.
---
[1]: https://lore.kernel.org/lkml/20230816163517.112518-1-flaniel@linux.microsof…
[2]: https://lore.kernel.org/lkml/20230819101105.b0c104ae4494a7d1f2eea742@kernel…
--
2.34.1
On Sun, Sep 03, 2023 at 09:19:13PM +0900, Kenta Sato wrote:
> Hi,
>
> I am using the FriendlyElec NanoPi R4S board.
> When I update the kernel from 6.4.7 to 6.4.11, 6.4.13, and 6.5.1, it
> doesn't recognize some USB devices.
>
> The board has two USB 3.0 ports. I connected 1) BUFFALO USB Flash Disk
> (high-speed) and 2) NETGEAR A6210 (SuperSpeed) to each port.
> 1) is often not recognized. On the other hand, 2) was working while I
> was testing.
> Regardless of whether a USB device is connected, I could see the below
> message on dmesg:
>
> [ 0.740993] phy phy-ff7c0000.phy.8: phy poweron failed --> -110
> [ 0.741585] dwc3 fe800000.usb: error -ETIMEDOUT: failed to initialize core
> [ 0.742334] dwc3: probe of fe800000.usb failed with error -110
> [ 0.751635] rockchip-usb2phy ff770000.syscon:usb2phy@e460:
> Requested PHY is disabled
>
> Is there any idea on this?
>
> The cause seems to be related to this commit. I tried reverting this
> change and the issue seemed to be solved.
>
> >From 317d6e4c12b46bde61248ea4ab5e19f68cbd1c57 Mon Sep 17 00:00:00 2001
> From: Jisheng Zhang <jszhang(a)kernel.org>
> Date: Wed, 28 Jun 2023 00:20:18 +0800
> Subject: usb: dwc3: don't reset device side if dwc3 was configured as
> host-only
>
> commit e835c0a4e23c38531dcee5ef77e8d1cf462658c7 upstream.
>
> Commit c4a5153e87fd ("usb: dwc3: core: Power-off core/PHYs on
> system_suspend in host mode") replaces check for HOST only dr_mode with
> current_dr_role. But during booting, the current_dr_role isn't
> initialized, thus the device side reset is always issued even if dwc3
> was configured as host-only. What's more, on some platforms with host
> only dwc3, aways issuing device side reset by accessing device register
> block can cause kernel panic.
>
> Fixes: c4a5153e87fd ("usb: dwc3: core: Power-off core/PHYs on
> system_suspend in host mode")
> Cc: stable <stable(a)kernel.org>
> Signed-off-by: Jisheng Zhang <jszhang(a)kernel.org>
> Acked-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
> Link: https://lore.kernel.org/r/20230627162018.739-1-jszhang@kernel.org
> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
> ---
> drivers/usb/dwc3/core.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=…
>
Thanks for the regression report. I'm adding it to regzbot:
#regzbot ^introduced: e835c0a4e23c38
#regzbot title: some USB devices unrecognized caused by not resetting dwc3 device if it is host-only
--
An old man doll... just what I always wanted! - Clara
With every release of LLVM, both of these sanitizers eat up more and
more of the stack. So, set FRAME_WARN to 0 if either of them is enabled
for a given build.
Cc: stable(a)vger.kernel.org
Signed-off-by: Hamza Mahfooz <hamza.mahfooz(a)amd.com>
---
lib/Kconfig.debug | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 39d1d93164bd..15ad742729ca 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -429,11 +429,10 @@ endif # DEBUG_INFO
config FRAME_WARN
int "Warn for stack frames larger than"
range 0 8192
- default 0 if KMSAN
+ default 0 if KASAN || KCSAN || KMSAN
default 2048 if GCC_PLUGIN_LATENT_ENTROPY
default 2048 if PARISC
default 1536 if (!64BIT && XTENSA)
- default 1280 if KASAN && !64BIT
default 1024 if !64BIT
default 2048 if 64BIT
help
--
2.42.0
When calculating the hotness threshold for lru_prio scheme of
DAMON_LRU_SORT, the module divides some values by the maximum
nr_accesses. However, due to the type of the related variables, simple
division-based calculation of the divisor can return zero. As a result,
divide-by-zero is possible. Fix it by using damon_max_nr_accesses(),
which handles the case.
Fixes: 40e983cca927 ("mm/damon: introduce DAMON-based LRU-lists Sorting")
Cc: <stable(a)vger.kernel.org> # 6.0.x
Signed-off-by: SeongJae Park <sj(a)kernel.org>
---
mm/damon/lru_sort.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/mm/damon/lru_sort.c b/mm/damon/lru_sort.c
index 3ecdcc029443..f2e5f9431892 100644
--- a/mm/damon/lru_sort.c
+++ b/mm/damon/lru_sort.c
@@ -195,9 +195,7 @@ static int damon_lru_sort_apply_parameters(void)
if (err)
return err;
- /* aggr_interval / sample_interval is the maximum nr_accesses */
- hot_thres = damon_lru_sort_mon_attrs.aggr_interval /
- damon_lru_sort_mon_attrs.sample_interval *
+ hot_thres = damon_max_nr_accesses(&damon_lru_sort_mon_attrs) *
hot_thres_access_freq / 1000;
scheme = damon_lru_sort_new_hot_scheme(hot_thres);
if (!scheme)
--
2.34.1
When calculating the hotness of each region for the under-quota regions
prioritization, DAMON divides some values by the maximum nr_accesses.
However, due to the type of the related variables, simple division-based
calculation of the divisor can return zero. As a result, divide-by-zero
is possible. Fix it by using damon_max_nr_accesses(), which handles the
case.
Fixes: 198f0f4c58b9 ("mm/damon/vaddr,paddr: support pageout prioritization")
Cc: <stable(a)vger.kernel.org> # 5.16.x
Signed-off-by: SeongJae Park <sj(a)kernel.org>
---
mm/damon/ops-common.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/mm/damon/ops-common.c b/mm/damon/ops-common.c
index ac1c3fa80f98..d25d99cb5f2b 100644
--- a/mm/damon/ops-common.c
+++ b/mm/damon/ops-common.c
@@ -73,7 +73,6 @@ void damon_pmdp_mkold(pmd_t *pmd, struct vm_area_struct *vma, unsigned long addr
int damon_hot_score(struct damon_ctx *c, struct damon_region *r,
struct damos *s)
{
- unsigned int max_nr_accesses;
int freq_subscore;
unsigned int age_in_sec;
int age_in_log, age_subscore;
@@ -81,8 +80,8 @@ int damon_hot_score(struct damon_ctx *c, struct damon_region *r,
unsigned int age_weight = s->quota.weight_age;
int hotness;
- max_nr_accesses = c->attrs.aggr_interval / c->attrs.sample_interval;
- freq_subscore = r->nr_accesses * DAMON_MAX_SUBSCORE / max_nr_accesses;
+ freq_subscore = r->nr_accesses * DAMON_MAX_SUBSCORE /
+ damon_max_nr_accesses(&c->attrs);
age_in_sec = (unsigned long)r->age * c->attrs.aggr_interval / 1000000;
for (age_in_log = 0; age_in_log < DAMON_MAX_AGE_IN_LOG && age_in_sec;
--
2.34.1
When monitoring attributes are changed, DAMON updates access rate of the
monitoring results accordingly. For that, it divides some values by the
maximum nr_accesses. However, due to the type of the related variables,
simple division-based calculation of the divisor can return zero. As a
result, divide-by-zero is possible. Fix it by using
damon_max_nr_accesses(), which handles the case.
Fixes: 2f5bef5a590b ("mm/damon/core: update monitoring results for new monitoring attributes")
Cc: <stable(a)vger.kernel.org> # 6.3.x
Signed-off-by: SeongJae Park <sj(a)kernel.org>
---
mm/damon/core.c | 10 ++--------
1 file changed, 2 insertions(+), 8 deletions(-)
diff --git a/mm/damon/core.c b/mm/damon/core.c
index 9f4f7c378cf3..e194c8075235 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -500,20 +500,14 @@ static unsigned int damon_age_for_new_attrs(unsigned int age,
static unsigned int damon_accesses_bp_to_nr_accesses(
unsigned int accesses_bp, struct damon_attrs *attrs)
{
- unsigned int max_nr_accesses =
- attrs->aggr_interval / attrs->sample_interval;
-
- return accesses_bp * max_nr_accesses / 10000;
+ return accesses_bp * damon_max_nr_accesses(attrs) / 10000;
}
/* convert nr_accesses to access ratio in bp (per 10,000) */
static unsigned int damon_nr_accesses_to_accesses_bp(
unsigned int nr_accesses, struct damon_attrs *attrs)
{
- unsigned int max_nr_accesses =
- attrs->aggr_interval / attrs->sample_interval;
-
- return nr_accesses * 10000 / max_nr_accesses;
+ return nr_accesses * 10000 / damon_max_nr_accesses(attrs);
}
static unsigned int damon_nr_accesses_for_new_attrs(unsigned int nr_accesses,
--
2.34.1
The maximum nr_accesses of given DAMON context can be calculated by
dividing the aggregation interval by the sampling interval. Some logics
in DAMON uses the maximum nr_accesses as a divisor. Hence, the value
shouldn't be zero. Such case is avoided since DAMON avoids setting the
agregation interval as samller than the sampling interval. However,
since nr_accesses is unsigned int while the intervals are unsigned long,
the maximum nr_accesses could be zero while casting. Implement a
function that handles the corner case.
Note that this commit is not fixing the real issue since this is only
introducing the safe function that will replaces the problematic
divisions. The replacements will be made by followup commits, to make
backporting on stable series easier.
Fixes: 198f0f4c58b9 ("mm/damon/vaddr,paddr: support pageout prioritization")
Cc: <stable(a)vger.kernel.org> # 5.16.x
Signed-off-by: SeongJae Park <sj(a)kernel.org>
---
include/linux/damon.h | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/include/linux/damon.h b/include/linux/damon.h
index 27b995c22497..ab2f17d9926b 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -681,6 +681,13 @@ static inline bool damon_target_has_pid(const struct damon_ctx *ctx)
return ctx->ops.id == DAMON_OPS_VADDR || ctx->ops.id == DAMON_OPS_FVADDR;
}
+static inline unsigned int damon_max_nr_accesses(const struct damon_attrs *attrs)
+{
+ /* {aggr,sample}_interval are unsigned long, hence could overflow */
+ return min(attrs->aggr_interval / attrs->sample_interval,
+ (unsigned long)UINT_MAX);
+}
+
int damon_start(struct damon_ctx **ctxs, int nr_ctxs, bool exclusive);
int damon_stop(struct damon_ctx **ctxs, int nr_ctxs);
--
2.34.1
I've received reports that an ethernet usb dongle doesn't work (google
internal bug 304028301)...
Investigation shows that we have 5.10 (GKI) with USB_NET_AX8817X=y and
AX88796B_PHY not set.
I *think* this configuration combination makes no sense?
[note: I'm unsure how many different phy's this driver supports...]
Obviously, we could simply turn it on 'manually'... but:
commit dde25846925765a88df8964080098174495c1f10
Author: Oleksij Rempel <o.rempel(a)pengutronix.de>
Date: Mon Jun 7 10:27:22 2021 +0200
net: usb/phy: asix: add support for ax88772A/C PHYs
Add support for build-in x88772A/C PHYs
Signed-off-by: Oleksij Rempel <o.rempel(a)pengutronix.de>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
includes (as a side effect):
drivers/net/usb/Kconfig
@@ -164,6 +164,7 @@ config USB_NET_AX8817X
depends on USB_USBNET
select CRC32
select PHYLIB
+ select AX88796B_PHY
default y
which presumably makes this (particular problem) a non issue on 5.15+
I'm guessing the above fix (ie. commit dde25846925765a88df8964080098174495c1f10)
could (should?) simply be backported to older stable kernels?
I've verified it cherrypicks cleanly and builds (on x86_64 5.10 gki),
ie.
$ git checkout android/kernel/common/android13-5.10
$ git cherry-pick -x dde25846925765a88df8964080098174495c1f10
$ make ARCH=x86_64 gki_defconfig
$ egrep -i ax88796b < .config
CONFIG_AX88796B_PHY=y
$ make -j50
./drivers/net/phy/ax88796b.o gets built
I've sourced a copy of the problematic hardware, but I'm hitting
problems where on at least two (both my chromebook and 1 of the 2
usb-c ports on my lenovo laptop) totally different usb
controllers/ports it doesn't even usb enumerate (ie. nothing in dmesg,
no show on lsusb), which is making testing difficult (unsure if I just
got a bad sample)...
- Maciej
Hi,
I've tested the below on both linux-6.5.7 and mainline linux-6.6-rc6,
both of which seem to have the same issue.
GDB 13.2 isn't able to load vmlinux-gdb.py as it throws the following:
Traceback (most recent call last):
File "/home/user/debug_kernel/linux-6.6-rc6/vmlinux-gdb.py", line
25, in <module>
import linux.constants
File "/home/user/debug_kernel/linux-6.6-rc6/scripts/gdb/linux/constants.py",
line 11, in <module>
LX_hrtimer_resolution = gdb.parse_and_eval("hrtimer_resolution")
gdb.error: 'hrtimer_resolution' has unknown type; cast it to its declared type
I've built-linux like so:
make defconfig
scripts/config --disable SYSTEM_TRUSTED_KEYS
scripts/config --disable SYSTEM_REVOCATION_KEYS
scripts/config --set-str SYSTEM_TRUSTED_KEYS ""
scripts/config -e CONFIG_DEBUG_INFO -e CONFIG_GDB_SCRIPTS -e
CONFIG_FRAME_POINTER
make -j$(nproc)
make scripts_gdb
$ gcc --version
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
$ gdb --version
GNU gdb (GDB) 13.2
This is my first time submitting a bug to the LK mailing list, please
let me know if this format is not correct or if you need more
information.
Thanks.
This reverts commit 108a36d07c01edbc5942d27c92494d1c6e4d45a0.
It was reported that this fix breaks the possibility to remove existing WoL
flags. For example:
~$ ethtool lan2
...
Supports Wake-on: pg
Wake-on: d
...
~$ ethtool -s lan2 wol gp
~$ ethtool lan2
...
Wake-on: pg
...
~$ ethtool -s lan2 wol d
~$ ethtool lan2
...
Wake-on: pg
...
This worked correctly before this commit because we were always updating
a zero bitmap (since commit 6699170376ab ("ethtool: fix application of
verbose no_mask bitset"), that is) so that the rest was left zero
naturally. But now the 1->0 change (old_val is true, bit not present in
netlink nest) no longer works.
Reported-by: Oleksij Rempel <o.rempel(a)pengutronix.de>
Reported-by: Michal Kubecek <mkubecek(a)suse.cz>
Closes: https://lore.kernel.org/netdev/20231019095140.l6fffnszraeb6iiw@lion.mk-sys.…
Cc: stable(a)vger.kernel.org
Fixes: 108a36d07c01 ("ethtool: Fix mod state of verbose no_mask bitset")
Signed-off-by: Kory Maincent <kory.maincent(a)bootlin.com>
---
This patch is reverted for now as we are approaching the end of the
merge-window. The real fix that fix the mod value will be sent later
on the next merge-window.
---
net/ethtool/bitset.c | 32 ++++++--------------------------
1 file changed, 6 insertions(+), 26 deletions(-)
diff --git a/net/ethtool/bitset.c b/net/ethtool/bitset.c
index 883ed9be81f9..0515d6604b3b 100644
--- a/net/ethtool/bitset.c
+++ b/net/ethtool/bitset.c
@@ -431,10 +431,8 @@ ethnl_update_bitset32_verbose(u32 *bitmap, unsigned int nbits,
ethnl_string_array_t names,
struct netlink_ext_ack *extack, bool *mod)
{
- u32 *orig_bitmap, *saved_bitmap = NULL;
struct nlattr *bit_attr;
bool no_mask;
- bool dummy;
int rem;
int ret;
@@ -450,22 +448,8 @@ ethnl_update_bitset32_verbose(u32 *bitmap, unsigned int nbits,
}
no_mask = tb[ETHTOOL_A_BITSET_NOMASK];
- if (no_mask) {
- unsigned int nwords = DIV_ROUND_UP(nbits, 32);
- unsigned int nbytes = nwords * sizeof(u32);
-
- /* The bitmap size is only the size of the map part without
- * its mask part.
- */
- saved_bitmap = kcalloc(nwords, sizeof(u32), GFP_KERNEL);
- if (!saved_bitmap)
- return -ENOMEM;
- memcpy(saved_bitmap, bitmap, nbytes);
- ethnl_bitmap32_clear(bitmap, 0, nbits, &dummy);
- orig_bitmap = saved_bitmap;
- } else {
- orig_bitmap = bitmap;
- }
+ if (no_mask)
+ ethnl_bitmap32_clear(bitmap, 0, nbits, mod);
nla_for_each_nested(bit_attr, tb[ETHTOOL_A_BITSET_BITS], rem) {
bool old_val, new_val;
@@ -474,14 +458,13 @@ ethnl_update_bitset32_verbose(u32 *bitmap, unsigned int nbits,
if (nla_type(bit_attr) != ETHTOOL_A_BITSET_BITS_BIT) {
NL_SET_ERR_MSG_ATTR(extack, bit_attr,
"only ETHTOOL_A_BITSET_BITS_BIT allowed in ETHTOOL_A_BITSET_BITS");
- ret = -EINVAL;
- goto out;
+ return -EINVAL;
}
ret = ethnl_parse_bit(&idx, &new_val, nbits, bit_attr, no_mask,
names, extack);
if (ret < 0)
- goto out;
- old_val = orig_bitmap[idx / 32] & ((u32)1 << (idx % 32));
+ return ret;
+ old_val = bitmap[idx / 32] & ((u32)1 << (idx % 32));
if (new_val != old_val) {
if (new_val)
bitmap[idx / 32] |= ((u32)1 << (idx % 32));
@@ -491,10 +474,7 @@ ethnl_update_bitset32_verbose(u32 *bitmap, unsigned int nbits,
}
}
- ret = 0;
-out:
- kfree(saved_bitmap);
- return ret;
+ return 0;
}
static int ethnl_compact_sanity_checks(unsigned int nbits,
---
base-commit: a602ee3176a81280b829c9f0cf259450f7982168
change-id: 20231019-feature_ptp_bitset_fix-6bf75671b33e
Best regards,
--
Köry Maincent, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com
Patch 1 corrects the logic for MP_JOIN tests where 0 RSTs are expected.
Patch 2 ensures MPTCP packets are not incorrectly coalesced in the TCP
backlog queue.
Patch 3 avoids a zero-window probe and associated WARN_ON_ONCE() in an
expected MPTCP reinjection scenario.
Patches 4 & 5 allow an initial MPTCP subflow to be closed cleanly
instead of always sending RST. Associated selftest is updated.
Signed-off-by: Mat Martineau <martineau(a)kernel.org>
---
Geliang Tang (1):
mptcp: avoid sending RST when closing the initial subflow
Matthieu Baerts (2):
selftests: mptcp: join: correctly check for no RST
selftests: mptcp: join: no RST when rm subflow/addr
Paolo Abeni (2):
tcp: check mptcp-level constraints for backlog coalescing
mptcp: more conservative check for zero probes
net/ipv4/tcp_ipv4.c | 1 +
net/mptcp/protocol.c | 36 ++++++++++++++++---------
tools/testing/selftests/net/mptcp/mptcp_join.sh | 21 +++++++++++++--
3 files changed, 43 insertions(+), 15 deletions(-)
---
base-commit: 2915240eddba96b37de4c7e9a3d0ac6f9548454b
change-id: 20231018-send-net-20231018-ac6b38df05e2
Best regards,
--
Mat Martineau <martineau(a)kernel.org>
The ath11k active pdevs are protected by RCU but the temperature event
handling code calling ath11k_mac_get_ar_by_pdev_id() was not marked as a
read-side critical section as reported by RCU lockdep:
=============================
WARNING: suspicious RCU usage
6.6.0-rc6 #7 Not tainted
-----------------------------
drivers/net/wireless/ath/ath11k/mac.c:638 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
no locks held by swapper/0/0.
...
Call trace:
...
lockdep_rcu_suspicious+0x16c/0x22c
ath11k_mac_get_ar_by_pdev_id+0x194/0x1b0 [ath11k]
ath11k_wmi_tlv_op_rx+0xa84/0x2c1c [ath11k]
ath11k_htc_rx_completion_handler+0x388/0x510 [ath11k]
Mark the code in question as an RCU read-side critical section to avoid
any potential use-after-free issues.
Fixes: a41d10348b01 ("ath11k: add thermal sensor device support")
Cc: stable(a)vger.kernel.org # 5.7
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
drivers/net/wireless/ath/ath11k/wmi.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/wireless/ath/ath11k/wmi.c b/drivers/net/wireless/ath/ath11k/wmi.c
index 23ad6825e5be..980ff588325d 100644
--- a/drivers/net/wireless/ath/ath11k/wmi.c
+++ b/drivers/net/wireless/ath/ath11k/wmi.c
@@ -8383,6 +8383,8 @@ ath11k_wmi_pdev_temperature_event(struct ath11k_base *ab,
ath11k_dbg(ab, ATH11K_DBG_WMI, "event pdev temperature ev temp %d pdev_id %d\n",
ev->temp, ev->pdev_id);
+ rcu_read_lock();
+
ar = ath11k_mac_get_ar_by_pdev_id(ab, ev->pdev_id);
if (!ar) {
ath11k_warn(ab, "invalid pdev id in pdev temperature ev %d", ev->pdev_id);
@@ -8392,6 +8394,8 @@ ath11k_wmi_pdev_temperature_event(struct ath11k_base *ab,
ath11k_thermal_event_temperature(ar, ev->temp);
+ rcu_read_unlock();
+
kfree(tb);
}
--
2.41.0
From: Agustin Gutierrez <agustin.gutierrez(a)amd.com>
[Why]
Some ASICs keep backlight powered on after dpms off
command has been issued.
[How]
The check for no edp power sequencing was never going to pass.
The value is never changed from what it is set by design.
Cc: stable(a)vger.kernel.org # 6.1+
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2765
Reviewed-by: Swapnil Patel <swapnil.patel(a)amd.com>
Acked-by: Roman Li <roman.li(a)amd.com>
Signed-off-by: Agustin Gutierrez <agustin.gutierrez(a)amd.com>
---
drivers/gpu/drm/amd/display/dc/link/link_dpms.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/link/link_dpms.c b/drivers/gpu/drm/amd/display/dc/link/link_dpms.c
index 4538451945b4..34a4a8c0e18c 100644
--- a/drivers/gpu/drm/amd/display/dc/link/link_dpms.c
+++ b/drivers/gpu/drm/amd/display/dc/link/link_dpms.c
@@ -1932,8 +1932,7 @@ static void disable_link_dp(struct dc_link *link,
dp_disable_link_phy(link, link_res, signal);
if (link->connector_signal == SIGNAL_TYPE_EDP) {
- if (!link->dc->config.edp_no_power_sequencing &&
- !link->skip_implict_edp_power_control)
+ if (!link->skip_implict_edp_power_control)
link->dc->hwss.edp_power_control(link, false);
}
--
2.34.1
The static files placement by `rustdoc` changed in Rust 1.67.0 [1],
but the custom code we have to replace the logo in the generated
HTML files did not get updated.
Thus update it to have the Linux logo again in the output.
Hopefully `rustdoc` will eventually support a custom logo from
a local file [2], so that we do not need to maintain this hack
on our side.
Link: https://github.com/rust-lang/rust/pull/101702 [1]
Link: https://github.com/rust-lang/rfcs/pull/3226 [2]
Fixes: 3ed03f4da06e ("rust: upgrade to Rust 1.68.2")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miguel Ojeda <ojeda(a)kernel.org>
---
rust/Makefile | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/rust/Makefile b/rust/Makefile
index 87958e864be0..08af1f869f0c 100644
--- a/rust/Makefile
+++ b/rust/Makefile
@@ -93,15 +93,14 @@ quiet_cmd_rustdoc = RUSTDOC $(if $(rustdoc_host),H, ) $<
# and then retouch the generated files.
rustdoc: rustdoc-core rustdoc-macros rustdoc-compiler_builtins \
rustdoc-alloc rustdoc-kernel
- $(Q)cp $(srctree)/Documentation/images/logo.svg $(rustdoc_output)
- $(Q)cp $(srctree)/Documentation/images/COPYING-logo $(rustdoc_output)
+ $(Q)cp $(srctree)/Documentation/images/logo.svg $(rustdoc_output)/static.files/
+ $(Q)cp $(srctree)/Documentation/images/COPYING-logo $(rustdoc_output)/static.files/
$(Q)find $(rustdoc_output) -name '*.html' -type f -print0 | xargs -0 sed -Ei \
- -e 's:rust-logo\.svg:logo.svg:g' \
- -e 's:rust-logo\.png:logo.svg:g' \
- -e 's:favicon\.svg:logo.svg:g' \
- -e 's:<link rel="alternate icon" type="image/png" href="[./]*favicon-(16x16|32x32)\.png">::g'
- $(Q)echo '.logo-container > img { object-fit: contain; }' \
- >> $(rustdoc_output)/rustdoc.css
+ -e 's:rust-logo-[0-9a-f]+\.svg:logo.svg:g' \
+ -e 's:favicon-[0-9a-f]+\.svg:logo.svg:g' \
+ -e 's:<link rel="alternate icon" type="image/png" href="[/.]+/static\.files/favicon-(16x16|32x32)-[0-9a-f]+\.png">::g'
+ $(Q)for f in $(rustdoc_output)/static.files/rustdoc-*.css; do \
+ echo ".logo-container > img { object-fit: contain; }" >> $$f; done
rustdoc-macros: private rustdoc_host = yes
rustdoc-macros: private rustc_target_flags = --crate-type proc-macro \
--
2.42.0
From: "Joel Fernandes (Google)" <joel(a)joelfernandes.org>
There are instances where rcu_cpu_stall_reset() is called when jiffies
did not get a chance to update for a long time. Before jiffies is
updated, the CPU stall detector can go off triggering false-positives
where a just-started grace period appears to be ages old. In the past,
we disabled stall detection in rcu_cpu_stall_reset() however this got
changed [1]. This is resulting in false-positives in KGDB usecase [2].
Fix this by deferring the update of jiffies to the third run of the FQS
loop. This is more robust, as, even if rcu_cpu_stall_reset() is called
just before jiffies is read, we would end up pushing out the jiffies
read by 3 more FQS loops. Meanwhile the CPU stall detection will be
delayed and we will not get any false positives.
[1] https://lore.kernel.org/all/20210521155624.174524-2-senozhatsky@chromium.or…
[2] https://lore.kernel.org/all/20230814020045.51950-2-chenhuacai@loongson.cn/
Tested with rcutorture.cpu_stall option as well to verify stall behavior
with/without patch.
Tested-by: Huacai Chen <chenhuacai(a)loongson.cn>
Reported-by: Binbin Zhou <zhoubinbin(a)loongson.cn>
Closes: https://lore.kernel.org/all/20230814020045.51950-2-chenhuacai@loongson.cn/
Suggested-by: Paul McKenney <paulmck(a)kernel.org>
Cc: Sergey Senozhatsky <senozhatsky(a)chromium.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Fixes: a80be428fbc1 ("rcu: Do not disable GP stall detection in rcu_cpu_stall_reset()")
Signed-off-by: Joel Fernandes (Google) <joel(a)joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org>
Signed-off-by: Frederic Weisbecker <frederic(a)kernel.org>
---
kernel/rcu/tree.c | 12 ++++++++++++
kernel/rcu/tree.h | 4 ++++
kernel/rcu/tree_stall.h | 20 ++++++++++++++++++--
3 files changed, 34 insertions(+), 2 deletions(-)
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index cb1caefa8bd0..d85779f67aea 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1556,10 +1556,22 @@ static bool rcu_gp_fqs_check_wake(int *gfp)
*/
static void rcu_gp_fqs(bool first_time)
{
+ int nr_fqs = READ_ONCE(rcu_state.nr_fqs_jiffies_stall);
struct rcu_node *rnp = rcu_get_root();
WRITE_ONCE(rcu_state.gp_activity, jiffies);
WRITE_ONCE(rcu_state.n_force_qs, rcu_state.n_force_qs + 1);
+
+ WARN_ON_ONCE(nr_fqs > 3);
+ /* Only countdown nr_fqs for stall purposes if jiffies moves. */
+ if (nr_fqs) {
+ if (nr_fqs == 1) {
+ WRITE_ONCE(rcu_state.jiffies_stall,
+ jiffies + rcu_jiffies_till_stall_check());
+ }
+ WRITE_ONCE(rcu_state.nr_fqs_jiffies_stall, --nr_fqs);
+ }
+
if (first_time) {
/* Collect dyntick-idle snapshots. */
force_qs_rnp(dyntick_save_progress_counter);
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 192536916f9a..e9821a8422db 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -386,6 +386,10 @@ struct rcu_state {
/* in jiffies. */
unsigned long jiffies_stall; /* Time at which to check */
/* for CPU stalls. */
+ int nr_fqs_jiffies_stall; /* Number of fqs loops after
+ * which read jiffies and set
+ * jiffies_stall. Stall
+ * warnings disabled if !0. */
unsigned long jiffies_resched; /* Time at which to resched */
/* a reluctant CPU. */
unsigned long n_force_qs_gpstart; /* Snapshot of n_force_qs at */
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 49544f932279..ac8e86babe44 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -150,12 +150,17 @@ static void panic_on_rcu_stall(void)
/**
* rcu_cpu_stall_reset - restart stall-warning timeout for current grace period
*
+ * To perform the reset request from the caller, disable stall detection until
+ * 3 fqs loops have passed. This is required to ensure a fresh jiffies is
+ * loaded. It should be safe to do from the fqs loop as enough timer
+ * interrupts and context switches should have passed.
+ *
* The caller must disable hard irqs.
*/
void rcu_cpu_stall_reset(void)
{
- WRITE_ONCE(rcu_state.jiffies_stall,
- jiffies + rcu_jiffies_till_stall_check());
+ WRITE_ONCE(rcu_state.nr_fqs_jiffies_stall, 3);
+ WRITE_ONCE(rcu_state.jiffies_stall, ULONG_MAX);
}
//////////////////////////////////////////////////////////////////////////////
@@ -171,6 +176,7 @@ static void record_gp_stall_check_time(void)
WRITE_ONCE(rcu_state.gp_start, j);
j1 = rcu_jiffies_till_stall_check();
smp_mb(); // ->gp_start before ->jiffies_stall and caller's ->gp_seq.
+ WRITE_ONCE(rcu_state.nr_fqs_jiffies_stall, 0);
WRITE_ONCE(rcu_state.jiffies_stall, j + j1);
rcu_state.jiffies_resched = j + j1 / 2;
rcu_state.n_force_qs_gpstart = READ_ONCE(rcu_state.n_force_qs);
@@ -726,6 +732,16 @@ static void check_cpu_stall(struct rcu_data *rdp)
!rcu_gp_in_progress())
return;
rcu_stall_kick_kthreads();
+
+ /*
+ * Check if it was requested (via rcu_cpu_stall_reset()) that the FQS
+ * loop has to set jiffies to ensure a non-stale jiffies value. This
+ * is required to have good jiffies value after coming out of long
+ * breaks of jiffies updates. Not doing so can cause false positives.
+ */
+ if (READ_ONCE(rcu_state.nr_fqs_jiffies_stall) > 0)
+ return;
+
j = jiffies;
/*
--
2.34.1
From: Ajay Singh <ajay.kathat(a)microchip.com>
Enabling KASAN and running some iperf tests raises some memory issues with
vmm_table:
BUG: KASAN: slab-out-of-bounds in wilc_wlan_handle_txq+0x6ac/0xdb4
Write of size 4 at addr c3a61540 by task wlan0-tx/95
KASAN detects that we are writing data beyond range allocated to vmm_table.
There is indeed a mismatch between the size passed to allocator in
wilc_wlan_init, and the range of possible indexes used later: allocation
size is missing a multiplication by sizeof(u32)
Fixes: 40b717bfcefa ("wifi: wilc1000: fix DMA on stack objects")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ajay Singh <ajay.kathat(a)microchip.com>
Signed-off-by: Alexis Lothoré <alexis.lothore(a)bootlin.com>
---
Changes in v3:
- Use kcalloc instead of kzalloc
- Link to v2: https://lore.kernel.org/r/20231016-wilc1000_tx_oops-v2-1-8d1982a29ef1@bootl…
Changes in v2:
- keep dedicated dynamic allocation for vmm_table
- Link to v1: https://lore.kernel.org/r/20231013-wilc1000_tx_oops-v1-1-3761beb9524d@bootl…
---
drivers/net/wireless/microchip/wilc1000/wlan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/microchip/wilc1000/wlan.c b/drivers/net/wireless/microchip/wilc1000/wlan.c
index 58bbf50081e4..9eb115c79c90 100644
--- a/drivers/net/wireless/microchip/wilc1000/wlan.c
+++ b/drivers/net/wireless/microchip/wilc1000/wlan.c
@@ -1492,7 +1492,7 @@ int wilc_wlan_init(struct net_device *dev)
}
if (!wilc->vmm_table)
- wilc->vmm_table = kzalloc(WILC_VMM_TBL_SIZE, GFP_KERNEL);
+ wilc->vmm_table = kcalloc(WILC_VMM_TBL_SIZE, sizeof(u32), GFP_KERNEL);
if (!wilc->vmm_table) {
ret = -ENOBUFS;
---
base-commit: ea12d85cbfd6b08fff40a4fefccc011b6cfadf8e
change-id: 20231012-wilc1000_tx_oops-58ce91ee3e93
Best regards,
--
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
DisplayPort Alt Mode CTS test 10.3.8 states that both sides of the
connection shall be compatible with one another such that the connection
is not Source to Source or Sink to Sink.
The DisplayPort driver currently checks for a compatible pin configuration
that resolves into a source and sink combination. The CTS test is designed
to send a Discover Modes message that has a compatible pin configuration
but advertises the same port capability as the device; the current check
fails this.
Verify that the port and port partner resolve into a valid source and sink
combination before checking for a compatible pin configuration.
Fixes: 0e3bb7d6894d ("usb: typec: Add driver for DisplayPort alternate mode")
Cc: stable(a)vger.kernel.org
Signed-off-by: RD Babiera <rdbabiera(a)google.com>
---
drivers/usb/typec/altmodes/displayport.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/usb/typec/altmodes/displayport.c b/drivers/usb/typec/altmodes/displayport.c
index 718da02036d8..3b35a6b8cb72 100644
--- a/drivers/usb/typec/altmodes/displayport.c
+++ b/drivers/usb/typec/altmodes/displayport.c
@@ -575,9 +575,18 @@ int dp_altmode_probe(struct typec_altmode *alt)
struct fwnode_handle *fwnode;
struct dp_altmode *dp;
int ret;
+ int port_cap, partner_cap;
/* FIXME: Port can only be DFP_U. */
+ /* Make sure that the port and partner can resolve into source and sink */
+ port_cap = DP_CAP_CAPABILITY(port->vdo);
+ partner_cap = DP_CAP_CAPABILITY(alt->vdo);
+ if (!((port_cap & DP_CAP_DFP_D) && (partner_cap & DP_CAP_UFP_D)) &&
+ !((port_cap & DP_CAP_UFP_D) && (partner_cap & DP_CAP_DFP_D))) {
+ return -ENODEV;
+ }
+
/* Make sure we have compatiple pin configurations */
if (!(DP_CAP_PIN_ASSIGN_DFP_D(port->vdo) &
DP_CAP_PIN_ASSIGN_UFP_D(alt->vdo)) &&
base-commit: d0d27ef87e1ca974ed93ed4f7d3c123cbd392ba6
--
2.42.0.655.g421f12c284-goog
This patch series was ultimately created to fix a bug reported via
BZ203687. Halil did some problem determination and concluded that the
problem was due to the fact that all queues removed from the guest's AP
configuration are not reset. For example, when an adapter or domain is
assigned to the mdev matrix, if a queue device associated with the adapter
or domain is not bound to the vfio_ap device driver, the adapter to which
the queue is attached will be removed from the guest's configuration. The
problem is, removing the adapter also implicitly removes all queues
attached to that adapter. Those queues also need to be reset to avert
leaking crypto data should any of those queues get assigned to another
guest or back to the host.
The original intent was to ensure that all queues removed from a guest's
AP configuration get reset. The testing included permutations of various
scenarios involving the binding/unbinding of queues either manually, or
due to dynamic host AP configuration changes initiated via an SE or HMC
attached to a DPM-enabled LPAR. This testing revealed several issues that
are also addressed via this patch series.
Note that several of the patches has a 'Fixes:' tag as well as a
Cc: <stable(a)vger.kernel.org> tag. I'm not sure whether this is necessary
because the patches not related to the reset issue are probably rarely
if ever encountered, so I'd like an opinion on that also.
Tony Krowiak (7):
s390/vfio-ap: always filter entire AP matrix
s390/vfio-ap: circumvent filtering for adapters/domains not in host
config
s390/vfio-ap: do not reset queue removed from host config
s390/vfio-ap: let 'on_scan_complete' callback filter matrix and update
guest's APCB
s390/vfio-ap: allow reset of subset of queues assigned to matrix mdev
s390/vfio-ap: reset queues filtered from the guest's AP config
s390/vfio-ap: reset queues associated with adapter for queue unbound
from driver
drivers/s390/crypto/vfio_ap_ops.c | 294 +++++++++++++++++++-------
drivers/s390/crypto/vfio_ap_private.h | 25 ++-
2 files changed, 234 insertions(+), 85 deletions(-)
--
2.41.0
The quilt patch titled
Subject: selftests/clone3: Fix broken test under !CONFIG_TIME_NS
has been removed from the -mm tree. Its filename was
selftests-clone3-fix-broken-test-under-config_time_ns.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Tiezhu Yang <yangtiezhu(a)loongson.cn>
Subject: selftests/clone3: Fix broken test under !CONFIG_TIME_NS
Date: Tue, 11 Jul 2023 17:13:34 +0800
When execute the following command to test clone3 under !CONFIG_TIME_NS:
# make headers && cd tools/testing/selftests/clone3 && make && ./clone3
we can see the following error info:
# [7538] Trying clone3() with flags 0x80 (size 0)
# Invalid argument - Failed to create new process
# [7538] clone3() with flags says: -22 expected 0
not ok 18 [7538] Result (-22) is different than expected (0)
...
# Totals: pass:18 fail:1 xfail:0 xpass:0 skip:0 error:0
This is because if CONFIG_TIME_NS is not set, but the flag
CLONE_NEWTIME (0x80) is used to clone a time namespace, it
will return -EINVAL in copy_time_ns().
If kernel does not support CONFIG_TIME_NS, /proc/self/ns/time
will be not exist, and then we should skip clone3() test with
CLONE_NEWTIME.
With this patch under !CONFIG_TIME_NS:
# make headers && cd tools/testing/selftests/clone3 && make && ./clone3
...
# Time namespaces are not supported
ok 18 # SKIP Skipping clone3() with CLONE_NEWTIME
...
# Totals: pass:18 fail:0 xfail:0 xpass:0 skip:1 error:0
Link: https://lkml.kernel.org/r/1689066814-13295-1-git-send-email-yangtiezhu@loon…
Fixes: 515bddf0ec41 ("selftests/clone3: test clone3 with CLONE_NEWTIME")
Signed-off-by: Tiezhu Yang <yangtiezhu(a)loongson.cn>
Suggested-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/clone3/clone3.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
--- a/tools/testing/selftests/clone3/clone3.c~selftests-clone3-fix-broken-test-under-config_time_ns
+++ a/tools/testing/selftests/clone3/clone3.c
@@ -196,7 +196,12 @@ int main(int argc, char *argv[])
CLONE3_ARGS_NO_TEST);
/* Do a clone3() in a new time namespace */
- test_clone3(CLONE_NEWTIME, 0, 0, CLONE3_ARGS_NO_TEST);
+ if (access("/proc/self/ns/time", F_OK) == 0) {
+ test_clone3(CLONE_NEWTIME, 0, 0, CLONE3_ARGS_NO_TEST);
+ } else {
+ ksft_print_msg("Time namespaces are not supported\n");
+ ksft_test_result_skip("Skipping clone3() with CLONE_NEWTIME\n");
+ }
/* Do a clone3() with exit signal (SIGCHLD) in flags */
test_clone3(SIGCHLD, 0, -EINVAL, CLONE3_ARGS_NO_TEST);
_
Patches currently in -mm which might be from yangtiezhu(a)loongson.cn are
The quilt patch titled
Subject: selftests/mm: include mman header to access MREMAP_DONTUNMAP identifier
has been removed from the -mm tree. Its filename was
selftests-mm-include-mman-header-to-access-mremap_dontunmap-identifier.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Samasth Norway Ananda <samasth.norway.ananda(a)oracle.com>
Subject: selftests/mm: include mman header to access MREMAP_DONTUNMAP identifier
Date: Thu, 12 Oct 2023 08:52:57 -0700
Definition for MREMAP_DONTUNMAP is not present in glibc older than 2.32
thus throwing an undeclared error when running make on mm. Including
linux/mman.h solves the build error for people having older glibc.
Link: https://lkml.kernel.org/r/20231012155257.891776-1-samasth.norway.ananda@ora…
Fixes: 0183d777c29a ("selftests: mm: remove duplicate unneeded defines")
Signed-off-by: Samasth Norway Ananda <samasth.norway.ananda(a)oracle.com>
Reported-by: Linux Kernel Functional Testing <lkft(a)linaro.org>
Closes: https://lore.kernel.org/linux-mm/CA+G9fYvV-71XqpCr_jhdDfEtN701fBdG3q+=bafaZ…
Tested-by: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/mremap_dontunmap.c | 1 +
1 file changed, 1 insertion(+)
--- a/tools/testing/selftests/mm/mremap_dontunmap.c~selftests-mm-include-mman-header-to-access-mremap_dontunmap-identifier
+++ a/tools/testing/selftests/mm/mremap_dontunmap.c
@@ -7,6 +7,7 @@
*/
#define _GNU_SOURCE
#include <sys/mman.h>
+#include <linux/mman.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
_
Patches currently in -mm which might be from samasth.norway.ananda(a)oracle.com are
The quilt patch titled
Subject: mm/damon/sysfs: check DAMOS regions update progress from before_terminate()
has been removed from the -mm tree. Its filename was
mm-damon-sysfs-check-damos-regions-update-progress-from-before_terminate.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: SeongJae Park <sj(a)kernel.org>
Subject: mm/damon/sysfs: check DAMOS regions update progress from before_terminate()
Date: Sat, 7 Oct 2023 20:04:32 +0000
DAMON_SYSFS can receive DAMOS tried regions update request while kdamond
is already out of the main loop and before_terminate callback
(damon_sysfs_before_terminate() in this case) is not yet called. And
damon_sysfs_handle_cmd() can further be finished before the callback is
invoked. Then, damon_sysfs_before_terminate() unlocks damon_sysfs_lock,
which is not locked by anyone. This happens because the callback function
assumes damon_sysfs_cmd_request_callback() should be called before it.
Check if the assumption was true before doing the unlock, to avoid this
problem.
Link: https://lkml.kernel.org/r/20231007200432.3110-1-sj@kernel.org
Fixes: f1d13cacabe1 ("mm/damon/sysfs: implement DAMOS tried regions update command")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org> [6.2.x]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/sysfs.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--- a/mm/damon/sysfs.c~mm-damon-sysfs-check-damos-regions-update-progress-from-before_terminate
+++ a/mm/damon/sysfs.c
@@ -1208,6 +1208,8 @@ static int damon_sysfs_set_targets(struc
return 0;
}
+static bool damon_sysfs_schemes_regions_updating;
+
static void damon_sysfs_before_terminate(struct damon_ctx *ctx)
{
struct damon_target *t, *next;
@@ -1219,8 +1221,10 @@ static void damon_sysfs_before_terminate
cmd = damon_sysfs_cmd_request.cmd;
if (kdamond && ctx == kdamond->damon_ctx &&
(cmd == DAMON_SYSFS_CMD_UPDATE_SCHEMES_TRIED_REGIONS ||
- cmd == DAMON_SYSFS_CMD_UPDATE_SCHEMES_TRIED_BYTES)) {
+ cmd == DAMON_SYSFS_CMD_UPDATE_SCHEMES_TRIED_BYTES) &&
+ damon_sysfs_schemes_regions_updating) {
damon_sysfs_schemes_update_regions_stop(ctx);
+ damon_sysfs_schemes_regions_updating = false;
mutex_unlock(&damon_sysfs_lock);
}
@@ -1340,7 +1344,6 @@ static int damon_sysfs_commit_input(stru
static int damon_sysfs_cmd_request_callback(struct damon_ctx *c)
{
struct damon_sysfs_kdamond *kdamond;
- static bool damon_sysfs_schemes_regions_updating;
bool total_bytes_only = false;
int err = 0;
_
Patches currently in -mm which might be from sj(a)kernel.org are
mm-damon-sysfs-schemes-do-not-update-tried-regions-more-than-one-damon-snapshot.patch
mm-damon-sysfs-avoid-empty-scheme-tried-regions-for-large-apply-interval.patch
docs-admin-guide-mm-damon-usage-update-for-tried-regions-update-time-interval.patch
The quilt patch titled
Subject: hugetlbfs: clear resv_map pointer if mmap fails
has been removed from the -mm tree. Its filename was
hugetlbfs-clear-resv_map-pointer-if-mmap-fails.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Rik van Riel <riel(a)surriel.com>
Subject: hugetlbfs: clear resv_map pointer if mmap fails
Date: Thu, 5 Oct 2023 23:59:06 -0400
Patch series "hugetlbfs: close race between MADV_DONTNEED and page fault", v7.
Malloc libraries, like jemalloc and tcalloc, take decisions on when to
call madvise independently from the code in the main application.
This sometimes results in the application page faulting on an address,
right after the malloc library has shot down the backing memory with
MADV_DONTNEED.
Usually this is harmless, because we always have some 4kB pages sitting
around to satisfy a page fault. However, with hugetlbfs systems often
allocate only the exact number of huge pages that the application wants.
Due to TLB batching, hugetlbfs MADV_DONTNEED will free pages outside of
any lock taken on the page fault path, which can open up the following
race condition:
CPU 1 CPU 2
MADV_DONTNEED
unmap page
shoot down TLB entry
page fault
fail to allocate a huge page
killed with SIGBUS
free page
Fix that race by extending the hugetlb_vma_lock locking scheme to also
cover private hugetlb mappings (with resv_map), and pulling the locking
from __unmap_hugepage_final_range into helper functions called from
zap_page_range_single. This ensures page faults stay locked out of the
MADV_DONTNEED VMA until the huge pages have actually been freed.
This patch (of 3):
Hugetlbfs leaves a dangling pointer in the VMA if mmap fails. This has
not been a problem so far, but other code in this patch series tries to
follow that pointer.
Link: https://lkml.kernel.org/r/20231006040020.3677377-1-riel@surriel.com
Link: https://lkml.kernel.org/r/20231006040020.3677377-2-riel@surriel.com
Fixes: 04ada095dcfc ("hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Signed-off-by: Rik van Riel <riel(a)surriel.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
--- a/mm/hugetlb.c~hugetlbfs-clear-resv_map-pointer-if-mmap-fails
+++ a/mm/hugetlb.c
@@ -1138,8 +1138,7 @@ static void set_vma_resv_map(struct vm_a
VM_BUG_ON_VMA(!is_vm_hugetlb_page(vma), vma);
VM_BUG_ON_VMA(vma->vm_flags & VM_MAYSHARE, vma);
- set_vma_private_data(vma, (get_vma_private_data(vma) &
- HPAGE_RESV_MASK) | (unsigned long)map);
+ set_vma_private_data(vma, (unsigned long)map);
}
static void set_vma_resv_flags(struct vm_area_struct *vma, unsigned long flags)
@@ -6811,8 +6810,10 @@ out_err:
*/
if (chg >= 0 && add < 0)
region_abort(resv_map, from, to, regions_needed);
- if (vma && is_vma_resv_set(vma, HPAGE_RESV_OWNER))
+ if (vma && is_vma_resv_set(vma, HPAGE_RESV_OWNER)) {
kref_put(&resv_map->refs, resv_map_release);
+ set_vma_resv_map(vma, NULL);
+ }
return false;
}
_
Patches currently in -mm which might be from riel(a)surriel.com are
The quilt patch titled
Subject: mm: zswap: fix pool refcount bug around shrink_worker()
has been removed from the -mm tree. Its filename was
mm-zswap-fix-pool-refcount-bug-around-shrink_worker.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Johannes Weiner <hannes(a)cmpxchg.org>
Subject: mm: zswap: fix pool refcount bug around shrink_worker()
Date: Fri, 6 Oct 2023 12:00:24 -0400
When a zswap store fails due to the limit, it acquires a pool reference
and queues the shrinker. When the shrinker runs, it drops the reference.
However, there can be multiple store attempts before the shrinker wakes up
and runs once. This results in reference leaks and eventual saturation
warnings for the pool refcount.
Fix this by dropping the reference again when the shrinker is already
queued. This ensures one reference per shrinker run.
Link: https://lkml.kernel.org/r/20231006160024.170748-1-hannes@cmpxchg.org
Fixes: 45190f01dd40 ("mm/zswap.c: add allocation hysteresis if pool limit is hit")
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reported-by: Chris Mason <clm(a)fb.com>
Acked-by: Nhat Pham <nphamcs(a)gmail.com>
Cc: Vitaly Wool <vitaly.wool(a)konsulko.com>
Cc: Domenico Cerasuolo <cerasuolodomenico(a)gmail.com>
Cc: <stable(a)vger.kernel.org> [5.6+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/zswap.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/zswap.c~mm-zswap-fix-pool-refcount-bug-around-shrink_worker
+++ a/mm/zswap.c
@@ -1383,8 +1383,8 @@ reject:
shrink:
pool = zswap_pool_last_get();
- if (pool)
- queue_work(shrink_wq, &pool->shrink_work);
+ if (pool && !queue_work(shrink_wq, &pool->shrink_work))
+ zswap_pool_put(pool);
goto reject;
}
_
Patches currently in -mm which might be from hannes(a)cmpxchg.org are
The UART supports an auto-RTS mode in which the RTS pin is automatically
activated during transmission. So mark this mode as being supported even
if RTS is not controlled by the driver but the UART.
Also the serial core expects now at least one of both modes rts-on-send or
rts-after-send to be supported. This is since during sanitization
unsupported flags are deleted from a RS485 configuration set by userspace.
However if the configuration ends up with both flags unset, the core prints
a warning since it considers such a configuration invalid (see
uart_sanitize_serial_rs485()).
Cc: stable(a)vger.kernel.org
Signed-off-by: Lino Sanfilippo <l.sanfilippo(a)kunbus.com>
---
drivers/tty/serial/8250/8250_exar.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
This patch is only required if "serial: core: fix sanitizing check for RTS
settings" is applied, which is also part of this series.
Therefore it lacks a "Fixes:" tag.
diff --git a/drivers/tty/serial/8250/8250_exar.c b/drivers/tty/serial/8250/8250_exar.c
index 077c3ba3539e..8385be846840 100644
--- a/drivers/tty/serial/8250/8250_exar.c
+++ b/drivers/tty/serial/8250/8250_exar.c
@@ -446,7 +446,7 @@ static int generic_rs485_config(struct uart_port *port, struct ktermios *termios
}
static const struct serial_rs485 generic_rs485_supported = {
- .flags = SER_RS485_ENABLED,
+ .flags = SER_RS485_ENABLED | SER_RS485_RTS_ON_SEND,
};
static const struct exar8250_platform exar8250_default_platform = {
@@ -490,7 +490,8 @@ static int iot2040_rs485_config(struct uart_port *port, struct ktermios *termios
}
static const struct serial_rs485 iot2040_rs485_supported = {
- .flags = SER_RS485_ENABLED | SER_RS485_RX_DURING_TX | SER_RS485_TERMINATE_BUS,
+ .flags = SER_RS485_ENABLED | SER_RS485_RTS_ON_SEND |
+ SER_RS485_RX_DURING_TX | SER_RS485_TERMINATE_BUS,
};
static const struct property_entry iot2040_gpio_properties[] = {
--
2.40.1
In serial_omap_rs485() RS485 support may be deactivated due to a missing
RTS GPIO. This is done by nullifying the ports rs485_supported struct.
After that however the serial_omap_rs485_supported struct is assigned to
the same structure unconditionally, which results in an unintended
reactivation of RS485 support.
Fix this by callling serial_omap_rs485() after the assignment of
rs485_supported.
Fixes: e2752ae3cfc9 ("serial: omap: Disallow RS-485 if rts-gpio is not specified")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lino Sanfilippo <l.sanfilippo(a)kunbus.com>
---
drivers/tty/serial/omap-serial.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/tty/serial/omap-serial.c b/drivers/tty/serial/omap-serial.c
index 0ead88c5a19a..4f7ee4392034 100644
--- a/drivers/tty/serial/omap-serial.c
+++ b/drivers/tty/serial/omap-serial.c
@@ -1604,10 +1604,6 @@ static int serial_omap_probe(struct platform_device *pdev)
dev_info(up->port.dev, "no wakeirq for uart%d\n",
up->port.line);
- ret = serial_omap_probe_rs485(up, &pdev->dev);
- if (ret < 0)
- goto err_rs485;
-
sprintf(up->name, "OMAP UART%d", up->port.line);
up->port.mapbase = mem->start;
up->port.membase = base;
@@ -1622,6 +1618,10 @@ static int serial_omap_probe(struct platform_device *pdev)
DEFAULT_CLK_SPEED);
}
+ ret = serial_omap_probe_rs485(up, &pdev->dev);
+ if (ret < 0)
+ goto err_rs485;
+
up->latency = PM_QOS_CPU_LATENCY_DEFAULT_VALUE;
up->calc_latency = PM_QOS_CPU_LATENCY_DEFAULT_VALUE;
cpu_latency_qos_add_request(&up->pm_qos_request, up->latency);
--
2.40.1
If the imx driver cannot support RS485 it sets the ports rs485_supported
structure to NULL. But it still calls uart_get_rs485_mode() which may set
the RS485_ENABLED flag nevertheless.
This may lead to an attempt to configure RS485 even if it is not supported
when the flag is evaluated in uart_configure_port() at port startup.
Avoid this by bailing out of uart_get_rs485_mode() if the RS485_ENABLED
flag is not supported by the caller.
With this fix a check for RTS availability is now obsolete in the imx
driver, since it can not evaluate to true any more. Remove this check, too.
Fixes: 00d7a00e2a6f ("serial: imx: Fill in rs485_supported")
Cc: Shawn Guo <shawnguo(a)kernel.org>
Cc: Sascha Hauer <s.hauer(a)pengutronix.de>
Cc: stable(a)vger.kernel.org
Suggested-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
Signed-off-by: Lino Sanfilippo <l.sanfilippo(a)kunbus.com>
---
drivers/tty/serial/imx.c | 4 ----
drivers/tty/serial/serial_core.c | 3 +++
2 files changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index edb2ec6a5567..c8c19bf8585d 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -2332,10 +2332,6 @@ static int imx_uart_probe(struct platform_device *pdev)
return ret;
}
- if (sport->port.rs485.flags & SER_RS485_ENABLED &&
- (!sport->have_rtscts && !sport->have_rtsgpio))
- dev_err(&pdev->dev, "no RTS control, disabling rs485\n");
-
/*
* If using the i.MX UART RTS/CTS control then the RTS (CTS_B)
* signal cannot be set low during transmission in case the
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index 49dd5e864719..c85275a573e3 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -3586,6 +3586,9 @@ int uart_get_rs485_mode(struct uart_port *port)
int ret;
int rx_during_tx_gpio_flag;
+ if (!(port->rs485_supported.flags & SER_RS485_ENABLED))
+ return 0;
+
ret = device_property_read_u32_array(dev, "rs485-rts-delay",
rs485_delay, 2);
if (!ret) {
--
2.40.1
Some uart drivers specify a rs485_config() function and then decide later
to disable RS485 support for some reason (e.g. imx and ar933).
In these cases userspace may be able to activate RS485 via TIOCSRS485
nevertheless, since in uart_set_rs485_config() an existing rs485_config()
function indicates that RS485 is supported.
Make sure that this is not longer possible by checking the uarts
rs485_supported.flags instead and bailing out if SER_RS485_ENABLED is not
set.
Furthermore instead of returning an empty structure return -ENOTTY if the
RS485 configuration is requested via TIOCGRS485 but RS485 is not supported.
This has a small impact on userspace visibility but it is consistent with
the -ENOTTY error for TIOCGRS485.
Fixes: e849145e1fdd ("serial: ar933x: Fill in rs485_supported")
Fixes: 55e18c6b6d42 ("serial: imx: Remove serial_rs485 sanitization")
Cc: Shawn Guo <shawnguo(a)kernel.org>
Cc: Sascha Hauer <s.hauer(a)pengutronix.de>
Cc: stable(a)vger.kernel.org
Signed-off-by: Lino Sanfilippo <l.sanfilippo(a)kunbus.com>
---
drivers/tty/serial/serial_core.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index f1d889f9fd6e..49dd5e864719 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -1443,6 +1443,9 @@ static int uart_get_rs485_config(struct uart_port *port,
unsigned long flags;
struct serial_rs485 aux;
+ if (!(port->rs485_supported.flags & SER_RS485_ENABLED))
+ return -ENOTTY;
+
spin_lock_irqsave(&port->lock, flags);
aux = port->rs485;
spin_unlock_irqrestore(&port->lock, flags);
@@ -1460,7 +1463,7 @@ static int uart_set_rs485_config(struct tty_struct *tty, struct uart_port *port,
int ret;
unsigned long flags;
- if (!port->rs485_config)
+ if (!(port->rs485_supported.flags & SER_RS485_ENABLED))
return -ENOTTY;
if (copy_from_user(&rs485, rs485_user, sizeof(*rs485_user)))
--
2.40.1
Among other things uart_sanitize_serial_rs485() tests the sanity of the RTS
settings in a RS485 configuration that has been passed by userspace.
If RTS-on-send and RTS-after-send are both set or unset the configuration
is adjusted and RTS-after-send is disabled and RTS-on-send enabled.
This however makes only sense if both RTS modes are actually supported by
the driver.
With commit be2e2cb1d281 ("serial: Sanitize rs485_struct") the code does
take the driver support into account but only checks if one of both RTS
modes are supported. This may lead to the errorneous result of RTS-on-send
being set even if only RTS-after-send is supported.
Fix this by changing the implemented logic: First clear all unsupported
flags in the RS485 configuration, then adjust an invalid RTS setting by
taking into account which RTS mode is supported.
Cc: stable(a)vger.kernel.org
Fixes: be2e2cb1d281 ("serial: Sanitize rs485_struct")
Signed-off-by: Lino Sanfilippo <l.sanfilippo(a)kunbus.com>
---
drivers/tty/serial/serial_core.c | 28 ++++++++++++++++++----------
1 file changed, 18 insertions(+), 10 deletions(-)
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index ef9e9014bfab..f1d889f9fd6e 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -1370,19 +1370,27 @@ static void uart_sanitize_serial_rs485(struct uart_port *port, struct serial_rs4
return;
}
+ rs485->flags &= supported_flags;
+
/* Pick sane settings if the user hasn't */
- if ((supported_flags & (SER_RS485_RTS_ON_SEND|SER_RS485_RTS_AFTER_SEND)) &&
- !(rs485->flags & SER_RS485_RTS_ON_SEND) ==
+ if (!(rs485->flags & SER_RS485_RTS_ON_SEND) ==
!(rs485->flags & SER_RS485_RTS_AFTER_SEND)) {
- dev_warn_ratelimited(port->dev,
- "%s (%d): invalid RTS setting, using RTS_ON_SEND instead\n",
- port->name, port->line);
- rs485->flags |= SER_RS485_RTS_ON_SEND;
- rs485->flags &= ~SER_RS485_RTS_AFTER_SEND;
- supported_flags |= SER_RS485_RTS_ON_SEND|SER_RS485_RTS_AFTER_SEND;
- }
+ if (supported_flags & SER_RS485_RTS_ON_SEND) {
+ rs485->flags |= SER_RS485_RTS_ON_SEND;
+ rs485->flags &= ~SER_RS485_RTS_AFTER_SEND;
- rs485->flags &= supported_flags;
+ dev_warn_ratelimited(port->dev,
+ "%s (%d): invalid RTS setting, using RTS_ON_SEND instead\n",
+ port->name, port->line);
+ } else {
+ rs485->flags |= SER_RS485_RTS_AFTER_SEND;
+ rs485->flags &= ~SER_RS485_RTS_ON_SEND;
+
+ dev_warn_ratelimited(port->dev,
+ "%s (%d): invalid RTS setting, using RTS_AFTER_SEND instead\n",
+ port->name, port->line);
+ }
+ }
uart_sanitize_serial_rs485_delays(port, rs485);
--
2.40.1
If the RS485 feature RX-during-TX is supported by means of a GPIO set the
according supported flag. Otherwise setting this feature from userspace may
not be possible, since in uart_sanitize_serial_rs485() the passed RS485
configuration is matched against the supported features and unsupported
settings are thereby removed and thus take no effect.
Cc: stable(a)vger.kernel.org
Fixes: 163f080eb717 ("serial: core: Add option to output RS485 RX_DURING_TX state via GPIO")
Signed-off-by: Lino Sanfilippo <l.sanfilippo(a)kunbus.com>
---
drivers/tty/serial/serial_core.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index 35e014f83465..ef9e9014bfab 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -3632,6 +3632,8 @@ int uart_get_rs485_mode(struct uart_port *port)
port->rs485_rx_during_tx_gpio = NULL;
return dev_err_probe(dev, ret, "Cannot get rs485-rx-during-tx-gpios\n");
}
+ if (port->rs485_rx_during_tx_gpio)
+ port->rs485_supported.flags |= SER_RS485_RX_DURING_TX;
return 0;
}
--
2.40.1
Both the imx and stm32 driver set the rx-during-tx GPIO in rs485_config().
Since this function is called with the port lock held, this can be an
problem in case that setting the GPIO line can sleep (e.g. if a GPIO
expander is used which is connected via SPI or I2C).
Avoid this issue by moving the GPIO setting outside of the port lock into
the serial core and thus making it a generic feature.
Since both setting the term and the rx-during-tx GPIO is done at the same
code place, move it into a common function.
Fixes: c54d48543689 ("serial: stm32: Add support for rs485 RX_DURING_TX output GPIO")
Fixes: ca530cfa968c ("serial: imx: Add support for RS485 RX_DURING_TX output GPIO")
Cc: Shawn Guo <shawnguo(a)kernel.org>
Cc: Sascha Hauer <s.hauer(a)pengutronix.de>
Cc: stable(a)vger.kernel.org
Signed-off-by: Lino Sanfilippo <l.sanfilippo(a)kunbus.com>
---
drivers/tty/serial/imx.c | 4 ----
drivers/tty/serial/serial_core.c | 15 +++++++++++----
drivers/tty/serial/stm32-usart.c | 5 +----
3 files changed, 12 insertions(+), 12 deletions(-)
diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index 13cb78340709..edb2ec6a5567 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -1947,10 +1947,6 @@ static int imx_uart_rs485_config(struct uart_port *port, struct ktermios *termio
rs485conf->flags & SER_RS485_RX_DURING_TX)
imx_uart_start_rx(port);
- if (port->rs485_rx_during_tx_gpio)
- gpiod_set_value_cansleep(port->rs485_rx_during_tx_gpio,
- !!(rs485conf->flags & SER_RS485_RX_DURING_TX));
-
return 0;
}
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index d5ba6e90bd95..35e014f83465 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -1391,14 +1391,21 @@ static void uart_sanitize_serial_rs485(struct uart_port *port, struct serial_rs4
memset(rs485->padding1, 0, sizeof(rs485->padding1));
}
-static void uart_set_rs485_termination(struct uart_port *port,
- const struct serial_rs485 *rs485)
+/*
+ * Set optional RS485 GPIOs for bus termination and data reception during
+ * transmission. These GPIOs are controlled by the serial core independently
+ * from the UART driver.
+ */
+static void uart_set_rs485_gpios(struct uart_port *port,
+ const struct serial_rs485 *rs485)
{
if (!(rs485->flags & SER_RS485_ENABLED))
return;
gpiod_set_value_cansleep(port->rs485_term_gpio,
!!(rs485->flags & SER_RS485_TERMINATE_BUS));
+ gpiod_set_value_cansleep(port->rs485_rx_during_tx_gpio,
+ !!(rs485->flags & SER_RS485_RX_DURING_TX));
}
static int uart_rs485_config(struct uart_port *port)
@@ -1411,7 +1418,7 @@ static int uart_rs485_config(struct uart_port *port)
return 0;
uart_sanitize_serial_rs485(port, rs485);
- uart_set_rs485_termination(port, rs485);
+ uart_set_rs485_gpios(port, rs485);
spin_lock_irqsave(&port->lock, flags);
ret = port->rs485_config(port, NULL, rs485);
@@ -1455,7 +1462,7 @@ static int uart_set_rs485_config(struct tty_struct *tty, struct uart_port *port,
if (ret)
return ret;
uart_sanitize_serial_rs485(port, &rs485);
- uart_set_rs485_termination(port, &rs485);
+ uart_set_rs485_gpios(port, &rs485);
spin_lock_irqsave(&port->lock, flags);
ret = port->rs485_config(port, &tty->termios, &rs485);
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index 5e9cf0c48813..8eb13bf055f2 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -226,10 +226,7 @@ static int stm32_usart_config_rs485(struct uart_port *port, struct ktermios *ter
stm32_usart_clr_bits(port, ofs->cr1, BIT(cfg->uart_enable_bit));
- if (port->rs485_rx_during_tx_gpio)
- gpiod_set_value_cansleep(port->rs485_rx_during_tx_gpio,
- !!(rs485conf->flags & SER_RS485_RX_DURING_TX));
- else
+ if (!port->rs485_rx_during_tx_gpio)
rs485conf->flags |= SER_RS485_RX_DURING_TX;
if (rs485conf->flags & SER_RS485_ENABLED) {
--
2.40.1
Hi Greg,
It recently came to my attention that this patch:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/patch/?i…
[Upstream commit 69cba9d3c1284e0838ae408830a02c4a063104bc]
... which is marked with Fixes tag for a change that went into 5.9
kernel, was taken into 6.4 and 6.1 stable trees.
However, I do not see this in the 5.15 stable tree.
I got emails about this fix being taken to the 6.4 and 6.1 stable. But
I do not see any communication about 5.15 kernel.
Was this missed? Or is there something in the process that I missed?
Based on the kernel documentation about commit tags, I assumed that
for commits that have the "Fixes: " tag, it was not necessary to add
the "CC: stable" as well.
Please let me know if that understanding is wrong.
Regarding this particular fix, I discussed this with Steve, and he
agrees that this fix needs to go into all stable kernels as well.
--
Regards,
Shyam
An email was sent to you about receiving a pending funds but I'm
surprised that you never bothered to respond.
Please URGENTLY use my regular email address: mgr.audu(a)yahoo.com
Yours faithfully
Audit Manager
From: Kuninori Morimoto <kuninori.morimoto.gx(a)renesas.com>
[ Upstream commit 41bae58df411f9accf01ea660730649b2fab1dab ]
asoc_simple_probe() is used for both "DT probe" (A) and "platform probe"
(B). It uses "goto err" when error case, but it is not needed for
"platform probe" case (B). Thus it is using "return" directly there.
static int asoc_simple_probe(...)
{
^ if (...) {
| ...
(A) if (ret < 0)
| goto err;
v } else {
^ ...
| if (ret < 0)
(B) return -Exxx;
v }
...
^ if (ret < 0)
(C) goto err;
v ...
err:
(D) simple_util_clean_reference(card);
return ret;
}
Both case are using (C) part, and it calls (D) when err case.
But (D) will do nothing for (B) case.
Because of these behavior, current code itself is not wrong,
but is confusable, and more, static analyzing tool will warning on
(B) part (should use goto err).
To avoid static analyzing tool warning, this patch uses "goto err"
on (B) part.
Reported-by: kernel test robot <lkp(a)intel.com>
Reported-by: Dan Carpenter <dan.carpenter(a)linaro.org>
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx(a)renesas.com>
Link: https://lore.kernel.org/r/87o7hy7mlh.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/soc/generic/simple-card.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/sound/soc/generic/simple-card.c b/sound/soc/generic/simple-card.c
index 6959a74a6f491..c5db2d9d44edd 100644
--- a/sound/soc/generic/simple-card.c
+++ b/sound/soc/generic/simple-card.c
@@ -441,10 +441,12 @@ static int asoc_simple_card_probe(struct platform_device *pdev)
} else {
struct asoc_simple_card_info *cinfo;
+ ret = -EINVAL;
+
cinfo = dev->platform_data;
if (!cinfo) {
dev_err(dev, "no info for asoc-simple-card\n");
- return -EINVAL;
+ goto err;
}
if (!cinfo->name ||
@@ -453,7 +455,7 @@ static int asoc_simple_card_probe(struct platform_device *pdev)
!cinfo->platform ||
!cinfo->cpu_dai.name) {
dev_err(dev, "insufficient asoc_simple_card_info settings\n");
- return -EINVAL;
+ goto err;
}
card->name = (cinfo->card) ? cinfo->card : cinfo->name;
--
2.40.1
From: Kuninori Morimoto <kuninori.morimoto.gx(a)renesas.com>
[ Upstream commit 41bae58df411f9accf01ea660730649b2fab1dab ]
asoc_simple_probe() is used for both "DT probe" (A) and "platform probe"
(B). It uses "goto err" when error case, but it is not needed for
"platform probe" case (B). Thus it is using "return" directly there.
static int asoc_simple_probe(...)
{
^ if (...) {
| ...
(A) if (ret < 0)
| goto err;
v } else {
^ ...
| if (ret < 0)
(B) return -Exxx;
v }
...
^ if (ret < 0)
(C) goto err;
v ...
err:
(D) simple_util_clean_reference(card);
return ret;
}
Both case are using (C) part, and it calls (D) when err case.
But (D) will do nothing for (B) case.
Because of these behavior, current code itself is not wrong,
but is confusable, and more, static analyzing tool will warning on
(B) part (should use goto err).
To avoid static analyzing tool warning, this patch uses "goto err"
on (B) part.
Reported-by: kernel test robot <lkp(a)intel.com>
Reported-by: Dan Carpenter <dan.carpenter(a)linaro.org>
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx(a)renesas.com>
Link: https://lore.kernel.org/r/87o7hy7mlh.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/soc/generic/simple-card.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/sound/soc/generic/simple-card.c b/sound/soc/generic/simple-card.c
index 64bf3560c1d1c..7567ee380283e 100644
--- a/sound/soc/generic/simple-card.c
+++ b/sound/soc/generic/simple-card.c
@@ -404,10 +404,12 @@ static int asoc_simple_card_probe(struct platform_device *pdev)
} else {
struct asoc_simple_card_info *cinfo;
+ ret = -EINVAL;
+
cinfo = dev->platform_data;
if (!cinfo) {
dev_err(dev, "no info for asoc-simple-card\n");
- return -EINVAL;
+ goto err;
}
if (!cinfo->name ||
@@ -416,7 +418,7 @@ static int asoc_simple_card_probe(struct platform_device *pdev)
!cinfo->platform ||
!cinfo->cpu_dai.name) {
dev_err(dev, "insufficient asoc_simple_card_info settings\n");
- return -EINVAL;
+ goto err;
}
card->name = (cinfo->card) ? cinfo->card : cinfo->name;
--
2.40.1
From: Szilard Fabian <szfabian(a)bluemarch.art>
[ Upstream commit 80f39e1c27ba9e5a1ea7e68e21c569c9d8e46062 ]
In the initial boot stage the integrated keyboard of Fujitsu Lifebook E5411
refuses to work and it's not possible to type for example a dm-crypt
passphrase without the help of an external keyboard.
i8042.nomux kernel parameter resolves this issue but using that a PS/2
mouse is detected. This input device is unused even when the i2c-hid-acpi
kernel module is blacklisted making the integrated ELAN touchpad
(04F3:308A) not working at all.
Since the integrated touchpad is managed by the i2c_designware input
driver in the Linux kernel and you can't find a PS/2 mouse port on the
computer I think it's safe to not use the PS/2 mouse port at all.
Signed-off-by: Szilard Fabian <szfabian(a)bluemarch.art>
Link: https://lore.kernel.org/r/20231004011749.101789-1-szfabian@bluemarch.art
Signed-off-by: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/input/serio/i8042-x86ia64io.h | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/input/serio/i8042-x86ia64io.h b/drivers/input/serio/i8042-x86ia64io.h
index 700655741bf28..5df6eef18228d 100644
--- a/drivers/input/serio/i8042-x86ia64io.h
+++ b/drivers/input/serio/i8042-x86ia64io.h
@@ -609,6 +609,14 @@ static const struct dmi_system_id i8042_dmi_quirk_table[] __initconst = {
},
.driver_data = (void *)(SERIO_QUIRK_NOMUX)
},
+ {
+ /* Fujitsu Lifebook E5411 */
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "FUJITSU CLIENT COMPUTING LIMITED"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "LIFEBOOK E5411"),
+ },
+ .driver_data = (void *)(SERIO_QUIRK_NOAUX)
+ },
{
/* Gigabyte M912 */
.matches = {
--
2.40.1
From: Szilard Fabian <szfabian(a)bluemarch.art>
[ Upstream commit 80f39e1c27ba9e5a1ea7e68e21c569c9d8e46062 ]
In the initial boot stage the integrated keyboard of Fujitsu Lifebook E5411
refuses to work and it's not possible to type for example a dm-crypt
passphrase without the help of an external keyboard.
i8042.nomux kernel parameter resolves this issue but using that a PS/2
mouse is detected. This input device is unused even when the i2c-hid-acpi
kernel module is blacklisted making the integrated ELAN touchpad
(04F3:308A) not working at all.
Since the integrated touchpad is managed by the i2c_designware input
driver in the Linux kernel and you can't find a PS/2 mouse port on the
computer I think it's safe to not use the PS/2 mouse port at all.
Signed-off-by: Szilard Fabian <szfabian(a)bluemarch.art>
Link: https://lore.kernel.org/r/20231004011749.101789-1-szfabian@bluemarch.art
Signed-off-by: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/input/serio/i8042-acpipnpio.h | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/input/serio/i8042-acpipnpio.h b/drivers/input/serio/i8042-acpipnpio.h
index 1bd5898abb97d..09528c0a8a34e 100644
--- a/drivers/input/serio/i8042-acpipnpio.h
+++ b/drivers/input/serio/i8042-acpipnpio.h
@@ -609,6 +609,14 @@ static const struct dmi_system_id i8042_dmi_quirk_table[] __initconst = {
},
.driver_data = (void *)(SERIO_QUIRK_NOMUX)
},
+ {
+ /* Fujitsu Lifebook E5411 */
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "FUJITSU CLIENT COMPUTING LIMITED"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "LIFEBOOK E5411"),
+ },
+ .driver_data = (void *)(SERIO_QUIRK_NOAUX)
+ },
{
/* Gigabyte M912 */
.matches = {
--
2.40.1
From: Kuninori Morimoto <kuninori.morimoto.gx(a)renesas.com>
[ Upstream commit 41bae58df411f9accf01ea660730649b2fab1dab ]
asoc_simple_probe() is used for both "DT probe" (A) and "platform probe"
(B). It uses "goto err" when error case, but it is not needed for
"platform probe" case (B). Thus it is using "return" directly there.
static int asoc_simple_probe(...)
{
^ if (...) {
| ...
(A) if (ret < 0)
| goto err;
v } else {
^ ...
| if (ret < 0)
(B) return -Exxx;
v }
...
^ if (ret < 0)
(C) goto err;
v ...
err:
(D) simple_util_clean_reference(card);
return ret;
}
Both case are using (C) part, and it calls (D) when err case.
But (D) will do nothing for (B) case.
Because of these behavior, current code itself is not wrong,
but is confusable, and more, static analyzing tool will warning on
(B) part (should use goto err).
To avoid static analyzing tool warning, this patch uses "goto err"
on (B) part.
Reported-by: kernel test robot <lkp(a)intel.com>
Reported-by: Dan Carpenter <dan.carpenter(a)linaro.org>
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx(a)renesas.com>
Link: https://lore.kernel.org/r/87o7hy7mlh.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/soc/generic/simple-card.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/sound/soc/generic/simple-card.c b/sound/soc/generic/simple-card.c
index 283aa21879aa5..95e4c53cd90c7 100644
--- a/sound/soc/generic/simple-card.c
+++ b/sound/soc/generic/simple-card.c
@@ -680,10 +680,12 @@ static int asoc_simple_probe(struct platform_device *pdev)
struct snd_soc_dai_link *dai_link = priv->dai_link;
struct simple_dai_props *dai_props = priv->dai_props;
+ ret = -EINVAL;
+
cinfo = dev->platform_data;
if (!cinfo) {
dev_err(dev, "no info for asoc-simple-card\n");
- return -EINVAL;
+ goto err;
}
if (!cinfo->name ||
@@ -692,7 +694,7 @@ static int asoc_simple_probe(struct platform_device *pdev)
!cinfo->platform ||
!cinfo->cpu_dai.name) {
dev_err(dev, "insufficient asoc_simple_card_info settings\n");
- return -EINVAL;
+ goto err;
}
cpus = dai_link->cpus;
--
2.40.1
Commit 495184ac91bb ("mt76: mt7915: add support for applying
pre-calibration data") was fundamentally broken and never worked.
The idea (before NVMEM support) was to expand the MTD function and pass
an additional offset. For normal EEPROM load the offset would always be
0. For the purpose of precal loading, an offset was passed that was
internally the size of EEPROM, since precal data is right after the
EEPROM.
Problem is that the offset value passed is never handled and is actually
overwrite by
offset = be32_to_cpup(list);
ret = mtd_read(mtd, offset, len, &retlen, eep);
resulting in the passed offset value always ingnored. (and even passing
garbage data as precal as the start of the EEPROM is getting read)
Fix this by adding to the current offset value, the offset from DT to
correctly read the piece of data at the requested location.
Cc: stable(a)vger.kernel.org
Fixes: 495184ac91bb ("mt76: mt7915: add support for applying pre-calibration data")
Signed-off-by: Christian Marangi <ansuelsmth(a)gmail.com>
---
drivers/net/wireless/mediatek/mt76/eeprom.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/mediatek/mt76/eeprom.c b/drivers/net/wireless/mediatek/mt76/eeprom.c
index 36564930aef1..2558788f7ffb 100644
--- a/drivers/net/wireless/mediatek/mt76/eeprom.c
+++ b/drivers/net/wireless/mediatek/mt76/eeprom.c
@@ -67,7 +67,7 @@ static int mt76_get_of_epprom_from_mtd(struct mt76_dev *dev, void *eep, int offs
goto out_put_node;
}
- offset = be32_to_cpup(list);
+ offset += be32_to_cpup(list);
ret = mtd_read(mtd, offset, len, &retlen, eep);
put_mtd_device(mtd);
if (mtd_is_bitflip(ret))
--
2.40.1
> Now i wounder if you are fixing the wrong thing. Maybe you should be
> fixing the PHY so it does not report up and then down? You say 'very
> snall intervals', which should in fact be 1 second. So is the PHY
> reporting link for a number of poll intervals? 1min to 10 minutes?
>
> Andrew
>
> Yes, according to the log records, the phy polls every second,
> but the link status changes take time.
> Generally, it takes 10 seconds for the phy to detect link down,
> but occasionally it takes several minutes to detect link down,
What PHY driver is this?
It is not so clear what should actually happen with auto-neg turned
off. With it on, and the link going down, the PHY should react after
about 1 second. It is not supposed to react faster than that, although
some PHYs allow fast link down notification to be configured.
Have you checked 802.3 to see what it says about auto-neg off and link
down detection?
I personally would not suppress this behaviour in the MAC
driver. Otherwise you are going to have funny combinations of special
cases of a feature which very few people actually use, making your
maintenance costs higher.
Andrew
From: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
commit 6c2f421174273de8f83cde4286d1c076d43a2d35 upstream.
Several core drivers and buses expect that driver_override is a
dynamically allocated memory thus later they can kfree() it.
However such assumption is not documented, there were in the past and
there are already users setting it to a string literal. This leads to
kfree() of static memory during device release (e.g. in error paths or
during unbind):
kernel BUG at ../mm/slub.c:3960!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
...
(kfree) from [<c058da50>] (platform_device_release+0x88/0xb4)
(platform_device_release) from [<c0585be0>] (device_release+0x2c/0x90)
(device_release) from [<c0a69050>] (kobject_put+0xec/0x20c)
(kobject_put) from [<c0f2f120>] (exynos5_clk_probe+0x154/0x18c)
(exynos5_clk_probe) from [<c058de70>] (platform_drv_probe+0x6c/0xa4)
(platform_drv_probe) from [<c058b7ac>] (really_probe+0x280/0x414)
(really_probe) from [<c058baf4>] (driver_probe_device+0x78/0x1c4)
(driver_probe_device) from [<c0589854>] (bus_for_each_drv+0x74/0xb8)
(bus_for_each_drv) from [<c058b48c>] (__device_attach+0xd4/0x16c)
(__device_attach) from [<c058a638>] (bus_probe_device+0x88/0x90)
(bus_probe_device) from [<c05871fc>] (device_add+0x3dc/0x62c)
(device_add) from [<c075ff10>] (of_platform_device_create_pdata+0x94/0xbc)
(of_platform_device_create_pdata) from [<c07600ec>] (of_platform_bus_create+0x1a8/0x4fc)
(of_platform_bus_create) from [<c0760150>] (of_platform_bus_create+0x20c/0x4fc)
(of_platform_bus_create) from [<c07605f0>] (of_platform_populate+0x84/0x118)
(of_platform_populate) from [<c0f3c964>] (of_platform_default_populate_init+0xa0/0xb8)
(of_platform_default_populate_init) from [<c01031f8>] (do_one_initcall+0x8c/0x404)
Provide a helper which clearly documents the usage of driver_override.
This will allow later to reuse the helper and reduce the amount of
duplicated code.
Convert the platform driver to use a new helper and make the
driver_override field const char (it is not modified by the core).
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Link: https://lore.kernel.org/r/20220419113435.246203-2-krzysztof.kozlowski@linar…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Lee Jones <lee(a)kernel.org>
---
drivers/base/driver.c | 69 +++++++++++++++++++++++++++++++++
drivers/base/platform.c | 28 ++-----------
include/linux/device.h | 2 +
include/linux/platform_device.h | 6 ++-
4 files changed, 80 insertions(+), 25 deletions(-)
diff --git a/drivers/base/driver.c b/drivers/base/driver.c
index 857c8f1b876e5..668c6c8c22f1f 100644
--- a/drivers/base/driver.c
+++ b/drivers/base/driver.c
@@ -29,6 +29,75 @@ static struct device *next_device(struct klist_iter *i)
return dev;
}
+/**
+ * driver_set_override() - Helper to set or clear driver override.
+ * @dev: Device to change
+ * @override: Address of string to change (e.g. &device->driver_override);
+ * The contents will be freed and hold newly allocated override.
+ * @s: NUL-terminated string, new driver name to force a match, pass empty
+ * string to clear it ("" or "\n", where the latter is only for sysfs
+ * interface).
+ * @len: length of @s
+ *
+ * Helper to set or clear driver override in a device, intended for the cases
+ * when the driver_override field is allocated by driver/bus code.
+ *
+ * Returns: 0 on success or a negative error code on failure.
+ */
+int driver_set_override(struct device *dev, const char **override,
+ const char *s, size_t len)
+{
+ const char *new, *old;
+ char *cp;
+
+ if (!override || !s)
+ return -EINVAL;
+
+ /*
+ * The stored value will be used in sysfs show callback (sysfs_emit()),
+ * which has a length limit of PAGE_SIZE and adds a trailing newline.
+ * Thus we can store one character less to avoid truncation during sysfs
+ * show.
+ */
+ if (len >= (PAGE_SIZE - 1))
+ return -EINVAL;
+
+ if (!len) {
+ /* Empty string passed - clear override */
+ device_lock(dev);
+ old = *override;
+ *override = NULL;
+ device_unlock(dev);
+ kfree(old);
+
+ return 0;
+ }
+
+ cp = strnchr(s, len, '\n');
+ if (cp)
+ len = cp - s;
+
+ new = kstrndup(s, len, GFP_KERNEL);
+ if (!new)
+ return -ENOMEM;
+
+ device_lock(dev);
+ old = *override;
+ if (cp != s) {
+ *override = new;
+ } else {
+ /* "\n" passed - clear override */
+ kfree(new);
+ *override = NULL;
+ }
+ device_unlock(dev);
+
+ kfree(old);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(driver_set_override);
+
/**
* driver_for_each_device - Iterator for devices bound to a driver.
* @drv: Driver we're iterating.
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 2f89e618b142c..a09e7a681f7a7 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -891,31 +891,11 @@ static ssize_t driver_override_store(struct device *dev,
const char *buf, size_t count)
{
struct platform_device *pdev = to_platform_device(dev);
- char *driver_override, *old, *cp;
-
- /* We need to keep extra room for a newline */
- if (count >= (PAGE_SIZE - 1))
- return -EINVAL;
-
- driver_override = kstrndup(buf, count, GFP_KERNEL);
- if (!driver_override)
- return -ENOMEM;
-
- cp = strchr(driver_override, '\n');
- if (cp)
- *cp = '\0';
-
- device_lock(dev);
- old = pdev->driver_override;
- if (strlen(driver_override)) {
- pdev->driver_override = driver_override;
- } else {
- kfree(driver_override);
- pdev->driver_override = NULL;
- }
- device_unlock(dev);
+ int ret;
- kfree(old);
+ ret = driver_set_override(dev, &pdev->driver_override, buf, count);
+ if (ret)
+ return ret;
return count;
}
diff --git a/include/linux/device.h b/include/linux/device.h
index 37e359d81a86f..bccd367c11de5 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -330,6 +330,8 @@ extern int __must_check driver_create_file(struct device_driver *driver,
extern void driver_remove_file(struct device_driver *driver,
const struct driver_attribute *attr);
+int driver_set_override(struct device *dev, const char **override,
+ const char *s, size_t len);
extern int __must_check driver_for_each_device(struct device_driver *drv,
struct device *start,
void *data,
diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
index 9e5c98fcea8c6..8268439975b21 100644
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -29,7 +29,11 @@ struct platform_device {
struct resource *resource;
const struct platform_device_id *id_entry;
- char *driver_override; /* Driver name to force a match */
+ /*
+ * Driver name to force a match. Do not set directly, because core
+ * frees it. Use driver_set_override() to set or clear it.
+ */
+ const char *driver_override;
/* MFD cell pointer */
struct mfd_cell *mfd_cell;
--
2.42.0.655.g421f12c284-goog
From: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
commit 6c2f421174273de8f83cde4286d1c076d43a2d35 upstream.
Several core drivers and buses expect that driver_override is a
dynamically allocated memory thus later they can kfree() it.
However such assumption is not documented, there were in the past and
there are already users setting it to a string literal. This leads to
kfree() of static memory during device release (e.g. in error paths or
during unbind):
kernel BUG at ../mm/slub.c:3960!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
...
(kfree) from [<c058da50>] (platform_device_release+0x88/0xb4)
(platform_device_release) from [<c0585be0>] (device_release+0x2c/0x90)
(device_release) from [<c0a69050>] (kobject_put+0xec/0x20c)
(kobject_put) from [<c0f2f120>] (exynos5_clk_probe+0x154/0x18c)
(exynos5_clk_probe) from [<c058de70>] (platform_drv_probe+0x6c/0xa4)
(platform_drv_probe) from [<c058b7ac>] (really_probe+0x280/0x414)
(really_probe) from [<c058baf4>] (driver_probe_device+0x78/0x1c4)
(driver_probe_device) from [<c0589854>] (bus_for_each_drv+0x74/0xb8)
(bus_for_each_drv) from [<c058b48c>] (__device_attach+0xd4/0x16c)
(__device_attach) from [<c058a638>] (bus_probe_device+0x88/0x90)
(bus_probe_device) from [<c05871fc>] (device_add+0x3dc/0x62c)
(device_add) from [<c075ff10>] (of_platform_device_create_pdata+0x94/0xbc)
(of_platform_device_create_pdata) from [<c07600ec>] (of_platform_bus_create+0x1a8/0x4fc)
(of_platform_bus_create) from [<c0760150>] (of_platform_bus_create+0x20c/0x4fc)
(of_platform_bus_create) from [<c07605f0>] (of_platform_populate+0x84/0x118)
(of_platform_populate) from [<c0f3c964>] (of_platform_default_populate_init+0xa0/0xb8)
(of_platform_default_populate_init) from [<c01031f8>] (do_one_initcall+0x8c/0x404)
Provide a helper which clearly documents the usage of driver_override.
This will allow later to reuse the helper and reduce the amount of
duplicated code.
Convert the platform driver to use a new helper and make the
driver_override field const char (it is not modified by the core).
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Link: https://lore.kernel.org/r/20220419113435.246203-2-krzysztof.kozlowski@linar…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Lee Jones <lee(a)kernel.org>
---
drivers/base/driver.c | 69 +++++++++++++++++++++++++++++++++
drivers/base/platform.c | 28 ++-----------
include/linux/device/driver.h | 2 +
include/linux/platform_device.h | 6 ++-
4 files changed, 80 insertions(+), 25 deletions(-)
diff --git a/drivers/base/driver.c b/drivers/base/driver.c
index 8c0d33e182fd5..1b9d47b10bd0a 100644
--- a/drivers/base/driver.c
+++ b/drivers/base/driver.c
@@ -30,6 +30,75 @@ static struct device *next_device(struct klist_iter *i)
return dev;
}
+/**
+ * driver_set_override() - Helper to set or clear driver override.
+ * @dev: Device to change
+ * @override: Address of string to change (e.g. &device->driver_override);
+ * The contents will be freed and hold newly allocated override.
+ * @s: NUL-terminated string, new driver name to force a match, pass empty
+ * string to clear it ("" or "\n", where the latter is only for sysfs
+ * interface).
+ * @len: length of @s
+ *
+ * Helper to set or clear driver override in a device, intended for the cases
+ * when the driver_override field is allocated by driver/bus code.
+ *
+ * Returns: 0 on success or a negative error code on failure.
+ */
+int driver_set_override(struct device *dev, const char **override,
+ const char *s, size_t len)
+{
+ const char *new, *old;
+ char *cp;
+
+ if (!override || !s)
+ return -EINVAL;
+
+ /*
+ * The stored value will be used in sysfs show callback (sysfs_emit()),
+ * which has a length limit of PAGE_SIZE and adds a trailing newline.
+ * Thus we can store one character less to avoid truncation during sysfs
+ * show.
+ */
+ if (len >= (PAGE_SIZE - 1))
+ return -EINVAL;
+
+ if (!len) {
+ /* Empty string passed - clear override */
+ device_lock(dev);
+ old = *override;
+ *override = NULL;
+ device_unlock(dev);
+ kfree(old);
+
+ return 0;
+ }
+
+ cp = strnchr(s, len, '\n');
+ if (cp)
+ len = cp - s;
+
+ new = kstrndup(s, len, GFP_KERNEL);
+ if (!new)
+ return -ENOMEM;
+
+ device_lock(dev);
+ old = *override;
+ if (cp != s) {
+ *override = new;
+ } else {
+ /* "\n" passed - clear override */
+ kfree(new);
+ *override = NULL;
+ }
+ device_unlock(dev);
+
+ kfree(old);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(driver_set_override);
+
/**
* driver_for_each_device - Iterator for devices bound to a driver.
* @drv: Driver we're iterating.
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 88aef93eb4ddf..647066229fec3 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -1046,31 +1046,11 @@ static ssize_t driver_override_store(struct device *dev,
const char *buf, size_t count)
{
struct platform_device *pdev = to_platform_device(dev);
- char *driver_override, *old, *cp;
-
- /* We need to keep extra room for a newline */
- if (count >= (PAGE_SIZE - 1))
- return -EINVAL;
-
- driver_override = kstrndup(buf, count, GFP_KERNEL);
- if (!driver_override)
- return -ENOMEM;
-
- cp = strchr(driver_override, '\n');
- if (cp)
- *cp = '\0';
-
- device_lock(dev);
- old = pdev->driver_override;
- if (strlen(driver_override)) {
- pdev->driver_override = driver_override;
- } else {
- kfree(driver_override);
- pdev->driver_override = NULL;
- }
- device_unlock(dev);
+ int ret;
- kfree(old);
+ ret = driver_set_override(dev, &pdev->driver_override, buf, count);
+ if (ret)
+ return ret;
return count;
}
diff --git a/include/linux/device/driver.h b/include/linux/device/driver.h
index ee7ba5b5417e5..a44f5adeaef5a 100644
--- a/include/linux/device/driver.h
+++ b/include/linux/device/driver.h
@@ -150,6 +150,8 @@ extern int __must_check driver_create_file(struct device_driver *driver,
extern void driver_remove_file(struct device_driver *driver,
const struct driver_attribute *attr);
+int driver_set_override(struct device *dev, const char **override,
+ const char *s, size_t len);
extern int __must_check driver_for_each_device(struct device_driver *drv,
struct device *start,
void *data,
diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
index 17f9cd5626c83..e7a83b0218077 100644
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -30,7 +30,11 @@ struct platform_device {
struct resource *resource;
const struct platform_device_id *id_entry;
- char *driver_override; /* Driver name to force a match */
+ /*
+ * Driver name to force a match. Do not set directly, because core
+ * frees it. Use driver_set_override() to set or clear it.
+ */
+ const char *driver_override;
/* MFD cell pointer */
struct mfd_cell *mfd_cell;
--
2.42.0.655.g421f12c284-goog
Commit 495184ac91bb ("mt76: mt7915: add support for applying
pre-calibration data") was fundamentally broken and never worked.
The idea (before NVMEM support) was to expand the MTD function and pass
an additional offset. For normal EEPROM load the offset would always be
0. For the purpose of precal loading, an offset was passed that was
internally the size of EEPROM, since precal data is right after the
EEPROM.
Problem is that the offset value passed is never handled and is actually
overwrite by
offset = be32_to_cpup(list);
ret = mtd_read(mtd, offset, len, &retlen, eep);
resulting in the passed offset value always ingnored. (and even passing
garbage data as precal as the start of the EEPROM is getting read)
Fix this by adding to the current offset value, the offset from DT to
correctly read the piece of data at the requested location.
Cc: stable(a)vger.kernel.org
Fixes: 495184ac91bb ("mt76: mt7915: add support for applying pre-calibration data")
Signed-off-by: Christian Marangi <ansuelsmth(a)gmail.com>
---
drivers/net/wireless/mediatek/mt76/eeprom.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/mediatek/mt76/eeprom.c b/drivers/net/wireless/mediatek/mt76/eeprom.c
index 36564930aef1..2558788f7ffb 100644
--- a/drivers/net/wireless/mediatek/mt76/eeprom.c
+++ b/drivers/net/wireless/mediatek/mt76/eeprom.c
@@ -67,7 +67,7 @@ static int mt76_get_of_epprom_from_mtd(struct mt76_dev *dev, void *eep, int offs
goto out_put_node;
}
- offset = be32_to_cpup(list);
+ offset += be32_to_cpup(list);
ret = mtd_read(mtd, offset, len, &retlen, eep);
put_mtd_device(mtd);
if (mtd_is_bitflip(ret))
--
2.40.1
Don't apply the stimer's counter side effects when modifying its
value from user-space, as this may trigger spurious interrupts.
For example:
- The stimer is configured in auto-enable mode.
- The stimer's count is set and the timer enabled.
- The stimer expires, an interrupt is injected.
- The VM is live migrated.
- The stimer config and count are deserialized, auto-enable is ON, the
stimer is re-enabled.
- The stimer expires right away, and injects an unwarranted interrupt.
Cc: stable(a)vger.kernel.org
Fixes: 1f4b34f825e8 ("kvm/x86: Hyper-V SynIC timers")
Signed-off-by: Nicolas Saenz Julienne <nsaenz(a)amazon.com>
---
Changes since v2:
- reword commit message/subject.
Changes since v1:
- Cover all 'stimer->config.enable' updates.
arch/x86/kvm/hyperv.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 7c2dac6824e2..238afd7335e4 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -727,10 +727,12 @@ static int stimer_set_count(struct kvm_vcpu_hv_stimer *stimer, u64 count,
stimer_cleanup(stimer);
stimer->count = count;
- if (stimer->count == 0)
- stimer->config.enable = 0;
- else if (stimer->config.auto_enable)
- stimer->config.enable = 1;
+ if (!host) {
+ if (stimer->count == 0)
+ stimer->config.enable = 0;
+ else if (stimer->config.auto_enable)
+ stimer->config.enable = 1;
+ }
if (stimer->config.enable)
stimer_mark_pending(stimer, false);
--
2.40.1
This is a new document based on my 2022 blog post:
https://blogs.oracle.com/linux/post/backporting-patches-using-git
Although this is aimed at stable contributors and distro maintainers,
it does also contain useful tips and tricks for anybody who needs to
resolve merge conflicts.
By adding this to the kernel as documentation we can more easily point
to it e.g. from stable emails about failed backports, as well as allow
the community to modify it over time if necessary.
I've added this under process/ since it also has
process/applying-patches.rst. Another interesting document is
maintainer/rebasing-and-merging.rst which maybe should eventually refer
to this one, but I'm leaving that as a future cleanup.
Thanks to Harshit Mogalapalli for helping with the original blog post
as well as this updated document and Bagas Sanjaya for providing
thoughtful feedback.
v2: fixed heading style, link style, placeholder style, other comments
Link: https://lore.kernel.org/linux-doc/20230303162553.17212-1-vegard.nossum@orac…
Cc: Harshit Mogalapalli <harshit.m.mogalapalli(a)oracle.com>
Cc: Bagas Sanjaya <bagasdotme(a)gmail.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Stephen Rothwell <sfr(a)canb.auug.org.au>
Cc: Jason A. Donenfeld <Jason(a)zx2c4.com>
Cc: Konstantin Ryabitsev <konstantin(a)linuxfoundation.org>
Signed-off-by: Vegard Nossum <vegard.nossum(a)oracle.com>
---
Documentation/process/backporting.rst | 514 ++++++++++++++++++++++++++
Documentation/process/index.rst | 1 +
2 files changed, 515 insertions(+)
create mode 100644 Documentation/process/backporting.rst
diff --git a/Documentation/process/backporting.rst b/Documentation/process/backporting.rst
new file mode 100644
index 0000000000000..7593980af9659
--- /dev/null
+++ b/Documentation/process/backporting.rst
@@ -0,0 +1,514 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================================
+Backporting and conflict resolution
+===================================
+
+:Author: Vegard Nossum <vegard.nossum(a)oracle.com>
+
+.. contents::
+ :local:
+ :depth: 3
+ :backlinks: none
+
+Introduction
+============
+
+Some developers may never really have to deal with backporting patches,
+merging branches, or resolving conflicts in their day-to-day work, so
+when a merge conflict does pop up, it can be daunting. Luckily,
+resolving conflicts is a skill like any other, and there are many useful
+techniques you can use to make the process smoother and increase your
+confidence in the result.
+
+This document aims to be a comprehensive, step-by-step guide to
+backporting and conflict resolution.
+
+Applying the patch to a tree
+============================
+
+Sometimes the patch you are backporting already exists as a git commit,
+in which case you just cherry-pick it directly using
+``git cherry-pick``. However, if the patch comes from an email, as it
+often does for the Linux kernel, you will need to apply it to a tree
+using ``git am``.
+
+If you've ever used ``git am``, you probably already know that it is
+quite picky about the patch applying perfectly to your source tree. In
+fact, you've probably had nightmares about ``.rej`` files and trying to
+edit the patch to make it apply.
+
+It is strongly recommended to instead find an appropriate base version
+where the patch applies cleanly and *then* cherry-pick it over to your
+destination tree, as this will make git output conflict markers and let
+you resolve conflicts with the help of git and any other conflict
+resolution tools you might prefer to use. For example, if you want to
+apply a patch that just arrived on LKML to an older stable kernel, you
+can apply it to the most recent mainline kernel and then cherry-pick it
+to your older stable branch.
+
+It's generally better to use the exact same base as the one the patch
+was generated from, but it doesn't really matter that much as long as it
+applies cleanly and isn't too far from the original base. The only
+problem with applying the patch to the "wrong" base is that it may pull
+in more unrelated changes in the context of the diff when cherry-picking
+it to the older branch.
+
+If you are using `b4`_. and you are applying the patch directly from an
+email, you can use ``b4 am`` with the options ``-g``/``--guess-base``
+and ``-3``/``--prep-3way`` to do some of this automatically (see the
+`b4 presentation`_ for more information). However, the rest of this
+article will assume that you are doing a plain ``git cherry-pick``.
+
+.. _b4: https://people.kernel.org/monsieuricon/introducing-b4-and-patch-attestation
+.. _b4 presentation: https://youtu.be/mF10hgVIx9o?t=2996
+
+Once you have the patch in git, you can go ahead and cherry-pick it into
+your source tree. Don't forget to cherry-pick with ``-x`` if you want a
+written record of where the patch came from!
+
+Note that if you are submiting a patch for stable, the format is
+slightly different; the first line after the subject line needs tobe
+either::
+
+ commit <upstream commit> upstream
+
+or::
+
+ [ Upstream commit <upstream commit> ]
+
+Resolving conflicts
+===================
+
+Uh-oh; the cherry-pick failed with a vaguely threatening message::
+
+ CONFLICT (content): Merge conflict
+
+What to do now?
+
+In general, conflicts appear when the context of the patch (i.e., the
+lines being changed and/or the lines surrounding the changes) doesn't
+match what's in the tree you are trying to apply the patch *to*.
+
+For backports, what likely happened was that the branch you are
+backporting from contains patches not in the branch you are backporting
+to. However, the reverse is also possible. In any case, the result is a
+conflict that needs to be resolved.
+
+If your attempted cherry-pick fails with a conflict, git automatically
+edits the files to include so-called conflict markers showing you where
+the conflict is and how the two branches have diverged. Resolving the
+conflict typically means editing the end result in such a way that it
+takes into account these other commits.
+
+Resolving the conflict can be done either by hand in a regular text
+editor or using a dedicated conflict resolution tool.
+
+Many people prefer to use their regular text editor and edit the
+conflict directly, as it may be easier to understand what you're doing
+and to control the final result. There are definitely pros and cons to
+each method, and sometimes there's value in using both.
+
+We will not cover using dedicated merge tools here beyond providing some
+pointers to various tools that you could use:
+
+- `vimdiff/gvimdiff <https://linux.die.net/man/1/vimdiff>`__
+- `KDiff3 <http://kdiff3.sourceforge.net/>`__
+- `TortoiseMerge <https://tortoisesvn.net/TortoiseMerge.html>`__
+- `Meld <https://meldmerge.org/help/>`__
+- `P4Merge <https://www.perforce.com/products/helix-core-apps/merge-diff-tool-p4merge>`__
+- `Beyond Compare <https://www.scootersoftware.com/>`__
+- `IntelliJ <https://www.jetbrains.com/help/idea/resolve-conflicts.html>`__
+- `VSCode <https://code.visualstudio.com/docs/editor/versioncontrol>`__
+
+To configure git to work with these, see ``git mergetool --help`` or
+the official `git-mergetool documentation`_.
+
+.. _git-mergetool documentation: https://git-scm.com/docs/git-mergetool
+
+Prerequisite patches
+--------------------
+
+Most conflicts happen because the branch you are backporting to is
+missing some patches compared to the branch you are backporting *from*.
+In the more general case (such as merging two independent branches),
+development could have happened on either branch, or the branches have
+simply diverged -- perhaps your older branch had some other backports
+applied to it that themselves needed conflict resolutions, causing a
+divergence.
+
+It's important to always identify the commit or commits that caused the
+conflict, as otherwise you cannot be confident in the correctness of
+your resolution. As an added bonus, especially if the patch is in an
+area you're not that famliar with, the changelogs of these commits will
+often give you the context to understand the code and potential problems
+or pitfalls with your conflict resolution.
+
+git log
+~~~~~~~
+
+A good first step is to look at ``git log`` for the file that has the
+conflict -- this is usually sufficient when there aren't a lot of
+patches to the file, but may get confusing if the file is big and
+frequently patched. You should run ``git log`` on the range of commits
+between your currently checked-out branch (``HEAD``) and the parent of
+the patch you are picking (``<commit>``), i.e.::
+
+ git log HEAD..<commit>^ -- <path>
+
+Even better, if you want to restrict this output to a single function
+(because that's where the conflict appears), you can use the following
+syntax::
+
+ git log -L:'\<function\>':<path> HEAD..<commit>^
+
+.. note::
+ The ``\<`` and ``\>`` around the function name ensure that the
+ matches are anchored on a word boundary. This is important, as this
+ part is actually a regex and git only follows the first match, so
+ if you use ``-L:thread_stack:kernel/fork.c`` it may only give you
+ results for the function ``try_release_thread_stack_to_cache`` even
+ though there are many other functions in that file containing the
+ string ``thread_stack`` in their names.
+
+Another useful option for ``git log`` is ``-G``, which allows you to
+filter on certain strings appearing in the diffs of the commits you are
+listing::
+
+ git log -G'regex' HEAD..<commit>^ -- <path>
+
+This can also be a handy way to quickly find when something (e.g. a
+function call or a variable) was changed, added, or removed. The search
+string is a regular expression, which means you can potentially search
+for more specific things like assignments to a specific struct member::
+
+ git log -G'\->index\>.*='
+
+git blame
+~~~~~~~~~
+
+Another way to find prerequisite commits (albeit only the most recent
+one for a given conflict) is to run ``git blame``. In this case, you
+need to run it against the parent commit of the patch you are
+cherry-picking and the file where the conflict appared, i.e.::
+
+ git blame <commit>^ -- <path>
+
+This command also accepts the ``-L`` argument (for restricting the
+output to a single function), but in this case you specify the filename
+at the end of the command as usual::
+
+ git blame -L:'\<function\>' <commit>^ -- <path>
+
+Navigate to the place where the conflict occurred. The first column of
+the blame output is the commit ID of the patch that added a given line
+of code.
+
+It might be a good idea to ``git show`` these commits and see if they
+look like they might be the source of the conflict. Sometimes there will
+be more than one of these commits, either because multiple commits
+changed different lines of the same conflict area *or* because multiple
+subsequent patches changed the same line (or lines) multiple times. In
+the latter case, you may have to run ``git blame`` again and specify the
+older version of the file to look at in order to dig further back in
+the history of the file.
+
+Prerequisite vs. incidental patches
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Having found the patch that caused the conflict, you need to determine
+whether it is a prerequisite for the patch you are backporting or
+whether it is just incidental and can be skipped. An incidental patch
+would be one that touches the same code as the patch you are
+backporting, but does not change the semantics of the code in any
+material way. For example, a whitespace cleanup patch is completely
+incidental -- likewise, a patch that simply renames a function or a
+variable would be incidental as well. On the other hand, if the function
+being changed does not even exist in your current branch then this would
+not be incidental at all and you need to carefully consider whether the
+patch adding the function should be cherry-picked first.
+
+If you find that there is a necessary prerequisite patch, then you need
+to stop and cherry-pick that instead. If you've already resolved some
+conflicts in a different file and don't want to do it again, you can
+create a temporary copy of that file.
+
+To abort the current cherry-pick, go ahead and run
+``git cherry-pick --abort``, then restart the cherry-picking process
+with the commit ID of the prerequisite patch instead.
+
+Understanding conflict markers
+------------------------------
+
+Combined diffs
+~~~~~~~~~~~~~~
+
+Let's say you've decided against picking (or reverting) additional
+patches and you just want to resolve the conflict. Git will have
+inserted conflict markers into your file. Out of the box, this will look
+something like::
+
+ <<<<<<< HEAD
+ this is what's in your current tree before cherry-picking
+ =======
+ this is what the patch wants it to be after cherry-picking
+ >>>>>>> <commit>... title
+
+This is what you would see if you opened the file in your editor.
+However, if you were to run run ``git diff`` without any arguments, the
+output would look something like this::
+
+ $ git diff
+ [...]
+ ++<<<<<<<< HEAD
+ +this is what's in your current tree before cherry-picking
+ ++========
+ + this is what the patch wants it to be after cherry-picking
+ ++>>>>>>>> <commit>... title
+
+When you are resolving a conflict, the behavior of ``git diff`` differs
+from its normal behavior. Notice the two columns of diff markers
+instead of the usual one; this is a so-called "`combined diff`_", here
+showing the 3-way diff (or diff-of-diffs) between
+
+#. the current branch (before cherry-picking) and the current working
+ directory, and
+#. the current branch (before cherry-picking) and the file as it looks
+ after the original patch has been applied.
+
+.. _combined diff: https://git-scm.com/docs/diff-format#_combined_diff_format
+
+
+Better diffs
+~~~~~~~~~~~~
+
+3-way combined diffs include all the other changes that happened to the
+file between your current branch and the branch you are cherry-picking
+from. While this is useful for spotting other changes that you need to
+take into account, this also makes the output of ``git diff`` somewhat
+intimidating and difficult to read. You may instead prefer to run
+``git diff HEAD`` (or ``git diff --ours``) which shows only the diff
+between the current branch before cherry-picking and the current working
+directory. It looks like this::
+
+ $ git diff HEAD
+ [...]
+ +<<<<<<<< HEAD
+ this is what's in your current tree before cherry-picking
+ +========
+ +this is what the patch wants it to be after cherry-picking
+ +>>>>>>>> <commit>... title
+
+As you can see, this reads just like any other diff and makes it clear
+which lines are in the current branch and which lines are being added
+because they are part of the merge conflict or the patch being
+cherry-picked.
+
+Merge styles and diff3
+~~~~~~~~~~~~~~~~~~~~~~
+
+The default conflict marker style shown above is known as the ``merge``
+style. There is also another style available, known as the ``diff3``
+style, which looks like this::
+
+ <<<<<<< HEAD
+ this is what is in your current tree before cherry-picking
+ ||||||| parent of <commit> (title)
+ this is what the patch expected to find there
+ =======
+ this is what the patch wants it to be after being applied
+ >>>>>>> <commit> (title)
+
+As you can see, this has 3 parts instead of 2, and includes what git
+expected to find there but didn't. Some people vastly prefer this style
+as it makes it much clearer what the patch actually changed; i.e., it
+allows you to compare the before-and-after versions of the file for the
+commit you are cherry-picking. This allows you to make better decisions
+about how to resolve the conflict.
+
+To change conflict marker styles, you can use the following command::
+
+ git config merge.conflictStyle diff3
+
+There is a third option, ``zdiff3``, introduced in `Git 2.35`_,
+which has the same 3 sections as ``diff3``, but where common lines have
+been trimmed off, making the conflict area smaller in some cases.
+
+.. _Git 2.35: https://github.blog/2022-01-24-highlights-from-git-2-35/
+
+Iterating on conflict resolutions
+---------------------------------
+
+The first step in any conflict resolution process is to understand the
+patch you are backporting. For the Linux kernel this is especially
+important, since an incorrect change can lead to the whole system
+crashing -- or worse, an undetected security vulnerability.
+
+Understanding the patch can be easy or difficult depending on the patch
+itself, the changelog, and your familiarity with the code being changed.
+However, a good question for every change (or every hunk of the patch)
+might be: "Why is this hunk in the patch?" The answers to these
+questions will inform your conflict resolution.
+
+Resolution process
+~~~~~~~~~~~~~~~~~~
+
+Sometimes the easiest thing to do is to just remove all but the first
+part of the conflict, leaving the file essentially unchanged, and apply
+the changes by hand. Perhaps the patch is changing a function call
+argument from ``0`` to ``1`` while a conflicting change added an
+entirely new (and insignificant) parameter to the end of the parameter
+list; in that case, it's easy enough to change the argument from ``0``
+to ``1`` by hand and leave the rest of the arguments alone. This
+technique of manually applying changes is mostly useful if the conflict
+pulled in a lot of unrelated context that you don't really need to care
+about.
+
+For particularly nasty conflicts with many conflict markers, you can use
+``git add`` or ``git add -i`` to selectively stage your resolutions to
+get them out of the way; this also lets you use ``git diff HEAD`` to
+always see what remains to be resolved or ``git diff --cached`` to see
+what your patch looks like so far.
+
+Function arguments
+~~~~~~~~~~~~~~~~~~
+
+Pay attention to changing function arguments! It's easy to gloss over
+details and think that two lines are the same but actually they differ
+in some small detail like which variable was passed as an argument
+(especially if the two variables are both a single character that look
+the same, like i and j).
+
+Error handling
+~~~~~~~~~~~~~~
+
+If you cherry-pick a patch that includes a ``goto`` statement (typically
+for error handling), it is absolutely imperative to double check that
+the target label is still correct in the branch you are backporting to.
+Error handling is typically located at the bottom of the function, so it
+may not be part of the conflict even though could have been changed by
+other patches.
+
+A good way to ensure that you review the error paths is to always use
+``git diff -W`` and ``git show -W`` (AKA ``--function-context``) when
+inspecting your changes. For C code, this will show you the whole
+function that's being changed in a patch. One of the things that often
+go wrong during backports is that something else in the function changed
+on either of the branches that you're backporting from or to. By
+including the whole function in the diff you get more context and can
+more easily spot problems that might otherwise go unnoticed.
+
+Dealing with file renames
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+One of the most annoying things that can happen while backporting a
+patch is discovering that one of the files being patched has been
+renamed, as that typically means git won't even put in conflict markers,
+but will just throw up its hands and say (paraphrased): "Unmerged path!
+You do the work..."
+
+There are generally a few ways to deal with this. If the patch to the
+renamed file is small, like a one-line change, the easiest thing is to
+just go ahead and apply the change by hand and be done with it. On the
+other hand, if the change is big or complicated, you definitely don't
+want to do it by hand.
+
+Sometimes the right thing to do will be to also backport the patch that
+did the rename, but that's definitely not the most common case. Instead,
+what you can do is to temporarily rename the file in the branch you're
+backporting to (using ``git mv`` and committing the result), restart the
+attempt to cherry-pick the patch, rename the file back (``git mv`` and
+committing again), and finally squash the result using ``git rebase -i``
+(see the `rebase tutorial`_) so it appears as a single commit when you
+are done.
+
+.. _rebase tutorial: https://medium.com/@slamflipstrom/a-beginners-guide-to-squashing-commits-wi…
+
+Verifying the result
+====================
+
+colordiff
+---------
+
+Having committed a conflict-free new patch, you can now compare your
+patch to the original patch. It is highly recommended that you use a
+tool such as `colordiff`_ that can show two files side by side and color
+them according to the changes between them::
+
+ colordiff -yw -W 200 <(git diff -W <upstream commit>^-) <(git diff -W HEAD^-) | less -SR
+
+.. _colordiff: https://www.colordiff.org/
+
+Here, ``-y`` means to do a side-by-side comparison; ``-w`` ignores
+whitespace, and ``-W 200`` sets the width of the output (as otherwise it
+will use 130 by default, which is often a bit too little).
+
+The ``rev^-`` syntax is a handy shorthand for ``rev^..rev``, essentially
+giving you just the diff for that single commit; also see
+the official `git rev-parse documentation`_.
+
+.. _git rev-parse documentation: https://git-scm.com/docs/git-rev-parse#_other_rev_parent_shorthand_notations
+
+Again, note the inclusion of ``-W`` for ``git diff``; this ensures that
+you will see the full function for any function that has changed.
+
+One incredibly important thing that colordiff does is to highlight lines
+that are different. For example, if an error-handling ``goto`` has
+changed labels between the original and backported patch, colordiff will
+show these side-by-side but highlighted in a different color. Thus, it
+is easy to see that the two ``goto`` statements are jumping to different
+labels. Likewise, lines that were not modified by either patch but
+differ in the context will also be highlighted and thus stand out during
+a manual inspection.
+
+Of course, this is just a visual inspection; the real test is building
+and running the patched kernel (or program).
+
+Build testing
+-------------
+
+We won't cover runtime testing here, but it can be a good idea to build
+just the files touched by the patch as a quick sanity check. For the
+Linux kernel you can build single files like this, assuming you have the
+``.config`` and build environment set up correctly::
+
+ make path/to/file.o
+
+Note that this won't discover linker errors, so you should still do a
+full build after verifying that the single file compiles. By compiling
+the single file first you can avoid having to wait for a full build *in
+case* there are compiler errors in any of the files you've changed.
+
+Runtime testing
+---------------
+
+Even a successful build or boot test is not necessarily enough to rule
+out a missing dependency somewhere. Even though the chances are small,
+there could be code changes where two independent changes to the same
+file result in no conflicts, no compile-time errors, and runtime errors
+only in exceptional cases.
+
+One concrete example of this was a pair of patches to the system call
+entry code where the first patch saved/restored a register and a later
+patch made use of the same register somewhere in the middle of this
+sequence. Since there was no overlap between the changes, one could
+cherry-pick the second patch, have no conflicts, and believe that
+everything was fine, when in fact the code was now scribbling over an
+unsaved register.
+
+Although the vast majority of errors will be caught during compilation
+or by superficially exercising the code, the only way to *really* verify
+a backport is to review the final patch with the same level of scrutiny
+as you would (or should) give to any other patch. Having unit tests and
+regression tests or other types of automatic testing can help increase
+the confidence in the correctness of a backport.
+
+Examples
+========
+
+The above shows roughly the idealized process of backporting a patch.
+For a more concrete example, see this video tutorial where two patches
+are backported from mainline to stable:
+`Backporting Linux Kernel Patches`_.
+
+.. _Backporting Linux Kernel Patches: https://youtu.be/sBR7R1V2FeA
diff --git a/Documentation/process/index.rst b/Documentation/process/index.rst
index b501cd977053f..13e522752f880 100644
--- a/Documentation/process/index.rst
+++ b/Documentation/process/index.rst
@@ -66,6 +66,7 @@ lack of a better place.
:maxdepth: 1
applying-patches
+ backporting
adding-syscalls
magic-number
volatile-considered-harmful
--
2.25.1
The patch titled
Subject: x86/mm: drop 4MB restriction on minimal NUMA node size
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
x86-mm-drop-4mb-restriction-on-minimal-numa-node-size.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Mike Rapoport (IBM)" <rppt(a)kernel.org>
Subject: x86/mm: drop 4MB restriction on minimal NUMA node size
Date: Tue, 17 Oct 2023 09:22:15 +0300
Qi Zheng reports crashes in a production environment and provides a
simplified example as a reproducer:
For example, if we use qemu to start a two NUMA node kernel,
one of the nodes has 2M memory (less than NODE_MIN_SIZE),
and the other node has 2G, then we will encounter the
following panic:
[ 0.149844] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 0.150783] #PF: supervisor write access in kernel mode
[ 0.151488] #PF: error_code(0x0002) - not-present page
<...>
[ 0.156056] RIP: 0010:_raw_spin_lock_irqsave+0x22/0x40
<...>
[ 0.169781] Call Trace:
[ 0.170159] <TASK>
[ 0.170448] deactivate_slab+0x187/0x3c0
[ 0.171031] ? bootstrap+0x1b/0x10e
[ 0.171559] ? preempt_count_sub+0x9/0xa0
[ 0.172145] ? kmem_cache_alloc+0x12c/0x440
[ 0.172735] ? bootstrap+0x1b/0x10e
[ 0.173236] bootstrap+0x6b/0x10e
[ 0.173720] kmem_cache_init+0x10a/0x188
[ 0.174240] start_kernel+0x415/0x6ac
[ 0.174738] secondary_startup_64_no_verify+0xe0/0xeb
[ 0.175417] </TASK>
[ 0.175713] Modules linked in:
[ 0.176117] CR2: 0000000000000000
The crashes happen because of inconsistency between nodemask that has
nodes with less than 4MB as memoryless and the actual memory fed into
core mm.
The commit 9391a3f9c7f1 ("[PATCH] x86_64: Clear more state when ignoring
empty node in SRAT parsing") that introduced minimal size of a NUMA node
does not explain why a node size cannot be less than 4MB and what boot
failures this restriction might fix.
Since then a lot has changed and core mm won't confuse badly about small
node sizes.
Drop the limitation for the minimal node size.
Link: https://lkml.kernel.org/r/20231017062215.171670-1-rppt@kernel.org
Signed-off-by: Mike Rapoport (IBM) <rppt(a)kernel.org>
Reported-by: Qi Zheng <zhengqi.arch(a)bytedance.com>
Closes: https://lore.kernel.org/all/20230212110305.93670-1-zhengqi.arch@bytedance.c…
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Borislav Petkov (AMD) <bp(a)alien8.de>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Peter Ziljstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/x86/include/asm/numa.h | 7 -------
arch/x86/mm/numa.c | 7 -------
2 files changed, 14 deletions(-)
--- a/arch/x86/include/asm/numa.h~x86-mm-drop-4mb-restriction-on-minimal-numa-node-size
+++ a/arch/x86/include/asm/numa.h
@@ -12,13 +12,6 @@
#define NR_NODE_MEMBLKS (MAX_NUMNODES*2)
-/*
- * Too small node sizes may confuse the VM badly. Usually they
- * result from BIOS bugs. So dont recognize nodes as standalone
- * NUMA entities that have less than this amount of RAM listed:
- */
-#define NODE_MIN_SIZE (4*1024*1024)
-
extern int numa_off;
/*
--- a/arch/x86/mm/numa.c~x86-mm-drop-4mb-restriction-on-minimal-numa-node-size
+++ a/arch/x86/mm/numa.c
@@ -601,13 +601,6 @@ static int __init numa_register_memblks(
if (start >= end)
continue;
- /*
- * Don't confuse VM with a node that doesn't have the
- * minimum amount of memory:
- */
- if (end && (end - start) < NODE_MIN_SIZE)
- continue;
-
alloc_node_data(nid);
}
_
Patches currently in -mm which might be from rppt(a)kernel.org are
x86-mm-drop-4mb-restriction-on-minimal-numa-node-size.patch
nios2-define-virtual-address-space-for-modules.patch
mm-introduce-execmem_text_alloc-and-execmem_free.patch
mm-execmem-arch-convert-simple-overrides-of-module_alloc-to-execmem.patch
mm-execmem-arch-convert-remaining-overrides-of-module_alloc-to-execmem.patch
modules-execmem-drop-module_alloc.patch
mm-execmem-introduce-execmem_data_alloc.patch
arm64-execmem-extend-execmem_params-for-generated-code-allocations.patch
riscv-extend-execmem_params-for-generated-code-allocations.patch
powerpc-extend-execmem_params-for-kprobes-allocations.patch
powerpc-extend-execmem_params-for-kprobes-allocations-fix.patch
arch-make-execmem-setup-available-regardless-of-config_modules.patch
x86-ftrace-enable-dynamic-ftrace-without-config_modules.patch
kprobes-remove-dependency-on-config_modules.patch
bpf-remove-config_bpf_jit-dependency-on-config_modules-of.patch
From: Ajay Singh <ajay.kathat(a)microchip.com>
Enabling KASAN and running some iperf tests raises some memory issues with
vmm_table:
BUG: KASAN: slab-out-of-bounds in wilc_wlan_handle_txq+0x6ac/0xdb4
Write of size 4 at addr c3a61540 by task wlan0-tx/95
KASAN detects that we are writing data beyond range allocated to vmm_table.
There is indeed a mismatch between the size passed to allocator in
wilc_wlan_init, and the range of possible indexes used later: allocation
size is missing a multiplication by sizeof(u32)
Fixes: 40b717bfcefa ("wifi: wilc1000: fix DMA on stack objects")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ajay Singh <ajay.kathat(a)microchip.com>
Signed-off-by: Alexis Lothoré <alexis.lothore(a)bootlin.com>
---
Changes in v2:
- keep dedicated dynamic allocation for vmm_table
- Link to v1: https://lore.kernel.org/r/20231013-wilc1000_tx_oops-v1-1-3761beb9524d@bootl…
---
drivers/net/wireless/microchip/wilc1000/wlan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/microchip/wilc1000/wlan.c b/drivers/net/wireless/microchip/wilc1000/wlan.c
index 58bbf50081e4..e4113f2dfadf 100644
--- a/drivers/net/wireless/microchip/wilc1000/wlan.c
+++ b/drivers/net/wireless/microchip/wilc1000/wlan.c
@@ -1492,7 +1492,7 @@ int wilc_wlan_init(struct net_device *dev)
}
if (!wilc->vmm_table)
- wilc->vmm_table = kzalloc(WILC_VMM_TBL_SIZE, GFP_KERNEL);
+ wilc->vmm_table = kzalloc(WILC_VMM_TBL_SIZE * sizeof(u32), GFP_KERNEL);
if (!wilc->vmm_table) {
ret = -ENOBUFS;
---
base-commit: ea12d85cbfd6b08fff40a4fefccc011b6cfadf8e
change-id: 20231012-wilc1000_tx_oops-58ce91ee3e93
Best regards,
--
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
From: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Fei has reported that KASAN triggers during apply_alternatives() on
a 5-level paging machine:
BUG: KASAN: out-of-bounds in rcu_is_watching()
Read of size 4 at addr ff110003ee6419a0 by task swapper/0/0
...
__asan_load4()
rcu_is_watching()
trace_hardirqs_on()
text_poke_early()
apply_alternatives()
...
On machines with 5-level paging, cpu_feature_enabled(X86_FEATURE_LA57)
gets patched. It includes KASAN code, where KASAN_SHADOW_START depends on
__VIRTUAL_MASK_SHIFT, which is defined with cpu_feature_enabled().
KASAN gets confused when apply_alternatives() patches the
KASAN_SHADOW_START users. A test patch that makes KASAN_SHADOW_START
static, by replacing __VIRTUAL_MASK_SHIFT with 56, works around the issue.
Fix it for real by disabling KASAN while the kernel is patching alternatives.
[ mingo: updated the changelog ]
Fixes: 6657fca06e3f ("x86/mm: Allow to boot without LA57 if CONFIG_X86_5LEVEL=y")
Reported-by: Fei Yang <fei.yang(a)intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20231012100424.1456-1-kirill.shutemov@linux.intel…
(cherry picked from commit d35652a5fc9944784f6f50a5c979518ff8dacf61)
---
arch/x86/kernel/alternative.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 517ee01503be..73be3931e4f0 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -403,6 +403,17 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start,
u8 insn_buff[MAX_PATCH_LEN];
DPRINTK(ALT, "alt table %px, -> %px", start, end);
+
+ /*
+ * In the case CONFIG_X86_5LEVEL=y, KASAN_SHADOW_START is defined using
+ * cpu_feature_enabled(X86_FEATURE_LA57) and is therefore patched here.
+ * During the process, KASAN becomes confused seeing partial LA57
+ * conversion and triggers a false-positive out-of-bound report.
+ *
+ * Disable KASAN until the patching is complete.
+ */
+ kasan_disable_current();
+
/*
* The scan order should be from start to end. A later scanned
* alternative code can overwrite previously scanned alternative code.
@@ -452,6 +463,8 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start,
text_poke_early(instr, insn_buff, insn_buff_sz);
}
+
+ kasan_enable_current();
}
static inline bool is_jcc32(struct insn *insn)
--
2.25.1
[ Upstream commit 3971442870713de527684398416970cf025b4f89 ]
The ravb_stop() should call cancel_work_sync(). Otherwise,
ravb_tx_timeout_work() is possible to use the freed priv after
ravb_remove() was called like below:
CPU0 CPU1
ravb_tx_timeout()
ravb_remove()
unregister_netdev()
free_netdev(ndev)
// free priv
ravb_tx_timeout_work()
// use priv
unregister_netdev() will call .ndo_stop() so that ravb_stop() is
called. And, after phy_stop() is called, netif_carrier_off()
is also called. So that .ndo_tx_timeout() will not be called
after phy_stop().
Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
Reported-by: Zheng Wang <zyytlz.wz(a)163.com>
Closes: https://lore.kernel.org/netdev/20230725030026.1664873-1-zyytlz.wz@163.com/
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov(a)omp.ru>
Link: https://lore.kernel.org/r/20231005011201.14368-3-yoshihiro.shimoda.uh@renes…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Cc: <stable(a)vger.kernel.org> # for 5.10.x and 5.4.x
---
drivers/net/ethernet/renesas/ravb_main.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
index a59da6a11976..f218bacec001 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -1706,6 +1706,8 @@ static int ravb_close(struct net_device *ndev)
of_phy_deregister_fixed_link(np);
}
+ cancel_work_sync(&priv->work);
+
if (priv->chip_id != RCAR_GEN2) {
free_irq(priv->tx_irqs[RAVB_NC], ndev);
free_irq(priv->rx_irqs[RAVB_NC], ndev);
--
2.25.1
From: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
If we can't find a free fence register to handle a fault in the GMADR
range just return VM_FAULT_NOPAGE without populating the PTE so that
userspace will retry the access and trigger another fault. Eventually
we should find a free fence and the fault will get properly handled.
A further improvement idea might be to reserve a fence (or one per CPU?)
for the express purpose of handling faults without having to retry. But
that would require some additional work.
Looks like this may have gotten broken originally by
commit 39965b376601 ("drm/i915: don't trash the gtt when running out of fences")
as that changed the errno to -EDEADLK which wasn't handle by the gtt
fault code either. But later in commit 2feeb52859fc ("drm/i915/gt: Fix
-EDEADLK handling regression") I changed it again to -ENOBUFS as -EDEADLK
was now getting used for the ww mutex dance. So this fix only makes
sense after that last commit.
Cc: stable(a)vger.kernel.org
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9479
Fixes: 2feeb52859fc ("drm/i915/gt: Fix -EDEADLK handling regression")
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
---
drivers/gpu/drm/i915/gem/i915_gem_mman.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index aa4d842d4c5a..310654542b42 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -235,6 +235,7 @@ static vm_fault_t i915_error_to_vmf_fault(int err)
case 0:
case -EAGAIN:
case -ENOSPC: /* transient failure to evict? */
+ case -ENOBUFS: /* temporarily out of fences? */
case -ERESTARTSYS:
case -EINTR:
case -EBUSY:
--
2.41.0
From: Paolo Bonzini <pbonzini(a)redhat.com>
svm_recalc_instruction_intercepts() is always called at least once
before the vCPU is started, so the setting or clearing of the RDTSCP
intercept can be dropped from the TSC_AUX virtualization support.
Extracted from a patch by Tom Lendacky.
Cc: stable(a)vger.kernel.org
Fixes: 296d5a17e793 ("KVM: SEV-ES: Use V_TSC_AUX if available instead of RDTSC/MSR_TSC_AUX intercepts")
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
(cherry picked from commit e8d93d5d93f85949e7299be289c6e7e1154b2f78)
Signed-off-by: Michael Roth <michael.roth(a)amd.com>
---
arch/x86/kvm/svm/sev.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index b9a0a939d59f..fa1fb81323b5 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3027,11 +3027,8 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm)
if (boot_cpu_has(X86_FEATURE_V_TSC_AUX) &&
(guest_cpuid_has(&svm->vcpu, X86_FEATURE_RDTSCP) ||
- guest_cpuid_has(&svm->vcpu, X86_FEATURE_RDPID))) {
+ guest_cpuid_has(&svm->vcpu, X86_FEATURE_RDPID)))
set_msr_interception(vcpu, svm->msrpm, MSR_TSC_AUX, 1, 1);
- if (guest_cpuid_has(&svm->vcpu, X86_FEATURE_RDTSCP))
- svm_clr_intercept(svm, INTERCEPT_RDTSCP);
- }
}
void sev_init_vmcb(struct vcpu_svm *svm)
--
2.25.1