This is an automatic generated email to let you know that the following patch were queued:
Subject: media: v4l2-core: hold videodev_lock until dev reg, finishes
Author: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Date: Fri Feb 23 09:45:36 2024 +0100
After the new V4L2 device node was registered, some additional
initialization was done before the device node was marked as
'registered'. During the time between creating the device node
and marking it as 'registered' it was possible to open the
device node, which would return -ENODEV since the 'registered'
flag was not yet set.
Hold the videodev_lock mutex from just before the device node
is registered until the 'registered' flag is set. Since v4l2_open
will take the same lock, it will wait until this registration
process is finished. This resolves this race condition.
Signed-off-by: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Reviewed-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org> # for vi4.18 and up
drivers/media/v4l2-core/v4l2-dev.c | 3 +++
1 file changed, 3 insertions(+)
---
diff --git a/drivers/media/v4l2-core/v4l2-dev.c b/drivers/media/v4l2-core/v4l2-dev.c
index e39e9742fdb5..be2ba7ca5de2 100644
--- a/drivers/media/v4l2-core/v4l2-dev.c
+++ b/drivers/media/v4l2-core/v4l2-dev.c
@@ -1039,8 +1039,10 @@ int __video_register_device(struct video_device *vdev,
vdev->dev.devt = MKDEV(VIDEO_MAJOR, vdev->minor);
vdev->dev.parent = vdev->dev_parent;
dev_set_name(&vdev->dev, "%s%d", name_base, vdev->num);
+ mutex_lock(&videodev_lock);
ret = device_register(&vdev->dev);
if (ret < 0) {
+ mutex_unlock(&videodev_lock);
pr_err("%s: device_register failed\n", __func__);
goto cleanup;
}
@@ -1060,6 +1062,7 @@ int __video_register_device(struct video_device *vdev,
/* Part 6: Activate this minor. The char device can now be used. */
set_bit(V4L2_FL_REGISTERED, &vdev->flags);
+ mutex_unlock(&videodev_lock);
return 0;
Some device drivers support devices that enable them to annotate whether a
Rx skb refers to a packet that was processed by the MACsec offloading
functionality of the device. Logic in the Rx handling for MACsec offload
does not utilize this information to preemptively avoid forwarding to the
macsec netdev currently. Because of this, things like multicast messages
such as ARP requests are forwarded to the macsec netdev whether the message
received was MACsec encrypted or not. The goal of this patch series is to
improve the Rx handling for MACsec offload for devices capable of
annotating skbs received that were decrypted by the NIC offload for MACsec.
Here is a summary of the issue that occurs with the existing logic today.
* The current design of the MACsec offload handling path tries to use
"best guess" mechanisms for determining whether a packet associated
with the currently handled skb in the datapath was processed via HW
offload
* The best guess mechanism uses the following heuristic logic (in order of
precedence)
- Check if header destination MAC address matches MACsec netdev MAC
address -> forward to MACsec port
- Check if packet is multicast traffic -> forward to MACsec port
- MACsec security channel was able to be looked up from skb offload
context (mlx5 only) -> forward to MACsec port
* Problem: plaintext traffic can potentially solicit a MACsec encrypted
response from the offload device
- Core aspect of MACsec is that it identifies unauthorized LAN connections
and excludes them from communication
+ This behavior can be seen when not enabling offload for MACsec
- The offload behavior violates this principle in MACsec
I believe this behavior is a security bug since applications utilizing
MACsec could be exploited using this behavior, and the correct way to
resolve this is by having the hardware correctly indicate whether MACsec
offload occurred for the packet or not. In the patches in this series, I
leave a warning for when the problematic path occurs because I cannot
figure out a secure way to fix the security issue that applies to the core
MACsec offload handling in the Rx path without breaking MACsec offload for
other vendors.
Shown at the bottom is an example use case where plaintext traffic sent to
a physical port of a NIC configured for MACsec offload is unable to be
handled correctly by the software stack when the NIC provides awareness to
the kernel about whether the received packet is MACsec traffic or not. In
this specific example, plaintext ARP requests are being responded with
MACsec encrypted ARP replies (which leads to routing information being
unable to be built for the requester).
Side 1
ip link del macsec0
ip address flush mlx5_1
ip address add 1.1.1.1/24 dev mlx5_1
ip link set dev mlx5_1 up
ip link add link mlx5_1 macsec0 type macsec sci 1 encrypt on
ip link set dev macsec0 address 00:11:22:33:44:66
ip macsec offload macsec0 mac
ip macsec add macsec0 tx sa 0 pn 1 on key 00 dffafc8d7b9a43d5b9a3dfbbf6a30c16
ip macsec add macsec0 rx sci 2 on
ip macsec add macsec0 rx sci 2 sa 0 pn 1 on key 00 ead3664f508eb06c40ac7104cdae4ce5
ip address flush macsec0
ip address add 2.2.2.1/24 dev macsec0
ip link set dev macsec0 up
ip link add link macsec0 name macsec_vlan type vlan id 1
ip link set dev macsec_vlan address 00:11:22:33:44:88
ip address flush macsec_vlan
ip address add 3.3.3.1/24 dev macsec_vlan
ip link set dev macsec_vlan up
Side 2
ip link del macsec0
ip address flush mlx5_1
ip address add 1.1.1.2/24 dev mlx5_1
ip link set dev mlx5_1 up
ip link add link mlx5_1 macsec0 type macsec sci 2 encrypt on
ip link set dev macsec0 address 00:11:22:33:44:77
ip macsec offload macsec0 mac
ip macsec add macsec0 tx sa 0 pn 1 on key 00 ead3664f508eb06c40ac7104cdae4ce5
ip macsec add macsec0 rx sci 1 on
ip macsec add macsec0 rx sci 1 sa 0 pn 1 on key 00 dffafc8d7b9a43d5b9a3dfbbf6a30c16
ip address flush macsec0
ip address add 2.2.2.2/24 dev macsec0
ip link set dev macsec0 up
ip link add link macsec0 name macsec_vlan type vlan id 1
ip link set dev macsec_vlan address 00:11:22:33:44:99
ip address flush macsec_vlan
ip address add 3.3.3.2/24 dev macsec_vlan
ip link set dev macsec_vlan up
Side 1
ping -I mlx5_1 1.1.1.2
PING 1.1.1.2 (1.1.1.2) from 1.1.1.1 mlx5_1: 56(84) bytes of data.
From 1.1.1.1 icmp_seq=1 Destination Host Unreachable
ping: sendmsg: No route to host
From 1.1.1.1 icmp_seq=2 Destination Host Unreachable
From 1.1.1.1 icmp_seq=3 Destination Host Unreachable
Link: https://github.com/Binary-Eater/macsec-rx-offload/blob/trunk/MACsec_violati…
Link: https://lore.kernel.org/netdev/87r0l25y1c.fsf@nvidia.com/
Link: https://lore.kernel.org/netdev/20231116182900.46052-1-rrameshbabu@nvidia.co…
Cc: Sabrina Dubroca <sd(a)queasysnail.net>
Cc: stable(a)vger.kernel.org
Signed-off-by: Rahul Rameshbabu <rrameshbabu(a)nvidia.com>
---
Rahul Rameshbabu (3):
macsec: Enable devices to advertise whether they update sk_buff md_dst
during offloads
macsec: Detect if Rx skb is macsec-related for offloading devices that
update md_dst
net/mlx5e: Advertise mlx5 ethernet driver updates sk_buff md_dst for
MACsec
.../mellanox/mlx5/core/en_accel/macsec.c | 1 +
drivers/net/macsec.c | 57 ++++++++++++++++---
include/net/macsec.h | 2 +
3 files changed, 51 insertions(+), 9 deletions(-)
--
2.42.0
Syzkaller reports out of bounds on shift in ntfs_init_from_boot(). The problem
was fixed in upstream with patch 91a4b1ee78cb100b19b70f077c247f211110348f.
This can be fixed in branch 6.1 with the following patch.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Link: https://syzkaller.appspot.com/bug?extid=010986becd65dbf9464b
Konstantin Komarov (1):
fs/ntfs3: Fix shift-out-of-bounds in ntfs_fill_super
fs/ntfs3/ntfs_fs.h | 2 ++
fs/ntfs3/super.c | 50 +++++++++++++++++++++++++++++-----------------
2 files changed, 34 insertions(+), 18 deletions(-)
--
2.34.1
The cros_ec_uart_probe() function calls devm_serdev_device_open() before
it calls serdev_device_set_client_ops(). This can trigger a NULL pointer
dereference:
BUG: kernel NULL pointer dereference, address: 0000000000000000
...
CPU: 5 PID: 103 Comm: kworker/u16:3 Not tainted 6.8.4-zen1-1-zen #1 4a88f2661038c2a3bb69aa70fb41a5735338823c
Hardware name: Google Morphius/Morphius, BIOS MrChromebox-4.22.2-1-g2a93624aebf 01/22/2024
Workqueue: events_unbound flush_to_ldisc
RIP: 0010:ttyport_receive_buf+0x3f/0xf0
...
Call Trace:
<TASK>
? __die+0x10f/0x120
? page_fault_oops+0x171/0x4e0
? srso_return_thunk+0x5/0x5f
? exc_page_fault+0x7f/0x180
? asm_exc_page_fault+0x26/0x30
? ttyport_receive_buf+0x3f/0xf0
flush_to_ldisc+0x9b/0x1c0
process_one_work+0x17b/0x340
worker_thread+0x301/0x490
? __pfx_worker_thread+0x10/0x10
kthread+0xe8/0x120
? __pfx_kthread+0x10/0x10
ret_from_fork+0x34/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1b/0x30
</TASK>
A simplified version of crashing code is as follows:
static inline size_t serdev_controller_receive_buf(struct serdev_controller *ctrl,
const u8 *data,
size_t count)
{
struct serdev_device *serdev = ctrl->serdev;
if (!serdev || !serdev->ops->receive_buf) // CRASH!
return 0;
return serdev->ops->receive_buf(serdev, data, count);
}
static size_t ttyport_receive_buf(struct tty_port *port, const u8 *cp,
const u8 *fp, size_t count)
{
struct serdev_controller *ctrl = port->client_data;
[...]
if (!test_bit(SERPORT_ACTIVE, &serport->flags))
return 0;
ret = serdev_controller_receive_buf(ctrl, cp, count);
[...]
return ret;
}
It assumes that if SERPORT_ACTIVE is set and serdev exists, serdev->ops
will also exist. This conflicts with the existing cros_ec_uart_probe()
logic, as it first calls devm_serdev_device_open() (which sets
SERPORT_ACTIVE), and only later sets serdev->ops via
serdev_device_set_client_ops().
Commit 01f95d42b8f4 ("platform/chrome: cros_ec_uart: fix race
condition") attempted to fix a similar race condition, but while doing
so, made the window of error for this race condition to happen much
wider.
Attempt to fix the race condition again, making sure we fully setup
before calling devm_serdev_device_open().
Fixes: 01f95d42b8f4 ("platform/chrome: cros_ec_uart: fix race condition")
Cc: stable(a)vger.kernel.org
Signed-off-by: Noah Loomans <noah(a)noahloomans.com>
---
This is my first time contributing to Linux, I hope this is a good
patch. Feedback on how to improve is welcome!
drivers/platform/chrome/cros_ec_uart.c | 28 +++++++++++++-------------
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/drivers/platform/chrome/cros_ec_uart.c b/drivers/platform/chrome/cros_ec_uart.c
index 8ea867c2a01a..62bc24f6dcc7 100644
--- a/drivers/platform/chrome/cros_ec_uart.c
+++ b/drivers/platform/chrome/cros_ec_uart.c
@@ -263,12 +263,6 @@ static int cros_ec_uart_probe(struct serdev_device *serdev)
if (!ec_dev)
return -ENOMEM;
- ret = devm_serdev_device_open(dev, serdev);
- if (ret) {
- dev_err(dev, "Unable to open UART device");
- return ret;
- }
-
serdev_device_set_drvdata(serdev, ec_dev);
init_waitqueue_head(&ec_uart->response.wait_queue);
@@ -280,14 +274,6 @@ static int cros_ec_uart_probe(struct serdev_device *serdev)
return ret;
}
- ret = serdev_device_set_baudrate(serdev, ec_uart->baudrate);
- if (ret < 0) {
- dev_err(dev, "Failed to set up host baud rate (%d)", ret);
- return ret;
- }
-
- serdev_device_set_flow_control(serdev, ec_uart->flowcontrol);
-
/* Initialize ec_dev for cros_ec */
ec_dev->phys_name = dev_name(dev);
ec_dev->dev = dev;
@@ -301,6 +287,20 @@ static int cros_ec_uart_probe(struct serdev_device *serdev)
serdev_device_set_client_ops(serdev, &cros_ec_uart_client_ops);
+ ret = devm_serdev_device_open(dev, serdev);
+ if (ret) {
+ dev_err(dev, "Unable to open UART device");
+ return ret;
+ }
+
+ ret = serdev_device_set_baudrate(serdev, ec_uart->baudrate);
+ if (ret < 0) {
+ dev_err(dev, "Failed to set up host baud rate (%d)", ret);
+ return ret;
+ }
+
+ serdev_device_set_flow_control(serdev, ec_uart->flowcontrol);
+
return cros_ec_register(ec_dev);
}
--
2.44.0
From: Wayne Lin <wayne.lin(a)amd.com>
[Why]
Like commit ec5fa9fcdeca ("drm/amd/display: Adjust the MST resume flow"), we
want to avoid handling mst topology changes before restoring the old state.
If we enable DP_UP_REQ_EN before calling drm_atomic_helper_resume(), have
changce to handle CSN event first and fire hotplug event before restoring the
cached state.
[How]
Disable mst branch sending up request event before we restoring the cached state.
DP_UP_REQ_EN will be set later when we call drm_dp_mst_topology_mgr_resume().
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Hersen Wu <hersenxs.wu(a)amd.com>
Signed-off-by: Wayne Lin <wayne.lin(a)amd.com>
---
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 9d36dba914e9..961b5984afa0 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -2429,7 +2429,6 @@ static void resume_mst_branch_status(struct drm_dp_mst_topology_mgr *mgr)
ret = drm_dp_dpcd_writeb(mgr->aux, DP_MSTM_CTRL,
DP_MST_EN |
- DP_UP_REQ_EN |
DP_UPSTREAM_IS_SRC);
if (ret < 0) {
drm_dbg_kms(mgr->dev, "mst write failed - undocked during suspend?\n");
--
2.37.3
This is a note to let you know that I've just added the patch titled
iio:imu: adis16475: Fix sync mode setting
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 74a72baf204fd509bbe8b53eec35e39869d94341 Mon Sep 17 00:00:00 2001
From: Ramona Gradinariu <ramona.bolboaca13(a)gmail.com>
Date: Fri, 5 Apr 2024 07:53:09 +0300
Subject: iio:imu: adis16475: Fix sync mode setting
Fix sync mode setting by applying the necessary shift bits.
Fixes: fff7352bf7a3 ("iio: imu: Add support for adis16475")
Signed-off-by: Ramona Gradinariu <ramona.bolboaca13(a)gmail.com>
Reviewed-by: Nuno Sa <nuno.sa(a)analog.com>
Link: https://lore.kernel.org/r/20240405045309.816328-2-ramona.bolboaca13@gmail.c…
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/imu/adis16475.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/iio/imu/adis16475.c b/drivers/iio/imu/adis16475.c
index 01f55cc902fa..060a21c70460 100644
--- a/drivers/iio/imu/adis16475.c
+++ b/drivers/iio/imu/adis16475.c
@@ -1289,6 +1289,7 @@ static int adis16475_config_sync_mode(struct adis16475 *st)
struct device *dev = &st->adis.spi->dev;
const struct adis16475_sync *sync;
u32 sync_mode;
+ u16 val;
/* default to internal clk */
st->clk_freq = st->info->int_clk * 1000;
@@ -1350,8 +1351,9 @@ static int adis16475_config_sync_mode(struct adis16475 *st)
* I'm keeping this for simplicity and avoiding extra variables
* in chip_info.
*/
+ val = ADIS16475_SYNC_MODE(sync->sync_mode);
ret = __adis_update_bits(&st->adis, ADIS16475_REG_MSG_CTRL,
- ADIS16475_SYNC_MODE_MASK, sync->sync_mode);
+ ADIS16475_SYNC_MODE_MASK, val);
if (ret)
return ret;
--
2.44.0
This is a note to let you know that I've just added the patch titled
iio: accel: mxc4005: Reset chip on probe() and resume()
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 6b8cffdc4a31e4a72f75ecd1bc13fbf0dafee390 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hdegoede(a)redhat.com>
Date: Tue, 26 Mar 2024 12:37:00 +0100
Subject: iio: accel: mxc4005: Reset chip on probe() and resume()
On some designs the chip is not properly reset when powered up at boot or
after a suspend/resume cycle.
Use the sw-reset feature to ensure that the chip is in a clean state
after probe() / resume() and in the case of resume() restore the settings
(scale, trigger-enabled).
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218578
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
Link: https://lore.kernel.org/r/20240326113700.56725-3-hdegoede@redhat.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/accel/mxc4005.c | 68 +++++++++++++++++++++++++++++++++++++
1 file changed, 68 insertions(+)
diff --git a/drivers/iio/accel/mxc4005.c b/drivers/iio/accel/mxc4005.c
index 111f4bcf24ad..63c3566a533b 100644
--- a/drivers/iio/accel/mxc4005.c
+++ b/drivers/iio/accel/mxc4005.c
@@ -5,6 +5,7 @@
* Copyright (c) 2014, Intel Corporation.
*/
+#include <linux/delay.h>
#include <linux/module.h>
#include <linux/i2c.h>
#include <linux/iio/iio.h>
@@ -36,6 +37,7 @@
#define MXC4005_REG_INT_CLR1 0x01
#define MXC4005_REG_INT_CLR1_BIT_DRDYC 0x01
+#define MXC4005_REG_INT_CLR1_SW_RST 0x10
#define MXC4005_REG_CONTROL 0x0D
#define MXC4005_REG_CONTROL_MASK_FSR GENMASK(6, 5)
@@ -43,6 +45,9 @@
#define MXC4005_REG_DEVICE_ID 0x0E
+/* Datasheet does not specify a reset time, this is a conservative guess */
+#define MXC4005_RESET_TIME_US 2000
+
enum mxc4005_axis {
AXIS_X,
AXIS_Y,
@@ -66,6 +71,8 @@ struct mxc4005_data {
s64 timestamp __aligned(8);
} scan;
bool trigger_enabled;
+ unsigned int control;
+ unsigned int int_mask1;
};
/*
@@ -349,6 +356,7 @@ static int mxc4005_set_trigger_state(struct iio_trigger *trig,
return ret;
}
+ data->int_mask1 = val;
data->trigger_enabled = state;
mutex_unlock(&data->mutex);
@@ -384,6 +392,13 @@ static int mxc4005_chip_init(struct mxc4005_data *data)
dev_dbg(data->dev, "MXC4005 chip id %02x\n", reg);
+ ret = regmap_write(data->regmap, MXC4005_REG_INT_CLR1,
+ MXC4005_REG_INT_CLR1_SW_RST);
+ if (ret < 0)
+ return dev_err_probe(data->dev, ret, "resetting chip\n");
+
+ fsleep(MXC4005_RESET_TIME_US);
+
ret = regmap_write(data->regmap, MXC4005_REG_INT_MASK0, 0);
if (ret < 0)
return dev_err_probe(data->dev, ret, "writing INT_MASK0\n");
@@ -479,6 +494,58 @@ static int mxc4005_probe(struct i2c_client *client)
return devm_iio_device_register(&client->dev, indio_dev);
}
+static int mxc4005_suspend(struct device *dev)
+{
+ struct iio_dev *indio_dev = dev_get_drvdata(dev);
+ struct mxc4005_data *data = iio_priv(indio_dev);
+ int ret;
+
+ /* Save control to restore it on resume */
+ ret = regmap_read(data->regmap, MXC4005_REG_CONTROL, &data->control);
+ if (ret < 0)
+ dev_err(data->dev, "failed to read reg_control\n");
+
+ return ret;
+}
+
+static int mxc4005_resume(struct device *dev)
+{
+ struct iio_dev *indio_dev = dev_get_drvdata(dev);
+ struct mxc4005_data *data = iio_priv(indio_dev);
+ int ret;
+
+ ret = regmap_write(data->regmap, MXC4005_REG_INT_CLR1,
+ MXC4005_REG_INT_CLR1_SW_RST);
+ if (ret) {
+ dev_err(data->dev, "failed to reset chip: %d\n", ret);
+ return ret;
+ }
+
+ fsleep(MXC4005_RESET_TIME_US);
+
+ ret = regmap_write(data->regmap, MXC4005_REG_CONTROL, data->control);
+ if (ret) {
+ dev_err(data->dev, "failed to restore control register\n");
+ return ret;
+ }
+
+ ret = regmap_write(data->regmap, MXC4005_REG_INT_MASK0, 0);
+ if (ret) {
+ dev_err(data->dev, "failed to restore interrupt 0 mask\n");
+ return ret;
+ }
+
+ ret = regmap_write(data->regmap, MXC4005_REG_INT_MASK1, data->int_mask1);
+ if (ret) {
+ dev_err(data->dev, "failed to restore interrupt 1 mask\n");
+ return ret;
+ }
+
+ return 0;
+}
+
+static DEFINE_SIMPLE_DEV_PM_OPS(mxc4005_pm_ops, mxc4005_suspend, mxc4005_resume);
+
static const struct acpi_device_id mxc4005_acpi_match[] = {
{"MXC4005", 0},
{"MXC6655", 0},
@@ -506,6 +573,7 @@ static struct i2c_driver mxc4005_driver = {
.name = MXC4005_DRV_NAME,
.acpi_match_table = mxc4005_acpi_match,
.of_match_table = mxc4005_of_match,
+ .pm = pm_sleep_ptr(&mxc4005_pm_ops),
},
.probe = mxc4005_probe,
.id_table = mxc4005_id,
--
2.44.0
This is a note to let you know that I've just added the patch titled
iio: accel: mxc4005: Interrupt handling fixes
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 57a1592784d622ecee0b71940c65429173996b33 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hdegoede(a)redhat.com>
Date: Tue, 26 Mar 2024 12:36:59 +0100
Subject: iio: accel: mxc4005: Interrupt handling fixes
There are 2 issues with interrupt handling in the mxc4005 driver:
1. mxc4005_set_trigger_state() writes MXC4005_REG_INT_MASK1_BIT_DRDYE
(0x01) to INT_MASK1 to enable the interrupt, but to disable the interrupt
it writes ~MXC4005_REG_INT_MASK1_BIT_DRDYE which is 0xfe, so it enables
all other interrupt sources in the INT_SRC1 register. On the MXC4005 this
is not an issue because only bit 0 of the register is used. On the MXC6655
OTOH this is a problem since bit7 is used as TC (Temperature Compensation)
disable bit and writing 1 to this disables Temperature Compensation which
should only be done when running self-tests on the chip.
Write 0 instead of ~MXC4005_REG_INT_MASK1_BIT_DRDYE to disable
the interrupts to fix this.
2. The datasheets for the MXC4005 / MXC6655 do not state what the reset
value for the INT_MASK0 and INT_MASK1 registers is and since these are
write only we also cannot learn this from the hw. Presumably the reset
value for both is all 0, which means all interrupts disabled.
Explicitly set both registers to 0 from mxc4005_chip_init() to ensure
both masks are actually set to 0.
Fixes: 79846e33aac1 ("iio: accel: mxc4005: add support for mxc6655")
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
Link: https://lore.kernel.org/r/20240326113700.56725-2-hdegoede@redhat.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/accel/mxc4005.c | 24 +++++++++++++++++-------
1 file changed, 17 insertions(+), 7 deletions(-)
diff --git a/drivers/iio/accel/mxc4005.c b/drivers/iio/accel/mxc4005.c
index 61839be501c2..111f4bcf24ad 100644
--- a/drivers/iio/accel/mxc4005.c
+++ b/drivers/iio/accel/mxc4005.c
@@ -27,9 +27,13 @@
#define MXC4005_REG_ZOUT_UPPER 0x07
#define MXC4005_REG_ZOUT_LOWER 0x08
+#define MXC4005_REG_INT_MASK0 0x0A
+
#define MXC4005_REG_INT_MASK1 0x0B
#define MXC4005_REG_INT_MASK1_BIT_DRDYE 0x01
+#define MXC4005_REG_INT_CLR0 0x00
+
#define MXC4005_REG_INT_CLR1 0x01
#define MXC4005_REG_INT_CLR1_BIT_DRDYC 0x01
@@ -113,7 +117,9 @@ static bool mxc4005_is_readable_reg(struct device *dev, unsigned int reg)
static bool mxc4005_is_writeable_reg(struct device *dev, unsigned int reg)
{
switch (reg) {
+ case MXC4005_REG_INT_CLR0:
case MXC4005_REG_INT_CLR1:
+ case MXC4005_REG_INT_MASK0:
case MXC4005_REG_INT_MASK1:
case MXC4005_REG_CONTROL:
return true;
@@ -330,17 +336,13 @@ static int mxc4005_set_trigger_state(struct iio_trigger *trig,
{
struct iio_dev *indio_dev = iio_trigger_get_drvdata(trig);
struct mxc4005_data *data = iio_priv(indio_dev);
+ unsigned int val;
int ret;
mutex_lock(&data->mutex);
- if (state) {
- ret = regmap_write(data->regmap, MXC4005_REG_INT_MASK1,
- MXC4005_REG_INT_MASK1_BIT_DRDYE);
- } else {
- ret = regmap_write(data->regmap, MXC4005_REG_INT_MASK1,
- ~MXC4005_REG_INT_MASK1_BIT_DRDYE);
- }
+ val = state ? MXC4005_REG_INT_MASK1_BIT_DRDYE : 0;
+ ret = regmap_write(data->regmap, MXC4005_REG_INT_MASK1, val);
if (ret < 0) {
mutex_unlock(&data->mutex);
dev_err(data->dev, "failed to update reg_int_mask1");
@@ -382,6 +384,14 @@ static int mxc4005_chip_init(struct mxc4005_data *data)
dev_dbg(data->dev, "MXC4005 chip id %02x\n", reg);
+ ret = regmap_write(data->regmap, MXC4005_REG_INT_MASK0, 0);
+ if (ret < 0)
+ return dev_err_probe(data->dev, ret, "writing INT_MASK0\n");
+
+ ret = regmap_write(data->regmap, MXC4005_REG_INT_MASK1, 0);
+ if (ret < 0)
+ return dev_err_probe(data->dev, ret, "writing INT_MASK1\n");
+
return 0;
}
--
2.44.0
This is a note to let you know that I've just added the patch titled
dt-bindings: iio: health: maxim,max30102: fix compatible check
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 89384a2b656b9dace4c965432a209d5c9c3a2a6f Mon Sep 17 00:00:00 2001
From: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Date: Sat, 16 Mar 2024 23:56:57 +0100
Subject: dt-bindings: iio: health: maxim,max30102: fix compatible check
The "maxim,green-led-current-microamp" property is only available for
the max30105 part (it provides an extra green LED), and must be set to
false for the max30102 part.
Instead, the max30100 part has been used for that, which is not
supported by this binding (it has its own binding).
This error was introduced during the txt to yaml conversion.
Fixes: 5a6a65b11e3a ("dt-bindings:iio:health:maxim,max30102: txt to yaml conversion")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Acked-by: Conor Dooley <conor.dooley(a)microchip.com>
Link: https://lore.kernel.org/r/20240316-max30102_binding_fix-v1-1-e8e58f69ef8a@g…
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
.../devicetree/bindings/iio/health/maxim,max30102.yaml | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/devicetree/bindings/iio/health/maxim,max30102.yaml b/Documentation/devicetree/bindings/iio/health/maxim,max30102.yaml
index c13c10c8d65d..eed0df9d3a23 100644
--- a/Documentation/devicetree/bindings/iio/health/maxim,max30102.yaml
+++ b/Documentation/devicetree/bindings/iio/health/maxim,max30102.yaml
@@ -42,7 +42,7 @@ allOf:
properties:
compatible:
contains:
- const: maxim,max30100
+ const: maxim,max30102
then:
properties:
maxim,green-led-current-microamp: false
--
2.44.0
This is a note to let you know that I've just added the patch titled
iio: pressure: Fixes BME280 SPI driver data
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 546a4f4b5f4d930ea57f5510e109acf08eca5e87 Mon Sep 17 00:00:00 2001
From: Vasileios Amoiridis <vassilisamir(a)gmail.com>
Date: Sat, 16 Mar 2024 12:07:42 +0100
Subject: iio: pressure: Fixes BME280 SPI driver data
Use bme280_chip_info structure instead of bmp280_chip_info
in SPI support for the BME280 sensor.
Fixes: 0b0b772637cd ("iio: pressure: bmp280: Use chip_info pointers for each chip as driver data")
Signed-off-by: Vasileios Amoiridis <vassilisamir(a)gmail.com>
Link: https://lore.kernel.org/r/20240316110743.1998400-2-vassilisamir@gmail.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/pressure/bmp280-spi.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/iio/pressure/bmp280-spi.c b/drivers/iio/pressure/bmp280-spi.c
index a444d4b2978b..038d36aad3eb 100644
--- a/drivers/iio/pressure/bmp280-spi.c
+++ b/drivers/iio/pressure/bmp280-spi.c
@@ -127,7 +127,7 @@ static const struct of_device_id bmp280_of_spi_match[] = {
{ .compatible = "bosch,bmp180", .data = &bmp180_chip_info },
{ .compatible = "bosch,bmp181", .data = &bmp180_chip_info },
{ .compatible = "bosch,bmp280", .data = &bmp280_chip_info },
- { .compatible = "bosch,bme280", .data = &bmp280_chip_info },
+ { .compatible = "bosch,bme280", .data = &bme280_chip_info },
{ .compatible = "bosch,bmp380", .data = &bmp380_chip_info },
{ .compatible = "bosch,bmp580", .data = &bmp580_chip_info },
{ },
@@ -139,7 +139,7 @@ static const struct spi_device_id bmp280_spi_id[] = {
{ "bmp180", (kernel_ulong_t)&bmp180_chip_info },
{ "bmp181", (kernel_ulong_t)&bmp180_chip_info },
{ "bmp280", (kernel_ulong_t)&bmp280_chip_info },
- { "bme280", (kernel_ulong_t)&bmp280_chip_info },
+ { "bme280", (kernel_ulong_t)&bme280_chip_info },
{ "bmp380", (kernel_ulong_t)&bmp380_chip_info },
{ "bmp580", (kernel_ulong_t)&bmp580_chip_info },
{ }
--
2.44.0
This is a note to let you know that I've just added the patch titled
iio: pressure: Fixes SPI support for BMP3xx devices
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 5ca29ea4e4073b3caba750efe155b1bd4c597ca9 Mon Sep 17 00:00:00 2001
From: Vasileios Amoiridis <vassilisamir(a)gmail.com>
Date: Sat, 16 Mar 2024 12:07:43 +0100
Subject: iio: pressure: Fixes SPI support for BMP3xx devices
Bosch does not use unique BMPxxx_CHIP_ID for the different versions
of the device which leads to misidentification of devices if their
ID is used. Use a new value in the chip_info structure instead of
the BMPxxx_CHIP_ID, in order to choose the correct regmap_bus to
be used.
Fixes: a9dd9ba32311 ("iio: pressure: Fixes BMP38x and BMP390 SPI support")
Signed-off-by: Vasileios Amoiridis <vassilisamir(a)gmail.com>
Link: https://lore.kernel.org/r/20240316110743.1998400-3-vassilisamir@gmail.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/pressure/bmp280-core.c | 1 +
drivers/iio/pressure/bmp280-spi.c | 9 ++-------
drivers/iio/pressure/bmp280.h | 1 +
3 files changed, 4 insertions(+), 7 deletions(-)
diff --git a/drivers/iio/pressure/bmp280-core.c b/drivers/iio/pressure/bmp280-core.c
index fe8734468ed3..62e9e93d915d 100644
--- a/drivers/iio/pressure/bmp280-core.c
+++ b/drivers/iio/pressure/bmp280-core.c
@@ -1233,6 +1233,7 @@ const struct bmp280_chip_info bmp380_chip_info = {
.chip_id = bmp380_chip_ids,
.num_chip_id = ARRAY_SIZE(bmp380_chip_ids),
.regmap_config = &bmp380_regmap_config,
+ .spi_read_extra_byte = true,
.start_up_time = 2000,
.channels = bmp380_channels,
.num_channels = 2,
diff --git a/drivers/iio/pressure/bmp280-spi.c b/drivers/iio/pressure/bmp280-spi.c
index 038d36aad3eb..4e19ea0b4d39 100644
--- a/drivers/iio/pressure/bmp280-spi.c
+++ b/drivers/iio/pressure/bmp280-spi.c
@@ -96,15 +96,10 @@ static int bmp280_spi_probe(struct spi_device *spi)
chip_info = spi_get_device_match_data(spi);
- switch (chip_info->chip_id[0]) {
- case BMP380_CHIP_ID:
- case BMP390_CHIP_ID:
+ if (chip_info->spi_read_extra_byte)
bmp_regmap_bus = &bmp380_regmap_bus;
- break;
- default:
+ else
bmp_regmap_bus = &bmp280_regmap_bus;
- break;
- }
regmap = devm_regmap_init(&spi->dev,
bmp_regmap_bus,
diff --git a/drivers/iio/pressure/bmp280.h b/drivers/iio/pressure/bmp280.h
index 4012387d7956..5812a344ed8e 100644
--- a/drivers/iio/pressure/bmp280.h
+++ b/drivers/iio/pressure/bmp280.h
@@ -423,6 +423,7 @@ struct bmp280_chip_info {
int num_chip_id;
const struct regmap_config *regmap_config;
+ bool spi_read_extra_byte;
const struct iio_chan_spec *channels;
int num_channels;
--
2.44.0
We flush the rebind worker during the vm close phase, however in places
like preempt_fence_work_func() we seem to queue the rebind worker
without first checking if the vm has already been closed. The concern
here is the vm being closed with the worker flushed, but then being
rearmed later, which looks like potential uaf, since there is no actual
refcounting to track the queued worker. We can't take the vm->lock here
in preempt_rebind_work_func() to first check if the vm is closed since
that will deadlock, so instead flush the worker again when the vm
refcount reaches zero.
v2:
- Grabbing vm->lock in the preempt worker creates a deadlock, so
checking the closed state is tricky. Instead flush the worker when
the refcount reaches zero. It should be impossible to queue the
preempt worker without already holding vm ref.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1676
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1591
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1304
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1249
Signed-off-by: Matthew Auld <matthew.auld(a)intel.com>
Cc: Matthew Brost <matthew.brost(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.8+
---
drivers/gpu/drm/xe/xe_vm.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 2ba7c920a8af..71de9848bdc2 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -1509,6 +1509,9 @@ static void vm_destroy_work_func(struct work_struct *w)
/* xe_vm_close_and_put was not called? */
xe_assert(xe, !vm->size);
+ if (xe_vm_in_preempt_fence_mode(vm))
+ flush_work(&vm->preempt.rebind_work);
+
mutex_destroy(&vm->snap_mutex);
if (!(vm->flags & XE_VM_FLAG_MIGRATION))
--
2.44.0
From: Eric Biggers <ebiggers(a)google.com>
Since the signature self-test uses RSA and SHA-256, it must only be
enabled when those algorithms are enabled. Otherwise it fails and
panics the kernel on boot-up.
Reported-by: kernel test robot <oliver.sang(a)intel.com>
Closes: https://lore.kernel.org/oe-lkp/202404221528.51d75177-lkp@intel.com
Fixes: 3cde3174eb91 ("certs: Add FIPS selftests")
Cc: stable(a)vger.kernel.org
Cc: Simo Sorce <simo(a)redhat.com>
Cc: David Howells <dhowells(a)redhat.com>
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
crypto/asymmetric_keys/Kconfig | 2 ++
1 file changed, 2 insertions(+)
diff --git a/crypto/asymmetric_keys/Kconfig b/crypto/asymmetric_keys/Kconfig
index 59ec726b7c77..4abc58c55efa 100644
--- a/crypto/asymmetric_keys/Kconfig
+++ b/crypto/asymmetric_keys/Kconfig
@@ -83,7 +83,9 @@ config FIPS_SIGNATURE_SELFTEST
for FIPS.
depends on KEYS
depends on ASYMMETRIC_KEY_TYPE
depends on PKCS7_MESSAGE_PARSER=X509_CERTIFICATE_PARSER
depends on X509_CERTIFICATE_PARSER
+ depends on CRYPTO_RSA
+ depends on CRYPTO_SHA256
endif # ASYMMETRIC_KEY_TYPE
base-commit: ed30a4a51bb196781c8058073ea720133a65596f
--
2.44.0
From: Eric Biggers <ebiggers(a)google.com>
Make ASYMMETRIC_PUBLIC_KEY_SUBTYPE select CRYPTO_SIG to avoid build
errors like the following, which were possible with
CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y && CONFIG_CRYPTO_SIG=n:
ld: vmlinux.o: in function `public_key_verify_signature':
(.text+0x306280): undefined reference to `crypto_alloc_sig'
ld: (.text+0x306300): undefined reference to `crypto_sig_set_pubkey'
ld: (.text+0x306324): undefined reference to `crypto_sig_verify'
ld: (.text+0x30636c): undefined reference to `crypto_sig_set_privkey'
Fixes: 63ba4d67594a ("KEYS: asymmetric: Use new crypto interface without scatterlists")
Cc: stable(a)vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
crypto/asymmetric_keys/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/crypto/asymmetric_keys/Kconfig b/crypto/asymmetric_keys/Kconfig
index 59ec726b7c77..3f089abd6fc9 100644
--- a/crypto/asymmetric_keys/Kconfig
+++ b/crypto/asymmetric_keys/Kconfig
@@ -13,10 +13,11 @@ if ASYMMETRIC_KEY_TYPE
config ASYMMETRIC_PUBLIC_KEY_SUBTYPE
tristate "Asymmetric public-key crypto algorithm subtype"
select MPILIB
select CRYPTO_HASH_INFO
select CRYPTO_AKCIPHER
+ select CRYPTO_SIG
select CRYPTO_HASH
help
This option provides support for asymmetric public key type handling.
If signature generation and/or verification are to be used,
appropriate hash algorithms (such as SHA-1) must be available.
base-commit: ed30a4a51bb196781c8058073ea720133a65596f
--
2.44.0
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 1e560864159d002b453da42bd2c13a1805515a20
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024021334-each-residence-41ce@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
1e560864159d ("PCI/ASPM: Fix deadlock when enabling ASPM")
ac865f00af29 ("Merge tag 'pci-v6.7-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1e560864159d002b453da42bd2c13a1805515a20 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan+linaro(a)kernel.org>
Date: Tue, 30 Jan 2024 11:02:43 +0100
Subject: [PATCH] PCI/ASPM: Fix deadlock when enabling ASPM
A last minute revert in 6.7-final introduced a potential deadlock when
enabling ASPM during probe of Qualcomm PCIe controllers as reported by
lockdep:
============================================
WARNING: possible recursive locking detected
6.7.0 #40 Not tainted
--------------------------------------------
kworker/u16:5/90 is trying to acquire lock:
ffffacfa78ced000 (pci_bus_sem){++++}-{3:3}, at: pcie_aspm_pm_state_change+0x58/0xdc
but task is already holding lock:
ffffacfa78ced000 (pci_bus_sem){++++}-{3:3}, at: pci_walk_bus+0x34/0xbc
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(pci_bus_sem);
lock(pci_bus_sem);
*** DEADLOCK ***
Call trace:
print_deadlock_bug+0x25c/0x348
__lock_acquire+0x10a4/0x2064
lock_acquire+0x1e8/0x318
down_read+0x60/0x184
pcie_aspm_pm_state_change+0x58/0xdc
pci_set_full_power_state+0xa8/0x114
pci_set_power_state+0xc4/0x120
qcom_pcie_enable_aspm+0x1c/0x3c [pcie_qcom]
pci_walk_bus+0x64/0xbc
qcom_pcie_host_post_init_2_7_0+0x28/0x34 [pcie_qcom]
The deadlock can easily be reproduced on machines like the Lenovo ThinkPad
X13s by adding a delay to increase the race window during asynchronous
probe where another thread can take a write lock.
Add a new pci_set_power_state_locked() and associated helper functions that
can be called with the PCI bus semaphore held to avoid taking the read lock
twice.
Link: https://lore.kernel.org/r/ZZu0qx2cmn7IwTyQ@hovoldconsulting.com
Link: https://lore.kernel.org/r/20240130100243.11011-1-johan+linaro@kernel.org
Fixes: f93e71aea6c6 ("Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"")
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: <stable(a)vger.kernel.org> # 6.7
diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 9c2137dae429..826b5016a101 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -386,21 +386,8 @@ void pci_bus_add_devices(const struct pci_bus *bus)
}
EXPORT_SYMBOL(pci_bus_add_devices);
-/** pci_walk_bus - walk devices on/under bus, calling callback.
- * @top bus whose devices should be walked
- * @cb callback to be called for each device found
- * @userdata arbitrary pointer to be passed to callback.
- *
- * Walk the given bus, including any bridged devices
- * on buses under this bus. Call the provided callback
- * on each device found.
- *
- * We check the return of @cb each time. If it returns anything
- * other than 0, we break out.
- *
- */
-void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
- void *userdata)
+static void __pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
+ void *userdata, bool locked)
{
struct pci_dev *dev;
struct pci_bus *bus;
@@ -408,7 +395,8 @@ void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
int retval;
bus = top;
- down_read(&pci_bus_sem);
+ if (!locked)
+ down_read(&pci_bus_sem);
next = top->devices.next;
for (;;) {
if (next == &bus->devices) {
@@ -431,10 +419,37 @@ void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
if (retval)
break;
}
- up_read(&pci_bus_sem);
+ if (!locked)
+ up_read(&pci_bus_sem);
+}
+
+/**
+ * pci_walk_bus - walk devices on/under bus, calling callback.
+ * @top: bus whose devices should be walked
+ * @cb: callback to be called for each device found
+ * @userdata: arbitrary pointer to be passed to callback
+ *
+ * Walk the given bus, including any bridged devices
+ * on buses under this bus. Call the provided callback
+ * on each device found.
+ *
+ * We check the return of @cb each time. If it returns anything
+ * other than 0, we break out.
+ */
+void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *), void *userdata)
+{
+ __pci_walk_bus(top, cb, userdata, false);
}
EXPORT_SYMBOL_GPL(pci_walk_bus);
+void pci_walk_bus_locked(struct pci_bus *top, int (*cb)(struct pci_dev *, void *), void *userdata)
+{
+ lockdep_assert_held(&pci_bus_sem);
+
+ __pci_walk_bus(top, cb, userdata, true);
+}
+EXPORT_SYMBOL_GPL(pci_walk_bus_locked);
+
struct pci_bus *pci_bus_get(struct pci_bus *bus)
{
if (bus)
diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index 10f2d0bb86be..2ce2a3bd932b 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -972,7 +972,7 @@ static int qcom_pcie_enable_aspm(struct pci_dev *pdev, void *userdata)
* Downstream devices need to be in D0 state before enabling PCI PM
* substates.
*/
- pci_set_power_state(pdev, PCI_D0);
+ pci_set_power_state_locked(pdev, PCI_D0);
pci_enable_link_state_locked(pdev, PCIE_LINK_STATE_ALL);
return 0;
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index d8f11a078924..9ab9b1008d8b 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1354,6 +1354,7 @@ int pci_power_up(struct pci_dev *dev)
/**
* pci_set_full_power_state - Put a PCI device into D0 and update its state
* @dev: PCI device to power up
+ * @locked: whether pci_bus_sem is held
*
* Call pci_power_up() to put @dev into D0, read from its PCI_PM_CTRL register
* to confirm the state change, restore its BARs if they might be lost and
@@ -1363,7 +1364,7 @@ int pci_power_up(struct pci_dev *dev)
* to D0, it is more efficient to use pci_power_up() directly instead of this
* function.
*/
-static int pci_set_full_power_state(struct pci_dev *dev)
+static int pci_set_full_power_state(struct pci_dev *dev, bool locked)
{
u16 pmcsr;
int ret;
@@ -1399,7 +1400,7 @@ static int pci_set_full_power_state(struct pci_dev *dev)
}
if (dev->bus->self)
- pcie_aspm_pm_state_change(dev->bus->self);
+ pcie_aspm_pm_state_change(dev->bus->self, locked);
return 0;
}
@@ -1428,10 +1429,22 @@ void pci_bus_set_current_state(struct pci_bus *bus, pci_power_t state)
pci_walk_bus(bus, __pci_dev_set_current_state, &state);
}
+static void __pci_bus_set_current_state(struct pci_bus *bus, pci_power_t state, bool locked)
+{
+ if (!bus)
+ return;
+
+ if (locked)
+ pci_walk_bus_locked(bus, __pci_dev_set_current_state, &state);
+ else
+ pci_walk_bus(bus, __pci_dev_set_current_state, &state);
+}
+
/**
* pci_set_low_power_state - Put a PCI device into a low-power state.
* @dev: PCI device to handle.
* @state: PCI power state (D1, D2, D3hot) to put the device into.
+ * @locked: whether pci_bus_sem is held
*
* Use the device's PCI_PM_CTRL register to put it into a low-power state.
*
@@ -1442,7 +1455,7 @@ void pci_bus_set_current_state(struct pci_bus *bus, pci_power_t state)
* 0 if device already is in the requested state.
* 0 if device's power state has been successfully changed.
*/
-static int pci_set_low_power_state(struct pci_dev *dev, pci_power_t state)
+static int pci_set_low_power_state(struct pci_dev *dev, pci_power_t state, bool locked)
{
u16 pmcsr;
@@ -1496,29 +1509,12 @@ static int pci_set_low_power_state(struct pci_dev *dev, pci_power_t state)
pci_power_name(state));
if (dev->bus->self)
- pcie_aspm_pm_state_change(dev->bus->self);
+ pcie_aspm_pm_state_change(dev->bus->self, locked);
return 0;
}
-/**
- * pci_set_power_state - Set the power state of a PCI device
- * @dev: PCI device to handle.
- * @state: PCI power state (D0, D1, D2, D3hot) to put the device into.
- *
- * Transition a device to a new power state, using the platform firmware and/or
- * the device's PCI PM registers.
- *
- * RETURN VALUE:
- * -EINVAL if the requested state is invalid.
- * -EIO if device does not support PCI PM or its PM capabilities register has a
- * wrong version, or device doesn't support the requested state.
- * 0 if the transition is to D1 or D2 but D1 and D2 are not supported.
- * 0 if device already is in the requested state.
- * 0 if the transition is to D3 but D3 is not supported.
- * 0 if device's power state has been successfully changed.
- */
-int pci_set_power_state(struct pci_dev *dev, pci_power_t state)
+static int __pci_set_power_state(struct pci_dev *dev, pci_power_t state, bool locked)
{
int error;
@@ -1542,7 +1538,7 @@ int pci_set_power_state(struct pci_dev *dev, pci_power_t state)
return 0;
if (state == PCI_D0)
- return pci_set_full_power_state(dev);
+ return pci_set_full_power_state(dev, locked);
/*
* This device is quirked not to be put into D3, so don't put it in
@@ -1556,16 +1552,16 @@ int pci_set_power_state(struct pci_dev *dev, pci_power_t state)
* To put the device in D3cold, put it into D3hot in the native
* way, then put it into D3cold using platform ops.
*/
- error = pci_set_low_power_state(dev, PCI_D3hot);
+ error = pci_set_low_power_state(dev, PCI_D3hot, locked);
if (pci_platform_power_transition(dev, PCI_D3cold))
return error;
/* Powering off a bridge may power off the whole hierarchy */
if (dev->current_state == PCI_D3cold)
- pci_bus_set_current_state(dev->subordinate, PCI_D3cold);
+ __pci_bus_set_current_state(dev->subordinate, PCI_D3cold, locked);
} else {
- error = pci_set_low_power_state(dev, state);
+ error = pci_set_low_power_state(dev, state, locked);
if (pci_platform_power_transition(dev, state))
return error;
@@ -1573,8 +1569,38 @@ int pci_set_power_state(struct pci_dev *dev, pci_power_t state)
return 0;
}
+
+/**
+ * pci_set_power_state - Set the power state of a PCI device
+ * @dev: PCI device to handle.
+ * @state: PCI power state (D0, D1, D2, D3hot) to put the device into.
+ *
+ * Transition a device to a new power state, using the platform firmware and/or
+ * the device's PCI PM registers.
+ *
+ * RETURN VALUE:
+ * -EINVAL if the requested state is invalid.
+ * -EIO if device does not support PCI PM or its PM capabilities register has a
+ * wrong version, or device doesn't support the requested state.
+ * 0 if the transition is to D1 or D2 but D1 and D2 are not supported.
+ * 0 if device already is in the requested state.
+ * 0 if the transition is to D3 but D3 is not supported.
+ * 0 if device's power state has been successfully changed.
+ */
+int pci_set_power_state(struct pci_dev *dev, pci_power_t state)
+{
+ return __pci_set_power_state(dev, state, false);
+}
EXPORT_SYMBOL(pci_set_power_state);
+int pci_set_power_state_locked(struct pci_dev *dev, pci_power_t state)
+{
+ lockdep_assert_held(&pci_bus_sem);
+
+ return __pci_set_power_state(dev, state, true);
+}
+EXPORT_SYMBOL(pci_set_power_state_locked);
+
#define PCI_EXP_SAVE_REGS 7
static struct pci_cap_saved_state *_pci_find_saved_cap(struct pci_dev *pci_dev,
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 2336a8d1edab..e9750b1b19ba 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -571,12 +571,12 @@ int pcie_retrain_link(struct pci_dev *pdev, bool use_lt);
#ifdef CONFIG_PCIEASPM
void pcie_aspm_init_link_state(struct pci_dev *pdev);
void pcie_aspm_exit_link_state(struct pci_dev *pdev);
-void pcie_aspm_pm_state_change(struct pci_dev *pdev);
+void pcie_aspm_pm_state_change(struct pci_dev *pdev, bool locked);
void pcie_aspm_powersave_config_link(struct pci_dev *pdev);
#else
static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { }
static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { }
-static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev) { }
+static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev, bool locked) { }
static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { }
#endif
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 5a0066ecc3c5..bc0bd86695ec 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -1003,8 +1003,11 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev)
up_read(&pci_bus_sem);
}
-/* @pdev: the root port or switch downstream port */
-void pcie_aspm_pm_state_change(struct pci_dev *pdev)
+/*
+ * @pdev: the root port or switch downstream port
+ * @locked: whether pci_bus_sem is held
+ */
+void pcie_aspm_pm_state_change(struct pci_dev *pdev, bool locked)
{
struct pcie_link_state *link = pdev->link_state;
@@ -1014,12 +1017,14 @@ void pcie_aspm_pm_state_change(struct pci_dev *pdev)
* Devices changed PM state, we should recheck if latency
* meets all functions' requirement
*/
- down_read(&pci_bus_sem);
+ if (!locked)
+ down_read(&pci_bus_sem);
mutex_lock(&aspm_lock);
pcie_update_aspm_capable(link->root);
pcie_config_aspm_path(link);
mutex_unlock(&aspm_lock);
- up_read(&pci_bus_sem);
+ if (!locked)
+ up_read(&pci_bus_sem);
}
void pcie_aspm_powersave_config_link(struct pci_dev *pdev)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index add9368e6314..7ab0d13672da 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1422,6 +1422,7 @@ int pci_load_and_free_saved_state(struct pci_dev *dev,
struct pci_saved_state **state);
int pci_platform_power_transition(struct pci_dev *dev, pci_power_t state);
int pci_set_power_state(struct pci_dev *dev, pci_power_t state);
+int pci_set_power_state_locked(struct pci_dev *dev, pci_power_t state);
pci_power_t pci_choose_state(struct pci_dev *dev, pm_message_t state);
bool pci_pme_capable(struct pci_dev *dev, pci_power_t state);
void pci_pme_active(struct pci_dev *dev, bool enable);
@@ -1625,6 +1626,8 @@ int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max,
void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
void *userdata);
+void pci_walk_bus_locked(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
+ void *userdata);
int pci_cfg_space_size(struct pci_dev *dev);
unsigned char pci_bus_max_busnr(struct pci_bus *bus);
void pci_setup_bridge(struct pci_bus *bus);
@@ -2025,6 +2028,8 @@ static inline int pci_save_state(struct pci_dev *dev) { return 0; }
static inline void pci_restore_state(struct pci_dev *dev) { }
static inline int pci_set_power_state(struct pci_dev *dev, pci_power_t state)
{ return 0; }
+static inline int pci_set_power_state_locked(struct pci_dev *dev, pci_power_t state)
+{ return 0; }
static inline int pci_wake_from_d3(struct pci_dev *dev, bool enable)
{ return 0; }
static inline pci_power_t pci_choose_state(struct pci_dev *dev,
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 1e560864159d002b453da42bd2c13a1805515a20
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024021328-stylus-ooze-f752@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
1e560864159d ("PCI/ASPM: Fix deadlock when enabling ASPM")
ac865f00af29 ("Merge tag 'pci-v6.7-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1e560864159d002b453da42bd2c13a1805515a20 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan+linaro(a)kernel.org>
Date: Tue, 30 Jan 2024 11:02:43 +0100
Subject: [PATCH] PCI/ASPM: Fix deadlock when enabling ASPM
A last minute revert in 6.7-final introduced a potential deadlock when
enabling ASPM during probe of Qualcomm PCIe controllers as reported by
lockdep:
============================================
WARNING: possible recursive locking detected
6.7.0 #40 Not tainted
--------------------------------------------
kworker/u16:5/90 is trying to acquire lock:
ffffacfa78ced000 (pci_bus_sem){++++}-{3:3}, at: pcie_aspm_pm_state_change+0x58/0xdc
but task is already holding lock:
ffffacfa78ced000 (pci_bus_sem){++++}-{3:3}, at: pci_walk_bus+0x34/0xbc
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(pci_bus_sem);
lock(pci_bus_sem);
*** DEADLOCK ***
Call trace:
print_deadlock_bug+0x25c/0x348
__lock_acquire+0x10a4/0x2064
lock_acquire+0x1e8/0x318
down_read+0x60/0x184
pcie_aspm_pm_state_change+0x58/0xdc
pci_set_full_power_state+0xa8/0x114
pci_set_power_state+0xc4/0x120
qcom_pcie_enable_aspm+0x1c/0x3c [pcie_qcom]
pci_walk_bus+0x64/0xbc
qcom_pcie_host_post_init_2_7_0+0x28/0x34 [pcie_qcom]
The deadlock can easily be reproduced on machines like the Lenovo ThinkPad
X13s by adding a delay to increase the race window during asynchronous
probe where another thread can take a write lock.
Add a new pci_set_power_state_locked() and associated helper functions that
can be called with the PCI bus semaphore held to avoid taking the read lock
twice.
Link: https://lore.kernel.org/r/ZZu0qx2cmn7IwTyQ@hovoldconsulting.com
Link: https://lore.kernel.org/r/20240130100243.11011-1-johan+linaro@kernel.org
Fixes: f93e71aea6c6 ("Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"")
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: <stable(a)vger.kernel.org> # 6.7
diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 9c2137dae429..826b5016a101 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -386,21 +386,8 @@ void pci_bus_add_devices(const struct pci_bus *bus)
}
EXPORT_SYMBOL(pci_bus_add_devices);
-/** pci_walk_bus - walk devices on/under bus, calling callback.
- * @top bus whose devices should be walked
- * @cb callback to be called for each device found
- * @userdata arbitrary pointer to be passed to callback.
- *
- * Walk the given bus, including any bridged devices
- * on buses under this bus. Call the provided callback
- * on each device found.
- *
- * We check the return of @cb each time. If it returns anything
- * other than 0, we break out.
- *
- */
-void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
- void *userdata)
+static void __pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
+ void *userdata, bool locked)
{
struct pci_dev *dev;
struct pci_bus *bus;
@@ -408,7 +395,8 @@ void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
int retval;
bus = top;
- down_read(&pci_bus_sem);
+ if (!locked)
+ down_read(&pci_bus_sem);
next = top->devices.next;
for (;;) {
if (next == &bus->devices) {
@@ -431,10 +419,37 @@ void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
if (retval)
break;
}
- up_read(&pci_bus_sem);
+ if (!locked)
+ up_read(&pci_bus_sem);
+}
+
+/**
+ * pci_walk_bus - walk devices on/under bus, calling callback.
+ * @top: bus whose devices should be walked
+ * @cb: callback to be called for each device found
+ * @userdata: arbitrary pointer to be passed to callback
+ *
+ * Walk the given bus, including any bridged devices
+ * on buses under this bus. Call the provided callback
+ * on each device found.
+ *
+ * We check the return of @cb each time. If it returns anything
+ * other than 0, we break out.
+ */
+void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *), void *userdata)
+{
+ __pci_walk_bus(top, cb, userdata, false);
}
EXPORT_SYMBOL_GPL(pci_walk_bus);
+void pci_walk_bus_locked(struct pci_bus *top, int (*cb)(struct pci_dev *, void *), void *userdata)
+{
+ lockdep_assert_held(&pci_bus_sem);
+
+ __pci_walk_bus(top, cb, userdata, true);
+}
+EXPORT_SYMBOL_GPL(pci_walk_bus_locked);
+
struct pci_bus *pci_bus_get(struct pci_bus *bus)
{
if (bus)
diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index 10f2d0bb86be..2ce2a3bd932b 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -972,7 +972,7 @@ static int qcom_pcie_enable_aspm(struct pci_dev *pdev, void *userdata)
* Downstream devices need to be in D0 state before enabling PCI PM
* substates.
*/
- pci_set_power_state(pdev, PCI_D0);
+ pci_set_power_state_locked(pdev, PCI_D0);
pci_enable_link_state_locked(pdev, PCIE_LINK_STATE_ALL);
return 0;
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index d8f11a078924..9ab9b1008d8b 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1354,6 +1354,7 @@ int pci_power_up(struct pci_dev *dev)
/**
* pci_set_full_power_state - Put a PCI device into D0 and update its state
* @dev: PCI device to power up
+ * @locked: whether pci_bus_sem is held
*
* Call pci_power_up() to put @dev into D0, read from its PCI_PM_CTRL register
* to confirm the state change, restore its BARs if they might be lost and
@@ -1363,7 +1364,7 @@ int pci_power_up(struct pci_dev *dev)
* to D0, it is more efficient to use pci_power_up() directly instead of this
* function.
*/
-static int pci_set_full_power_state(struct pci_dev *dev)
+static int pci_set_full_power_state(struct pci_dev *dev, bool locked)
{
u16 pmcsr;
int ret;
@@ -1399,7 +1400,7 @@ static int pci_set_full_power_state(struct pci_dev *dev)
}
if (dev->bus->self)
- pcie_aspm_pm_state_change(dev->bus->self);
+ pcie_aspm_pm_state_change(dev->bus->self, locked);
return 0;
}
@@ -1428,10 +1429,22 @@ void pci_bus_set_current_state(struct pci_bus *bus, pci_power_t state)
pci_walk_bus(bus, __pci_dev_set_current_state, &state);
}
+static void __pci_bus_set_current_state(struct pci_bus *bus, pci_power_t state, bool locked)
+{
+ if (!bus)
+ return;
+
+ if (locked)
+ pci_walk_bus_locked(bus, __pci_dev_set_current_state, &state);
+ else
+ pci_walk_bus(bus, __pci_dev_set_current_state, &state);
+}
+
/**
* pci_set_low_power_state - Put a PCI device into a low-power state.
* @dev: PCI device to handle.
* @state: PCI power state (D1, D2, D3hot) to put the device into.
+ * @locked: whether pci_bus_sem is held
*
* Use the device's PCI_PM_CTRL register to put it into a low-power state.
*
@@ -1442,7 +1455,7 @@ void pci_bus_set_current_state(struct pci_bus *bus, pci_power_t state)
* 0 if device already is in the requested state.
* 0 if device's power state has been successfully changed.
*/
-static int pci_set_low_power_state(struct pci_dev *dev, pci_power_t state)
+static int pci_set_low_power_state(struct pci_dev *dev, pci_power_t state, bool locked)
{
u16 pmcsr;
@@ -1496,29 +1509,12 @@ static int pci_set_low_power_state(struct pci_dev *dev, pci_power_t state)
pci_power_name(state));
if (dev->bus->self)
- pcie_aspm_pm_state_change(dev->bus->self);
+ pcie_aspm_pm_state_change(dev->bus->self, locked);
return 0;
}
-/**
- * pci_set_power_state - Set the power state of a PCI device
- * @dev: PCI device to handle.
- * @state: PCI power state (D0, D1, D2, D3hot) to put the device into.
- *
- * Transition a device to a new power state, using the platform firmware and/or
- * the device's PCI PM registers.
- *
- * RETURN VALUE:
- * -EINVAL if the requested state is invalid.
- * -EIO if device does not support PCI PM or its PM capabilities register has a
- * wrong version, or device doesn't support the requested state.
- * 0 if the transition is to D1 or D2 but D1 and D2 are not supported.
- * 0 if device already is in the requested state.
- * 0 if the transition is to D3 but D3 is not supported.
- * 0 if device's power state has been successfully changed.
- */
-int pci_set_power_state(struct pci_dev *dev, pci_power_t state)
+static int __pci_set_power_state(struct pci_dev *dev, pci_power_t state, bool locked)
{
int error;
@@ -1542,7 +1538,7 @@ int pci_set_power_state(struct pci_dev *dev, pci_power_t state)
return 0;
if (state == PCI_D0)
- return pci_set_full_power_state(dev);
+ return pci_set_full_power_state(dev, locked);
/*
* This device is quirked not to be put into D3, so don't put it in
@@ -1556,16 +1552,16 @@ int pci_set_power_state(struct pci_dev *dev, pci_power_t state)
* To put the device in D3cold, put it into D3hot in the native
* way, then put it into D3cold using platform ops.
*/
- error = pci_set_low_power_state(dev, PCI_D3hot);
+ error = pci_set_low_power_state(dev, PCI_D3hot, locked);
if (pci_platform_power_transition(dev, PCI_D3cold))
return error;
/* Powering off a bridge may power off the whole hierarchy */
if (dev->current_state == PCI_D3cold)
- pci_bus_set_current_state(dev->subordinate, PCI_D3cold);
+ __pci_bus_set_current_state(dev->subordinate, PCI_D3cold, locked);
} else {
- error = pci_set_low_power_state(dev, state);
+ error = pci_set_low_power_state(dev, state, locked);
if (pci_platform_power_transition(dev, state))
return error;
@@ -1573,8 +1569,38 @@ int pci_set_power_state(struct pci_dev *dev, pci_power_t state)
return 0;
}
+
+/**
+ * pci_set_power_state - Set the power state of a PCI device
+ * @dev: PCI device to handle.
+ * @state: PCI power state (D0, D1, D2, D3hot) to put the device into.
+ *
+ * Transition a device to a new power state, using the platform firmware and/or
+ * the device's PCI PM registers.
+ *
+ * RETURN VALUE:
+ * -EINVAL if the requested state is invalid.
+ * -EIO if device does not support PCI PM or its PM capabilities register has a
+ * wrong version, or device doesn't support the requested state.
+ * 0 if the transition is to D1 or D2 but D1 and D2 are not supported.
+ * 0 if device already is in the requested state.
+ * 0 if the transition is to D3 but D3 is not supported.
+ * 0 if device's power state has been successfully changed.
+ */
+int pci_set_power_state(struct pci_dev *dev, pci_power_t state)
+{
+ return __pci_set_power_state(dev, state, false);
+}
EXPORT_SYMBOL(pci_set_power_state);
+int pci_set_power_state_locked(struct pci_dev *dev, pci_power_t state)
+{
+ lockdep_assert_held(&pci_bus_sem);
+
+ return __pci_set_power_state(dev, state, true);
+}
+EXPORT_SYMBOL(pci_set_power_state_locked);
+
#define PCI_EXP_SAVE_REGS 7
static struct pci_cap_saved_state *_pci_find_saved_cap(struct pci_dev *pci_dev,
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 2336a8d1edab..e9750b1b19ba 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -571,12 +571,12 @@ int pcie_retrain_link(struct pci_dev *pdev, bool use_lt);
#ifdef CONFIG_PCIEASPM
void pcie_aspm_init_link_state(struct pci_dev *pdev);
void pcie_aspm_exit_link_state(struct pci_dev *pdev);
-void pcie_aspm_pm_state_change(struct pci_dev *pdev);
+void pcie_aspm_pm_state_change(struct pci_dev *pdev, bool locked);
void pcie_aspm_powersave_config_link(struct pci_dev *pdev);
#else
static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { }
static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { }
-static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev) { }
+static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev, bool locked) { }
static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { }
#endif
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 5a0066ecc3c5..bc0bd86695ec 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -1003,8 +1003,11 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev)
up_read(&pci_bus_sem);
}
-/* @pdev: the root port or switch downstream port */
-void pcie_aspm_pm_state_change(struct pci_dev *pdev)
+/*
+ * @pdev: the root port or switch downstream port
+ * @locked: whether pci_bus_sem is held
+ */
+void pcie_aspm_pm_state_change(struct pci_dev *pdev, bool locked)
{
struct pcie_link_state *link = pdev->link_state;
@@ -1014,12 +1017,14 @@ void pcie_aspm_pm_state_change(struct pci_dev *pdev)
* Devices changed PM state, we should recheck if latency
* meets all functions' requirement
*/
- down_read(&pci_bus_sem);
+ if (!locked)
+ down_read(&pci_bus_sem);
mutex_lock(&aspm_lock);
pcie_update_aspm_capable(link->root);
pcie_config_aspm_path(link);
mutex_unlock(&aspm_lock);
- up_read(&pci_bus_sem);
+ if (!locked)
+ up_read(&pci_bus_sem);
}
void pcie_aspm_powersave_config_link(struct pci_dev *pdev)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index add9368e6314..7ab0d13672da 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1422,6 +1422,7 @@ int pci_load_and_free_saved_state(struct pci_dev *dev,
struct pci_saved_state **state);
int pci_platform_power_transition(struct pci_dev *dev, pci_power_t state);
int pci_set_power_state(struct pci_dev *dev, pci_power_t state);
+int pci_set_power_state_locked(struct pci_dev *dev, pci_power_t state);
pci_power_t pci_choose_state(struct pci_dev *dev, pm_message_t state);
bool pci_pme_capable(struct pci_dev *dev, pci_power_t state);
void pci_pme_active(struct pci_dev *dev, bool enable);
@@ -1625,6 +1626,8 @@ int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max,
void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
void *userdata);
+void pci_walk_bus_locked(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
+ void *userdata);
int pci_cfg_space_size(struct pci_dev *dev);
unsigned char pci_bus_max_busnr(struct pci_bus *bus);
void pci_setup_bridge(struct pci_bus *bus);
@@ -2025,6 +2028,8 @@ static inline int pci_save_state(struct pci_dev *dev) { return 0; }
static inline void pci_restore_state(struct pci_dev *dev) { }
static inline int pci_set_power_state(struct pci_dev *dev, pci_power_t state)
{ return 0; }
+static inline int pci_set_power_state_locked(struct pci_dev *dev, pci_power_t state)
+{ return 0; }
static inline int pci_wake_from_d3(struct pci_dev *dev, bool enable)
{ return 0; }
static inline pci_power_t pci_choose_state(struct pci_dev *dev,
Hi,
We've got a collection of bug reports about how if a Thunderbolt device
is connected at bootup it doesn't behave the same as if it were
hotplugged by the user after bootup. Issues range from non-functional
devices or display devices working at lower performance.
All of the issues stem from a pre-OS firmware initializing the USB4
controller and the OS re-using those tunnels. This has been fixed in
6.9-rc1 by resetting the controller to a fresh state and discarding
whatever firmware has done. I'd like to bring it back to 6.6.y LTS and
6.8.y stable.
01da6b99d49f6 thunderbolt: Introduce tb_port_reset()
b35c1d7b11da8 thunderbolt: Introduce tb_path_deactivate_hop()
ec8162b3f0683 thunderbolt: Make tb_switch_reset() support Thunderbolt 2,
3 and USB4 routers
9a54c5f3dbde thunderbolt: Reset topology created by the boot firmware
Thanks!
From: Charles Keepax <ckeepax(a)opensource.cirrus.com>
[ Upstream commit 6588732445ff19f6183f0fa72ddedf67e5a5be32 ]
MIPS appears to define a RST symbol at a high level, which clashes
with some register naming in the driver. Since there is currently
no case for running this driver on MIPS devices simply cut off the
build of this driver on MIPS.
Reported-by: kernel test robot <lkp(a)intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202311071303.JJMAOjy4-lkp@intel.com/
Suggested-by: Linus Walleij <linus.walleij(a)linaro.org>
Signed-off-by: Charles Keepax <ckeepax(a)opensource.cirrus.com>
Link: https://lore.kernel.org/r/20231115162853.1891940-1-ckeepax@opensource.cirru…
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/pinctrl/cirrus/Kconfig | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/pinctrl/cirrus/Kconfig b/drivers/pinctrl/cirrus/Kconfig
index 530426a74f751..b3cea8d56c4f6 100644
--- a/drivers/pinctrl/cirrus/Kconfig
+++ b/drivers/pinctrl/cirrus/Kconfig
@@ -1,7 +1,8 @@
# SPDX-License-Identifier: GPL-2.0-only
config PINCTRL_LOCHNAGAR
tristate "Cirrus Logic Lochnagar pinctrl driver"
- depends on MFD_LOCHNAGAR
+ # Avoid clash caused by MIPS defining RST, which is used in the driver
+ depends on MFD_LOCHNAGAR && !MIPS
select GPIOLIB
select PINMUX
select PINCONF
--
2.42.0
The patch below does not apply to the 6.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.8.y
git checkout FETCH_HEAD
git cherry-pick -x c0205eaf3af9f5db14d4b5ee4abacf4a583c3c50
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042334-agenda-nutlike-cb77@gregkh' --subject-prefix 'PATCH 6.8.y' HEAD^..
Possible dependencies:
c0205eaf3af9 ("userfaultfd: change src_folio after ensuring it's unpinned in UFFDIO_MOVE")
eb1521dad8f3 ("userfaultfd: handle zeropage moves by UFFDIO_MOVE")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c0205eaf3af9f5db14d4b5ee4abacf4a583c3c50 Mon Sep 17 00:00:00 2001
From: Lokesh Gidra <lokeshgidra(a)google.com>
Date: Thu, 4 Apr 2024 10:17:26 -0700
Subject: [PATCH] userfaultfd: change src_folio after ensuring it's unpinned in
UFFDIO_MOVE
Commit d7a08838ab74 ("mm: userfaultfd: fix unexpected change to src_folio
when UFFDIO_MOVE fails") moved the src_folio->{mapping, index} changing to
after clearing the page-table and ensuring that it's not pinned. This
avoids failure of swapout+migration and possibly memory corruption.
However, the commit missed fixing it in the huge-page case.
Link: https://lkml.kernel.org/r/20240404171726.2302435-1-lokeshgidra@google.com
Fixes: adef440691ba ("userfaultfd: UFFDIO_MOVE uABI")
Signed-off-by: Lokesh Gidra <lokeshgidra(a)google.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Kalesh Singh <kaleshsingh(a)google.com>
Cc: Lokesh Gidra <lokeshgidra(a)google.com>
Cc: Nicolas Geoffray <ngeoffray(a)google.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Qi Zheng <zhengqi.arch(a)bytedance.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 9859aa4f7553..89f58c7603b2 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2259,9 +2259,6 @@ int move_pages_huge_pmd(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, pm
goto unlock_ptls;
}
- folio_move_anon_rmap(src_folio, dst_vma);
- WRITE_ONCE(src_folio->index, linear_page_index(dst_vma, dst_addr));
-
src_pmdval = pmdp_huge_clear_flush(src_vma, src_addr, src_pmd);
/* Folio got pinned from under us. Put it back and fail the move. */
if (folio_maybe_dma_pinned(src_folio)) {
@@ -2270,6 +2267,9 @@ int move_pages_huge_pmd(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, pm
goto unlock_ptls;
}
+ folio_move_anon_rmap(src_folio, dst_vma);
+ WRITE_ONCE(src_folio->index, linear_page_index(dst_vma, dst_addr));
+
_dst_pmd = mk_huge_pmd(&src_folio->page, dst_vma->vm_page_prot);
/* Follow mremap() behavior and treat the entry dirty after the move */
_dst_pmd = pmd_mkwrite(pmd_mkdirty(_dst_pmd), dst_vma);
Some device drivers support devices that enable them to annotate whether a
Rx skb refers to a packet that was processed by the MACsec offloading
functionality of the device. Logic in the Rx handling for MACsec offload
does not utilize this information to preemptively avoid forwarding to the
macsec netdev currently. Because of this, things like multicast messages or
unicast messages with an unmatched destination address such as ARP requests
are forwarded to the macsec netdev whether the message received was MACsec
encrypted or not. The goal of this patch series is to improve the Rx
handling for MACsec offload for devices capable of annotating skbs received
that were decrypted by the NIC offload for MACsec.
Here is a summary of the issue that occurs with the existing logic today.
* The current design of the MACsec offload handling path tries to use
"best guess" mechanisms for determining whether a packet associated
with the currently handled skb in the datapath was processed via HW
offload
* The best guess mechanism uses the following heuristic logic (in order of
precedence)
- Check if header destination MAC address matches MACsec netdev MAC
address -> forward to MACsec port
- Check if packet is multicast traffic -> forward to MACsec port
- MACsec security channel was able to be looked up from skb offload
context (mlx5 only) -> forward to MACsec port
* Problem: plaintext traffic can potentially solicit a MACsec encrypted
response from the offload device
- Core aspect of MACsec is that it identifies unauthorized LAN connections
and excludes them from communication
+ This behavior can be seen when not enabling offload for MACsec
- The offload behavior violates this principle in MACsec
I believe this behavior is a security bug since applications utilizing
MACsec could be exploited using this behavior, and the correct way to
resolve this is by having the hardware correctly indicate whether MACsec
offload occurred for the packet or not. In the patches in this series, I
leave a warning for when the problematic path occurs because I cannot
figure out a secure way to fix the security issue that applies to the core
MACsec offload handling in the Rx path without breaking MACsec offload for
other vendors.
Shown at the bottom is an example use case where plaintext traffic sent to
a physical port of a NIC configured for MACsec offload is unable to be
handled correctly by the software stack when the NIC provides awareness to
the kernel about whether the received packet is MACsec traffic or not. In
this specific example, plaintext ARP requests are being responded with
MACsec encrypted ARP replies (which leads to routing information being
unable to be built for the requester).
Side 1
ip link del macsec0
ip address flush mlx5_1
ip address add 1.1.1.1/24 dev mlx5_1
ip link set dev mlx5_1 up
ip link add link mlx5_1 macsec0 type macsec sci 1 encrypt on
ip link set dev macsec0 address 00:11:22:33:44:66
ip macsec offload macsec0 mac
ip macsec add macsec0 tx sa 0 pn 1 on key 00 dffafc8d7b9a43d5b9a3dfbbf6a30c16
ip macsec add macsec0 rx sci 2 on
ip macsec add macsec0 rx sci 2 sa 0 pn 1 on key 00 ead3664f508eb06c40ac7104cdae4ce5
ip address flush macsec0
ip address add 2.2.2.1/24 dev macsec0
ip link set dev macsec0 up
# macsec0 enters promiscuous mode.
# This enables all traffic received on macsec_vlan to be processed by
# the macsec offload rx datapath. This however means that traffic
# meant to be received by mlx5_1 will be incorrectly steered to
# macsec0 as well.
ip link add link macsec0 name macsec_vlan type vlan id 1
ip link set dev macsec_vlan address 00:11:22:33:44:88
ip address flush macsec_vlan
ip address add 3.3.3.1/24 dev macsec_vlan
ip link set dev macsec_vlan up
Side 2
ip link del macsec0
ip address flush mlx5_1
ip address add 1.1.1.2/24 dev mlx5_1
ip link set dev mlx5_1 up
ip link add link mlx5_1 macsec0 type macsec sci 2 encrypt on
ip link set dev macsec0 address 00:11:22:33:44:77
ip macsec offload macsec0 mac
ip macsec add macsec0 tx sa 0 pn 1 on key 00 ead3664f508eb06c40ac7104cdae4ce5
ip macsec add macsec0 rx sci 1 on
ip macsec add macsec0 rx sci 1 sa 0 pn 1 on key 00 dffafc8d7b9a43d5b9a3dfbbf6a30c16
ip address flush macsec0
ip address add 2.2.2.2/24 dev macsec0
ip link set dev macsec0 up
# macsec0 enters promiscuous mode.
# This enables all traffic received on macsec_vlan to be processed by
# the macsec offload rx datapath. This however means that traffic
# meant to be received by mlx5_1 will be incorrectly steered to
# macsec0 as well.
ip link add link macsec0 name macsec_vlan type vlan id 1
ip link set dev macsec_vlan address 00:11:22:33:44:99
ip address flush macsec_vlan
ip address add 3.3.3.2/24 dev macsec_vlan
ip link set dev macsec_vlan up
Side 1
ping -I mlx5_1 1.1.1.2
PING 1.1.1.2 (1.1.1.2) from 1.1.1.1 mlx5_1: 56(84) bytes of data.
From 1.1.1.1 icmp_seq=1 Destination Host Unreachable
ping: sendmsg: No route to host
From 1.1.1.1 icmp_seq=2 Destination Host Unreachable
From 1.1.1.1 icmp_seq=3 Destination Host Unreachable
Changes:
v1->v2:
* Fixed series subject to detail the issue being fixed
* Removed strange characters from cover letter
* Added comment in example that illustrates the impact involving
promiscuous mode
* Added patch for generalizing packet type detection
* Added Fixes: tags and targeting net
* Removed pointless warning in the heuristic Rx path for macsec offload
* Applied small refactor in Rx path offload to minimize scope of rx_sc
local variable
Link: https://github.com/Binary-Eater/macsec-rx-offload/blob/trunk/MACsec_violati…
Link: https://lore.kernel.org/netdev/20240419011740.333714-1-rrameshbabu@nvidia.c…
Link: https://lore.kernel.org/netdev/87r0l25y1c.fsf@nvidia.com/
Link: https://lore.kernel.org/netdev/20231116182900.46052-1-rrameshbabu@nvidia.co…
Cc: Sabrina Dubroca <sd(a)queasysnail.net>
Cc: stable(a)vger.kernel.org
Signed-off-by: Rahul Rameshbabu <rrameshbabu(a)nvidia.com>
---
Rahul Rameshbabu (4):
macsec: Enable devices to advertise whether they update sk_buff md_dst
during offloads
ethernet: Add helper for assigning packet type when dest address does
not match device address
macsec: Detect if Rx skb is macsec-related for offloading devices that
update md_dst
net/mlx5e: Advertise mlx5 ethernet driver updates sk_buff md_dst for
MACsec
.../mellanox/mlx5/core/en_accel/macsec.c | 1 +
drivers/net/macsec.c | 46 +++++++++++++++----
include/linux/etherdevice.h | 24 ++++++++++
include/net/macsec.h | 2 +
net/ethernet/eth.c | 12 +----
5 files changed, 64 insertions(+), 21 deletions(-)
--
2.42.0
On Fri, Apr 19, 2024 at 11:46:56PM +0800, George Guo wrote:
> Subject: [PATCH 4.19.y v6 0/2] Double-free bug discovery on testing trigger-field-variable-support.tc
>
> 1) About v4-0001-tracing-Remove-hist-trigger-synth_var_refs.patch:
>
> The reason I am backporting this patch is that no one found the double-free bug
> at that time, then later the code was removed on upstream, but
> 4.19-stable has the bug.
Both now queued up, thanks
greg k-h
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x ea73179e64131bcd29ba6defd33732abdf8ca14b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024021902-authentic-handpick-da09@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
ea73179e6413 ("powerpc/ftrace: Ignore ftrace locations in exit text sections")
2ec36570c358 ("powerpc/ftrace: Fix indentation in ftrace.h")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ea73179e64131bcd29ba6defd33732abdf8ca14b Mon Sep 17 00:00:00 2001
From: Naveen N Rao <naveen(a)kernel.org>
Date: Tue, 13 Feb 2024 23:24:10 +0530
Subject: [PATCH] powerpc/ftrace: Ignore ftrace locations in exit text sections
Michael reported that we are seeing an ftrace bug on bootup when KASAN
is enabled and we are using -fpatchable-function-entry:
ftrace: allocating 47780 entries in 18 pages
ftrace-powerpc: 0xc0000000020b3d5c: No module provided for non-kernel address
------------[ ftrace bug ]------------
ftrace faulted on modifying
[<c0000000020b3d5c>] 0xc0000000020b3d5c
Initializing ftrace call sites
ftrace record flags: 0
(0)
expected tramp: c00000000008cef4
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at kernel/trace/ftrace.c:2180 ftrace_bug+0x3c0/0x424
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 6.5.0-rc3-00120-g0f71dcfb4aef #860
Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,HEAD hv:linux,kvm pSeries
NIP: c0000000003aa81c LR: c0000000003aa818 CTR: 0000000000000000
REGS: c0000000033cfab0 TRAP: 0700 Not tainted (6.5.0-rc3-00120-g0f71dcfb4aef)
MSR: 8000000002021033 <SF,VEC,ME,IR,DR,RI,LE> CR: 28028240 XER: 00000000
CFAR: c0000000002781a8 IRQMASK: 3
...
NIP [c0000000003aa81c] ftrace_bug+0x3c0/0x424
LR [c0000000003aa818] ftrace_bug+0x3bc/0x424
Call Trace:
ftrace_bug+0x3bc/0x424 (unreliable)
ftrace_process_locs+0x5f4/0x8a0
ftrace_init+0xc0/0x1d0
start_kernel+0x1d8/0x484
With CONFIG_FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY=y and
CONFIG_KASAN=y, compiler emits nops in functions that it generates for
registering and unregistering global variables (unlike with -pg and
-mprofile-kernel where calls to _mcount() are not generated in those
functions). Those functions then end up in INIT_TEXT and EXIT_TEXT
respectively. We don't expect to see any profiled functions in
EXIT_TEXT, so ftrace_init_nop() assumes that all addresses that aren't
in the core kernel text belongs to a module. Since these functions do
not match that criteria, we see the above bug.
Address this by having ftrace ignore all locations in the text exit
sections of vmlinux.
Fixes: 0f71dcfb4aef ("powerpc/ftrace: Add support for -fpatchable-function-entry")
Cc: stable(a)vger.kernel.org # v6.6+
Reported-by: Michael Ellerman <mpe(a)ellerman.id.au>
Signed-off-by: Naveen N Rao <naveen(a)kernel.org>
Reviewed-by: Benjamin Gray <bgray(a)linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://msgid.link/20240213175410.1091313-1-naveen@kernel.org
diff --git a/arch/powerpc/include/asm/ftrace.h b/arch/powerpc/include/asm/ftrace.h
index 1ebd2ca97f12..107fc5a48456 100644
--- a/arch/powerpc/include/asm/ftrace.h
+++ b/arch/powerpc/include/asm/ftrace.h
@@ -20,14 +20,6 @@
#ifndef __ASSEMBLY__
extern void _mcount(void);
-static inline unsigned long ftrace_call_adjust(unsigned long addr)
-{
- if (IS_ENABLED(CONFIG_ARCH_USING_PATCHABLE_FUNCTION_ENTRY))
- addr += MCOUNT_INSN_SIZE;
-
- return addr;
-}
-
unsigned long prepare_ftrace_return(unsigned long parent, unsigned long ip,
unsigned long sp);
@@ -142,8 +134,10 @@ static inline u8 this_cpu_get_ftrace_enabled(void) { return 1; }
#ifdef CONFIG_FUNCTION_TRACER
extern unsigned int ftrace_tramp_text[], ftrace_tramp_init[];
void ftrace_free_init_tramp(void);
+unsigned long ftrace_call_adjust(unsigned long addr);
#else
static inline void ftrace_free_init_tramp(void) { }
+static inline unsigned long ftrace_call_adjust(unsigned long addr) { return addr; }
#endif
#endif /* !__ASSEMBLY__ */
diff --git a/arch/powerpc/include/asm/sections.h b/arch/powerpc/include/asm/sections.h
index ea26665f82cf..f43f3a6b0051 100644
--- a/arch/powerpc/include/asm/sections.h
+++ b/arch/powerpc/include/asm/sections.h
@@ -14,6 +14,7 @@ typedef struct func_desc func_desc_t;
extern char __head_end[];
extern char __srwx_boundary[];
+extern char __exittext_begin[], __exittext_end[];
/* Patch sites */
extern s32 patch__call_flush_branch_caches1;
diff --git a/arch/powerpc/kernel/trace/ftrace.c b/arch/powerpc/kernel/trace/ftrace.c
index 82010629cf88..d8d6b4fd9a14 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -27,10 +27,22 @@
#include <asm/ftrace.h>
#include <asm/syscall.h>
#include <asm/inst.h>
+#include <asm/sections.h>
#define NUM_FTRACE_TRAMPS 2
static unsigned long ftrace_tramps[NUM_FTRACE_TRAMPS];
+unsigned long ftrace_call_adjust(unsigned long addr)
+{
+ if (addr >= (unsigned long)__exittext_begin && addr < (unsigned long)__exittext_end)
+ return 0;
+
+ if (IS_ENABLED(CONFIG_ARCH_USING_PATCHABLE_FUNCTION_ENTRY))
+ addr += MCOUNT_INSN_SIZE;
+
+ return addr;
+}
+
static ppc_inst_t ftrace_create_branch_inst(unsigned long ip, unsigned long addr, int link)
{
ppc_inst_t op;
diff --git a/arch/powerpc/kernel/trace/ftrace_64_pg.c b/arch/powerpc/kernel/trace/ftrace_64_pg.c
index 7b85c3b460a3..12fab1803bcf 100644
--- a/arch/powerpc/kernel/trace/ftrace_64_pg.c
+++ b/arch/powerpc/kernel/trace/ftrace_64_pg.c
@@ -37,6 +37,11 @@
#define NUM_FTRACE_TRAMPS 8
static unsigned long ftrace_tramps[NUM_FTRACE_TRAMPS];
+unsigned long ftrace_call_adjust(unsigned long addr)
+{
+ return addr;
+}
+
static ppc_inst_t
ftrace_call_replace(unsigned long ip, unsigned long addr, int link)
{
diff --git a/arch/powerpc/kernel/vmlinux.lds.S b/arch/powerpc/kernel/vmlinux.lds.S
index 1c5970df3233..f420df7888a7 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -281,7 +281,9 @@ SECTIONS
* to deal with references from __bug_table
*/
.exit.text : AT(ADDR(.exit.text) - LOAD_OFFSET) {
+ __exittext_begin = .;
EXIT_TEXT
+ __exittext_end = .;
}
. = ALIGN(PAGE_SIZE);
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 059a49aa2e25c58f90b50151f109dd3c4cdb3a47
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024041412-subduing-brewing-cd04@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
059a49aa2e25 ("virtio_net: Do not send RSS key if it is not supported")
fb6e30a72539 ("net: ethtool: pass a pointer to parameters to get/set_rxfh ethtool ops")
02cbfba1add5 ("idpf: add ethtool callbacks")
a5ab9ee0df0b ("idpf: add singleq start_xmit and napi poll")
3a8845af66ed ("idpf: add RX splitq napi poll support")
c2d548cad150 ("idpf: add TX splitq napi poll support")
6818c4d5b3c2 ("idpf: add splitq start_xmit")
d4d558718266 ("idpf: initialize interrupts and enable vport")
95af467d9a4e ("idpf: configure resources for RX queues")
1c325aac10a8 ("idpf: configure resources for TX queues")
ce1b75d0635c ("idpf: add ptypes and MAC filter support")
0fe45467a104 ("idpf: add create vport and netdev configuration")
4930fbf419a7 ("idpf: add core init and interrupt request")
8077c727561a ("idpf: add controlq init and reset checks")
e850efed5e15 ("idpf: add module register and probe functionality")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 059a49aa2e25c58f90b50151f109dd3c4cdb3a47 Mon Sep 17 00:00:00 2001
From: Breno Leitao <leitao(a)debian.org>
Date: Wed, 3 Apr 2024 08:43:12 -0700
Subject: [PATCH] virtio_net: Do not send RSS key if it is not supported
There is a bug when setting the RSS options in virtio_net that can break
the whole machine, getting the kernel into an infinite loop.
Running the following command in any QEMU virtual machine with virtionet
will reproduce this problem:
# ethtool -X eth0 hfunc toeplitz
This is how the problem happens:
1) ethtool_set_rxfh() calls virtnet_set_rxfh()
2) virtnet_set_rxfh() calls virtnet_commit_rss_command()
3) virtnet_commit_rss_command() populates 4 entries for the rss
scatter-gather
4) Since the command above does not have a key, then the last
scatter-gatter entry will be zeroed, since rss_key_size == 0.
sg_buf_size = vi->rss_key_size;
5) This buffer is passed to qemu, but qemu is not happy with a buffer
with zero length, and do the following in virtqueue_map_desc() (QEMU
function):
if (!sz) {
virtio_error(vdev, "virtio: zero sized buffers are not allowed");
6) virtio_error() (also QEMU function) set the device as broken
vdev->broken = true;
7) Qemu bails out, and do not repond this crazy kernel.
8) The kernel is waiting for the response to come back (function
virtnet_send_command())
9) The kernel is waiting doing the following :
while (!virtqueue_get_buf(vi->cvq, &tmp) &&
!virtqueue_is_broken(vi->cvq))
cpu_relax();
10) None of the following functions above is true, thus, the kernel
loops here forever. Keeping in mind that virtqueue_is_broken() does
not look at the qemu `vdev->broken`, so, it never realizes that the
vitio is broken at QEMU side.
Fix it by not sending RSS commands if the feature is not available in
the device.
Fixes: c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.")
Cc: stable(a)vger.kernel.org
Cc: qemu-devel(a)nongnu.org
Signed-off-by: Breno Leitao <leitao(a)debian.org>
Reviewed-by: Heng Qi <hengqi(a)linux.alibaba.com>
Reviewed-by: Xuan Zhuo <xuanzhuo(a)linux.alibaba.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index c22d1118a133..115c3c5414f2 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -3807,6 +3807,7 @@ static int virtnet_set_rxfh(struct net_device *dev,
struct netlink_ext_ack *extack)
{
struct virtnet_info *vi = netdev_priv(dev);
+ bool update = false;
int i;
if (rxfh->hfunc != ETH_RSS_HASH_NO_CHANGE &&
@@ -3814,13 +3815,28 @@ static int virtnet_set_rxfh(struct net_device *dev,
return -EOPNOTSUPP;
if (rxfh->indir) {
+ if (!vi->has_rss)
+ return -EOPNOTSUPP;
+
for (i = 0; i < vi->rss_indir_table_size; ++i)
vi->ctrl->rss.indirection_table[i] = rxfh->indir[i];
+ update = true;
}
- if (rxfh->key)
- memcpy(vi->ctrl->rss.key, rxfh->key, vi->rss_key_size);
- virtnet_commit_rss_command(vi);
+ if (rxfh->key) {
+ /* If either _F_HASH_REPORT or _F_RSS are negotiated, the
+ * device provides hash calculation capabilities, that is,
+ * hash_key is configured.
+ */
+ if (!vi->has_rss && !vi->has_rss_hash_report)
+ return -EOPNOTSUPP;
+
+ memcpy(vi->ctrl->rss.key, rxfh->key, vi->rss_key_size);
+ update = true;
+ }
+
+ if (update)
+ virtnet_commit_rss_command(vi);
return 0;
}
@@ -4729,13 +4745,15 @@ static int virtnet_probe(struct virtio_device *vdev)
if (virtio_has_feature(vdev, VIRTIO_NET_F_HASH_REPORT))
vi->has_rss_hash_report = true;
- if (virtio_has_feature(vdev, VIRTIO_NET_F_RSS))
+ if (virtio_has_feature(vdev, VIRTIO_NET_F_RSS)) {
vi->has_rss = true;
- if (vi->has_rss || vi->has_rss_hash_report) {
vi->rss_indir_table_size =
virtio_cread16(vdev, offsetof(struct virtio_net_config,
rss_max_indirection_table_length));
+ }
+
+ if (vi->has_rss || vi->has_rss_hash_report) {
vi->rss_key_size =
virtio_cread8(vdev, offsetof(struct virtio_net_config, rss_max_key_size));
Hello.
These are the remaining bugfix patches for the MT7530 DSA subdriver.
They didn't apply as is to the 6.1 stable tree so I have submitted the
adjusted versions in this thread. Please apply them in the order
they were submitted.
Signed-off-by: Arınç ÜNAL <arinc.unal(a)arinc9.com>
---
Arınç ÜNAL (3):
net: dsa: mt7530: set all CPU ports in MT7531_CPU_PMAP
net: dsa: mt7530: fix improper frames on all 25MHz and 40MHz XTAL MT7530
net: dsa: mt7530: fix enabling EEE on MT7531 switch on all boards
Vladimir Oltean (1):
net: dsa: introduce preferred_default_local_cpu_port and use on MT7530
drivers/net/dsa/mt7530.c | 52 ++++++++++++++++++++++++++++++++++--------------
drivers/net/dsa/mt7530.h | 2 ++
include/net/dsa.h | 8 ++++++++
net/dsa/dsa2.c | 24 +++++++++++++++++++++-
4 files changed, 70 insertions(+), 16 deletions(-)
---
base-commit: 6741e066ec7633450d3186946035c1f80c4226b8
change-id: 20240420-for-stable-6-1-backports-54c939276879
Best regards,
--
Arınç ÜNAL <arinc.unal(a)arinc9.com>
Hello.
These are the remaining bugfix patches for the MT7530 DSA subdriver.
They didn't apply as is to the 6.8 stable tree so I have submitted the
adjusted versions in this thread. Please apply them in the order
they were submitted.
Signed-off-by: Arınç ÜNAL <arinc.unal(a)arinc9.com>
---
Arınç ÜNAL (2):
net: dsa: mt7530: fix improper frames on all 25MHz and 40MHz XTAL MT7530
net: dsa: mt7530: fix enabling EEE on MT7531 switch on all boards
drivers/net/dsa/mt7530.c | 22 +++++++++++++++-------
drivers/net/dsa/mt7530.h | 1 +
2 files changed, 16 insertions(+), 7 deletions(-)
---
base-commit: 12dadc409c2bd8538c6ee0e56e191efde6d92007
change-id: 20240420-for-stable-6-8-backports-5a210f6cfb0c
Best regards,
--
Arınç ÜNAL <arinc.unal(a)arinc9.com>
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x e871abcda3b67d0820b4182ebe93435624e9c6a4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024041908-ethically-floss-e1ea@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
e871abcda3b6 ("random: handle creditable entropy from atomic process context")
bbc7e1bed1f5 ("random: add back async readiness notifier")
9148de3196ed ("random: reseed in delayed work rather than on-demand")
f62384995e4c ("random: split initialization into early step and later step")
745558f95885 ("random: use hwgenerator randomness more frequently at early boot")
228dfe98a313 ("Merge tag 'char-misc-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e871abcda3b67d0820b4182ebe93435624e9c6a4 Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Wed, 17 Apr 2024 13:38:29 +0200
Subject: [PATCH] random: handle creditable entropy from atomic process context
The entropy accounting changes a static key when the RNG has
initialized, since it only ever initializes once. Static key changes,
however, cannot be made from atomic context, so depending on where the
last creditable entropy comes from, the static key change might need to
be deferred to a worker.
Previously the code used the execute_in_process_context() helper
function, which accounts for whether or not the caller is
in_interrupt(). However, that doesn't account for the case where the
caller is actually in process context but is holding a spinlock.
This turned out to be the case with input_handle_event() in
drivers/input/input.c contributing entropy:
[<ffffffd613025ba0>] die+0xa8/0x2fc
[<ffffffd613027428>] bug_handler+0x44/0xec
[<ffffffd613016964>] brk_handler+0x90/0x144
[<ffffffd613041e58>] do_debug_exception+0xa0/0x148
[<ffffffd61400c208>] el1_dbg+0x60/0x7c
[<ffffffd61400c000>] el1h_64_sync_handler+0x38/0x90
[<ffffffd613011294>] el1h_64_sync+0x64/0x6c
[<ffffffd613102d88>] __might_resched+0x1fc/0x2e8
[<ffffffd613102b54>] __might_sleep+0x44/0x7c
[<ffffffd6130b6eac>] cpus_read_lock+0x1c/0xec
[<ffffffd6132c2820>] static_key_enable+0x14/0x38
[<ffffffd61400ac08>] crng_set_ready+0x14/0x28
[<ffffffd6130df4dc>] execute_in_process_context+0xb8/0xf8
[<ffffffd61400ab30>] _credit_init_bits+0x118/0x1dc
[<ffffffd6138580c8>] add_timer_randomness+0x264/0x270
[<ffffffd613857e54>] add_input_randomness+0x38/0x48
[<ffffffd613a80f94>] input_handle_event+0x2b8/0x490
[<ffffffd613a81310>] input_event+0x6c/0x98
According to Guoyong, it's not really possible to refactor the various
drivers to never hold a spinlock there. And in_atomic() isn't reliable.
So, rather than trying to be too fancy, just punt the change in the
static key to a workqueue always. There's basically no drawback of doing
this, as the code already needed to account for the static key not
changing immediately, and given that it's just an optimization, there's
not exactly a hurry to change the static key right away, so deferal is
fine.
Reported-by: Guoyong Wang <guoyong.wang(a)mediatek.com>
Cc: stable(a)vger.kernel.org
Fixes: f5bda35fba61 ("random: use static branch for crng_ready()")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 456be28ba67c..2597cb43f438 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -702,7 +702,7 @@ static void extract_entropy(void *buf, size_t len)
static void __cold _credit_init_bits(size_t bits)
{
- static struct execute_work set_ready;
+ static DECLARE_WORK(set_ready, crng_set_ready);
unsigned int new, orig, add;
unsigned long flags;
@@ -718,8 +718,8 @@ static void __cold _credit_init_bits(size_t bits)
if (orig < POOL_READY_BITS && new >= POOL_READY_BITS) {
crng_reseed(NULL); /* Sets crng_init to CRNG_READY under base_crng.lock. */
- if (static_key_initialized)
- execute_in_process_context(crng_set_ready, &set_ready);
+ if (static_key_initialized && system_unbound_wq)
+ queue_work(system_unbound_wq, &set_ready);
atomic_notifier_call_chain(&random_ready_notifier, 0, NULL);
wake_up_interruptible(&crng_init_wait);
kill_fasync(&fasync, SIGIO, POLL_IN);
@@ -890,8 +890,8 @@ void __init random_init(void)
/*
* If we were initialized by the cpu or bootloader before jump labels
- * are initialized, then we should enable the static branch here, where
- * it's guaranteed that jump labels have been initialized.
+ * or workqueues are initialized, then we should enable the static
+ * branch here, where it's guaranteed that these have been initialized.
*/
if (!static_branch_likely(&crng_is_ready) && crng_init >= CRNG_READY)
crng_set_ready(NULL);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 13c785323b36b845300b256d0e5963c3727667d7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042344-phonics-simile-0b3c@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
13c785323b36 ("serial: stm32: Return IRQ_NONE in the ISR if no handling happend")
c5d06662551c ("serial: stm32: Use port lock wrappers")
a01ae50d7eae ("serial: stm32: replace access to DMAR bit by dmaengine_pause/resume")
7f28bcea824e ("serial: stm32: group dma pause/resume error handling into single function")
00d1f9c6af0d ("serial: stm32: modify parameter and rename stm32_usart_rx_dma_enabled")
db89728abad5 ("serial: stm32: avoid clearing DMAT bit during transfer")
3f6c02fa712b ("serial: stm32: Merge hard IRQ and threaded IRQ handling into single IRQ handler")
d7c76716169d ("serial: stm32: Use TC interrupt to deassert GPIO RTS in RS485 mode")
3bcea529b295 ("serial: stm32: Factor out GPIO RTS toggling into separate function")
037b91ec7729 ("serial: stm32: fix software flow control transfer")
d3d079bde07e ("serial: stm32: prevent TDR register overwrite when sending x_char")
195437d14fb4 ("serial: stm32: correct loop for dma error handling")
2a3bcfe03725 ("serial: stm32: fix flow control transfer in DMA mode")
9a135f16d228 ("serial: stm32: rework TX DMA state condition")
56a23f9319e8 ("serial: stm32: move tx dma terminate DMA to shutdown")
6333a4850621 ("serial: stm32: push DMA RX data before suspending")
6eeb348c8482 ("serial: stm32: terminate / restart DMA transfer at suspend / resume")
e0abc903deea ("serial: stm32: rework RX dma initialization and release")
d1ec8a2eabe9 ("serial: stm32: update throttle and unthrottle ops for dma mode")
33bb2f6ac308 ("serial: stm32: rework RX over DMA")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13c785323b36b845300b256d0e5963c3727667d7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?= <u.kleine-koenig(a)pengutronix.de>
Date: Wed, 17 Apr 2024 11:03:27 +0200
Subject: [PATCH] serial: stm32: Return IRQ_NONE in the ISR if no handling
happend
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
If there is a stuck irq that the handler doesn't address, returning
IRQ_HANDLED unconditionally makes it impossible for the irq core to
detect the problem and disable the irq. So only return IRQ_HANDLED if
an event was handled.
A stuck irq is still problematic, but with this change at least it only
makes the UART nonfunctional instead of occupying the (usually only) CPU
by 100% and so stall the whole machine.
Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
Link: https://lore.kernel.org/r/5f92603d0dfd8a5b8014b2b10a902d91e0bb881f.17133441…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index 58d169e5c1db..d60cbac69194 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -861,6 +861,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs;
u32 sr;
unsigned int size;
+ irqreturn_t ret = IRQ_NONE;
sr = readl_relaxed(port->membase + ofs->isr);
@@ -869,11 +870,14 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
(sr & USART_SR_TC)) {
stm32_usart_tc_interrupt_disable(port);
stm32_usart_rs485_rts_disable(port);
+ ret = IRQ_HANDLED;
}
- if ((sr & USART_SR_RTOF) && ofs->icr != UNDEF_REG)
+ if ((sr & USART_SR_RTOF) && ofs->icr != UNDEF_REG) {
writel_relaxed(USART_ICR_RTOCF,
port->membase + ofs->icr);
+ ret = IRQ_HANDLED;
+ }
if ((sr & USART_SR_WUF) && ofs->icr != UNDEF_REG) {
/* Clear wake up flag and disable wake up interrupt */
@@ -882,6 +886,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
stm32_usart_clr_bits(port, ofs->cr3, USART_CR3_WUFIE);
if (irqd_is_wakeup_set(irq_get_irq_data(port->irq)))
pm_wakeup_event(tport->tty->dev, 0);
+ ret = IRQ_HANDLED;
}
/*
@@ -896,6 +901,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_unlock_and_check_sysrq(port);
if (size)
tty_flip_buffer_push(tport);
+ ret = IRQ_HANDLED;
}
}
@@ -903,6 +909,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_port_lock(port);
stm32_usart_transmit_chars(port);
uart_port_unlock(port);
+ ret = IRQ_HANDLED;
}
/* Receiver timeout irq for DMA RX */
@@ -912,9 +919,10 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_unlock_and_check_sysrq(port);
if (size)
tty_flip_buffer_push(tport);
+ ret = IRQ_HANDLED;
}
- return IRQ_HANDLED;
+ return ret;
}
static void stm32_usart_set_mctrl(struct uart_port *port, unsigned int mctrl)
The commit 407d1a51921e ("PCI: Create device tree node for bridge")
creates of_node for PCI devices.
During the insertion handling of these new DT nodes done by of_platform,
new devices (struct device) are created. For each PCI devices a struct
device is already present (created and handled by the PCI core).
Having a second struct device to represent the exact same PCI device is
not correct.
On the of_node creation:
- tell the of_platform that there is no need to create a device for this
node (OF_POPULATED flag),
- link this newly created of_node to the already present device,
- tell fwnode that the device attached to this of_node is ready using
fwnode_dev_initialized().
On the of_node removal, revert the operations.
Fixes: 407d1a51921e ("PCI: Create device tree node for bridge")
Cc: stable(a)vger.kernel.org
Signed-off-by: Herve Codina <herve.codina(a)bootlin.com>
---
drivers/pci/of.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/of.c b/drivers/pci/of.c
index 51e3dd0ea5ab..5afd2731e876 100644
--- a/drivers/pci/of.c
+++ b/drivers/pci/of.c
@@ -615,7 +615,8 @@ void of_pci_remove_node(struct pci_dev *pdev)
np = pci_device_to_OF_node(pdev);
if (!np || !of_node_check_flag(np, OF_DYNAMIC))
return;
- pdev->dev.of_node = NULL;
+
+ device_remove_of_node(&pdev->dev);
of_changeset_revert(np->data);
of_changeset_destroy(np->data);
@@ -668,12 +669,22 @@ void of_pci_make_dev_node(struct pci_dev *pdev)
if (ret)
goto out_free_node;
+ /*
+ * This of_node will be added to an existing device.
+ * Avoid any device creation and use the existing device
+ */
+ of_node_set_flag(np, OF_POPULATED);
+ np->fwnode.dev = &pdev->dev;
+ fwnode_dev_initialized(&np->fwnode, true);
+
ret = of_changeset_apply(cset);
if (ret)
goto out_free_node;
np->data = cset;
- pdev->dev.of_node = np;
+
+ /* Add the of_node to the existing device */
+ device_add_of_node(&pdev->dev, np);
kfree(name);
return;
--
2.44.0
From: Dominique Martinet <dominique.martinet(a)atmark-techno.com>
The previous patch forgot to unlock in the error path
Link: https://lore.kernel.org/all/Zh%2fHpAGFqa7YAFuM@duo.ucw.cz
Reported-by: Pavel Machek <pavel(a)denx.de>
Cc: stable(a)vger.kernel.org
Fixes: 7411055db5ce ("btrfs: handle chunk tree lookup error in btrfs_relocate_sys_chunks()")
Signed-off-by: Dominique Martinet <dominique.martinet(a)atmark-techno.com>
---
Note for stable: the mutex has been renamed from delete_unused_bgs_mutex
in 5.13, so the 5.10 and 4.19 backports need a trivial rename:
s/reclaim_bgs_lock/delete_unused_bgs_mutex/
If required I'll send branch-specific patches after this is merged.
---
fs/btrfs/volumes.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index f15591f3e54f..ef6bd2f4251b 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3455,6 +3455,7 @@ static int btrfs_relocate_sys_chunks(struct btrfs_fs_info *fs_info)
* alignment and size).
*/
ret = -EUCLEAN;
+ mutex_unlock(&fs_info->reclaim_bgs_lock);
goto error;
}
---
base-commit: 2668e3ae2ef36d5e7c52f818ad7d90822c037de4
change-id: 20240419-btrfs_unlock-95e0b3e2e2fc
Best regards,
--
Dominique Martinet | Asmadeus
SPECULATION_MITIGATIONS is currently defined only for x86. As a result,
IS_ENABLED(CONFIG_SPECULATION_MITIGATIONS) is always false for other
archs. f337a6a21e2f effectively set "mitigations=off" by default on
non-x86 archs, which is not desired behavior. Jakub observed this
change when running bpf selftests on s390 and arm64.
Fix this by moving SPECULATION_MITIGATIONS to arch/Kconfig so that it is
available in all archs and thus can be used safely in kernel/cpu.c
Fixes: f337a6a21e2f ("x86/cpu: Actually turn off mitigations by default for SPECULATION_MITIGATIONS=n")
Cc: stable(a)vger.kernel.org
Cc: Sean Christopherson <seanjc(a)google.com>
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Daniel Sneddon <daniel.sneddon(a)linux.intel.com>
Cc: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Song Liu <song(a)kernel.org>
---
arch/Kconfig | 10 ++++++++++
arch/x86/Kconfig | 10 ----------
2 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/arch/Kconfig b/arch/Kconfig
index 9f066785bb71..8f4af75005f8 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1609,4 +1609,14 @@ config CC_HAS_SANE_FUNCTION_ALIGNMENT
# strict alignment always, even with -falign-functions.
def_bool CC_HAS_MIN_FUNCTION_ALIGNMENT || CC_IS_CLANG
+menuconfig SPECULATION_MITIGATIONS
+ bool "Mitigations for speculative execution vulnerabilities"
+ default y
+ help
+ Say Y here to enable options which enable mitigations for
+ speculative execution hardware vulnerabilities.
+
+ If you say N, all mitigations will be disabled. You really
+ should know what you are doing to say so.
+
endmenu
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 39886bab943a..50c890fce5e0 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2486,16 +2486,6 @@ config PREFIX_SYMBOLS
def_bool y
depends on CALL_PADDING && !CFI_CLANG
-menuconfig SPECULATION_MITIGATIONS
- bool "Mitigations for speculative execution vulnerabilities"
- default y
- help
- Say Y here to enable options which enable mitigations for
- speculative execution hardware vulnerabilities.
-
- If you say N, all mitigations will be disabled. You really
- should know what you are doing to say so.
-
if SPECULATION_MITIGATIONS
config MITIGATION_PAGE_TABLE_ISOLATION
--
2.43.0
With deferred IO enabled, a page fault happens when data is written to the
framebuffer device. Then driver determines which page is being updated by
calculating the offset of the written virtual address within the virtual
memory area, and uses this offset to get the updated page within the
internal buffer. This page is later copied to hardware (thus the name
"deferred IO").
This offset calculation is only correct if the virtual memory area is
mapped to the beginning of the internal buffer. Otherwise this is wrong.
For example, if users do:
mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0xff000);
Then the virtual memory area will mapped at offset 0xff000 within the
internal buffer. This offset 0xff000 is not accounted for, and wrong page
is updated.
Correct the calculation by using vmf->pgoff instead. With this change, the
variable "offset" will no longer hold the exact offset value, but it is
rounded down to multiples of PAGE_SIZE. But this is still correct, because
this variable is only used to calculate the page offset.
Reported-by: Harshit Mogalapalli <harshit.m.mogalapalli(a)oracle.com>
Closes: https://lore.kernel.org/linux-fbdev/271372d6-e665-4e7f-b088-dee5f4ab341a@or…
Fixes: 56c134f7f1b5 ("fbdev: Track deferred-I/O pages in pageref struct")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
---
v2:
- simplify the patch by using vfg->pgoff
- remove tested-by tag, as the patch is now different
drivers/video/fbdev/core/fb_defio.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c
index 1ae1d35a5942..b9607d5a370d 100644
--- a/drivers/video/fbdev/core/fb_defio.c
+++ b/drivers/video/fbdev/core/fb_defio.c
@@ -196,7 +196,7 @@ static vm_fault_t fb_deferred_io_track_page(struct fb_info *info, unsigned long
*/
static vm_fault_t fb_deferred_io_page_mkwrite(struct fb_info *info, struct vm_fault *vmf)
{
- unsigned long offset = vmf->address - vmf->vma->vm_start;
+ unsigned long offset = vmf->pgoff << PAGE_SHIFT;
struct page *page = vmf->page;
file_update_time(vmf->vma->vm_file);
--
2.39.2
The Elan eKTH5015M touch controller on the X13s requires a 300 ms delay
before sending commands after having deasserted reset during power on.
Switch to the Elan specific binding so that the OS can determine the
required power-on sequence and make sure that the controller is always
detected during boot.
Note that the always-on 1.8 V supply (s10b) is not used by the
controller directly and should not be described.
Fixes: 32c231385ed4 ("arm64: dts: qcom: sc8280xp: add Lenovo Thinkpad X13s devicetree")
Cc: stable(a)vger.kernel.org # 6.0
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
.../dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts b/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts
index b220ba4fba23..e27c8a21125c 100644
--- a/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts
+++ b/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts
@@ -674,15 +674,16 @@ &i2c4 {
status = "okay";
- /* FIXME: verify */
touchscreen@10 {
- compatible = "hid-over-i2c";
+ compatible = "elan,ekth5015m", "elan,ekth6915";
reg = <0x10>;
- hid-descr-addr = <0x1>;
interrupts-extended = <&tlmm 175 IRQ_TYPE_LEVEL_LOW>;
- vdd-supply = <&vreg_misc_3p3>;
- vddl-supply = <&vreg_s10b>;
+ reset-gpios = <&tlmm 99 (GPIO_ACTIVE_LOW | GPIO_OPEN_DRAIN)>;
+ no-reset-on-power-off;
+
+ vcc33-supply = <&vreg_misc_3p3>;
+ vccio-supply = <&vreg_misc_3p3>;
pinctrl-names = "default";
pinctrl-0 = <&ts0_default>;
@@ -1637,8 +1638,8 @@ int-n-pins {
reset-n-pins {
pins = "gpio99";
function = "gpio";
- output-high;
- drive-strength = <16>;
+ drive-strength = <2>;
+ bias-disable;
};
};
--
2.43.2
priv.runtime_map is only allocated when efi_novamap is not set.
Otherwise, it is an uninitialized value.
In the error path, it is freed unconditionally.
Avoid passing an uninitialized value to free_pool.
Free priv.runtime_map only when it was allocated.
This bug was discovered and resolved using Coverity Static Analysis
Security Testing (SAST) by Synopsys, Inc.
Fixes: f80d26043af9 ("efi: libstub: avoid efi_get_memory_map() for allocating the virt map")
Signed-off-by: Hagar Hemdan <hagarhem(a)amazon.com>
---
drivers/firmware/efi/libstub/fdt.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
index 70e9789ff9de..6a337f1f8787 100644
--- a/drivers/firmware/efi/libstub/fdt.c
+++ b/drivers/firmware/efi/libstub/fdt.c
@@ -335,8 +335,8 @@ efi_status_t allocate_new_fdt_and_exit_boot(void *handle,
fail:
efi_free(fdt_size, fdt_addr);
-
- efi_bs_call(free_pool, priv.runtime_map);
+ if (!efi_novamap)
+ efi_bs_call(free_pool, priv.runtime_map);
return EFI_LOAD_ERROR;
}
--
2.40.1
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x a60ccade88f926e871a57176e86a34bbf0db0098
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042326-stubbed-gauze-ca0a@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
a60ccade88f9 ("drm/vmwgfx: Fix crtc's atomic check conditional")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a60ccade88f926e871a57176e86a34bbf0db0098 Mon Sep 17 00:00:00 2001
From: Zack Rusin <zack.rusin(a)broadcom.com>
Date: Thu, 11 Apr 2024 22:55:10 -0400
Subject: [PATCH] drm/vmwgfx: Fix crtc's atomic check conditional
The conditional was supposed to prevent enabling of a crtc state
without a set primary plane. Accidently it also prevented disabling
crtc state with a set primary plane. Neither is correct.
Fix the conditional and just driver-warn when a crtc state has been
enabled without a primary plane which will help debug broken userspace.
Fixes IGT's kms_atomic_interruptible and kms_atomic_transition tests.
Signed-off-by: Zack Rusin <zack.rusin(a)broadcom.com>
Fixes: 06ec41909e31 ("drm/vmwgfx: Add and connect CRTC helper functions")
Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list(a)broadcom.com>
Cc: dri-devel(a)lists.freedesktop.org
Cc: <stable(a)vger.kernel.org> # v4.12+
Reviewed-by: Ian Forbes <ian.forbes(a)broadcom.com>
Reviewed-by: Martin Krastev <martin.krastev(a)broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240412025511.78553-5-zack.r…
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
index cd4925346ed4..84ae4e10a2eb 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
@@ -933,6 +933,7 @@ int vmw_du_cursor_plane_atomic_check(struct drm_plane *plane,
int vmw_du_crtc_atomic_check(struct drm_crtc *crtc,
struct drm_atomic_state *state)
{
+ struct vmw_private *vmw = vmw_priv(crtc->dev);
struct drm_crtc_state *new_state = drm_atomic_get_new_crtc_state(state,
crtc);
struct vmw_display_unit *du = vmw_crtc_to_du(new_state->crtc);
@@ -940,9 +941,13 @@ int vmw_du_crtc_atomic_check(struct drm_crtc *crtc,
bool has_primary = new_state->plane_mask &
drm_plane_mask(crtc->primary);
- /* We always want to have an active plane with an active CRTC */
- if (has_primary != new_state->enable)
- return -EINVAL;
+ /*
+ * This is fine in general, but broken userspace might expect
+ * some actual rendering so give a clue as why it's blank.
+ */
+ if (new_state->enable && !has_primary)
+ drm_dbg_driver(&vmw->drm,
+ "CRTC without a primary plane will be blank.\n");
if (new_state->connector_mask != connector_mask &&
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 9253c54e01b6505d348afbc02abaa4d9f8a01395
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042310-nylon-condense-e215@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
9253c54e01b6 ("Squashfs: check the inode number is not the invalid value of zero")
a1f13ed8c748 ("squashfs: convert to new timestamp accessors")
280345d0d03b ("squashfs: convert to ctime accessor functions")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9253c54e01b6505d348afbc02abaa4d9f8a01395 Mon Sep 17 00:00:00 2001
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Date: Mon, 8 Apr 2024 23:02:06 +0100
Subject: [PATCH] Squashfs: check the inode number is not the invalid value of
zero
Syskiller has produced an out of bounds access in fill_meta_index().
That out of bounds access is ultimately caused because the inode
has an inode number with the invalid value of zero, which was not checked.
The reason this causes the out of bounds access is due to following
sequence of events:
1. Fill_meta_index() is called to allocate (via empty_meta_index())
and fill a metadata index. It however suffers a data read error
and aborts, invalidating the newly returned empty metadata index.
It does this by setting the inode number of the index to zero,
which means unused (zero is not a valid inode number).
2. When fill_meta_index() is subsequently called again on another
read operation, locate_meta_index() returns the previous index
because it matches the inode number of 0. Because this index
has been returned it is expected to have been filled, and because
it hasn't been, an out of bounds access is performed.
This patch adds a sanity check which checks that the inode number
is not zero when the inode is created and returns -EINVAL if it is.
[phillip(a)squashfs.org.uk: whitespace fix]
Link: https://lkml.kernel.org/r/20240409204723.446925-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20240408220206.435788-1-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: "Ubisectech Sirius" <bugreport(a)ubisectech.com>
Closes: https://lore.kernel.org/lkml/87f5c007-b8a5-41ae-8b57-431e924c5915.bugreport…
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/squashfs/inode.c b/fs/squashfs/inode.c
index aa3411354e66..16bd693d0b3a 100644
--- a/fs/squashfs/inode.c
+++ b/fs/squashfs/inode.c
@@ -48,6 +48,10 @@ static int squashfs_new_inode(struct super_block *sb, struct inode *inode,
gid_t i_gid;
int err;
+ inode->i_ino = le32_to_cpu(sqsh_ino->inode_number);
+ if (inode->i_ino == 0)
+ return -EINVAL;
+
err = squashfs_get_id(sb, le16_to_cpu(sqsh_ino->uid), &i_uid);
if (err)
return err;
@@ -58,7 +62,6 @@ static int squashfs_new_inode(struct super_block *sb, struct inode *inode,
i_uid_write(inode, i_uid);
i_gid_write(inode, i_gid);
- inode->i_ino = le32_to_cpu(sqsh_ino->inode_number);
inode_set_mtime(inode, le32_to_cpu(sqsh_ino->mtime), 0);
inode_set_atime(inode, inode_get_mtime_sec(inode), 0);
inode_set_ctime(inode, inode_get_mtime_sec(inode), 0);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 9253c54e01b6505d348afbc02abaa4d9f8a01395
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042308-reassign-skylight-6cfd@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
9253c54e01b6 ("Squashfs: check the inode number is not the invalid value of zero")
a1f13ed8c748 ("squashfs: convert to new timestamp accessors")
280345d0d03b ("squashfs: convert to ctime accessor functions")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9253c54e01b6505d348afbc02abaa4d9f8a01395 Mon Sep 17 00:00:00 2001
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Date: Mon, 8 Apr 2024 23:02:06 +0100
Subject: [PATCH] Squashfs: check the inode number is not the invalid value of
zero
Syskiller has produced an out of bounds access in fill_meta_index().
That out of bounds access is ultimately caused because the inode
has an inode number with the invalid value of zero, which was not checked.
The reason this causes the out of bounds access is due to following
sequence of events:
1. Fill_meta_index() is called to allocate (via empty_meta_index())
and fill a metadata index. It however suffers a data read error
and aborts, invalidating the newly returned empty metadata index.
It does this by setting the inode number of the index to zero,
which means unused (zero is not a valid inode number).
2. When fill_meta_index() is subsequently called again on another
read operation, locate_meta_index() returns the previous index
because it matches the inode number of 0. Because this index
has been returned it is expected to have been filled, and because
it hasn't been, an out of bounds access is performed.
This patch adds a sanity check which checks that the inode number
is not zero when the inode is created and returns -EINVAL if it is.
[phillip(a)squashfs.org.uk: whitespace fix]
Link: https://lkml.kernel.org/r/20240409204723.446925-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20240408220206.435788-1-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: "Ubisectech Sirius" <bugreport(a)ubisectech.com>
Closes: https://lore.kernel.org/lkml/87f5c007-b8a5-41ae-8b57-431e924c5915.bugreport…
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/squashfs/inode.c b/fs/squashfs/inode.c
index aa3411354e66..16bd693d0b3a 100644
--- a/fs/squashfs/inode.c
+++ b/fs/squashfs/inode.c
@@ -48,6 +48,10 @@ static int squashfs_new_inode(struct super_block *sb, struct inode *inode,
gid_t i_gid;
int err;
+ inode->i_ino = le32_to_cpu(sqsh_ino->inode_number);
+ if (inode->i_ino == 0)
+ return -EINVAL;
+
err = squashfs_get_id(sb, le16_to_cpu(sqsh_ino->uid), &i_uid);
if (err)
return err;
@@ -58,7 +62,6 @@ static int squashfs_new_inode(struct super_block *sb, struct inode *inode,
i_uid_write(inode, i_uid);
i_gid_write(inode, i_gid);
- inode->i_ino = le32_to_cpu(sqsh_ino->inode_number);
inode_set_mtime(inode, le32_to_cpu(sqsh_ino->mtime), 0);
inode_set_atime(inode, inode_get_mtime_sec(inode), 0);
inode_set_ctime(inode, inode_get_mtime_sec(inode), 0);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 9253c54e01b6505d348afbc02abaa4d9f8a01395
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042302-prepaid-rural-046c@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
9253c54e01b6 ("Squashfs: check the inode number is not the invalid value of zero")
a1f13ed8c748 ("squashfs: convert to new timestamp accessors")
280345d0d03b ("squashfs: convert to ctime accessor functions")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9253c54e01b6505d348afbc02abaa4d9f8a01395 Mon Sep 17 00:00:00 2001
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Date: Mon, 8 Apr 2024 23:02:06 +0100
Subject: [PATCH] Squashfs: check the inode number is not the invalid value of
zero
Syskiller has produced an out of bounds access in fill_meta_index().
That out of bounds access is ultimately caused because the inode
has an inode number with the invalid value of zero, which was not checked.
The reason this causes the out of bounds access is due to following
sequence of events:
1. Fill_meta_index() is called to allocate (via empty_meta_index())
and fill a metadata index. It however suffers a data read error
and aborts, invalidating the newly returned empty metadata index.
It does this by setting the inode number of the index to zero,
which means unused (zero is not a valid inode number).
2. When fill_meta_index() is subsequently called again on another
read operation, locate_meta_index() returns the previous index
because it matches the inode number of 0. Because this index
has been returned it is expected to have been filled, and because
it hasn't been, an out of bounds access is performed.
This patch adds a sanity check which checks that the inode number
is not zero when the inode is created and returns -EINVAL if it is.
[phillip(a)squashfs.org.uk: whitespace fix]
Link: https://lkml.kernel.org/r/20240409204723.446925-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20240408220206.435788-1-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: "Ubisectech Sirius" <bugreport(a)ubisectech.com>
Closes: https://lore.kernel.org/lkml/87f5c007-b8a5-41ae-8b57-431e924c5915.bugreport…
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/squashfs/inode.c b/fs/squashfs/inode.c
index aa3411354e66..16bd693d0b3a 100644
--- a/fs/squashfs/inode.c
+++ b/fs/squashfs/inode.c
@@ -48,6 +48,10 @@ static int squashfs_new_inode(struct super_block *sb, struct inode *inode,
gid_t i_gid;
int err;
+ inode->i_ino = le32_to_cpu(sqsh_ino->inode_number);
+ if (inode->i_ino == 0)
+ return -EINVAL;
+
err = squashfs_get_id(sb, le16_to_cpu(sqsh_ino->uid), &i_uid);
if (err)
return err;
@@ -58,7 +62,6 @@ static int squashfs_new_inode(struct super_block *sb, struct inode *inode,
i_uid_write(inode, i_uid);
i_gid_write(inode, i_gid);
- inode->i_ino = le32_to_cpu(sqsh_ino->inode_number);
inode_set_mtime(inode, le32_to_cpu(sqsh_ino->mtime), 0);
inode_set_atime(inode, inode_get_mtime_sec(inode), 0);
inode_set_ctime(inode, inode_get_mtime_sec(inode), 0);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 9253c54e01b6505d348afbc02abaa4d9f8a01395
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042357-second-squishier-5617@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
9253c54e01b6 ("Squashfs: check the inode number is not the invalid value of zero")
a1f13ed8c748 ("squashfs: convert to new timestamp accessors")
280345d0d03b ("squashfs: convert to ctime accessor functions")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9253c54e01b6505d348afbc02abaa4d9f8a01395 Mon Sep 17 00:00:00 2001
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Date: Mon, 8 Apr 2024 23:02:06 +0100
Subject: [PATCH] Squashfs: check the inode number is not the invalid value of
zero
Syskiller has produced an out of bounds access in fill_meta_index().
That out of bounds access is ultimately caused because the inode
has an inode number with the invalid value of zero, which was not checked.
The reason this causes the out of bounds access is due to following
sequence of events:
1. Fill_meta_index() is called to allocate (via empty_meta_index())
and fill a metadata index. It however suffers a data read error
and aborts, invalidating the newly returned empty metadata index.
It does this by setting the inode number of the index to zero,
which means unused (zero is not a valid inode number).
2. When fill_meta_index() is subsequently called again on another
read operation, locate_meta_index() returns the previous index
because it matches the inode number of 0. Because this index
has been returned it is expected to have been filled, and because
it hasn't been, an out of bounds access is performed.
This patch adds a sanity check which checks that the inode number
is not zero when the inode is created and returns -EINVAL if it is.
[phillip(a)squashfs.org.uk: whitespace fix]
Link: https://lkml.kernel.org/r/20240409204723.446925-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20240408220206.435788-1-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: "Ubisectech Sirius" <bugreport(a)ubisectech.com>
Closes: https://lore.kernel.org/lkml/87f5c007-b8a5-41ae-8b57-431e924c5915.bugreport…
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/squashfs/inode.c b/fs/squashfs/inode.c
index aa3411354e66..16bd693d0b3a 100644
--- a/fs/squashfs/inode.c
+++ b/fs/squashfs/inode.c
@@ -48,6 +48,10 @@ static int squashfs_new_inode(struct super_block *sb, struct inode *inode,
gid_t i_gid;
int err;
+ inode->i_ino = le32_to_cpu(sqsh_ino->inode_number);
+ if (inode->i_ino == 0)
+ return -EINVAL;
+
err = squashfs_get_id(sb, le16_to_cpu(sqsh_ino->uid), &i_uid);
if (err)
return err;
@@ -58,7 +62,6 @@ static int squashfs_new_inode(struct super_block *sb, struct inode *inode,
i_uid_write(inode, i_uid);
i_gid_write(inode, i_gid);
- inode->i_ino = le32_to_cpu(sqsh_ino->inode_number);
inode_set_mtime(inode, le32_to_cpu(sqsh_ino->mtime), 0);
inode_set_atime(inode, inode_get_mtime_sec(inode), 0);
inode_set_ctime(inode, inode_get_mtime_sec(inode), 0);
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 9253c54e01b6505d348afbc02abaa4d9f8a01395
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042355-elk-rumor-538b@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
9253c54e01b6 ("Squashfs: check the inode number is not the invalid value of zero")
a1f13ed8c748 ("squashfs: convert to new timestamp accessors")
280345d0d03b ("squashfs: convert to ctime accessor functions")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9253c54e01b6505d348afbc02abaa4d9f8a01395 Mon Sep 17 00:00:00 2001
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Date: Mon, 8 Apr 2024 23:02:06 +0100
Subject: [PATCH] Squashfs: check the inode number is not the invalid value of
zero
Syskiller has produced an out of bounds access in fill_meta_index().
That out of bounds access is ultimately caused because the inode
has an inode number with the invalid value of zero, which was not checked.
The reason this causes the out of bounds access is due to following
sequence of events:
1. Fill_meta_index() is called to allocate (via empty_meta_index())
and fill a metadata index. It however suffers a data read error
and aborts, invalidating the newly returned empty metadata index.
It does this by setting the inode number of the index to zero,
which means unused (zero is not a valid inode number).
2. When fill_meta_index() is subsequently called again on another
read operation, locate_meta_index() returns the previous index
because it matches the inode number of 0. Because this index
has been returned it is expected to have been filled, and because
it hasn't been, an out of bounds access is performed.
This patch adds a sanity check which checks that the inode number
is not zero when the inode is created and returns -EINVAL if it is.
[phillip(a)squashfs.org.uk: whitespace fix]
Link: https://lkml.kernel.org/r/20240409204723.446925-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20240408220206.435788-1-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: "Ubisectech Sirius" <bugreport(a)ubisectech.com>
Closes: https://lore.kernel.org/lkml/87f5c007-b8a5-41ae-8b57-431e924c5915.bugreport…
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/squashfs/inode.c b/fs/squashfs/inode.c
index aa3411354e66..16bd693d0b3a 100644
--- a/fs/squashfs/inode.c
+++ b/fs/squashfs/inode.c
@@ -48,6 +48,10 @@ static int squashfs_new_inode(struct super_block *sb, struct inode *inode,
gid_t i_gid;
int err;
+ inode->i_ino = le32_to_cpu(sqsh_ino->inode_number);
+ if (inode->i_ino == 0)
+ return -EINVAL;
+
err = squashfs_get_id(sb, le16_to_cpu(sqsh_ino->uid), &i_uid);
if (err)
return err;
@@ -58,7 +62,6 @@ static int squashfs_new_inode(struct super_block *sb, struct inode *inode,
i_uid_write(inode, i_uid);
i_gid_write(inode, i_gid);
- inode->i_ino = le32_to_cpu(sqsh_ino->inode_number);
inode_set_mtime(inode, le32_to_cpu(sqsh_ino->mtime), 0);
inode_set_atime(inode, inode_get_mtime_sec(inode), 0);
inode_set_ctime(inode, inode_get_mtime_sec(inode), 0);
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 9253c54e01b6505d348afbc02abaa4d9f8a01395
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042352-affected-violation-2f60@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
9253c54e01b6 ("Squashfs: check the inode number is not the invalid value of zero")
a1f13ed8c748 ("squashfs: convert to new timestamp accessors")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9253c54e01b6505d348afbc02abaa4d9f8a01395 Mon Sep 17 00:00:00 2001
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Date: Mon, 8 Apr 2024 23:02:06 +0100
Subject: [PATCH] Squashfs: check the inode number is not the invalid value of
zero
Syskiller has produced an out of bounds access in fill_meta_index().
That out of bounds access is ultimately caused because the inode
has an inode number with the invalid value of zero, which was not checked.
The reason this causes the out of bounds access is due to following
sequence of events:
1. Fill_meta_index() is called to allocate (via empty_meta_index())
and fill a metadata index. It however suffers a data read error
and aborts, invalidating the newly returned empty metadata index.
It does this by setting the inode number of the index to zero,
which means unused (zero is not a valid inode number).
2. When fill_meta_index() is subsequently called again on another
read operation, locate_meta_index() returns the previous index
because it matches the inode number of 0. Because this index
has been returned it is expected to have been filled, and because
it hasn't been, an out of bounds access is performed.
This patch adds a sanity check which checks that the inode number
is not zero when the inode is created and returns -EINVAL if it is.
[phillip(a)squashfs.org.uk: whitespace fix]
Link: https://lkml.kernel.org/r/20240409204723.446925-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20240408220206.435788-1-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: "Ubisectech Sirius" <bugreport(a)ubisectech.com>
Closes: https://lore.kernel.org/lkml/87f5c007-b8a5-41ae-8b57-431e924c5915.bugreport…
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/squashfs/inode.c b/fs/squashfs/inode.c
index aa3411354e66..16bd693d0b3a 100644
--- a/fs/squashfs/inode.c
+++ b/fs/squashfs/inode.c
@@ -48,6 +48,10 @@ static int squashfs_new_inode(struct super_block *sb, struct inode *inode,
gid_t i_gid;
int err;
+ inode->i_ino = le32_to_cpu(sqsh_ino->inode_number);
+ if (inode->i_ino == 0)
+ return -EINVAL;
+
err = squashfs_get_id(sb, le16_to_cpu(sqsh_ino->uid), &i_uid);
if (err)
return err;
@@ -58,7 +62,6 @@ static int squashfs_new_inode(struct super_block *sb, struct inode *inode,
i_uid_write(inode, i_uid);
i_gid_write(inode, i_gid);
- inode->i_ino = le32_to_cpu(sqsh_ino->inode_number);
inode_set_mtime(inode, le32_to_cpu(sqsh_ino->mtime), 0);
inode_set_atime(inode, inode_get_mtime_sec(inode), 0);
inode_set_ctime(inode, inode_get_mtime_sec(inode), 0);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x a60ccade88f926e871a57176e86a34bbf0db0098
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042353-oversight-despair-5bb8@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
a60ccade88f9 ("drm/vmwgfx: Fix crtc's atomic check conditional")
29b77ad7b9ca ("drm/atomic: Pass the full state to CRTC atomic_check")
c489573b5b6c ("Merge drm/drm-next into drm-misc-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a60ccade88f926e871a57176e86a34bbf0db0098 Mon Sep 17 00:00:00 2001
From: Zack Rusin <zack.rusin(a)broadcom.com>
Date: Thu, 11 Apr 2024 22:55:10 -0400
Subject: [PATCH] drm/vmwgfx: Fix crtc's atomic check conditional
The conditional was supposed to prevent enabling of a crtc state
without a set primary plane. Accidently it also prevented disabling
crtc state with a set primary plane. Neither is correct.
Fix the conditional and just driver-warn when a crtc state has been
enabled without a primary plane which will help debug broken userspace.
Fixes IGT's kms_atomic_interruptible and kms_atomic_transition tests.
Signed-off-by: Zack Rusin <zack.rusin(a)broadcom.com>
Fixes: 06ec41909e31 ("drm/vmwgfx: Add and connect CRTC helper functions")
Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list(a)broadcom.com>
Cc: dri-devel(a)lists.freedesktop.org
Cc: <stable(a)vger.kernel.org> # v4.12+
Reviewed-by: Ian Forbes <ian.forbes(a)broadcom.com>
Reviewed-by: Martin Krastev <martin.krastev(a)broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240412025511.78553-5-zack.r…
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
index cd4925346ed4..84ae4e10a2eb 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
@@ -933,6 +933,7 @@ int vmw_du_cursor_plane_atomic_check(struct drm_plane *plane,
int vmw_du_crtc_atomic_check(struct drm_crtc *crtc,
struct drm_atomic_state *state)
{
+ struct vmw_private *vmw = vmw_priv(crtc->dev);
struct drm_crtc_state *new_state = drm_atomic_get_new_crtc_state(state,
crtc);
struct vmw_display_unit *du = vmw_crtc_to_du(new_state->crtc);
@@ -940,9 +941,13 @@ int vmw_du_crtc_atomic_check(struct drm_crtc *crtc,
bool has_primary = new_state->plane_mask &
drm_plane_mask(crtc->primary);
- /* We always want to have an active plane with an active CRTC */
- if (has_primary != new_state->enable)
- return -EINVAL;
+ /*
+ * This is fine in general, but broken userspace might expect
+ * some actual rendering so give a clue as why it's blank.
+ */
+ if (new_state->enable && !has_primary)
+ drm_dbg_driver(&vmw->drm,
+ "CRTC without a primary plane will be blank.\n");
if (new_state->connector_mask != connector_mask &&
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x a60ccade88f926e871a57176e86a34bbf0db0098
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042351-plausibly-bucket-235e@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
a60ccade88f9 ("drm/vmwgfx: Fix crtc's atomic check conditional")
29b77ad7b9ca ("drm/atomic: Pass the full state to CRTC atomic_check")
c489573b5b6c ("Merge drm/drm-next into drm-misc-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a60ccade88f926e871a57176e86a34bbf0db0098 Mon Sep 17 00:00:00 2001
From: Zack Rusin <zack.rusin(a)broadcom.com>
Date: Thu, 11 Apr 2024 22:55:10 -0400
Subject: [PATCH] drm/vmwgfx: Fix crtc's atomic check conditional
The conditional was supposed to prevent enabling of a crtc state
without a set primary plane. Accidently it also prevented disabling
crtc state with a set primary plane. Neither is correct.
Fix the conditional and just driver-warn when a crtc state has been
enabled without a primary plane which will help debug broken userspace.
Fixes IGT's kms_atomic_interruptible and kms_atomic_transition tests.
Signed-off-by: Zack Rusin <zack.rusin(a)broadcom.com>
Fixes: 06ec41909e31 ("drm/vmwgfx: Add and connect CRTC helper functions")
Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list(a)broadcom.com>
Cc: dri-devel(a)lists.freedesktop.org
Cc: <stable(a)vger.kernel.org> # v4.12+
Reviewed-by: Ian Forbes <ian.forbes(a)broadcom.com>
Reviewed-by: Martin Krastev <martin.krastev(a)broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240412025511.78553-5-zack.r…
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
index cd4925346ed4..84ae4e10a2eb 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
@@ -933,6 +933,7 @@ int vmw_du_cursor_plane_atomic_check(struct drm_plane *plane,
int vmw_du_crtc_atomic_check(struct drm_crtc *crtc,
struct drm_atomic_state *state)
{
+ struct vmw_private *vmw = vmw_priv(crtc->dev);
struct drm_crtc_state *new_state = drm_atomic_get_new_crtc_state(state,
crtc);
struct vmw_display_unit *du = vmw_crtc_to_du(new_state->crtc);
@@ -940,9 +941,13 @@ int vmw_du_crtc_atomic_check(struct drm_crtc *crtc,
bool has_primary = new_state->plane_mask &
drm_plane_mask(crtc->primary);
- /* We always want to have an active plane with an active CRTC */
- if (has_primary != new_state->enable)
- return -EINVAL;
+ /*
+ * This is fine in general, but broken userspace might expect
+ * some actual rendering so give a clue as why it's blank.
+ */
+ if (new_state->enable && !has_primary)
+ drm_dbg_driver(&vmw->drm,
+ "CRTC without a primary plane will be blank.\n");
if (new_state->connector_mask != connector_mask &&
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x a60ccade88f926e871a57176e86a34bbf0db0098
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042350-wikipedia-jogging-8ed0@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
a60ccade88f9 ("drm/vmwgfx: Fix crtc's atomic check conditional")
29b77ad7b9ca ("drm/atomic: Pass the full state to CRTC atomic_check")
c489573b5b6c ("Merge drm/drm-next into drm-misc-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a60ccade88f926e871a57176e86a34bbf0db0098 Mon Sep 17 00:00:00 2001
From: Zack Rusin <zack.rusin(a)broadcom.com>
Date: Thu, 11 Apr 2024 22:55:10 -0400
Subject: [PATCH] drm/vmwgfx: Fix crtc's atomic check conditional
The conditional was supposed to prevent enabling of a crtc state
without a set primary plane. Accidently it also prevented disabling
crtc state with a set primary plane. Neither is correct.
Fix the conditional and just driver-warn when a crtc state has been
enabled without a primary plane which will help debug broken userspace.
Fixes IGT's kms_atomic_interruptible and kms_atomic_transition tests.
Signed-off-by: Zack Rusin <zack.rusin(a)broadcom.com>
Fixes: 06ec41909e31 ("drm/vmwgfx: Add and connect CRTC helper functions")
Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list(a)broadcom.com>
Cc: dri-devel(a)lists.freedesktop.org
Cc: <stable(a)vger.kernel.org> # v4.12+
Reviewed-by: Ian Forbes <ian.forbes(a)broadcom.com>
Reviewed-by: Martin Krastev <martin.krastev(a)broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240412025511.78553-5-zack.r…
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
index cd4925346ed4..84ae4e10a2eb 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
@@ -933,6 +933,7 @@ int vmw_du_cursor_plane_atomic_check(struct drm_plane *plane,
int vmw_du_crtc_atomic_check(struct drm_crtc *crtc,
struct drm_atomic_state *state)
{
+ struct vmw_private *vmw = vmw_priv(crtc->dev);
struct drm_crtc_state *new_state = drm_atomic_get_new_crtc_state(state,
crtc);
struct vmw_display_unit *du = vmw_crtc_to_du(new_state->crtc);
@@ -940,9 +941,13 @@ int vmw_du_crtc_atomic_check(struct drm_crtc *crtc,
bool has_primary = new_state->plane_mask &
drm_plane_mask(crtc->primary);
- /* We always want to have an active plane with an active CRTC */
- if (has_primary != new_state->enable)
- return -EINVAL;
+ /*
+ * This is fine in general, but broken userspace might expect
+ * some actual rendering so give a clue as why it's blank.
+ */
+ if (new_state->enable && !has_primary)
+ drm_dbg_driver(&vmw->drm,
+ "CRTC without a primary plane will be blank.\n");
if (new_state->connector_mask != connector_mask &&
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x d4c972bff3129a9dd4c22a3999fd8eba1a81531a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042337-sizzle-quadrant-ebd6@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
d4c972bff312 ("drm/vmwgfx: Sort primary plane formats by order of preference")
8bb75aeb58bd ("drm/vmwgfx: validate the screen formats")
92f1d09ca4ed ("drm: Switch to %p4cc format modifier")
d8713d6684a4 ("drm/vmwgfx/vmwgfx_kms: Mark vmw_{cursor,primary}_plane_formats as __maybe_unused")
5fbd41d3bf12 ("Merge tag 'drm-misc-next-2020-11-27-1' of git://anongit.freedesktop.org/drm/drm-misc into drm-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d4c972bff3129a9dd4c22a3999fd8eba1a81531a Mon Sep 17 00:00:00 2001
From: Zack Rusin <zack.rusin(a)broadcom.com>
Date: Thu, 11 Apr 2024 22:55:11 -0400
Subject: [PATCH] drm/vmwgfx: Sort primary plane formats by order of preference
The table of primary plane formats wasn't sorted at all, leading to
applications picking our least desirable formats by defaults.
Sort the primary plane formats according to our order of preference.
Nice side-effect of this change is that it makes IGT's kms_atomic
plane-invalid-params pass because the test picks the first format
which for vmwgfx was DRM_FORMAT_XRGB1555 and uses fb's with odd sizes
which make Pixman, which IGT depends on assert due to the fact that our
16bpp formats aren't 32 bit aligned like Pixman requires all formats
to be.
Signed-off-by: Zack Rusin <zack.rusin(a)broadcom.com>
Fixes: 36cc79bc9077 ("drm/vmwgfx: Add universal plane support")
Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list(a)broadcom.com>
Cc: dri-devel(a)lists.freedesktop.org
Cc: <stable(a)vger.kernel.org> # v4.12+
Acked-by: Pekka Paalanen <pekka.paalanen(a)collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240412025511.78553-6-zack.r…
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
index a94947b588e8..19a843da87b7 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
@@ -243,10 +243,10 @@ struct vmw_framebuffer_bo {
static const uint32_t __maybe_unused vmw_primary_plane_formats[] = {
- DRM_FORMAT_XRGB1555,
- DRM_FORMAT_RGB565,
DRM_FORMAT_XRGB8888,
DRM_FORMAT_ARGB8888,
+ DRM_FORMAT_RGB565,
+ DRM_FORMAT_XRGB1555,
};
static const uint32_t __maybe_unused vmw_cursor_plane_formats[] = {
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x d4c972bff3129a9dd4c22a3999fd8eba1a81531a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042336-upscale-specks-c8aa@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
d4c972bff312 ("drm/vmwgfx: Sort primary plane formats by order of preference")
8bb75aeb58bd ("drm/vmwgfx: validate the screen formats")
92f1d09ca4ed ("drm: Switch to %p4cc format modifier")
d8713d6684a4 ("drm/vmwgfx/vmwgfx_kms: Mark vmw_{cursor,primary}_plane_formats as __maybe_unused")
5fbd41d3bf12 ("Merge tag 'drm-misc-next-2020-11-27-1' of git://anongit.freedesktop.org/drm/drm-misc into drm-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d4c972bff3129a9dd4c22a3999fd8eba1a81531a Mon Sep 17 00:00:00 2001
From: Zack Rusin <zack.rusin(a)broadcom.com>
Date: Thu, 11 Apr 2024 22:55:11 -0400
Subject: [PATCH] drm/vmwgfx: Sort primary plane formats by order of preference
The table of primary plane formats wasn't sorted at all, leading to
applications picking our least desirable formats by defaults.
Sort the primary plane formats according to our order of preference.
Nice side-effect of this change is that it makes IGT's kms_atomic
plane-invalid-params pass because the test picks the first format
which for vmwgfx was DRM_FORMAT_XRGB1555 and uses fb's with odd sizes
which make Pixman, which IGT depends on assert due to the fact that our
16bpp formats aren't 32 bit aligned like Pixman requires all formats
to be.
Signed-off-by: Zack Rusin <zack.rusin(a)broadcom.com>
Fixes: 36cc79bc9077 ("drm/vmwgfx: Add universal plane support")
Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list(a)broadcom.com>
Cc: dri-devel(a)lists.freedesktop.org
Cc: <stable(a)vger.kernel.org> # v4.12+
Acked-by: Pekka Paalanen <pekka.paalanen(a)collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240412025511.78553-6-zack.r…
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
index a94947b588e8..19a843da87b7 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
@@ -243,10 +243,10 @@ struct vmw_framebuffer_bo {
static const uint32_t __maybe_unused vmw_primary_plane_formats[] = {
- DRM_FORMAT_XRGB1555,
- DRM_FORMAT_RGB565,
DRM_FORMAT_XRGB8888,
DRM_FORMAT_ARGB8888,
+ DRM_FORMAT_RGB565,
+ DRM_FORMAT_XRGB1555,
};
static const uint32_t __maybe_unused vmw_cursor_plane_formats[] = {
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x d4c972bff3129a9dd4c22a3999fd8eba1a81531a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042335-prowling-handwash-80c5@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
d4c972bff312 ("drm/vmwgfx: Sort primary plane formats by order of preference")
8bb75aeb58bd ("drm/vmwgfx: validate the screen formats")
92f1d09ca4ed ("drm: Switch to %p4cc format modifier")
d8713d6684a4 ("drm/vmwgfx/vmwgfx_kms: Mark vmw_{cursor,primary}_plane_formats as __maybe_unused")
5fbd41d3bf12 ("Merge tag 'drm-misc-next-2020-11-27-1' of git://anongit.freedesktop.org/drm/drm-misc into drm-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d4c972bff3129a9dd4c22a3999fd8eba1a81531a Mon Sep 17 00:00:00 2001
From: Zack Rusin <zack.rusin(a)broadcom.com>
Date: Thu, 11 Apr 2024 22:55:11 -0400
Subject: [PATCH] drm/vmwgfx: Sort primary plane formats by order of preference
The table of primary plane formats wasn't sorted at all, leading to
applications picking our least desirable formats by defaults.
Sort the primary plane formats according to our order of preference.
Nice side-effect of this change is that it makes IGT's kms_atomic
plane-invalid-params pass because the test picks the first format
which for vmwgfx was DRM_FORMAT_XRGB1555 and uses fb's with odd sizes
which make Pixman, which IGT depends on assert due to the fact that our
16bpp formats aren't 32 bit aligned like Pixman requires all formats
to be.
Signed-off-by: Zack Rusin <zack.rusin(a)broadcom.com>
Fixes: 36cc79bc9077 ("drm/vmwgfx: Add universal plane support")
Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list(a)broadcom.com>
Cc: dri-devel(a)lists.freedesktop.org
Cc: <stable(a)vger.kernel.org> # v4.12+
Acked-by: Pekka Paalanen <pekka.paalanen(a)collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240412025511.78553-6-zack.r…
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
index a94947b588e8..19a843da87b7 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
@@ -243,10 +243,10 @@ struct vmw_framebuffer_bo {
static const uint32_t __maybe_unused vmw_primary_plane_formats[] = {
- DRM_FORMAT_XRGB1555,
- DRM_FORMAT_RGB565,
DRM_FORMAT_XRGB8888,
DRM_FORMAT_ARGB8888,
+ DRM_FORMAT_RGB565,
+ DRM_FORMAT_XRGB1555,
};
static const uint32_t __maybe_unused vmw_cursor_plane_formats[] = {
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x b6976f323a8687cc0d55bc92c2086fd934324ed5
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042340-laborious-overact-587b@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
b6976f323a86 ("drm/ttm: stop pooling cached NUMA pages v2")
fd37721803c6 ("mm, treewide: introduce NR_PAGE_ORDERS")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b6976f323a8687cc0d55bc92c2086fd934324ed5 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= <ckoenig.leichtzumerken(a)gmail.com>
Date: Mon, 15 Apr 2024 15:48:21 +0200
Subject: [PATCH] drm/ttm: stop pooling cached NUMA pages v2
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
We only pool write combined and uncached allocations because they
require extra overhead on allocation and release.
If we also pool cached NUMA it not only means some extra unnecessary
overhead, but also that under memory pressure it can happen that
pages from the wrong NUMA node enters the pool and are re-used
over and over again.
This can lead to performance reduction after running into memory
pressure.
v2: restructure and cleanup the code a bit from the internal hack to
test this.
Signed-off-by: Christian König <christian.koenig(a)amd.com>
Fixes: 4482d3c94d7f ("drm/ttm: add NUMA node id to the pool")
CC: stable(a)vger.kernel.org
Reviewed-by: Felix Kuehling <felix.kuehling(a)amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240415134821.1919-1-christi…
diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
index 112438d965ff..6e1fd6985ffc 100644
--- a/drivers/gpu/drm/ttm/ttm_pool.c
+++ b/drivers/gpu/drm/ttm/ttm_pool.c
@@ -288,17 +288,23 @@ static struct ttm_pool_type *ttm_pool_select_type(struct ttm_pool *pool,
enum ttm_caching caching,
unsigned int order)
{
- if (pool->use_dma_alloc || pool->nid != NUMA_NO_NODE)
+ if (pool->use_dma_alloc)
return &pool->caching[caching].orders[order];
#ifdef CONFIG_X86
switch (caching) {
case ttm_write_combined:
+ if (pool->nid != NUMA_NO_NODE)
+ return &pool->caching[caching].orders[order];
+
if (pool->use_dma32)
return &global_dma32_write_combined[order];
return &global_write_combined[order];
case ttm_uncached:
+ if (pool->nid != NUMA_NO_NODE)
+ return &pool->caching[caching].orders[order];
+
if (pool->use_dma32)
return &global_dma32_uncached[order];
@@ -566,11 +572,17 @@ void ttm_pool_init(struct ttm_pool *pool, struct device *dev,
pool->use_dma_alloc = use_dma_alloc;
pool->use_dma32 = use_dma32;
- if (use_dma_alloc || nid != NUMA_NO_NODE) {
- for (i = 0; i < TTM_NUM_CACHING_TYPES; ++i)
- for (j = 0; j < NR_PAGE_ORDERS; ++j)
- ttm_pool_type_init(&pool->caching[i].orders[j],
- pool, i, j);
+ for (i = 0; i < TTM_NUM_CACHING_TYPES; ++i) {
+ for (j = 0; j < NR_PAGE_ORDERS; ++j) {
+ struct ttm_pool_type *pt;
+
+ /* Initialize only pool types which are actually used */
+ pt = ttm_pool_select_type(pool, i, j);
+ if (pt != &pool->caching[i].orders[j])
+ continue;
+
+ ttm_pool_type_init(pt, pool, i, j);
+ }
}
}
EXPORT_SYMBOL(ttm_pool_init);
@@ -599,10 +611,16 @@ void ttm_pool_fini(struct ttm_pool *pool)
{
unsigned int i, j;
- if (pool->use_dma_alloc || pool->nid != NUMA_NO_NODE) {
- for (i = 0; i < TTM_NUM_CACHING_TYPES; ++i)
- for (j = 0; j < NR_PAGE_ORDERS; ++j)
- ttm_pool_type_fini(&pool->caching[i].orders[j]);
+ for (i = 0; i < TTM_NUM_CACHING_TYPES; ++i) {
+ for (j = 0; j < NR_PAGE_ORDERS; ++j) {
+ struct ttm_pool_type *pt;
+
+ pt = ttm_pool_select_type(pool, i, j);
+ if (pt != &pool->caching[i].orders[j])
+ continue;
+
+ ttm_pool_type_fini(pt);
+ }
}
/* We removed the pool types from the LRU, but we need to also make sure
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 46dad3c1e57897ab9228332f03e1c14798d2d3b9
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042347-create-item-ebf8@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
46dad3c1e578 ("init/main.c: Fix potential static_command_line memory overflow")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 46dad3c1e57897ab9228332f03e1c14798d2d3b9 Mon Sep 17 00:00:00 2001
From: Yuntao Wang <ytcoode(a)gmail.com>
Date: Fri, 12 Apr 2024 16:17:32 +0800
Subject: [PATCH] init/main.c: Fix potential static_command_line memory
overflow
We allocate memory of size 'xlen + strlen(boot_command_line) + 1' for
static_command_line, but the strings copied into static_command_line are
extra_command_line and command_line, rather than extra_command_line and
boot_command_line.
When strlen(command_line) > strlen(boot_command_line), static_command_line
will overflow.
This patch just recovers strlen(command_line) which was miss-consolidated
with strlen(boot_command_line) in the commit f5c7310ac73e ("init/main: add
checks for the return value of memblock_alloc*()")
Link: https://lore.kernel.org/all/20240412081733.35925-2-ytcoode@gmail.com/
Fixes: f5c7310ac73e ("init/main: add checks for the return value of memblock_alloc*()")
Cc: stable(a)vger.kernel.org
Signed-off-by: Yuntao Wang <ytcoode(a)gmail.com>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
diff --git a/init/main.c b/init/main.c
index 881f6230ee59..5dcf5274c09c 100644
--- a/init/main.c
+++ b/init/main.c
@@ -636,6 +636,8 @@ static void __init setup_command_line(char *command_line)
if (!saved_command_line)
panic("%s: Failed to allocate %zu bytes\n", __func__, len + ilen);
+ len = xlen + strlen(command_line) + 1;
+
static_command_line = memblock_alloc(len, SMP_CACHE_BYTES);
if (!static_command_line)
panic("%s: Failed to allocate %zu bytes\n", __func__, len);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 49ff3b4aec51e3abfc9369997cc603319b02af9a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042325-rockstar-civic-9c1a@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
49ff3b4aec51 ("KVM: x86/pmu: Do not mask LVTPC when handling a PMI on AMD platforms")
a16eb25b09c0 ("KVM: x86: Mask LVTPC when handling a PMI")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 49ff3b4aec51e3abfc9369997cc603319b02af9a Mon Sep 17 00:00:00 2001
From: Sandipan Das <sandipan.das(a)amd.com>
Date: Fri, 5 Apr 2024 16:55:55 -0700
Subject: [PATCH] KVM: x86/pmu: Do not mask LVTPC when handling a PMI on AMD
platforms
On AMD and Hygon platforms, the local APIC does not automatically set
the mask bit of the LVTPC register when handling a PMI and there is
no need to clear it in the kernel's PMI handler.
For guests, the mask bit is currently set by kvm_apic_local_deliver()
and unless it is cleared by the guest kernel's PMI handler, PMIs stop
arriving and break use-cases like sampling with perf record.
This does not affect non-PerfMonV2 guests because PMIs are handled in
the guest kernel by x86_pmu_handle_irq() which always clears the LVTPC
mask bit irrespective of the vendor.
Before:
$ perf record -e cycles:u true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data (1 samples) ]
After:
$ perf record -e cycles:u true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB perf.data (19 samples) ]
Fixes: a16eb25b09c0 ("KVM: x86: Mask LVTPC when handling a PMI")
Cc: stable(a)vger.kernel.org
Signed-off-by: Sandipan Das <sandipan.das(a)amd.com>
Reviewed-by: Jim Mattson <jmattson(a)google.com>
[sean: use is_intel_compatible instead of !is_amd_or_hygon()]
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-ID: <20240405235603.1173076-3-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index cf37586f0466..ebf41023be38 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2776,7 +2776,8 @@ int kvm_apic_local_deliver(struct kvm_lapic *apic, int lvt_type)
trig_mode = reg & APIC_LVT_LEVEL_TRIGGER;
r = __apic_accept_irq(apic, mode, vector, 1, trig_mode, NULL);
- if (r && lvt_type == APIC_LVTPC)
+ if (r && lvt_type == APIC_LVTPC &&
+ guest_cpuid_is_intel_compatible(apic->vcpu))
kvm_lapic_set_reg(apic, APIC_LVTPC, reg | APIC_LVT_MASKED);
return r;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 49ff3b4aec51e3abfc9369997cc603319b02af9a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042320-deflected-pester-4998@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
49ff3b4aec51 ("KVM: x86/pmu: Do not mask LVTPC when handling a PMI on AMD platforms")
a16eb25b09c0 ("KVM: x86: Mask LVTPC when handling a PMI")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 49ff3b4aec51e3abfc9369997cc603319b02af9a Mon Sep 17 00:00:00 2001
From: Sandipan Das <sandipan.das(a)amd.com>
Date: Fri, 5 Apr 2024 16:55:55 -0700
Subject: [PATCH] KVM: x86/pmu: Do not mask LVTPC when handling a PMI on AMD
platforms
On AMD and Hygon platforms, the local APIC does not automatically set
the mask bit of the LVTPC register when handling a PMI and there is
no need to clear it in the kernel's PMI handler.
For guests, the mask bit is currently set by kvm_apic_local_deliver()
and unless it is cleared by the guest kernel's PMI handler, PMIs stop
arriving and break use-cases like sampling with perf record.
This does not affect non-PerfMonV2 guests because PMIs are handled in
the guest kernel by x86_pmu_handle_irq() which always clears the LVTPC
mask bit irrespective of the vendor.
Before:
$ perf record -e cycles:u true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data (1 samples) ]
After:
$ perf record -e cycles:u true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB perf.data (19 samples) ]
Fixes: a16eb25b09c0 ("KVM: x86: Mask LVTPC when handling a PMI")
Cc: stable(a)vger.kernel.org
Signed-off-by: Sandipan Das <sandipan.das(a)amd.com>
Reviewed-by: Jim Mattson <jmattson(a)google.com>
[sean: use is_intel_compatible instead of !is_amd_or_hygon()]
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-ID: <20240405235603.1173076-3-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index cf37586f0466..ebf41023be38 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2776,7 +2776,8 @@ int kvm_apic_local_deliver(struct kvm_lapic *apic, int lvt_type)
trig_mode = reg & APIC_LVT_LEVEL_TRIGGER;
r = __apic_accept_irq(apic, mode, vector, 1, trig_mode, NULL);
- if (r && lvt_type == APIC_LVTPC)
+ if (r && lvt_type == APIC_LVTPC &&
+ guest_cpuid_is_intel_compatible(apic->vcpu))
kvm_lapic_set_reg(apic, APIC_LVTPC, reg | APIC_LVT_MASKED);
return r;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 49ff3b4aec51e3abfc9369997cc603319b02af9a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042306-overfeed-simmering-3d3c@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
49ff3b4aec51 ("KVM: x86/pmu: Do not mask LVTPC when handling a PMI on AMD platforms")
a16eb25b09c0 ("KVM: x86: Mask LVTPC when handling a PMI")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 49ff3b4aec51e3abfc9369997cc603319b02af9a Mon Sep 17 00:00:00 2001
From: Sandipan Das <sandipan.das(a)amd.com>
Date: Fri, 5 Apr 2024 16:55:55 -0700
Subject: [PATCH] KVM: x86/pmu: Do not mask LVTPC when handling a PMI on AMD
platforms
On AMD and Hygon platforms, the local APIC does not automatically set
the mask bit of the LVTPC register when handling a PMI and there is
no need to clear it in the kernel's PMI handler.
For guests, the mask bit is currently set by kvm_apic_local_deliver()
and unless it is cleared by the guest kernel's PMI handler, PMIs stop
arriving and break use-cases like sampling with perf record.
This does not affect non-PerfMonV2 guests because PMIs are handled in
the guest kernel by x86_pmu_handle_irq() which always clears the LVTPC
mask bit irrespective of the vendor.
Before:
$ perf record -e cycles:u true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data (1 samples) ]
After:
$ perf record -e cycles:u true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB perf.data (19 samples) ]
Fixes: a16eb25b09c0 ("KVM: x86: Mask LVTPC when handling a PMI")
Cc: stable(a)vger.kernel.org
Signed-off-by: Sandipan Das <sandipan.das(a)amd.com>
Reviewed-by: Jim Mattson <jmattson(a)google.com>
[sean: use is_intel_compatible instead of !is_amd_or_hygon()]
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-ID: <20240405235603.1173076-3-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index cf37586f0466..ebf41023be38 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2776,7 +2776,8 @@ int kvm_apic_local_deliver(struct kvm_lapic *apic, int lvt_type)
trig_mode = reg & APIC_LVT_LEVEL_TRIGGER;
r = __apic_accept_irq(apic, mode, vector, 1, trig_mode, NULL);
- if (r && lvt_type == APIC_LVTPC)
+ if (r && lvt_type == APIC_LVTPC &&
+ guest_cpuid_is_intel_compatible(apic->vcpu))
kvm_lapic_set_reg(apic, APIC_LVTPC, reg | APIC_LVT_MASKED);
return r;
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 54c4ec5f8c471b7c1137a1f769648549c423c026
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042357-dropbox-unhitched-ffb8@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
54c4ec5f8c47 ("serial: mxs-auart: add spinlock around changing cts state")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 54c4ec5f8c471b7c1137a1f769648549c423c026 Mon Sep 17 00:00:00 2001
From: Emil Kronborg <emil.kronborg(a)protonmail.com>
Date: Wed, 20 Mar 2024 12:15:36 +0000
Subject: [PATCH] serial: mxs-auart: add spinlock around changing cts state
The uart_handle_cts_change() function in serial_core expects the caller
to hold uport->lock. For example, I have seen the below kernel splat,
when the Bluetooth driver is loaded on an i.MX28 board.
[ 85.119255] ------------[ cut here ]------------
[ 85.124413] WARNING: CPU: 0 PID: 27 at /drivers/tty/serial/serial_core.c:3453 uart_handle_cts_change+0xb4/0xec
[ 85.134694] Modules linked in: hci_uart bluetooth ecdh_generic ecc wlcore_sdio configfs
[ 85.143314] CPU: 0 PID: 27 Comm: kworker/u3:0 Not tainted 6.6.3-00021-gd62a2f068f92 #1
[ 85.151396] Hardware name: Freescale MXS (Device Tree)
[ 85.156679] Workqueue: hci0 hci_power_on [bluetooth]
(...)
[ 85.191765] uart_handle_cts_change from mxs_auart_irq_handle+0x380/0x3f4
[ 85.198787] mxs_auart_irq_handle from __handle_irq_event_percpu+0x88/0x210
(...)
Cc: stable(a)vger.kernel.org
Fixes: 4d90bb147ef6 ("serial: core: Document and assert lock requirements for irq helpers")
Reviewed-by: Frank Li <Frank.Li(a)nxp.com>
Signed-off-by: Emil Kronborg <emil.kronborg(a)protonmail.com>
Link: https://lore.kernel.org/r/20240320121530.11348-1-emil.kronborg@protonmail.c…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/mxs-auart.c b/drivers/tty/serial/mxs-auart.c
index 4749331fe618..1e8853eae504 100644
--- a/drivers/tty/serial/mxs-auart.c
+++ b/drivers/tty/serial/mxs-auart.c
@@ -1086,11 +1086,13 @@ static void mxs_auart_set_ldisc(struct uart_port *port,
static irqreturn_t mxs_auart_irq_handle(int irq, void *context)
{
- u32 istat;
+ u32 istat, stat;
struct mxs_auart_port *s = context;
u32 mctrl_temp = s->mctrl_prev;
- u32 stat = mxs_read(s, REG_STAT);
+ uart_port_lock(&s->port);
+
+ stat = mxs_read(s, REG_STAT);
istat = mxs_read(s, REG_INTR);
/* ack irq */
@@ -1126,6 +1128,8 @@ static irqreturn_t mxs_auart_irq_handle(int irq, void *context)
istat &= ~AUART_INTR_TXIS;
}
+ uart_port_unlock(&s->port);
+
return IRQ_HANDLED;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 54c4ec5f8c471b7c1137a1f769648549c423c026
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042355-imagines-such-7bf6@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
54c4ec5f8c47 ("serial: mxs-auart: add spinlock around changing cts state")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 54c4ec5f8c471b7c1137a1f769648549c423c026 Mon Sep 17 00:00:00 2001
From: Emil Kronborg <emil.kronborg(a)protonmail.com>
Date: Wed, 20 Mar 2024 12:15:36 +0000
Subject: [PATCH] serial: mxs-auart: add spinlock around changing cts state
The uart_handle_cts_change() function in serial_core expects the caller
to hold uport->lock. For example, I have seen the below kernel splat,
when the Bluetooth driver is loaded on an i.MX28 board.
[ 85.119255] ------------[ cut here ]------------
[ 85.124413] WARNING: CPU: 0 PID: 27 at /drivers/tty/serial/serial_core.c:3453 uart_handle_cts_change+0xb4/0xec
[ 85.134694] Modules linked in: hci_uart bluetooth ecdh_generic ecc wlcore_sdio configfs
[ 85.143314] CPU: 0 PID: 27 Comm: kworker/u3:0 Not tainted 6.6.3-00021-gd62a2f068f92 #1
[ 85.151396] Hardware name: Freescale MXS (Device Tree)
[ 85.156679] Workqueue: hci0 hci_power_on [bluetooth]
(...)
[ 85.191765] uart_handle_cts_change from mxs_auart_irq_handle+0x380/0x3f4
[ 85.198787] mxs_auart_irq_handle from __handle_irq_event_percpu+0x88/0x210
(...)
Cc: stable(a)vger.kernel.org
Fixes: 4d90bb147ef6 ("serial: core: Document and assert lock requirements for irq helpers")
Reviewed-by: Frank Li <Frank.Li(a)nxp.com>
Signed-off-by: Emil Kronborg <emil.kronborg(a)protonmail.com>
Link: https://lore.kernel.org/r/20240320121530.11348-1-emil.kronborg@protonmail.c…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/mxs-auart.c b/drivers/tty/serial/mxs-auart.c
index 4749331fe618..1e8853eae504 100644
--- a/drivers/tty/serial/mxs-auart.c
+++ b/drivers/tty/serial/mxs-auart.c
@@ -1086,11 +1086,13 @@ static void mxs_auart_set_ldisc(struct uart_port *port,
static irqreturn_t mxs_auart_irq_handle(int irq, void *context)
{
- u32 istat;
+ u32 istat, stat;
struct mxs_auart_port *s = context;
u32 mctrl_temp = s->mctrl_prev;
- u32 stat = mxs_read(s, REG_STAT);
+ uart_port_lock(&s->port);
+
+ stat = mxs_read(s, REG_STAT);
istat = mxs_read(s, REG_INTR);
/* ack irq */
@@ -1126,6 +1128,8 @@ static irqreturn_t mxs_auart_irq_handle(int irq, void *context)
istat &= ~AUART_INTR_TXIS;
}
+ uart_port_unlock(&s->port);
+
return IRQ_HANDLED;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 54c4ec5f8c471b7c1137a1f769648549c423c026
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042354-childless-album-3181@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
54c4ec5f8c47 ("serial: mxs-auart: add spinlock around changing cts state")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 54c4ec5f8c471b7c1137a1f769648549c423c026 Mon Sep 17 00:00:00 2001
From: Emil Kronborg <emil.kronborg(a)protonmail.com>
Date: Wed, 20 Mar 2024 12:15:36 +0000
Subject: [PATCH] serial: mxs-auart: add spinlock around changing cts state
The uart_handle_cts_change() function in serial_core expects the caller
to hold uport->lock. For example, I have seen the below kernel splat,
when the Bluetooth driver is loaded on an i.MX28 board.
[ 85.119255] ------------[ cut here ]------------
[ 85.124413] WARNING: CPU: 0 PID: 27 at /drivers/tty/serial/serial_core.c:3453 uart_handle_cts_change+0xb4/0xec
[ 85.134694] Modules linked in: hci_uart bluetooth ecdh_generic ecc wlcore_sdio configfs
[ 85.143314] CPU: 0 PID: 27 Comm: kworker/u3:0 Not tainted 6.6.3-00021-gd62a2f068f92 #1
[ 85.151396] Hardware name: Freescale MXS (Device Tree)
[ 85.156679] Workqueue: hci0 hci_power_on [bluetooth]
(...)
[ 85.191765] uart_handle_cts_change from mxs_auart_irq_handle+0x380/0x3f4
[ 85.198787] mxs_auart_irq_handle from __handle_irq_event_percpu+0x88/0x210
(...)
Cc: stable(a)vger.kernel.org
Fixes: 4d90bb147ef6 ("serial: core: Document and assert lock requirements for irq helpers")
Reviewed-by: Frank Li <Frank.Li(a)nxp.com>
Signed-off-by: Emil Kronborg <emil.kronborg(a)protonmail.com>
Link: https://lore.kernel.org/r/20240320121530.11348-1-emil.kronborg@protonmail.c…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/mxs-auart.c b/drivers/tty/serial/mxs-auart.c
index 4749331fe618..1e8853eae504 100644
--- a/drivers/tty/serial/mxs-auart.c
+++ b/drivers/tty/serial/mxs-auart.c
@@ -1086,11 +1086,13 @@ static void mxs_auart_set_ldisc(struct uart_port *port,
static irqreturn_t mxs_auart_irq_handle(int irq, void *context)
{
- u32 istat;
+ u32 istat, stat;
struct mxs_auart_port *s = context;
u32 mctrl_temp = s->mctrl_prev;
- u32 stat = mxs_read(s, REG_STAT);
+ uart_port_lock(&s->port);
+
+ stat = mxs_read(s, REG_STAT);
istat = mxs_read(s, REG_INTR);
/* ack irq */
@@ -1126,6 +1128,8 @@ static irqreturn_t mxs_auart_irq_handle(int irq, void *context)
istat &= ~AUART_INTR_TXIS;
}
+ uart_port_unlock(&s->port);
+
return IRQ_HANDLED;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 54c4ec5f8c471b7c1137a1f769648549c423c026
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042353-mayday-porcupine-efb6@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
54c4ec5f8c47 ("serial: mxs-auart: add spinlock around changing cts state")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 54c4ec5f8c471b7c1137a1f769648549c423c026 Mon Sep 17 00:00:00 2001
From: Emil Kronborg <emil.kronborg(a)protonmail.com>
Date: Wed, 20 Mar 2024 12:15:36 +0000
Subject: [PATCH] serial: mxs-auart: add spinlock around changing cts state
The uart_handle_cts_change() function in serial_core expects the caller
to hold uport->lock. For example, I have seen the below kernel splat,
when the Bluetooth driver is loaded on an i.MX28 board.
[ 85.119255] ------------[ cut here ]------------
[ 85.124413] WARNING: CPU: 0 PID: 27 at /drivers/tty/serial/serial_core.c:3453 uart_handle_cts_change+0xb4/0xec
[ 85.134694] Modules linked in: hci_uart bluetooth ecdh_generic ecc wlcore_sdio configfs
[ 85.143314] CPU: 0 PID: 27 Comm: kworker/u3:0 Not tainted 6.6.3-00021-gd62a2f068f92 #1
[ 85.151396] Hardware name: Freescale MXS (Device Tree)
[ 85.156679] Workqueue: hci0 hci_power_on [bluetooth]
(...)
[ 85.191765] uart_handle_cts_change from mxs_auart_irq_handle+0x380/0x3f4
[ 85.198787] mxs_auart_irq_handle from __handle_irq_event_percpu+0x88/0x210
(...)
Cc: stable(a)vger.kernel.org
Fixes: 4d90bb147ef6 ("serial: core: Document and assert lock requirements for irq helpers")
Reviewed-by: Frank Li <Frank.Li(a)nxp.com>
Signed-off-by: Emil Kronborg <emil.kronborg(a)protonmail.com>
Link: https://lore.kernel.org/r/20240320121530.11348-1-emil.kronborg@protonmail.c…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/mxs-auart.c b/drivers/tty/serial/mxs-auart.c
index 4749331fe618..1e8853eae504 100644
--- a/drivers/tty/serial/mxs-auart.c
+++ b/drivers/tty/serial/mxs-auart.c
@@ -1086,11 +1086,13 @@ static void mxs_auart_set_ldisc(struct uart_port *port,
static irqreturn_t mxs_auart_irq_handle(int irq, void *context)
{
- u32 istat;
+ u32 istat, stat;
struct mxs_auart_port *s = context;
u32 mctrl_temp = s->mctrl_prev;
- u32 stat = mxs_read(s, REG_STAT);
+ uart_port_lock(&s->port);
+
+ stat = mxs_read(s, REG_STAT);
istat = mxs_read(s, REG_INTR);
/* ack irq */
@@ -1126,6 +1128,8 @@ static irqreturn_t mxs_auart_irq_handle(int irq, void *context)
istat &= ~AUART_INTR_TXIS;
}
+ uart_port_unlock(&s->port);
+
return IRQ_HANDLED;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 631426ba1d45a8672b177ee85ad4cabe760dd131
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042338-ideally-setting-9717@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
631426ba1d45 ("mm/madvise: make MADV_POPULATE_(READ|WRITE) handle VM_FAULT_RETRY properly")
0f20bba1688b ("mm/gup: explicitly define and check internal GUP flags, disallow FOLL_TOUCH")
6cd06ab12d1a ("gup: make the stack expansion warning a bit more targeted")
9471f1f2f502 ("Merge branch 'expand-stack'")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 631426ba1d45a8672b177ee85ad4cabe760dd131 Mon Sep 17 00:00:00 2001
From: David Hildenbrand <david(a)redhat.com>
Date: Thu, 14 Mar 2024 17:12:59 +0100
Subject: [PATCH] mm/madvise: make MADV_POPULATE_(READ|WRITE) handle
VM_FAULT_RETRY properly
Darrick reports that in some cases where pread() would fail with -EIO and
mmap()+access would generate a SIGBUS signal, MADV_POPULATE_READ /
MADV_POPULATE_WRITE will keep retrying forever and not fail with -EFAULT.
While the madvise() call can be interrupted by a signal, this is not the
desired behavior. MADV_POPULATE_READ / MADV_POPULATE_WRITE should behave
like page faults in that case: fail and not retry forever.
A reproducer can be found at [1].
The reason is that __get_user_pages(), as called by
faultin_vma_page_range(), will not handle VM_FAULT_RETRY in a proper way:
it will simply return 0 when VM_FAULT_RETRY happened, making
madvise_populate()->faultin_vma_page_range() retry again and again, never
setting FOLL_TRIED->FAULT_FLAG_TRIED for __get_user_pages().
__get_user_pages_locked() does what we want, but duplicating that logic in
faultin_vma_page_range() feels wrong.
So let's use __get_user_pages_locked() instead, that will detect
VM_FAULT_RETRY and set FOLL_TRIED when retrying, making the fault handler
return VM_FAULT_SIGBUS (VM_FAULT_ERROR) at some point, propagating -EFAULT
from faultin_page() to __get_user_pages(), all the way to
madvise_populate().
But, there is an issue: __get_user_pages_locked() will end up re-taking
the MM lock and then __get_user_pages() will do another VMA lookup. In
the meantime, the VMA layout could have changed and we'd fail with
different error codes than we'd want to.
As __get_user_pages() will currently do a new VMA lookup either way, let
it do the VMA handling in a different way, controlled by a new
FOLL_MADV_POPULATE flag, effectively moving these checks from
madvise_populate() + faultin_page_range() in there.
With this change, Darricks reproducer properly fails with -EFAULT, as
documented for MADV_POPULATE_READ / MADV_POPULATE_WRITE.
[1] https://lore.kernel.org/all/20240313171936.GN1927156@frogsfrogsfrogs/
Link: https://lkml.kernel.org/r/20240314161300.382526-1-david@redhat.com
Link: https://lkml.kernel.org/r/20240314161300.382526-2-david@redhat.com
Fixes: 4ca9b3859dac ("mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables")
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Reported-by: Darrick J. Wong <djwong(a)kernel.org>
Closes: https://lore.kernel.org/all/20240311223815.GW1927156@frogsfrogsfrogs/
Cc: Darrick J. Wong <djwong(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Jason Gunthorpe <jgg(a)nvidia.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/gup.c b/mm/gup.c
index af8edadc05d1..1611e73b1121 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1206,6 +1206,22 @@ static long __get_user_pages(struct mm_struct *mm,
/* first iteration or cross vma bound */
if (!vma || start >= vma->vm_end) {
+ /*
+ * MADV_POPULATE_(READ|WRITE) wants to handle VMA
+ * lookups+error reporting differently.
+ */
+ if (gup_flags & FOLL_MADV_POPULATE) {
+ vma = vma_lookup(mm, start);
+ if (!vma) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ if (check_vma_flags(vma, gup_flags)) {
+ ret = -EINVAL;
+ goto out;
+ }
+ goto retry;
+ }
vma = gup_vma_lookup(mm, start);
if (!vma && in_gate_area(mm, start)) {
ret = get_gate_page(mm, start & PAGE_MASK,
@@ -1685,35 +1701,35 @@ long populate_vma_page_range(struct vm_area_struct *vma,
}
/*
- * faultin_vma_page_range() - populate (prefault) page tables inside the
- * given VMA range readable/writable
+ * faultin_page_range() - populate (prefault) page tables inside the
+ * given range readable/writable
*
* This takes care of mlocking the pages, too, if VM_LOCKED is set.
*
- * @vma: target vma
+ * @mm: the mm to populate page tables in
* @start: start address
* @end: end address
* @write: whether to prefault readable or writable
* @locked: whether the mmap_lock is still held
*
- * Returns either number of processed pages in the vma, or a negative error
- * code on error (see __get_user_pages()).
+ * Returns either number of processed pages in the MM, or a negative error
+ * code on error (see __get_user_pages()). Note that this function reports
+ * errors related to VMAs, such as incompatible mappings, as expected by
+ * MADV_POPULATE_(READ|WRITE).
*
- * vma->vm_mm->mmap_lock must be held. The range must be page-aligned and
- * covered by the VMA. If it's released, *@locked will be set to 0.
+ * The range must be page-aligned.
+ *
+ * mm->mmap_lock must be held. If it's released, *@locked will be set to 0.
*/
-long faultin_vma_page_range(struct vm_area_struct *vma, unsigned long start,
- unsigned long end, bool write, int *locked)
+long faultin_page_range(struct mm_struct *mm, unsigned long start,
+ unsigned long end, bool write, int *locked)
{
- struct mm_struct *mm = vma->vm_mm;
unsigned long nr_pages = (end - start) / PAGE_SIZE;
int gup_flags;
long ret;
VM_BUG_ON(!PAGE_ALIGNED(start));
VM_BUG_ON(!PAGE_ALIGNED(end));
- VM_BUG_ON_VMA(start < vma->vm_start, vma);
- VM_BUG_ON_VMA(end > vma->vm_end, vma);
mmap_assert_locked(mm);
/*
@@ -1725,19 +1741,13 @@ long faultin_vma_page_range(struct vm_area_struct *vma, unsigned long start,
* a poisoned page.
* !FOLL_FORCE: Require proper access permissions.
*/
- gup_flags = FOLL_TOUCH | FOLL_HWPOISON | FOLL_UNLOCKABLE;
+ gup_flags = FOLL_TOUCH | FOLL_HWPOISON | FOLL_UNLOCKABLE |
+ FOLL_MADV_POPULATE;
if (write)
gup_flags |= FOLL_WRITE;
- /*
- * We want to report -EINVAL instead of -EFAULT for any permission
- * problems or incompatible mappings.
- */
- if (check_vma_flags(vma, gup_flags))
- return -EINVAL;
-
- ret = __get_user_pages(mm, start, nr_pages, gup_flags,
- NULL, locked);
+ ret = __get_user_pages_locked(mm, start, nr_pages, NULL, locked,
+ gup_flags);
lru_add_drain();
return ret;
}
diff --git a/mm/internal.h b/mm/internal.h
index 7e486f2c502c..07ad2675a88b 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -686,9 +686,8 @@ struct anon_vma *folio_anon_vma(struct folio *folio);
void unmap_mapping_folio(struct folio *folio);
extern long populate_vma_page_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end, int *locked);
-extern long faultin_vma_page_range(struct vm_area_struct *vma,
- unsigned long start, unsigned long end,
- bool write, int *locked);
+extern long faultin_page_range(struct mm_struct *mm, unsigned long start,
+ unsigned long end, bool write, int *locked);
extern bool mlock_future_ok(struct mm_struct *mm, unsigned long flags,
unsigned long bytes);
@@ -1127,10 +1126,13 @@ enum {
FOLL_FAST_ONLY = 1 << 20,
/* allow unlocking the mmap lock */
FOLL_UNLOCKABLE = 1 << 21,
+ /* VMA lookup+checks compatible with MADV_POPULATE_(READ|WRITE) */
+ FOLL_MADV_POPULATE = 1 << 22,
};
#define INTERNAL_GUP_FLAGS (FOLL_TOUCH | FOLL_TRIED | FOLL_REMOTE | FOLL_PIN | \
- FOLL_FAST_ONLY | FOLL_UNLOCKABLE)
+ FOLL_FAST_ONLY | FOLL_UNLOCKABLE | \
+ FOLL_MADV_POPULATE)
/*
* Indicates for which pages that are write-protected in the page table,
diff --git a/mm/madvise.c b/mm/madvise.c
index 44a498c94158..1a073fcc4c0c 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -908,27 +908,14 @@ static long madvise_populate(struct vm_area_struct *vma,
{
const bool write = behavior == MADV_POPULATE_WRITE;
struct mm_struct *mm = vma->vm_mm;
- unsigned long tmp_end;
int locked = 1;
long pages;
*prev = vma;
while (start < end) {
- /*
- * We might have temporarily dropped the lock. For example,
- * our VMA might have been split.
- */
- if (!vma || start >= vma->vm_end) {
- vma = vma_lookup(mm, start);
- if (!vma)
- return -ENOMEM;
- }
-
- tmp_end = min_t(unsigned long, end, vma->vm_end);
/* Populate (prefault) page tables readable/writable. */
- pages = faultin_vma_page_range(vma, start, tmp_end, write,
- &locked);
+ pages = faultin_page_range(mm, start, end, write, &locked);
if (!locked) {
mmap_read_lock(mm);
locked = 1;
@@ -949,7 +936,7 @@ static long madvise_populate(struct vm_area_struct *vma,
pr_warn_once("%s: unhandled return value: %ld\n",
__func__, pages);
fallthrough;
- case -ENOMEM:
+ case -ENOMEM: /* No VMA or out of memory. */
return -ENOMEM;
}
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 631426ba1d45a8672b177ee85ad4cabe760dd131
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042336-tinfoil-trusting-765e@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
631426ba1d45 ("mm/madvise: make MADV_POPULATE_(READ|WRITE) handle VM_FAULT_RETRY properly")
0f20bba1688b ("mm/gup: explicitly define and check internal GUP flags, disallow FOLL_TOUCH")
6cd06ab12d1a ("gup: make the stack expansion warning a bit more targeted")
9471f1f2f502 ("Merge branch 'expand-stack'")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 631426ba1d45a8672b177ee85ad4cabe760dd131 Mon Sep 17 00:00:00 2001
From: David Hildenbrand <david(a)redhat.com>
Date: Thu, 14 Mar 2024 17:12:59 +0100
Subject: [PATCH] mm/madvise: make MADV_POPULATE_(READ|WRITE) handle
VM_FAULT_RETRY properly
Darrick reports that in some cases where pread() would fail with -EIO and
mmap()+access would generate a SIGBUS signal, MADV_POPULATE_READ /
MADV_POPULATE_WRITE will keep retrying forever and not fail with -EFAULT.
While the madvise() call can be interrupted by a signal, this is not the
desired behavior. MADV_POPULATE_READ / MADV_POPULATE_WRITE should behave
like page faults in that case: fail and not retry forever.
A reproducer can be found at [1].
The reason is that __get_user_pages(), as called by
faultin_vma_page_range(), will not handle VM_FAULT_RETRY in a proper way:
it will simply return 0 when VM_FAULT_RETRY happened, making
madvise_populate()->faultin_vma_page_range() retry again and again, never
setting FOLL_TRIED->FAULT_FLAG_TRIED for __get_user_pages().
__get_user_pages_locked() does what we want, but duplicating that logic in
faultin_vma_page_range() feels wrong.
So let's use __get_user_pages_locked() instead, that will detect
VM_FAULT_RETRY and set FOLL_TRIED when retrying, making the fault handler
return VM_FAULT_SIGBUS (VM_FAULT_ERROR) at some point, propagating -EFAULT
from faultin_page() to __get_user_pages(), all the way to
madvise_populate().
But, there is an issue: __get_user_pages_locked() will end up re-taking
the MM lock and then __get_user_pages() will do another VMA lookup. In
the meantime, the VMA layout could have changed and we'd fail with
different error codes than we'd want to.
As __get_user_pages() will currently do a new VMA lookup either way, let
it do the VMA handling in a different way, controlled by a new
FOLL_MADV_POPULATE flag, effectively moving these checks from
madvise_populate() + faultin_page_range() in there.
With this change, Darricks reproducer properly fails with -EFAULT, as
documented for MADV_POPULATE_READ / MADV_POPULATE_WRITE.
[1] https://lore.kernel.org/all/20240313171936.GN1927156@frogsfrogsfrogs/
Link: https://lkml.kernel.org/r/20240314161300.382526-1-david@redhat.com
Link: https://lkml.kernel.org/r/20240314161300.382526-2-david@redhat.com
Fixes: 4ca9b3859dac ("mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables")
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Reported-by: Darrick J. Wong <djwong(a)kernel.org>
Closes: https://lore.kernel.org/all/20240311223815.GW1927156@frogsfrogsfrogs/
Cc: Darrick J. Wong <djwong(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Jason Gunthorpe <jgg(a)nvidia.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/gup.c b/mm/gup.c
index af8edadc05d1..1611e73b1121 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1206,6 +1206,22 @@ static long __get_user_pages(struct mm_struct *mm,
/* first iteration or cross vma bound */
if (!vma || start >= vma->vm_end) {
+ /*
+ * MADV_POPULATE_(READ|WRITE) wants to handle VMA
+ * lookups+error reporting differently.
+ */
+ if (gup_flags & FOLL_MADV_POPULATE) {
+ vma = vma_lookup(mm, start);
+ if (!vma) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ if (check_vma_flags(vma, gup_flags)) {
+ ret = -EINVAL;
+ goto out;
+ }
+ goto retry;
+ }
vma = gup_vma_lookup(mm, start);
if (!vma && in_gate_area(mm, start)) {
ret = get_gate_page(mm, start & PAGE_MASK,
@@ -1685,35 +1701,35 @@ long populate_vma_page_range(struct vm_area_struct *vma,
}
/*
- * faultin_vma_page_range() - populate (prefault) page tables inside the
- * given VMA range readable/writable
+ * faultin_page_range() - populate (prefault) page tables inside the
+ * given range readable/writable
*
* This takes care of mlocking the pages, too, if VM_LOCKED is set.
*
- * @vma: target vma
+ * @mm: the mm to populate page tables in
* @start: start address
* @end: end address
* @write: whether to prefault readable or writable
* @locked: whether the mmap_lock is still held
*
- * Returns either number of processed pages in the vma, or a negative error
- * code on error (see __get_user_pages()).
+ * Returns either number of processed pages in the MM, or a negative error
+ * code on error (see __get_user_pages()). Note that this function reports
+ * errors related to VMAs, such as incompatible mappings, as expected by
+ * MADV_POPULATE_(READ|WRITE).
*
- * vma->vm_mm->mmap_lock must be held. The range must be page-aligned and
- * covered by the VMA. If it's released, *@locked will be set to 0.
+ * The range must be page-aligned.
+ *
+ * mm->mmap_lock must be held. If it's released, *@locked will be set to 0.
*/
-long faultin_vma_page_range(struct vm_area_struct *vma, unsigned long start,
- unsigned long end, bool write, int *locked)
+long faultin_page_range(struct mm_struct *mm, unsigned long start,
+ unsigned long end, bool write, int *locked)
{
- struct mm_struct *mm = vma->vm_mm;
unsigned long nr_pages = (end - start) / PAGE_SIZE;
int gup_flags;
long ret;
VM_BUG_ON(!PAGE_ALIGNED(start));
VM_BUG_ON(!PAGE_ALIGNED(end));
- VM_BUG_ON_VMA(start < vma->vm_start, vma);
- VM_BUG_ON_VMA(end > vma->vm_end, vma);
mmap_assert_locked(mm);
/*
@@ -1725,19 +1741,13 @@ long faultin_vma_page_range(struct vm_area_struct *vma, unsigned long start,
* a poisoned page.
* !FOLL_FORCE: Require proper access permissions.
*/
- gup_flags = FOLL_TOUCH | FOLL_HWPOISON | FOLL_UNLOCKABLE;
+ gup_flags = FOLL_TOUCH | FOLL_HWPOISON | FOLL_UNLOCKABLE |
+ FOLL_MADV_POPULATE;
if (write)
gup_flags |= FOLL_WRITE;
- /*
- * We want to report -EINVAL instead of -EFAULT for any permission
- * problems or incompatible mappings.
- */
- if (check_vma_flags(vma, gup_flags))
- return -EINVAL;
-
- ret = __get_user_pages(mm, start, nr_pages, gup_flags,
- NULL, locked);
+ ret = __get_user_pages_locked(mm, start, nr_pages, NULL, locked,
+ gup_flags);
lru_add_drain();
return ret;
}
diff --git a/mm/internal.h b/mm/internal.h
index 7e486f2c502c..07ad2675a88b 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -686,9 +686,8 @@ struct anon_vma *folio_anon_vma(struct folio *folio);
void unmap_mapping_folio(struct folio *folio);
extern long populate_vma_page_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end, int *locked);
-extern long faultin_vma_page_range(struct vm_area_struct *vma,
- unsigned long start, unsigned long end,
- bool write, int *locked);
+extern long faultin_page_range(struct mm_struct *mm, unsigned long start,
+ unsigned long end, bool write, int *locked);
extern bool mlock_future_ok(struct mm_struct *mm, unsigned long flags,
unsigned long bytes);
@@ -1127,10 +1126,13 @@ enum {
FOLL_FAST_ONLY = 1 << 20,
/* allow unlocking the mmap lock */
FOLL_UNLOCKABLE = 1 << 21,
+ /* VMA lookup+checks compatible with MADV_POPULATE_(READ|WRITE) */
+ FOLL_MADV_POPULATE = 1 << 22,
};
#define INTERNAL_GUP_FLAGS (FOLL_TOUCH | FOLL_TRIED | FOLL_REMOTE | FOLL_PIN | \
- FOLL_FAST_ONLY | FOLL_UNLOCKABLE)
+ FOLL_FAST_ONLY | FOLL_UNLOCKABLE | \
+ FOLL_MADV_POPULATE)
/*
* Indicates for which pages that are write-protected in the page table,
diff --git a/mm/madvise.c b/mm/madvise.c
index 44a498c94158..1a073fcc4c0c 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -908,27 +908,14 @@ static long madvise_populate(struct vm_area_struct *vma,
{
const bool write = behavior == MADV_POPULATE_WRITE;
struct mm_struct *mm = vma->vm_mm;
- unsigned long tmp_end;
int locked = 1;
long pages;
*prev = vma;
while (start < end) {
- /*
- * We might have temporarily dropped the lock. For example,
- * our VMA might have been split.
- */
- if (!vma || start >= vma->vm_end) {
- vma = vma_lookup(mm, start);
- if (!vma)
- return -ENOMEM;
- }
-
- tmp_end = min_t(unsigned long, end, vma->vm_end);
/* Populate (prefault) page tables readable/writable. */
- pages = faultin_vma_page_range(vma, start, tmp_end, write,
- &locked);
+ pages = faultin_page_range(mm, start, end, write, &locked);
if (!locked) {
mmap_read_lock(mm);
locked = 1;
@@ -949,7 +936,7 @@ static long madvise_populate(struct vm_area_struct *vma,
pr_warn_once("%s: unhandled return value: %ld\n",
__func__, pages);
fallthrough;
- case -ENOMEM:
+ case -ENOMEM: /* No VMA or out of memory. */
return -ENOMEM;
}
}
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 631426ba1d45a8672b177ee85ad4cabe760dd131
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042332-mop-broiling-603b@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
631426ba1d45 ("mm/madvise: make MADV_POPULATE_(READ|WRITE) handle VM_FAULT_RETRY properly")
0f20bba1688b ("mm/gup: explicitly define and check internal GUP flags, disallow FOLL_TOUCH")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 631426ba1d45a8672b177ee85ad4cabe760dd131 Mon Sep 17 00:00:00 2001
From: David Hildenbrand <david(a)redhat.com>
Date: Thu, 14 Mar 2024 17:12:59 +0100
Subject: [PATCH] mm/madvise: make MADV_POPULATE_(READ|WRITE) handle
VM_FAULT_RETRY properly
Darrick reports that in some cases where pread() would fail with -EIO and
mmap()+access would generate a SIGBUS signal, MADV_POPULATE_READ /
MADV_POPULATE_WRITE will keep retrying forever and not fail with -EFAULT.
While the madvise() call can be interrupted by a signal, this is not the
desired behavior. MADV_POPULATE_READ / MADV_POPULATE_WRITE should behave
like page faults in that case: fail and not retry forever.
A reproducer can be found at [1].
The reason is that __get_user_pages(), as called by
faultin_vma_page_range(), will not handle VM_FAULT_RETRY in a proper way:
it will simply return 0 when VM_FAULT_RETRY happened, making
madvise_populate()->faultin_vma_page_range() retry again and again, never
setting FOLL_TRIED->FAULT_FLAG_TRIED for __get_user_pages().
__get_user_pages_locked() does what we want, but duplicating that logic in
faultin_vma_page_range() feels wrong.
So let's use __get_user_pages_locked() instead, that will detect
VM_FAULT_RETRY and set FOLL_TRIED when retrying, making the fault handler
return VM_FAULT_SIGBUS (VM_FAULT_ERROR) at some point, propagating -EFAULT
from faultin_page() to __get_user_pages(), all the way to
madvise_populate().
But, there is an issue: __get_user_pages_locked() will end up re-taking
the MM lock and then __get_user_pages() will do another VMA lookup. In
the meantime, the VMA layout could have changed and we'd fail with
different error codes than we'd want to.
As __get_user_pages() will currently do a new VMA lookup either way, let
it do the VMA handling in a different way, controlled by a new
FOLL_MADV_POPULATE flag, effectively moving these checks from
madvise_populate() + faultin_page_range() in there.
With this change, Darricks reproducer properly fails with -EFAULT, as
documented for MADV_POPULATE_READ / MADV_POPULATE_WRITE.
[1] https://lore.kernel.org/all/20240313171936.GN1927156@frogsfrogsfrogs/
Link: https://lkml.kernel.org/r/20240314161300.382526-1-david@redhat.com
Link: https://lkml.kernel.org/r/20240314161300.382526-2-david@redhat.com
Fixes: 4ca9b3859dac ("mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables")
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Reported-by: Darrick J. Wong <djwong(a)kernel.org>
Closes: https://lore.kernel.org/all/20240311223815.GW1927156@frogsfrogsfrogs/
Cc: Darrick J. Wong <djwong(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Jason Gunthorpe <jgg(a)nvidia.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/gup.c b/mm/gup.c
index af8edadc05d1..1611e73b1121 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1206,6 +1206,22 @@ static long __get_user_pages(struct mm_struct *mm,
/* first iteration or cross vma bound */
if (!vma || start >= vma->vm_end) {
+ /*
+ * MADV_POPULATE_(READ|WRITE) wants to handle VMA
+ * lookups+error reporting differently.
+ */
+ if (gup_flags & FOLL_MADV_POPULATE) {
+ vma = vma_lookup(mm, start);
+ if (!vma) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ if (check_vma_flags(vma, gup_flags)) {
+ ret = -EINVAL;
+ goto out;
+ }
+ goto retry;
+ }
vma = gup_vma_lookup(mm, start);
if (!vma && in_gate_area(mm, start)) {
ret = get_gate_page(mm, start & PAGE_MASK,
@@ -1685,35 +1701,35 @@ long populate_vma_page_range(struct vm_area_struct *vma,
}
/*
- * faultin_vma_page_range() - populate (prefault) page tables inside the
- * given VMA range readable/writable
+ * faultin_page_range() - populate (prefault) page tables inside the
+ * given range readable/writable
*
* This takes care of mlocking the pages, too, if VM_LOCKED is set.
*
- * @vma: target vma
+ * @mm: the mm to populate page tables in
* @start: start address
* @end: end address
* @write: whether to prefault readable or writable
* @locked: whether the mmap_lock is still held
*
- * Returns either number of processed pages in the vma, or a negative error
- * code on error (see __get_user_pages()).
+ * Returns either number of processed pages in the MM, or a negative error
+ * code on error (see __get_user_pages()). Note that this function reports
+ * errors related to VMAs, such as incompatible mappings, as expected by
+ * MADV_POPULATE_(READ|WRITE).
*
- * vma->vm_mm->mmap_lock must be held. The range must be page-aligned and
- * covered by the VMA. If it's released, *@locked will be set to 0.
+ * The range must be page-aligned.
+ *
+ * mm->mmap_lock must be held. If it's released, *@locked will be set to 0.
*/
-long faultin_vma_page_range(struct vm_area_struct *vma, unsigned long start,
- unsigned long end, bool write, int *locked)
+long faultin_page_range(struct mm_struct *mm, unsigned long start,
+ unsigned long end, bool write, int *locked)
{
- struct mm_struct *mm = vma->vm_mm;
unsigned long nr_pages = (end - start) / PAGE_SIZE;
int gup_flags;
long ret;
VM_BUG_ON(!PAGE_ALIGNED(start));
VM_BUG_ON(!PAGE_ALIGNED(end));
- VM_BUG_ON_VMA(start < vma->vm_start, vma);
- VM_BUG_ON_VMA(end > vma->vm_end, vma);
mmap_assert_locked(mm);
/*
@@ -1725,19 +1741,13 @@ long faultin_vma_page_range(struct vm_area_struct *vma, unsigned long start,
* a poisoned page.
* !FOLL_FORCE: Require proper access permissions.
*/
- gup_flags = FOLL_TOUCH | FOLL_HWPOISON | FOLL_UNLOCKABLE;
+ gup_flags = FOLL_TOUCH | FOLL_HWPOISON | FOLL_UNLOCKABLE |
+ FOLL_MADV_POPULATE;
if (write)
gup_flags |= FOLL_WRITE;
- /*
- * We want to report -EINVAL instead of -EFAULT for any permission
- * problems or incompatible mappings.
- */
- if (check_vma_flags(vma, gup_flags))
- return -EINVAL;
-
- ret = __get_user_pages(mm, start, nr_pages, gup_flags,
- NULL, locked);
+ ret = __get_user_pages_locked(mm, start, nr_pages, NULL, locked,
+ gup_flags);
lru_add_drain();
return ret;
}
diff --git a/mm/internal.h b/mm/internal.h
index 7e486f2c502c..07ad2675a88b 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -686,9 +686,8 @@ struct anon_vma *folio_anon_vma(struct folio *folio);
void unmap_mapping_folio(struct folio *folio);
extern long populate_vma_page_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end, int *locked);
-extern long faultin_vma_page_range(struct vm_area_struct *vma,
- unsigned long start, unsigned long end,
- bool write, int *locked);
+extern long faultin_page_range(struct mm_struct *mm, unsigned long start,
+ unsigned long end, bool write, int *locked);
extern bool mlock_future_ok(struct mm_struct *mm, unsigned long flags,
unsigned long bytes);
@@ -1127,10 +1126,13 @@ enum {
FOLL_FAST_ONLY = 1 << 20,
/* allow unlocking the mmap lock */
FOLL_UNLOCKABLE = 1 << 21,
+ /* VMA lookup+checks compatible with MADV_POPULATE_(READ|WRITE) */
+ FOLL_MADV_POPULATE = 1 << 22,
};
#define INTERNAL_GUP_FLAGS (FOLL_TOUCH | FOLL_TRIED | FOLL_REMOTE | FOLL_PIN | \
- FOLL_FAST_ONLY | FOLL_UNLOCKABLE)
+ FOLL_FAST_ONLY | FOLL_UNLOCKABLE | \
+ FOLL_MADV_POPULATE)
/*
* Indicates for which pages that are write-protected in the page table,
diff --git a/mm/madvise.c b/mm/madvise.c
index 44a498c94158..1a073fcc4c0c 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -908,27 +908,14 @@ static long madvise_populate(struct vm_area_struct *vma,
{
const bool write = behavior == MADV_POPULATE_WRITE;
struct mm_struct *mm = vma->vm_mm;
- unsigned long tmp_end;
int locked = 1;
long pages;
*prev = vma;
while (start < end) {
- /*
- * We might have temporarily dropped the lock. For example,
- * our VMA might have been split.
- */
- if (!vma || start >= vma->vm_end) {
- vma = vma_lookup(mm, start);
- if (!vma)
- return -ENOMEM;
- }
-
- tmp_end = min_t(unsigned long, end, vma->vm_end);
/* Populate (prefault) page tables readable/writable. */
- pages = faultin_vma_page_range(vma, start, tmp_end, write,
- &locked);
+ pages = faultin_page_range(mm, start, end, write, &locked);
if (!locked) {
mmap_read_lock(mm);
locked = 1;
@@ -949,7 +936,7 @@ static long madvise_populate(struct vm_area_struct *vma,
pr_warn_once("%s: unhandled return value: %ld\n",
__func__, pages);
fallthrough;
- case -ENOMEM:
+ case -ENOMEM: /* No VMA or out of memory. */
return -ENOMEM;
}
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x de120e1d692d73c7eefa3278837b1eb68f90728a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042301-anthill-recant-a287@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
de120e1d692d ("KVM: x86/pmu: Set enable bits for GP counters in PERF_GLOBAL_CTRL at "RESET"")
f933b88e2015 ("KVM: x86/pmu: Zero out PMU metadata on AMD if PMU is disabled")
1647b52757d5 ("KVM: x86/pmu: Reset the PMU, i.e. stop counters, before refreshing")
cbb359d81a26 ("KVM: x86/pmu: Move PMU reset logic to common x86 code")
73554b29bd70 ("KVM: x86/pmu: Synthesize at most one PMI per VM-exit")
b29a2acd36dd ("KVM: x86/pmu: Truncate counter value to allowed width on write")
4a2771895ca6 ("KVM: x86/svm/pmu: Add AMD PerfMonV2 support")
1c2bf8a6b045 ("KVM: x86/pmu: Constrain the num of guest counters with kvm_pmu_cap")
13afa29ae489 ("KVM: x86/pmu: Provide Intel PMU's pmc_is_enabled() as generic x86 code")
c85cdc1cc1ea ("KVM: x86/pmu: Move handling PERF_GLOBAL_CTRL and friends to common x86")
30dab5c0b65e ("KVM: x86/pmu: Reject userspace attempts to set reserved GLOBAL_STATUS bits")
8de18543dfe3 ("KVM: x86/pmu: Move reprogram_counters() to pmu.h")
53550b89220b ("KVM: x86/pmu: Rename global_ovf_ctrl_mask to global_status_mask")
649bccd7fac9 ("KVM: x86/pmu: Rewrite reprogram_counters() to improve performance")
8bca8c5ce40b ("KVM: VMX: Refactor intel_pmu_{g,}set_msr() to align with other helpers")
cdd2fbf6360e ("KVM: x86/pmu: Rename pmc_is_enabled() to pmc_is_globally_enabled()")
957d0f70e97b ("KVM: x86/pmu: Zero out LBR capabilities during PMU refresh")
3a6de51a437f ("KVM: x86/pmu: WARN and bug the VM if PMU is refreshed after vCPU has run")
7e768ce8278b ("KVM: x86/pmu: Zero out pmu->all_valid_pmc_idx each time it's refreshed")
e33b6d79acac ("KVM: x86/pmu: Don't tell userspace to save MSRs for non-existent fixed PMCs")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From de120e1d692d73c7eefa3278837b1eb68f90728a Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Fri, 8 Mar 2024 17:36:40 -0800
Subject: [PATCH] KVM: x86/pmu: Set enable bits for GP counters in
PERF_GLOBAL_CTRL at "RESET"
Set the enable bits for general purpose counters in IA32_PERF_GLOBAL_CTRL
when refreshing the PMU to emulate the MSR's architecturally defined
post-RESET behavior. Per Intel's SDM:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
and
Where "n" is the number of general-purpose counters available in the processor.
AMD also documents this behavior for PerfMonV2 CPUs in one of AMD's many
PPRs.
Do not set any PERF_GLOBAL_CTRL bits if there are no general purpose
counters, although a literal reading of the SDM would require the CPU to
set either bits 63:0 or 31:0. The intent of the behavior is to globally
enable all GP counters; honor the intent, if not the letter of the law.
Leaving PERF_GLOBAL_CTRL '0' effectively breaks PMU usage in guests that
haven't been updated to work with PMUs that support PERF_GLOBAL_CTRL.
This bug was recently exposed when KVM added supported for AMD's
PerfMonV2, i.e. when KVM started exposing a vPMU with PERF_GLOBAL_CTRL to
guest software that only knew how to program v1 PMUs (that don't support
PERF_GLOBAL_CTRL).
Failure to emulate the post-RESET behavior results in such guests
unknowingly leaving all general purpose counters globally disabled (the
entire reason the post-RESET value sets the GP counter enable bits is to
maintain backwards compatibility).
The bug has likely gone unnoticed because PERF_GLOBAL_CTRL has been
supported on Intel CPUs for as long as KVM has existed, i.e. hardly anyone
is running guest software that isn't aware of PERF_GLOBAL_CTRL on Intel
PMUs. And because up until v6.0, KVM _did_ emulate the behavior for Intel
CPUs, although the old behavior was likely dumb luck.
Because (a) that old code was also broken in its own way (the history of
this code is a comedy of errors), and (b) PERF_GLOBAL_CTRL was documented
as having a value of '0' post-RESET in all SDMs before March 2023.
Initial vPMU support in commit f5132b01386b ("KVM: Expose a version 2
architectural PMU to a guests") *almost* got it right (again likely by
dumb luck), but for some reason only set the bits if the guest PMU was
advertised as v1:
if (pmu->version == 1) {
pmu->global_ctrl = (1 << pmu->nr_arch_gp_counters) - 1;
return;
}
Commit f19a0c2c2e6a ("KVM: PMU emulation: GLOBAL_CTRL MSR should be
enabled on reset") then tried to remedy that goof, presumably because
guest PMUs were leaving PERF_GLOBAL_CTRL '0', i.e. weren't enabling
counters.
pmu->global_ctrl = ((1 << pmu->nr_arch_gp_counters) - 1) |
(((1ull << pmu->nr_arch_fixed_counters) - 1) << X86_PMC_IDX_FIXED);
pmu->global_ctrl_mask = ~pmu->global_ctrl;
That was KVM's behavior up until commit c49467a45fe0 ("KVM: x86/pmu:
Don't overwrite the pmu->global_ctrl when refreshing") removed
*everything*. However, it did so based on the behavior defined by the
SDM , which at the time stated that "Global Perf Counter Controls" is
'0' at Power-Up and RESET.
But then the March 2023 SDM (325462-079US), stealthily changed its
"IA-32 and Intel 64 Processor States Following Power-up, Reset, or INIT"
table to say:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
Note, kvm_pmu_refresh() can be invoked multiple times, i.e. it's not a
"pure" RESET flow. But it can only be called prior to the first KVM_RUN,
i.e. the guest will only ever observe the final value.
Note #2, KVM has always cleared global_ctrl during refresh (see commit
f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests")),
i.e. there is no danger of breaking existing setups by clobbering a value
set by userspace.
Reported-by: Babu Moger <babu.moger(a)amd.com>
Cc: Sandipan Das <sandipan.das(a)amd.com>
Cc: Like Xu <like.xu.linux(a)gmail.com>
Cc: Mingwei Zhang <mizhang(a)google.com>
Cc: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Tested-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Link: https://lore.kernel.org/r/20240309013641.1413400-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index c397b28e3d1b..a593b03c9aed 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -775,8 +775,20 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->pebs_data_cfg_mask = ~0ull;
bitmap_zero(pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX);
- if (vcpu->kvm->arch.enable_pmu)
- static_call(kvm_x86_pmu_refresh)(vcpu);
+ if (!vcpu->kvm->arch.enable_pmu)
+ return;
+
+ static_call(kvm_x86_pmu_refresh)(vcpu);
+
+ /*
+ * At RESET, both Intel and AMD CPUs set all enable bits for general
+ * purpose counters in IA32_PERF_GLOBAL_CTRL (so that software that
+ * was written for v1 PMUs don't unknowingly leave GP counters disabled
+ * in the global controls). Emulate that behavior when refreshing the
+ * PMU so that userspace doesn't need to manually set PERF_GLOBAL_CTRL.
+ */
+ if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters)
+ pmu->global_ctrl = GENMASK_ULL(pmu->nr_arch_gp_counters - 1, 0);
}
void kvm_pmu_init(struct kvm_vcpu *vcpu)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x de120e1d692d73c7eefa3278837b1eb68f90728a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042357-unnamed-riverboat-0b22@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
de120e1d692d ("KVM: x86/pmu: Set enable bits for GP counters in PERF_GLOBAL_CTRL at "RESET"")
f933b88e2015 ("KVM: x86/pmu: Zero out PMU metadata on AMD if PMU is disabled")
1647b52757d5 ("KVM: x86/pmu: Reset the PMU, i.e. stop counters, before refreshing")
cbb359d81a26 ("KVM: x86/pmu: Move PMU reset logic to common x86 code")
73554b29bd70 ("KVM: x86/pmu: Synthesize at most one PMI per VM-exit")
b29a2acd36dd ("KVM: x86/pmu: Truncate counter value to allowed width on write")
4a2771895ca6 ("KVM: x86/svm/pmu: Add AMD PerfMonV2 support")
1c2bf8a6b045 ("KVM: x86/pmu: Constrain the num of guest counters with kvm_pmu_cap")
13afa29ae489 ("KVM: x86/pmu: Provide Intel PMU's pmc_is_enabled() as generic x86 code")
c85cdc1cc1ea ("KVM: x86/pmu: Move handling PERF_GLOBAL_CTRL and friends to common x86")
30dab5c0b65e ("KVM: x86/pmu: Reject userspace attempts to set reserved GLOBAL_STATUS bits")
8de18543dfe3 ("KVM: x86/pmu: Move reprogram_counters() to pmu.h")
53550b89220b ("KVM: x86/pmu: Rename global_ovf_ctrl_mask to global_status_mask")
649bccd7fac9 ("KVM: x86/pmu: Rewrite reprogram_counters() to improve performance")
8bca8c5ce40b ("KVM: VMX: Refactor intel_pmu_{g,}set_msr() to align with other helpers")
cdd2fbf6360e ("KVM: x86/pmu: Rename pmc_is_enabled() to pmc_is_globally_enabled()")
957d0f70e97b ("KVM: x86/pmu: Zero out LBR capabilities during PMU refresh")
3a6de51a437f ("KVM: x86/pmu: WARN and bug the VM if PMU is refreshed after vCPU has run")
7e768ce8278b ("KVM: x86/pmu: Zero out pmu->all_valid_pmc_idx each time it's refreshed")
e33b6d79acac ("KVM: x86/pmu: Don't tell userspace to save MSRs for non-existent fixed PMCs")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From de120e1d692d73c7eefa3278837b1eb68f90728a Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Fri, 8 Mar 2024 17:36:40 -0800
Subject: [PATCH] KVM: x86/pmu: Set enable bits for GP counters in
PERF_GLOBAL_CTRL at "RESET"
Set the enable bits for general purpose counters in IA32_PERF_GLOBAL_CTRL
when refreshing the PMU to emulate the MSR's architecturally defined
post-RESET behavior. Per Intel's SDM:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
and
Where "n" is the number of general-purpose counters available in the processor.
AMD also documents this behavior for PerfMonV2 CPUs in one of AMD's many
PPRs.
Do not set any PERF_GLOBAL_CTRL bits if there are no general purpose
counters, although a literal reading of the SDM would require the CPU to
set either bits 63:0 or 31:0. The intent of the behavior is to globally
enable all GP counters; honor the intent, if not the letter of the law.
Leaving PERF_GLOBAL_CTRL '0' effectively breaks PMU usage in guests that
haven't been updated to work with PMUs that support PERF_GLOBAL_CTRL.
This bug was recently exposed when KVM added supported for AMD's
PerfMonV2, i.e. when KVM started exposing a vPMU with PERF_GLOBAL_CTRL to
guest software that only knew how to program v1 PMUs (that don't support
PERF_GLOBAL_CTRL).
Failure to emulate the post-RESET behavior results in such guests
unknowingly leaving all general purpose counters globally disabled (the
entire reason the post-RESET value sets the GP counter enable bits is to
maintain backwards compatibility).
The bug has likely gone unnoticed because PERF_GLOBAL_CTRL has been
supported on Intel CPUs for as long as KVM has existed, i.e. hardly anyone
is running guest software that isn't aware of PERF_GLOBAL_CTRL on Intel
PMUs. And because up until v6.0, KVM _did_ emulate the behavior for Intel
CPUs, although the old behavior was likely dumb luck.
Because (a) that old code was also broken in its own way (the history of
this code is a comedy of errors), and (b) PERF_GLOBAL_CTRL was documented
as having a value of '0' post-RESET in all SDMs before March 2023.
Initial vPMU support in commit f5132b01386b ("KVM: Expose a version 2
architectural PMU to a guests") *almost* got it right (again likely by
dumb luck), but for some reason only set the bits if the guest PMU was
advertised as v1:
if (pmu->version == 1) {
pmu->global_ctrl = (1 << pmu->nr_arch_gp_counters) - 1;
return;
}
Commit f19a0c2c2e6a ("KVM: PMU emulation: GLOBAL_CTRL MSR should be
enabled on reset") then tried to remedy that goof, presumably because
guest PMUs were leaving PERF_GLOBAL_CTRL '0', i.e. weren't enabling
counters.
pmu->global_ctrl = ((1 << pmu->nr_arch_gp_counters) - 1) |
(((1ull << pmu->nr_arch_fixed_counters) - 1) << X86_PMC_IDX_FIXED);
pmu->global_ctrl_mask = ~pmu->global_ctrl;
That was KVM's behavior up until commit c49467a45fe0 ("KVM: x86/pmu:
Don't overwrite the pmu->global_ctrl when refreshing") removed
*everything*. However, it did so based on the behavior defined by the
SDM , which at the time stated that "Global Perf Counter Controls" is
'0' at Power-Up and RESET.
But then the March 2023 SDM (325462-079US), stealthily changed its
"IA-32 and Intel 64 Processor States Following Power-up, Reset, or INIT"
table to say:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
Note, kvm_pmu_refresh() can be invoked multiple times, i.e. it's not a
"pure" RESET flow. But it can only be called prior to the first KVM_RUN,
i.e. the guest will only ever observe the final value.
Note #2, KVM has always cleared global_ctrl during refresh (see commit
f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests")),
i.e. there is no danger of breaking existing setups by clobbering a value
set by userspace.
Reported-by: Babu Moger <babu.moger(a)amd.com>
Cc: Sandipan Das <sandipan.das(a)amd.com>
Cc: Like Xu <like.xu.linux(a)gmail.com>
Cc: Mingwei Zhang <mizhang(a)google.com>
Cc: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Tested-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Link: https://lore.kernel.org/r/20240309013641.1413400-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index c397b28e3d1b..a593b03c9aed 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -775,8 +775,20 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->pebs_data_cfg_mask = ~0ull;
bitmap_zero(pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX);
- if (vcpu->kvm->arch.enable_pmu)
- static_call(kvm_x86_pmu_refresh)(vcpu);
+ if (!vcpu->kvm->arch.enable_pmu)
+ return;
+
+ static_call(kvm_x86_pmu_refresh)(vcpu);
+
+ /*
+ * At RESET, both Intel and AMD CPUs set all enable bits for general
+ * purpose counters in IA32_PERF_GLOBAL_CTRL (so that software that
+ * was written for v1 PMUs don't unknowingly leave GP counters disabled
+ * in the global controls). Emulate that behavior when refreshing the
+ * PMU so that userspace doesn't need to manually set PERF_GLOBAL_CTRL.
+ */
+ if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters)
+ pmu->global_ctrl = GENMASK_ULL(pmu->nr_arch_gp_counters - 1, 0);
}
void kvm_pmu_init(struct kvm_vcpu *vcpu)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x de120e1d692d73c7eefa3278837b1eb68f90728a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042354-dividers-trowel-ba4a@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
de120e1d692d ("KVM: x86/pmu: Set enable bits for GP counters in PERF_GLOBAL_CTRL at "RESET"")
f933b88e2015 ("KVM: x86/pmu: Zero out PMU metadata on AMD if PMU is disabled")
1647b52757d5 ("KVM: x86/pmu: Reset the PMU, i.e. stop counters, before refreshing")
cbb359d81a26 ("KVM: x86/pmu: Move PMU reset logic to common x86 code")
73554b29bd70 ("KVM: x86/pmu: Synthesize at most one PMI per VM-exit")
b29a2acd36dd ("KVM: x86/pmu: Truncate counter value to allowed width on write")
4a2771895ca6 ("KVM: x86/svm/pmu: Add AMD PerfMonV2 support")
1c2bf8a6b045 ("KVM: x86/pmu: Constrain the num of guest counters with kvm_pmu_cap")
13afa29ae489 ("KVM: x86/pmu: Provide Intel PMU's pmc_is_enabled() as generic x86 code")
c85cdc1cc1ea ("KVM: x86/pmu: Move handling PERF_GLOBAL_CTRL and friends to common x86")
30dab5c0b65e ("KVM: x86/pmu: Reject userspace attempts to set reserved GLOBAL_STATUS bits")
8de18543dfe3 ("KVM: x86/pmu: Move reprogram_counters() to pmu.h")
53550b89220b ("KVM: x86/pmu: Rename global_ovf_ctrl_mask to global_status_mask")
649bccd7fac9 ("KVM: x86/pmu: Rewrite reprogram_counters() to improve performance")
8bca8c5ce40b ("KVM: VMX: Refactor intel_pmu_{g,}set_msr() to align with other helpers")
cdd2fbf6360e ("KVM: x86/pmu: Rename pmc_is_enabled() to pmc_is_globally_enabled()")
957d0f70e97b ("KVM: x86/pmu: Zero out LBR capabilities during PMU refresh")
3a6de51a437f ("KVM: x86/pmu: WARN and bug the VM if PMU is refreshed after vCPU has run")
7e768ce8278b ("KVM: x86/pmu: Zero out pmu->all_valid_pmc_idx each time it's refreshed")
e33b6d79acac ("KVM: x86/pmu: Don't tell userspace to save MSRs for non-existent fixed PMCs")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From de120e1d692d73c7eefa3278837b1eb68f90728a Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Fri, 8 Mar 2024 17:36:40 -0800
Subject: [PATCH] KVM: x86/pmu: Set enable bits for GP counters in
PERF_GLOBAL_CTRL at "RESET"
Set the enable bits for general purpose counters in IA32_PERF_GLOBAL_CTRL
when refreshing the PMU to emulate the MSR's architecturally defined
post-RESET behavior. Per Intel's SDM:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
and
Where "n" is the number of general-purpose counters available in the processor.
AMD also documents this behavior for PerfMonV2 CPUs in one of AMD's many
PPRs.
Do not set any PERF_GLOBAL_CTRL bits if there are no general purpose
counters, although a literal reading of the SDM would require the CPU to
set either bits 63:0 or 31:0. The intent of the behavior is to globally
enable all GP counters; honor the intent, if not the letter of the law.
Leaving PERF_GLOBAL_CTRL '0' effectively breaks PMU usage in guests that
haven't been updated to work with PMUs that support PERF_GLOBAL_CTRL.
This bug was recently exposed when KVM added supported for AMD's
PerfMonV2, i.e. when KVM started exposing a vPMU with PERF_GLOBAL_CTRL to
guest software that only knew how to program v1 PMUs (that don't support
PERF_GLOBAL_CTRL).
Failure to emulate the post-RESET behavior results in such guests
unknowingly leaving all general purpose counters globally disabled (the
entire reason the post-RESET value sets the GP counter enable bits is to
maintain backwards compatibility).
The bug has likely gone unnoticed because PERF_GLOBAL_CTRL has been
supported on Intel CPUs for as long as KVM has existed, i.e. hardly anyone
is running guest software that isn't aware of PERF_GLOBAL_CTRL on Intel
PMUs. And because up until v6.0, KVM _did_ emulate the behavior for Intel
CPUs, although the old behavior was likely dumb luck.
Because (a) that old code was also broken in its own way (the history of
this code is a comedy of errors), and (b) PERF_GLOBAL_CTRL was documented
as having a value of '0' post-RESET in all SDMs before March 2023.
Initial vPMU support in commit f5132b01386b ("KVM: Expose a version 2
architectural PMU to a guests") *almost* got it right (again likely by
dumb luck), but for some reason only set the bits if the guest PMU was
advertised as v1:
if (pmu->version == 1) {
pmu->global_ctrl = (1 << pmu->nr_arch_gp_counters) - 1;
return;
}
Commit f19a0c2c2e6a ("KVM: PMU emulation: GLOBAL_CTRL MSR should be
enabled on reset") then tried to remedy that goof, presumably because
guest PMUs were leaving PERF_GLOBAL_CTRL '0', i.e. weren't enabling
counters.
pmu->global_ctrl = ((1 << pmu->nr_arch_gp_counters) - 1) |
(((1ull << pmu->nr_arch_fixed_counters) - 1) << X86_PMC_IDX_FIXED);
pmu->global_ctrl_mask = ~pmu->global_ctrl;
That was KVM's behavior up until commit c49467a45fe0 ("KVM: x86/pmu:
Don't overwrite the pmu->global_ctrl when refreshing") removed
*everything*. However, it did so based on the behavior defined by the
SDM , which at the time stated that "Global Perf Counter Controls" is
'0' at Power-Up and RESET.
But then the March 2023 SDM (325462-079US), stealthily changed its
"IA-32 and Intel 64 Processor States Following Power-up, Reset, or INIT"
table to say:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
Note, kvm_pmu_refresh() can be invoked multiple times, i.e. it's not a
"pure" RESET flow. But it can only be called prior to the first KVM_RUN,
i.e. the guest will only ever observe the final value.
Note #2, KVM has always cleared global_ctrl during refresh (see commit
f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests")),
i.e. there is no danger of breaking existing setups by clobbering a value
set by userspace.
Reported-by: Babu Moger <babu.moger(a)amd.com>
Cc: Sandipan Das <sandipan.das(a)amd.com>
Cc: Like Xu <like.xu.linux(a)gmail.com>
Cc: Mingwei Zhang <mizhang(a)google.com>
Cc: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Tested-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Link: https://lore.kernel.org/r/20240309013641.1413400-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index c397b28e3d1b..a593b03c9aed 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -775,8 +775,20 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->pebs_data_cfg_mask = ~0ull;
bitmap_zero(pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX);
- if (vcpu->kvm->arch.enable_pmu)
- static_call(kvm_x86_pmu_refresh)(vcpu);
+ if (!vcpu->kvm->arch.enable_pmu)
+ return;
+
+ static_call(kvm_x86_pmu_refresh)(vcpu);
+
+ /*
+ * At RESET, both Intel and AMD CPUs set all enable bits for general
+ * purpose counters in IA32_PERF_GLOBAL_CTRL (so that software that
+ * was written for v1 PMUs don't unknowingly leave GP counters disabled
+ * in the global controls). Emulate that behavior when refreshing the
+ * PMU so that userspace doesn't need to manually set PERF_GLOBAL_CTRL.
+ */
+ if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters)
+ pmu->global_ctrl = GENMASK_ULL(pmu->nr_arch_gp_counters - 1, 0);
}
void kvm_pmu_init(struct kvm_vcpu *vcpu)
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x de120e1d692d73c7eefa3278837b1eb68f90728a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042350-unaware-very-3433@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
de120e1d692d ("KVM: x86/pmu: Set enable bits for GP counters in PERF_GLOBAL_CTRL at "RESET"")
f933b88e2015 ("KVM: x86/pmu: Zero out PMU metadata on AMD if PMU is disabled")
1647b52757d5 ("KVM: x86/pmu: Reset the PMU, i.e. stop counters, before refreshing")
cbb359d81a26 ("KVM: x86/pmu: Move PMU reset logic to common x86 code")
73554b29bd70 ("KVM: x86/pmu: Synthesize at most one PMI per VM-exit")
b29a2acd36dd ("KVM: x86/pmu: Truncate counter value to allowed width on write")
4a2771895ca6 ("KVM: x86/svm/pmu: Add AMD PerfMonV2 support")
1c2bf8a6b045 ("KVM: x86/pmu: Constrain the num of guest counters with kvm_pmu_cap")
13afa29ae489 ("KVM: x86/pmu: Provide Intel PMU's pmc_is_enabled() as generic x86 code")
c85cdc1cc1ea ("KVM: x86/pmu: Move handling PERF_GLOBAL_CTRL and friends to common x86")
30dab5c0b65e ("KVM: x86/pmu: Reject userspace attempts to set reserved GLOBAL_STATUS bits")
8de18543dfe3 ("KVM: x86/pmu: Move reprogram_counters() to pmu.h")
53550b89220b ("KVM: x86/pmu: Rename global_ovf_ctrl_mask to global_status_mask")
649bccd7fac9 ("KVM: x86/pmu: Rewrite reprogram_counters() to improve performance")
8bca8c5ce40b ("KVM: VMX: Refactor intel_pmu_{g,}set_msr() to align with other helpers")
cdd2fbf6360e ("KVM: x86/pmu: Rename pmc_is_enabled() to pmc_is_globally_enabled()")
957d0f70e97b ("KVM: x86/pmu: Zero out LBR capabilities during PMU refresh")
3a6de51a437f ("KVM: x86/pmu: WARN and bug the VM if PMU is refreshed after vCPU has run")
7e768ce8278b ("KVM: x86/pmu: Zero out pmu->all_valid_pmc_idx each time it's refreshed")
e33b6d79acac ("KVM: x86/pmu: Don't tell userspace to save MSRs for non-existent fixed PMCs")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From de120e1d692d73c7eefa3278837b1eb68f90728a Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Fri, 8 Mar 2024 17:36:40 -0800
Subject: [PATCH] KVM: x86/pmu: Set enable bits for GP counters in
PERF_GLOBAL_CTRL at "RESET"
Set the enable bits for general purpose counters in IA32_PERF_GLOBAL_CTRL
when refreshing the PMU to emulate the MSR's architecturally defined
post-RESET behavior. Per Intel's SDM:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
and
Where "n" is the number of general-purpose counters available in the processor.
AMD also documents this behavior for PerfMonV2 CPUs in one of AMD's many
PPRs.
Do not set any PERF_GLOBAL_CTRL bits if there are no general purpose
counters, although a literal reading of the SDM would require the CPU to
set either bits 63:0 or 31:0. The intent of the behavior is to globally
enable all GP counters; honor the intent, if not the letter of the law.
Leaving PERF_GLOBAL_CTRL '0' effectively breaks PMU usage in guests that
haven't been updated to work with PMUs that support PERF_GLOBAL_CTRL.
This bug was recently exposed when KVM added supported for AMD's
PerfMonV2, i.e. when KVM started exposing a vPMU with PERF_GLOBAL_CTRL to
guest software that only knew how to program v1 PMUs (that don't support
PERF_GLOBAL_CTRL).
Failure to emulate the post-RESET behavior results in such guests
unknowingly leaving all general purpose counters globally disabled (the
entire reason the post-RESET value sets the GP counter enable bits is to
maintain backwards compatibility).
The bug has likely gone unnoticed because PERF_GLOBAL_CTRL has been
supported on Intel CPUs for as long as KVM has existed, i.e. hardly anyone
is running guest software that isn't aware of PERF_GLOBAL_CTRL on Intel
PMUs. And because up until v6.0, KVM _did_ emulate the behavior for Intel
CPUs, although the old behavior was likely dumb luck.
Because (a) that old code was also broken in its own way (the history of
this code is a comedy of errors), and (b) PERF_GLOBAL_CTRL was documented
as having a value of '0' post-RESET in all SDMs before March 2023.
Initial vPMU support in commit f5132b01386b ("KVM: Expose a version 2
architectural PMU to a guests") *almost* got it right (again likely by
dumb luck), but for some reason only set the bits if the guest PMU was
advertised as v1:
if (pmu->version == 1) {
pmu->global_ctrl = (1 << pmu->nr_arch_gp_counters) - 1;
return;
}
Commit f19a0c2c2e6a ("KVM: PMU emulation: GLOBAL_CTRL MSR should be
enabled on reset") then tried to remedy that goof, presumably because
guest PMUs were leaving PERF_GLOBAL_CTRL '0', i.e. weren't enabling
counters.
pmu->global_ctrl = ((1 << pmu->nr_arch_gp_counters) - 1) |
(((1ull << pmu->nr_arch_fixed_counters) - 1) << X86_PMC_IDX_FIXED);
pmu->global_ctrl_mask = ~pmu->global_ctrl;
That was KVM's behavior up until commit c49467a45fe0 ("KVM: x86/pmu:
Don't overwrite the pmu->global_ctrl when refreshing") removed
*everything*. However, it did so based on the behavior defined by the
SDM , which at the time stated that "Global Perf Counter Controls" is
'0' at Power-Up and RESET.
But then the March 2023 SDM (325462-079US), stealthily changed its
"IA-32 and Intel 64 Processor States Following Power-up, Reset, or INIT"
table to say:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
Note, kvm_pmu_refresh() can be invoked multiple times, i.e. it's not a
"pure" RESET flow. But it can only be called prior to the first KVM_RUN,
i.e. the guest will only ever observe the final value.
Note #2, KVM has always cleared global_ctrl during refresh (see commit
f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests")),
i.e. there is no danger of breaking existing setups by clobbering a value
set by userspace.
Reported-by: Babu Moger <babu.moger(a)amd.com>
Cc: Sandipan Das <sandipan.das(a)amd.com>
Cc: Like Xu <like.xu.linux(a)gmail.com>
Cc: Mingwei Zhang <mizhang(a)google.com>
Cc: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Tested-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Link: https://lore.kernel.org/r/20240309013641.1413400-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index c397b28e3d1b..a593b03c9aed 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -775,8 +775,20 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->pebs_data_cfg_mask = ~0ull;
bitmap_zero(pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX);
- if (vcpu->kvm->arch.enable_pmu)
- static_call(kvm_x86_pmu_refresh)(vcpu);
+ if (!vcpu->kvm->arch.enable_pmu)
+ return;
+
+ static_call(kvm_x86_pmu_refresh)(vcpu);
+
+ /*
+ * At RESET, both Intel and AMD CPUs set all enable bits for general
+ * purpose counters in IA32_PERF_GLOBAL_CTRL (so that software that
+ * was written for v1 PMUs don't unknowingly leave GP counters disabled
+ * in the global controls). Emulate that behavior when refreshing the
+ * PMU so that userspace doesn't need to manually set PERF_GLOBAL_CTRL.
+ */
+ if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters)
+ pmu->global_ctrl = GENMASK_ULL(pmu->nr_arch_gp_counters - 1, 0);
}
void kvm_pmu_init(struct kvm_vcpu *vcpu)
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x de120e1d692d73c7eefa3278837b1eb68f90728a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042347-polar-gains-eb8f@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
de120e1d692d ("KVM: x86/pmu: Set enable bits for GP counters in PERF_GLOBAL_CTRL at "RESET"")
f933b88e2015 ("KVM: x86/pmu: Zero out PMU metadata on AMD if PMU is disabled")
1647b52757d5 ("KVM: x86/pmu: Reset the PMU, i.e. stop counters, before refreshing")
cbb359d81a26 ("KVM: x86/pmu: Move PMU reset logic to common x86 code")
73554b29bd70 ("KVM: x86/pmu: Synthesize at most one PMI per VM-exit")
b29a2acd36dd ("KVM: x86/pmu: Truncate counter value to allowed width on write")
4a2771895ca6 ("KVM: x86/svm/pmu: Add AMD PerfMonV2 support")
1c2bf8a6b045 ("KVM: x86/pmu: Constrain the num of guest counters with kvm_pmu_cap")
13afa29ae489 ("KVM: x86/pmu: Provide Intel PMU's pmc_is_enabled() as generic x86 code")
c85cdc1cc1ea ("KVM: x86/pmu: Move handling PERF_GLOBAL_CTRL and friends to common x86")
30dab5c0b65e ("KVM: x86/pmu: Reject userspace attempts to set reserved GLOBAL_STATUS bits")
8de18543dfe3 ("KVM: x86/pmu: Move reprogram_counters() to pmu.h")
53550b89220b ("KVM: x86/pmu: Rename global_ovf_ctrl_mask to global_status_mask")
649bccd7fac9 ("KVM: x86/pmu: Rewrite reprogram_counters() to improve performance")
8bca8c5ce40b ("KVM: VMX: Refactor intel_pmu_{g,}set_msr() to align with other helpers")
cdd2fbf6360e ("KVM: x86/pmu: Rename pmc_is_enabled() to pmc_is_globally_enabled()")
957d0f70e97b ("KVM: x86/pmu: Zero out LBR capabilities during PMU refresh")
3a6de51a437f ("KVM: x86/pmu: WARN and bug the VM if PMU is refreshed after vCPU has run")
7e768ce8278b ("KVM: x86/pmu: Zero out pmu->all_valid_pmc_idx each time it's refreshed")
e33b6d79acac ("KVM: x86/pmu: Don't tell userspace to save MSRs for non-existent fixed PMCs")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From de120e1d692d73c7eefa3278837b1eb68f90728a Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Fri, 8 Mar 2024 17:36:40 -0800
Subject: [PATCH] KVM: x86/pmu: Set enable bits for GP counters in
PERF_GLOBAL_CTRL at "RESET"
Set the enable bits for general purpose counters in IA32_PERF_GLOBAL_CTRL
when refreshing the PMU to emulate the MSR's architecturally defined
post-RESET behavior. Per Intel's SDM:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
and
Where "n" is the number of general-purpose counters available in the processor.
AMD also documents this behavior for PerfMonV2 CPUs in one of AMD's many
PPRs.
Do not set any PERF_GLOBAL_CTRL bits if there are no general purpose
counters, although a literal reading of the SDM would require the CPU to
set either bits 63:0 or 31:0. The intent of the behavior is to globally
enable all GP counters; honor the intent, if not the letter of the law.
Leaving PERF_GLOBAL_CTRL '0' effectively breaks PMU usage in guests that
haven't been updated to work with PMUs that support PERF_GLOBAL_CTRL.
This bug was recently exposed when KVM added supported for AMD's
PerfMonV2, i.e. when KVM started exposing a vPMU with PERF_GLOBAL_CTRL to
guest software that only knew how to program v1 PMUs (that don't support
PERF_GLOBAL_CTRL).
Failure to emulate the post-RESET behavior results in such guests
unknowingly leaving all general purpose counters globally disabled (the
entire reason the post-RESET value sets the GP counter enable bits is to
maintain backwards compatibility).
The bug has likely gone unnoticed because PERF_GLOBAL_CTRL has been
supported on Intel CPUs for as long as KVM has existed, i.e. hardly anyone
is running guest software that isn't aware of PERF_GLOBAL_CTRL on Intel
PMUs. And because up until v6.0, KVM _did_ emulate the behavior for Intel
CPUs, although the old behavior was likely dumb luck.
Because (a) that old code was also broken in its own way (the history of
this code is a comedy of errors), and (b) PERF_GLOBAL_CTRL was documented
as having a value of '0' post-RESET in all SDMs before March 2023.
Initial vPMU support in commit f5132b01386b ("KVM: Expose a version 2
architectural PMU to a guests") *almost* got it right (again likely by
dumb luck), but for some reason only set the bits if the guest PMU was
advertised as v1:
if (pmu->version == 1) {
pmu->global_ctrl = (1 << pmu->nr_arch_gp_counters) - 1;
return;
}
Commit f19a0c2c2e6a ("KVM: PMU emulation: GLOBAL_CTRL MSR should be
enabled on reset") then tried to remedy that goof, presumably because
guest PMUs were leaving PERF_GLOBAL_CTRL '0', i.e. weren't enabling
counters.
pmu->global_ctrl = ((1 << pmu->nr_arch_gp_counters) - 1) |
(((1ull << pmu->nr_arch_fixed_counters) - 1) << X86_PMC_IDX_FIXED);
pmu->global_ctrl_mask = ~pmu->global_ctrl;
That was KVM's behavior up until commit c49467a45fe0 ("KVM: x86/pmu:
Don't overwrite the pmu->global_ctrl when refreshing") removed
*everything*. However, it did so based on the behavior defined by the
SDM , which at the time stated that "Global Perf Counter Controls" is
'0' at Power-Up and RESET.
But then the March 2023 SDM (325462-079US), stealthily changed its
"IA-32 and Intel 64 Processor States Following Power-up, Reset, or INIT"
table to say:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
Note, kvm_pmu_refresh() can be invoked multiple times, i.e. it's not a
"pure" RESET flow. But it can only be called prior to the first KVM_RUN,
i.e. the guest will only ever observe the final value.
Note #2, KVM has always cleared global_ctrl during refresh (see commit
f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests")),
i.e. there is no danger of breaking existing setups by clobbering a value
set by userspace.
Reported-by: Babu Moger <babu.moger(a)amd.com>
Cc: Sandipan Das <sandipan.das(a)amd.com>
Cc: Like Xu <like.xu.linux(a)gmail.com>
Cc: Mingwei Zhang <mizhang(a)google.com>
Cc: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Tested-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Link: https://lore.kernel.org/r/20240309013641.1413400-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index c397b28e3d1b..a593b03c9aed 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -775,8 +775,20 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->pebs_data_cfg_mask = ~0ull;
bitmap_zero(pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX);
- if (vcpu->kvm->arch.enable_pmu)
- static_call(kvm_x86_pmu_refresh)(vcpu);
+ if (!vcpu->kvm->arch.enable_pmu)
+ return;
+
+ static_call(kvm_x86_pmu_refresh)(vcpu);
+
+ /*
+ * At RESET, both Intel and AMD CPUs set all enable bits for general
+ * purpose counters in IA32_PERF_GLOBAL_CTRL (so that software that
+ * was written for v1 PMUs don't unknowingly leave GP counters disabled
+ * in the global controls). Emulate that behavior when refreshing the
+ * PMU so that userspace doesn't need to manually set PERF_GLOBAL_CTRL.
+ */
+ if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters)
+ pmu->global_ctrl = GENMASK_ULL(pmu->nr_arch_gp_counters - 1, 0);
}
void kvm_pmu_init(struct kvm_vcpu *vcpu)
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x de120e1d692d73c7eefa3278837b1eb68f90728a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042343-imitate-divinity-9e38@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
de120e1d692d ("KVM: x86/pmu: Set enable bits for GP counters in PERF_GLOBAL_CTRL at "RESET"")
f933b88e2015 ("KVM: x86/pmu: Zero out PMU metadata on AMD if PMU is disabled")
1647b52757d5 ("KVM: x86/pmu: Reset the PMU, i.e. stop counters, before refreshing")
cbb359d81a26 ("KVM: x86/pmu: Move PMU reset logic to common x86 code")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From de120e1d692d73c7eefa3278837b1eb68f90728a Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Fri, 8 Mar 2024 17:36:40 -0800
Subject: [PATCH] KVM: x86/pmu: Set enable bits for GP counters in
PERF_GLOBAL_CTRL at "RESET"
Set the enable bits for general purpose counters in IA32_PERF_GLOBAL_CTRL
when refreshing the PMU to emulate the MSR's architecturally defined
post-RESET behavior. Per Intel's SDM:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
and
Where "n" is the number of general-purpose counters available in the processor.
AMD also documents this behavior for PerfMonV2 CPUs in one of AMD's many
PPRs.
Do not set any PERF_GLOBAL_CTRL bits if there are no general purpose
counters, although a literal reading of the SDM would require the CPU to
set either bits 63:0 or 31:0. The intent of the behavior is to globally
enable all GP counters; honor the intent, if not the letter of the law.
Leaving PERF_GLOBAL_CTRL '0' effectively breaks PMU usage in guests that
haven't been updated to work with PMUs that support PERF_GLOBAL_CTRL.
This bug was recently exposed when KVM added supported for AMD's
PerfMonV2, i.e. when KVM started exposing a vPMU with PERF_GLOBAL_CTRL to
guest software that only knew how to program v1 PMUs (that don't support
PERF_GLOBAL_CTRL).
Failure to emulate the post-RESET behavior results in such guests
unknowingly leaving all general purpose counters globally disabled (the
entire reason the post-RESET value sets the GP counter enable bits is to
maintain backwards compatibility).
The bug has likely gone unnoticed because PERF_GLOBAL_CTRL has been
supported on Intel CPUs for as long as KVM has existed, i.e. hardly anyone
is running guest software that isn't aware of PERF_GLOBAL_CTRL on Intel
PMUs. And because up until v6.0, KVM _did_ emulate the behavior for Intel
CPUs, although the old behavior was likely dumb luck.
Because (a) that old code was also broken in its own way (the history of
this code is a comedy of errors), and (b) PERF_GLOBAL_CTRL was documented
as having a value of '0' post-RESET in all SDMs before March 2023.
Initial vPMU support in commit f5132b01386b ("KVM: Expose a version 2
architectural PMU to a guests") *almost* got it right (again likely by
dumb luck), but for some reason only set the bits if the guest PMU was
advertised as v1:
if (pmu->version == 1) {
pmu->global_ctrl = (1 << pmu->nr_arch_gp_counters) - 1;
return;
}
Commit f19a0c2c2e6a ("KVM: PMU emulation: GLOBAL_CTRL MSR should be
enabled on reset") then tried to remedy that goof, presumably because
guest PMUs were leaving PERF_GLOBAL_CTRL '0', i.e. weren't enabling
counters.
pmu->global_ctrl = ((1 << pmu->nr_arch_gp_counters) - 1) |
(((1ull << pmu->nr_arch_fixed_counters) - 1) << X86_PMC_IDX_FIXED);
pmu->global_ctrl_mask = ~pmu->global_ctrl;
That was KVM's behavior up until commit c49467a45fe0 ("KVM: x86/pmu:
Don't overwrite the pmu->global_ctrl when refreshing") removed
*everything*. However, it did so based on the behavior defined by the
SDM , which at the time stated that "Global Perf Counter Controls" is
'0' at Power-Up and RESET.
But then the March 2023 SDM (325462-079US), stealthily changed its
"IA-32 and Intel 64 Processor States Following Power-up, Reset, or INIT"
table to say:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
Note, kvm_pmu_refresh() can be invoked multiple times, i.e. it's not a
"pure" RESET flow. But it can only be called prior to the first KVM_RUN,
i.e. the guest will only ever observe the final value.
Note #2, KVM has always cleared global_ctrl during refresh (see commit
f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests")),
i.e. there is no danger of breaking existing setups by clobbering a value
set by userspace.
Reported-by: Babu Moger <babu.moger(a)amd.com>
Cc: Sandipan Das <sandipan.das(a)amd.com>
Cc: Like Xu <like.xu.linux(a)gmail.com>
Cc: Mingwei Zhang <mizhang(a)google.com>
Cc: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Tested-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Link: https://lore.kernel.org/r/20240309013641.1413400-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index c397b28e3d1b..a593b03c9aed 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -775,8 +775,20 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->pebs_data_cfg_mask = ~0ull;
bitmap_zero(pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX);
- if (vcpu->kvm->arch.enable_pmu)
- static_call(kvm_x86_pmu_refresh)(vcpu);
+ if (!vcpu->kvm->arch.enable_pmu)
+ return;
+
+ static_call(kvm_x86_pmu_refresh)(vcpu);
+
+ /*
+ * At RESET, both Intel and AMD CPUs set all enable bits for general
+ * purpose counters in IA32_PERF_GLOBAL_CTRL (so that software that
+ * was written for v1 PMUs don't unknowingly leave GP counters disabled
+ * in the global controls). Emulate that behavior when refreshing the
+ * PMU so that userspace doesn't need to manually set PERF_GLOBAL_CTRL.
+ */
+ if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters)
+ pmu->global_ctrl = GENMASK_ULL(pmu->nr_arch_gp_counters - 1, 0);
}
void kvm_pmu_init(struct kvm_vcpu *vcpu)
The patch below does not apply to the 6.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.8.y
git checkout FETCH_HEAD
git cherry-pick -x de120e1d692d73c7eefa3278837b1eb68f90728a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042340-retrieval-unhook-57b9@gregkh' --subject-prefix 'PATCH 6.8.y' HEAD^..
Possible dependencies:
de120e1d692d ("KVM: x86/pmu: Set enable bits for GP counters in PERF_GLOBAL_CTRL at "RESET"")
f933b88e2015 ("KVM: x86/pmu: Zero out PMU metadata on AMD if PMU is disabled")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From de120e1d692d73c7eefa3278837b1eb68f90728a Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Fri, 8 Mar 2024 17:36:40 -0800
Subject: [PATCH] KVM: x86/pmu: Set enable bits for GP counters in
PERF_GLOBAL_CTRL at "RESET"
Set the enable bits for general purpose counters in IA32_PERF_GLOBAL_CTRL
when refreshing the PMU to emulate the MSR's architecturally defined
post-RESET behavior. Per Intel's SDM:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
and
Where "n" is the number of general-purpose counters available in the processor.
AMD also documents this behavior for PerfMonV2 CPUs in one of AMD's many
PPRs.
Do not set any PERF_GLOBAL_CTRL bits if there are no general purpose
counters, although a literal reading of the SDM would require the CPU to
set either bits 63:0 or 31:0. The intent of the behavior is to globally
enable all GP counters; honor the intent, if not the letter of the law.
Leaving PERF_GLOBAL_CTRL '0' effectively breaks PMU usage in guests that
haven't been updated to work with PMUs that support PERF_GLOBAL_CTRL.
This bug was recently exposed when KVM added supported for AMD's
PerfMonV2, i.e. when KVM started exposing a vPMU with PERF_GLOBAL_CTRL to
guest software that only knew how to program v1 PMUs (that don't support
PERF_GLOBAL_CTRL).
Failure to emulate the post-RESET behavior results in such guests
unknowingly leaving all general purpose counters globally disabled (the
entire reason the post-RESET value sets the GP counter enable bits is to
maintain backwards compatibility).
The bug has likely gone unnoticed because PERF_GLOBAL_CTRL has been
supported on Intel CPUs for as long as KVM has existed, i.e. hardly anyone
is running guest software that isn't aware of PERF_GLOBAL_CTRL on Intel
PMUs. And because up until v6.0, KVM _did_ emulate the behavior for Intel
CPUs, although the old behavior was likely dumb luck.
Because (a) that old code was also broken in its own way (the history of
this code is a comedy of errors), and (b) PERF_GLOBAL_CTRL was documented
as having a value of '0' post-RESET in all SDMs before March 2023.
Initial vPMU support in commit f5132b01386b ("KVM: Expose a version 2
architectural PMU to a guests") *almost* got it right (again likely by
dumb luck), but for some reason only set the bits if the guest PMU was
advertised as v1:
if (pmu->version == 1) {
pmu->global_ctrl = (1 << pmu->nr_arch_gp_counters) - 1;
return;
}
Commit f19a0c2c2e6a ("KVM: PMU emulation: GLOBAL_CTRL MSR should be
enabled on reset") then tried to remedy that goof, presumably because
guest PMUs were leaving PERF_GLOBAL_CTRL '0', i.e. weren't enabling
counters.
pmu->global_ctrl = ((1 << pmu->nr_arch_gp_counters) - 1) |
(((1ull << pmu->nr_arch_fixed_counters) - 1) << X86_PMC_IDX_FIXED);
pmu->global_ctrl_mask = ~pmu->global_ctrl;
That was KVM's behavior up until commit c49467a45fe0 ("KVM: x86/pmu:
Don't overwrite the pmu->global_ctrl when refreshing") removed
*everything*. However, it did so based on the behavior defined by the
SDM , which at the time stated that "Global Perf Counter Controls" is
'0' at Power-Up and RESET.
But then the March 2023 SDM (325462-079US), stealthily changed its
"IA-32 and Intel 64 Processor States Following Power-up, Reset, or INIT"
table to say:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
Note, kvm_pmu_refresh() can be invoked multiple times, i.e. it's not a
"pure" RESET flow. But it can only be called prior to the first KVM_RUN,
i.e. the guest will only ever observe the final value.
Note #2, KVM has always cleared global_ctrl during refresh (see commit
f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests")),
i.e. there is no danger of breaking existing setups by clobbering a value
set by userspace.
Reported-by: Babu Moger <babu.moger(a)amd.com>
Cc: Sandipan Das <sandipan.das(a)amd.com>
Cc: Like Xu <like.xu.linux(a)gmail.com>
Cc: Mingwei Zhang <mizhang(a)google.com>
Cc: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Tested-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Link: https://lore.kernel.org/r/20240309013641.1413400-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index c397b28e3d1b..a593b03c9aed 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -775,8 +775,20 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->pebs_data_cfg_mask = ~0ull;
bitmap_zero(pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX);
- if (vcpu->kvm->arch.enable_pmu)
- static_call(kvm_x86_pmu_refresh)(vcpu);
+ if (!vcpu->kvm->arch.enable_pmu)
+ return;
+
+ static_call(kvm_x86_pmu_refresh)(vcpu);
+
+ /*
+ * At RESET, both Intel and AMD CPUs set all enable bits for general
+ * purpose counters in IA32_PERF_GLOBAL_CTRL (so that software that
+ * was written for v1 PMUs don't unknowingly leave GP counters disabled
+ * in the global controls). Emulate that behavior when refreshing the
+ * PMU so that userspace doesn't need to manually set PERF_GLOBAL_CTRL.
+ */
+ if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters)
+ pmu->global_ctrl = GENMASK_ULL(pmu->nr_arch_gp_counters - 1, 0);
}
void kvm_pmu_init(struct kvm_vcpu *vcpu)
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 5bfc311dd6c376d350b39028b9000ad766ddc934
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042323-ransack-tightness-3eba@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
5bfc311dd6c3 ("usb: xhci: correct return value in case of STS_HCE")
84ac5e4fa517 ("xhci: move event processing for one interrupter to a separate function")
e30e9ad9ed66 ("xhci: update event ring dequeue pointer position to controller correctly")
143e64df1bda ("xhci: remove unnecessary event_ring_deq parameter from xhci_handle_event()")
becbd202af84 ("xhci: make isoc_bei_interval variable interrupter specific.")
4f022aad80dc ("xhci: Add interrupt pending autoclear flag to each interrupter")
c99b38c41234 ("xhci: add support to allocate several interrupters")
47f503cf5f79 ("xhci: split free interrupter into separate remove and free parts")
d1830364e963 ("xhci: Simplify event ring dequeue pointer update for port change events")
08cc5616798d ("xhci: Clean up ERST_PTR_MASK inversion")
28084d3fcc3c ("xhci: Use more than one Event Ring segment")
044818a6cd80 ("xhci: Set DESI bits in ERDP register correctly")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5bfc311dd6c376d350b39028b9000ad766ddc934 Mon Sep 17 00:00:00 2001
From: Oliver Neukum <oneukum(a)suse.com>
Date: Thu, 4 Apr 2024 15:11:05 +0300
Subject: [PATCH] usb: xhci: correct return value in case of STS_HCE
If we get STS_HCE we give up on the interrupt, but for the purpose
of IRQ handling that still counts as ours. We may return IRQ_NONE
only if we are positive that it wasn't ours. Hence correct the default.
Fixes: 2a25e66d676d ("xhci: print warning when HCE was set")
Cc: stable(a)vger.kernel.org # v6.2+
Signed-off-by: Oliver Neukum <oneukum(a)suse.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20240404121106.2842417-2-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 52278afea94b..575f0fd9c9f1 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3133,7 +3133,7 @@ static int xhci_handle_events(struct xhci_hcd *xhci, struct xhci_interrupter *ir
irqreturn_t xhci_irq(struct usb_hcd *hcd)
{
struct xhci_hcd *xhci = hcd_to_xhci(hcd);
- irqreturn_t ret = IRQ_NONE;
+ irqreturn_t ret = IRQ_HANDLED;
u32 status;
spin_lock(&xhci->lock);
@@ -3141,12 +3141,13 @@ irqreturn_t xhci_irq(struct usb_hcd *hcd)
status = readl(&xhci->op_regs->status);
if (status == ~(u32)0) {
xhci_hc_died(xhci);
- ret = IRQ_HANDLED;
goto out;
}
- if (!(status & STS_EINT))
+ if (!(status & STS_EINT)) {
+ ret = IRQ_NONE;
goto out;
+ }
if (status & STS_HCE) {
xhci_warn(xhci, "WARNING: Host Controller Error\n");
@@ -3156,7 +3157,6 @@ irqreturn_t xhci_irq(struct usb_hcd *hcd)
if (status & STS_FATAL) {
xhci_warn(xhci, "WARNING: Host System Error\n");
xhci_halt(xhci);
- ret = IRQ_HANDLED;
goto out;
}
@@ -3167,7 +3167,6 @@ irqreturn_t xhci_irq(struct usb_hcd *hcd)
*/
status |= STS_EINT;
writel(status, &xhci->op_regs->status);
- ret = IRQ_HANDLED;
/* This is the handler of the primary interrupter */
xhci_handle_events(xhci, xhci->interrupters[0]);
The patch below does not apply to the 6.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.8.y
git checkout FETCH_HEAD
git cherry-pick -x 5bfc311dd6c376d350b39028b9000ad766ddc934
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042322-lunchbox-flint-5a2b@gregkh' --subject-prefix 'PATCH 6.8.y' HEAD^..
Possible dependencies:
5bfc311dd6c3 ("usb: xhci: correct return value in case of STS_HCE")
84ac5e4fa517 ("xhci: move event processing for one interrupter to a separate function")
e30e9ad9ed66 ("xhci: update event ring dequeue pointer position to controller correctly")
143e64df1bda ("xhci: remove unnecessary event_ring_deq parameter from xhci_handle_event()")
becbd202af84 ("xhci: make isoc_bei_interval variable interrupter specific.")
4f022aad80dc ("xhci: Add interrupt pending autoclear flag to each interrupter")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5bfc311dd6c376d350b39028b9000ad766ddc934 Mon Sep 17 00:00:00 2001
From: Oliver Neukum <oneukum(a)suse.com>
Date: Thu, 4 Apr 2024 15:11:05 +0300
Subject: [PATCH] usb: xhci: correct return value in case of STS_HCE
If we get STS_HCE we give up on the interrupt, but for the purpose
of IRQ handling that still counts as ours. We may return IRQ_NONE
only if we are positive that it wasn't ours. Hence correct the default.
Fixes: 2a25e66d676d ("xhci: print warning when HCE was set")
Cc: stable(a)vger.kernel.org # v6.2+
Signed-off-by: Oliver Neukum <oneukum(a)suse.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20240404121106.2842417-2-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 52278afea94b..575f0fd9c9f1 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3133,7 +3133,7 @@ static int xhci_handle_events(struct xhci_hcd *xhci, struct xhci_interrupter *ir
irqreturn_t xhci_irq(struct usb_hcd *hcd)
{
struct xhci_hcd *xhci = hcd_to_xhci(hcd);
- irqreturn_t ret = IRQ_NONE;
+ irqreturn_t ret = IRQ_HANDLED;
u32 status;
spin_lock(&xhci->lock);
@@ -3141,12 +3141,13 @@ irqreturn_t xhci_irq(struct usb_hcd *hcd)
status = readl(&xhci->op_regs->status);
if (status == ~(u32)0) {
xhci_hc_died(xhci);
- ret = IRQ_HANDLED;
goto out;
}
- if (!(status & STS_EINT))
+ if (!(status & STS_EINT)) {
+ ret = IRQ_NONE;
goto out;
+ }
if (status & STS_HCE) {
xhci_warn(xhci, "WARNING: Host Controller Error\n");
@@ -3156,7 +3157,6 @@ irqreturn_t xhci_irq(struct usb_hcd *hcd)
if (status & STS_FATAL) {
xhci_warn(xhci, "WARNING: Host System Error\n");
xhci_halt(xhci);
- ret = IRQ_HANDLED;
goto out;
}
@@ -3167,7 +3167,6 @@ irqreturn_t xhci_irq(struct usb_hcd *hcd)
*/
status |= STS_EINT;
writel(status, &xhci->op_regs->status);
- ret = IRQ_HANDLED;
/* This is the handler of the primary interrupter */
xhci_handle_events(xhci, xhci->interrupters[0]);
The patch below does not apply to the 6.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.8.y
git checkout FETCH_HEAD
git cherry-pick -x 91f10a3d21f2313485178d49efef8a3ba02bd8c7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042332-depose-frenzy-e027@gregkh' --subject-prefix 'PATCH 6.8.y' HEAD^..
Possible dependencies:
91f10a3d21f2 ("Revert "drm/amd/display: fix USB-C flag update after enc10 feature init"")
7d1e9d0369e4 ("drm/amd/display: Check DP Alt mode DPCS state via DMUB")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 91f10a3d21f2313485178d49efef8a3ba02bd8c7 Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher(a)amd.com>
Date: Fri, 29 Mar 2024 18:03:03 -0400
Subject: [PATCH] Revert "drm/amd/display: fix USB-C flag update after enc10
feature init"
This reverts commit b5abd7f983e14054593dc91d6df2aa5f8cc67652.
This change breaks DSC on 4k monitors at 144Hz over USB-C.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3254
Reviewed-by: Harry Wentland <harry.wentland(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: Muhammad Ahmed <ahmed.ahmed(a)amd.com>
Cc: Tom Chung <chiahsuan.chung(a)amd.com>
Cc: Charlene Liu <charlene.liu(a)amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz(a)amd.com>
Cc: Harry Wentland <harry.wentland(a)amd.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dio_link_encoder.c b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dio_link_encoder.c
index e224a028d68a..8a0460e86309 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dio_link_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dio_link_encoder.c
@@ -248,14 +248,12 @@ void dcn32_link_encoder_construct(
enc10->base.hpd_source = init_data->hpd_source;
enc10->base.connector = init_data->connector;
+ if (enc10->base.connector.id == CONNECTOR_ID_USBC)
+ enc10->base.features.flags.bits.DP_IS_USB_C = 1;
+
enc10->base.preferred_engine = ENGINE_ID_UNKNOWN;
enc10->base.features = *enc_features;
- if (enc10->base.connector.id == CONNECTOR_ID_USBC)
- enc10->base.features.flags.bits.DP_IS_USB_C = 1;
-
- if (enc10->base.connector.id == CONNECTOR_ID_USBC)
- enc10->base.features.flags.bits.DP_IS_USB_C = 1;
enc10->base.transmitter = init_data->transmitter;
diff --git a/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_dio_link_encoder.c b/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_dio_link_encoder.c
index 81e349d5835b..da94e5309fba 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_dio_link_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_dio_link_encoder.c
@@ -184,6 +184,8 @@ void dcn35_link_encoder_construct(
enc10->base.hpd_source = init_data->hpd_source;
enc10->base.connector = init_data->connector;
+ if (enc10->base.connector.id == CONNECTOR_ID_USBC)
+ enc10->base.features.flags.bits.DP_IS_USB_C = 1;
enc10->base.preferred_engine = ENGINE_ID_UNKNOWN;
@@ -238,8 +240,6 @@ void dcn35_link_encoder_construct(
}
enc10->base.features.flags.bits.HDMI_6GB_EN = 1;
- if (enc10->base.connector.id == CONNECTOR_ID_USBC)
- enc10->base.features.flags.bits.DP_IS_USB_C = 1;
if (bp_funcs->get_connector_speed_cap_info)
result = bp_funcs->get_connector_speed_cap_info(enc10->base.ctx->dc_bios,
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 13c785323b36b845300b256d0e5963c3727667d7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042348-galleria-wielder-75a5@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
13c785323b36 ("serial: stm32: Return IRQ_NONE in the ISR if no handling happend")
c5d06662551c ("serial: stm32: Use port lock wrappers")
a01ae50d7eae ("serial: stm32: replace access to DMAR bit by dmaengine_pause/resume")
7f28bcea824e ("serial: stm32: group dma pause/resume error handling into single function")
00d1f9c6af0d ("serial: stm32: modify parameter and rename stm32_usart_rx_dma_enabled")
db89728abad5 ("serial: stm32: avoid clearing DMAT bit during transfer")
3f6c02fa712b ("serial: stm32: Merge hard IRQ and threaded IRQ handling into single IRQ handler")
d7c76716169d ("serial: stm32: Use TC interrupt to deassert GPIO RTS in RS485 mode")
3bcea529b295 ("serial: stm32: Factor out GPIO RTS toggling into separate function")
037b91ec7729 ("serial: stm32: fix software flow control transfer")
d3d079bde07e ("serial: stm32: prevent TDR register overwrite when sending x_char")
195437d14fb4 ("serial: stm32: correct loop for dma error handling")
2a3bcfe03725 ("serial: stm32: fix flow control transfer in DMA mode")
9a135f16d228 ("serial: stm32: rework TX DMA state condition")
56a23f9319e8 ("serial: stm32: move tx dma terminate DMA to shutdown")
6333a4850621 ("serial: stm32: push DMA RX data before suspending")
6eeb348c8482 ("serial: stm32: terminate / restart DMA transfer at suspend / resume")
e0abc903deea ("serial: stm32: rework RX dma initialization and release")
d1ec8a2eabe9 ("serial: stm32: update throttle and unthrottle ops for dma mode")
33bb2f6ac308 ("serial: stm32: rework RX over DMA")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13c785323b36b845300b256d0e5963c3727667d7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?= <u.kleine-koenig(a)pengutronix.de>
Date: Wed, 17 Apr 2024 11:03:27 +0200
Subject: [PATCH] serial: stm32: Return IRQ_NONE in the ISR if no handling
happend
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
If there is a stuck irq that the handler doesn't address, returning
IRQ_HANDLED unconditionally makes it impossible for the irq core to
detect the problem and disable the irq. So only return IRQ_HANDLED if
an event was handled.
A stuck irq is still problematic, but with this change at least it only
makes the UART nonfunctional instead of occupying the (usually only) CPU
by 100% and so stall the whole machine.
Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
Link: https://lore.kernel.org/r/5f92603d0dfd8a5b8014b2b10a902d91e0bb881f.17133441…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index 58d169e5c1db..d60cbac69194 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -861,6 +861,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs;
u32 sr;
unsigned int size;
+ irqreturn_t ret = IRQ_NONE;
sr = readl_relaxed(port->membase + ofs->isr);
@@ -869,11 +870,14 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
(sr & USART_SR_TC)) {
stm32_usart_tc_interrupt_disable(port);
stm32_usart_rs485_rts_disable(port);
+ ret = IRQ_HANDLED;
}
- if ((sr & USART_SR_RTOF) && ofs->icr != UNDEF_REG)
+ if ((sr & USART_SR_RTOF) && ofs->icr != UNDEF_REG) {
writel_relaxed(USART_ICR_RTOCF,
port->membase + ofs->icr);
+ ret = IRQ_HANDLED;
+ }
if ((sr & USART_SR_WUF) && ofs->icr != UNDEF_REG) {
/* Clear wake up flag and disable wake up interrupt */
@@ -882,6 +886,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
stm32_usart_clr_bits(port, ofs->cr3, USART_CR3_WUFIE);
if (irqd_is_wakeup_set(irq_get_irq_data(port->irq)))
pm_wakeup_event(tport->tty->dev, 0);
+ ret = IRQ_HANDLED;
}
/*
@@ -896,6 +901,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_unlock_and_check_sysrq(port);
if (size)
tty_flip_buffer_push(tport);
+ ret = IRQ_HANDLED;
}
}
@@ -903,6 +909,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_port_lock(port);
stm32_usart_transmit_chars(port);
uart_port_unlock(port);
+ ret = IRQ_HANDLED;
}
/* Receiver timeout irq for DMA RX */
@@ -912,9 +919,10 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_unlock_and_check_sysrq(port);
if (size)
tty_flip_buffer_push(tport);
+ ret = IRQ_HANDLED;
}
- return IRQ_HANDLED;
+ return ret;
}
static void stm32_usart_set_mctrl(struct uart_port *port, unsigned int mctrl)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 13c785323b36b845300b256d0e5963c3727667d7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042346-correct-footgear-b44a@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
13c785323b36 ("serial: stm32: Return IRQ_NONE in the ISR if no handling happend")
c5d06662551c ("serial: stm32: Use port lock wrappers")
a01ae50d7eae ("serial: stm32: replace access to DMAR bit by dmaengine_pause/resume")
7f28bcea824e ("serial: stm32: group dma pause/resume error handling into single function")
00d1f9c6af0d ("serial: stm32: modify parameter and rename stm32_usart_rx_dma_enabled")
db89728abad5 ("serial: stm32: avoid clearing DMAT bit during transfer")
3f6c02fa712b ("serial: stm32: Merge hard IRQ and threaded IRQ handling into single IRQ handler")
d7c76716169d ("serial: stm32: Use TC interrupt to deassert GPIO RTS in RS485 mode")
3bcea529b295 ("serial: stm32: Factor out GPIO RTS toggling into separate function")
037b91ec7729 ("serial: stm32: fix software flow control transfer")
d3d079bde07e ("serial: stm32: prevent TDR register overwrite when sending x_char")
195437d14fb4 ("serial: stm32: correct loop for dma error handling")
2a3bcfe03725 ("serial: stm32: fix flow control transfer in DMA mode")
9a135f16d228 ("serial: stm32: rework TX DMA state condition")
56a23f9319e8 ("serial: stm32: move tx dma terminate DMA to shutdown")
6333a4850621 ("serial: stm32: push DMA RX data before suspending")
6eeb348c8482 ("serial: stm32: terminate / restart DMA transfer at suspend / resume")
e0abc903deea ("serial: stm32: rework RX dma initialization and release")
d1ec8a2eabe9 ("serial: stm32: update throttle and unthrottle ops for dma mode")
33bb2f6ac308 ("serial: stm32: rework RX over DMA")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13c785323b36b845300b256d0e5963c3727667d7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?= <u.kleine-koenig(a)pengutronix.de>
Date: Wed, 17 Apr 2024 11:03:27 +0200
Subject: [PATCH] serial: stm32: Return IRQ_NONE in the ISR if no handling
happend
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
If there is a stuck irq that the handler doesn't address, returning
IRQ_HANDLED unconditionally makes it impossible for the irq core to
detect the problem and disable the irq. So only return IRQ_HANDLED if
an event was handled.
A stuck irq is still problematic, but with this change at least it only
makes the UART nonfunctional instead of occupying the (usually only) CPU
by 100% and so stall the whole machine.
Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
Link: https://lore.kernel.org/r/5f92603d0dfd8a5b8014b2b10a902d91e0bb881f.17133441…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index 58d169e5c1db..d60cbac69194 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -861,6 +861,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs;
u32 sr;
unsigned int size;
+ irqreturn_t ret = IRQ_NONE;
sr = readl_relaxed(port->membase + ofs->isr);
@@ -869,11 +870,14 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
(sr & USART_SR_TC)) {
stm32_usart_tc_interrupt_disable(port);
stm32_usart_rs485_rts_disable(port);
+ ret = IRQ_HANDLED;
}
- if ((sr & USART_SR_RTOF) && ofs->icr != UNDEF_REG)
+ if ((sr & USART_SR_RTOF) && ofs->icr != UNDEF_REG) {
writel_relaxed(USART_ICR_RTOCF,
port->membase + ofs->icr);
+ ret = IRQ_HANDLED;
+ }
if ((sr & USART_SR_WUF) && ofs->icr != UNDEF_REG) {
/* Clear wake up flag and disable wake up interrupt */
@@ -882,6 +886,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
stm32_usart_clr_bits(port, ofs->cr3, USART_CR3_WUFIE);
if (irqd_is_wakeup_set(irq_get_irq_data(port->irq)))
pm_wakeup_event(tport->tty->dev, 0);
+ ret = IRQ_HANDLED;
}
/*
@@ -896,6 +901,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_unlock_and_check_sysrq(port);
if (size)
tty_flip_buffer_push(tport);
+ ret = IRQ_HANDLED;
}
}
@@ -903,6 +909,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_port_lock(port);
stm32_usart_transmit_chars(port);
uart_port_unlock(port);
+ ret = IRQ_HANDLED;
}
/* Receiver timeout irq for DMA RX */
@@ -912,9 +919,10 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_unlock_and_check_sysrq(port);
if (size)
tty_flip_buffer_push(tport);
+ ret = IRQ_HANDLED;
}
- return IRQ_HANDLED;
+ return ret;
}
static void stm32_usart_set_mctrl(struct uart_port *port, unsigned int mctrl)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 13c785323b36b845300b256d0e5963c3727667d7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042345-brunette-striking-6f32@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
13c785323b36 ("serial: stm32: Return IRQ_NONE in the ISR if no handling happend")
c5d06662551c ("serial: stm32: Use port lock wrappers")
a01ae50d7eae ("serial: stm32: replace access to DMAR bit by dmaengine_pause/resume")
7f28bcea824e ("serial: stm32: group dma pause/resume error handling into single function")
00d1f9c6af0d ("serial: stm32: modify parameter and rename stm32_usart_rx_dma_enabled")
db89728abad5 ("serial: stm32: avoid clearing DMAT bit during transfer")
3f6c02fa712b ("serial: stm32: Merge hard IRQ and threaded IRQ handling into single IRQ handler")
d7c76716169d ("serial: stm32: Use TC interrupt to deassert GPIO RTS in RS485 mode")
3bcea529b295 ("serial: stm32: Factor out GPIO RTS toggling into separate function")
037b91ec7729 ("serial: stm32: fix software flow control transfer")
d3d079bde07e ("serial: stm32: prevent TDR register overwrite when sending x_char")
195437d14fb4 ("serial: stm32: correct loop for dma error handling")
2a3bcfe03725 ("serial: stm32: fix flow control transfer in DMA mode")
9a135f16d228 ("serial: stm32: rework TX DMA state condition")
56a23f9319e8 ("serial: stm32: move tx dma terminate DMA to shutdown")
6333a4850621 ("serial: stm32: push DMA RX data before suspending")
6eeb348c8482 ("serial: stm32: terminate / restart DMA transfer at suspend / resume")
e0abc903deea ("serial: stm32: rework RX dma initialization and release")
d1ec8a2eabe9 ("serial: stm32: update throttle and unthrottle ops for dma mode")
33bb2f6ac308 ("serial: stm32: rework RX over DMA")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13c785323b36b845300b256d0e5963c3727667d7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?= <u.kleine-koenig(a)pengutronix.de>
Date: Wed, 17 Apr 2024 11:03:27 +0200
Subject: [PATCH] serial: stm32: Return IRQ_NONE in the ISR if no handling
happend
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
If there is a stuck irq that the handler doesn't address, returning
IRQ_HANDLED unconditionally makes it impossible for the irq core to
detect the problem and disable the irq. So only return IRQ_HANDLED if
an event was handled.
A stuck irq is still problematic, but with this change at least it only
makes the UART nonfunctional instead of occupying the (usually only) CPU
by 100% and so stall the whole machine.
Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
Link: https://lore.kernel.org/r/5f92603d0dfd8a5b8014b2b10a902d91e0bb881f.17133441…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index 58d169e5c1db..d60cbac69194 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -861,6 +861,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs;
u32 sr;
unsigned int size;
+ irqreturn_t ret = IRQ_NONE;
sr = readl_relaxed(port->membase + ofs->isr);
@@ -869,11 +870,14 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
(sr & USART_SR_TC)) {
stm32_usart_tc_interrupt_disable(port);
stm32_usart_rs485_rts_disable(port);
+ ret = IRQ_HANDLED;
}
- if ((sr & USART_SR_RTOF) && ofs->icr != UNDEF_REG)
+ if ((sr & USART_SR_RTOF) && ofs->icr != UNDEF_REG) {
writel_relaxed(USART_ICR_RTOCF,
port->membase + ofs->icr);
+ ret = IRQ_HANDLED;
+ }
if ((sr & USART_SR_WUF) && ofs->icr != UNDEF_REG) {
/* Clear wake up flag and disable wake up interrupt */
@@ -882,6 +886,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
stm32_usart_clr_bits(port, ofs->cr3, USART_CR3_WUFIE);
if (irqd_is_wakeup_set(irq_get_irq_data(port->irq)))
pm_wakeup_event(tport->tty->dev, 0);
+ ret = IRQ_HANDLED;
}
/*
@@ -896,6 +901,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_unlock_and_check_sysrq(port);
if (size)
tty_flip_buffer_push(tport);
+ ret = IRQ_HANDLED;
}
}
@@ -903,6 +909,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_port_lock(port);
stm32_usart_transmit_chars(port);
uart_port_unlock(port);
+ ret = IRQ_HANDLED;
}
/* Receiver timeout irq for DMA RX */
@@ -912,9 +919,10 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_unlock_and_check_sysrq(port);
if (size)
tty_flip_buffer_push(tport);
+ ret = IRQ_HANDLED;
}
- return IRQ_HANDLED;
+ return ret;
}
static void stm32_usart_set_mctrl(struct uart_port *port, unsigned int mctrl)
With deferred IO enabled, a page fault happens when data is written to the
framebuffer device. Then driver determines which page is being updated by
calculating the offset of the written virtual address within the virtual
memory area, and uses this offset to get the updated page within the
internal buffer. This page is later copied to hardware (thus the name
"deferred IO").
This calculation is only correct if the virtual memory area is mapped to
the beginning of the internal buffer. Otherwise this is wrong. For example,
if users do:
mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0xff000);
Then the virtual memory area will mapped at offset 0xff000 within the
internal buffer. This offset 0xff000 is not accounted for, and wrong page
is updated. This will lead to wrong pixels being updated on the device.
However, it gets worse: if users do 2 mmap to the same virtual address, for
example:
int fd = open("/dev/fb0", O_RDWR, 0);
char *ptr = (char *) 0x20000000ul;
mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0xff000);
*ptr = 0; // write #1
mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0);
*ptr = 0; // write #2
In this case, both write #1 and write #2 apply to the same virtual address
(0x20000000ul), and the driver mistakenly thinks the same page is being
written to. When the second write happens, the driver thinks this is the
same page as the last time, and reuse the page from write #1. The driver
then lock_page() an incorrect page, and returns VM_FAULT_LOCKED with the
correct page unlocked. It is unclear what will happen with memory
management subsystem after that, but likely something terrible.
Fix this by taking the mapping offset into account.
Reported-and-tested-by: Harshit Mogalapalli <harshit.m.mogalapalli(a)oracle.com>
Closes: https://lore.kernel.org/linux-fbdev/271372d6-e665-4e7f-b088-dee5f4ab341a@or…
Fixes: 56c134f7f1b5 ("fbdev: Track deferred-I/O pages in pageref struct")
Cc: stable(a)vger.kernel.org
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
---
drivers/video/fbdev/core/fb_defio.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c
index dae96c9f61cf..d5d6cd9e8b29 100644
--- a/drivers/video/fbdev/core/fb_defio.c
+++ b/drivers/video/fbdev/core/fb_defio.c
@@ -196,7 +196,8 @@ static vm_fault_t fb_deferred_io_track_page(struct fb_info *info, unsigned long
*/
static vm_fault_t fb_deferred_io_page_mkwrite(struct fb_info *info, struct vm_fault *vmf)
{
- unsigned long offset = vmf->address - vmf->vma->vm_start;
+ unsigned long offset = vmf->address - vmf->vma->vm_start
+ + (vmf->vma->vm_pgoff << PAGE_SHIFT);
struct page *page = vmf->page;
file_update_time(vmf->vma->vm_file);
--
2.39.2
Inspired by a patch from [Justin][1] I took a closer look at kdb_read().
Despite Justin's patch being a (correct) one-line manipulation it was a
tough patch to review because the surrounding code was hard to read and
it looked like there were unfixed problems.
This series isn't enough to make kdb_read() beautiful but it does make
it shorter, easier to reason about and fixes two buffer overflows and a
screen redraw problem!
[1]: https://lore.kernel.org/all/20240403-strncpy-kernel-debug-kdb-kdb_io-c-v1-1…
Signed-off-by: Daniel Thompson <daniel.thompson(a)linaro.org>
---
Changes in v2:
- No code changes!
- I belatedly realized that one of the cleanups actually fixed a buffer
overflow so there are changes to Cc: (to add stable@...) and to one
of the patch descriptions.
- Link to v1: https://lore.kernel.org/r/20240416-kgdb_read_refactor-v1-0-b18c2d01076d@lin…
---
Daniel Thompson (7):
kdb: Fix buffer overflow during tab-complete
kdb: Use format-strings rather than '\0' injection in kdb_read()
kdb: Fix console handling when editing and tab-completing commands
kdb: Merge identical case statements in kdb_read()
kdb: Use format-specifiers rather than memset() for padding in kdb_read()
kdb: Replace double memcpy() with memmove() in kdb_read()
kdb: Simplify management of tmpbuffer in kdb_read()
kernel/debug/kdb/kdb_io.c | 133 ++++++++++++++++++++--------------------------
1 file changed, 58 insertions(+), 75 deletions(-)
---
base-commit: dccce9b8780618986962ba37c373668bcf426866
change-id: 20240415-kgdb_read_refactor-2ea2dfc15dbb
Best regards,
--
Daniel Thompson <daniel.thompson(a)linaro.org>
[ Upstream commit 9fb9eb4b59acc607e978288c96ac7efa917153d4 ]
systems, using igb driver, crash while executing poweroff command
as per following call stack:
crash> bt -a
PID: 62583 TASK: ffff97ebbf28dc40 CPU: 0 COMMAND: "poweroff"
#0 [ffffa7adcd64f8a0] machine_kexec at ffffffffa606c7c1
#1 [ffffa7adcd64f900] __crash_kexec at ffffffffa613bb52
#2 [ffffa7adcd64f9d0] panic at ffffffffa6099c45
#3 [ffffa7adcd64fa50] oops_end at ffffffffa603359a
#4 [ffffa7adcd64fa78] die at ffffffffa6033c32
#5 [ffffa7adcd64faa8] do_trap at ffffffffa60309a0
#6 [ffffa7adcd64faf8] do_error_trap at ffffffffa60311e7
#7 [ffffa7adcd64fbc0] do_invalid_op at ffffffffa6031320
#8 [ffffa7adcd64fbd0] invalid_op at ffffffffa6a01f2a
[exception RIP: free_msi_irqs+408]
RIP: ffffffffa645d248 RSP: ffffa7adcd64fc88 RFLAGS: 00010286
RAX: ffff97eb1396fe00 RBX: 0000000000000000 RCX: ffff97eb1396fe00
RDX: ffff97eb1396fe00 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffa7adcd64fcb0 R8: 0000000000000002 R9: 000000000000fbff
R10: 0000000000000000 R11: 0000000000000000 R12: ffff98c047af4720
R13: ffff97eb87cd32a0 R14: ffff97eb87cd3000 R15: ffffa7adcd64fd57
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#9 [ffffa7adcd64fc80] free_msi_irqs at ffffffffa645d0fc
#10 [ffffa7adcd64fcb8] pci_disable_msix at ffffffffa645d896
#11 [ffffa7adcd64fce0] igb_reset_interrupt_capability at ffffffffc024f335 [igb]
#12 [ffffa7adcd64fd08] __igb_shutdown at ffffffffc0258ed7 [igb]
#13 [ffffa7adcd64fd48] igb_shutdown at ffffffffc025908b [igb]
#14 [ffffa7adcd64fd70] pci_device_shutdown at ffffffffa6441e3a
#15 [ffffa7adcd64fd98] device_shutdown at ffffffffa6570260
#16 [ffffa7adcd64fdc8] kernel_power_off at ffffffffa60c0725
#17 [ffffa7adcd64fdd8] SYSC_reboot at ffffffffa60c08f1
#18 [ffffa7adcd64ff18] sys_reboot at ffffffffa60c09ee
#19 [ffffa7adcd64ff28] do_syscall_64 at ffffffffa6003ca9
#20 [ffffa7adcd64ff50] entry_SYSCALL_64_after_hwframe at ffffffffa6a001b1
This happens because igb_shutdown has not yet freed up allocated irqs and
free_msi_irqs finds irq_has_action true for involved msi irqs here and this
condition triggers BUG_ON.
Freeing irqs before proceeding further in igb_clear_interrupt_scheme,
fixes this problem.
Signed-off-by: Imran Khan <imran.f.khan(a)oracle.com>
---
This issue does not happen in v5.17 or later kernel versions because
'commit 9fb9eb4b59ac ("PCI/MSI: Let core code free MSI descriptors")',
explicitly frees up MSI based irqs and hence indirectly fixes this issue
as well. Also this is why I have mentioned this commit as equivalent
upstream commit. But this upstream change itself is dependent on a bunch
of changes starting from 'commit 288c81ce4be7 ("PCI/MSI: Move code into a
separate directory")', which refactored msi driver into multiple parts.
So another way of fixing this issue would be to backport these patches and
get this issue implictly fixed.
Kindly let me know if my current patch is not acceptable and in that case
will it be fine if I backport the above mentioned msi driver refactoring
patches to LST.
drivers/net/ethernet/intel/igb/igb_main.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 7c42a99be5065..5b59fb65231ba 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -107,6 +107,7 @@ static int igb_setup_all_rx_resources(struct igb_adapter *);
static void igb_free_all_tx_resources(struct igb_adapter *);
static void igb_free_all_rx_resources(struct igb_adapter *);
static void igb_setup_mrqc(struct igb_adapter *);
+static void igb_free_irq(struct igb_adapter *adapter);
static int igb_probe(struct pci_dev *, const struct pci_device_id *);
static void igb_remove(struct pci_dev *pdev);
static int igb_sw_init(struct igb_adapter *);
@@ -1080,6 +1081,7 @@ static void igb_free_q_vectors(struct igb_adapter *adapter)
*/
static void igb_clear_interrupt_scheme(struct igb_adapter *adapter)
{
+ igb_free_irq(adapter);
igb_free_q_vectors(adapter);
igb_reset_interrupt_capability(adapter);
}
--
2.34.1
From: Joakim Sindholt <opensource(a)zhasha.com>
[ Upstream commit cd25e15e57e68a6b18dc9323047fe9c68b99290b ]
Garbage in plain 9P2000's perm bits is allowed through, which causes it
to be able to set (among others) the suid bit. This was presumably not
the intent since the unix extended bits are handled explicitly and
conditionally on .u.
Signed-off-by: Joakim Sindholt <opensource(a)zhasha.com>
Signed-off-by: Eric Van Hensbergen <ericvh(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
fs/9p/vfs_inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 72b779bc09422..d1a0f36dcdd43 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -101,7 +101,7 @@ static int p9mode2perm(struct v9fs_session_info *v9ses,
int res;
int mode = stat->mode;
- res = mode & S_IALLUGO;
+ res = mode & 0777; /* S_IRWXUGO */
if (v9fs_proto_dotu(v9ses)) {
if ((mode & P9_DMSETUID) == P9_DMSETUID)
res |= S_ISUID;
--
2.43.0
From: Joakim Sindholt <opensource(a)zhasha.com>
[ Upstream commit cd25e15e57e68a6b18dc9323047fe9c68b99290b ]
Garbage in plain 9P2000's perm bits is allowed through, which causes it
to be able to set (among others) the suid bit. This was presumably not
the intent since the unix extended bits are handled explicitly and
conditionally on .u.
Signed-off-by: Joakim Sindholt <opensource(a)zhasha.com>
Signed-off-by: Eric Van Hensbergen <ericvh(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
fs/9p/vfs_inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 0791480bf922b..88ca5015f987e 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -86,7 +86,7 @@ static int p9mode2perm(struct v9fs_session_info *v9ses,
int res;
int mode = stat->mode;
- res = mode & S_IALLUGO;
+ res = mode & 0777; /* S_IRWXUGO */
if (v9fs_proto_dotu(v9ses)) {
if ((mode & P9_DMSETUID) == P9_DMSETUID)
res |= S_ISUID;
--
2.43.0
From: Joakim Sindholt <opensource(a)zhasha.com>
[ Upstream commit cd25e15e57e68a6b18dc9323047fe9c68b99290b ]
Garbage in plain 9P2000's perm bits is allowed through, which causes it
to be able to set (among others) the suid bit. This was presumably not
the intent since the unix extended bits are handled explicitly and
conditionally on .u.
Signed-off-by: Joakim Sindholt <opensource(a)zhasha.com>
Signed-off-by: Eric Van Hensbergen <ericvh(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
fs/9p/vfs_inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 0d9b7d453a877..75907f77f9e38 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -87,7 +87,7 @@ static int p9mode2perm(struct v9fs_session_info *v9ses,
int res;
int mode = stat->mode;
- res = mode & S_IALLUGO;
+ res = mode & 0777; /* S_IRWXUGO */
if (v9fs_proto_dotu(v9ses)) {
if ((mode & P9_DMSETUID) == P9_DMSETUID)
res |= S_ISUID;
--
2.43.0
From: Joakim Sindholt <opensource(a)zhasha.com>
[ Upstream commit cd25e15e57e68a6b18dc9323047fe9c68b99290b ]
Garbage in plain 9P2000's perm bits is allowed through, which causes it
to be able to set (among others) the suid bit. This was presumably not
the intent since the unix extended bits are handled explicitly and
conditionally on .u.
Signed-off-by: Joakim Sindholt <opensource(a)zhasha.com>
Signed-off-by: Eric Van Hensbergen <ericvh(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
fs/9p/vfs_inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 5e2657c1dbbe6..a0c5a372dcf62 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -85,7 +85,7 @@ static int p9mode2perm(struct v9fs_session_info *v9ses,
int res;
int mode = stat->mode;
- res = mode & S_IALLUGO;
+ res = mode & 0777; /* S_IRWXUGO */
if (v9fs_proto_dotu(v9ses)) {
if ((mode & P9_DMSETUID) == P9_DMSETUID)
res |= S_ISUID;
--
2.43.0
From: Joakim Sindholt <opensource(a)zhasha.com>
[ Upstream commit cd25e15e57e68a6b18dc9323047fe9c68b99290b ]
Garbage in plain 9P2000's perm bits is allowed through, which causes it
to be able to set (among others) the suid bit. This was presumably not
the intent since the unix extended bits are handled explicitly and
conditionally on .u.
Signed-off-by: Joakim Sindholt <opensource(a)zhasha.com>
Signed-off-by: Eric Van Hensbergen <ericvh(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
fs/9p/vfs_inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index ea695c4a7a3fb..3bdf6df4b553e 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -83,7 +83,7 @@ static int p9mode2perm(struct v9fs_session_info *v9ses,
int res;
int mode = stat->mode;
- res = mode & S_IALLUGO;
+ res = mode & 0777; /* S_IRWXUGO */
if (v9fs_proto_dotu(v9ses)) {
if ((mode & P9_DMSETUID) == P9_DMSETUID)
res |= S_ISUID;
--
2.43.0
From: Joakim Sindholt <opensource(a)zhasha.com>
[ Upstream commit cd25e15e57e68a6b18dc9323047fe9c68b99290b ]
Garbage in plain 9P2000's perm bits is allowed through, which causes it
to be able to set (among others) the suid bit. This was presumably not
the intent since the unix extended bits are handled explicitly and
conditionally on .u.
Signed-off-by: Joakim Sindholt <opensource(a)zhasha.com>
Signed-off-by: Eric Van Hensbergen <ericvh(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
fs/9p/vfs_inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 32572982f72e6..e337fec9b18e1 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -83,7 +83,7 @@ static int p9mode2perm(struct v9fs_session_info *v9ses,
int res;
int mode = stat->mode;
- res = mode & S_IALLUGO;
+ res = mode & 0777; /* S_IRWXUGO */
if (v9fs_proto_dotu(v9ses)) {
if ((mode & P9_DMSETUID) == P9_DMSETUID)
res |= S_ISUID;
--
2.43.0
Removes the output of this purely informational message from the
kernel buffer:
"ntfs3: Max link count 4000"
Signed-off-by: Konstantin Komarov <almaz.alexandrovich(a)paragon-software.com>
Cc: stable(a)vger.kernel.org
---
fs/ntfs3/super.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/fs/ntfs3/super.c b/fs/ntfs3/super.c
index 9df7c20d066f..ac4722011140 100644
--- a/fs/ntfs3/super.c
+++ b/fs/ntfs3/super.c
@@ -1804,8 +1804,6 @@ static int __init init_ntfs_fs(void)
{
int err;
- pr_info("ntfs3: Max link count %u\n", NTFS_LINK_MAX);
-
if (IS_ENABLED(CONFIG_NTFS3_FS_POSIX_ACL))
pr_info("ntfs3: Enabled Linux POSIX ACLs support\n");
if (IS_ENABLED(CONFIG_NTFS3_64BIT_CLUSTER))
--
2.34.1
When counting and checking hard links in an ntfs file record,
struct MFT_REC {
struct NTFS_RECORD_HEADER rhdr; // 'FILE'
__le16 seq; // 0x10: Sequence number for this record.
>> __le16 hard_links; // 0x12: The number of hard links to record.
__le16 attr_off; // 0x14: Offset to attributes.
...
the ntfs3 driver ignored short names (DOS names), causing the link count
to be reduced by 1 and messages to be output to dmesg.
For Windows, such a situation is a minor error, meaning chkdsk does not report
errors on such a volume, and in the case of using the /f switch, it silently
corrects them, reporting that no errors were found. This does not affect
the consistency of the file system.
Nevertheless, the behavior in the ntfs3 driver is incorrect and
changes the content of the file system. This patch should fix that.
PS: most likely, there has been a confusion of concepts
MFT_REC::hard_links and inode::__i_nlink.
Fixes: 82cae269cfa95 ("fs/ntfs3: Add initialization of super block")
Signed-off-by: Konstantin Komarov <almaz.alexandrovich(a)paragon-software.com>
Cc: stable(a)vger.kernel.org
---
fs/ntfs3/inode.c | 7 ++++---
fs/ntfs3/record.c | 11 ++---------
2 files changed, 6 insertions(+), 12 deletions(-)
diff --git a/fs/ntfs3/inode.c b/fs/ntfs3/inode.c
index eb7a8c9fba01..05f169018c4e 100644
--- a/fs/ntfs3/inode.c
+++ b/fs/ntfs3/inode.c
@@ -37,7 +37,7 @@ static struct inode *ntfs_read_mft(struct inode *inode,
bool is_dir;
unsigned long ino = inode->i_ino;
u32 rp_fa = 0, asize, t32;
- u16 roff, rsize, names = 0;
+ u16 roff, rsize, names = 0, links = 0;
const struct ATTR_FILE_NAME *fname = NULL;
const struct INDEX_ROOT *root;
struct REPARSE_DATA_BUFFER rp; // 0x18 bytes
@@ -200,11 +200,12 @@ static struct inode *ntfs_read_mft(struct inode *inode,
rsize < SIZEOF_ATTRIBUTE_FILENAME)
goto out;
+ names += 1;
fname = Add2Ptr(attr, roff);
if (fname->type == FILE_NAME_DOS)
goto next_attr;
- names += 1;
+ links += 1;
if (name && name->len == fname->name_len &&
!ntfs_cmp_names_cpu(name, (struct le_str *)&fname->name_len,
NULL, false))
@@ -429,7 +430,7 @@ static struct inode *ntfs_read_mft(struct inode *inode,
ni->mi.dirty = true;
}
- set_nlink(inode, names);
+ set_nlink(inode, links);
if (S_ISDIR(mode)) {
ni->std_fa |= FILE_ATTRIBUTE_DIRECTORY;
diff --git a/fs/ntfs3/record.c b/fs/ntfs3/record.c
index 6aa3a9d44df1..6c76503edc20 100644
--- a/fs/ntfs3/record.c
+++ b/fs/ntfs3/record.c
@@ -534,16 +534,9 @@ bool mi_remove_attr(struct ntfs_inode *ni, struct mft_inode *mi,
if (aoff + asize > used)
return false;
- if (ni && is_attr_indexed(attr)) {
+ if (ni && is_attr_indexed(attr) && attr->type == ATTR_NAME) {
u16 links = le16_to_cpu(ni->mi.mrec->hard_links);
- struct ATTR_FILE_NAME *fname =
- attr->type != ATTR_NAME ?
- NULL :
- resident_data_ex(attr,
- SIZEOF_ATTRIBUTE_FILENAME);
- if (fname && fname->type == FILE_NAME_DOS) {
- /* Do not decrease links count deleting DOS name. */
- } else if (!links) {
+ if (!links) {
/* minor error. Not critical. */
} else {
ni->mi.mrec->hard_links = cpu_to_le16(links - 1);
--
2.34.1
When KUnit tests are enabled, under very big kernel configurations
(e.g. `allyesconfig`), we can trigger a `rustdoc` ICE [1]:
RUSTDOC TK rust/kernel/lib.rs
error: the compiler unexpectedly panicked. this is a bug.
The reason is that this build step has a duplicated `@rustc_cfg` argument,
which contains the kernel configuration, and thus a lot of arguments. The
factor 2 happens to be enough to reach the ICE.
Thus remove the unneeded `@rustc_cfg`. By doing so, we clean up the
command and workaround the ICE.
The ICE has been fixed in the upcoming Rust 1.79 [2].
Cc: stable(a)vger.kernel.org
Fixes: a66d733da801 ("rust: support running Rust documentation tests as KUnit ones")
Link: https://github.com/rust-lang/rust/issues/122722 [1]
Link: https://github.com/rust-lang/rust/pull/122840 [2]
Signed-off-by: Miguel Ojeda <ojeda(a)kernel.org>
---
rust/Makefile | 1 -
1 file changed, 1 deletion(-)
diff --git a/rust/Makefile b/rust/Makefile
index 846e6ab9d5a9..86a125c4243c 100644
--- a/rust/Makefile
+++ b/rust/Makefile
@@ -175,7 +175,6 @@ quiet_cmd_rustdoc_test_kernel = RUSTDOC TK $<
mkdir -p $(objtree)/$(obj)/test/doctests/kernel; \
OBJTREE=$(abspath $(objtree)) \
$(RUSTDOC) --test $(rust_flags) \
- @$(objtree)/include/generated/rustc_cfg \
-L$(objtree)/$(obj) --extern alloc --extern kernel \
--extern build_error --extern macros \
--extern bindings --extern uapi \
base-commit: 4cece764965020c22cff7665b18a012006359095
--
2.44.0
On Sun, Apr 21, 2024 at 01:11:19PM -0400, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> configs/hardening: Fix disabling UBSAN configurations
>
> to the 6.8-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> configs-hardening-fix-disabling-ubsan-configurations.patch
> and it can be found in the queue-6.8 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit a54fba0bb1f52707b423c908e153d6429d08db58
> Author: Nathan Chancellor <nathan(a)kernel.org>
> Date: Thu Apr 11 11:11:06 2024 -0700
>
> configs/hardening: Fix disabling UBSAN configurations
>
> [ Upstream commit e048d668f2969cf2b76e0fa21882a1b3bb323eca ]
While I think backporting this makes sense, I don't know that
backporting 918327e9b7ff ("ubsan: Remove CONFIG_UBSAN_SANITIZE_ALL") to
resolve the conflict with 6.8 is entirely necessary (or beneficial, I
don't know how Kees feels about it though). I've attached a version that
applies cleanly to 6.8, in case it is desirable.
Cheers,
Nathan
Hi,
Based on the issue reported here [0], commit 0553e753ea9e ("net/mlx5: E-
switch, store eswitch pointer before registering devlink_param") is missing from
kernel 6.6.x due to a missing tag. Could you please include it in 6.6.x and
6.8.x?
[0] https://lore.kernel.org/netdev/20240422170428.32576-1-oxana@cloudflare.com/
Thanks,
Dragos
The patch titled
Subject: maple_tree: fix mas_empty_area_rev() null pointer dereference
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
maple_tree-fix-mas_empty_area_rev-null-pointer-dereference.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Liam R. Howlett" <Liam.Howlett(a)oracle.com>
Subject: maple_tree: fix mas_empty_area_rev() null pointer dereference
Date: Mon, 22 Apr 2024 16:33:49 -0400
Currently the code calls mas_start() followed by mas_data_end() if the
maple state is MA_START, but mas_start() may return with the maple state
node == NULL. This will lead to a null pointer dereference when checking
information in the NULL node, which is done in mas_data_end().
Avoid setting the offset if there is no node by waiting until after the
maple state is checked for an empty or single entry state.
A user could trigger the events to cause a kernel oops by unmapping all
vmas to produce an empty maple tree, then mapping a vma that would cause
the scenario described above.
Link: https://lkml.kernel.org/r/20240422203349.2418465-1-Liam.Howlett@oracle.com
Fixes: 54a611b60590 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Reported-by: Marius Fleischer <fleischermarius(a)gmail.com>
Closes: https://lore.kernel.org/lkml/CAJg=8jyuSxDL6XvqEXY_66M20psRK2J53oBTP+fjV5xpW…
Link: https://lore.kernel.org/lkml/CAJg=8jyuSxDL6XvqEXY_66M20psRK2J53oBTP+fjV5xpW…
Tested-by: Marius Fleischer <fleischermarius(a)gmail.com>
Tested-by: Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/maple_tree.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
--- a/lib/maple_tree.c~maple_tree-fix-mas_empty_area_rev-null-pointer-dereference
+++ a/lib/maple_tree.c
@@ -5109,18 +5109,18 @@ int mas_empty_area_rev(struct ma_state *
if (size == 0 || max - min < size - 1)
return -EINVAL;
- if (mas_is_start(mas)) {
+ if (mas_is_start(mas))
mas_start(mas);
- mas->offset = mas_data_end(mas);
- } else if (mas->offset >= 2) {
- mas->offset -= 2;
- } else if (!mas_rewind_node(mas)) {
+ else if ((mas->offset < 2) && (!mas_rewind_node(mas)))
return -EBUSY;
- }
- /* Empty set. */
- if (mas_is_none(mas) || mas_is_ptr(mas))
+ if (unlikely(mas_is_none(mas) || mas_is_ptr(mas)))
return mas_sparse_area(mas, min, max, size, false);
+ else if (mas->offset >= 2)
+ mas->offset -= 2;
+ else
+ mas->offset = mas_data_end(mas);
+
/* The start of the window can only be within these values. */
mas->index = min;
_
Patches currently in -mm which might be from Liam.Howlett(a)oracle.com are
maple_tree-fix-mas_empty_area_rev-null-pointer-dereference.patch
Currently, when kdb is compiled with keyboard support, then we will use
schedule_work() to provoke reset of the keyboard status. Unfortunately
schedule_work() gets called from the kgdboc post-debug-exception
handler. That risks deadlock since schedule_work() is not NMI-safe and,
even on platforms where the NMI is not directly used for debugging, the
debug trap can have NMI-like behaviour depending on where breakpoints
are placed.
Fix this by using the irq work system, which is NMI-safe, to defer the
call to schedule_work() to a point when it is safe to call.
Reported-by: Liuye <liu.yeC(a)h3c.com>
Closes: https://lore.kernel.org/all/20240228025602.3087748-1-liu.yeC@h3c.com/
Cc: stable(a)vger.kernel.org
Signed-off-by: Daniel Thompson <daniel.thompson(a)linaro.org>
---
drivers/tty/serial/kgdboc.c | 30 +++++++++++++++++++++++++++++-
1 file changed, 29 insertions(+), 1 deletion(-)
diff --git a/drivers/tty/serial/kgdboc.c b/drivers/tty/serial/kgdboc.c
index 7ce7bb1640054..adcea70fd7507 100644
--- a/drivers/tty/serial/kgdboc.c
+++ b/drivers/tty/serial/kgdboc.c
@@ -19,6 +19,7 @@
#include <linux/console.h>
#include <linux/vt_kern.h>
#include <linux/input.h>
+#include <linux/irq_work.h>
#include <linux/module.h>
#include <linux/platform_device.h>
#include <linux/serial_core.h>
@@ -48,6 +49,25 @@ static struct kgdb_io kgdboc_earlycon_io_ops;
static int (*earlycon_orig_exit)(struct console *con);
#endif /* IS_BUILTIN(CONFIG_KGDB_SERIAL_CONSOLE) */
+/*
+ * When we leave the debug trap handler we need to reset the keyboard status
+ * (since the original keyboard state gets partially clobbered by kdb use of
+ * the keyboard).
+ *
+ * The path to deliver the reset is somewhat circuitous.
+ *
+ * To deliver the reset we register an input handler, reset the keyboard and
+ * then deregister the input handler. However, to get this done right, we do
+ * have to carefully manage the calling context because we can only register
+ * input handlers from task context.
+ *
+ * In particular we need to trigger the action from the debug trap handler with
+ * all its NMI and/or NMI-like oddities. To solve this the kgdboc trap exit code
+ * (the "post_exception" callback) uses irq_work_queue(), which is NMI-safe, to
+ * schedule a callback from a hardirq context. From there we have to defer the
+ * work again, this time using schedule_Work(), to get a callback using the
+ * system workqueue, which runs in task context.
+ */
#ifdef CONFIG_KDB_KEYBOARD
static int kgdboc_reset_connect(struct input_handler *handler,
struct input_dev *dev,
@@ -99,10 +119,17 @@ static void kgdboc_restore_input_helper(struct work_struct *dummy)
static DECLARE_WORK(kgdboc_restore_input_work, kgdboc_restore_input_helper);
+static void kgdboc_queue_restore_input_helper(struct irq_work *unused)
+{
+ schedule_work(&kgdboc_restore_input_work);
+}
+
+static DEFINE_IRQ_WORK(kgdboc_restore_input_irq_work, kgdboc_queue_restore_input_helper);
+
static void kgdboc_restore_input(void)
{
if (likely(system_state == SYSTEM_RUNNING))
- schedule_work(&kgdboc_restore_input_work);
+ irq_work_queue(&kgdboc_restore_input_irq_work);
}
static int kgdboc_register_kbd(char **cptr)
@@ -133,6 +160,7 @@ static void kgdboc_unregister_kbd(void)
i--;
}
}
+ irq_work_sync(&kgdboc_restore_input_irq_work);
flush_work(&kgdboc_restore_input_work);
}
#else /* ! CONFIG_KDB_KEYBOARD */
---
base-commit: 0bbac3facb5d6cc0171c45c9873a2dc96bea9680
change-id: 20240419-kgdboc_fix_schedule_work-f0cb44b8a354
Best regards,
--
Daniel Thompson <daniel.thompson(a)linaro.org>
Currently the code calls mas_start() followed by mas_data_end() if the
maple state is MA_START, but mas_start() may return with the maple state
node == NULL. This will lead to a null pointer dereference when
checking information in the NULL node, which is done in mas_data_end().
Avoid setting the offset if there is no node by waiting until after the
maple state is checked for an empty or single entry state.
A user could trigger the events to cause a kernel oops by unmapping all
vmas to produce an empty maple tree, then mapping a vma that would cause
the scenario described above.
Reported-by: Marius Fleischer <fleischermarius(a)gmail.com>
Closes: https://lore.kernel.org/lkml/CAJg=8jyuSxDL6XvqEXY_66M20psRK2J53oBTP+fjV5xpW…
Link: https://lore.kernel.org/lkml/CAJg=8jyuSxDL6XvqEXY_66M20psRK2J53oBTP+fjV5xpW…
Fixes: 54a611b60590 ("Maple Tree: add new data structure")
Tested-by: Marius Fleischer <fleischermarius(a)gmail.com>
Tested-by: Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
Cc: maple-tree(a)lists.infradead.org
Cc: linux-mm(a)kvack.org
Cc: stable(a)vger.kernel.org
Signed-off-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
---
lib/maple_tree.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 55e1b35bf877..2d7d27e6ae3c 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -5109,18 +5109,18 @@ int mas_empty_area_rev(struct ma_state *mas, unsigned long min,
if (size == 0 || max - min < size - 1)
return -EINVAL;
- if (mas_is_start(mas)) {
+ if (mas_is_start(mas))
mas_start(mas);
- mas->offset = mas_data_end(mas);
- } else if (mas->offset >= 2) {
- mas->offset -= 2;
- } else if (!mas_rewind_node(mas)) {
+ else if ((mas->offset < 2) && (!mas_rewind_node(mas)))
return -EBUSY;
- }
- /* Empty set. */
- if (mas_is_none(mas) || mas_is_ptr(mas))
+ if (unlikely(mas_is_none(mas) || mas_is_ptr(mas)))
return mas_sparse_area(mas, min, max, size, false);
+ else if (mas->offset >= 2)
+ mas->offset -= 2;
+ else
+ mas->offset = mas_data_end(mas);
+
/* The start of the window can only be within these values. */
mas->index = min;
--
2.43.0
The patch titled
Subject: mm/userfaultfd: reset ptes when close() for wr-protected ones
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-userfaultfd-reset-ptes-when-close-for-wr-protected-ones.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Peter Xu <peterx(a)redhat.com>
Subject: mm/userfaultfd: reset ptes when close() for wr-protected ones
Date: Mon, 22 Apr 2024 09:33:11 -0400
Userfaultfd unregister includes a step to remove wr-protect bits from all
the relevant pgtable entries, but that only covered an explicit
UFFDIO_UNREGISTER ioctl, not a close() on the userfaultfd itself. Cover
that too. This fixes a WARN trace.
Link: https://lore.kernel.org/all/000000000000ca4df20616a0fe16@google.com/
Analyzed-by: David Hildenbrand <david(a)redhat.com>
Link: https://lkml.kernel.org/r/20240422133311.2987675-1-peterx@redhat.com
Reported-by: syzbot+d8426b591c36b21c750e(a)syzkaller.appspotmail.com
Signed-off-by: Peter Xu <peterx(a)redhat.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/userfaultfd.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/fs/userfaultfd.c~mm-userfaultfd-reset-ptes-when-close-for-wr-protected-ones
+++ a/fs/userfaultfd.c
@@ -895,6 +895,10 @@ static int userfaultfd_release(struct in
prev = vma;
continue;
}
+ /* Reset ptes for the whole vma range if wr-protected */
+ if (userfaultfd_wp(vma))
+ uffd_wp_range(vma, vma->vm_start,
+ vma->vm_end - vma->vm_start, false);
new_flags = vma->vm_flags & ~__VM_UFFD_FLAGS;
vma = vma_modify_flags_uffd(&vmi, prev, vma, vma->vm_start,
vma->vm_end, new_flags,
_
Patches currently in -mm which might be from peterx(a)redhat.com are
mm-hugetlb-fix-missing-hugetlb_lock-for-resv-uncharge.patch
mm-userfaultfd-reset-ptes-when-close-for-wr-protected-ones.patch
mm-hmm-process-pud-swap-entry-without-pud_huge.patch
mm-gup-cache-p4d-in-follow_p4d_mask.patch
mm-gup-check-p4d-presence-before-going-on.patch
mm-x86-change-pxd_huge-behavior-to-exclude-swap-entries.patch
mm-sparc-change-pxd_huge-behavior-to-exclude-swap-entries.patch
mm-arm-use-macros-to-define-pmd-pud-helpers.patch
mm-arm-redefine-pmd_huge-with-pmd_leaf.patch
mm-arm64-merge-pxd_huge-and-pxd_leaf-definitions.patch
mm-powerpc-redefine-pxd_huge-with-pxd_leaf.patch
mm-gup-merge-pxd-huge-mapping-checks.patch
mm-treewide-replace-pxd_huge-with-pxd_leaf.patch
mm-treewide-remove-pxd_huge.patch
mm-arm-remove-pmd_thp_or_huge.patch
mm-document-pxd_leaf-api.patch
mm-always-initialise-folio-_deferred_list-fix.patch
selftests-mm-run_vmtestssh-fix-hugetlb-mem-size-calculation.patch
selftests-mm-run_vmtestssh-fix-hugetlb-mem-size-calculation-fix.patch
mm-kconfig-config_pgtable_has_huge_leaves.patch
mm-hugetlb-declare-hugetlbfs_pagecache_present-non-static.patch
mm-make-hpage_pxd_-macros-even-if-thp.patch
mm-introduce-vma_pgtable_walk_beginend.patch
mm-arch-provide-pud_pfn-fallback.patch
mm-arch-provide-pud_pfn-fallback-fix.patch
mm-gup-drop-folio_fast_pin_allowed-in-hugepd-processing.patch
mm-gup-refactor-record_subpages-to-find-1st-small-page.patch
mm-gup-handle-hugetlb-for-no_page_table.patch
mm-gup-cache-pudp-in-follow_pud_mask.patch
mm-gup-handle-huge-pud-for-follow_pud_mask.patch
mm-gup-handle-huge-pmd-for-follow_pmd_mask.patch
mm-gup-handle-huge-pmd-for-follow_pmd_mask-fix.patch
mm-gup-handle-hugepd-for-follow_page.patch
mm-gup-handle-hugetlb-in-the-generic-follow_page_mask-code.patch
mm-allow-anon-exclusive-check-over-hugetlb-tail-pages.patch
mm-hugetlb-assert-hugetlb_lock-in-__hugetlb_cgroup_commit_charge.patch
mm-page_table_check-support-userfault-wr-protect-entries.patch
commit 2f4a4d63a193be6fd530d180bb13c3592052904c modified
cpc_read/cpc_write to use access_width to read CPC registers. For PCC
registers the access width field in the ACPI register macro specifies
the PCC subspace id. For non-zero PCC subspace id the access width is
incorrectly treated as access width. This causes errors when reading
from PCC registers in the CPPC driver.
For PCC registers base the size of read/write on the bit width field.
The debug message in cpc_read/cpc_write is updated to print relevant
information for the address space type used to read the register.
Signed-off-by: Vanshidhar Konda <vanshikonda(a)os.amperecomputing.com>
Tested-by: Jarred White <jarredwhite(a)linux.microsoft.com>
Reviewed-by: Jarred White <jarredwhite(a)linux.microsoft.com>
Reviewed-by: Easwar Hariharan <eahariha(a)linux.microsoft.com>
Cc: 5.15+ <stable(a)vger.kernel.org> # 5.15+
---
When testing v6.9-rc1 kernel on AmpereOne system dmesg showed that
cpufreq policy had failed to initialize on some cores during boot because
cpufreq->get() always returned 0. On this system CPPC registers are in PCC
subspace index 2 that are 32 bits wide. With this patch the CPPC driver
interpreted the access width field as 16 bits, causing the register read
to roll over too quickly to provide valid values during frequency
computation.
v2:
- Use size variable in debug print message
- Use size instead of reg->bit_width for acpi_os_read_memory and
acpi_os_write_memory
v3:
- Fix language in error messages in cpc_read/cpc_write
drivers/acpi/cppc_acpi.c | 53 ++++++++++++++++++++++++++++------------
1 file changed, 37 insertions(+), 16 deletions(-)
diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
index 4bfbe55553f4..7d476988fae3 100644
--- a/drivers/acpi/cppc_acpi.c
+++ b/drivers/acpi/cppc_acpi.c
@@ -1002,14 +1002,14 @@ static int cpc_read(int cpu, struct cpc_register_resource *reg_res, u64 *val)
}
*val = 0;
+ size = GET_BIT_WIDTH(reg);
if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_IO) {
- u32 width = GET_BIT_WIDTH(reg);
u32 val_u32;
acpi_status status;
status = acpi_os_read_port((acpi_io_address)reg->address,
- &val_u32, width);
+ &val_u32, size);
if (ACPI_FAILURE(status)) {
pr_debug("Error: Failed to read SystemIO port %llx\n",
reg->address);
@@ -1018,17 +1018,22 @@ static int cpc_read(int cpu, struct cpc_register_resource *reg_res, u64 *val)
*val = val_u32;
return 0;
- } else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM && pcc_ss_id >= 0)
+ } else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM && pcc_ss_id >= 0) {
+ /*
+ * For registers in PCC space, the register size is determined
+ * by the bit width field; the access size is used to indicate
+ * the PCC subspace id.
+ */
+ size = reg->bit_width;
vaddr = GET_PCC_VADDR(reg->address, pcc_ss_id);
+ }
else if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY)
vaddr = reg_res->sys_mem_vaddr;
else if (reg->space_id == ACPI_ADR_SPACE_FIXED_HARDWARE)
return cpc_read_ffh(cpu, reg, val);
else
return acpi_os_read_memory((acpi_physical_address)reg->address,
- val, reg->bit_width);
-
- size = GET_BIT_WIDTH(reg);
+ val, size);
switch (size) {
case 8:
@@ -1044,8 +1049,13 @@ static int cpc_read(int cpu, struct cpc_register_resource *reg_res, u64 *val)
*val = readq_relaxed(vaddr);
break;
default:
- pr_debug("Error: Cannot read %u bit width from PCC for ss: %d\n",
- reg->bit_width, pcc_ss_id);
+ if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY) {
+ pr_debug("Error: Cannot read %u bit width from system memory: 0x%llx\n",
+ size, reg->address);
+ } else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM) {
+ pr_debug("Error: Cannot read %u bit width from PCC for ss: %d\n",
+ size, pcc_ss_id);
+ }
return -EFAULT;
}
@@ -1063,12 +1073,13 @@ static int cpc_write(int cpu, struct cpc_register_resource *reg_res, u64 val)
int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpu);
struct cpc_reg *reg = ®_res->cpc_entry.reg;
+ size = GET_BIT_WIDTH(reg);
+
if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_IO) {
- u32 width = GET_BIT_WIDTH(reg);
acpi_status status;
status = acpi_os_write_port((acpi_io_address)reg->address,
- (u32)val, width);
+ (u32)val, size);
if (ACPI_FAILURE(status)) {
pr_debug("Error: Failed to write SystemIO port %llx\n",
reg->address);
@@ -1076,17 +1087,22 @@ static int cpc_write(int cpu, struct cpc_register_resource *reg_res, u64 val)
}
return 0;
- } else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM && pcc_ss_id >= 0)
+ } else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM && pcc_ss_id >= 0) {
+ /*
+ * For registers in PCC space, the register size is determined
+ * by the bit width field; the access size is used to indicate
+ * the PCC subspace id.
+ */
+ size = reg->bit_width;
vaddr = GET_PCC_VADDR(reg->address, pcc_ss_id);
+ }
else if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY)
vaddr = reg_res->sys_mem_vaddr;
else if (reg->space_id == ACPI_ADR_SPACE_FIXED_HARDWARE)
return cpc_write_ffh(cpu, reg, val);
else
return acpi_os_write_memory((acpi_physical_address)reg->address,
- val, reg->bit_width);
-
- size = GET_BIT_WIDTH(reg);
+ val, size);
if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY)
val = MASK_VAL(reg, val);
@@ -1105,8 +1121,13 @@ static int cpc_write(int cpu, struct cpc_register_resource *reg_res, u64 val)
writeq_relaxed(val, vaddr);
break;
default:
- pr_debug("Error: Cannot write %u bit width to PCC for ss: %d\n",
- reg->bit_width, pcc_ss_id);
+ if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY) {
+ pr_debug("Error: Cannot write %u bit width to system memory: 0x%llx\n",
+ size, reg->address);
+ } else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM) {
+ pr_debug("Error: Cannot write %u bit width to PCC for ss: %d\n",
+ size, pcc_ss_id);
+ }
ret_val = -EFAULT;
break;
}
--
2.43.1
Commit 2f4a4d63a193 ("ACPI: CPPC: Use access_width over bit_width for
system memory accesses") neglected to properly wrap the bit_offset shift
when it comes to applying the mask. This may cause incorrect values to be
read and may cause the cpufreq module not be loaded.
[ 11.059751] cpu_capacity: CPU0 missing/invalid highest performance.
[ 11.066005] cpu_capacity: partial information: fallback to 1024 for all CPUs
Also, corrected the bitmask generation in GENMASK (extra bit being added).
Fixes: 2f4a4d63a193 ("ACPI: CPPC: Use access_width over bit_width for system memory accesses")
Signed-off-by: Jarred White <jarredwhite(a)linux.microsoft.com>
CC: Vanshidhar Konda <vanshikonda(a)os.amperecomputing.com>
CC: stable(a)vger.kernel.org #5.15+
---
drivers/acpi/cppc_acpi.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
index 4bfbe55553f4..00a30ca35e78 100644
--- a/drivers/acpi/cppc_acpi.c
+++ b/drivers/acpi/cppc_acpi.c
@@ -170,8 +170,8 @@ show_cppc_data(cppc_get_perf_ctrs, cppc_perf_fb_ctrs, wraparound_time);
#define GET_BIT_WIDTH(reg) ((reg)->access_width ? (8 << ((reg)->access_width - 1)) : (reg)->bit_width)
/* Shift and apply the mask for CPC reads/writes */
-#define MASK_VAL(reg, val) ((val) >> ((reg)->bit_offset & \
- GENMASK(((reg)->bit_width), 0)))
+#define MASK_VAL(reg, val) (((val) >> (reg)->bit_offset) & \
+ GENMASK(((reg)->bit_width) - 1, 0))
static ssize_t show_feedback_ctrs(struct kobject *kobj,
struct kobj_attribute *attr, char *buf)
--
2.33.8
From: George Shen <george.shen(a)amd.com>
Theoretically rare corner case where ceil(Y) results in rounding up to
an integer. If this happens, the 1 should be carried over to the X
value.
CC: stable(a)vger.kernel.org
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira(a)amd.com>
Signed-off-by: George Shen <george.shen(a)amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler(a)amd.com>
---
.../drm/amd/display/dc/dcn31/dcn31_hpo_dp_link_encoder.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_link_encoder.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_link_encoder.c
index 48e63550f696..03b4ac2f1991 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_link_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_link_encoder.c
@@ -395,6 +395,12 @@ void dcn31_hpo_dp_link_enc_set_throttled_vcp_size(
x),
25));
+ // If y rounds up to integer, carry it over to x.
+ if (y >> 25) {
+ x += 1;
+ y = 0;
+ }
+
switch (stream_encoder_inst) {
case 0:
REG_SET_2(DP_DPHY_SYM32_VC_RATE_CNTL0, 0,
--
2.44.0
An async dio write to a sparse file can generate a lot of extents
and when we unlink this file (using rm), the kernel can be busy in umapping
and freeing those extents as part of transaction processing.
Add cond_resched() in xfs_defer_finish_noroll() to avoid soft lockups
messages. Here is a call trace of such soft lockup.
watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [rm:81335]
CPU: 1 PID: 81335 Comm: rm Kdump: loaded Tainted: G L X 5.14.21-150500.53-default
NIP [c00800001b174768] xfs_extent_busy_trim+0xc0/0x2a0 [xfs]
LR [c00800001b1746f4] xfs_extent_busy_trim+0x4c/0x2a0 [xfs]
Call Trace:
0xc0000000a8268340 (unreliable)
xfs_alloc_compute_aligned+0x5c/0x150 [xfs]
xfs_alloc_ag_vextent_size+0x1dc/0x8c0 [xfs]
xfs_alloc_ag_vextent+0x17c/0x1c0 [xfs]
xfs_alloc_fix_freelist+0x274/0x4b0 [xfs]
xfs_free_extent_fix_freelist+0x84/0xe0 [xfs]
__xfs_free_extent+0xa0/0x240 [xfs]
xfs_trans_free_extent+0x6c/0x140 [xfs]
xfs_defer_finish_noroll+0x2b0/0x650 [xfs]
xfs_inactive_truncate+0xe8/0x140 [xfs]
xfs_fs_destroy_inode+0xdc/0x320 [xfs]
destroy_inode+0x6c/0xc0
do_unlinkat+0x1fc/0x410
sys_unlinkat+0x58/0xb0
system_call_exception+0x15c/0x330
system_call_common+0xec/0x250
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list(a)gmail.com>
cc: stable(a)vger.kernel.org
cc: Ojaswin Mujoo <ojaswin(a)linux.ibm.com>
---
fs/xfs/libxfs/xfs_defer.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c
index c13276095cc0..cb185b97447d 100644
--- a/fs/xfs/libxfs/xfs_defer.c
+++ b/fs/xfs/libxfs/xfs_defer.c
@@ -705,6 +705,7 @@ xfs_defer_finish_noroll(
error = xfs_defer_finish_one(*tp, dfp);
if (error && error != -EAGAIN)
goto out_shutdown;
+ cond_resched();
}
/* Requeue the paused items in the outgoing transaction. */
--
2.44.0
Qualcomm ROME controllers can be registered from the Bluetooth line
discipline and in this case the HCI UART serdev pointer is NULL.
Add the missing sanity check to prevent a NULL-pointer dereference when
setup() is called for a non-serdev controller.
Fixes: e9b3e5b8c657 ("Bluetooth: hci_qca: only assign wakeup with serial port support")
Cc: stable(a)vger.kernel.org # 6.2
Cc: Zhengping Jiang <jiangzp(a)google.com>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
drivers/bluetooth/hci_qca.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
index 94c85f4fbf3b..b621a0a40ea4 100644
--- a/drivers/bluetooth/hci_qca.c
+++ b/drivers/bluetooth/hci_qca.c
@@ -1958,8 +1958,10 @@ static int qca_setup(struct hci_uart *hu)
qca_debugfs_init(hdev);
hu->hdev->hw_error = qca_hw_error;
hu->hdev->cmd_timeout = qca_cmd_timeout;
- if (device_can_wakeup(hu->serdev->ctrl->dev.parent))
- hu->hdev->wakeup = qca_wakeup;
+ if (hu->serdev) {
+ if (device_can_wakeup(hu->serdev->ctrl->dev.parent))
+ hu->hdev->wakeup = qca_wakeup;
+ }
} else if (ret == -ENOENT) {
/* No patch/nvm-config found, run with original fw/config */
set_bit(QCA_ROM_FW, &qca->flags);
--
2.43.2
Qualcomm ROME controllers can be registered from the Bluetooth line
discipline and in this case the HCI UART serdev pointer is NULL.
Add the missing sanity check to prevent a NULL-pointer dereference when
wakeup() is called for a non-serdev controller during suspend.
Just return true for now to restore the original behaviour and address
the crash with pre-6.2 kernels, which do not have commit e9b3e5b8c657
("Bluetooth: hci_qca: only assign wakeup with serial port support") that
causes the crash to happen already at setup() time.
Fixes: c1a74160eaf1 ("Bluetooth: hci_qca: Add device_may_wakeup support")
Cc: stable(a)vger.kernel.org # 5.13
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
drivers/bluetooth/hci_qca.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
index 92fa20f5ac7d..94c85f4fbf3b 100644
--- a/drivers/bluetooth/hci_qca.c
+++ b/drivers/bluetooth/hci_qca.c
@@ -1672,6 +1672,9 @@ static bool qca_wakeup(struct hci_dev *hdev)
struct hci_uart *hu = hci_get_drvdata(hdev);
bool wakeup;
+ if (!hu->serdev)
+ return true;
+
/* BT SoC attached through the serial bus is handled by the serdev driver.
* So we need to use the device handle of the serdev driver to get the
* status of device may wakeup.
--
2.43.2
If none of the clusters are added because of some error, fail to load
driver without presenting root domain. In this case root domain will
present invalid data.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada(a)linux.intel.com>
Fixes: 01c10f88c9b7 ("platform/x86/intel-uncore-freq: tpmi: Provide cluster level control")
Cc: <stable(a)vger.kernel.org> # 6.5+
---
This error can be reproduced in the pre production hardware only.
So can go through regular cycle and they apply to stable.
.../x86/intel/uncore-frequency/uncore-frequency-tpmi.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
index bd75d61ff8a6..587437211d72 100644
--- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
+++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
@@ -240,6 +240,7 @@ static int uncore_probe(struct auxiliary_device *auxdev, const struct auxiliary_
bool read_blocked = 0, write_blocked = 0;
struct intel_tpmi_plat_info *plat_info;
struct tpmi_uncore_struct *tpmi_uncore;
+ bool uncore_sysfs_added = false;
int ret, i, pkg = 0;
int num_resources;
@@ -384,9 +385,15 @@ static int uncore_probe(struct auxiliary_device *auxdev, const struct auxiliary_
}
/* Point to next cluster offset */
cluster_offset >>= UNCORE_MAX_CLUSTER_PER_DOMAIN;
+ uncore_sysfs_added = true;
}
}
+ if (!uncore_sysfs_added) {
+ ret = -ENODEV;
+ goto remove_clusters;
+ }
+
auxiliary_set_drvdata(auxdev, tpmi_uncore);
tpmi_uncore->root_cluster.root_domain = true;
--
2.40.1
Nick Bowler reported that sparc64 failed to bring all his CPU's online,
and that turned out to be an easy fix.
The sparc64 build was rather noisy with a lot of warnings which had
irritated me enough to go ahead and fix them.
With this set of patches my arch/sparc/ is almost warning free for
all{no,yes,mod}config + defconfig builds.
There is one warning about "clone3 not implemented", which I have ignored.
The warning fixes hides the fact that sparc64 is not yet y2038 prepared,
and it would be preferable if someone knowledgeable would fix this
poperly.
All fixes looks like 6.9 material to me.
Sam
---
Sam Ravnborg (10):
sparc64: Fix prototype warning for init_vdso_image
sparc64: Fix prototype warnings in traps_64.c
sparc64: Fix prototype warning for vmemmap_free
sparc64: Fix prototype warning for alloc_irqstack_bootmem
sparc64: Fix prototype warning for uprobe_trap
sparc64: Fix prototype warning for dma_4v_iotsb_bind
sparc64: Fix prototype warnings in adi_64.c
sparc64: Fix prototype warning for sched_clock
sparc64: Fix number of online CPUs
sparc64: Fix prototype warnings for vdso
arch/sparc/include/asm/smp_64.h | 2 --
arch/sparc/include/asm/vdso.h | 10 ++++++++++
arch/sparc/kernel/adi_64.c | 14 +++++++-------
arch/sparc/kernel/kernel.h | 4 ++++
arch/sparc/kernel/pci_sun4v.c | 6 +++---
arch/sparc/kernel/prom_64.c | 4 +++-
arch/sparc/kernel/setup_64.c | 3 +--
arch/sparc/kernel/smp_64.c | 14 --------------
arch/sparc/kernel/time_64.c | 1 +
arch/sparc/kernel/traps_64.c | 10 +++++-----
arch/sparc/kernel/uprobes.c | 2 ++
arch/sparc/mm/init_64.c | 5 -----
arch/sparc/vdso/vclock_gettime.c | 1 +
arch/sparc/vdso/vma.c | 5 +++--
14 files changed, 40 insertions(+), 41 deletions(-)
---
base-commit: 84b76d05828a1909e20d0f66553b876b801f98c8
change-id: 20240329-sparc64-warnings-668cc90ef53b
Best regards,
--
Sam Ravnborg <sam(a)ravnborg.org>
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: mc: mark the media devnode as registered from the, start
Author: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Date: Fri Feb 23 09:46:19 2024 +0100
First the media device node was created, and if successful it was
marked as 'registered'. This leaves a small race condition where
an application can open the device node and get an error back
because the 'registered' flag was not yet set.
Change the order: first set the 'registered' flag, then actually
register the media device node. If that fails, then clear the flag.
Signed-off-by: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Acked-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart(a)ideasonboard.com>
Fixes: cf4b9211b568 ("[media] media: Media device node support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
drivers/media/mc/mc-devnode.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
---
diff --git a/drivers/media/mc/mc-devnode.c b/drivers/media/mc/mc-devnode.c
index 7f67825c8757..318e267e798e 100644
--- a/drivers/media/mc/mc-devnode.c
+++ b/drivers/media/mc/mc-devnode.c
@@ -245,15 +245,14 @@ int __must_check media_devnode_register(struct media_device *mdev,
kobject_set_name(&devnode->cdev.kobj, "media%d", devnode->minor);
/* Part 3: Add the media and char device */
+ set_bit(MEDIA_FLAG_REGISTERED, &devnode->flags);
ret = cdev_device_add(&devnode->cdev, &devnode->dev);
if (ret < 0) {
+ clear_bit(MEDIA_FLAG_REGISTERED, &devnode->flags);
pr_err("%s: cdev_device_add failed\n", __func__);
goto cdev_add_error;
}
- /* Part 4: Activate this minor. The char device can now be used. */
- set_bit(MEDIA_FLAG_REGISTERED, &devnode->flags);
-
return 0;
cdev_add_error:
The type defined for the BINDER_SET_MAX_THREADS ioctl was changed from
size_t to __u32 in order to avoid incompatibility issues between 32 and
64-bit kernels. However, the internal types used to copy from user and
store the value were never updated. Use u32 to fix the inconsistency.
Fixes: a9350fc859ae ("staging: android: binder: fix BINDER_SET_MAX_THREADS declaration")
Reported-by: Arve Hjønnevåg <arve(a)android.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Carlos Llamas <cmllamas(a)google.com>
---
Notes:
v2: rebased, send fix patch separately per Greg's feedback.
drivers/android/binder.c | 2 +-
drivers/android/binder_internal.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index bad28cf42010..5834e829f391 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -5365,7 +5365,7 @@ static long binder_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
goto err;
break;
case BINDER_SET_MAX_THREADS: {
- int max_threads;
+ u32 max_threads;
if (copy_from_user(&max_threads, ubuf,
sizeof(max_threads))) {
diff --git a/drivers/android/binder_internal.h b/drivers/android/binder_internal.h
index 7270d4d22207..5b7c80b99ae8 100644
--- a/drivers/android/binder_internal.h
+++ b/drivers/android/binder_internal.h
@@ -421,7 +421,7 @@ struct binder_proc {
struct list_head todo;
struct binder_stats stats;
struct list_head delivered_death;
- int max_threads;
+ u32 max_threads;
int requested_threads;
int requested_threads_started;
int tmp_ref;
--
2.44.0.769.g3c40516874-goog
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 6d029c25b71f2de2838a6f093ce0fa0e69336154
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024041509-triangle-parlor-1783@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6d029c25b71f2de2838a6f093ce0fa0e69336154 Mon Sep 17 00:00:00 2001
From: Oleg Nesterov <oleg(a)redhat.com>
Date: Tue, 9 Apr 2024 15:38:03 +0200
Subject: [PATCH] selftests/timers/posix_timers: Reimplement
check_timer_distribution()
check_timer_distribution() runs ten threads in a busy loop and tries to
test that the kernel distributes a process posix CPU timer signal to every
thread over time.
There is not guarantee that this is true even after commit bcb7ee79029d
("posix-timers: Prefer delivery of signals to the current thread") because
that commit only avoids waking up the sleeping process leader thread, but
that has nothing to do with the actual signal delivery.
As the signal is process wide the first thread which observes sigpending
and wins the race to lock sighand will deliver the signal. Testing shows
that this hangs on a regular base because some threads never win the race.
The comment "This primarily tests that the kernel does not favour any one."
is wrong. The kernel does favour a thread which hits the timer interrupt
when CLOCK_PROCESS_CPUTIME_ID expires.
Rewrite the test so it only checks that the group leader sleeping in join()
never receives SIGALRM and the thread which burns CPU cycles receives all
signals.
In older kernels which do not have commit bcb7ee79029d ("posix-timers:
Prefer delivery of signals to the current thread") the test-case fails
immediately, the very 1st tick wakes the leader up. Otherwise it quickly
succeeds after 100 ticks.
CI testing wants to use newer selftest versions on stable kernels. In this
case the test is guaranteed to fail.
So check in the failure case whether the kernel version is less than v6.3
and skip the test result in that case.
[ tglx: Massaged change log, renamed the version check helper ]
Fixes: e797203fb3ba ("selftests/timers/posix_timers: Test delivery of signals across threads")
Signed-off-by: Oleg Nesterov <oleg(a)redhat.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20240409133802.GD29396@redhat.com
diff --git a/tools/testing/selftests/kselftest.h b/tools/testing/selftests/kselftest.h
index 541bf192e30e..973b18e156b2 100644
--- a/tools/testing/selftests/kselftest.h
+++ b/tools/testing/selftests/kselftest.h
@@ -51,6 +51,7 @@
#include <stdarg.h>
#include <string.h>
#include <stdio.h>
+#include <sys/utsname.h>
#endif
#ifndef ARRAY_SIZE
@@ -388,4 +389,16 @@ static inline __printf(1, 2) int ksft_exit_skip(const char *msg, ...)
exit(KSFT_SKIP);
}
+static inline int ksft_min_kernel_version(unsigned int min_major,
+ unsigned int min_minor)
+{
+ unsigned int major, minor;
+ struct utsname info;
+
+ if (uname(&info) || sscanf(info.release, "%u.%u.", &major, &minor) != 2)
+ ksft_exit_fail_msg("Can't parse kernel version\n");
+
+ return major > min_major || (major == min_major && minor >= min_minor);
+}
+
#endif /* __KSELFTEST_H */
diff --git a/tools/testing/selftests/timers/posix_timers.c b/tools/testing/selftests/timers/posix_timers.c
index d49dd3ffd0d9..d86a0e00711e 100644
--- a/tools/testing/selftests/timers/posix_timers.c
+++ b/tools/testing/selftests/timers/posix_timers.c
@@ -184,80 +184,71 @@ static int check_timer_create(int which)
return 0;
}
-int remain;
-__thread int got_signal;
+static pthread_t ctd_thread;
+static volatile int ctd_count, ctd_failed;
-static void *distribution_thread(void *arg)
+static void ctd_sighandler(int sig)
{
- while (__atomic_load_n(&remain, __ATOMIC_RELAXED));
- return NULL;
+ if (pthread_self() != ctd_thread)
+ ctd_failed = 1;
+ ctd_count--;
}
-static void distribution_handler(int nr)
+static void *ctd_thread_func(void *arg)
{
- if (!__atomic_exchange_n(&got_signal, 1, __ATOMIC_RELAXED))
- __atomic_fetch_sub(&remain, 1, __ATOMIC_RELAXED);
-}
-
-/*
- * Test that all running threads _eventually_ receive CLOCK_PROCESS_CPUTIME_ID
- * timer signals. This primarily tests that the kernel does not favour any one.
- */
-static int check_timer_distribution(void)
-{
- int err, i;
- timer_t id;
- const int nthreads = 10;
- pthread_t threads[nthreads];
struct itimerspec val = {
.it_value.tv_sec = 0,
.it_value.tv_nsec = 1000 * 1000,
.it_interval.tv_sec = 0,
.it_interval.tv_nsec = 1000 * 1000,
};
+ timer_t id;
- remain = nthreads + 1; /* worker threads + this thread */
- signal(SIGALRM, distribution_handler);
- err = timer_create(CLOCK_PROCESS_CPUTIME_ID, NULL, &id);
- if (err < 0) {
- ksft_perror("Can't create timer");
- return -1;
- }
- err = timer_settime(id, 0, &val, NULL);
- if (err < 0) {
- ksft_perror("Can't set timer");
- return -1;
- }
+ /* 1/10 seconds to ensure the leader sleeps */
+ usleep(10000);
- for (i = 0; i < nthreads; i++) {
- err = pthread_create(&threads[i], NULL, distribution_thread,
- NULL);
- if (err) {
- ksft_print_msg("Can't create thread: %s (%d)\n",
- strerror(errno), errno);
- return -1;
- }
- }
+ ctd_count = 100;
+ if (timer_create(CLOCK_PROCESS_CPUTIME_ID, NULL, &id))
+ return "Can't create timer\n";
+ if (timer_settime(id, 0, &val, NULL))
+ return "Can't set timer\n";
- /* Wait for all threads to receive the signal. */
- while (__atomic_load_n(&remain, __ATOMIC_RELAXED));
+ while (ctd_count > 0 && !ctd_failed)
+ ;
- for (i = 0; i < nthreads; i++) {
- err = pthread_join(threads[i], NULL);
- if (err) {
- ksft_print_msg("Can't join thread: %s (%d)\n",
- strerror(errno), errno);
- return -1;
- }
- }
+ if (timer_delete(id))
+ return "Can't delete timer\n";
- if (timer_delete(id)) {
- ksft_perror("Can't delete timer");
- return -1;
- }
+ return NULL;
+}
- ksft_test_result_pass("check_timer_distribution\n");
+/*
+ * Test that only the running thread receives the timer signal.
+ */
+static int check_timer_distribution(void)
+{
+ const char *errmsg;
+
+ signal(SIGALRM, ctd_sighandler);
+
+ errmsg = "Can't create thread\n";
+ if (pthread_create(&ctd_thread, NULL, ctd_thread_func, NULL))
+ goto err;
+
+ errmsg = "Can't join thread\n";
+ if (pthread_join(ctd_thread, (void **)&errmsg) || errmsg)
+ goto err;
+
+ if (!ctd_failed)
+ ksft_test_result_pass("check signal distribution\n");
+ else if (ksft_min_kernel_version(6, 3))
+ ksft_test_result_fail("check signal distribution\n");
+ else
+ ksft_test_result_skip("check signal distribution (old kernel)\n");
return 0;
+err:
+ ksft_print_msg(errmsg);
+ return -1;
}
int main(int argc, char **argv)
Hi
We have published a survey on E-bike Drive and Deceleration System. If you have further interest in this or related reports, we are happy to share sample reports for your reference, please contact: kiko(a)vicmarketresearch.com
Some of the prominent players reviewed in the research report include:
Shimano
Bosch
Yamaha
Bafang Electric
Brose
Ananda
Aikem
TQ-Group
Panasonic
MAHLE
……
The primary objectives of this report are to provide
1) global market size and forecasts, growth rates, market dynamics, industry structure and developments, market situation, trends;
2) global market share and ranking by company;
3) comprehensive presentation of the global market for E-bike Drive and Deceleration System, with both quantitative and qualitative analysis through detailed segmentation;
4) detailed value chain analysis and review of growth factors essential for the existing market players and new entrants;
5) emerging opportunities in the market and the future impact of major drivers and restraints of the market.
Segment by Type
Mid Motor
Geared Hub Motor
Direct Drive Hub Motor
Segment by Application
OEM market
Aftermarket
More companies that not listed here are also available.
Thanks for you reading.
Each RPMh VRM accelerator resource has 3 or 4 contiguous 4-byte aligned
addresses associated with it. These control voltage, enable state, mode,
and in legacy targets, voltage headroom. The current in-flight request
checking logic looks for exact address matches. Requests for different
addresses of the same RPMh resource as thus not detected as in-flight.
Add new cmd-db API cmd_db_match_resource_addr() to enhance the in-flight
request check for VRM requests by ignoring the address offset.
This ensures that only one request is allowed to be in-flight for a given
VRM resource. This is needed to avoid scenarios where request commands are
carried out by RPMh hardware out-of-order leading to LDO regulator
over-current protection triggering.
Fixes: 658628e7ef78 ("drivers: qcom: rpmh-rsc: add RPMH controller for QCOM SoCs")
cc: stable(a)vger.kernel.org
Reviewed-by: Konrad Dybcio <konrad.dybcio(a)linaro.org>
Tested-by: Elliot Berman <quic_eberman(a)quicinc.com> # sm8650-qrd
Signed-off-by: Maulik Shah <quic_mkshah(a)quicinc.com>
---
Changes in v4:
- Simplify cmd_db_match_resource_addr()
- Remove unrelated changes to newly added logic
- Update function description comments
- Replace Signed-off-by: with Tested-by: from Elliot
- Link to v3: https://lore.kernel.org/r/20240212-rpmh-rsc-fixes-v3-1-1be0d705dbb5@quicinc…
Changes in v3:
- Fix s-o-b chain
- Add cmd-db API to compare addresses
- Reuse already defined resource types in cmd-db
- Add Fixes tag and Cc to stable
- Retain Reviewed-by tag of v2
- Link to v2: https://lore.kernel.org/r/20240119-rpmh-rsc-fixes-v2-1-e42c0a9e36f0@quicinc…
Changes in v2:
- Use GENMASK() and FIELD_GET()
- Link to v1: https://lore.kernel.org/r/20240117-rpmh-rsc-fixes-v1-1-71ee4f8f72a4@quicinc…
---
drivers/soc/qcom/cmd-db.c | 32 +++++++++++++++++++++++++++++++-
drivers/soc/qcom/rpmh-rsc.c | 3 ++-
include/soc/qcom/cmd-db.h | 10 +++++++++-
3 files changed, 42 insertions(+), 3 deletions(-)
diff --git a/drivers/soc/qcom/cmd-db.c b/drivers/soc/qcom/cmd-db.c
index a5fd68411bed..86fb2cd4f484 100644
--- a/drivers/soc/qcom/cmd-db.c
+++ b/drivers/soc/qcom/cmd-db.c
@@ -1,6 +1,10 @@
/* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (c) 2016-2018, 2020, The Linux Foundation. All rights reserved. */
+/*
+ * Copyright (c) 2016-2018, 2020, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+#include <linux/bitfield.h>
#include <linux/debugfs.h>
#include <linux/kernel.h>
#include <linux/module.h>
@@ -17,6 +21,8 @@
#define MAX_SLV_ID 8
#define SLAVE_ID_MASK 0x7
#define SLAVE_ID_SHIFT 16
+#define SLAVE_ID(addr) FIELD_GET(GENMASK(19, 16), addr)
+#define VRM_ADDR(addr) FIELD_GET(GENMASK(19, 4), addr)
/**
* struct entry_header: header for each entry in cmddb
@@ -220,6 +226,30 @@ const void *cmd_db_read_aux_data(const char *id, size_t *len)
}
EXPORT_SYMBOL_GPL(cmd_db_read_aux_data);
+/**
+ * cmd_db_match_resource_addr() - Compare if both Resource addresses are same
+ *
+ * @addr1: Resource address to compare
+ * @addr2: Resource address to compare
+ *
+ * Return: true if two addresses refer to the same resource, false otherwise
+ */
+bool cmd_db_match_resource_addr(u32 addr1, u32 addr2)
+{
+ /*
+ * Each RPMh VRM accelerator resource has 3 or 4 contiguous 4-byte
+ * aligned addresses associated with it. Ignore the offset to check
+ * for VRM requests.
+ */
+ if (addr1 == addr2)
+ return true;
+ else if (SLAVE_ID(addr1) == CMD_DB_HW_VRM && VRM_ADDR(addr1) == VRM_ADDR(addr2))
+ return true;
+
+ return false;
+}
+EXPORT_SYMBOL_GPL(cmd_db_match_resource_addr);
+
/**
* cmd_db_read_slave_id - Get the slave ID for a given resource address
*
diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c
index a021dc71807b..daf64be966fe 100644
--- a/drivers/soc/qcom/rpmh-rsc.c
+++ b/drivers/soc/qcom/rpmh-rsc.c
@@ -1,6 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
/*
* Copyright (c) 2016-2018, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2023-2024, Qualcomm Innovation Center, Inc. All rights reserved.
*/
#define pr_fmt(fmt) "%s " fmt, KBUILD_MODNAME
@@ -557,7 +558,7 @@ static int check_for_req_inflight(struct rsc_drv *drv, struct tcs_group *tcs,
for_each_set_bit(j, &curr_enabled, MAX_CMDS_PER_TCS) {
addr = read_tcs_cmd(drv, drv->regs[RSC_DRV_CMD_ADDR], i, j);
for (k = 0; k < msg->num_cmds; k++) {
- if (addr == msg->cmds[k].addr)
+ if (cmd_db_match_resource_addr(msg->cmds[k].addr, addr))
return -EBUSY;
}
}
diff --git a/include/soc/qcom/cmd-db.h b/include/soc/qcom/cmd-db.h
index c8bb56e6852a..47a6cab75e63 100644
--- a/include/soc/qcom/cmd-db.h
+++ b/include/soc/qcom/cmd-db.h
@@ -1,5 +1,8 @@
/* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (c) 2016-2018, The Linux Foundation. All rights reserved. */
+/*
+ * Copyright (c) 2016-2018, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved.
+ */
#ifndef __QCOM_COMMAND_DB_H__
#define __QCOM_COMMAND_DB_H__
@@ -21,6 +24,8 @@ u32 cmd_db_read_addr(const char *resource_id);
const void *cmd_db_read_aux_data(const char *resource_id, size_t *len);
+bool cmd_db_match_resource_addr(u32 addr1, u32 addr2);
+
enum cmd_db_hw_type cmd_db_read_slave_id(const char *resource_id);
int cmd_db_ready(void);
@@ -31,6 +36,9 @@ static inline u32 cmd_db_read_addr(const char *resource_id)
static inline const void *cmd_db_read_aux_data(const char *resource_id, size_t *len)
{ return ERR_PTR(-ENODEV); }
+static inline bool cmd_db_match_resource_addr(u32 addr1, u32 addr2)
+{ return false; }
+
static inline enum cmd_db_hw_type cmd_db_read_slave_id(const char *resource_id)
{ return -ENODEV; }
---
base-commit: 615d300648869c774bd1fe54b4627bb0c20faed4
change-id: 20240210-rpmh-rsc-fixes-372a79ab364b
Best regards,
--
Maulik Shah <quic_mkshah(a)quicinc.com>
The type defined for the BINDER_SET_MAX_THREADS ioctl was changed from
size_t to __u32 in order to avoid incompatibility issues between 32 and
64-bit kernels. However, the internal types used to copy from user and
store the value were never updated. Use u32 to fix the inconsistency.
Fixes: a9350fc859ae ("staging: android: binder: fix BINDER_SET_MAX_THREADS declaration")
Reported-by: Arve Hjønnevåg <arve(a)android.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Carlos Llamas <cmllamas(a)google.com>
---
drivers/android/binder.c | 2 +-
drivers/android/binder_internal.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index f120a24c9ae6..2596cbfa16d0 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -5408,7 +5408,7 @@ static long binder_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
goto err;
break;
case BINDER_SET_MAX_THREADS: {
- int max_threads;
+ u32 max_threads;
if (copy_from_user(&max_threads, ubuf,
sizeof(max_threads))) {
diff --git a/drivers/android/binder_internal.h b/drivers/android/binder_internal.h
index 221ab7a6384a..3c522698083f 100644
--- a/drivers/android/binder_internal.h
+++ b/drivers/android/binder_internal.h
@@ -426,7 +426,7 @@ struct binder_proc {
struct list_head todo;
struct binder_stats stats;
struct list_head delivered_death;
- int max_threads;
+ u32 max_threads;
int requested_threads;
int requested_threads_started;
int tmp_ref;
--
2.44.0.683.g7961c838ac-goog
There are two issues with SDHC2 configuration for SA8155P-ADP,
which prevent use of SDHC2 and causes issues with ethernet:
- Card Detect pin for SHDC2 on SA8155P-ADP is connected to gpio4 of
PMM8155AU_1, not to SoC itself. SoC's gpio4 is used for DWMAC
TX. If sdhc driver probes after dwmac driver, it reconfigures
gpio4 and this breaks Ethernet MAC.
- pinctrl configuration mentions gpio96 as CD pin. It seems it was
copied from some SM8150 example, because as mentioned above,
correct CD pin is gpio4 on PMM8155AU_1.
This patch fixes both mentioned issues by providing correct pin handle
and pinctrl configuration.
Fixes: 0deb2624e2d0 ("arm64: dts: qcom: sa8155p-adp: Add support for uSD card")
Cc: stable(a)vger.kernel.org
Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk(a)epam.com>
---
In v3:
- Moved regulator changes to a separate patch
- Renamed pinctrl node
- Moved pinctrl node so it will appear in alphabetical order
In v2:
- Added "Fixes:" tag
- CCed stable ML
- Fixed pinctrl configuration
- Extended voltage range for L13C voltage regulator
---
arch/arm64/boot/dts/qcom/sa8155p-adp.dts | 30 ++++++++++--------------
1 file changed, 13 insertions(+), 17 deletions(-)
diff --git a/arch/arm64/boot/dts/qcom/sa8155p-adp.dts b/arch/arm64/boot/dts/qcom/sa8155p-adp.dts
index 5e4287f8c8cd1..b2cf2c988336c 100644
--- a/arch/arm64/boot/dts/qcom/sa8155p-adp.dts
+++ b/arch/arm64/boot/dts/qcom/sa8155p-adp.dts
@@ -367,6 +367,16 @@ queue0 {
};
};
+&pmm8155au_1_gpios {
+ pmm8155au_1_sdc2_cd: sdc2-cd-default-state {
+ pins = "gpio4";
+ function = "normal";
+ input-enable;
+ bias-pull-up;
+ power-source = <0>;
+ };
+};
+
&qupv3_id_1 {
status = "okay";
};
@@ -384,10 +394,10 @@ &remoteproc_cdsp {
&sdhc_2 {
status = "okay";
- cd-gpios = <&tlmm 4 GPIO_ACTIVE_LOW>;
+ cd-gpios = <&pmm8155au_1_gpios 4 GPIO_ACTIVE_LOW>;
pinctrl-names = "default", "sleep";
- pinctrl-0 = <&sdc2_on>;
- pinctrl-1 = <&sdc2_off>;
+ pinctrl-0 = <&sdc2_on &pmm8155au_1_sdc2_cd>;
+ pinctrl-1 = <&sdc2_off &pmm8155au_1_sdc2_cd>;
vqmmc-supply = <&vreg_l13c_2p96>; /* IO line power */
vmmc-supply = <&vreg_l17a_2p96>; /* Card power line */
bus-width = <4>;
@@ -505,13 +515,6 @@ data-pins {
bias-pull-up; /* pull up */
drive-strength = <16>; /* 16 MA */
};
-
- sd-cd-pins {
- pins = "gpio96";
- function = "gpio";
- bias-pull-up; /* pull up */
- drive-strength = <2>; /* 2 MA */
- };
};
sdc2_off: sdc2-off-state {
@@ -532,13 +535,6 @@ data-pins {
bias-pull-up; /* pull up */
drive-strength = <2>; /* 2 MA */
};
-
- sd-cd-pins {
- pins = "gpio96";
- function = "gpio";
- bias-pull-up; /* pull up */
- drive-strength = <2>; /* 2 MA */
- };
};
usb2phy_ac_en1_default: usb2phy-ac-en1-default-state {
--
2.44.0
Since some version of Linux (I think since some 5.x), my computer
started to turn off randomly, especially when under a load. A workaround
is to put it into power saving mode (using Power in Gnome Settings).
This makes it almost 2 times slower.
Unmodified Asus X515EA. I recently updated BIOS to Asus's version 315,
but this didn't help.
Previously we cleared it from dust and replaced the thermal paste, but
that didn't fix the problem.
I am attaching dmidecode.
In my previous email, I forgot to tell Linux version:
$ uname -a
Linux victor 6.5.0-28-generic #29-Ubuntu SMP PREEMPT_DYNAMIC Thu Mar 28
23:46:48 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Hi,
The Framework 13 quirk for s2idle got extended to cover another BIOS
version very recently.
Can you please backport this to 6.6.y and later as well?
f609e7b1b49e ("platform/x86/amd/pmc: Extend Framework 13 quirk to more
BIOSes")
Thanks,
Hello.
These are the remaining bugfix patches for the MT7530 DSA subdriver. They
didn't apply as is to the 5.15 stable tree so I have submitted the adjusted
versions in this thread. Please apply them in the order they were
submitted.
Signed-off-by: Arınç ÜNAL <arinc.unal(a)arinc9.com>
---
Arınç ÜNAL (3):
net: dsa: mt7530: set all CPU ports in MT7531_CPU_PMAP
net: dsa: mt7530: fix improper frames on all 25MHz and 40MHz XTAL MT7530
net: dsa: mt7530: fix enabling EEE on MT7531 switch on all boards
Vladimir Oltean (1):
net: dsa: introduce preferred_default_local_cpu_port and use on MT7530
drivers/net/dsa/mt7530.c | 54 ++++++++++++++++++++++++++++++++++--------------
drivers/net/dsa/mt7530.h | 2 ++
include/net/dsa.h | 8 +++++++
net/dsa/dsa2.c | 24 ++++++++++++++++++++-
4 files changed, 71 insertions(+), 17 deletions(-)
---
base-commit: c52b9710c83d3b8ab63bb217cc7c8b61e13f12cd
change-id: 20240420-for-stable-5-15-backports-d1fca1fee8c9
Best regards,
--
Arınç ÜNAL <arinc.unal(a)arinc9.com>
On 4/19/24 2:47 PM, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> ravb: Group descriptor types used in Rx ring
>
> to the 6.8-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> ravb-group-descriptor-types-used-in-rx-ring.patch
> and it can be found in the queue-6.8 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit fb17fd565be203e2aa62544a586a72430c457751
> Author: Niklas Söderlund <niklas.soderlund+renesas(a)ragnatech.se>
> Date: Mon Mar 4 12:08:53 2024 +0100
>
> ravb: Group descriptor types used in Rx ring
>
> [ Upstream commit 4123c3fbf8632e5c553222bf1c10b3a3e0a8dc06 ]
>
> The Rx ring can either be made up of normal or extended descriptors, not
> a mix of the two at the same time. Make this explicit by grouping the
> two variables in a rx_ring union.
>
> The extension of the storage for more than one queue of normal
> descriptors from a single to NUM_RX_QUEUE queues have no practical
> effect. But aids in making the code readable as the code that uses it
> already piggyback on other members of struct ravb_private that are
> arrays of max length NUM_RX_QUEUE, e.g. rx_desc_dma. This will also make
> further refactoring easier.
>
> While at it, rename the normal descriptor Rx ring to make it clear it's
> not strictly related to the GbEthernet E-MAC IP found in RZ/G2L, normal
> descriptors could be used on R-Car SoCs too.
>
> Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas(a)ragnatech.se>
> Reviewed-by: Paul Barker <paul.barker.ct(a)bp.renesas.com>
> Reviewed-by: Sergey Shtylyov <s.shtylyov(a)omp.ru>
> Signed-off-by: David S. Miller <davem(a)davemloft.net>
> Stable-dep-of: def52db470df ("net: ravb: Count packets instead of descriptors in R-Car RX path")
Hm, I doubt this patch is actually necessary here...
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
[...]
MBR, Sergey
Since some version of Linux (I think since some 5.x), my computer
started to turn off randomly, especially when under a load. A workaround
is to put it into power saving mode (using Power in Gnome Settings).
This makes it almost 2 times slower.
Unmodified Asus X515EA. I recently updated BIOS to Asus's version 315,
but this didn't help.
Previously we cleared it from dust and replaced the thermal paste, but
that didn't fix the problem.
I am attaching dmidecode.
The patch titled
Subject: mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-hugetlb-fix-debug_locks_warn_on1-when-dissolve_free_hugetlb_folio.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Miaohe Lin <linmiaohe(a)huawei.com>
Subject: mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio()
Date: Fri, 19 Apr 2024 16:58:19 +0800
When I did memory failure tests recently, below warning occurs:
DEBUG_LOCKS_WARN_ON(1)
WARNING: CPU: 8 PID: 1011 at kernel/locking/lockdep.c:232 __lock_acquire+0xccb/0x1ca0
Modules linked in: mce_inject hwpoison_inject
CPU: 8 PID: 1011 Comm: bash Kdump: loaded Not tainted 6.9.0-rc3-next-20240410-00012-gdb69f219f4be #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
RIP: 0010:__lock_acquire+0xccb/0x1ca0
RSP: 0018:ffffa7a1c7fe3bd0 EFLAGS: 00000082
RAX: 0000000000000000 RBX: eb851eb853975fcf RCX: ffffa1ce5fc1c9c8
RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffffa1ce5fc1c9c0
RBP: ffffa1c6865d3280 R08: ffffffffb0f570a8 R09: 0000000000009ffb
R10: 0000000000000286 R11: ffffffffb0f2ad50 R12: ffffa1c6865d3d10
R13: ffffa1c6865d3c70 R14: 0000000000000000 R15: 0000000000000004
FS: 00007ff9f32aa740(0000) GS:ffffa1ce5fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff9f3134ba0 CR3: 00000008484e4000 CR4: 00000000000006f0
Call Trace:
<TASK>
lock_acquire+0xbe/0x2d0
_raw_spin_lock_irqsave+0x3a/0x60
hugepage_subpool_put_pages.part.0+0xe/0xc0
free_huge_folio+0x253/0x3f0
dissolve_free_huge_page+0x147/0x210
__page_handle_poison+0x9/0x70
memory_failure+0x4e6/0x8c0
hard_offline_page_store+0x55/0xa0
kernfs_fop_write_iter+0x12c/0x1d0
vfs_write+0x380/0x540
ksys_write+0x64/0xe0
do_syscall_64+0xbc/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff9f3114887
RSP: 002b:00007ffecbacb458 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007ff9f3114887
RDX: 000000000000000c RSI: 0000564494164e10 RDI: 0000000000000001
RBP: 0000564494164e10 R08: 00007ff9f31d1460 R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c
R13: 00007ff9f321b780 R14: 00007ff9f3217600 R15: 00007ff9f3216a00
</TASK>
Kernel panic - not syncing: kernel: panic_on_warn set ...
CPU: 8 PID: 1011 Comm: bash Kdump: loaded Not tainted 6.9.0-rc3-next-20240410-00012-gdb69f219f4be #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
panic+0x326/0x350
check_panic_on_warn+0x4f/0x50
__warn+0x98/0x190
report_bug+0x18e/0x1a0
handle_bug+0x3d/0x70
exc_invalid_op+0x18/0x70
asm_exc_invalid_op+0x1a/0x20
RIP: 0010:__lock_acquire+0xccb/0x1ca0
RSP: 0018:ffffa7a1c7fe3bd0 EFLAGS: 00000082
RAX: 0000000000000000 RBX: eb851eb853975fcf RCX: ffffa1ce5fc1c9c8
RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffffa1ce5fc1c9c0
RBP: ffffa1c6865d3280 R08: ffffffffb0f570a8 R09: 0000000000009ffb
R10: 0000000000000286 R11: ffffffffb0f2ad50 R12: ffffa1c6865d3d10
R13: ffffa1c6865d3c70 R14: 0000000000000000 R15: 0000000000000004
lock_acquire+0xbe/0x2d0
_raw_spin_lock_irqsave+0x3a/0x60
hugepage_subpool_put_pages.part.0+0xe/0xc0
free_huge_folio+0x253/0x3f0
dissolve_free_huge_page+0x147/0x210
__page_handle_poison+0x9/0x70
memory_failure+0x4e6/0x8c0
hard_offline_page_store+0x55/0xa0
kernfs_fop_write_iter+0x12c/0x1d0
vfs_write+0x380/0x540
ksys_write+0x64/0xe0
do_syscall_64+0xbc/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff9f3114887
RSP: 002b:00007ffecbacb458 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007ff9f3114887
RDX: 000000000000000c RSI: 0000564494164e10 RDI: 0000000000000001
RBP: 0000564494164e10 R08: 00007ff9f31d1460 R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c
R13: 00007ff9f321b780 R14: 00007ff9f3217600 R15: 00007ff9f3216a00
</TASK>
After git bisecting and digging into the code, I believe the root cause is
that _deferred_list field of folio is unioned with _hugetlb_subpool field.
In __update_and_free_hugetlb_folio(), folio->_deferred_list is
initialized leading to corrupted folio->_hugetlb_subpool when folio is
hugetlb. Later free_huge_folio() will use _hugetlb_subpool and above
warning happens.
But it is assumed hugetlb flag must have been cleared when calling
folio_put() in update_and_free_hugetlb_folio(). This assumption is broken
due to below race:
CPU1 CPU2
dissolve_free_huge_page update_and_free_pages_bulk
update_and_free_hugetlb_folio hugetlb_vmemmap_restore_folios
folio_clear_hugetlb_vmemmap_optimized
clear_flag = folio_test_hugetlb_vmemmap_optimized
if (clear_flag) <-- False, it's already cleared.
__folio_clear_hugetlb(folio) <-- Hugetlb is not cleared.
folio_put
free_huge_folio <-- free_the_page is expected.
list_for_each_entry()
__folio_clear_hugetlb <-- Too late.
Fix this issue by checking whether folio is hugetlb directly instead of
checking clear_flag to close the race window.
Link: https://lkml.kernel.org/r/20240419085819.1901645-1-linmiaohe@huawei.com
Fixes: 32c877191e02 ("hugetlb: do not clear hugetlb dtor until allocating vmemmap")
Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/hugetlb.c~mm-hugetlb-fix-debug_locks_warn_on1-when-dissolve_free_hugetlb_folio
+++ a/mm/hugetlb.c
@@ -1781,7 +1781,7 @@ static void __update_and_free_hugetlb_fo
* If vmemmap pages were allocated above, then we need to clear the
* hugetlb destructor under the hugetlb lock.
*/
- if (clear_dtor) {
+ if (folio_test_hugetlb(folio)) {
spin_lock_irq(&hugetlb_lock);
__clear_hugetlb_destructor(h, folio);
spin_unlock_irq(&hugetlb_lock);
_
Patches currently in -mm which might be from linmiaohe(a)huawei.com are
mm-hugetlb-fix-debug_locks_warn_on1-when-dissolve_free_hugetlb_folio.patch
(struct dirty_throttle_control *)->thresh is an unsigned long, but is
passed as the u32 divisor argument to div_u64(). On architectures where
unsigned long is 64 bytes, the argument will be implicitly truncated.
Use div64_u64() instead of div_u64() so that the value used in the "is
this a safe division" check is the same as the divisor.
Also, remove redundant cast of the numerator to u64, as that should
happen implicitly.
This would be difficult to exploit in memcg domain, given the
ratio-based arithmetic domain_drity_limits() uses, but is much easier in
global writeback domain with a BDI_CAP_STRICTLIMIT-backing device, using
e.g. vm.dirty_bytes=(1<<32)*PAGE_SIZE so that dtc->thresh == (1<<32)
Fixes: f6789593d5ce ("mm/page-writeback.c: fix divide by zero in bdi_dirty_limits()")
Cc: Maxim Patlasov <MPatlasov(a)parallels.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Zach O'Keefe <zokeefe(a)google.com>
---
mm/page-writeback.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index cd4e4ae77c40a..02147b61712bc 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1638,7 +1638,7 @@ static inline void wb_dirty_limits(struct dirty_throttle_control *dtc)
*/
dtc->wb_thresh = __wb_calc_thresh(dtc);
dtc->wb_bg_thresh = dtc->thresh ?
- div_u64((u64)dtc->wb_thresh * dtc->bg_thresh, dtc->thresh) : 0;
+ div64_u64(dtc->wb_thresh * dtc->bg_thresh, dtc->thresh) : 0;
/*
* In order to avoid the stacked BDI deadlock we need
--
2.43.0.429.g432eaa2c6b-goog
Hi Nam,
+CC stable( heads up as this is a regression affecting 5.15.y and
probably others, Greg: this was reproducible upstream so reported
everything w.r.t upstream code but initially found on 5.15.y)
On 19/04/24 20:29, Nam Cao wrote:
> On 2024-04-18 Harshit Mogalapalli wrote:
>> While fuzzing 5.15.y kernel with Syzkaller, we noticed a INFO: task hung
>> bug in fb_deferred_io_work()
>
> I think the problem is because of improper offset address calculation.
> The kernel calculate address offset with:
> offset = vmf->address - vmf->vma->vm_start
>
> Now the problem is that your C program mmap the framebuffer at 2
> different offsets:
> mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0xff000);
> mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0);
>
> but the kernel doesn't take these different offsets into account.
> So, 2 different pages are mistakenly recognized as the same page.
>
> Can you try the following patch?
This patch works well against the reproducer, this simplified repro and
the longer repro which syzkaller generated couldn't trigger any hang
with the below patch applied.
Thanks,
Harshit
>
> Best regards,
> Nam
>
> diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c
> index dae96c9f61cf..d5d6cd9e8b29 100644
> --- a/drivers/video/fbdev/core/fb_defio.c
> +++ b/drivers/video/fbdev/core/fb_defio.c
> @@ -196,7 +196,8 @@ static vm_fault_t fb_deferred_io_track_page(struct fb_info *info, unsigned long
> */
> static vm_fault_t fb_deferred_io_page_mkwrite(struct fb_info *info, struct vm_fault *vmf)
> {
> - unsigned long offset = vmf->address - vmf->vma->vm_start;
> + unsigned long offset = vmf->address - vmf->vma->vm_start
> + + (vmf->vma->vm_pgoff << PAGE_SHIFT);
> struct page *page = vmf->page;
>
> file_update_time(vmf->vma->vm_file);
>
>
commit 3aadf100f93d80815685493d60cd8cab206403df upstream.
This reverts commit ad6bcdad2b6724e113f191a12f859a9e8456b26d. I had
nak'd it, and Greg said on the thread that it links that he wasn't going
to take it either, especially since it's not his code or his tree, but
then, seemingly accidentally, it got pushed up some months later, in
what looks like a mistake, with no further discussion in the linked
thread. So revert it, since it's clearly not intended.
Fixes: ad6bcdad2b67 ("vmgenid: emit uevent when VMGENID updates")
Cc: stable(a)vger.kernel.org
Acked-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Link: https://lore.kernel.org/r/20230531095119.11202-2-bchalios@amazon.es
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
drivers/virt/vmgenid.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/virt/vmgenid.c b/drivers/virt/vmgenid.c
index b67a28da4702..a1c467a0e9f7 100644
--- a/drivers/virt/vmgenid.c
+++ b/drivers/virt/vmgenid.c
@@ -68,7 +68,6 @@ static int vmgenid_add(struct acpi_device *device)
static void vmgenid_notify(struct acpi_device *device, u32 event)
{
struct vmgenid_state *state = acpi_driver_data(device);
- char *envp[] = { "NEW_VMGENID=1", NULL };
u8 old_id[VMGENID_SIZE];
memcpy(old_id, state->this_id, sizeof(old_id));
@@ -76,7 +75,6 @@ static void vmgenid_notify(struct acpi_device *device, u32 event)
if (!memcmp(old_id, state->this_id, sizeof(old_id)))
return;
add_vmfork_randomness(state->this_id, sizeof(state->this_id));
- kobject_uevent_env(&device->dev.kobj, KOBJ_CHANGE, envp);
}
static const struct acpi_device_id vmgenid_ids[] = {
--
2.44.0
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x e871abcda3b67d0820b4182ebe93435624e9c6a4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024041901-roundish-flint-1143@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
e871abcda3b6 ("random: handle creditable entropy from atomic process context")
bbc7e1bed1f5 ("random: add back async readiness notifier")
9148de3196ed ("random: reseed in delayed work rather than on-demand")
f62384995e4c ("random: split initialization into early step and later step")
745558f95885 ("random: use hwgenerator randomness more frequently at early boot")
228dfe98a313 ("Merge tag 'char-misc-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e871abcda3b67d0820b4182ebe93435624e9c6a4 Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Wed, 17 Apr 2024 13:38:29 +0200
Subject: [PATCH] random: handle creditable entropy from atomic process context
The entropy accounting changes a static key when the RNG has
initialized, since it only ever initializes once. Static key changes,
however, cannot be made from atomic context, so depending on where the
last creditable entropy comes from, the static key change might need to
be deferred to a worker.
Previously the code used the execute_in_process_context() helper
function, which accounts for whether or not the caller is
in_interrupt(). However, that doesn't account for the case where the
caller is actually in process context but is holding a spinlock.
This turned out to be the case with input_handle_event() in
drivers/input/input.c contributing entropy:
[<ffffffd613025ba0>] die+0xa8/0x2fc
[<ffffffd613027428>] bug_handler+0x44/0xec
[<ffffffd613016964>] brk_handler+0x90/0x144
[<ffffffd613041e58>] do_debug_exception+0xa0/0x148
[<ffffffd61400c208>] el1_dbg+0x60/0x7c
[<ffffffd61400c000>] el1h_64_sync_handler+0x38/0x90
[<ffffffd613011294>] el1h_64_sync+0x64/0x6c
[<ffffffd613102d88>] __might_resched+0x1fc/0x2e8
[<ffffffd613102b54>] __might_sleep+0x44/0x7c
[<ffffffd6130b6eac>] cpus_read_lock+0x1c/0xec
[<ffffffd6132c2820>] static_key_enable+0x14/0x38
[<ffffffd61400ac08>] crng_set_ready+0x14/0x28
[<ffffffd6130df4dc>] execute_in_process_context+0xb8/0xf8
[<ffffffd61400ab30>] _credit_init_bits+0x118/0x1dc
[<ffffffd6138580c8>] add_timer_randomness+0x264/0x270
[<ffffffd613857e54>] add_input_randomness+0x38/0x48
[<ffffffd613a80f94>] input_handle_event+0x2b8/0x490
[<ffffffd613a81310>] input_event+0x6c/0x98
According to Guoyong, it's not really possible to refactor the various
drivers to never hold a spinlock there. And in_atomic() isn't reliable.
So, rather than trying to be too fancy, just punt the change in the
static key to a workqueue always. There's basically no drawback of doing
this, as the code already needed to account for the static key not
changing immediately, and given that it's just an optimization, there's
not exactly a hurry to change the static key right away, so deferal is
fine.
Reported-by: Guoyong Wang <guoyong.wang(a)mediatek.com>
Cc: stable(a)vger.kernel.org
Fixes: f5bda35fba61 ("random: use static branch for crng_ready()")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 456be28ba67c..2597cb43f438 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -702,7 +702,7 @@ static void extract_entropy(void *buf, size_t len)
static void __cold _credit_init_bits(size_t bits)
{
- static struct execute_work set_ready;
+ static DECLARE_WORK(set_ready, crng_set_ready);
unsigned int new, orig, add;
unsigned long flags;
@@ -718,8 +718,8 @@ static void __cold _credit_init_bits(size_t bits)
if (orig < POOL_READY_BITS && new >= POOL_READY_BITS) {
crng_reseed(NULL); /* Sets crng_init to CRNG_READY under base_crng.lock. */
- if (static_key_initialized)
- execute_in_process_context(crng_set_ready, &set_ready);
+ if (static_key_initialized && system_unbound_wq)
+ queue_work(system_unbound_wq, &set_ready);
atomic_notifier_call_chain(&random_ready_notifier, 0, NULL);
wake_up_interruptible(&crng_init_wait);
kill_fasync(&fasync, SIGIO, POLL_IN);
@@ -890,8 +890,8 @@ void __init random_init(void)
/*
* If we were initialized by the cpu or bootloader before jump labels
- * are initialized, then we should enable the static branch here, where
- * it's guaranteed that jump labels have been initialized.
+ * or workqueues are initialized, then we should enable the static
+ * branch here, where it's guaranteed that these have been initialized.
*/
if (!static_branch_likely(&crng_is_ready) && crng_init >= CRNG_READY)
crng_set_ready(NULL);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x e871abcda3b67d0820b4182ebe93435624e9c6a4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024041903-pentagon-deceiver-fe1d@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
e871abcda3b6 ("random: handle creditable entropy from atomic process context")
bbc7e1bed1f5 ("random: add back async readiness notifier")
9148de3196ed ("random: reseed in delayed work rather than on-demand")
f62384995e4c ("random: split initialization into early step and later step")
745558f95885 ("random: use hwgenerator randomness more frequently at early boot")
228dfe98a313 ("Merge tag 'char-misc-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e871abcda3b67d0820b4182ebe93435624e9c6a4 Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Wed, 17 Apr 2024 13:38:29 +0200
Subject: [PATCH] random: handle creditable entropy from atomic process context
The entropy accounting changes a static key when the RNG has
initialized, since it only ever initializes once. Static key changes,
however, cannot be made from atomic context, so depending on where the
last creditable entropy comes from, the static key change might need to
be deferred to a worker.
Previously the code used the execute_in_process_context() helper
function, which accounts for whether or not the caller is
in_interrupt(). However, that doesn't account for the case where the
caller is actually in process context but is holding a spinlock.
This turned out to be the case with input_handle_event() in
drivers/input/input.c contributing entropy:
[<ffffffd613025ba0>] die+0xa8/0x2fc
[<ffffffd613027428>] bug_handler+0x44/0xec
[<ffffffd613016964>] brk_handler+0x90/0x144
[<ffffffd613041e58>] do_debug_exception+0xa0/0x148
[<ffffffd61400c208>] el1_dbg+0x60/0x7c
[<ffffffd61400c000>] el1h_64_sync_handler+0x38/0x90
[<ffffffd613011294>] el1h_64_sync+0x64/0x6c
[<ffffffd613102d88>] __might_resched+0x1fc/0x2e8
[<ffffffd613102b54>] __might_sleep+0x44/0x7c
[<ffffffd6130b6eac>] cpus_read_lock+0x1c/0xec
[<ffffffd6132c2820>] static_key_enable+0x14/0x38
[<ffffffd61400ac08>] crng_set_ready+0x14/0x28
[<ffffffd6130df4dc>] execute_in_process_context+0xb8/0xf8
[<ffffffd61400ab30>] _credit_init_bits+0x118/0x1dc
[<ffffffd6138580c8>] add_timer_randomness+0x264/0x270
[<ffffffd613857e54>] add_input_randomness+0x38/0x48
[<ffffffd613a80f94>] input_handle_event+0x2b8/0x490
[<ffffffd613a81310>] input_event+0x6c/0x98
According to Guoyong, it's not really possible to refactor the various
drivers to never hold a spinlock there. And in_atomic() isn't reliable.
So, rather than trying to be too fancy, just punt the change in the
static key to a workqueue always. There's basically no drawback of doing
this, as the code already needed to account for the static key not
changing immediately, and given that it's just an optimization, there's
not exactly a hurry to change the static key right away, so deferal is
fine.
Reported-by: Guoyong Wang <guoyong.wang(a)mediatek.com>
Cc: stable(a)vger.kernel.org
Fixes: f5bda35fba61 ("random: use static branch for crng_ready()")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 456be28ba67c..2597cb43f438 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -702,7 +702,7 @@ static void extract_entropy(void *buf, size_t len)
static void __cold _credit_init_bits(size_t bits)
{
- static struct execute_work set_ready;
+ static DECLARE_WORK(set_ready, crng_set_ready);
unsigned int new, orig, add;
unsigned long flags;
@@ -718,8 +718,8 @@ static void __cold _credit_init_bits(size_t bits)
if (orig < POOL_READY_BITS && new >= POOL_READY_BITS) {
crng_reseed(NULL); /* Sets crng_init to CRNG_READY under base_crng.lock. */
- if (static_key_initialized)
- execute_in_process_context(crng_set_ready, &set_ready);
+ if (static_key_initialized && system_unbound_wq)
+ queue_work(system_unbound_wq, &set_ready);
atomic_notifier_call_chain(&random_ready_notifier, 0, NULL);
wake_up_interruptible(&crng_init_wait);
kill_fasync(&fasync, SIGIO, POLL_IN);
@@ -890,8 +890,8 @@ void __init random_init(void)
/*
* If we were initialized by the cpu or bootloader before jump labels
- * are initialized, then we should enable the static branch here, where
- * it's guaranteed that jump labels have been initialized.
+ * or workqueues are initialized, then we should enable the static
+ * branch here, where it's guaranteed that these have been initialized.
*/
if (!static_branch_likely(&crng_is_ready) && crng_init >= CRNG_READY)
crng_set_ready(NULL);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x e871abcda3b67d0820b4182ebe93435624e9c6a4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024041905-obvious-blush-10d8@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
e871abcda3b6 ("random: handle creditable entropy from atomic process context")
bbc7e1bed1f5 ("random: add back async readiness notifier")
9148de3196ed ("random: reseed in delayed work rather than on-demand")
f62384995e4c ("random: split initialization into early step and later step")
745558f95885 ("random: use hwgenerator randomness more frequently at early boot")
228dfe98a313 ("Merge tag 'char-misc-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e871abcda3b67d0820b4182ebe93435624e9c6a4 Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Wed, 17 Apr 2024 13:38:29 +0200
Subject: [PATCH] random: handle creditable entropy from atomic process context
The entropy accounting changes a static key when the RNG has
initialized, since it only ever initializes once. Static key changes,
however, cannot be made from atomic context, so depending on where the
last creditable entropy comes from, the static key change might need to
be deferred to a worker.
Previously the code used the execute_in_process_context() helper
function, which accounts for whether or not the caller is
in_interrupt(). However, that doesn't account for the case where the
caller is actually in process context but is holding a spinlock.
This turned out to be the case with input_handle_event() in
drivers/input/input.c contributing entropy:
[<ffffffd613025ba0>] die+0xa8/0x2fc
[<ffffffd613027428>] bug_handler+0x44/0xec
[<ffffffd613016964>] brk_handler+0x90/0x144
[<ffffffd613041e58>] do_debug_exception+0xa0/0x148
[<ffffffd61400c208>] el1_dbg+0x60/0x7c
[<ffffffd61400c000>] el1h_64_sync_handler+0x38/0x90
[<ffffffd613011294>] el1h_64_sync+0x64/0x6c
[<ffffffd613102d88>] __might_resched+0x1fc/0x2e8
[<ffffffd613102b54>] __might_sleep+0x44/0x7c
[<ffffffd6130b6eac>] cpus_read_lock+0x1c/0xec
[<ffffffd6132c2820>] static_key_enable+0x14/0x38
[<ffffffd61400ac08>] crng_set_ready+0x14/0x28
[<ffffffd6130df4dc>] execute_in_process_context+0xb8/0xf8
[<ffffffd61400ab30>] _credit_init_bits+0x118/0x1dc
[<ffffffd6138580c8>] add_timer_randomness+0x264/0x270
[<ffffffd613857e54>] add_input_randomness+0x38/0x48
[<ffffffd613a80f94>] input_handle_event+0x2b8/0x490
[<ffffffd613a81310>] input_event+0x6c/0x98
According to Guoyong, it's not really possible to refactor the various
drivers to never hold a spinlock there. And in_atomic() isn't reliable.
So, rather than trying to be too fancy, just punt the change in the
static key to a workqueue always. There's basically no drawback of doing
this, as the code already needed to account for the static key not
changing immediately, and given that it's just an optimization, there's
not exactly a hurry to change the static key right away, so deferal is
fine.
Reported-by: Guoyong Wang <guoyong.wang(a)mediatek.com>
Cc: stable(a)vger.kernel.org
Fixes: f5bda35fba61 ("random: use static branch for crng_ready()")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 456be28ba67c..2597cb43f438 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -702,7 +702,7 @@ static void extract_entropy(void *buf, size_t len)
static void __cold _credit_init_bits(size_t bits)
{
- static struct execute_work set_ready;
+ static DECLARE_WORK(set_ready, crng_set_ready);
unsigned int new, orig, add;
unsigned long flags;
@@ -718,8 +718,8 @@ static void __cold _credit_init_bits(size_t bits)
if (orig < POOL_READY_BITS && new >= POOL_READY_BITS) {
crng_reseed(NULL); /* Sets crng_init to CRNG_READY under base_crng.lock. */
- if (static_key_initialized)
- execute_in_process_context(crng_set_ready, &set_ready);
+ if (static_key_initialized && system_unbound_wq)
+ queue_work(system_unbound_wq, &set_ready);
atomic_notifier_call_chain(&random_ready_notifier, 0, NULL);
wake_up_interruptible(&crng_init_wait);
kill_fasync(&fasync, SIGIO, POLL_IN);
@@ -890,8 +890,8 @@ void __init random_init(void)
/*
* If we were initialized by the cpu or bootloader before jump labels
- * are initialized, then we should enable the static branch here, where
- * it's guaranteed that jump labels have been initialized.
+ * or workqueues are initialized, then we should enable the static
+ * branch here, where it's guaranteed that these have been initialized.
*/
if (!static_branch_likely(&crng_is_ready) && crng_init >= CRNG_READY)
crng_set_ready(NULL);
There is a recent report on UFFDIO_COPY over hugetlb:
https://lore.kernel.org/all/000000000000ee06de0616177560@google.com/
350: lockdep_assert_held(&hugetlb_lock);
Should be an issue in hugetlb but triggered in an userfault context, where
it goes into the unlikely path where two threads modifying the resv map
together. Mike has a fix in that path for resv uncharge but it looks like
the locking criteria was overlooked: hugetlb_cgroup_uncharge_folio_rsvd()
will update the cgroup pointer, so it requires to be called with the lock
held.
Looks like a stable material, so have it copied.
Reported-by: syzbot+4b8077a5fccc61c385a1(a)syzkaller.appspotmail.com
Cc: Mina Almasry <almasrymina(a)google.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: linux-stable <stable(a)vger.kernel.org>
Fixes: 79aa925bf239 ("hugetlb_cgroup: fix reservation accounting")
Signed-off-by: Peter Xu <peterx(a)redhat.com>
---
mm/hugetlb.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 26ab9dfc7d63..3158a55ce567 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3247,9 +3247,12 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
rsv_adjust = hugepage_subpool_put_pages(spool, 1);
hugetlb_acct_memory(h, -rsv_adjust);
- if (deferred_reserve)
+ if (deferred_reserve) {
+ spin_lock_irq(&hugetlb_lock);
hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
pages_per_huge_page(h), folio);
+ spin_unlock_irq(&hugetlb_lock);
+ }
}
if (!memcg_charge_ret)
--
2.44.0
Hi,
Roland Rosenfeld reported in Debian a regression after the update to
the 6.1.85 based kernel, with his USB ethernet device not anymore
able to use the usb ethernet names.
https://bugs.debian.org/1069082
it is somehow linked to the already reported regression
https://lore.kernel.org/regressions/ZhFl6xueHnuVHKdp@nuc/ but has
another aspect. I'm quoting his original report:
> Dear Maintainer,
>
> when upgrading from 6.1.76-1 to 6.1.85-1 my USB ethernet device
> ID 0b95:1790 ASIX Electronics Corp. AX88179 Gigabit Ethernet
> is no longer named enx00249bXXXXXX but eth0.
>
> I see the following in dmsg:
>
> [ 1.484345] usb 4-5: Manufacturer: ASIX Elec. Corp.
> [ 1.484661] usb 4-5: SerialNumber: 0000249BXXXXXX
> [ 1.496312] ax88179_178a 4-5:1.0 eth0: register 'ax88179_178a' at usb-0000:00:14.0-5, ASIX AX88179 USB 3.0 Gigabit Ethernet, d2:60:4c:YY:YY:YY
> [ 1.497746] usbcore: registered new interface driver ax88179_178a
>
> Unplugging and plugging again does not solve the issue, but the
> interface still is named eth0.
>
> Maybe it has to do with the following commit from
> https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.1.85
>
> commit fc77240f6316d17fc58a8881927c3732b1d75d51
> Author: Jose Ignacio Tornos Martinez <jtornosm(a)redhat.com>
> Date: Wed Apr 3 15:21:58 2024 +0200
>
> net: usb: ax88179_178a: avoid the interface always configured as random address
>
> commit 2e91bb99b9d4f756e92e83c4453f894dda220f09 upstream.
>
> After the commit d2689b6a86b9 ("net: usb: ax88179_178a: avoid two
> consecutive device resets"), reset is not executed from bind operation and
> mac address is not read from the device registers or the devicetree at that
> moment. Since the check to configure if the assigned mac address is random
> or not for the interface, happens after the bind operation from
> usbnet_probe, the interface keeps configured as random address, although the
> address is correctly read and set during open operation (the only reset
> now).
>
> In order to keep only one reset for the device and to avoid the interface
> always configured as random address, after reset, configure correctly the
> suitable field from the driver, if the mac address is read successfully from
> the device registers or the devicetree. Take into account if a locally
> administered address (random) was previously stored.
>
> cc: stable(a)vger.kernel.org # 6.6+
> Fixes: d2689b6a86b9 ("net: usb: ax88179_178a: avoid two consecutive device resets")
> Reported-by: Dave Stevenson <dave.stevenson(a)raspberrypi.com>
> Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm(a)redhat.com>
> Reviewed-by: Simon Horman <horms(a)kernel.org>
> Link: https://lore.kernel.org/r/20240403132158.344838-1-jtornosm@redhat.com
> Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
>
> Seems, that I'm not alone with this issue, there are also reports in
> https://www.reddit.com/r/debian/comments/1c304xn/linuximageamd64_61851_usb_…
> and https://infosec.space/@topher/112276500329020316
>
>
> All other (pci based) network interfaces still use there static names
> (enp0s25, enp2s0, enp3s0), only the usb ethernet name is broken with
> the new kernel.
>
> Greetings
> Roland
Roland confirmed that reverting both fc77240f6316 ("net: usb:
ax88179_178a: avoid the interface always configured as random
address") and 5c4cbec5106d ("net: usb: ax88179_178a: avoid two
consecutive device resets") fixes the problem.
Confirmation: https://bugs.debian.org/1069082#27
Regards,
Salvatore
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 325f3fb551f8cd672dbbfc4cf58b14f9ee3fc9e8
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024041524-monoxide-kilobyte-1c44@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
325f3fb551f8 ("kprobes: Fix possible use-after-free issue on kprobe registration")
1efda38d6f9b ("kprobes: Prohibit probes in gate area")
28f6c37a2910 ("kprobes: Forbid probing on trampoline and BPF code areas")
223a76b268c9 ("kprobes: Fix coding style issues")
9c89bb8e3272 ("kprobes: treewide: Cleanup the error messages for kprobes")
02afb8d6048d ("kprobe: Simplify prepare_kprobe() by dropping redundant version")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 325f3fb551f8cd672dbbfc4cf58b14f9ee3fc9e8 Mon Sep 17 00:00:00 2001
From: Zheng Yejian <zhengyejian1(a)huawei.com>
Date: Wed, 10 Apr 2024 09:58:02 +0800
Subject: [PATCH] kprobes: Fix possible use-after-free issue on kprobe
registration
When unloading a module, its state is changing MODULE_STATE_LIVE ->
MODULE_STATE_GOING -> MODULE_STATE_UNFORMED. Each change will take
a time. `is_module_text_address()` and `__module_text_address()`
works with MODULE_STATE_LIVE and MODULE_STATE_GOING.
If we use `is_module_text_address()` and `__module_text_address()`
separately, there is a chance that the first one is succeeded but the
next one is failed because module->state becomes MODULE_STATE_UNFORMED
between those operations.
In `check_kprobe_address_safe()`, if the second `__module_text_address()`
is failed, that is ignored because it expected a kernel_text address.
But it may have failed simply because module->state has been changed
to MODULE_STATE_UNFORMED. In this case, arm_kprobe() will try to modify
non-exist module text address (use-after-free).
To fix this problem, we should not use separated `is_module_text_address()`
and `__module_text_address()`, but use only `__module_text_address()`
once and do `try_module_get(module)` which is only available with
MODULE_STATE_LIVE.
Link: https://lore.kernel.org/all/20240410015802.265220-1-zhengyejian1@huawei.com/
Fixes: 28f6c37a2910 ("kprobes: Forbid probing on trampoline and BPF code areas")
Cc: stable(a)vger.kernel.org
Signed-off-by: Zheng Yejian <zhengyejian1(a)huawei.com>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 9d9095e81792..65adc815fc6e 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1567,10 +1567,17 @@ static int check_kprobe_address_safe(struct kprobe *p,
jump_label_lock();
preempt_disable();
- /* Ensure it is not in reserved area nor out of text */
- if (!(core_kernel_text((unsigned long) p->addr) ||
- is_module_text_address((unsigned long) p->addr)) ||
- in_gate_area_no_mm((unsigned long) p->addr) ||
+ /* Ensure the address is in a text area, and find a module if exists. */
+ *probed_mod = NULL;
+ if (!core_kernel_text((unsigned long) p->addr)) {
+ *probed_mod = __module_text_address((unsigned long) p->addr);
+ if (!(*probed_mod)) {
+ ret = -EINVAL;
+ goto out;
+ }
+ }
+ /* Ensure it is not in reserved area. */
+ if (in_gate_area_no_mm((unsigned long) p->addr) ||
within_kprobe_blacklist((unsigned long) p->addr) ||
jump_label_text_reserved(p->addr, p->addr) ||
static_call_text_reserved(p->addr, p->addr) ||
@@ -1580,8 +1587,7 @@ static int check_kprobe_address_safe(struct kprobe *p,
goto out;
}
- /* Check if 'p' is probing a module. */
- *probed_mod = __module_text_address((unsigned long) p->addr);
+ /* Get module refcount and reject __init functions for loaded modules. */
if (*probed_mod) {
/*
* We must hold a refcount of the probed module while updating
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64620e0a1e712a778095bd35cbb277dc2259281f Mon Sep 17 00:00:00 2001
From: Daniel Borkmann <daniel(a)iogearbox.net>
Date: Tue, 11 Jan 2022 14:43:41 +0000
Subject: [PATCH] bpf: Fix out of bounds access for ringbuf helpers
Both bpf_ringbuf_submit() and bpf_ringbuf_discard() have ARG_PTR_TO_ALLOC_MEM
in their bpf_func_proto definition as their first argument. They both expect
the result from a prior bpf_ringbuf_reserve() call which has a return type of
RET_PTR_TO_ALLOC_MEM_OR_NULL.
Meaning, after a NULL check in the code, the verifier will promote the register
type in the non-NULL branch to a PTR_TO_MEM and in the NULL branch to a known
zero scalar. Generally, pointer arithmetic on PTR_TO_MEM is allowed, so the
latter could have an offset.
The ARG_PTR_TO_ALLOC_MEM expects a PTR_TO_MEM register type. However, the non-
zero result from bpf_ringbuf_reserve() must be fed into either bpf_ringbuf_submit()
or bpf_ringbuf_discard() but with the original offset given it will then read
out the struct bpf_ringbuf_hdr mapping.
The verifier missed to enforce a zero offset, so that out of bounds access
can be triggered which could be used to escalate privileges if unprivileged
BPF was enabled (disabled by default in kernel).
Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it")
Reported-by: <tr3e.wang(a)gmail.com> (SecCoder Security Lab)
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Acked-by: John Fastabend <john.fastabend(a)gmail.com>
Acked-by: Alexei Starovoitov <ast(a)kernel.org>
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index e0b3f4d683eb..c72c57a6684f 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5318,9 +5318,15 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
case PTR_TO_BUF:
case PTR_TO_BUF | MEM_RDONLY:
case PTR_TO_STACK:
+ /* Some of the argument types nevertheless require a
+ * zero register offset.
+ */
+ if (arg_type == ARG_PTR_TO_ALLOC_MEM)
+ goto force_off_check;
break;
/* All the rest must be rejected: */
default:
+force_off_check:
err = __check_ptr_off_reg(env, reg, regno,
type == PTR_TO_BTF_ID);
if (err < 0)
This is the start of the stable review cycle for the 6.6.28 release.
There are 122 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 17 Apr 2024 14:19:30 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.28-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.6.28-rc1
Fudongwang <fudong.wang(a)amd.com>
drm/amd/display: fix disable otg wa logic in DCN316
Harry Wentland <harry.wentland(a)amd.com>
drm/amd/display: Set VSC SDP Colorimetry same way for MST and SST
Harry Wentland <harry.wentland(a)amd.com>
drm/amd/display: Program VSC SDP colorimetry for all DP sinks >= 1.4
Tim Huang <Tim.Huang(a)amd.com>
drm/amdgpu: fix incorrect number of active RBs for gfx11
Alex Deucher <alexander.deucher(a)amd.com>
drm/amdgpu: always force full reset for SOC21
Lijo Lazar <lijo.lazar(a)amd.com>
drm/amdgpu: Reset dGPU if suspend got aborted
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/i915: Disable port sync when bigjoiner is used
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/i915/cdclk: Fix CDCLK programming order when pipes are active
Josh Poimboeuf <jpoimboe(a)kernel.org>
x86/bugs: Replace CONFIG_SPECTRE_BHI_{ON,OFF} with CONFIG_MITIGATION_SPECTRE_BHI
Josh Poimboeuf <jpoimboe(a)kernel.org>
x86/bugs: Remove CONFIG_BHI_MITIGATION_AUTO and spectre_bhi=auto
Josh Poimboeuf <jpoimboe(a)kernel.org>
x86/bugs: Clarify that syscall hardening isn't a BHI mitigation
Josh Poimboeuf <jpoimboe(a)kernel.org>
x86/bugs: Fix BHI handling of RRSBA
Ingo Molnar <mingo(a)kernel.org>
x86/bugs: Rename various 'ia32_cap' variables to 'x86_arch_cap_msr'
Josh Poimboeuf <jpoimboe(a)kernel.org>
x86/bugs: Cache the value of MSR_IA32_ARCH_CAPABILITIES
Josh Poimboeuf <jpoimboe(a)kernel.org>
x86/bugs: Fix BHI documentation
Daniel Sneddon <daniel.sneddon(a)linux.intel.com>
x86/bugs: Fix return type of spectre_bhi_state()
Arnd Bergmann <arnd(a)arndb.de>
irqflags: Explicitly ignore lockdep_hrtimer_exit() argument
Adam Dunlap <acdunlap(a)google.com>
x86/apic: Force native_apic_mem_read() to use the MOV instruction
John Stultz <jstultz(a)google.com>
selftests: timers: Fix abs() warning in posix_timers test
Sean Christopherson <seanjc(a)google.com>
x86/cpu: Actually turn off mitigations by default for SPECULATION_MITIGATIONS=n
Namhyung Kim <namhyung(a)kernel.org>
perf/x86: Fix out of range data
Gavin Shan <gshan(a)redhat.com>
vhost: Add smp_rmb() in vhost_enable_notify()
Gavin Shan <gshan(a)redhat.com>
vhost: Add smp_rmb() in vhost_vq_avail_empty()
Frank Li <Frank.Li(a)nxp.com>
arm64: dts: imx8-ss-dma: fix spi lpcg indices
Frank Li <Frank.Li(a)nxp.com>
arm64: dts: imx8-ss-lsio: fix pwm lpcg indices
Frank Li <Frank.Li(a)nxp.com>
arm64: dts: imx8-ss-conn: fix usb lpcg indices
Frank Li <Frank.Li(a)nxp.com>
arm64: dts: imx8-ss-dma: fix adc lpcg indices
Frank Li <Frank.Li(a)nxp.com>
arm64: dts: imx8-ss-dma: fix can lpcg indices
Frank Li <Frank.Li(a)nxp.com>
arm64: dts: imx8qm-ss-dma: fix can lpcg indices
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/client: Fully protect modes[] with dev->mode_config.mutex
Boris Brezillon <boris.brezillon(a)collabora.com>
drm/panfrost: Fix the error path in panfrost_mmu_map_fault_addr()
Jammy Huang <jammy_huang(a)aspeedtech.com>
drm/ast: Fix soft lockup
Harish Kasiviswanathan <Harish.Kasiviswanathan(a)amd.com>
drm/amdkfd: Reset GPU on queue preemption failure
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/i915/vrr: Disable VRR when using bigjoiner
Zack Rusin <zack.rusin(a)broadcom.com>
drm/vmwgfx: Enable DMA mappings with SEV
Jacek Lawrynowicz <jacek.lawrynowicz(a)linux.intel.com>
accel/ivpu: Fix deadlock in context_xa
Alexander Wetzel <Alexander(a)wetzel-home.de>
scsi: sg: Avoid race in error handling & drop bogus warn
Alexander Wetzel <Alexander(a)wetzel-home.de>
scsi: sg: Avoid sg device teardown race
Zheng Yejian <zhengyejian1(a)huawei.com>
kprobes: Fix possible use-after-free issue on kprobe registration
Pavel Begunkov <asml.silence(a)gmail.com>
io_uring/net: restore msg_control on sendzc retry
Boris Burkov <boris(a)bur.io>
btrfs: qgroup: convert PREALLOC to PERTRANS after record_root_in_trans
Boris Burkov <boris(a)bur.io>
btrfs: record delayed inode root in transaction
Boris Burkov <boris(a)bur.io>
btrfs: qgroup: fix qgroup prealloc rsv leak in subvolume operations
Boris Burkov <boris(a)bur.io>
btrfs: qgroup: correctly model root qgroup rsv in convert
Geliang Tang <tanggeliang(a)kylinos.cn>
selftests: mptcp: use += operator to append strings
Jacob Pan <jacob.jun.pan(a)linux.intel.com>
iommu/vt-d: Allocate local memory for page request queue
Xuchun Shang <xuchun.shang(a)linux.alibaba.com>
iommu/vt-d: Fix wrong use of pasid config
Arnd Bergmann <arnd(a)arndb.de>
tracing: hide unused ftrace_event_id_fops
David Arinzon <darinzon(a)amazon.com>
net: ena: Set tx_info->xdpf value to NULL
David Arinzon <darinzon(a)amazon.com>
net: ena: Use tx_ring instead of xdp_ring for XDP channel TX
David Arinzon <darinzon(a)amazon.com>
net: ena: Pass ena_adapter instead of net_device to ena_xmit_common()
David Arinzon <darinzon(a)amazon.com>
net: ena: Move XDP code to its new files
David Arinzon <darinzon(a)amazon.com>
net: ena: Fix incorrect descriptor free behavior
David Arinzon <darinzon(a)amazon.com>
net: ena: Wrong missing IO completions check order
David Arinzon <darinzon(a)amazon.com>
net: ena: Fix potential sign extension issue
Michal Luczaj <mhal(a)rbox.co>
af_unix: Fix garbage collector racing against connect()
Kuniyuki Iwashima <kuniyu(a)amazon.com>
af_unix: Do not use atomic ops for unix_sk(sk)->inflight.
Arınç ÜNAL <arinc.unal(a)arinc9.com>
net: dsa: mt7530: trap link-local frames regardless of ST Port State
Gerd Bayer <gbayer(a)linux.ibm.com>
Revert "s390/ism: fix receive message buffer allocation"
Daniel Machon <daniel.machon(a)microchip.com>
net: sparx5: fix wrong config being used when reconfiguring PCS
Rahul Rameshbabu <rrameshbabu(a)nvidia.com>
net/mlx5e: Do not produce metadata freelist entries in Tx port ts WQE xmit
Carolina Jubran <cjubran(a)nvidia.com>
net/mlx5e: HTB, Fix inconsistencies with QoS SQs number
Carolina Jubran <cjubran(a)nvidia.com>
net/mlx5e: Fix mlx5e_priv_init() cleanup flow
Cosmin Ratiu <cratiu(a)nvidia.com>
net/mlx5: Correctly compare pkt reformat ids
Cosmin Ratiu <cratiu(a)nvidia.com>
net/mlx5: Properly link new fs rules into the tree
Michael Liang <mliang(a)purestorage.com>
net/mlx5: offset comp irq index in name by one
Shay Drory <shayd(a)nvidia.com>
net/mlx5: Register devlink first under devlink lock
Moshe Shemesh <moshe(a)nvidia.com>
net/mlx5: SF, Stop waiting for FW as teardown was called
Eric Dumazet <edumazet(a)google.com>
netfilter: complete validation of user input
Archie Pusaka <apusaka(a)chromium.org>
Bluetooth: l2cap: Don't double set the HCI_CONN_MGMT_CONNECTED bit
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: SCO: Fix not validating setsockopt user input
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: hci_sync: Fix using the same interval and window for Coded PHY
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: hci_sync: Use QoS to determine which PHY to scan
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: ISO: Don't reject BT_ISO_QOS if parameters are unset
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: ISO: Align broadcast sync_timeout with connection timeout
Jiri Benc <jbenc(a)redhat.com>
ipv6: fix race condition between ipv6_get_ifaddr and ipv6_del_addr
Arnd Bergmann <arnd(a)arndb.de>
ipv4/route: avoid unused-but-set-variable warning
Arnd Bergmann <arnd(a)arndb.de>
ipv6: fib: hide unused 'pn' variable
Geetha sowjanya <gakula(a)marvell.com>
octeontx2-af: Fix NIX SQ mode and BP config
Kuniyuki Iwashima <kuniyu(a)amazon.com>
af_unix: Clear stale u->oob_skb.
Marek Vasut <marex(a)denx.de>
net: ks8851: Handle softirqs at the end of IRQ thread to fix hang
Marek Vasut <marex(a)denx.de>
net: ks8851: Inline ks8851_rx_skb()
Pavan Chebbi <pavan.chebbi(a)broadcom.com>
bnxt_en: Reset PTP tx_avail after possible firmware reset
Vikas Gupta <vikas.gupta(a)broadcom.com>
bnxt_en: Fix error recovery for RoCE ulp client
Vikas Gupta <vikas.gupta(a)broadcom.com>
bnxt_en: Fix possible memory leak in bnxt_rdma_aux_device_init()
Gerd Bayer <gbayer(a)linux.ibm.com>
s390/ism: fix receive message buffer allocation
Eric Dumazet <edumazet(a)google.com>
geneve: fix header validation in geneve[6]_xmit_skb
Ming Lei <ming.lei(a)redhat.com>
block: fix q->blkg_list corruption during disk rebind
Hariprasad Kelam <hkelam(a)marvell.com>
octeontx2-pf: Fix transmit scheduler resource leak
Eric Dumazet <edumazet(a)google.com>
xsk: validate user input for XDP_{UMEM|COMPLETION}_FILL_RING
Petr Tesarik <petr(a)tesarici.cz>
u64_stats: fix u64_stats_init() for lockdep when used repeatedly in one file
Ilya Maximets <i.maximets(a)ovn.org>
net: openvswitch: fix unwanted error log on timeout policy probing
Dan Carpenter <dan.carpenter(a)linaro.org>
scsi: qla2xxx: Fix off by one in qla_edif_app_getstats()
Xiang Chen <chenxiang66(a)hisilicon.com>
scsi: hisi_sas: Modify the deadline for ata_wait_after_reset()
Arnd Bergmann <arnd(a)arndb.de>
nouveau: fix function cast warning
Alex Constantino <dreaming.about.electric.sheep(a)gmail.com>
Revert "drm/qxl: simplify qxl_fence_wait"
Kwangjin Ko <kwangjin.ko(a)sk.com>
cxl/core: Fix initialization of mbox_cmd.size_out in get event
Frank Li <Frank.Li(a)nxp.com>
arm64: dts: imx8-ss-conn: fix usdhc wrong lpcg clock order
Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org>
drm/msm/dpu: don't allow overriding data from catalog
Dave Jiang <dave.jiang(a)intel.com>
cxl/core/regs: Fix usage of map->reg_type in cxl_decode_regblock() before assigned
Yuquan Wang <wangyuquan1236(a)phytium.com.cn>
cxl/mem: Fix for the index of Clear Event Record Handle
Cristian Marussi <cristian.marussi(a)arm.com>
firmware: arm_scmi: Make raw debugfs entries non-seekable
Aaro Koskinen <aaro.koskinen(a)iki.fi>
ARM: OMAP2+: fix USB regression on Nokia N8x0
Aaro Koskinen <aaro.koskinen(a)iki.fi>
mmc: omap: restore original power up/down steps
Aaro Koskinen <aaro.koskinen(a)iki.fi>
mmc: omap: fix deferred probe
Aaro Koskinen <aaro.koskinen(a)iki.fi>
mmc: omap: fix broken slot switch lookup
Aaro Koskinen <aaro.koskinen(a)iki.fi>
ARM: OMAP2+: fix N810 MMC gpiod table
Aaro Koskinen <aaro.koskinen(a)iki.fi>
ARM: OMAP2+: fix bogus MMC GPIO labels on Nokia N8x0
Nini Song <nini.song(a)mediatek.com>
media: cec: core: remove length check of Timer Status
Anna-Maria Behnsen <anna-maria(a)linutronix.de>
PM: s2idle: Make sure CPUs will wakeup directly on resume
Hans de Goede <hdegoede(a)redhat.com>
ACPI: scan: Do not increase dep_unmet for already met dependencies
Noah Loomans <noah(a)noahloomans.com>
platform/chrome: cros_ec_uart: properly fix race condition
Tim Huang <Tim.Huang(a)amd.com>
drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11
Dmitry Antipov <dmantipov(a)yandex.ru>
Bluetooth: Fix memory leak in hci_req_sync_complete()
Steven Rostedt (Google) <rostedt(a)goodmis.org>
ring-buffer: Only update pages_touched when a new page is touched
Yu Kuai <yukuai3(a)huawei.com>
raid1: fix use-after-free for original bio in raid1_write_request()
Fabio Estevam <festevam(a)denx.de>
ARM: dts: imx7s-warp: Pass OV2680 link-frequencies
Gavin Shan <gshan(a)redhat.com>
arm64: tlb: Fix TLBI RANGE operand
Sven Eckelmann <sven(a)narfation.org>
batman-adv: Avoid infinite loop trying to resize local TT
Damien Le Moal <dlemoal(a)kernel.org>
ata: libata-scsi: Fix ata_scsi_dev_rescan() error path
Igor Pylypiv <ipylypiv(a)google.com>
ata: libata-core: Allow command duration limits detection for ACS-4 drives
Steve French <stfrench(a)microsoft.com>
smb3: fix Open files on server counter going negative
-------------
Diffstat:
Documentation/admin-guide/hw-vuln/spectre.rst | 22 +-
Documentation/admin-guide/kernel-parameters.txt | 12 +-
.../device_drivers/ethernet/amazon/ena.rst | 1 +
Makefile | 4 +-
arch/arm/boot/dts/nxp/imx/imx7s-warp.dts | 1 +
arch/arm/mach-omap2/board-n8x0.c | 23 +-
arch/arm64/boot/dts/freescale/imx8-ss-conn.dtsi | 16 +-
arch/arm64/boot/dts/freescale/imx8-ss-dma.dtsi | 36 +-
arch/arm64/boot/dts/freescale/imx8-ss-lsio.dtsi | 16 +-
arch/arm64/boot/dts/freescale/imx8qm-ss-dma.dtsi | 8 +-
arch/arm64/include/asm/tlbflush.h | 20 +-
arch/x86/Kconfig | 21 +-
arch/x86/events/core.c | 1 +
arch/x86/include/asm/apic.h | 3 +-
arch/x86/kernel/apic/apic.c | 6 +-
arch/x86/kernel/cpu/bugs.c | 82 ++-
arch/x86/kernel/cpu/common.c | 48 +-
block/blk-cgroup.c | 9 +-
block/blk-cgroup.h | 2 +
block/blk-core.c | 2 +
drivers/accel/ivpu/ivpu_drv.c | 2 +-
drivers/acpi/scan.c | 3 +-
drivers/ata/libata-core.c | 2 +-
drivers/ata/libata-scsi.c | 9 +-
drivers/cxl/core/mbox.c | 5 +-
drivers/cxl/core/regs.c | 5 +-
drivers/firmware/arm_scmi/raw_mode.c | 7 +-
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 2 +-
drivers/gpu/drm/amd/amdgpu/soc21.c | 27 +-
.../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 1 +
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 15 +-
.../amd/display/dc/clk_mgr/dcn316/dcn316_clk_mgr.c | 19 +-
.../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_4_ppt.c | 12 +-
drivers/gpu/drm/ast/ast_dp.c | 3 +
drivers/gpu/drm/drm_client_modeset.c | 3 +-
drivers/gpu/drm/i915/display/intel_cdclk.c | 7 +-
drivers/gpu/drm/i915/display/intel_cdclk.h | 3 +
drivers/gpu/drm/i915/display/intel_ddi.c | 5 +
drivers/gpu/drm/i915/display/intel_vrr.c | 7 +
drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c | 10 +-
.../gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c | 7 +-
drivers/gpu/drm/panfrost/panfrost_mmu.c | 13 +-
drivers/gpu/drm/qxl/qxl_release.c | 50 +-
drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 11 +-
drivers/iommu/intel/perfmon.c | 2 +-
drivers/iommu/intel/svm.c | 2 +-
drivers/md/raid1.c | 2 +-
drivers/media/cec/core/cec-adap.c | 14 -
drivers/mmc/host/omap.c | 48 +-
drivers/net/dsa/mt7530.c | 229 ++++++-
drivers/net/dsa/mt7530.h | 5 +
drivers/net/ethernet/amazon/ena/Makefile | 2 +-
drivers/net/ethernet/amazon/ena/ena_com.c | 2 +-
drivers/net/ethernet/amazon/ena/ena_ethtool.c | 1 +
drivers/net/ethernet/amazon/ena/ena_netdev.c | 688 ++-------------------
drivers/net/ethernet/amazon/ena/ena_netdev.h | 83 +--
drivers/net/ethernet/amazon/ena/ena_xdp.c | 466 ++++++++++++++
drivers/net/ethernet/amazon/ena/ena_xdp.h | 152 +++++
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +
drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 6 +-
.../net/ethernet/marvell/octeontx2/af/rvu_nix.c | 22 +-
drivers/net/ethernet/marvell/octeontx2/nic/qos.c | 1 +
drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h | 8 +-
drivers/net/ethernet/mellanox/mlx5/core/en/qos.c | 33 +-
drivers/net/ethernet/mellanox/mlx5/core/en/selq.c | 2 +
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 2 -
drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 7 +-
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 17 +-
drivers/net/ethernet/mellanox/mlx5/core/main.c | 37 +-
drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c | 4 +-
.../ethernet/mellanox/mlx5/core/sf/dev/driver.c | 22 +-
drivers/net/ethernet/micrel/ks8851.h | 3 -
drivers/net/ethernet/micrel/ks8851_common.c | 16 +-
drivers/net/ethernet/micrel/ks8851_par.c | 11 -
drivers/net/ethernet/micrel/ks8851_spi.c | 11 -
.../net/ethernet/microchip/sparx5/sparx5_port.c | 4 +-
drivers/net/geneve.c | 4 +-
drivers/platform/chrome/cros_ec_uart.c | 28 +-
drivers/scsi/hisi_sas/hisi_sas_main.c | 2 +-
drivers/scsi/qla2xxx/qla_edif.c | 2 +-
drivers/scsi/sg.c | 20 +-
drivers/vhost/vhost.c | 28 +-
fs/btrfs/delayed-inode.c | 3 +
fs/btrfs/inode.c | 13 +-
fs/btrfs/ioctl.c | 37 +-
fs/btrfs/qgroup.c | 2 +
fs/btrfs/root-tree.c | 10 -
fs/btrfs/root-tree.h | 2 -
fs/btrfs/transaction.c | 17 +-
fs/smb/client/cached_dir.c | 4 +-
include/linux/dma-fence.h | 7 +
include/linux/irqflags.h | 2 +-
include/linux/u64_stats_sync.h | 9 +-
include/net/addrconf.h | 4 +
include/net/af_unix.h | 2 +-
include/net/bluetooth/bluetooth.h | 11 +
include/net/ip_tunnels.h | 33 +
io_uring/net.c | 1 +
kernel/cpu.c | 3 +-
kernel/kprobes.c | 18 +-
kernel/power/suspend.c | 6 +
kernel/trace/ring_buffer.c | 6 +-
kernel/trace/trace_events.c | 4 +
net/batman-adv/translation-table.c | 2 +-
net/bluetooth/hci_request.c | 4 +-
net/bluetooth/hci_sync.c | 66 +-
net/bluetooth/iso.c | 14 +-
net/bluetooth/l2cap_core.c | 3 +-
net/bluetooth/sco.c | 23 +-
net/ipv4/netfilter/arp_tables.c | 4 +
net/ipv4/netfilter/ip_tables.c | 4 +
net/ipv4/route.c | 4 +-
net/ipv6/addrconf.c | 7 +-
net/ipv6/ip6_fib.c | 7 +-
net/ipv6/netfilter/ip6_tables.c | 4 +
net/openvswitch/conntrack.c | 5 +-
net/unix/af_unix.c | 8 +-
net/unix/garbage.c | 35 +-
net/unix/scm.c | 8 +-
net/xdp/xsk.c | 2 +
tools/testing/selftests/net/mptcp/mptcp_connect.sh | 53 +-
tools/testing/selftests/net/mptcp/mptcp_join.sh | 30 +-
tools/testing/selftests/timers/posix_timers.c | 2 +-
123 files changed, 1765 insertions(+), 1263 deletions(-)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x a4833e3abae132d613ce7da0e0c9a9465d1681fa
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024041925-walnut-flammable-adf1@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
a4833e3abae1 ("SUNRPC: Fix rpcgss_context trace event acceptor field")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a4833e3abae132d613ce7da0e0c9a9465d1681fa Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
Date: Wed, 10 Apr 2024 12:38:13 -0400
Subject: [PATCH] SUNRPC: Fix rpcgss_context trace event acceptor field
The rpcgss_context trace event acceptor field is a dynamically sized
string that records the "data" parameter. But this parameter is also
dependent on the "len" field to determine the size of the data.
It needs to use __string_len() helper macro where the length can be passed
in. It also incorrectly uses strncpy() to save it instead of
__assign_str(). As these macros can change, it is not wise to open code
them in trace events.
As of commit c759e609030c ("tracing: Remove __assign_str_len()"),
__assign_str() can be used for both __string() and __string_len() fields.
Before that commit, __assign_str_len() is required to be used. This needs
to be noted for backporting. (In actuality, commit c1fa617caeb0 ("tracing:
Rework __assign_str() and __string() to not duplicate getting the string")
is the commit that makes __string_str_len() obsolete).
Cc: stable(a)vger.kernel.org
Fixes: 0c77668ddb4e ("SUNRPC: Introduce trace points in rpc_auth_gss.ko")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
diff --git a/include/trace/events/rpcgss.h b/include/trace/events/rpcgss.h
index ba2d96a1bc2f..f50fcafc69de 100644
--- a/include/trace/events/rpcgss.h
+++ b/include/trace/events/rpcgss.h
@@ -609,7 +609,7 @@ TRACE_EVENT(rpcgss_context,
__field(unsigned int, timeout)
__field(u32, window_size)
__field(int, len)
- __string(acceptor, data)
+ __string_len(acceptor, data, len)
),
TP_fast_assign(
@@ -618,7 +618,7 @@ TRACE_EVENT(rpcgss_context,
__entry->timeout = timeout;
__entry->window_size = window_size;
__entry->len = len;
- strncpy(__get_str(acceptor), data, len);
+ __assign_str(acceptor, data);
),
TP_printk("win_size=%u expiry=%lu now=%lu timeout=%u acceptor=%.*s",
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x a4833e3abae132d613ce7da0e0c9a9465d1681fa
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024041953-duration-fructose-0bc1@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
a4833e3abae1 ("SUNRPC: Fix rpcgss_context trace event acceptor field")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a4833e3abae132d613ce7da0e0c9a9465d1681fa Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
Date: Wed, 10 Apr 2024 12:38:13 -0400
Subject: [PATCH] SUNRPC: Fix rpcgss_context trace event acceptor field
The rpcgss_context trace event acceptor field is a dynamically sized
string that records the "data" parameter. But this parameter is also
dependent on the "len" field to determine the size of the data.
It needs to use __string_len() helper macro where the length can be passed
in. It also incorrectly uses strncpy() to save it instead of
__assign_str(). As these macros can change, it is not wise to open code
them in trace events.
As of commit c759e609030c ("tracing: Remove __assign_str_len()"),
__assign_str() can be used for both __string() and __string_len() fields.
Before that commit, __assign_str_len() is required to be used. This needs
to be noted for backporting. (In actuality, commit c1fa617caeb0 ("tracing:
Rework __assign_str() and __string() to not duplicate getting the string")
is the commit that makes __string_str_len() obsolete).
Cc: stable(a)vger.kernel.org
Fixes: 0c77668ddb4e ("SUNRPC: Introduce trace points in rpc_auth_gss.ko")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
diff --git a/include/trace/events/rpcgss.h b/include/trace/events/rpcgss.h
index ba2d96a1bc2f..f50fcafc69de 100644
--- a/include/trace/events/rpcgss.h
+++ b/include/trace/events/rpcgss.h
@@ -609,7 +609,7 @@ TRACE_EVENT(rpcgss_context,
__field(unsigned int, timeout)
__field(u32, window_size)
__field(int, len)
- __string(acceptor, data)
+ __string_len(acceptor, data, len)
),
TP_fast_assign(
@@ -618,7 +618,7 @@ TRACE_EVENT(rpcgss_context,
__entry->timeout = timeout;
__entry->window_size = window_size;
__entry->len = len;
- strncpy(__get_str(acceptor), data, len);
+ __assign_str(acceptor, data);
),
TP_printk("win_size=%u expiry=%lu now=%lu timeout=%u acceptor=%.*s",
From: Ard Biesheuvel <ardb(a)kernel.org>
This is the final batch of changes to bring linux-6.1.y in sync with
6.6 and later in terms of compatibility with tightened boot security
requirements imposed by MicroSoft, compliance with which is a
prerequisite for them to be willing to resume signing distro shim images
with the MS 3rd party secure boot certificate.
Without this, distros can only boot on off-the-shelf x86 PCs after
disabling secure boot explicitly.
Most of these changes appeared in v6.8 and have been backported to v6.6
already.
Ard Biesheuvel (20):
x86/efi: Drop EFI stub .bss from .data section
x86/efi: Disregard setup header of loaded image
x86/efistub: Reinstate soft limit for initrd loading
x86/efi: Drop alignment flags from PE section headers
x86/boot: Remove the 'bugger off' message
x86/boot: Omit compression buffer from PE/COFF image memory footprint
x86/boot: Drop redundant code setting the root device
x86/boot: Drop references to startup_64
x86/boot: Grab kernel_info offset from zoffset header directly
x86/boot: Set EFI handover offset directly in header asm
x86/boot: Define setup size in linker script
x86/boot: Derive file size from _edata symbol
x86/boot: Construct PE/COFF .text section from assembler
x86/boot: Drop PE/COFF .reloc section
x86/boot: Split off PE/COFF .data section
x86/boot: Increase section and file alignment to 4k/512
x86/efistub: Use 1:1 file:memory mapping for PE/COFF .compat section
x86/sme: Move early SME kernel encryption handling into .head.text
x86/sev: Move early startup code into .head.text section
x86/efistub: Remap kernel text read-only before dropping NX attribute
Hou Wenlong (2):
x86/head/64: Add missing __head annotation to startup_64_load_idt()
x86/head/64: Move the __head definition to <asm/init.h>
Pasha Tatashin (1):
x86/mm: Remove P*D_PAGE_MASK and P*D_PAGE_SIZE macros
arch/x86/boot/Makefile | 2 +-
arch/x86/boot/compressed/Makefile | 2 +-
arch/x86/boot/compressed/misc.c | 1 +
arch/x86/boot/compressed/sev.c | 3 +
arch/x86/boot/compressed/vmlinux.lds.S | 6 +-
arch/x86/boot/header.S | 211 ++++++---------
arch/x86/boot/setup.ld | 14 +-
arch/x86/boot/tools/build.c | 273 +-------------------
arch/x86/include/asm/boot.h | 1 +
arch/x86/include/asm/init.h | 2 +
arch/x86/include/asm/mem_encrypt.h | 8 +-
arch/x86/include/asm/page_types.h | 12 +-
arch/x86/include/asm/sev.h | 10 +-
arch/x86/kernel/amd_gart_64.c | 2 +-
arch/x86/kernel/head64.c | 7 +-
arch/x86/kernel/sev-shared.c | 23 +-
arch/x86/kernel/sev.c | 11 +-
arch/x86/mm/mem_encrypt_boot.S | 4 +-
arch/x86/mm/mem_encrypt_identity.c | 58 ++---
arch/x86/mm/pat/set_memory.c | 6 +-
arch/x86/mm/pti.c | 2 +-
drivers/firmware/efi/libstub/Makefile | 7 -
drivers/firmware/efi/libstub/x86-stub.c | 58 ++---
23 files changed, 194 insertions(+), 529 deletions(-)
--
2.44.0.769.g3c40516874-goog
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 1db7959aacd905e6487d0478ac01d89f86eb1e51
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024041954-bullish-slingshot-109f@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
1db7959aacd9 ("btrfs: do not wait for short bulk allocation")
09e6cef19c9f ("btrfs: refactor alloc_extent_buffer() to allocate-then-attach method")
397239ed6a6c ("btrfs: allow extent buffer helpers to skip cross-page handling")
94dbf7c0871f ("btrfs: free the allocated memory if btrfs_alloc_page_array() fails")
096d23016543 ("btrfs: refactor main loop in memmove_extent_buffer()")
13840f3f2837 ("btrfs: refactor main loop in memcpy_extent_buffer()")
730c374e5b2c ("btrfs: use write_extent_buffer() to implement write_extent_buffer_*id()")
cb22964f1dad ("btrfs: refactor extent buffer bitmaps operations")
52ea5bfbfa6d ("btrfs: move eb subpage preallocation out of the loop")
5a96341927b0 ("btrfs: subpage: make alloc_extent_buffer() handle previously uptodate range efficiently")
2af2aaf98205 ("btrfs: scrub: introduce structure for new BTRFS_STRIPE_LEN based interface")
5eb30ee26fa4 ("btrfs: raid56: introduce the main entrance for RMW path")
6486d21c99cb ("btrfs: raid56: extract rwm write bios assembly into a helper")
509c27aa2fb6 ("btrfs: raid56: extract the rmw bio list build code into a helper")
30e3c897f4a8 ("btrfs: raid56: extract the pq generation code into a helper")
2fc6822c99d7 ("btrfs: move scrub prototypes into scrub.h")
677074792a1d ("btrfs: move relocation prototypes into relocation.h")
33cf97a7b658 ("btrfs: move acl prototypes into acl.h")
af142b6f44d3 ("btrfs: move file prototypes to file.h")
7572dec8f522 ("btrfs: move ioctl prototypes into ioctl.h")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1db7959aacd905e6487d0478ac01d89f86eb1e51 Mon Sep 17 00:00:00 2001
From: Qu Wenruo <wqu(a)suse.com>
Date: Tue, 26 Mar 2024 09:16:46 +1030
Subject: [PATCH] btrfs: do not wait for short bulk allocation
[BUG]
There is a recent report that when memory pressure is high (including
cached pages), btrfs can spend most of its time on memory allocation in
btrfs_alloc_page_array() for compressed read/write.
[CAUSE]
For btrfs_alloc_page_array() we always go alloc_pages_bulk_array(), and
even if the bulk allocation failed (fell back to single page
allocation) we still retry but with extra memalloc_retry_wait().
If the bulk alloc only returned one page a time, we would spend a lot of
time on the retry wait.
The behavior was introduced in commit 395cb57e8560 ("btrfs: wait between
incomplete batch memory allocations").
[FIX]
Although the commit mentioned that other filesystems do the wait, it's
not the case at least nowadays.
All the mainlined filesystems only call memalloc_retry_wait() if they
failed to allocate any page (not only for bulk allocation).
If there is any progress, they won't call memalloc_retry_wait() at all.
For example, xfs_buf_alloc_pages() would only call memalloc_retry_wait()
if there is no allocation progress at all, and the call is not for
metadata readahead.
So I don't believe we should call memalloc_retry_wait() unconditionally
for short allocation.
Call memalloc_retry_wait() if it fails to allocate any page for tree
block allocation (which goes with __GFP_NOFAIL and may not need the
special handling anyway), and reduce the latency for
btrfs_alloc_page_array().
Reported-by: Julian Taylor <julian.taylor(a)1und1.de>
Tested-by: Julian Taylor <julian.taylor(a)1und1.de>
Link: https://lore.kernel.org/all/8966c095-cbe7-4d22-9784-a647d1bf27c3@1und1.de/
Fixes: 395cb57e8560 ("btrfs: wait between incomplete batch memory allocations")
CC: stable(a)vger.kernel.org # 6.1+
Reviewed-by: Sweet Tea Dorminy <sweettea-kernel(a)dorminy.me>
Reviewed-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index b18034f2ab80..2776112dbdf8 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -681,31 +681,21 @@ static void end_bbio_data_read(struct btrfs_bio *bbio)
int btrfs_alloc_page_array(unsigned int nr_pages, struct page **page_array,
gfp_t extra_gfp)
{
+ const gfp_t gfp = GFP_NOFS | extra_gfp;
unsigned int allocated;
for (allocated = 0; allocated < nr_pages;) {
unsigned int last = allocated;
- allocated = alloc_pages_bulk_array(GFP_NOFS | extra_gfp,
- nr_pages, page_array);
-
- if (allocated == nr_pages)
- return 0;
-
- /*
- * During this iteration, no page could be allocated, even
- * though alloc_pages_bulk_array() falls back to alloc_page()
- * if it could not bulk-allocate. So we must be out of memory.
- */
- if (allocated == last) {
+ allocated = alloc_pages_bulk_array(gfp, nr_pages, page_array);
+ if (unlikely(allocated == last)) {
+ /* No progress, fail and do cleanup. */
for (int i = 0; i < allocated; i++) {
__free_page(page_array[i]);
page_array[i] = NULL;
}
return -ENOMEM;
}
-
- memalloc_retry_wait(GFP_NOFS);
}
return 0;
}
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 1db7959aacd905e6487d0478ac01d89f86eb1e51
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024041951-sports-hula-f2a5@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
1db7959aacd9 ("btrfs: do not wait for short bulk allocation")
09e6cef19c9f ("btrfs: refactor alloc_extent_buffer() to allocate-then-attach method")
397239ed6a6c ("btrfs: allow extent buffer helpers to skip cross-page handling")
94dbf7c0871f ("btrfs: free the allocated memory if btrfs_alloc_page_array() fails")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1db7959aacd905e6487d0478ac01d89f86eb1e51 Mon Sep 17 00:00:00 2001
From: Qu Wenruo <wqu(a)suse.com>
Date: Tue, 26 Mar 2024 09:16:46 +1030
Subject: [PATCH] btrfs: do not wait for short bulk allocation
[BUG]
There is a recent report that when memory pressure is high (including
cached pages), btrfs can spend most of its time on memory allocation in
btrfs_alloc_page_array() for compressed read/write.
[CAUSE]
For btrfs_alloc_page_array() we always go alloc_pages_bulk_array(), and
even if the bulk allocation failed (fell back to single page
allocation) we still retry but with extra memalloc_retry_wait().
If the bulk alloc only returned one page a time, we would spend a lot of
time on the retry wait.
The behavior was introduced in commit 395cb57e8560 ("btrfs: wait between
incomplete batch memory allocations").
[FIX]
Although the commit mentioned that other filesystems do the wait, it's
not the case at least nowadays.
All the mainlined filesystems only call memalloc_retry_wait() if they
failed to allocate any page (not only for bulk allocation).
If there is any progress, they won't call memalloc_retry_wait() at all.
For example, xfs_buf_alloc_pages() would only call memalloc_retry_wait()
if there is no allocation progress at all, and the call is not for
metadata readahead.
So I don't believe we should call memalloc_retry_wait() unconditionally
for short allocation.
Call memalloc_retry_wait() if it fails to allocate any page for tree
block allocation (which goes with __GFP_NOFAIL and may not need the
special handling anyway), and reduce the latency for
btrfs_alloc_page_array().
Reported-by: Julian Taylor <julian.taylor(a)1und1.de>
Tested-by: Julian Taylor <julian.taylor(a)1und1.de>
Link: https://lore.kernel.org/all/8966c095-cbe7-4d22-9784-a647d1bf27c3@1und1.de/
Fixes: 395cb57e8560 ("btrfs: wait between incomplete batch memory allocations")
CC: stable(a)vger.kernel.org # 6.1+
Reviewed-by: Sweet Tea Dorminy <sweettea-kernel(a)dorminy.me>
Reviewed-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index b18034f2ab80..2776112dbdf8 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -681,31 +681,21 @@ static void end_bbio_data_read(struct btrfs_bio *bbio)
int btrfs_alloc_page_array(unsigned int nr_pages, struct page **page_array,
gfp_t extra_gfp)
{
+ const gfp_t gfp = GFP_NOFS | extra_gfp;
unsigned int allocated;
for (allocated = 0; allocated < nr_pages;) {
unsigned int last = allocated;
- allocated = alloc_pages_bulk_array(GFP_NOFS | extra_gfp,
- nr_pages, page_array);
-
- if (allocated == nr_pages)
- return 0;
-
- /*
- * During this iteration, no page could be allocated, even
- * though alloc_pages_bulk_array() falls back to alloc_page()
- * if it could not bulk-allocate. So we must be out of memory.
- */
- if (allocated == last) {
+ allocated = alloc_pages_bulk_array(gfp, nr_pages, page_array);
+ if (unlikely(allocated == last)) {
+ /* No progress, fail and do cleanup. */
for (int i = 0; i < allocated; i++) {
__free_page(page_array[i]);
page_array[i] = NULL;
}
return -ENOMEM;
}
-
- memalloc_retry_wait(GFP_NOFS);
}
return 0;
}
From: Joao Paulo Goncalves <joao.goncalves(a)toradex.com>
When using davinci-mcasp as CPU DAI with simple-card, there are some
conditions that cause simple-card to finish registering a sound card before
davinci-mcasp finishes registering all sound components. This creates a
non-working sound card from userspace with no problem indication apart
from not being able to play/record audio on a PCM stream. The issue
arises during simultaneous probe execution of both drivers. Specifically,
the simple-card driver, awaiting a CPU DAI, proceeds as soon as
davinci-mcasp registers its DAI. However, this process can lead to the
client mutex lock (client_mutex in soc-core.c) being held or davinci-mcasp
being preempted before PCM DMA registration on davinci-mcasp finishes.
This situation occurs when the probes of both drivers run concurrently.
Below is the code path for this condition. To solve the issue, defer
davinci-mcasp CPU DAI registration to the last step in the audio part of
it. This way, simple-card CPU DAI parsing will be deferred until all
audio components are registered.
Fail Code Path:
simple-card.c: probe starts
simple-card.c: simple_dai_link_of: simple_parse_node(..,cpu,..) returns EPROBE_DEFER, no CPU DAI yet
davinci-mcasp.c: probe starts
davinci-mcasp.c: devm_snd_soc_register_component() register CPU DAI
simple-card.c: probes again, finish CPU DAI parsing and call devm_snd_soc_register_card()
simple-card.c: finish probe
davinci-mcasp.c: *dma_pcm_platform_register() register PCM DMA
davinci-mcasp.c: probe finish
Cc: stable(a)vger.kernel.org
Fixes: 9fbd58cf4ab0 ("ASoC: davinci-mcasp: Choose PCM driver based on configured DMA controller")
Signed-off-by: Joao Paulo Goncalves <joao.goncalves(a)toradex.com>
---
sound/soc/ti/davinci-mcasp.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/sound/soc/ti/davinci-mcasp.c b/sound/soc/ti/davinci-mcasp.c
index b892d66f78470..1e760c3155213 100644
--- a/sound/soc/ti/davinci-mcasp.c
+++ b/sound/soc/ti/davinci-mcasp.c
@@ -2417,12 +2417,6 @@ static int davinci_mcasp_probe(struct platform_device *pdev)
mcasp_reparent_fck(pdev);
- ret = devm_snd_soc_register_component(&pdev->dev, &davinci_mcasp_component,
- &davinci_mcasp_dai[mcasp->op_mode], 1);
-
- if (ret != 0)
- goto err;
-
ret = davinci_mcasp_get_dma_type(mcasp);
switch (ret) {
case PCM_EDMA:
@@ -2449,6 +2443,12 @@ static int davinci_mcasp_probe(struct platform_device *pdev)
goto err;
}
+ ret = devm_snd_soc_register_component(&pdev->dev, &davinci_mcasp_component,
+ &davinci_mcasp_dai[mcasp->op_mode], 1);
+
+ if (ret != 0)
+ goto err;
+
no_audio:
ret = davinci_mcasp_init_gpiochip(mcasp);
if (ret) {
--
2.34.1
Framebuffer memory is allocated via vzalloc() from non-contiguous
physical pages. The physical framebuffer start address is therefore
meaningless. Do not set it.
The value is not used within the kernel and only exported to userspace
on dedicated ARM configs. No functional change is expected.
v2:
- refer to vzalloc() in commit message (Javier)
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Fixes: a5b44c4adb16 ("drm/fbdev-generic: Always use shadow buffering")
Cc: Thomas Zimmermann <tzimmermann(a)suse.de>
Cc: Javier Martinez Canillas <javierm(a)redhat.com>
Cc: Zack Rusin <zackr(a)vmware.com>
Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com>
Cc: Maxime Ripard <mripard(a)kernel.org>
Cc: <stable(a)vger.kernel.org> # v6.4+
Reviewed-by: Javier Martinez Canillas <javierm(a)redhat.com>
Reviewed-by: Zack Rusin <zack.rusin(a)broadcom.com>
Reviewed-by: Sui Jingfeng <sui.jingfeng(a)linux.dev>
Tested-by: Sui Jingfeng <sui.jingfeng(a)linux.dev>
Acked-by: Maxime Ripard <mripard(a)kernel.org>
---
drivers/gpu/drm/drm_fbdev_generic.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_fbdev_generic.c b/drivers/gpu/drm/drm_fbdev_generic.c
index be357f926faec..97e579c33d84a 100644
--- a/drivers/gpu/drm/drm_fbdev_generic.c
+++ b/drivers/gpu/drm/drm_fbdev_generic.c
@@ -113,7 +113,6 @@ static int drm_fbdev_generic_helper_fb_probe(struct drm_fb_helper *fb_helper,
/* screen */
info->flags |= FBINFO_VIRTFB | FBINFO_READS_FAST;
info->screen_buffer = screen_buffer;
- info->fix.smem_start = page_to_phys(vmalloc_to_page(info->screen_buffer));
info->fix.smem_len = screen_size;
/* deferred I/O */
--
2.44.0
The patch titled
Subject: init: fix allocated page overlapping with PTR_ERR
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
init-fix-allocated-page-overlapping-with-ptr_err.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Nam Cao <namcao(a)linutronix.de>
Subject: init: fix allocated page overlapping with PTR_ERR
Date: Thu, 18 Apr 2024 12:29:43 +0200
There is nothing preventing kernel memory allocators from allocating a
page that overlaps with PTR_ERR(), except for architecture-specific code
that setup memblock.
It was discovered that RISCV architecture doesn't setup memblock corectly,
leading to a page overlapping with PTR_ERR() being allocated, and
subsequently crashing the kernel (link in Close: )
The reported crash has nothing to do with PTR_ERR(): the last page (at
address 0xfffff000) being allocated leads to an unexpected arithmetic
overflow in ext4; but still, this page shouldn't be allocated in the first
place.
Because PTR_ERR() is an architecture-independent thing, we shouldn't ask
every single architecture to set this up. There may be other
architectures beside RISCV that have the same problem.
Fix this once and for all by reserving the physical memory page that may
be mapped to the last virtual memory page as part of low memory.
Unfortunately, this means if there is actual memory at this reserved
location, that memory will become inaccessible. However, if this page is
not reserved, it can only be accessed as high memory, so this doesn't
matter if high memory is not supported. Even if high memory is supported,
it is still only one page.
Closes: https://lore.kernel.org/linux-riscv/878r1ibpdn.fsf@all.your.base.are.belong…
Link: https://lkml.kernel.org/r/20240418102943.180510-1-namcao@linutronix.de
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
Reported-by: Bj��rn T��pel <bjorn(a)kernel.org>
Tested-by: Bj��rn T��pel <bjorn(a)kernel.org>
Reviewed-by: Mike Rapoport (IBM) <rppt(a)kernel.org>
Cc: Andreas Dilger <adilger(a)dilger.ca>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Changbin Du <changbin.du(a)huawei.com>
Cc: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Cc: Geert Uytterhoeven <geert+renesas(a)glider.be>
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Krister Johansen <kjlx(a)templeofstupid.com>
Cc: Luis Chamberlain <mcgrof(a)kernel.org>
Cc: Nick Desaulniers <ndesaulniers(a)google.com>
Cc: Stephen Rothwell <sfr(a)canb.auug.org.au>
Cc: Tejun Heo <tj(a)kernel.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
init/main.c | 1 +
1 file changed, 1 insertion(+)
--- a/init/main.c~init-fix-allocated-page-overlapping-with-ptr_err
+++ a/init/main.c
@@ -900,6 +900,7 @@ void start_kernel(void)
page_address_init();
pr_notice("%s", linux_banner);
early_security_init();
+ memblock_reserve(__pa(-PAGE_SIZE), PAGE_SIZE); /* reserve last page for ERR_PTR */
setup_arch(&command_line);
setup_boot_config();
setup_command_line(command_line);
_
Patches currently in -mm which might be from namcao(a)linutronix.de are
init-fix-allocated-page-overlapping-with-ptr_err.patch
The patch titled
Subject: stackdepot: respect __GFP_NOLOCKDEP allocation flag
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
stackdepot-respect-__gfp_nolockdep-allocation-flag.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
Subject: stackdepot: respect __GFP_NOLOCKDEP allocation flag
Date: Thu, 18 Apr 2024 16:11:33 +0200
If stack_depot_save_flags() allocates memory it always drops
__GFP_NOLOCKDEP flag. So when KASAN tries to track __GFP_NOLOCKDEP
allocation we may end up with lockdep splat like bellow:
======================================================
WARNING: possible circular locking dependency detected
6.9.0-rc3+ #49 Not tainted
------------------------------------------------------
kswapd0/149 is trying to acquire lock:
ffff88811346a920
(&xfs_nondir_ilock_class){++++}-{4:4}, at: xfs_reclaim_inode+0x3ac/0x590
[xfs]
but task is already holding lock:
ffffffff8bb33100 (fs_reclaim){+.+.}-{0:0}, at:
balance_pgdat+0x5d9/0xad0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (fs_reclaim){+.+.}-{0:0}:
__lock_acquire+0x7da/0x1030
lock_acquire+0x15d/0x400
fs_reclaim_acquire+0xb5/0x100
prepare_alloc_pages.constprop.0+0xc5/0x230
__alloc_pages+0x12a/0x3f0
alloc_pages_mpol+0x175/0x340
stack_depot_save_flags+0x4c5/0x510
kasan_save_stack+0x30/0x40
kasan_save_track+0x10/0x30
__kasan_slab_alloc+0x83/0x90
kmem_cache_alloc+0x15e/0x4a0
__alloc_object+0x35/0x370
__create_object+0x22/0x90
__kmalloc_node_track_caller+0x477/0x5b0
krealloc+0x5f/0x110
xfs_iext_insert_raw+0x4b2/0x6e0 [xfs]
xfs_iext_insert+0x2e/0x130 [xfs]
xfs_iread_bmbt_block+0x1a9/0x4d0 [xfs]
xfs_btree_visit_block+0xfb/0x290 [xfs]
xfs_btree_visit_blocks+0x215/0x2c0 [xfs]
xfs_iread_extents+0x1a2/0x2e0 [xfs]
xfs_buffered_write_iomap_begin+0x376/0x10a0 [xfs]
iomap_iter+0x1d1/0x2d0
iomap_file_buffered_write+0x120/0x1a0
xfs_file_buffered_write+0x128/0x4b0 [xfs]
vfs_write+0x675/0x890
ksys_write+0xc3/0x160
do_syscall_64+0x94/0x170
entry_SYSCALL_64_after_hwframe+0x71/0x79
Always preserve __GFP_NOLOCKDEP to fix this.
Link: https://lkml.kernel.org/r/20240418141133.22950-1-ryabinin.a.a@gmail.com
Fixes: cd11016e5f52 ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB")
Signed-off-by: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
Reported-by: Xiubo Li <xiubli(a)redhat.com>
Closes: https://lore.kernel.org/all/a0caa289-ca02-48eb-9bf2-d86fd47b71f4@redhat.com/
Reported-by: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Closes: https://lore.kernel.org/all/f9ff999a-e170-b66b-7caf-293f2b147ac2@opensource…
Suggested-by: Dave Chinner <david(a)fromorbit.com>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/stackdepot.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/lib/stackdepot.c~stackdepot-respect-__gfp_nolockdep-allocation-flag
+++ a/lib/stackdepot.c
@@ -627,10 +627,10 @@ depot_stack_handle_t stack_depot_save_fl
/*
* Zero out zone modifiers, as we don't have specific zone
* requirements. Keep the flags related to allocation in atomic
- * contexts and I/O.
+ * contexts, I/O, nolockdep.
*/
alloc_flags &= ~GFP_ZONEMASK;
- alloc_flags &= (GFP_ATOMIC | GFP_KERNEL);
+ alloc_flags &= (GFP_ATOMIC | GFP_KERNEL | __GFP_NOLOCKDEP);
alloc_flags |= __GFP_NOWARN;
page = alloc_pages(alloc_flags, DEPOT_POOL_ORDER);
if (page)
_
Patches currently in -mm which might be from ryabinin.a.a(a)gmail.com are
stackdepot-respect-__gfp_nolockdep-allocation-flag.patch
The patch titled
Subject: hugetlb: check for anon_vma prior to folio allocation
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
hugetlb-check-for-anon_vma-prior-to-folio-allocation.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Vishal Moola (Oracle)" <vishal.moola(a)gmail.com>
Subject: hugetlb: check for anon_vma prior to folio allocation
Date: Mon, 15 Apr 2024 14:17:47 -0700
Commit 9acad7ba3e25 ("hugetlb: use vmf_anon_prepare() instead of
anon_vma_prepare()") may bailout after allocating a folio if we do not
hold the mmap lock. When this occurs, vmf_anon_prepare() will release the
vma lock. Hugetlb then attempts to call restore_reserve_on_error(), which
depends on the vma lock being held.
We can move vmf_anon_prepare() prior to the folio allocation in order to
avoid calling restore_reserve_on_error() without the vma lock.
Link: https://lkml.kernel.org/r/ZiFqSrSRLhIV91og@fedora
Fixes: 9acad7ba3e25 ("hugetlb: use vmf_anon_prepare() instead of anon_vma_prepare()")
Reported-by: syzbot+ad1b592fc4483655438b(a)syzkaller.appspotmail.com
Signed-off-by: Vishal Moola (Oracle) <vishal.moola(a)gmail.com>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
--- a/mm/hugetlb.c~hugetlb-check-for-anon_vma-prior-to-folio-allocation
+++ a/mm/hugetlb.c
@@ -6261,6 +6261,12 @@ static vm_fault_t hugetlb_no_page(struct
VM_UFFD_MISSING);
}
+ if (!(vma->vm_flags & VM_MAYSHARE)) {
+ ret = vmf_anon_prepare(vmf);
+ if (unlikely(ret))
+ goto out;
+ }
+
folio = alloc_hugetlb_folio(vma, haddr, 0);
if (IS_ERR(folio)) {
/*
@@ -6297,15 +6303,12 @@ static vm_fault_t hugetlb_no_page(struct
*/
restore_reserve_on_error(h, vma, haddr, folio);
folio_put(folio);
+ ret = VM_FAULT_SIGBUS;
goto out;
}
new_pagecache_folio = true;
} else {
folio_lock(folio);
-
- ret = vmf_anon_prepare(vmf);
- if (unlikely(ret))
- goto backout_unlocked;
anon_rmap = 1;
}
} else {
_
Patches currently in -mm which might be from vishal.moola(a)gmail.com are
hugetlb-check-for-anon_vma-prior-to-folio-allocation.patch
hugetlb-convert-hugetlb_fault-to-use-struct-vm_fault.patch
hugetlb-convert-hugetlb_no_page-to-use-struct-vm_fault.patch
hugetlb-convert-hugetlb_no_page-to-use-struct-vm_fault-fix.patch
hugetlb-convert-hugetlb_wp-to-use-struct-vm_fault.patch
hugetlb-convert-hugetlb_wp-to-use-struct-vm_fault-fix.patch
The patch titled
Subject: mm: zswap: fix shrinker NULL crash with cgroup_disable=memory
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-zswap-fix-shrinker-null-crash-with-cgroup_disable=memory.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Johannes Weiner <hannes(a)cmpxchg.org>
Subject: mm: zswap: fix shrinker NULL crash with cgroup_disable=memory
Date: Thu, 18 Apr 2024 08:26:28 -0400
Christian reports a NULL deref in zswap that he bisected down to the zswap
shrinker. The issue also cropped up in the bug trackers of libguestfs [1]
and the Red Hat bugzilla [2].
The problem is that when memcg is disabled with the boot time flag, the
zswap shrinker might get called with sc->memcg == NULL. This is okay in
many places, like the lruvec operations. But it crashes in
memcg_page_state() - which is only used due to the non-node accounting of
cgroup's the zswap memory to begin with.
Nhat spotted that the memcg can be NULL in the memcg-disabled case, and I
was then able to reproduce the crash locally as well.
[1] https://github.com/libguestfs/libguestfs/issues/139
[2] https://bugzilla.redhat.com/show_bug.cgi?id=2275252
Link: https://lkml.kernel.org/r/20240418124043.GC1055428@cmpxchg.org
Link: https://lkml.kernel.org/r/20240417143324.GA1055428@cmpxchg.org
Fixes: b5ba474f3f51 ("zswap: shrink zswap pool based on memory pressure")
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reported-by: Christian Heusel <christian(a)heusel.eu>
Debugged-by: Nhat Pham <nphamcs(a)gmail.com>
Suggested-by: Nhat Pham <nphamcs(a)gmail.com>
Tested-by: Christian Heusel <christian(a)heusel.eu>
Cc: Chengming Zhou <chengming.zhou(a)linux.dev>
Cc: Dan Streetman <ddstreet(a)ieee.org>
Cc: Richard W.M. Jones <rjones(a)redhat.com>
Cc: Seth Jennings <sjenning(a)redhat.com>
Cc: Vitaly Wool <vitaly.wool(a)konsulko.com>
Cc: Yosry Ahmed <yosryahmed(a)google.com>
Cc: <stable(a)vger.kernel.org> [v6.8]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/zswap.c | 25 ++++++++++++++++---------
1 file changed, 16 insertions(+), 9 deletions(-)
--- a/mm/zswap.c~mm-zswap-fix-shrinker-null-crash-with-cgroup_disable=memory
+++ a/mm/zswap.c
@@ -1331,15 +1331,22 @@ static unsigned long zswap_shrinker_coun
if (!gfp_has_io_fs(sc->gfp_mask))
return 0;
-#ifdef CONFIG_MEMCG_KMEM
- mem_cgroup_flush_stats(memcg);
- nr_backing = memcg_page_state(memcg, MEMCG_ZSWAP_B) >> PAGE_SHIFT;
- nr_stored = memcg_page_state(memcg, MEMCG_ZSWAPPED);
-#else
- /* use pool stats instead of memcg stats */
- nr_backing = zswap_pool_total_size >> PAGE_SHIFT;
- nr_stored = atomic_read(&zswap_nr_stored);
-#endif
+ /*
+ * For memcg, use the cgroup-wide ZSWAP stats since we don't
+ * have them per-node and thus per-lruvec. Careful if memcg is
+ * runtime-disabled: we can get sc->memcg == NULL, which is ok
+ * for the lruvec, but not for memcg_page_state().
+ *
+ * Without memcg, use the zswap pool-wide metrics.
+ */
+ if (!mem_cgroup_disabled()) {
+ mem_cgroup_flush_stats(memcg);
+ nr_backing = memcg_page_state(memcg, MEMCG_ZSWAP_B) >> PAGE_SHIFT;
+ nr_stored = memcg_page_state(memcg, MEMCG_ZSWAPPED);
+ } else {
+ nr_backing = zswap_pool_total_size >> PAGE_SHIFT;
+ nr_stored = atomic_read(&zswap_nr_stored);
+ }
if (!nr_stored)
return 0;
_
Patches currently in -mm which might be from hannes(a)cmpxchg.org are
mm-zswap-fix-shrinker-null-crash-with-cgroup_disable=memory.patch
mm-zswap-optimize-zswap-pool-size-tracking.patch
mm-zpool-return-pool-size-in-pages.patch
mm-page_alloc-remove-pcppage-migratetype-caching.patch
mm-page_alloc-optimize-free_unref_folios.patch
mm-page_alloc-fix-up-block-types-when-merging-compatible-blocks.patch
mm-page_alloc-move-free-pages-when-converting-block-during-isolation.patch
mm-page_alloc-fix-move_freepages_block-range-error.patch
mm-page_alloc-fix-freelist-movement-during-block-conversion.patch
mm-page_alloc-close-migratetype-race-between-freeing-and-stealing.patch
mm-page_isolation-prepare-for-hygienic-freelists.patch
mm-page_isolation-prepare-for-hygienic-freelists-fix.patch
mm-page_alloc-consolidate-free-page-accounting.patch
mm-page_alloc-consolidate-free-page-accounting-fix.patch
mm-page_alloc-consolidate-free-page-accounting-fix-2.patch
mm-page_alloc-batch-vmstat-updates-in-expand.patch
After the commit d2689b6a86b9 ("net: usb: ax88179_178a: avoid two
consecutive device resets"), reset operation, in which the default mac
address from the device is read, is not executed from bind operation and
the random address, that is pregenerated just in case, is direclty written
the first time in the device, so the default one from the device is not
even read. This writing is not dangerous because is volatile and the
default mac address is not missed.
In order to avoid this and keep the simplification to have only one
reset and reduce the delays, restore the reset from bind operation and
remove the reset that is commanded from open operation. The behavior is
the same but everything is ready for usbnet_probe.
Tested with ASIX AX88179 USB Gigabit Ethernet devices.
Restore the old behavior for the rest of possible devices because I don't
have the hardware to test.
cc: stable(a)vger.kernel.org # 6.6+
Fixes: d2689b6a86b9 ("net: usb: ax88179_178a: avoid two consecutive device resets")
Reported-by: Jarkko Palviainen <jarkko.palviainen(a)gmail.com>
Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm(a)redhat.com>
---
v2:
- Restore reset from bind operation to avoid problems with usbnet_probe.
v1: https://lore.kernel.org/netdev/20240410095603.502566-1-jtornosm@redhat.com/
drivers/net/usb/ax88179_178a.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/usb/ax88179_178a.c b/drivers/net/usb/ax88179_178a.c
index 69169842fa2f..a493fde1af3f 100644
--- a/drivers/net/usb/ax88179_178a.c
+++ b/drivers/net/usb/ax88179_178a.c
@@ -1316,6 +1316,8 @@ static int ax88179_bind(struct usbnet *dev, struct usb_interface *intf)
netif_set_tso_max_size(dev->net, 16384);
+ ax88179_reset(dev);
+
return 0;
}
@@ -1694,7 +1696,6 @@ static const struct driver_info ax88179_info = {
.unbind = ax88179_unbind,
.status = ax88179_status,
.link_reset = ax88179_link_reset,
- .reset = ax88179_reset,
.stop = ax88179_stop,
.flags = FLAG_ETHER | FLAG_FRAMING_AX,
.rx_fixup = ax88179_rx_fixup,
@@ -1707,7 +1708,6 @@ static const struct driver_info ax88178a_info = {
.unbind = ax88179_unbind,
.status = ax88179_status,
.link_reset = ax88179_link_reset,
- .reset = ax88179_reset,
.stop = ax88179_stop,
.flags = FLAG_ETHER | FLAG_FRAMING_AX,
.rx_fixup = ax88179_rx_fixup,
--
2.44.0
Sorry, apologies to everyone. By accident I replied off the list.
Redoing it now on the list. More below.
On Mon, Apr 8, 2024 at 12:10 PM Christian König
<christian.koenig(a)amd.com> wrote:
>
> Am 08.04.24 um 18:04 schrieb Zack Rusin:
> > On Mon, Apr 8, 2024 at 11:59 AM Christian König
> > <christian.koenig(a)amd.com> wrote:
> >> Am 08.04.24 um 17:56 schrieb Zack Rusin:
> >>> Stop printing the TT memory decryption status info each time tt is created
> >>> and instead print it just once.
> >>>
> >>> Reduces the spam in the system logs when running guests with SEV enabled.
> >> Do we then really need this in the first place?
> > Thomas asked for it just to have an indication when those paths are
> > being used because they could potentially break things pretty bad. I
> > think it is useful knowing that those paths are hit (but only once).
> > It makes it pretty easy for me to tell whether bug reports with people
> > who report black screen can be answered with "the kernel needs to be
> > upgraded" ;)
>
> Sounds reasonable, but my expectation was rather that we would print
> something on the device level.
>
> If that's not feasible for whatever reason than printing it once works
> as well of course.
TBH, I think it's pretty convenient to have the drm_info in the TT
just to make sure that when drivers request use_dma_alloc on SEV
systems TT turns decryption on correctly, i.e. it's a nice sanity
check when reading the logs. But if you'd prefer it in the driver I
can move this logic there as well.
z
Commit 6d98eb95b450 ("binder: avoid potential data leakage when copying
txn") introduced changes to how binder objects are copied. In doing so,
it unintentionally removed an offset alignment check done through calls
to binder_alloc_copy_from_buffer() -> check_buffer().
These calls were replaced in binder_get_object() with copy_from_user(),
so now an explicit offset alignment check is needed here. This avoids
later complications when unwinding the objects gets harder.
It is worth noting this check existed prior to commit 7a67a39320df
("binder: add function to copy binder object from buffer"), likely
removed due to redundancy at the time.
Fixes: 6d98eb95b450 ("binder: avoid potential data leakage when copying txn")
Cc: stable(a)vger.kernel.org
Signed-off-by: Carlos Llamas <cmllamas(a)google.com>
---
drivers/android/binder.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index bad28cf42010..dd6923d37931 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -1708,8 +1708,10 @@ static size_t binder_get_object(struct binder_proc *proc,
size_t object_size = 0;
read_size = min_t(size_t, sizeof(*object), buffer->data_size - offset);
- if (offset > buffer->data_size || read_size < sizeof(*hdr))
+ if (offset > buffer->data_size || read_size < sizeof(*hdr) ||
+ !IS_ALIGNED(offset, sizeof(u32)))
return 0;
+
if (u) {
if (copy_from_user(object, u + offset, read_size))
return 0;
--
2.44.0.478.gd926399ef9-goog
Hi Greg,
gregkh(a)linuxfoundation.org wrote on Mon, Apr 15, 2024 at 01:18:20PM +0200:
> mailbox: imx: fix suspend failue
>
> to the 5.10-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> mailbox-imx-fix-suspend-failue.patch
> and it can be found in the queue-5.10 subdirectory.
This only fixes typos in the commit message, but Mizobuchi sent a v2 here:
https://lore.kernel.org/all/20240412055648.1807780-1-mizo@atmark-techno.com…
(unfortunately you weren't in cc of the patch mail either, sorry... He
can resend if that helps with the process)
Peng Fan also gave his reviewed-by in the v2 thread, it's always
appreciable to get an ack from someone closer to the authors.
Since the code itself is identical, I'll leave the decision to update or
not up to you; I just can't unsee the "failue" in the summary :)
Thanks!
--
Dominique
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 978e5c19dfefc271e5550efba92fcef0d3f62864
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024041502-handsaw-overpass-5b8a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 978e5c19dfefc271e5550efba92fcef0d3f62864 Mon Sep 17 00:00:00 2001
From: Alexey Izbyshev <izbyshev(a)ispras.ru>
Date: Fri, 5 Apr 2024 15:55:51 +0300
Subject: [PATCH] io_uring: Fix io_cqring_wait() not restoring sigmask on
get_timespec64() failure
This bug was introduced in commit 950e79dd7313 ("io_uring: minor
io_cqring_wait() optimization"), which was made in preparation for
adc8682ec690 ("io_uring: Add support for napi_busy_poll"). The latter
got reverted in cb3182167325 ("Revert "io_uring: Add support for
napi_busy_poll""), so simply undo the former as well.
Cc: stable(a)vger.kernel.org
Fixes: 950e79dd7313 ("io_uring: minor io_cqring_wait() optimization")
Signed-off-by: Alexey Izbyshev <izbyshev(a)ispras.ru>
Link: https://lore.kernel.org/r/20240405125551.237142-1-izbyshev@ispras.ru
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 4521c2b66b98..c170a2b8d2cf 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2602,19 +2602,6 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
if (__io_cqring_events_user(ctx) >= min_events)
return 0;
- if (sig) {
-#ifdef CONFIG_COMPAT
- if (in_compat_syscall())
- ret = set_compat_user_sigmask((const compat_sigset_t __user *)sig,
- sigsz);
- else
-#endif
- ret = set_user_sigmask(sig, sigsz);
-
- if (ret)
- return ret;
- }
-
init_waitqueue_func_entry(&iowq.wq, io_wake_function);
iowq.wq.private = current;
INIT_LIST_HEAD(&iowq.wq.entry);
@@ -2633,6 +2620,19 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
io_napi_adjust_timeout(ctx, &iowq, &ts);
}
+ if (sig) {
+#ifdef CONFIG_COMPAT
+ if (in_compat_syscall())
+ ret = set_compat_user_sigmask((const compat_sigset_t __user *)sig,
+ sigsz);
+ else
+#endif
+ ret = set_user_sigmask(sig, sigsz);
+
+ if (ret)
+ return ret;
+ }
+
io_napi_busy_loop(ctx, &iowq);
trace_io_uring_cqring_wait(ctx, min_events);
Please backport the following v6.7 commit:
commit be097997a273 ("KVM: arm64: Always invalidate TLB for stage-2 permission faults")
to stable kernels v5.15 and newer to fix:
It is possible for multiple vCPUs to fault on the same IPA and attempt
to resolve the fault. One of the page table walks will actually update
the PTE and the rest will return -EAGAIN per our race detection scheme.
KVM elides the TLB invalidation on the racing threads as the return
value is nonzero.
Before commit a12ab1378a88 ("KVM: arm64: Use local TLBI on permission
relaxation") KVM always used broadcast TLB invalidations when handling
permission faults, which had the convenient property of making the
stage-2 updates visible to all CPUs in the system. However now we do a
local invalidation, and TLBI elision leads to the vCPU thread faulting
again on the stale entry. Remember that the architecture permits the TLB
to cache translations that precipitate a permission fault.
Invalidate the TLB entry responsible for the permission fault if the
stage-2 descriptor has been relaxed, regardless of which thread actually
did the job.
Thank you!
doug rady
On 2/16/24 17:57, Macpaul Lin wrote:
> This patch fixes an issue where xhci1 was not functioning properly because
> the state and PHY settings were incorrect.
>
> The introduction of the 'force-mode' property in the phy-mtk-tphy driver
> allows for the correct initialization of xhci1 by updating the Device Tree
> settings accordingly.
>
> The necessary fixup which added support for the 'force-mode' switch in the
> phy-mtk-tphy driver.
> commit 9b27303003f5 ("phy: mediatek: tphy: add support force phy mode switch")
> Link: https://lore.kernel.org/r/20231211025624.28991-2-chunfeng.yun@mediatek.com
Dear AngeloGioacchino,
Just a soft reminding about the patch has been sent a while back for the
shared U3PHY and PCIe PHY setup for genio-1200 boards. I'm not sure if
you've missed this patch in mail box. :)
The patch is pretty important as it lets the device tree (dts) decide
whether to enable U3PHY or PCIe PHY. Because this is a shared hardware
phy and could only be configured in dts to decide which function to be
initialized, so it's something that should be included in the
board-specific dts files.
Do you think it needs to be resubmitted, or is it still in the queue for
review? It's meant to be ready for action from kernel version 6.8 onwards.
Looking forward to your thoughts on this. Let me know if there's
anything else you need from my side.
> Prior to this fix, the system would exhibit the following probe failure messages
> for xhci1:
> xhci-mtk 11290000.usb: supply vbus not found, using dummy regulator
> xhci-mtk 11290000.usb: uwk - reg:0x400, version:104
> xhci-mtk 11290000.usb: xHCI Host Controller
> xhci-mtk 11290000.usb: new USB bus registered, assigned bus number 5
> xhci-mtk 11290000.usb: clocks are not stable (0x1003d0f)
> xhci-mtk 11290000.usb: can't setup: -110
> xhci-mtk 11290000.usb: USB bus 5 deregistered
> xhci-mtk: probe of 11290000.usb failed with error -110
>
> With the application of this dts fixup, the aforementioned initialization errors
> are resolved and xhci1 is working.
>
> Signed-off-by: Macpaul Lin <macpaul.lin(a)mediatek.com>
> ---
> arch/arm64/boot/dts/mediatek/mt8395-genio-1200-evk.dts | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/mediatek/mt8395-genio-1200-evk.dts b/arch/arm64/boot/dts/mediatek/mt8395-genio-1200-evk.dts
> index 7fc515a07c65..e0b9f2615c11 100644
> --- a/arch/arm64/boot/dts/mediatek/mt8395-genio-1200-evk.dts
> +++ b/arch/arm64/boot/dts/mediatek/mt8395-genio-1200-evk.dts
> @@ -854,6 +854,10 @@
>
> &u3phy1 {
> status = "okay";
> +
> + u3port1: usb-phy@700 {
> + mediatek,force-mode;
> + };
> };
>
> &u3phy2 {
> @@ -885,6 +889,8 @@
> };
>
> &xhci1 {
> + phys = <&u2port1 PHY_TYPE_USB2>,
> + <&u3port1 PHY_TYPE_USB3>;
> vusb33-supply = <&mt6359_vusb_ldo_reg>;
> status = "okay";
> };
Thanks
Macpaul Lin
In order to connect to networks which require 802.11w, add the
MFP_CAPABLE flag and let mac80211 do the actual crypto in software.
When a robust management frame is received, rx_dec->swdec is not set,
even though the HW did not decrypt it. Extend the check and don't set
RX_FLAG_DECRYPTED for these frames in order to use SW decryption.
Use the security flag in the RX descriptor for this purpose, like it is
done in the rtw88 driver.
Cc: stable(a)vger.kernel.org
Signed-off-by: Martin Kaistra <martin.kaistra(a)linutronix.de>
---
drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h | 9 +++++++++
drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c | 7 +++++--
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h
index fd92d23c43d91..16d884a3d87df 100644
--- a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h
+++ b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h
@@ -122,6 +122,15 @@ enum rtl8xxxu_rx_type {
RX_TYPE_ERROR = -1
};
+enum rtl8xxxu_rx_desc_enc {
+ RX_DESC_ENC_NONE = 0,
+ RX_DESC_ENC_WEP40 = 1,
+ RX_DESC_ENC_TKIP_WO_MIC = 2,
+ RX_DESC_ENC_TKIP_MIC = 3,
+ RX_DESC_ENC_AES = 4,
+ RX_DESC_ENC_WEP104 = 5,
+};
+
struct rtl8xxxu_rxdesc16 {
#ifdef __LITTLE_ENDIAN
u32 pktlen:14;
diff --git a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
index 4a49f8f9d80f2..b15a30a54259e 100644
--- a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
+++ b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
@@ -6473,7 +6473,8 @@ int rtl8xxxu_parse_rxdesc16(struct rtl8xxxu_priv *priv, struct sk_buff *skb)
rx_status->mactime = rx_desc->tsfl;
rx_status->flag |= RX_FLAG_MACTIME_START;
- if (!rx_desc->swdec)
+ if (!rx_desc->swdec &&
+ rx_desc->security != RX_DESC_ENC_NONE)
rx_status->flag |= RX_FLAG_DECRYPTED;
if (rx_desc->crc32)
rx_status->flag |= RX_FLAG_FAILED_FCS_CRC;
@@ -6578,7 +6579,8 @@ int rtl8xxxu_parse_rxdesc24(struct rtl8xxxu_priv *priv, struct sk_buff *skb)
rx_status->mactime = rx_desc->tsfl;
rx_status->flag |= RX_FLAG_MACTIME_START;
- if (!rx_desc->swdec)
+ if (!rx_desc->swdec &&
+ rx_desc->security != RX_DESC_ENC_NONE)
rx_status->flag |= RX_FLAG_DECRYPTED;
if (rx_desc->crc32)
rx_status->flag |= RX_FLAG_FAILED_FCS_CRC;
@@ -7998,6 +8000,7 @@ static int rtl8xxxu_probe(struct usb_interface *interface,
ieee80211_hw_set(hw, HAS_RATE_CONTROL);
ieee80211_hw_set(hw, SUPPORT_FAST_XMIT);
ieee80211_hw_set(hw, AMPDU_AGGREGATION);
+ ieee80211_hw_set(hw, MFP_CAPABLE);
wiphy_ext_feature_set(hw->wiphy, NL80211_EXT_FEATURE_CQM_RSSI_LIST);
--
2.39.2
In order to connect to networks which require 802.11w, add the
MFP_CAPABLE flag and let mac80211 do the actual crypto in software.
When a robust management frame is received, rx_dec->swdec is not set,
even though the HW did not decrypt it. Extend the check and don't set
RX_FLAG_DECRYPTED for these frames in order to use SW decryption.
Use the security flag in the RX descriptor for this purpose, like it is
done in the rtw88 driver.
Cc: stable(a)vger.kernel.org
Signed-off-by: Martin Kaistra <martin.kaistra(a)linutronix.de>
---
drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h | 9 +++++++++
drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c | 7 +++++--
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h
index fd92d23c43d91..4f2615dbfd0f0 100644
--- a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h
+++ b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h
@@ -122,6 +122,15 @@ enum rtl8xxxu_rx_type {
RX_TYPE_ERROR = -1
};
+enum rtw_rx_desc_enc {
+ RX_DESC_ENC_NONE = 0,
+ RX_DESC_ENC_WEP40 = 1,
+ RX_DESC_ENC_TKIP_WO_MIC = 2,
+ RX_DESC_ENC_TKIP_MIC = 3,
+ RX_DESC_ENC_AES = 4,
+ RX_DESC_ENC_WEP104 = 5,
+};
+
struct rtl8xxxu_rxdesc16 {
#ifdef __LITTLE_ENDIAN
u32 pktlen:14;
diff --git a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
index 4a49f8f9d80f2..b15a30a54259e 100644
--- a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
+++ b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
@@ -6473,7 +6473,8 @@ int rtl8xxxu_parse_rxdesc16(struct rtl8xxxu_priv *priv, struct sk_buff *skb)
rx_status->mactime = rx_desc->tsfl;
rx_status->flag |= RX_FLAG_MACTIME_START;
- if (!rx_desc->swdec)
+ if (!rx_desc->swdec &&
+ rx_desc->security != RX_DESC_ENC_NONE)
rx_status->flag |= RX_FLAG_DECRYPTED;
if (rx_desc->crc32)
rx_status->flag |= RX_FLAG_FAILED_FCS_CRC;
@@ -6578,7 +6579,8 @@ int rtl8xxxu_parse_rxdesc24(struct rtl8xxxu_priv *priv, struct sk_buff *skb)
rx_status->mactime = rx_desc->tsfl;
rx_status->flag |= RX_FLAG_MACTIME_START;
- if (!rx_desc->swdec)
+ if (!rx_desc->swdec &&
+ rx_desc->security != RX_DESC_ENC_NONE)
rx_status->flag |= RX_FLAG_DECRYPTED;
if (rx_desc->crc32)
rx_status->flag |= RX_FLAG_FAILED_FCS_CRC;
@@ -7998,6 +8000,7 @@ static int rtl8xxxu_probe(struct usb_interface *interface,
ieee80211_hw_set(hw, HAS_RATE_CONTROL);
ieee80211_hw_set(hw, SUPPORT_FAST_XMIT);
ieee80211_hw_set(hw, AMPDU_AGGREGATION);
+ ieee80211_hw_set(hw, MFP_CAPABLE);
wiphy_ext_feature_set(hw->wiphy, NL80211_EXT_FEATURE_CQM_RSSI_LIST);
--
2.39.2
We notice some platforms set "snps,dis_u3_susphy_quirk" and
"snps,dis_u2_susphy_quirk" when they should not need to. Just make sure that
the GUSB3PIPECTL.SUSPENDENABLE and GUSB2PHYCFG.SUSPHY are clear during
initialization. The host initialization involved xhci. So the dwc3 needs to
implement the xhci_plat_priv->plat_start() for xhci to re-enable the suspend
bits.
Since there's a prerequisite patch to drivers/usb/host/xhci-plat.h that's not a
fix patch, this series should go on Greg's usb-testing branch instead of
usb-linus.
Changes in v2:
- Fix xhci-rzv2m build issue
Thinh Nguyen (2):
usb: xhci-plat: Don't include xhci.h
usb: dwc3: core: Prevent phy suspend during init
drivers/usb/dwc3/core.c | 90 +++++++++++++++--------------------
drivers/usb/dwc3/core.h | 1 +
drivers/usb/dwc3/gadget.c | 2 +
drivers/usb/dwc3/host.c | 27 +++++++++++
drivers/usb/host/xhci-plat.h | 4 +-
drivers/usb/host/xhci-rzv2m.c | 1 +
6 files changed, 72 insertions(+), 53 deletions(-)
base-commit: 3d122e6d27e417a9fa91181922743df26b2cd679
--
2.28.0
The patch titled
Subject: mm/hugetlb: fix missing hugetlb_lock for resv uncharge
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-hugetlb-fix-missing-hugetlb_lock-for-resv-uncharge.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Peter Xu <peterx(a)redhat.com>
Subject: mm/hugetlb: fix missing hugetlb_lock for resv uncharge
Date: Wed, 17 Apr 2024 17:18:35 -0400
There is a recent report on UFFDIO_COPY over hugetlb:
https://lore.kernel.org/all/000000000000ee06de0616177560@google.com/
350: lockdep_assert_held(&hugetlb_lock);
Should be an issue in hugetlb but triggered in an userfault context, where
it goes into the unlikely path where two threads modifying the resv map
together. Mike has a fix in that path for resv uncharge but it looks like
the locking criteria was overlooked: hugetlb_cgroup_uncharge_folio_rsvd()
will update the cgroup pointer, so it requires to be called with the lock
held.
Link: https://lkml.kernel.org/r/20240417211836.2742593-3-peterx@redhat.com
Fixes: 79aa925bf239 ("hugetlb_cgroup: fix reservation accounting")
Signed-off-by: Peter Xu <peterx(a)redhat.com>
Reported-by: syzbot+4b8077a5fccc61c385a1(a)syzkaller.appspotmail.com
Cc: Mina Almasry <almasrymina(a)google.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/mm/hugetlb.c~mm-hugetlb-fix-missing-hugetlb_lock-for-resv-uncharge
+++ a/mm/hugetlb.c
@@ -3268,9 +3268,12 @@ struct folio *alloc_hugetlb_folio(struct
rsv_adjust = hugepage_subpool_put_pages(spool, 1);
hugetlb_acct_memory(h, -rsv_adjust);
- if (deferred_reserve)
+ if (deferred_reserve) {
+ spin_lock_irq(&hugetlb_lock);
hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
pages_per_huge_page(h), folio);
+ spin_unlock_irq(&hugetlb_lock);
+ }
}
if (!memcg_charge_ret)
_
Patches currently in -mm which might be from peterx(a)redhat.com are
mm-hugetlb-fix-missing-hugetlb_lock-for-resv-uncharge.patch
mm-hmm-process-pud-swap-entry-without-pud_huge.patch
mm-gup-cache-p4d-in-follow_p4d_mask.patch
mm-gup-check-p4d-presence-before-going-on.patch
mm-x86-change-pxd_huge-behavior-to-exclude-swap-entries.patch
mm-sparc-change-pxd_huge-behavior-to-exclude-swap-entries.patch
mm-arm-use-macros-to-define-pmd-pud-helpers.patch
mm-arm-redefine-pmd_huge-with-pmd_leaf.patch
mm-arm64-merge-pxd_huge-and-pxd_leaf-definitions.patch
mm-powerpc-redefine-pxd_huge-with-pxd_leaf.patch
mm-gup-merge-pxd-huge-mapping-checks.patch
mm-treewide-replace-pxd_huge-with-pxd_leaf.patch
mm-treewide-remove-pxd_huge.patch
mm-arm-remove-pmd_thp_or_huge.patch
mm-document-pxd_leaf-api.patch
mm-always-initialise-folio-_deferred_list-fix.patch
selftests-mm-run_vmtestssh-fix-hugetlb-mem-size-calculation.patch
selftests-mm-run_vmtestssh-fix-hugetlb-mem-size-calculation-fix.patch
mm-kconfig-config_pgtable_has_huge_leaves.patch
mm-hugetlb-declare-hugetlbfs_pagecache_present-non-static.patch
mm-make-hpage_pxd_-macros-even-if-thp.patch
mm-introduce-vma_pgtable_walk_beginend.patch
mm-arch-provide-pud_pfn-fallback.patch
mm-arch-provide-pud_pfn-fallback-fix.patch
mm-gup-drop-folio_fast_pin_allowed-in-hugepd-processing.patch
mm-gup-refactor-record_subpages-to-find-1st-small-page.patch
mm-gup-handle-hugetlb-for-no_page_table.patch
mm-gup-cache-pudp-in-follow_pud_mask.patch
mm-gup-handle-huge-pud-for-follow_pud_mask.patch
mm-gup-handle-huge-pmd-for-follow_pmd_mask.patch
mm-gup-handle-huge-pmd-for-follow_pmd_mask-fix.patch
mm-gup-handle-hugepd-for-follow_page.patch
mm-gup-handle-hugetlb-in-the-generic-follow_page_mask-code.patch
mm-allow-anon-exclusive-check-over-hugetlb-tail-pages.patch
We flush the rebind worker during the vm close phase, however in places
like preempt_fence_work_func() we seem to queue the rebind worker
without first checking if the vm has already been closed. The concern
here is the vm being closed with the worker flushed, but then being
rearmed later, which looks like potential uaf, since there is no actual
refcounting to track the queued worker. To ensure this can't happen
prevent queueing the rebind worker once the vm has been closed.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1591
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1304
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1249
Signed-off-by: Matthew Auld <matthew.auld(a)intel.com>
Cc: Matthew Brost <matthew.brost(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.8+
---
drivers/gpu/drm/xe/xe_pt.c | 2 +-
drivers/gpu/drm/xe/xe_vm.h | 17 ++++++++++++++---
2 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
index 5b7930f46cf3..e21461be904f 100644
--- a/drivers/gpu/drm/xe/xe_pt.c
+++ b/drivers/gpu/drm/xe/xe_pt.c
@@ -1327,7 +1327,7 @@ __xe_pt_bind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_exec_queue
}
if (!rebind && last_munmap_rebind &&
xe_vm_in_preempt_fence_mode(vm))
- xe_vm_queue_rebind_worker(vm);
+ xe_vm_queue_rebind_worker_locked(vm);
} else {
kfree(rfence);
kfree(ifence);
diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
index 306cd0934a19..8420fbf19f6d 100644
--- a/drivers/gpu/drm/xe/xe_vm.h
+++ b/drivers/gpu/drm/xe/xe_vm.h
@@ -211,10 +211,20 @@ int xe_vm_rebind(struct xe_vm *vm, bool rebind_worker);
int xe_vm_invalidate_vma(struct xe_vma *vma);
-static inline void xe_vm_queue_rebind_worker(struct xe_vm *vm)
+static inline void xe_vm_queue_rebind_worker_locked(struct xe_vm *vm)
{
xe_assert(vm->xe, xe_vm_in_preempt_fence_mode(vm));
- queue_work(vm->xe->ordered_wq, &vm->preempt.rebind_work);
+ lockdep_assert_held(&vm->lock);
+
+ if (!xe_vm_is_closed(vm))
+ queue_work(vm->xe->ordered_wq, &vm->preempt.rebind_work);
+}
+
+static inline void xe_vm_queue_rebind_worker(struct xe_vm *vm)
+{
+ down_read(&vm->lock);
+ xe_vm_queue_rebind_worker_locked(vm);
+ up_read(&vm->lock);
}
/**
@@ -225,12 +235,13 @@ static inline void xe_vm_queue_rebind_worker(struct xe_vm *vm)
* If the rebind functionality on a compute vm was disabled due
* to nothing to execute. Reactivate it and run the rebind worker.
* This function should be called after submitting a batch to a compute vm.
+ *
*/
static inline void xe_vm_reactivate_rebind(struct xe_vm *vm)
{
if (xe_vm_in_preempt_fence_mode(vm) && vm->preempt.rebind_deactivated) {
vm->preempt.rebind_deactivated = false;
- xe_vm_queue_rebind_worker(vm);
+ xe_vm_queue_rebind_worker_locked(vm);
}
}
--
2.44.0
PERST is active low according to the PCIe specification.
However, the existing pcie-dw-rockchip.c driver does:
gpiod_set_value(..., 0); msleep(100); gpiod_set_value(..., 1);
When asserting + deasserting PERST.
This is of course wrong, but because all the device trees for this
compatible string have also incorrectly marked this GPIO as ACTIVE_HIGH:
$ git grep -B 10 reset-gpios arch/arm64/boot/dts/rockchip/rk3568*
$ git grep -B 10 reset-gpios arch/arm64/boot/dts/rockchip/rk3588*
The actual toggling of PERST is correct.
(And we cannot change it anyway, since that would break device tree
compatibility.)
However, this driver does request the GPIO to be initialized as
GPIOD_OUT_HIGH, which does cause a silly sequence where PERST gets
toggled back and forth for no good reason.
Fix this by requesting the GPIO to be initialized as GPIOD_OUT_LOW
(which for this driver means PERST asserted).
This will avoid an unnecessary signal change where PERST gets deasserted
(by devm_gpiod_get_optional()) and then gets asserted
(by rockchip_pcie_start_link()) just a few instructions later.
Before patch, debug prints on EP side, when booting RC:
[ 845.606810] pci: PERST asserted by host!
[ 852.483985] pci: PERST de-asserted by host!
[ 852.503041] pci: PERST asserted by host!
[ 852.610318] pci: PERST de-asserted by host!
After patch, debug prints on EP side, when booting RC:
[ 125.107921] pci: PERST asserted by host!
[ 132.111429] pci: PERST de-asserted by host!
This extra, very short, PERST assertion + deassertion has been reported
to cause issues with certain WLAN controllers, e.g. RTL8822CE.
Fixes: 0e898eb8df4e ("PCI: rockchip-dwc: Add Rockchip RK356X host controller driver")
Tested-by: Jianfeng Liu <liujianfeng1994(a)gmail.com>
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
Cc: stable(a)vger.kernel.org # 5.15+
---
Changes since V1:
-Picked up tags.
-CC stable.
-Clarified commit message.
drivers/pci/controller/dwc/pcie-dw-rockchip.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
index d6842141d384..a909e42b4273 100644
--- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c
+++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
@@ -240,7 +240,7 @@ static int rockchip_pcie_resource_get(struct platform_device *pdev,
return PTR_ERR(rockchip->apb_base);
rockchip->rst_gpio = devm_gpiod_get_optional(&pdev->dev, "reset",
- GPIOD_OUT_HIGH);
+ GPIOD_OUT_LOW);
if (IS_ERR(rockchip->rst_gpio))
return PTR_ERR(rockchip->rst_gpio);
--
2.44.0
The entropy accounting changes a static key when the RNG has
initialized, since it only ever initializes once. Static key changes,
however, cannot be made from atomic context, so depending on where the
last creditable entropy comes from, the static key change might need to
be deferred to a worker.
Previously the code used the execute_in_process_context() helper
function, which accounts for whether or not the caller is
in_interrupt(). However, that doesn't account for the case where the
caller is actually in process context but is holding a spinlock.
This turned out to be the case with input_handle_event() in
drivers/input/input.c contributing entropy:
[<ffffffd613025ba0>] die+0xa8/0x2fc
[<ffffffd613027428>] bug_handler+0x44/0xec
[<ffffffd613016964>] brk_handler+0x90/0x144
[<ffffffd613041e58>] do_debug_exception+0xa0/0x148
[<ffffffd61400c208>] el1_dbg+0x60/0x7c
[<ffffffd61400c000>] el1h_64_sync_handler+0x38/0x90
[<ffffffd613011294>] el1h_64_sync+0x64/0x6c
[<ffffffd613102d88>] __might_resched+0x1fc/0x2e8
[<ffffffd613102b54>] __might_sleep+0x44/0x7c
[<ffffffd6130b6eac>] cpus_read_lock+0x1c/0xec
[<ffffffd6132c2820>] static_key_enable+0x14/0x38
[<ffffffd61400ac08>] crng_set_ready+0x14/0x28
[<ffffffd6130df4dc>] execute_in_process_context+0xb8/0xf8
[<ffffffd61400ab30>] _credit_init_bits+0x118/0x1dc
[<ffffffd6138580c8>] add_timer_randomness+0x264/0x270
[<ffffffd613857e54>] add_input_randomness+0x38/0x48
[<ffffffd613a80f94>] input_handle_event+0x2b8/0x490
[<ffffffd613a81310>] input_event+0x6c/0x98
According to Guoyong, it's not really possible to refactor the various
drivers to never hold a spinlock there. And in_atomic() isn't reliable.
So, rather than trying to be too fancy, just punt the change in the
static key to a workqueue always. There's basically no drawback of doing
this, as the code already needed to account for the static key not
changing immediately, and given that it's just an optimization, there's
not exactly a hurry to change the static key right away, so deferal is
fine.
Reported-by: Guoyong Wang <guoyong.wang(a)mediatek.com>
Cc: stable(a)vger.kernel.org
Fixes: f5bda35fba61 ("random: use static branch for crng_ready()")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
Guoyong- can you test this and tell me whether it fixes the problem you
were seeing? If so, I'll try to get this sent up for 6.9. -Jason
drivers/char/random.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 456be28ba67c..2597cb43f438 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -702,7 +702,7 @@ static void extract_entropy(void *buf, size_t len)
static void __cold _credit_init_bits(size_t bits)
{
- static struct execute_work set_ready;
+ static DECLARE_WORK(set_ready, crng_set_ready);
unsigned int new, orig, add;
unsigned long flags;
@@ -718,8 +718,8 @@ static void __cold _credit_init_bits(size_t bits)
if (orig < POOL_READY_BITS && new >= POOL_READY_BITS) {
crng_reseed(NULL); /* Sets crng_init to CRNG_READY under base_crng.lock. */
- if (static_key_initialized)
- execute_in_process_context(crng_set_ready, &set_ready);
+ if (static_key_initialized && system_unbound_wq)
+ queue_work(system_unbound_wq, &set_ready);
atomic_notifier_call_chain(&random_ready_notifier, 0, NULL);
wake_up_interruptible(&crng_init_wait);
kill_fasync(&fasync, SIGIO, POLL_IN);
@@ -890,8 +890,8 @@ void __init random_init(void)
/*
* If we were initialized by the cpu or bootloader before jump labels
- * are initialized, then we should enable the static branch here, where
- * it's guaranteed that jump labels have been initialized.
+ * or workqueues are initialized, then we should enable the static
+ * branch here, where it's guaranteed that these have been initialized.
*/
if (!static_branch_likely(&crng_is_ready) && crng_init >= CRNG_READY)
crng_set_ready(NULL);
--
2.44.0
When an UART is opened that still has .throttled set from a previous
open, the RX interrupt is enabled but the irq handler doesn't consider
it. This easily results in a stuck irq with the effect to occupy the CPU
in a tight loop.
So reset the throttle state in .startup() to ensure that RX irqs are
handled.
Fixes: d1ec8a2eabe9 ("serial: stm32: update throttle and unthrottle ops for dma mode")
Cc: stable(a)vger.kernel.org
Signed-off-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
---
drivers/tty/serial/stm32-usart.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index 2bea1d7c9858..e1e7bc04c579 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -1080,6 +1080,7 @@ static int stm32_usart_startup(struct uart_port *port)
val |= USART_CR2_SWAP;
writel_relaxed(val, port->membase + ofs->cr2);
}
+ stm32_port->throttled = false;
/* RX FIFO Flush */
if (ofs->rqr != UNDEF_REG)
--
2.43.0
If there is a stuck irq that the handler doesn't address, returning
IRQ_HANDLED unconditionally makes it impossible for the irq core to
detect the problem and disable the irq. So only return IRQ_HANDLED if
an event was handled.
A stuck irq is still problematic, but with this change at least it only
makes the UART nonfunctional instead of occupying the (usually only) CPU
by 100% and so stall the whole machine.
Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
---
drivers/tty/serial/stm32-usart.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index 8c66abcfe6ca..2bea1d7c9858 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -849,6 +849,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs;
u32 sr;
unsigned int size;
+ irqreturn_t ret = IRQ_NONE;
sr = readl_relaxed(port->membase + ofs->isr);
@@ -857,11 +858,14 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
(sr & USART_SR_TC)) {
stm32_usart_tc_interrupt_disable(port);
stm32_usart_rs485_rts_disable(port);
+ ret = IRQ_HANDLED;
}
- if ((sr & USART_SR_RTOF) && ofs->icr != UNDEF_REG)
+ if ((sr & USART_SR_RTOF) && ofs->icr != UNDEF_REG) {
writel_relaxed(USART_ICR_RTOCF,
port->membase + ofs->icr);
+ ret = IRQ_HANDLED;
+ }
if ((sr & USART_SR_WUF) && ofs->icr != UNDEF_REG) {
/* Clear wake up flag and disable wake up interrupt */
@@ -870,6 +874,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
stm32_usart_clr_bits(port, ofs->cr3, USART_CR3_WUFIE);
if (irqd_is_wakeup_set(irq_get_irq_data(port->irq)))
pm_wakeup_event(tport->tty->dev, 0);
+ ret = IRQ_HANDLED;
}
/*
@@ -884,6 +889,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_unlock_and_check_sysrq(port);
if (size)
tty_flip_buffer_push(tport);
+ ret = IRQ_HANDLED;
}
}
@@ -891,6 +897,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_port_lock(port);
stm32_usart_transmit_chars(port);
uart_port_unlock(port);
+ ret = IRQ_HANDLED;
}
/* Receiver timeout irq for DMA RX */
@@ -900,9 +907,10 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_unlock_and_check_sysrq(port);
if (size)
tty_flip_buffer_push(tport);
+ ret = IRQ_HANDLED;
}
- return IRQ_HANDLED;
+ return ret;
}
static void stm32_usart_set_mctrl(struct uart_port *port, unsigned int mctrl)
--
2.43.0
After the commit d2689b6a86b9 ("net: usb: ax88179_178a: avoid two
consecutive device resets"), reset operation, in which the default mac
address from the device is read, is not executed from bind operation and
the random address, that is pregenerated just in case, is direclty written
the first time in the device, so the default one from the device is not
even read. This writing is not dangerous because is volatile and the
default mac address is not missed.
In order to avoid this, do not allow writing a mac address directly into
the device, if the default mac address from the device has not been read
yet.
cc: stable(a)vger.kernel.org # 6.6+
Fixes: d2689b6a86b9 ("net: usb: ax88179_178a: avoid two consecutive device resets")
Reported-by: Jarkko Palviainen <jarkko.palviainen(a)gmail.com>
Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm(a)redhat.com>
---
drivers/net/usb/ax88179_178a.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/net/usb/ax88179_178a.c b/drivers/net/usb/ax88179_178a.c
index 69169842fa2f..650bb7b6badf 100644
--- a/drivers/net/usb/ax88179_178a.c
+++ b/drivers/net/usb/ax88179_178a.c
@@ -174,6 +174,7 @@ struct ax88179_data {
u32 wol_supported;
u32 wolopts;
u8 disconnecting;
+ u8 mac_address_read;
};
struct ax88179_int_data {
@@ -969,9 +970,12 @@ static int ax88179_change_mtu(struct net_device *net, int new_mtu)
static int ax88179_set_mac_addr(struct net_device *net, void *p)
{
struct usbnet *dev = netdev_priv(net);
+ struct ax88179_data *ax179_data = dev->driver_priv;
struct sockaddr *addr = p;
int ret;
+ if (!ax179_data->mac_address_read)
+ return -EAGAIN;
if (netif_running(net))
return -EBUSY;
if (!is_valid_ether_addr(addr->sa_data))
@@ -1256,6 +1260,7 @@ static int ax88179_led_setting(struct usbnet *dev)
static void ax88179_get_mac_addr(struct usbnet *dev)
{
+ struct ax88179_data *ax179_data = dev->driver_priv;
u8 mac[ETH_ALEN];
memset(mac, 0, sizeof(mac));
@@ -1281,6 +1286,9 @@ static void ax88179_get_mac_addr(struct usbnet *dev)
ax88179_write_cmd(dev, AX_ACCESS_MAC, AX_NODE_ID, ETH_ALEN, ETH_ALEN,
dev->net->dev_addr);
+
+ ax179_data->mac_address_read = 1;
+ netdev_info(dev->net, "MAC address: %pM\n", dev->net->dev_addr);
}
static int ax88179_bind(struct usbnet *dev, struct usb_interface *intf)
--
2.44.0
Without these patches, all tested laptop models using the es8326 audio codec
have no sound from the speakers and headphones, and the headset microphone
does not work. Although the initialization of the sound card is successful.
--
ver.2:
drop a commit that does not affect the fix of functionality;
added an explanation that does not work on the current version of the kernel
without patches.
--
Patches have been successfully tested on the latest 6.1.86 kernel.
[PATCH 6.1.y 1/6] ASoC: codecs: ES8326: Add es8326_mute function
[PATCH 6.1.y 2/6] ASoC: codecs: ES8326: Change Hp_detect register names
[PATCH 6.1.y 3/6] ASoC: codecs: ES8326: Change Volatile Reg function
[PATCH 6.1.y 4/6] ASoC: codecs: ES8326: Fix power-up sequence
[PATCH 6.1.y 5/6] ASOC: codecs: ES8326: Add calibration support for
[PATCH 6.1.y 6/6] ASoC: codecs: ES8326: Update jact detection function