On Tue, 2025-07-08 at 18:23 +0200, Greg Kroah-Hartman wrote:
6.6-stable review patch. If anyone has any objections, please let me know.
From: Niklas Schnelle schnelle@linux.ibm.com
[ Upstream commit 45537926dd2aaa9190ac0fac5a0fbeefcadfea95 ]
The error event information for PCI error events contains a function handle for the respective function. This handle is generally captured at the time the error event was recorded. Due to delays in processing or cascading issues, it may happen that during firmware recovery multiple events are generated. When processing these events in order Linux may already have recovered an affected function making the event information stale. Fix this by doing an unconditional CLP List PCI function retrieving the current function handle with the zdev->state_lock held and ignoring the event if its function handle is stale.
Cc: stable@vger.kernel.org Fixes: 4cdf2f4e24ff ("s390/pci: implement minimal PCI error recovery") Reviewed-by: Julian Ruess julianr@linux.ibm.com Reviewed-by: Gerd Bayer gbayer@linux.ibm.com Reviewed-by: Farhan Ali alifm@linux.ibm.com Signed-off-by: Niklas Schnelle schnelle@linux.ibm.com Signed-off-by: Alexander Gordeev agordeev@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org
arch/s390/pci/pci_event.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/arch/s390/pci/pci_event.c b/arch/s390/pci/pci_event.c index d969f36bf186f..fd83588f3c11d 100644 --- a/arch/s390/pci/pci_event.c +++ b/arch/s390/pci/pci_event.c @@ -257,6 +257,8 @@ static void __zpci_event_error(struct zpci_ccdf_err *ccdf) struct zpci_dev *zdev = get_zdev_by_fid(ccdf->fid); struct pci_dev *pdev = NULL; pci_ers_result_t ers_res;
- u32 fh = 0;
- int rc;
zpci_dbg(3, "err fid:%x, fh:%x, pec:%x\n", ccdf->fid, ccdf->fh, ccdf->pec); @@ -264,6 +266,16 @@ static void __zpci_event_error(struct zpci_ccdf_err *ccdf) zpci_err_hex(ccdf, sizeof(*ccdf)); if (zdev) {
mutex_lock(&zdev->state_lock);
This won't compile this tree misses commit bcb5d6c76903 ("s390/pci: introduce lock to synchronize state of zpci_dev's").
rc = clp_refresh_fh(zdev->fid, &fh);
if (rc)
goto no_pdev;
if (!fh || ccdf->fh != fh) {
/* Ignore events with stale handles */
zpci_dbg(3, "err fid:%x, fh:%x (stale %x)\n",
ccdf->fid, fh, ccdf->fh);
goto no_pdev;
zpci_update_fh(zdev, ccdf->fh); if (zdev->zbus->bus) pdev = pci_get_slot(zdev->zbus->bus, zdev->devfn);}
@@ -292,6 +304,8 @@ static void __zpci_event_error(struct zpci_ccdf_err *ccdf) } pci_dev_put(pdev); no_pdev:
- if (zdev)
mutex_unlock(&zdev->state_lock);
Curiously this patch was adjusted differently here vs for 6.1.y, this one at least places the unlock in the same place as upstream.
zpci_zdev_put(zdev); }
Please drop this patch! Ten can we pull in commit bcb5d6c76903 ("s390/pci: introduce lock to synchronize state of zpci_dev's") as a prerequiste? This fix would still work for its specific issue without the mutex i.e. just adjusting context but I'd prefer to have both in stable.
Also, I wonder if it would be possible to have the subject of these kind of mails indicate if the backport patch was adjusted more than just line offsets or context? I think that would make it much easier to spot where extra attention is required.
Thanks, Niklas