On Mon, Sep 29, 2025 at 02:15:47AM -0700, Breno Leitao wrote:
Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called when dev->aer_info is NULL. Add a NULL check before proceeding to avoid calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which does not rate limit, given this is fatal.
This prevents a kernel crash triggered by dereferencing a NULL pointer in aer_ratelimit(), ensuring safer handling of PCI devices that lack AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr() which already performs this NULL check.
Cc: stable@vger.kernel.org Fixes: a57f2bfb4a5863 ("PCI/AER: Ratelimit correctable and non-fatal error logging") Signed-off-by: Breno Leitao leitao@debian.org
Thanks, Breno, I applied this to pci/aer for v6.18. I added a little more detail to the commit log because the path where we hit this is a bit obscure. Please take a look and see if it makes sense:
https://git.kernel.org/cgit/linux/kernel/git/pci/pci.git/commit/?id=451f30b9...
- This problem is still happening in upstream, and unfortunately no action was done in the previous discussion.
- Link to previous post: https://lore.kernel.org/r/20250804-aer_crash_2-v1-1-fd06562c18a4@debian.org
drivers/pci/pcie/aer.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index e286c197d7167..55abc5e17b8b1 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -786,6 +786,9 @@ static void pci_rootport_aer_stats_incr(struct pci_dev *pdev, static int aer_ratelimit(struct pci_dev *dev, unsigned int severity) {
- if (!dev->aer_info)
return 1;
- switch (severity) { case AER_NONFATAL: return __ratelimit(&dev->aer_info->nonfatal_ratelimit);
base-commit: e5f0a698b34ed76002dc5cff3804a61c80233a7a change-id: 20250801-aer_crash_2-b21cc2ef0d00
Best regards,
Breno Leitao leitao@debian.org