From: Joerg M. Sigle joerg.sigle@jsigle.com
Patch 416fa531c816 commit 416fa531c8160151090206a51b829b9218b804d9 caused an immediate kernel panic on boot at RIP: 0010:__domain_mapping+0xa7/0x3a0 with longterm kernel 5.10.37 configured w/ CONFIG_INTEL_IOMMU_DEFAULT_ON=y due to removal of a check. Putting the check back in place fixes this. The kernel panic was observed on various Intel Core i7 i5 i3 CPUs from Sandy Bridge, Haswell, Broadwell and Kaby Lake generations (at least). It may NOT be reproducible on some older CPU generations. Suppressing the panic with boot parameter intel_iommu=off is diagnostic. See: https://bugzilla.kernel.org/show_bug.cgi?id=213077 https://bugzilla.kernel.org/show_bug.cgi?id=213087 https://bugzilla.kernel.org/show_bug.cgi?id=213095
Signed-off-by: Joerg M. Sigle joerg.sigle@jsigle.com Acked-by: Lu Baolu baolu.lu@linux.intel.com
---
Hi Greg, thanks for your response.
Are you sure that 5.10.38 doesn't already fix this issue? We resolved an issue in this area.
Yes, I'm sure.
Your a282b76166b13496967c70bd61ea8f03609d8a76 simply reverts the offending patch. My approach corrects it instead.
I have submitted that properly before, don't know why it hasn't found you yet.
I'm including the "proper" submission info above. The file with the patch is attached again again.
If I should send it to another address, please tell me.
Communication with Lu Baolu who saw it last Monday is also attached below.
Hope this contribution is helpful. Thanks & kind regards! Joerg
Am 20.05.2021 um 10:52 schrieb Greg KH:
On Thu, May 20, 2021 at 09:47:40AM +0200, Joerg M. Sigle wrote:
Dear colleaguse
I've submitted a patch for 5.10.37 that wasn't included in 5.10.38, which would have corrected a patch that has been reverted instead.
More info: https://bugzilla.kernel.org/show_bug.cgi?id=213077
Now sending to the other kernel list, according to autoresponse from Greg Kroah-Hartman.
Thanks for any feedback & Kind regards, Joerg Sigle
Are you sure that 5.10.38 doesn't already fix this issue? We resolved an issue in this area.
And where is the patch, I can't find it in this email, can you submit it "properly" so that it can be reviewed?
thanks,
greg k-h
---
Dear colleagues,
Please find the suggested patch in the attachment, now reformatted to include the affected C function. It fixes a problem in LT kernel 5.10.37; I'm asking for inclusion into LT kernel 5.10.38.
I'm submitting this now, after receiving Lu Baolu's positive response attached below. Baolu, I hope that the line "Acked-by: Lu Baolu ..." is ok given your comment.
I hope I'm providing this in a useful way, following https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html
I'm still unsure whether this line should be added above: Cc: stable@vger.kernel.org Please add this if needed, also considering Baolu's comment re. upstream/backported.
Thanks and kind regards to all! Joerg
Am 17.05.2021 um 04:51 schrieb Lu Baolu:
Hi Joerg,
On 5/16/21 7:57 AM, Joerg M. Sigle wrote:
Dear colleagues at Intel
could you please check the enclosed bug report and confirm whether the suggested patch is valid.
Thank you very much & kind regards - Joerg
-------- Weitergeleitete Nachricht -------- From: bugzilla-daemon@bugzilla.kernel.org To: joerg.sigle@jsigle.com Subject: [Bug 213077] Kernel 5.10.37 immediately panics at boot w/ Intel Core i7-4910MQ Haswell or Core i3-5010U Broadwell w/ custom .config CONFIG_INTEL_IOMMU_DEFAULT_ON=y, same config worked with 5.10.36, due to commit 416fa531c816 = a8ce9ebbecdfda3322bbcece6b3b25888217f8e3 Date: Sat, 15 May 2021 23:47:39 +0000 X-Envelope-To: joerg.sigle@jsigle.com
https://bugzilla.kernel.org/show_bug.cgi?id=213077
--- Comment #7 from Joerg M. Sigle (joerg.sigle@jsigle.com) --- This patch:
416fa531c816 iommu/vt-d: Preset Access/Dirty bits for IOVA over FL commit 416fa531c8160151090206a51b829b9218b804d9 Upstream commit a8ce9ebbecdfda3322bbcece6b3b25888217f8e3
https://github.com/arter97/x86-kernel/commit/416fa531c8160151090206a51b829b9...
while doing other things, changed the conditional:
if (!sg) { ... sg_res = nr_pages; pteval = ((phys_addr_t)phys_pfn << VTD_PAGE_SHIFT) | attr; }
to an unconditional:
pteval = ((phys_addr_t)phys_pfn << VTD_PAGE_SHIFT) | attr;
Reinserting the check for !sg fixed the immediate panic on boot for me. Reverting the remainder of the same patch had not helped before.
Here's a possible patch for 5.10.37:
--- a/drivers/iommu/intel/iommu.c 2021-05-14 09:50:46.000000000 +0200 +++ b/drivers/iommu/intel/iommu.c 2021-05-16 01:02:17.816810690 +0200 @@ -2373,7 +2373,10 @@ } }
pteval = ((phys_addr_t)phys_pfn << VTD_PAGE_SHIFT) | attr;
if (!sg) {
sg_res = nr_pages;
pteval = ((phys_addr_t)phys_pfn << VTD_PAGE_SHIFT) | attr;
} while (nr_pages > 0) { uint64_t tmp;
Could you please check this patch submission and pass it to upstream?
Above fix looks good to me.
This issue is caused by the back-ported patch for stable v5.10.37. There's no need for upstream.
Best regards, baolu
I have, however, NOT tried to understand what the code really does. So please ask the suppliers of patch 416fa531c816 whether their removal of the condition was intentional or a mere lapsus. Thanks!
Thanks and kind regards, Joerg
On Thu, May 20, 2021 at 04:05:51PM +0200, Joerg M. Sigle wrote:
From: Joerg M. Sigle joerg.sigle@jsigle.com
Odd subject line, what happened?
Patch 416fa531c816 commit 416fa531c8160151090206a51b829b9218b804d9 caused an immediate kernel panic on boot at RIP: 0010:__domain_mapping+0xa7/0x3a0 with longterm kernel 5.10.37 configured w/ CONFIG_INTEL_IOMMU_DEFAULT_ON=y due to removal of a check. Putting the check back in place fixes this. The kernel panic was observed on various Intel Core i7 i5 i3 CPUs from Sandy Bridge, Haswell, Broadwell and Kaby Lake generations (at least). It may NOT be reproducible on some older CPU generations. Suppressing the panic with boot parameter intel_iommu=off is diagnostic. See: https://bugzilla.kernel.org/show_bug.cgi?id=213077 https://bugzilla.kernel.org/show_bug.cgi?id=213087 https://bugzilla.kernel.org/show_bug.cgi?id=213095
Signed-off-by: Joerg M. Sigle joerg.sigle@jsigle.com Acked-by: Lu Baolu baolu.lu@linux.intel.com
Hi Greg, thanks for your response.
Are you sure that 5.10.38 doesn't already fix this issue? We resolved an issue in this area.
Yes, I'm sure.
Did you test 5.10.38?
Your a282b76166b13496967c70bd61ea8f03609d8a76 simply reverts the offending patch.
Yes, and then the next commit in the series applied it back in a different form, hopefully "fixed up" correctly.
My approach corrects it instead.
How so? It does not apply to 5.10.38. Nor to 5.10.39-rc1.
I have submitted that properly before, don't know why it hasn't found you yet.
Where on lore.kernel.org is it shown?
I'm including the "proper" submission info above. The file with the patch is attached again again.
That's not how to submit patches, take a look at the kernel documentaion for all that we need.
But again, are you sure this is needed? If so, can you make it against 5.10.39-rc1?
thanks,
greg k-h
linux-stable-mirror@lists.linaro.org