The stable release process documented in stable-kernel-rules.rst needs
to be updated to reflect current procedure.
Patch 1 is merely reorganizing three submission option lists to be
subsection of the procedure section.
Patch 2 contains the actual process documentation update.
Patch 3 and 4 updates "Tree" section by adding stable -rc tree link and
correcting link for stable tree, respectively.
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Sasha Levin <sashal(a)kernel.org>
Cc: Jonathan Corbet <corbet(a)lwn.net>
Cc: stable(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Bagas Sanjaya (4):
Documentation: make option lists subsection of "Procedure for
submitting patches to the -stable tree" in stable-kernel-rules.rst
Documentation: update stable review cycle documentation
Documentation: add link to stable release candidate tree
Documentation: update stable tree link
Documentation/process/stable-kernel-rules.rst | 31 ++++++++++++++-----
1 file changed, 24 insertions(+), 7 deletions(-)
base-commit: ffb217a13a2eaf6d5bd974fc83036a53ca69f1e2
--
An old man doll... just what I always wanted! - Clara
Good evening from here this evening my dear, how are you doing today?
My name is Mrs. Latifa Rassim Mohamad from Saudi Arabia, I have
something very important and serious i will like to discuss with you
privately, so i hope this is your private email?
Mrs. Latifa Rassim Mohamad.
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 173ce1ca47c489135b2799f70f550e1319ba36d8 Mon Sep 17 00:00:00 2001
From: David Howells <dhowells(a)redhat.com>
Date: Fri, 11 Mar 2022 15:58:21 +0000
Subject: [PATCH] afs: Fix potential thrashing in afs writeback
In afs_writepages_region(), if the dirty page we find is undergoing
writeback or write to cache, but the sync_mode is WB_SYNC_NONE, we go
round the loop trying the same page again and again with no pausing or
waiting unless and until another thread manages to clear the writeback
and fscache flags.
Fix this with three measures:
(1) Advance start to after the page we found.
(2) Break out of the loop and return if rescheduling is requested.
(3) Arbitrarily give up after a maximum of 5 skips.
Fixes: 31143d5d515e ("AFS: implement basic file write support")
Reported-by: Marc Dionne <marc.dionne(a)auristor.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
Tested-by: Marc Dionne <marc.dionne(a)auristor.com>
Acked-by: Marc Dionne <marc.dionne(a)auristor.com>
Link: https://lore.kernel.org/r/164692725757.2097000.2060513769492301854.stgit@wa… # v1
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 5e9157d0da29..f447c902318d 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -703,7 +703,7 @@ static int afs_writepages_region(struct address_space *mapping,
struct folio *folio;
struct page *head_page;
ssize_t ret;
- int n;
+ int n, skips = 0;
_enter("%llx,%llx,", start, end);
@@ -754,8 +754,15 @@ static int afs_writepages_region(struct address_space *mapping,
#ifdef CONFIG_AFS_FSCACHE
folio_wait_fscache(folio);
#endif
+ } else {
+ start += folio_size(folio);
}
folio_put(folio);
+ if (wbc->sync_mode == WB_SYNC_NONE) {
+ if (skips >= 5 || need_resched())
+ break;
+ skips++;
+ }
continue;
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 173ce1ca47c489135b2799f70f550e1319ba36d8 Mon Sep 17 00:00:00 2001
From: David Howells <dhowells(a)redhat.com>
Date: Fri, 11 Mar 2022 15:58:21 +0000
Subject: [PATCH] afs: Fix potential thrashing in afs writeback
In afs_writepages_region(), if the dirty page we find is undergoing
writeback or write to cache, but the sync_mode is WB_SYNC_NONE, we go
round the loop trying the same page again and again with no pausing or
waiting unless and until another thread manages to clear the writeback
and fscache flags.
Fix this with three measures:
(1) Advance start to after the page we found.
(2) Break out of the loop and return if rescheduling is requested.
(3) Arbitrarily give up after a maximum of 5 skips.
Fixes: 31143d5d515e ("AFS: implement basic file write support")
Reported-by: Marc Dionne <marc.dionne(a)auristor.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
Tested-by: Marc Dionne <marc.dionne(a)auristor.com>
Acked-by: Marc Dionne <marc.dionne(a)auristor.com>
Link: https://lore.kernel.org/r/164692725757.2097000.2060513769492301854.stgit@wa… # v1
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 5e9157d0da29..f447c902318d 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -703,7 +703,7 @@ static int afs_writepages_region(struct address_space *mapping,
struct folio *folio;
struct page *head_page;
ssize_t ret;
- int n;
+ int n, skips = 0;
_enter("%llx,%llx,", start, end);
@@ -754,8 +754,15 @@ static int afs_writepages_region(struct address_space *mapping,
#ifdef CONFIG_AFS_FSCACHE
folio_wait_fscache(folio);
#endif
+ } else {
+ start += folio_size(folio);
}
folio_put(folio);
+ if (wbc->sync_mode == WB_SYNC_NONE) {
+ if (skips >= 5 || need_resched())
+ break;
+ skips++;
+ }
continue;
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 173ce1ca47c489135b2799f70f550e1319ba36d8 Mon Sep 17 00:00:00 2001
From: David Howells <dhowells(a)redhat.com>
Date: Fri, 11 Mar 2022 15:58:21 +0000
Subject: [PATCH] afs: Fix potential thrashing in afs writeback
In afs_writepages_region(), if the dirty page we find is undergoing
writeback or write to cache, but the sync_mode is WB_SYNC_NONE, we go
round the loop trying the same page again and again with no pausing or
waiting unless and until another thread manages to clear the writeback
and fscache flags.
Fix this with three measures:
(1) Advance start to after the page we found.
(2) Break out of the loop and return if rescheduling is requested.
(3) Arbitrarily give up after a maximum of 5 skips.
Fixes: 31143d5d515e ("AFS: implement basic file write support")
Reported-by: Marc Dionne <marc.dionne(a)auristor.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
Tested-by: Marc Dionne <marc.dionne(a)auristor.com>
Acked-by: Marc Dionne <marc.dionne(a)auristor.com>
Link: https://lore.kernel.org/r/164692725757.2097000.2060513769492301854.stgit@wa… # v1
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 5e9157d0da29..f447c902318d 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -703,7 +703,7 @@ static int afs_writepages_region(struct address_space *mapping,
struct folio *folio;
struct page *head_page;
ssize_t ret;
- int n;
+ int n, skips = 0;
_enter("%llx,%llx,", start, end);
@@ -754,8 +754,15 @@ static int afs_writepages_region(struct address_space *mapping,
#ifdef CONFIG_AFS_FSCACHE
folio_wait_fscache(folio);
#endif
+ } else {
+ start += folio_size(folio);
}
folio_put(folio);
+ if (wbc->sync_mode == WB_SYNC_NONE) {
+ if (skips >= 5 || need_resched())
+ break;
+ skips++;
+ }
continue;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 173ce1ca47c489135b2799f70f550e1319ba36d8 Mon Sep 17 00:00:00 2001
From: David Howells <dhowells(a)redhat.com>
Date: Fri, 11 Mar 2022 15:58:21 +0000
Subject: [PATCH] afs: Fix potential thrashing in afs writeback
In afs_writepages_region(), if the dirty page we find is undergoing
writeback or write to cache, but the sync_mode is WB_SYNC_NONE, we go
round the loop trying the same page again and again with no pausing or
waiting unless and until another thread manages to clear the writeback
and fscache flags.
Fix this with three measures:
(1) Advance start to after the page we found.
(2) Break out of the loop and return if rescheduling is requested.
(3) Arbitrarily give up after a maximum of 5 skips.
Fixes: 31143d5d515e ("AFS: implement basic file write support")
Reported-by: Marc Dionne <marc.dionne(a)auristor.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
Tested-by: Marc Dionne <marc.dionne(a)auristor.com>
Acked-by: Marc Dionne <marc.dionne(a)auristor.com>
Link: https://lore.kernel.org/r/164692725757.2097000.2060513769492301854.stgit@wa… # v1
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 5e9157d0da29..f447c902318d 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -703,7 +703,7 @@ static int afs_writepages_region(struct address_space *mapping,
struct folio *folio;
struct page *head_page;
ssize_t ret;
- int n;
+ int n, skips = 0;
_enter("%llx,%llx,", start, end);
@@ -754,8 +754,15 @@ static int afs_writepages_region(struct address_space *mapping,
#ifdef CONFIG_AFS_FSCACHE
folio_wait_fscache(folio);
#endif
+ } else {
+ start += folio_size(folio);
}
folio_put(folio);
+ if (wbc->sync_mode == WB_SYNC_NONE) {
+ if (skips >= 5 || need_resched())
+ break;
+ skips++;
+ }
continue;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 173ce1ca47c489135b2799f70f550e1319ba36d8 Mon Sep 17 00:00:00 2001
From: David Howells <dhowells(a)redhat.com>
Date: Fri, 11 Mar 2022 15:58:21 +0000
Subject: [PATCH] afs: Fix potential thrashing in afs writeback
In afs_writepages_region(), if the dirty page we find is undergoing
writeback or write to cache, but the sync_mode is WB_SYNC_NONE, we go
round the loop trying the same page again and again with no pausing or
waiting unless and until another thread manages to clear the writeback
and fscache flags.
Fix this with three measures:
(1) Advance start to after the page we found.
(2) Break out of the loop and return if rescheduling is requested.
(3) Arbitrarily give up after a maximum of 5 skips.
Fixes: 31143d5d515e ("AFS: implement basic file write support")
Reported-by: Marc Dionne <marc.dionne(a)auristor.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
Tested-by: Marc Dionne <marc.dionne(a)auristor.com>
Acked-by: Marc Dionne <marc.dionne(a)auristor.com>
Link: https://lore.kernel.org/r/164692725757.2097000.2060513769492301854.stgit@wa… # v1
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 5e9157d0da29..f447c902318d 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -703,7 +703,7 @@ static int afs_writepages_region(struct address_space *mapping,
struct folio *folio;
struct page *head_page;
ssize_t ret;
- int n;
+ int n, skips = 0;
_enter("%llx,%llx,", start, end);
@@ -754,8 +754,15 @@ static int afs_writepages_region(struct address_space *mapping,
#ifdef CONFIG_AFS_FSCACHE
folio_wait_fscache(folio);
#endif
+ } else {
+ start += folio_size(folio);
}
folio_put(folio);
+ if (wbc->sync_mode == WB_SYNC_NONE) {
+ if (skips >= 5 || need_resched())
+ break;
+ skips++;
+ }
continue;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 173ce1ca47c489135b2799f70f550e1319ba36d8 Mon Sep 17 00:00:00 2001
From: David Howells <dhowells(a)redhat.com>
Date: Fri, 11 Mar 2022 15:58:21 +0000
Subject: [PATCH] afs: Fix potential thrashing in afs writeback
In afs_writepages_region(), if the dirty page we find is undergoing
writeback or write to cache, but the sync_mode is WB_SYNC_NONE, we go
round the loop trying the same page again and again with no pausing or
waiting unless and until another thread manages to clear the writeback
and fscache flags.
Fix this with three measures:
(1) Advance start to after the page we found.
(2) Break out of the loop and return if rescheduling is requested.
(3) Arbitrarily give up after a maximum of 5 skips.
Fixes: 31143d5d515e ("AFS: implement basic file write support")
Reported-by: Marc Dionne <marc.dionne(a)auristor.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
Tested-by: Marc Dionne <marc.dionne(a)auristor.com>
Acked-by: Marc Dionne <marc.dionne(a)auristor.com>
Link: https://lore.kernel.org/r/164692725757.2097000.2060513769492301854.stgit@wa… # v1
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 5e9157d0da29..f447c902318d 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -703,7 +703,7 @@ static int afs_writepages_region(struct address_space *mapping,
struct folio *folio;
struct page *head_page;
ssize_t ret;
- int n;
+ int n, skips = 0;
_enter("%llx,%llx,", start, end);
@@ -754,8 +754,15 @@ static int afs_writepages_region(struct address_space *mapping,
#ifdef CONFIG_AFS_FSCACHE
folio_wait_fscache(folio);
#endif
+ } else {
+ start += folio_size(folio);
}
folio_put(folio);
+ if (wbc->sync_mode == WB_SYNC_NONE) {
+ if (skips >= 5 || need_resched())
+ break;
+ skips++;
+ }
continue;
}
The patch below does not apply to the 5.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 173ce1ca47c489135b2799f70f550e1319ba36d8 Mon Sep 17 00:00:00 2001
From: David Howells <dhowells(a)redhat.com>
Date: Fri, 11 Mar 2022 15:58:21 +0000
Subject: [PATCH] afs: Fix potential thrashing in afs writeback
In afs_writepages_region(), if the dirty page we find is undergoing
writeback or write to cache, but the sync_mode is WB_SYNC_NONE, we go
round the loop trying the same page again and again with no pausing or
waiting unless and until another thread manages to clear the writeback
and fscache flags.
Fix this with three measures:
(1) Advance start to after the page we found.
(2) Break out of the loop and return if rescheduling is requested.
(3) Arbitrarily give up after a maximum of 5 skips.
Fixes: 31143d5d515e ("AFS: implement basic file write support")
Reported-by: Marc Dionne <marc.dionne(a)auristor.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
Tested-by: Marc Dionne <marc.dionne(a)auristor.com>
Acked-by: Marc Dionne <marc.dionne(a)auristor.com>
Link: https://lore.kernel.org/r/164692725757.2097000.2060513769492301854.stgit@wa… # v1
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 5e9157d0da29..f447c902318d 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -703,7 +703,7 @@ static int afs_writepages_region(struct address_space *mapping,
struct folio *folio;
struct page *head_page;
ssize_t ret;
- int n;
+ int n, skips = 0;
_enter("%llx,%llx,", start, end);
@@ -754,8 +754,15 @@ static int afs_writepages_region(struct address_space *mapping,
#ifdef CONFIG_AFS_FSCACHE
folio_wait_fscache(folio);
#endif
+ } else {
+ start += folio_size(folio);
}
folio_put(folio);
+ if (wbc->sync_mode == WB_SYNC_NONE) {
+ if (skips >= 5 || need_resched())
+ break;
+ skips++;
+ }
continue;
}
The driver_override field from platform driver should not be initialized
from static memory (string literal) because the core later kfree() it,
for example when driver_override is set via sysfs.
Use dedicated helper to set driver_override properly.
Fixes: 77d8f3068c63 ("clk: imx: scu: add two cells binding support")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)canonical.com>
---
drivers/clk/imx/clk-scu.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/clk/imx/clk-scu.c b/drivers/clk/imx/clk-scu.c
index 083da31dc3ea..4b2268b7d0d0 100644
--- a/drivers/clk/imx/clk-scu.c
+++ b/drivers/clk/imx/clk-scu.c
@@ -683,7 +683,12 @@ struct clk_hw *imx_clk_scu_alloc_dev(const char *name,
return ERR_PTR(ret);
}
- pdev->driver_override = "imx-scu-clk";
+ ret = driver_set_override(&pdev->dev, &pdev->driver_override,
+ "imx-scu-clk", strlen("imx-scu-clk"));
+ if (ret) {
+ platform_device_put(pdev);
+ return ERR_PTR(ret);
+ }
ret = imx_clk_scu_attach_pd(&pdev->dev, rsrc_id);
if (ret)
--
2.32.0
Hi,
following two patches were backported "automatically" applied in
4.14.y, 4.19.y, 5.4.y, 5.10.y, 5.5.y and 5.16.y. But they failed
to apply cleanly in v4.9.y due to some changes in the patch context
and one missing function in the older batman-adv version.
These problems were now fixed manually.
Kind regards,
Sven
Sven Eckelmann (2):
batman-adv: Request iflink once in batadv-on-batadv check
batman-adv: Don't expect inter-netns unique iflink indices
net/batman-adv/hard-interface.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
--
2.30.2
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From aa6f8dcbab473f3a3c7454b74caa46d36cdc5d13 Mon Sep 17 00:00:00 2001
From: Halil Pasic <pasic(a)linux.ibm.com>
Date: Sat, 5 Mar 2022 18:07:14 +0100
Subject: [PATCH] swiotlb: rework "fix info leak with DMA_FROM_DEVICE"
Unfortunately, we ended up merging an old version of the patch "fix info
leak with DMA_FROM_DEVICE" instead of merging the latest one. Christoph
(the swiotlb maintainer), he asked me to create an incremental fix
(after I have pointed this out the mix up, and asked him for guidance).
So here we go.
The main differences between what we got and what was agreed are:
* swiotlb_sync_single_for_device is also required to do an extra bounce
* We decided not to introduce DMA_ATTR_OVERWRITE until we have exploiters
* The implantation of DMA_ATTR_OVERWRITE is flawed: DMA_ATTR_OVERWRITE
must take precedence over DMA_ATTR_SKIP_CPU_SYNC
Thus this patch removes DMA_ATTR_OVERWRITE, and makes
swiotlb_sync_single_for_device() bounce unconditionally (that is, also
when dir == DMA_TO_DEVICE) in order do avoid synchronising back stale
data from the swiotlb buffer.
Let me note, that if the size used with dma_sync_* API is less than the
size used with dma_[un]map_*, under certain circumstances we may still
end up with swiotlb not being transparent. In that sense, this is no
perfect fix either.
To get this bullet proof, we would have to bounce the entire
mapping/bounce buffer. For that we would have to figure out the starting
address, and the size of the mapping in
swiotlb_sync_single_for_device(). While this does seem possible, there
seems to be no firm consensus on how things are supposed to work.
Signed-off-by: Halil Pasic <pasic(a)linux.ibm.com>
Fixes: ddbd89deb7d3 ("swiotlb: fix info leak with DMA_FROM_DEVICE")
Cc: stable(a)vger.kernel.org
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/Documentation/core-api/dma-attributes.rst b/Documentation/core-api/dma-attributes.rst
index 17706dc91ec9..1887d92e8e92 100644
--- a/Documentation/core-api/dma-attributes.rst
+++ b/Documentation/core-api/dma-attributes.rst
@@ -130,11 +130,3 @@ accesses to DMA buffers in both privileged "supervisor" and unprivileged
subsystem that the buffer is fully accessible at the elevated privilege
level (and ideally inaccessible or at least read-only at the
lesser-privileged levels).
-
-DMA_ATTR_OVERWRITE
-------------------
-
-This is a hint to the DMA-mapping subsystem that the device is expected to
-overwrite the entire mapped size, thus the caller does not require any of the
-previous buffer contents to be preserved. This allows bounce-buffering
-implementations to optimise DMA_FROM_DEVICE transfers.
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 6150d11a607e..dca2b1355bb1 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -61,14 +61,6 @@
*/
#define DMA_ATTR_PRIVILEGED (1UL << 9)
-/*
- * This is a hint to the DMA-mapping subsystem that the device is expected
- * to overwrite the entire mapped size, thus the caller does not require any
- * of the previous buffer contents to be preserved. This allows
- * bounce-buffering implementations to optimise DMA_FROM_DEVICE transfers.
- */
-#define DMA_ATTR_OVERWRITE (1UL << 10)
-
/*
* A dma_addr_t can hold any valid DMA or bus address for the platform. It can
* be given to a device to use as a DMA source or target. It is specific to a
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index bfc56cb21705..6db1c475ec82 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -627,10 +627,14 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
for (i = 0; i < nr_slots(alloc_size + offset); i++)
mem->slots[index + i].orig_addr = slot_addr(orig_addr, i);
tlb_addr = slot_addr(mem->start, index) + offset;
- if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
- (!(attrs & DMA_ATTR_OVERWRITE) || dir == DMA_TO_DEVICE ||
- dir == DMA_BIDIRECTIONAL))
- swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_TO_DEVICE);
+ /*
+ * When dir == DMA_FROM_DEVICE we could omit the copy from the orig
+ * to the tlb buffer, if we knew for sure the device will
+ * overwirte the entire current content. But we don't. Thus
+ * unconditional bounce may prevent leaking swiotlb content (i.e.
+ * kernel memory) to user-space.
+ */
+ swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_TO_DEVICE);
return tlb_addr;
}
@@ -697,10 +701,13 @@ void swiotlb_tbl_unmap_single(struct device *dev, phys_addr_t tlb_addr,
void swiotlb_sync_single_for_device(struct device *dev, phys_addr_t tlb_addr,
size_t size, enum dma_data_direction dir)
{
- if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
- swiotlb_bounce(dev, tlb_addr, size, DMA_TO_DEVICE);
- else
- BUG_ON(dir != DMA_FROM_DEVICE);
+ /*
+ * Unconditional bounce is necessary to avoid corruption on
+ * sync_*_for_cpu or dma_ummap_* when the device didn't overwrite
+ * the whole lengt of the bounce buffer.
+ */
+ swiotlb_bounce(dev, tlb_addr, size, DMA_TO_DEVICE);
+ BUG_ON(!valid_dma_direction(dir));
}
void swiotlb_sync_single_for_cpu(struct device *dev, phys_addr_t tlb_addr,
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From aa6f8dcbab473f3a3c7454b74caa46d36cdc5d13 Mon Sep 17 00:00:00 2001
From: Halil Pasic <pasic(a)linux.ibm.com>
Date: Sat, 5 Mar 2022 18:07:14 +0100
Subject: [PATCH] swiotlb: rework "fix info leak with DMA_FROM_DEVICE"
Unfortunately, we ended up merging an old version of the patch "fix info
leak with DMA_FROM_DEVICE" instead of merging the latest one. Christoph
(the swiotlb maintainer), he asked me to create an incremental fix
(after I have pointed this out the mix up, and asked him for guidance).
So here we go.
The main differences between what we got and what was agreed are:
* swiotlb_sync_single_for_device is also required to do an extra bounce
* We decided not to introduce DMA_ATTR_OVERWRITE until we have exploiters
* The implantation of DMA_ATTR_OVERWRITE is flawed: DMA_ATTR_OVERWRITE
must take precedence over DMA_ATTR_SKIP_CPU_SYNC
Thus this patch removes DMA_ATTR_OVERWRITE, and makes
swiotlb_sync_single_for_device() bounce unconditionally (that is, also
when dir == DMA_TO_DEVICE) in order do avoid synchronising back stale
data from the swiotlb buffer.
Let me note, that if the size used with dma_sync_* API is less than the
size used with dma_[un]map_*, under certain circumstances we may still
end up with swiotlb not being transparent. In that sense, this is no
perfect fix either.
To get this bullet proof, we would have to bounce the entire
mapping/bounce buffer. For that we would have to figure out the starting
address, and the size of the mapping in
swiotlb_sync_single_for_device(). While this does seem possible, there
seems to be no firm consensus on how things are supposed to work.
Signed-off-by: Halil Pasic <pasic(a)linux.ibm.com>
Fixes: ddbd89deb7d3 ("swiotlb: fix info leak with DMA_FROM_DEVICE")
Cc: stable(a)vger.kernel.org
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/Documentation/core-api/dma-attributes.rst b/Documentation/core-api/dma-attributes.rst
index 17706dc91ec9..1887d92e8e92 100644
--- a/Documentation/core-api/dma-attributes.rst
+++ b/Documentation/core-api/dma-attributes.rst
@@ -130,11 +130,3 @@ accesses to DMA buffers in both privileged "supervisor" and unprivileged
subsystem that the buffer is fully accessible at the elevated privilege
level (and ideally inaccessible or at least read-only at the
lesser-privileged levels).
-
-DMA_ATTR_OVERWRITE
-------------------
-
-This is a hint to the DMA-mapping subsystem that the device is expected to
-overwrite the entire mapped size, thus the caller does not require any of the
-previous buffer contents to be preserved. This allows bounce-buffering
-implementations to optimise DMA_FROM_DEVICE transfers.
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 6150d11a607e..dca2b1355bb1 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -61,14 +61,6 @@
*/
#define DMA_ATTR_PRIVILEGED (1UL << 9)
-/*
- * This is a hint to the DMA-mapping subsystem that the device is expected
- * to overwrite the entire mapped size, thus the caller does not require any
- * of the previous buffer contents to be preserved. This allows
- * bounce-buffering implementations to optimise DMA_FROM_DEVICE transfers.
- */
-#define DMA_ATTR_OVERWRITE (1UL << 10)
-
/*
* A dma_addr_t can hold any valid DMA or bus address for the platform. It can
* be given to a device to use as a DMA source or target. It is specific to a
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index bfc56cb21705..6db1c475ec82 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -627,10 +627,14 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
for (i = 0; i < nr_slots(alloc_size + offset); i++)
mem->slots[index + i].orig_addr = slot_addr(orig_addr, i);
tlb_addr = slot_addr(mem->start, index) + offset;
- if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
- (!(attrs & DMA_ATTR_OVERWRITE) || dir == DMA_TO_DEVICE ||
- dir == DMA_BIDIRECTIONAL))
- swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_TO_DEVICE);
+ /*
+ * When dir == DMA_FROM_DEVICE we could omit the copy from the orig
+ * to the tlb buffer, if we knew for sure the device will
+ * overwirte the entire current content. But we don't. Thus
+ * unconditional bounce may prevent leaking swiotlb content (i.e.
+ * kernel memory) to user-space.
+ */
+ swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_TO_DEVICE);
return tlb_addr;
}
@@ -697,10 +701,13 @@ void swiotlb_tbl_unmap_single(struct device *dev, phys_addr_t tlb_addr,
void swiotlb_sync_single_for_device(struct device *dev, phys_addr_t tlb_addr,
size_t size, enum dma_data_direction dir)
{
- if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
- swiotlb_bounce(dev, tlb_addr, size, DMA_TO_DEVICE);
- else
- BUG_ON(dir != DMA_FROM_DEVICE);
+ /*
+ * Unconditional bounce is necessary to avoid corruption on
+ * sync_*_for_cpu or dma_ummap_* when the device didn't overwrite
+ * the whole lengt of the bounce buffer.
+ */
+ swiotlb_bounce(dev, tlb_addr, size, DMA_TO_DEVICE);
+ BUG_ON(!valid_dma_direction(dir));
}
void swiotlb_sync_single_for_cpu(struct device *dev, phys_addr_t tlb_addr,
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From aa6f8dcbab473f3a3c7454b74caa46d36cdc5d13 Mon Sep 17 00:00:00 2001
From: Halil Pasic <pasic(a)linux.ibm.com>
Date: Sat, 5 Mar 2022 18:07:14 +0100
Subject: [PATCH] swiotlb: rework "fix info leak with DMA_FROM_DEVICE"
Unfortunately, we ended up merging an old version of the patch "fix info
leak with DMA_FROM_DEVICE" instead of merging the latest one. Christoph
(the swiotlb maintainer), he asked me to create an incremental fix
(after I have pointed this out the mix up, and asked him for guidance).
So here we go.
The main differences between what we got and what was agreed are:
* swiotlb_sync_single_for_device is also required to do an extra bounce
* We decided not to introduce DMA_ATTR_OVERWRITE until we have exploiters
* The implantation of DMA_ATTR_OVERWRITE is flawed: DMA_ATTR_OVERWRITE
must take precedence over DMA_ATTR_SKIP_CPU_SYNC
Thus this patch removes DMA_ATTR_OVERWRITE, and makes
swiotlb_sync_single_for_device() bounce unconditionally (that is, also
when dir == DMA_TO_DEVICE) in order do avoid synchronising back stale
data from the swiotlb buffer.
Let me note, that if the size used with dma_sync_* API is less than the
size used with dma_[un]map_*, under certain circumstances we may still
end up with swiotlb not being transparent. In that sense, this is no
perfect fix either.
To get this bullet proof, we would have to bounce the entire
mapping/bounce buffer. For that we would have to figure out the starting
address, and the size of the mapping in
swiotlb_sync_single_for_device(). While this does seem possible, there
seems to be no firm consensus on how things are supposed to work.
Signed-off-by: Halil Pasic <pasic(a)linux.ibm.com>
Fixes: ddbd89deb7d3 ("swiotlb: fix info leak with DMA_FROM_DEVICE")
Cc: stable(a)vger.kernel.org
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/Documentation/core-api/dma-attributes.rst b/Documentation/core-api/dma-attributes.rst
index 17706dc91ec9..1887d92e8e92 100644
--- a/Documentation/core-api/dma-attributes.rst
+++ b/Documentation/core-api/dma-attributes.rst
@@ -130,11 +130,3 @@ accesses to DMA buffers in both privileged "supervisor" and unprivileged
subsystem that the buffer is fully accessible at the elevated privilege
level (and ideally inaccessible or at least read-only at the
lesser-privileged levels).
-
-DMA_ATTR_OVERWRITE
-------------------
-
-This is a hint to the DMA-mapping subsystem that the device is expected to
-overwrite the entire mapped size, thus the caller does not require any of the
-previous buffer contents to be preserved. This allows bounce-buffering
-implementations to optimise DMA_FROM_DEVICE transfers.
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 6150d11a607e..dca2b1355bb1 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -61,14 +61,6 @@
*/
#define DMA_ATTR_PRIVILEGED (1UL << 9)
-/*
- * This is a hint to the DMA-mapping subsystem that the device is expected
- * to overwrite the entire mapped size, thus the caller does not require any
- * of the previous buffer contents to be preserved. This allows
- * bounce-buffering implementations to optimise DMA_FROM_DEVICE transfers.
- */
-#define DMA_ATTR_OVERWRITE (1UL << 10)
-
/*
* A dma_addr_t can hold any valid DMA or bus address for the platform. It can
* be given to a device to use as a DMA source or target. It is specific to a
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index bfc56cb21705..6db1c475ec82 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -627,10 +627,14 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
for (i = 0; i < nr_slots(alloc_size + offset); i++)
mem->slots[index + i].orig_addr = slot_addr(orig_addr, i);
tlb_addr = slot_addr(mem->start, index) + offset;
- if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
- (!(attrs & DMA_ATTR_OVERWRITE) || dir == DMA_TO_DEVICE ||
- dir == DMA_BIDIRECTIONAL))
- swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_TO_DEVICE);
+ /*
+ * When dir == DMA_FROM_DEVICE we could omit the copy from the orig
+ * to the tlb buffer, if we knew for sure the device will
+ * overwirte the entire current content. But we don't. Thus
+ * unconditional bounce may prevent leaking swiotlb content (i.e.
+ * kernel memory) to user-space.
+ */
+ swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_TO_DEVICE);
return tlb_addr;
}
@@ -697,10 +701,13 @@ void swiotlb_tbl_unmap_single(struct device *dev, phys_addr_t tlb_addr,
void swiotlb_sync_single_for_device(struct device *dev, phys_addr_t tlb_addr,
size_t size, enum dma_data_direction dir)
{
- if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
- swiotlb_bounce(dev, tlb_addr, size, DMA_TO_DEVICE);
- else
- BUG_ON(dir != DMA_FROM_DEVICE);
+ /*
+ * Unconditional bounce is necessary to avoid corruption on
+ * sync_*_for_cpu or dma_ummap_* when the device didn't overwrite
+ * the whole lengt of the bounce buffer.
+ */
+ swiotlb_bounce(dev, tlb_addr, size, DMA_TO_DEVICE);
+ BUG_ON(!valid_dma_direction(dir));
}
void swiotlb_sync_single_for_cpu(struct device *dev, phys_addr_t tlb_addr,
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From aa6f8dcbab473f3a3c7454b74caa46d36cdc5d13 Mon Sep 17 00:00:00 2001
From: Halil Pasic <pasic(a)linux.ibm.com>
Date: Sat, 5 Mar 2022 18:07:14 +0100
Subject: [PATCH] swiotlb: rework "fix info leak with DMA_FROM_DEVICE"
Unfortunately, we ended up merging an old version of the patch "fix info
leak with DMA_FROM_DEVICE" instead of merging the latest one. Christoph
(the swiotlb maintainer), he asked me to create an incremental fix
(after I have pointed this out the mix up, and asked him for guidance).
So here we go.
The main differences between what we got and what was agreed are:
* swiotlb_sync_single_for_device is also required to do an extra bounce
* We decided not to introduce DMA_ATTR_OVERWRITE until we have exploiters
* The implantation of DMA_ATTR_OVERWRITE is flawed: DMA_ATTR_OVERWRITE
must take precedence over DMA_ATTR_SKIP_CPU_SYNC
Thus this patch removes DMA_ATTR_OVERWRITE, and makes
swiotlb_sync_single_for_device() bounce unconditionally (that is, also
when dir == DMA_TO_DEVICE) in order do avoid synchronising back stale
data from the swiotlb buffer.
Let me note, that if the size used with dma_sync_* API is less than the
size used with dma_[un]map_*, under certain circumstances we may still
end up with swiotlb not being transparent. In that sense, this is no
perfect fix either.
To get this bullet proof, we would have to bounce the entire
mapping/bounce buffer. For that we would have to figure out the starting
address, and the size of the mapping in
swiotlb_sync_single_for_device(). While this does seem possible, there
seems to be no firm consensus on how things are supposed to work.
Signed-off-by: Halil Pasic <pasic(a)linux.ibm.com>
Fixes: ddbd89deb7d3 ("swiotlb: fix info leak with DMA_FROM_DEVICE")
Cc: stable(a)vger.kernel.org
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/Documentation/core-api/dma-attributes.rst b/Documentation/core-api/dma-attributes.rst
index 17706dc91ec9..1887d92e8e92 100644
--- a/Documentation/core-api/dma-attributes.rst
+++ b/Documentation/core-api/dma-attributes.rst
@@ -130,11 +130,3 @@ accesses to DMA buffers in both privileged "supervisor" and unprivileged
subsystem that the buffer is fully accessible at the elevated privilege
level (and ideally inaccessible or at least read-only at the
lesser-privileged levels).
-
-DMA_ATTR_OVERWRITE
-------------------
-
-This is a hint to the DMA-mapping subsystem that the device is expected to
-overwrite the entire mapped size, thus the caller does not require any of the
-previous buffer contents to be preserved. This allows bounce-buffering
-implementations to optimise DMA_FROM_DEVICE transfers.
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 6150d11a607e..dca2b1355bb1 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -61,14 +61,6 @@
*/
#define DMA_ATTR_PRIVILEGED (1UL << 9)
-/*
- * This is a hint to the DMA-mapping subsystem that the device is expected
- * to overwrite the entire mapped size, thus the caller does not require any
- * of the previous buffer contents to be preserved. This allows
- * bounce-buffering implementations to optimise DMA_FROM_DEVICE transfers.
- */
-#define DMA_ATTR_OVERWRITE (1UL << 10)
-
/*
* A dma_addr_t can hold any valid DMA or bus address for the platform. It can
* be given to a device to use as a DMA source or target. It is specific to a
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index bfc56cb21705..6db1c475ec82 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -627,10 +627,14 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
for (i = 0; i < nr_slots(alloc_size + offset); i++)
mem->slots[index + i].orig_addr = slot_addr(orig_addr, i);
tlb_addr = slot_addr(mem->start, index) + offset;
- if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
- (!(attrs & DMA_ATTR_OVERWRITE) || dir == DMA_TO_DEVICE ||
- dir == DMA_BIDIRECTIONAL))
- swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_TO_DEVICE);
+ /*
+ * When dir == DMA_FROM_DEVICE we could omit the copy from the orig
+ * to the tlb buffer, if we knew for sure the device will
+ * overwirte the entire current content. But we don't. Thus
+ * unconditional bounce may prevent leaking swiotlb content (i.e.
+ * kernel memory) to user-space.
+ */
+ swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_TO_DEVICE);
return tlb_addr;
}
@@ -697,10 +701,13 @@ void swiotlb_tbl_unmap_single(struct device *dev, phys_addr_t tlb_addr,
void swiotlb_sync_single_for_device(struct device *dev, phys_addr_t tlb_addr,
size_t size, enum dma_data_direction dir)
{
- if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
- swiotlb_bounce(dev, tlb_addr, size, DMA_TO_DEVICE);
- else
- BUG_ON(dir != DMA_FROM_DEVICE);
+ /*
+ * Unconditional bounce is necessary to avoid corruption on
+ * sync_*_for_cpu or dma_ummap_* when the device didn't overwrite
+ * the whole lengt of the bounce buffer.
+ */
+ swiotlb_bounce(dev, tlb_addr, size, DMA_TO_DEVICE);
+ BUG_ON(!valid_dma_direction(dir));
}
void swiotlb_sync_single_for_cpu(struct device *dev, phys_addr_t tlb_addr,
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From aa6f8dcbab473f3a3c7454b74caa46d36cdc5d13 Mon Sep 17 00:00:00 2001
From: Halil Pasic <pasic(a)linux.ibm.com>
Date: Sat, 5 Mar 2022 18:07:14 +0100
Subject: [PATCH] swiotlb: rework "fix info leak with DMA_FROM_DEVICE"
Unfortunately, we ended up merging an old version of the patch "fix info
leak with DMA_FROM_DEVICE" instead of merging the latest one. Christoph
(the swiotlb maintainer), he asked me to create an incremental fix
(after I have pointed this out the mix up, and asked him for guidance).
So here we go.
The main differences between what we got and what was agreed are:
* swiotlb_sync_single_for_device is also required to do an extra bounce
* We decided not to introduce DMA_ATTR_OVERWRITE until we have exploiters
* The implantation of DMA_ATTR_OVERWRITE is flawed: DMA_ATTR_OVERWRITE
must take precedence over DMA_ATTR_SKIP_CPU_SYNC
Thus this patch removes DMA_ATTR_OVERWRITE, and makes
swiotlb_sync_single_for_device() bounce unconditionally (that is, also
when dir == DMA_TO_DEVICE) in order do avoid synchronising back stale
data from the swiotlb buffer.
Let me note, that if the size used with dma_sync_* API is less than the
size used with dma_[un]map_*, under certain circumstances we may still
end up with swiotlb not being transparent. In that sense, this is no
perfect fix either.
To get this bullet proof, we would have to bounce the entire
mapping/bounce buffer. For that we would have to figure out the starting
address, and the size of the mapping in
swiotlb_sync_single_for_device(). While this does seem possible, there
seems to be no firm consensus on how things are supposed to work.
Signed-off-by: Halil Pasic <pasic(a)linux.ibm.com>
Fixes: ddbd89deb7d3 ("swiotlb: fix info leak with DMA_FROM_DEVICE")
Cc: stable(a)vger.kernel.org
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/Documentation/core-api/dma-attributes.rst b/Documentation/core-api/dma-attributes.rst
index 17706dc91ec9..1887d92e8e92 100644
--- a/Documentation/core-api/dma-attributes.rst
+++ b/Documentation/core-api/dma-attributes.rst
@@ -130,11 +130,3 @@ accesses to DMA buffers in both privileged "supervisor" and unprivileged
subsystem that the buffer is fully accessible at the elevated privilege
level (and ideally inaccessible or at least read-only at the
lesser-privileged levels).
-
-DMA_ATTR_OVERWRITE
-------------------
-
-This is a hint to the DMA-mapping subsystem that the device is expected to
-overwrite the entire mapped size, thus the caller does not require any of the
-previous buffer contents to be preserved. This allows bounce-buffering
-implementations to optimise DMA_FROM_DEVICE transfers.
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 6150d11a607e..dca2b1355bb1 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -61,14 +61,6 @@
*/
#define DMA_ATTR_PRIVILEGED (1UL << 9)
-/*
- * This is a hint to the DMA-mapping subsystem that the device is expected
- * to overwrite the entire mapped size, thus the caller does not require any
- * of the previous buffer contents to be preserved. This allows
- * bounce-buffering implementations to optimise DMA_FROM_DEVICE transfers.
- */
-#define DMA_ATTR_OVERWRITE (1UL << 10)
-
/*
* A dma_addr_t can hold any valid DMA or bus address for the platform. It can
* be given to a device to use as a DMA source or target. It is specific to a
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index bfc56cb21705..6db1c475ec82 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -627,10 +627,14 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
for (i = 0; i < nr_slots(alloc_size + offset); i++)
mem->slots[index + i].orig_addr = slot_addr(orig_addr, i);
tlb_addr = slot_addr(mem->start, index) + offset;
- if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
- (!(attrs & DMA_ATTR_OVERWRITE) || dir == DMA_TO_DEVICE ||
- dir == DMA_BIDIRECTIONAL))
- swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_TO_DEVICE);
+ /*
+ * When dir == DMA_FROM_DEVICE we could omit the copy from the orig
+ * to the tlb buffer, if we knew for sure the device will
+ * overwirte the entire current content. But we don't. Thus
+ * unconditional bounce may prevent leaking swiotlb content (i.e.
+ * kernel memory) to user-space.
+ */
+ swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_TO_DEVICE);
return tlb_addr;
}
@@ -697,10 +701,13 @@ void swiotlb_tbl_unmap_single(struct device *dev, phys_addr_t tlb_addr,
void swiotlb_sync_single_for_device(struct device *dev, phys_addr_t tlb_addr,
size_t size, enum dma_data_direction dir)
{
- if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
- swiotlb_bounce(dev, tlb_addr, size, DMA_TO_DEVICE);
- else
- BUG_ON(dir != DMA_FROM_DEVICE);
+ /*
+ * Unconditional bounce is necessary to avoid corruption on
+ * sync_*_for_cpu or dma_ummap_* when the device didn't overwrite
+ * the whole lengt of the bounce buffer.
+ */
+ swiotlb_bounce(dev, tlb_addr, size, DMA_TO_DEVICE);
+ BUG_ON(!valid_dma_direction(dir));
}
void swiotlb_sync_single_for_cpu(struct device *dev, phys_addr_t tlb_addr,
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ddbd89deb7d32b1fbb879f48d68fda1a8ac58e8e Mon Sep 17 00:00:00 2001
From: Halil Pasic <pasic(a)linux.ibm.com>
Date: Fri, 11 Feb 2022 02:12:52 +0100
Subject: [PATCH] swiotlb: fix info leak with DMA_FROM_DEVICE
The problem I'm addressing was discovered by the LTP test covering
cve-2018-1000204.
A short description of what happens follows:
1) The test case issues a command code 00 (TEST UNIT READY) via the SG_IO
interface with: dxfer_len == 524288, dxdfer_dir == SG_DXFER_FROM_DEV
and a corresponding dxferp. The peculiar thing about this is that TUR
is not reading from the device.
2) In sg_start_req() the invocation of blk_rq_map_user() effectively
bounces the user-space buffer. As if the device was to transfer into
it. Since commit a45b599ad808 ("scsi: sg: allocate with __GFP_ZERO in
sg_build_indirect()") we make sure this first bounce buffer is
allocated with GFP_ZERO.
3) For the rest of the story we keep ignoring that we have a TUR, so the
device won't touch the buffer we prepare as if the we had a
DMA_FROM_DEVICE type of situation. My setup uses a virtio-scsi device
and the buffer allocated by SG is mapped by the function
virtqueue_add_split() which uses DMA_FROM_DEVICE for the "in" sgs (here
scatter-gather and not scsi generics). This mapping involves bouncing
via the swiotlb (we need swiotlb to do virtio in protected guest like
s390 Secure Execution, or AMD SEV).
4) When the SCSI TUR is done, we first copy back the content of the second
(that is swiotlb) bounce buffer (which most likely contains some
previous IO data), to the first bounce buffer, which contains all
zeros. Then we copy back the content of the first bounce buffer to
the user-space buffer.
5) The test case detects that the buffer, which it zero-initialized,
ain't all zeros and fails.
One can argue that this is an swiotlb problem, because without swiotlb
we leak all zeros, and the swiotlb should be transparent in a sense that
it does not affect the outcome (if all other participants are well
behaved).
Copying the content of the original buffer into the swiotlb buffer is
the only way I can think of to make swiotlb transparent in such
scenarios. So let's do just that if in doubt, but allow the driver
to tell us that the whole mapped buffer is going to be overwritten,
in which case we can preserve the old behavior and avoid the performance
impact of the extra bounce.
Signed-off-by: Halil Pasic <pasic(a)linux.ibm.com>
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
diff --git a/Documentation/core-api/dma-attributes.rst b/Documentation/core-api/dma-attributes.rst
index 1887d92e8e92..17706dc91ec9 100644
--- a/Documentation/core-api/dma-attributes.rst
+++ b/Documentation/core-api/dma-attributes.rst
@@ -130,3 +130,11 @@ accesses to DMA buffers in both privileged "supervisor" and unprivileged
subsystem that the buffer is fully accessible at the elevated privilege
level (and ideally inaccessible or at least read-only at the
lesser-privileged levels).
+
+DMA_ATTR_OVERWRITE
+------------------
+
+This is a hint to the DMA-mapping subsystem that the device is expected to
+overwrite the entire mapped size, thus the caller does not require any of the
+previous buffer contents to be preserved. This allows bounce-buffering
+implementations to optimise DMA_FROM_DEVICE transfers.
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index dca2b1355bb1..6150d11a607e 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -61,6 +61,14 @@
*/
#define DMA_ATTR_PRIVILEGED (1UL << 9)
+/*
+ * This is a hint to the DMA-mapping subsystem that the device is expected
+ * to overwrite the entire mapped size, thus the caller does not require any
+ * of the previous buffer contents to be preserved. This allows
+ * bounce-buffering implementations to optimise DMA_FROM_DEVICE transfers.
+ */
+#define DMA_ATTR_OVERWRITE (1UL << 10)
+
/*
* A dma_addr_t can hold any valid DMA or bus address for the platform. It can
* be given to a device to use as a DMA source or target. It is specific to a
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index f1e7ea160b43..bfc56cb21705 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -628,7 +628,8 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
mem->slots[index + i].orig_addr = slot_addr(orig_addr, i);
tlb_addr = slot_addr(mem->start, index) + offset;
if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
- (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
+ (!(attrs & DMA_ATTR_OVERWRITE) || dir == DMA_TO_DEVICE ||
+ dir == DMA_BIDIRECTIONAL))
swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_TO_DEVICE);
return tlb_addr;
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ddbd89deb7d32b1fbb879f48d68fda1a8ac58e8e Mon Sep 17 00:00:00 2001
From: Halil Pasic <pasic(a)linux.ibm.com>
Date: Fri, 11 Feb 2022 02:12:52 +0100
Subject: [PATCH] swiotlb: fix info leak with DMA_FROM_DEVICE
The problem I'm addressing was discovered by the LTP test covering
cve-2018-1000204.
A short description of what happens follows:
1) The test case issues a command code 00 (TEST UNIT READY) via the SG_IO
interface with: dxfer_len == 524288, dxdfer_dir == SG_DXFER_FROM_DEV
and a corresponding dxferp. The peculiar thing about this is that TUR
is not reading from the device.
2) In sg_start_req() the invocation of blk_rq_map_user() effectively
bounces the user-space buffer. As if the device was to transfer into
it. Since commit a45b599ad808 ("scsi: sg: allocate with __GFP_ZERO in
sg_build_indirect()") we make sure this first bounce buffer is
allocated with GFP_ZERO.
3) For the rest of the story we keep ignoring that we have a TUR, so the
device won't touch the buffer we prepare as if the we had a
DMA_FROM_DEVICE type of situation. My setup uses a virtio-scsi device
and the buffer allocated by SG is mapped by the function
virtqueue_add_split() which uses DMA_FROM_DEVICE for the "in" sgs (here
scatter-gather and not scsi generics). This mapping involves bouncing
via the swiotlb (we need swiotlb to do virtio in protected guest like
s390 Secure Execution, or AMD SEV).
4) When the SCSI TUR is done, we first copy back the content of the second
(that is swiotlb) bounce buffer (which most likely contains some
previous IO data), to the first bounce buffer, which contains all
zeros. Then we copy back the content of the first bounce buffer to
the user-space buffer.
5) The test case detects that the buffer, which it zero-initialized,
ain't all zeros and fails.
One can argue that this is an swiotlb problem, because without swiotlb
we leak all zeros, and the swiotlb should be transparent in a sense that
it does not affect the outcome (if all other participants are well
behaved).
Copying the content of the original buffer into the swiotlb buffer is
the only way I can think of to make swiotlb transparent in such
scenarios. So let's do just that if in doubt, but allow the driver
to tell us that the whole mapped buffer is going to be overwritten,
in which case we can preserve the old behavior and avoid the performance
impact of the extra bounce.
Signed-off-by: Halil Pasic <pasic(a)linux.ibm.com>
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
diff --git a/Documentation/core-api/dma-attributes.rst b/Documentation/core-api/dma-attributes.rst
index 1887d92e8e92..17706dc91ec9 100644
--- a/Documentation/core-api/dma-attributes.rst
+++ b/Documentation/core-api/dma-attributes.rst
@@ -130,3 +130,11 @@ accesses to DMA buffers in both privileged "supervisor" and unprivileged
subsystem that the buffer is fully accessible at the elevated privilege
level (and ideally inaccessible or at least read-only at the
lesser-privileged levels).
+
+DMA_ATTR_OVERWRITE
+------------------
+
+This is a hint to the DMA-mapping subsystem that the device is expected to
+overwrite the entire mapped size, thus the caller does not require any of the
+previous buffer contents to be preserved. This allows bounce-buffering
+implementations to optimise DMA_FROM_DEVICE transfers.
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index dca2b1355bb1..6150d11a607e 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -61,6 +61,14 @@
*/
#define DMA_ATTR_PRIVILEGED (1UL << 9)
+/*
+ * This is a hint to the DMA-mapping subsystem that the device is expected
+ * to overwrite the entire mapped size, thus the caller does not require any
+ * of the previous buffer contents to be preserved. This allows
+ * bounce-buffering implementations to optimise DMA_FROM_DEVICE transfers.
+ */
+#define DMA_ATTR_OVERWRITE (1UL << 10)
+
/*
* A dma_addr_t can hold any valid DMA or bus address for the platform. It can
* be given to a device to use as a DMA source or target. It is specific to a
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index f1e7ea160b43..bfc56cb21705 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -628,7 +628,8 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
mem->slots[index + i].orig_addr = slot_addr(orig_addr, i);
tlb_addr = slot_addr(mem->start, index) + offset;
if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
- (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
+ (!(attrs & DMA_ATTR_OVERWRITE) || dir == DMA_TO_DEVICE ||
+ dir == DMA_BIDIRECTIONAL))
swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_TO_DEVICE);
return tlb_addr;
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ddbd89deb7d32b1fbb879f48d68fda1a8ac58e8e Mon Sep 17 00:00:00 2001
From: Halil Pasic <pasic(a)linux.ibm.com>
Date: Fri, 11 Feb 2022 02:12:52 +0100
Subject: [PATCH] swiotlb: fix info leak with DMA_FROM_DEVICE
The problem I'm addressing was discovered by the LTP test covering
cve-2018-1000204.
A short description of what happens follows:
1) The test case issues a command code 00 (TEST UNIT READY) via the SG_IO
interface with: dxfer_len == 524288, dxdfer_dir == SG_DXFER_FROM_DEV
and a corresponding dxferp. The peculiar thing about this is that TUR
is not reading from the device.
2) In sg_start_req() the invocation of blk_rq_map_user() effectively
bounces the user-space buffer. As if the device was to transfer into
it. Since commit a45b599ad808 ("scsi: sg: allocate with __GFP_ZERO in
sg_build_indirect()") we make sure this first bounce buffer is
allocated with GFP_ZERO.
3) For the rest of the story we keep ignoring that we have a TUR, so the
device won't touch the buffer we prepare as if the we had a
DMA_FROM_DEVICE type of situation. My setup uses a virtio-scsi device
and the buffer allocated by SG is mapped by the function
virtqueue_add_split() which uses DMA_FROM_DEVICE for the "in" sgs (here
scatter-gather and not scsi generics). This mapping involves bouncing
via the swiotlb (we need swiotlb to do virtio in protected guest like
s390 Secure Execution, or AMD SEV).
4) When the SCSI TUR is done, we first copy back the content of the second
(that is swiotlb) bounce buffer (which most likely contains some
previous IO data), to the first bounce buffer, which contains all
zeros. Then we copy back the content of the first bounce buffer to
the user-space buffer.
5) The test case detects that the buffer, which it zero-initialized,
ain't all zeros and fails.
One can argue that this is an swiotlb problem, because without swiotlb
we leak all zeros, and the swiotlb should be transparent in a sense that
it does not affect the outcome (if all other participants are well
behaved).
Copying the content of the original buffer into the swiotlb buffer is
the only way I can think of to make swiotlb transparent in such
scenarios. So let's do just that if in doubt, but allow the driver
to tell us that the whole mapped buffer is going to be overwritten,
in which case we can preserve the old behavior and avoid the performance
impact of the extra bounce.
Signed-off-by: Halil Pasic <pasic(a)linux.ibm.com>
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
diff --git a/Documentation/core-api/dma-attributes.rst b/Documentation/core-api/dma-attributes.rst
index 1887d92e8e92..17706dc91ec9 100644
--- a/Documentation/core-api/dma-attributes.rst
+++ b/Documentation/core-api/dma-attributes.rst
@@ -130,3 +130,11 @@ accesses to DMA buffers in both privileged "supervisor" and unprivileged
subsystem that the buffer is fully accessible at the elevated privilege
level (and ideally inaccessible or at least read-only at the
lesser-privileged levels).
+
+DMA_ATTR_OVERWRITE
+------------------
+
+This is a hint to the DMA-mapping subsystem that the device is expected to
+overwrite the entire mapped size, thus the caller does not require any of the
+previous buffer contents to be preserved. This allows bounce-buffering
+implementations to optimise DMA_FROM_DEVICE transfers.
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index dca2b1355bb1..6150d11a607e 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -61,6 +61,14 @@
*/
#define DMA_ATTR_PRIVILEGED (1UL << 9)
+/*
+ * This is a hint to the DMA-mapping subsystem that the device is expected
+ * to overwrite the entire mapped size, thus the caller does not require any
+ * of the previous buffer contents to be preserved. This allows
+ * bounce-buffering implementations to optimise DMA_FROM_DEVICE transfers.
+ */
+#define DMA_ATTR_OVERWRITE (1UL << 10)
+
/*
* A dma_addr_t can hold any valid DMA or bus address for the platform. It can
* be given to a device to use as a DMA source or target. It is specific to a
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index f1e7ea160b43..bfc56cb21705 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -628,7 +628,8 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
mem->slots[index + i].orig_addr = slot_addr(orig_addr, i);
tlb_addr = slot_addr(mem->start, index) + offset;
if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
- (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
+ (!(attrs & DMA_ATTR_OVERWRITE) || dir == DMA_TO_DEVICE ||
+ dir == DMA_BIDIRECTIONAL))
swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_TO_DEVICE);
return tlb_addr;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ddbd89deb7d32b1fbb879f48d68fda1a8ac58e8e Mon Sep 17 00:00:00 2001
From: Halil Pasic <pasic(a)linux.ibm.com>
Date: Fri, 11 Feb 2022 02:12:52 +0100
Subject: [PATCH] swiotlb: fix info leak with DMA_FROM_DEVICE
The problem I'm addressing was discovered by the LTP test covering
cve-2018-1000204.
A short description of what happens follows:
1) The test case issues a command code 00 (TEST UNIT READY) via the SG_IO
interface with: dxfer_len == 524288, dxdfer_dir == SG_DXFER_FROM_DEV
and a corresponding dxferp. The peculiar thing about this is that TUR
is not reading from the device.
2) In sg_start_req() the invocation of blk_rq_map_user() effectively
bounces the user-space buffer. As if the device was to transfer into
it. Since commit a45b599ad808 ("scsi: sg: allocate with __GFP_ZERO in
sg_build_indirect()") we make sure this first bounce buffer is
allocated with GFP_ZERO.
3) For the rest of the story we keep ignoring that we have a TUR, so the
device won't touch the buffer we prepare as if the we had a
DMA_FROM_DEVICE type of situation. My setup uses a virtio-scsi device
and the buffer allocated by SG is mapped by the function
virtqueue_add_split() which uses DMA_FROM_DEVICE for the "in" sgs (here
scatter-gather and not scsi generics). This mapping involves bouncing
via the swiotlb (we need swiotlb to do virtio in protected guest like
s390 Secure Execution, or AMD SEV).
4) When the SCSI TUR is done, we first copy back the content of the second
(that is swiotlb) bounce buffer (which most likely contains some
previous IO data), to the first bounce buffer, which contains all
zeros. Then we copy back the content of the first bounce buffer to
the user-space buffer.
5) The test case detects that the buffer, which it zero-initialized,
ain't all zeros and fails.
One can argue that this is an swiotlb problem, because without swiotlb
we leak all zeros, and the swiotlb should be transparent in a sense that
it does not affect the outcome (if all other participants are well
behaved).
Copying the content of the original buffer into the swiotlb buffer is
the only way I can think of to make swiotlb transparent in such
scenarios. So let's do just that if in doubt, but allow the driver
to tell us that the whole mapped buffer is going to be overwritten,
in which case we can preserve the old behavior and avoid the performance
impact of the extra bounce.
Signed-off-by: Halil Pasic <pasic(a)linux.ibm.com>
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
diff --git a/Documentation/core-api/dma-attributes.rst b/Documentation/core-api/dma-attributes.rst
index 1887d92e8e92..17706dc91ec9 100644
--- a/Documentation/core-api/dma-attributes.rst
+++ b/Documentation/core-api/dma-attributes.rst
@@ -130,3 +130,11 @@ accesses to DMA buffers in both privileged "supervisor" and unprivileged
subsystem that the buffer is fully accessible at the elevated privilege
level (and ideally inaccessible or at least read-only at the
lesser-privileged levels).
+
+DMA_ATTR_OVERWRITE
+------------------
+
+This is a hint to the DMA-mapping subsystem that the device is expected to
+overwrite the entire mapped size, thus the caller does not require any of the
+previous buffer contents to be preserved. This allows bounce-buffering
+implementations to optimise DMA_FROM_DEVICE transfers.
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index dca2b1355bb1..6150d11a607e 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -61,6 +61,14 @@
*/
#define DMA_ATTR_PRIVILEGED (1UL << 9)
+/*
+ * This is a hint to the DMA-mapping subsystem that the device is expected
+ * to overwrite the entire mapped size, thus the caller does not require any
+ * of the previous buffer contents to be preserved. This allows
+ * bounce-buffering implementations to optimise DMA_FROM_DEVICE transfers.
+ */
+#define DMA_ATTR_OVERWRITE (1UL << 10)
+
/*
* A dma_addr_t can hold any valid DMA or bus address for the platform. It can
* be given to a device to use as a DMA source or target. It is specific to a
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index f1e7ea160b43..bfc56cb21705 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -628,7 +628,8 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
mem->slots[index + i].orig_addr = slot_addr(orig_addr, i);
tlb_addr = slot_addr(mem->start, index) + offset;
if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
- (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
+ (!(attrs & DMA_ATTR_OVERWRITE) || dir == DMA_TO_DEVICE ||
+ dir == DMA_BIDIRECTIONAL))
swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_TO_DEVICE);
return tlb_addr;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ddbd89deb7d32b1fbb879f48d68fda1a8ac58e8e Mon Sep 17 00:00:00 2001
From: Halil Pasic <pasic(a)linux.ibm.com>
Date: Fri, 11 Feb 2022 02:12:52 +0100
Subject: [PATCH] swiotlb: fix info leak with DMA_FROM_DEVICE
The problem I'm addressing was discovered by the LTP test covering
cve-2018-1000204.
A short description of what happens follows:
1) The test case issues a command code 00 (TEST UNIT READY) via the SG_IO
interface with: dxfer_len == 524288, dxdfer_dir == SG_DXFER_FROM_DEV
and a corresponding dxferp. The peculiar thing about this is that TUR
is not reading from the device.
2) In sg_start_req() the invocation of blk_rq_map_user() effectively
bounces the user-space buffer. As if the device was to transfer into
it. Since commit a45b599ad808 ("scsi: sg: allocate with __GFP_ZERO in
sg_build_indirect()") we make sure this first bounce buffer is
allocated with GFP_ZERO.
3) For the rest of the story we keep ignoring that we have a TUR, so the
device won't touch the buffer we prepare as if the we had a
DMA_FROM_DEVICE type of situation. My setup uses a virtio-scsi device
and the buffer allocated by SG is mapped by the function
virtqueue_add_split() which uses DMA_FROM_DEVICE for the "in" sgs (here
scatter-gather and not scsi generics). This mapping involves bouncing
via the swiotlb (we need swiotlb to do virtio in protected guest like
s390 Secure Execution, or AMD SEV).
4) When the SCSI TUR is done, we first copy back the content of the second
(that is swiotlb) bounce buffer (which most likely contains some
previous IO data), to the first bounce buffer, which contains all
zeros. Then we copy back the content of the first bounce buffer to
the user-space buffer.
5) The test case detects that the buffer, which it zero-initialized,
ain't all zeros and fails.
One can argue that this is an swiotlb problem, because without swiotlb
we leak all zeros, and the swiotlb should be transparent in a sense that
it does not affect the outcome (if all other participants are well
behaved).
Copying the content of the original buffer into the swiotlb buffer is
the only way I can think of to make swiotlb transparent in such
scenarios. So let's do just that if in doubt, but allow the driver
to tell us that the whole mapped buffer is going to be overwritten,
in which case we can preserve the old behavior and avoid the performance
impact of the extra bounce.
Signed-off-by: Halil Pasic <pasic(a)linux.ibm.com>
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
diff --git a/Documentation/core-api/dma-attributes.rst b/Documentation/core-api/dma-attributes.rst
index 1887d92e8e92..17706dc91ec9 100644
--- a/Documentation/core-api/dma-attributes.rst
+++ b/Documentation/core-api/dma-attributes.rst
@@ -130,3 +130,11 @@ accesses to DMA buffers in both privileged "supervisor" and unprivileged
subsystem that the buffer is fully accessible at the elevated privilege
level (and ideally inaccessible or at least read-only at the
lesser-privileged levels).
+
+DMA_ATTR_OVERWRITE
+------------------
+
+This is a hint to the DMA-mapping subsystem that the device is expected to
+overwrite the entire mapped size, thus the caller does not require any of the
+previous buffer contents to be preserved. This allows bounce-buffering
+implementations to optimise DMA_FROM_DEVICE transfers.
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index dca2b1355bb1..6150d11a607e 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -61,6 +61,14 @@
*/
#define DMA_ATTR_PRIVILEGED (1UL << 9)
+/*
+ * This is a hint to the DMA-mapping subsystem that the device is expected
+ * to overwrite the entire mapped size, thus the caller does not require any
+ * of the previous buffer contents to be preserved. This allows
+ * bounce-buffering implementations to optimise DMA_FROM_DEVICE transfers.
+ */
+#define DMA_ATTR_OVERWRITE (1UL << 10)
+
/*
* A dma_addr_t can hold any valid DMA or bus address for the platform. It can
* be given to a device to use as a DMA source or target. It is specific to a
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index f1e7ea160b43..bfc56cb21705 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -628,7 +628,8 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
mem->slots[index + i].orig_addr = slot_addr(orig_addr, i);
tlb_addr = slot_addr(mem->start, index) + offset;
if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
- (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
+ (!(attrs & DMA_ATTR_OVERWRITE) || dir == DMA_TO_DEVICE ||
+ dir == DMA_BIDIRECTIONAL))
swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_TO_DEVICE);
return tlb_addr;
}
The logic in commit 2a5f1b67ec57 "KVM: arm64: Don't access PMCR_EL0 when no
PMU is available" relies on an empty reset handler being benign. This was
not the case in earlier kernel versions, so the stable backport of this
patch is causing problems.
KVMs behaviour in this area changed over time. In particular, prior to commit
03fdfb269009 ("KVM: arm64: Don't write junk to sysregs on reset"), an empty
reset handler will trigger a warning, as the guest registers have been
poisoned.
Prior to commit 20589c8cc47d ("arm/arm64: KVM: Don't panic on failure to
properly reset system registers"), this warning was a panic().
Instead of reverting the backport, make it write 0 to the sys_reg[] array.
This keeps the reset logic happy, and the dodgy value can't be seen by
the guest as it can't request the emulation.
The original bug was accessing the PMCR_EL0 register on CPUs that don't
implement that feature. There is no known silicon that does this, but
v4.9's ACPI support is unable to find the PMU, so triggers this code:
| Kernel panic - not syncing: Didn't reset vcpu_sys_reg(24)
| CPU: 1 PID: 3055 Comm: lkvm Not tainted 4.9.302-00032-g64e078a56789 #13476
| Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Jul 30 2018
| Call trace:
| [<ffff00000808b4b0>] dump_backtrace+0x0/0x1a0
| [<ffff00000808b664>] show_stack+0x14/0x20
| [<ffff0000088f0e18>] dump_stack+0x98/0xb8
| [<ffff0000088eef08>] panic+0x118/0x274
| [<ffff0000080b50e0>] access_actlr+0x0/0x20
| [<ffff0000080b2620>] kvm_reset_vcpu+0x5c/0xac
| [<ffff0000080ac688>] kvm_arch_vcpu_ioctl+0x3e4/0x490
| [<ffff0000080a382c>] kvm_vcpu_ioctl+0x5b8/0x720
| [<ffff000008201e44>] do_vfs_ioctl+0x2f4/0x884
| [<ffff00000820244c>] SyS_ioctl+0x78/0x9c
| [<ffff000008083a9c>] __sys_trace_return+0x0/0x4
Cc: <stable(a)vger.kernel.org> # < v5.3 with 2a5f1b67ec57 backported
Signed-off-by: James Morse <james.morse(a)arm.com>
---
arch/arm64/kvm/sys_regs.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 10d80456f38f..8d548fdbb6b2 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -451,8 +451,10 @@ static void reset_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
u64 pmcr, val;
/* No PMU available, PMCR_EL0 may UNDEF... */
- if (!kvm_arm_support_pmu_v3())
+ if (!kvm_arm_support_pmu_v3()) {
+ vcpu_sys_reg(vcpu, PMCR_EL0) = 0;
return;
+ }
pmcr = read_sysreg(pmcr_el0);
/*
--
2.30.2
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0bf476fc3624e3a72af4ba7340d430a91c18cd67 Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Thu, 3 Mar 2022 12:10:27 -0600
Subject: [PATCH] net: macb: Fix lost RX packet wakeup race in NAPI receive
There is an oddity in the way the RSR register flags propagate to the
ISR register (and the actual interrupt output) on this hardware: it
appears that RSR register bits only result in ISR being asserted if the
interrupt was actually enabled at the time, so enabling interrupts with
RSR bits already set doesn't trigger an interrupt to be raised. There
was already a partial fix for this race in the macb_poll function where
it checked for RSR bits being set and re-triggered NAPI receive.
However, there was a still a race window between checking RSR and
actually enabling interrupts, where a lost wakeup could happen. It's
necessary to check again after enabling interrupts to see if RSR was set
just prior to the interrupt being enabled, and re-trigger receive in that
case.
This issue was noticed in a point-to-point UDP request-response protocol
which periodically saw timeouts or abnormally high response times due to
received packets not being processed in a timely fashion. In many
applications, more packets arriving, including TCP retransmissions, would
cause the original packet to be processed, thus masking the issue.
Fixes: 02f7a34f34e3 ("net: macb: Re-enable RX interrupt only when RX is done")
Cc: stable(a)vger.kernel.org
Co-developed-by: Scott McNutt <scott.mcnutt(a)siriusxm.com>
Signed-off-by: Scott McNutt <scott.mcnutt(a)siriusxm.com>
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Tested-by: Claudiu Beznea <claudiu.beznea(a)microchip.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index 98498a76ae16..d13f06cf0308 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -1573,7 +1573,14 @@ static int macb_poll(struct napi_struct *napi, int budget)
if (work_done < budget) {
napi_complete_done(napi, work_done);
- /* Packets received while interrupts were disabled */
+ /* RSR bits only seem to propagate to raise interrupts when
+ * interrupts are enabled at the time, so if bits are already
+ * set due to packets received while interrupts were disabled,
+ * they will not cause another interrupt to be generated when
+ * interrupts are re-enabled.
+ * Check for this case here. This has been seen to happen
+ * around 30% of the time under heavy network load.
+ */
status = macb_readl(bp, RSR);
if (status) {
if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
@@ -1581,6 +1588,22 @@ static int macb_poll(struct napi_struct *napi, int budget)
napi_reschedule(napi);
} else {
queue_writel(queue, IER, bp->rx_intr_mask);
+
+ /* In rare cases, packets could have been received in
+ * the window between the check above and re-enabling
+ * interrupts. Therefore, a double-check is required
+ * to avoid losing a wakeup. This can potentially race
+ * with the interrupt handler doing the same actions
+ * if an interrupt is raised just after enabling them,
+ * but this should be harmless.
+ */
+ status = macb_readl(bp, RSR);
+ if (unlikely(status)) {
+ queue_writel(queue, IDR, bp->rx_intr_mask);
+ if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
+ queue_writel(queue, ISR, MACB_BIT(RCOMP));
+ napi_schedule(napi);
+ }
}
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0bf476fc3624e3a72af4ba7340d430a91c18cd67 Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Thu, 3 Mar 2022 12:10:27 -0600
Subject: [PATCH] net: macb: Fix lost RX packet wakeup race in NAPI receive
There is an oddity in the way the RSR register flags propagate to the
ISR register (and the actual interrupt output) on this hardware: it
appears that RSR register bits only result in ISR being asserted if the
interrupt was actually enabled at the time, so enabling interrupts with
RSR bits already set doesn't trigger an interrupt to be raised. There
was already a partial fix for this race in the macb_poll function where
it checked for RSR bits being set and re-triggered NAPI receive.
However, there was a still a race window between checking RSR and
actually enabling interrupts, where a lost wakeup could happen. It's
necessary to check again after enabling interrupts to see if RSR was set
just prior to the interrupt being enabled, and re-trigger receive in that
case.
This issue was noticed in a point-to-point UDP request-response protocol
which periodically saw timeouts or abnormally high response times due to
received packets not being processed in a timely fashion. In many
applications, more packets arriving, including TCP retransmissions, would
cause the original packet to be processed, thus masking the issue.
Fixes: 02f7a34f34e3 ("net: macb: Re-enable RX interrupt only when RX is done")
Cc: stable(a)vger.kernel.org
Co-developed-by: Scott McNutt <scott.mcnutt(a)siriusxm.com>
Signed-off-by: Scott McNutt <scott.mcnutt(a)siriusxm.com>
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Tested-by: Claudiu Beznea <claudiu.beznea(a)microchip.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index 98498a76ae16..d13f06cf0308 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -1573,7 +1573,14 @@ static int macb_poll(struct napi_struct *napi, int budget)
if (work_done < budget) {
napi_complete_done(napi, work_done);
- /* Packets received while interrupts were disabled */
+ /* RSR bits only seem to propagate to raise interrupts when
+ * interrupts are enabled at the time, so if bits are already
+ * set due to packets received while interrupts were disabled,
+ * they will not cause another interrupt to be generated when
+ * interrupts are re-enabled.
+ * Check for this case here. This has been seen to happen
+ * around 30% of the time under heavy network load.
+ */
status = macb_readl(bp, RSR);
if (status) {
if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
@@ -1581,6 +1588,22 @@ static int macb_poll(struct napi_struct *napi, int budget)
napi_reschedule(napi);
} else {
queue_writel(queue, IER, bp->rx_intr_mask);
+
+ /* In rare cases, packets could have been received in
+ * the window between the check above and re-enabling
+ * interrupts. Therefore, a double-check is required
+ * to avoid losing a wakeup. This can potentially race
+ * with the interrupt handler doing the same actions
+ * if an interrupt is raised just after enabling them,
+ * but this should be harmless.
+ */
+ status = macb_readl(bp, RSR);
+ if (unlikely(status)) {
+ queue_writel(queue, IDR, bp->rx_intr_mask);
+ if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
+ queue_writel(queue, ISR, MACB_BIT(RCOMP));
+ napi_schedule(napi);
+ }
}
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0c4bcfdecb1ac0967619ee7ff44871d93c08c909 Mon Sep 17 00:00:00 2001
From: Miklos Szeredi <mszeredi(a)redhat.com>
Date: Mon, 7 Mar 2022 16:30:44 +0100
Subject: [PATCH] fuse: fix pipe buffer lifetime for direct_io
In FOPEN_DIRECT_IO mode, fuse_file_write_iter() calls
fuse_direct_write_iter(), which normally calls fuse_direct_io(), which then
imports the write buffer with fuse_get_user_pages(), which uses
iov_iter_get_pages() to grab references to userspace pages instead of
actually copying memory.
On the filesystem device side, these pages can then either be read to
userspace (via fuse_dev_read()), or splice()d over into a pipe using
fuse_dev_splice_read() as pipe buffers with &nosteal_pipe_buf_ops.
This is wrong because after fuse_dev_do_read() unlocks the FUSE request,
the userspace filesystem can mark the request as completed, causing write()
to return. At that point, the userspace filesystem should no longer have
access to the pipe buffer.
Fix by copying pages coming from the user address space to new pipe
buffers.
Reported-by: Jann Horn <jannh(a)google.com>
Fixes: c3021629a0d8 ("fuse: support splice() reading from fuse device")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi(a)redhat.com>
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index cd54a529460d..592730fd6e42 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -941,7 +941,17 @@ static int fuse_copy_page(struct fuse_copy_state *cs, struct page **pagep,
while (count) {
if (cs->write && cs->pipebufs && page) {
- return fuse_ref_page(cs, page, offset, count);
+ /*
+ * Can't control lifetime of pipe buffers, so always
+ * copy user pages.
+ */
+ if (cs->req->args->user_pages) {
+ err = fuse_copy_fill(cs);
+ if (err)
+ return err;
+ } else {
+ return fuse_ref_page(cs, page, offset, count);
+ }
} else if (!cs->len) {
if (cs->move_pages && page &&
offset == 0 && count == PAGE_SIZE) {
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 829094451774..0fc150c1c50b 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1413,6 +1413,7 @@ static int fuse_get_user_pages(struct fuse_args_pages *ap, struct iov_iter *ii,
(PAGE_SIZE - ret) & (PAGE_SIZE - 1);
}
+ ap->args.user_pages = true;
if (write)
ap->args.in_pages = true;
else
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index e8e59fbdefeb..eac4984cc753 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -256,6 +256,7 @@ struct fuse_args {
bool nocreds:1;
bool in_pages:1;
bool out_pages:1;
+ bool user_pages:1;
bool out_argvar:1;
bool page_zeroing:1;
bool page_replace:1;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0c4bcfdecb1ac0967619ee7ff44871d93c08c909 Mon Sep 17 00:00:00 2001
From: Miklos Szeredi <mszeredi(a)redhat.com>
Date: Mon, 7 Mar 2022 16:30:44 +0100
Subject: [PATCH] fuse: fix pipe buffer lifetime for direct_io
In FOPEN_DIRECT_IO mode, fuse_file_write_iter() calls
fuse_direct_write_iter(), which normally calls fuse_direct_io(), which then
imports the write buffer with fuse_get_user_pages(), which uses
iov_iter_get_pages() to grab references to userspace pages instead of
actually copying memory.
On the filesystem device side, these pages can then either be read to
userspace (via fuse_dev_read()), or splice()d over into a pipe using
fuse_dev_splice_read() as pipe buffers with &nosteal_pipe_buf_ops.
This is wrong because after fuse_dev_do_read() unlocks the FUSE request,
the userspace filesystem can mark the request as completed, causing write()
to return. At that point, the userspace filesystem should no longer have
access to the pipe buffer.
Fix by copying pages coming from the user address space to new pipe
buffers.
Reported-by: Jann Horn <jannh(a)google.com>
Fixes: c3021629a0d8 ("fuse: support splice() reading from fuse device")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi(a)redhat.com>
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index cd54a529460d..592730fd6e42 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -941,7 +941,17 @@ static int fuse_copy_page(struct fuse_copy_state *cs, struct page **pagep,
while (count) {
if (cs->write && cs->pipebufs && page) {
- return fuse_ref_page(cs, page, offset, count);
+ /*
+ * Can't control lifetime of pipe buffers, so always
+ * copy user pages.
+ */
+ if (cs->req->args->user_pages) {
+ err = fuse_copy_fill(cs);
+ if (err)
+ return err;
+ } else {
+ return fuse_ref_page(cs, page, offset, count);
+ }
} else if (!cs->len) {
if (cs->move_pages && page &&
offset == 0 && count == PAGE_SIZE) {
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 829094451774..0fc150c1c50b 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1413,6 +1413,7 @@ static int fuse_get_user_pages(struct fuse_args_pages *ap, struct iov_iter *ii,
(PAGE_SIZE - ret) & (PAGE_SIZE - 1);
}
+ ap->args.user_pages = true;
if (write)
ap->args.in_pages = true;
else
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index e8e59fbdefeb..eac4984cc753 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -256,6 +256,7 @@ struct fuse_args {
bool nocreds:1;
bool in_pages:1;
bool out_pages:1;
+ bool user_pages:1;
bool out_argvar:1;
bool page_zeroing:1;
bool page_replace:1;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0c4bcfdecb1ac0967619ee7ff44871d93c08c909 Mon Sep 17 00:00:00 2001
From: Miklos Szeredi <mszeredi(a)redhat.com>
Date: Mon, 7 Mar 2022 16:30:44 +0100
Subject: [PATCH] fuse: fix pipe buffer lifetime for direct_io
In FOPEN_DIRECT_IO mode, fuse_file_write_iter() calls
fuse_direct_write_iter(), which normally calls fuse_direct_io(), which then
imports the write buffer with fuse_get_user_pages(), which uses
iov_iter_get_pages() to grab references to userspace pages instead of
actually copying memory.
On the filesystem device side, these pages can then either be read to
userspace (via fuse_dev_read()), or splice()d over into a pipe using
fuse_dev_splice_read() as pipe buffers with &nosteal_pipe_buf_ops.
This is wrong because after fuse_dev_do_read() unlocks the FUSE request,
the userspace filesystem can mark the request as completed, causing write()
to return. At that point, the userspace filesystem should no longer have
access to the pipe buffer.
Fix by copying pages coming from the user address space to new pipe
buffers.
Reported-by: Jann Horn <jannh(a)google.com>
Fixes: c3021629a0d8 ("fuse: support splice() reading from fuse device")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi(a)redhat.com>
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index cd54a529460d..592730fd6e42 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -941,7 +941,17 @@ static int fuse_copy_page(struct fuse_copy_state *cs, struct page **pagep,
while (count) {
if (cs->write && cs->pipebufs && page) {
- return fuse_ref_page(cs, page, offset, count);
+ /*
+ * Can't control lifetime of pipe buffers, so always
+ * copy user pages.
+ */
+ if (cs->req->args->user_pages) {
+ err = fuse_copy_fill(cs);
+ if (err)
+ return err;
+ } else {
+ return fuse_ref_page(cs, page, offset, count);
+ }
} else if (!cs->len) {
if (cs->move_pages && page &&
offset == 0 && count == PAGE_SIZE) {
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 829094451774..0fc150c1c50b 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1413,6 +1413,7 @@ static int fuse_get_user_pages(struct fuse_args_pages *ap, struct iov_iter *ii,
(PAGE_SIZE - ret) & (PAGE_SIZE - 1);
}
+ ap->args.user_pages = true;
if (write)
ap->args.in_pages = true;
else
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index e8e59fbdefeb..eac4984cc753 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -256,6 +256,7 @@ struct fuse_args {
bool nocreds:1;
bool in_pages:1;
bool out_pages:1;
+ bool user_pages:1;
bool out_argvar:1;
bool page_zeroing:1;
bool page_replace:1;
--
Hello Dear,
how are you today?hope you are fine
My name is Dr Ava Smith ,Am an English and French nationalities.
I will give you pictures and more details about me as soon as i hear from you
Thanks
Ava
Hi Greg,
As reported before, the mips failure for 5.15-stable will need two backports.
I can see one of them in the queue, but looks like you have missed:
b81e0c2372e6 ("block: drop unused includes in <linux/genhd.h>")
Can you please add it to the queue.
--
Regards
Sudip
The patch titled
Subject: mempolicy: mbind_range() set_policy() after vma_merge()
has been added to the -mm tree. Its filename is
mempolicy-mbind_range-set_policy-after-vma_merge.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mempolicy-mbind_range-set_policy-…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mempolicy-mbind_range-set_policy-…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Hugh Dickins <hughd(a)google.com>
Subject: mempolicy: mbind_range() set_policy() after vma_merge()
v2.6.34 commit 9d8cebd4bcd7 ("mm: fix mbind vma merge problem") introduced
vma_merge() to mbind_range(); but unlike madvise, mlock and mprotect, it
put a "continue" to next vma where its precedents go to update flags on
current vma before advancing: that left vma with the wrong setting in the
infamous vma_merge() case 8.
v3.10 commit 1444f92c8498 ("mm: merging memory blocks resets mempolicy")
tried to fix that in vma_adjust(), without fully understanding the issue.
v3.11 commit 3964acd0dbec ("mm: mempolicy: fix mbind_range() &&
vma_adjust() interaction") reverted that, and went about the fix in the
right way, but chose to optimize out an unnecessary mpol_dup() with a
prior mpol_equal() test. But on tmpfs, that also pessimized out the vital
call to its ->set_policy(), leaving the new mbind unenforced.
The user visible effect was that the pages got allocated on the local
node (happened to be 0), after the mbind() caller had specifically
asked for them to be allocated on node 1. There was not any page
migration involved in the case reported: the pages simply got allocated
on the wrong node.
Just delete that optimization now (though it could be made conditional on
vma not having a set_policy). Also remove the "next" variable: it turned
out to be blameless, but also pointless.
Link: https://lkml.kernel.org/r/319e4db9-64ae-4bca-92f0-ade85d342ff@google.com
Fixes: 3964acd0dbec ("mm: mempolicy: fix mbind_range() && vma_adjust() interaction")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Acked-by: Oleg Nesterov <oleg(a)redhat.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mempolicy.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
--- a/mm/mempolicy.c~mempolicy-mbind_range-set_policy-after-vma_merge
+++ a/mm/mempolicy.c
@@ -786,7 +786,6 @@ static int vma_replace_policy(struct vm_
static int mbind_range(struct mm_struct *mm, unsigned long start,
unsigned long end, struct mempolicy *new_pol)
{
- struct vm_area_struct *next;
struct vm_area_struct *prev;
struct vm_area_struct *vma;
int err = 0;
@@ -801,8 +800,7 @@ static int mbind_range(struct mm_struct
if (start > vma->vm_start)
prev = vma;
- for (; vma && vma->vm_start < end; prev = vma, vma = next) {
- next = vma->vm_next;
+ for (; vma && vma->vm_start < end; prev = vma, vma = vma->vm_next) {
vmstart = max(start, vma->vm_start);
vmend = min(end, vma->vm_end);
@@ -817,10 +815,6 @@ static int mbind_range(struct mm_struct
anon_vma_name(vma));
if (prev) {
vma = prev;
- next = vma->vm_next;
- if (mpol_equal(vma_policy(vma), new_pol))
- continue;
- /* vma_merge() joined vma && vma->next, case 8 */
goto replace;
}
if (vma->vm_start != vmstart) {
_
Patches currently in -mm which might be from hughd(a)google.com are
mm-fs-delete-pf_swapwrite.patch
mm-__isolate_lru_page_prepare-in-isolate_migratepages_block.patch
tmpfs-support-for-file-creation-time-fix.patch
shmem-mapping_set_exiting-to-help-mapped-resilience.patch
tmpfs-do-not-allocate-pages-on-read.patch
mm-_install_special_mapping-apply-vm_locked_clear_mask.patch
mempolicy-mbind_range-set_policy-after-vma_merge.patch
mm-thp-refix-__split_huge_pmd_locked-for-migration-pmd.patch
mm-thp-clearpagedoublemap-in-first-page_add_file_rmap.patch
mm-delete-__clearpagewaiters.patch
mm-filemap_unaccount_folio-large-skip-mapcount-fixup.patch
mm-thp-fix-nr_file_mapped-accounting-in-page__file_rmap.patch
mm-warn-on-deleting-redirtied-only-if-accounted.patch
mm-unmap_mapping_range_tree-with-i_mmap_rwsem-shared.patch
RAID arrays check/repair operations benefit a lot from merging requests.
If we only check the previous entry for merge attempt, many merge will be
missed. As a result, significant regression is observed for RAID check
and repair.
Fix this by checking more than just the previous entry when
plug->multiple_queues == true.
This improves the check/repair speed of a 20-HDD raid6 from 19 MB/s to
103 MB/s.
Fixes: d38a9c04c0d5 ("block: only check previous entry for plug merge attempt")
Cc: stable(a)vger.kernel.org # v5.16
Reported-by: Larkin Lowrey <llowrey(a)nuclearwinter.com>
Reported-by: Wilson Jonathan <i400sjon(a)gmail.com>
Reported-by: Roger Heflin <rogerheflin(a)gmail.com>
Signed-off-by: Song Liu <song(a)kernel.org>
---
block/blk-merge.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 4de34a332c9f..57e2075fb2f4 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -1089,12 +1089,14 @@ bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
if (!plug || rq_list_empty(plug->mq_list))
return false;
- /* check the previously added entry for a quick merge attempt */
- rq = rq_list_peek(&plug->mq_list);
- if (rq->q == q) {
- if (blk_attempt_bio_merge(q, rq, bio, nr_segs, false) ==
- BIO_MERGE_OK)
- return true;
+ rq_list_for_each(&plug->mq_list, rq) {
+ if (rq->q == q) {
+ if (blk_attempt_bio_merge(q, rq, bio, nr_segs, false) ==
+ BIO_MERGE_OK)
+ return true;
+ }
+ if (!plug->multiple_queues)
+ break;
}
return false;
}
--
2.30.2
The patch titled
Subject: mm: madvise: skip unmapped vma holes passed to process_madvise
has been added to the -mm tree. Its filename is
mm-madvise-skip-unmapped-vma-holes-passed-to-process_madvise.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-madvise-skip-unmapped-vma-hole…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-madvise-skip-unmapped-vma-hole…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Charan Teja Kalla <quic_charante(a)quicinc.com>
Subject: mm: madvise: skip unmapped vma holes passed to process_madvise
The process_madvise() system call is expected to skip holes in vma passed
through 'struct iovec' vector list. But do_madvise, which
process_madvise() calls for each vma, returns ENOMEM in case of unmapped
holes, despite the VMA is processed.
Thus process_madvise() should treat ENOMEM as expected and consider the
VMA passed to as processed and continue processing other vma's in the
vector list. Returning -ENOMEM to user, despite the VMA is processed,
will be unable to figure out where to start the next madvise.
Link: https://lkml.kernel.org/r/4f091776142f2ebf7b94018146de72318474e686.16470087…
Fixes: ecb8ac8b1f14("mm/madvise: introduce process_madvise() syscall: an external memory hinting API")
Signed-off-by: Charan Teja Kalla <quic_charante(a)quicinc.com>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Stephen Rothwell <sfr(a)canb.auug.org.au>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/madvise.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
--- a/mm/madvise.c~mm-madvise-skip-unmapped-vma-holes-passed-to-process_madvise
+++ a/mm/madvise.c
@@ -1428,9 +1428,16 @@ SYSCALL_DEFINE5(process_madvise, int, pi
while (iov_iter_count(&iter)) {
iovec = iov_iter_iovec(&iter);
+ /*
+ * do_madvise returns ENOMEM if unmapped holes are present
+ * in the passed VMA. process_madvise() is expected to skip
+ * unmapped holes passed to it in the 'struct iovec' list
+ * and not fail because of them. Thus treat -ENOMEM return
+ * from do_madvise as valid and continue processing.
+ */
ret = do_madvise(mm, (unsigned long)iovec.iov_base,
iovec.iov_len, behavior);
- if (ret < 0)
+ if (ret < 0 && ret != -ENOMEM)
break;
iov_iter_advance(&iter, iovec.iov_len);
}
_
Patches currently in -mm which might be from quic_charante(a)quicinc.com are
mm-vmscan-fix-documentation-for-page_check_references.patch
mm-madvise-return-correct-bytes-advised-with-process_madvise.patch
mm-madvise-skip-unmapped-vma-holes-passed-to-process_madvise.patch
The patch titled
Subject: mm: madvise: return correct bytes advised with process_madvise
has been added to the -mm tree. Its filename is
mm-madvise-return-correct-bytes-advised-with-process_madvise.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-madvise-return-correct-bytes-a…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-madvise-return-correct-bytes-a…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Charan Teja Kalla <quic_charante(a)quicinc.com>
Subject: mm: madvise: return correct bytes advised with process_madvise
Patch series "mm: madvise: return correct bytes processed with
process_madvise", v2. With the process_madvise(), always choose to return
non zero processed bytes over an error. This can help the user to know on
which VMA, passed in the 'struct iovec' vector list, is failed to advise
thus can take the decission of retrying/skipping on that VMA.
This patch (of 2):
The process_madvise() system call returns error even after processing some
VMA's passed in the 'struct iovec' vector list which leaves the user
confused to know where to restart the advise next. It is also against
this syscall man page[1] documentation where it mentions that "return
value may be less than the total number of requested bytes, if an error
occurred after some iovec elements were already processed.".
Consider a user passed 10 VMA's in the 'struct iovec' vector list of which
9 are processed but one. Then it just returns the error caused on that
failed VMA despite the first 9 VMA's processed, leaving the user confused
about on which VMA it is failed. Returning the number of bytes processed
here can help the user to know which VMA it is failed on and thus can
retry/skip the advise on that VMA.
[1]https://man7.org/linux/man-pages/man2/process_madvise.2.html.
Link: https://lkml.kernel.org/r/cover.1647008754.git.quic_charante@quicinc.com
Link: https://lkml.kernel.org/r/125b61a0edcee5c2db8658aed9d06a43a19ccafc.16470087…
Fixes: ecb8ac8b1f14("mm/madvise: introduce process_madvise() syscall: an external memory hinting API")
Signed-off-by: Charan Teja Kalla <quic_charante(a)quicinc.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Stephen Rothwell <sfr(a)canb.auug.org.au>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/madvise.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/mm/madvise.c~mm-madvise-return-correct-bytes-advised-with-process_madvise
+++ a/mm/madvise.c
@@ -1435,8 +1435,7 @@ SYSCALL_DEFINE5(process_madvise, int, pi
iov_iter_advance(&iter, iovec.iov_len);
}
- if (ret == 0)
- ret = total_len - iov_iter_count(&iter);
+ ret = (total_len - iov_iter_count(&iter)) ? : ret;
release_mm:
mmput(mm);
_
Patches currently in -mm which might be from quic_charante(a)quicinc.com are
mm-vmscan-fix-documentation-for-page_check_references.patch
mm-madvise-return-correct-bytes-advised-with-process_madvise.patch
mm-madvise-skip-unmapped-vma-holes-passed-to-process_madvise.patch
There is a limited amount of SGX memory (EPC) on each system. When that
memory is used up, SGX has its own swapping mechanism which is similar
in concept but totally separate from the core mm/* code. Instead of
swapping to disk, SGX swaps from EPC to normal RAM. That normal RAM
comes from a shared memory pseudo-file and can itself be swapped by the
core mm code. There is a hierarchy like this:
EPC <-> shmem <-> disk
After data is swapped back in from shmem to EPC, the shmem backing
storage needs to be freed. Currently, the backing shmem is not freed.
This effectively wastes the shmem while the enclave is running. The
memory is recovered when the enclave is destroyed and the backing
storage freed.
Sort this out by freeing memory with shmem_truncate_range(), as soon as
a page is faulted back to the EPC. In addition, free the memory for
PCMD pages as soon as all PCMD's in a page have been marked as unused
by zeroing its contents.
Reported-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
---
v6:
* Re-applied on top of tip/x86/sgx and fixed the merge conflict, i.e.
sgx_encl_get_backing() instead of sgx_encl_lookup_backing().
v5:
* Encapsulated file offset calculation for PCMD struct.
* Replaced "magic number" PAGE_SIZE with sizeof(struct sgx_secs) to make
the offset calculation more self-documentative.
v4:
* Sanitized the offset calculations.
v3:
* Resend.
v2:
* Rewrite commit message as proposed by Dave.
* Truncate PCMD pages (Dave).
---
arch/x86/kernel/cpu/sgx/encl.c | 57 ++++++++++++++++++++++++++++------
1 file changed, 48 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 001808e3901c..6fa3d0a14b93 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -12,6 +12,30 @@
#include "encls.h"
#include "sgx.h"
+/*
+ * Calculate byte offset of a PCMD struct associated with an enclave page. PCMD's
+ * follow right after the EPC data in the backing storage. In addition to the
+ * visible enclave pages, there's one extra page slot for SECS, before PCMD
+ * structs.
+ */
+static inline pgoff_t sgx_encl_get_backing_page_pcmd_offset(struct sgx_encl *encl,
+ unsigned long page_index)
+{
+ pgoff_t epc_end_off = encl->size + sizeof(struct sgx_secs);
+
+ return epc_end_off + page_index * sizeof(struct sgx_pcmd);
+}
+
+/*
+ * Free a page from the backing storage in the given page index.
+ */
+static inline void sgx_encl_truncate_backing_page(struct sgx_encl *encl, unsigned long page_index)
+{
+ struct inode *inode = file_inode(encl->backing);
+
+ shmem_truncate_range(inode, PFN_PHYS(page_index), PFN_PHYS(page_index) + PAGE_SIZE - 1);
+}
+
/*
* ELDU: Load an EPC page as unblocked. For more info, see "OS Management of EPC
* Pages" in the SDM.
@@ -22,9 +46,11 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
{
unsigned long va_offset = encl_page->desc & SGX_ENCL_PAGE_VA_OFFSET_MASK;
struct sgx_encl *encl = encl_page->encl;
+ pgoff_t page_index, page_pcmd_off;
struct sgx_pageinfo pginfo;
struct sgx_backing b;
- pgoff_t page_index;
+ bool pcmd_page_empty;
+ u8 *pcmd_page;
int ret;
if (secs_page)
@@ -32,14 +58,16 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
else
page_index = PFN_DOWN(encl->size);
+ page_pcmd_off = sgx_encl_get_backing_page_pcmd_offset(encl, page_index);
+
ret = sgx_encl_get_backing(encl, page_index, &b);
if (ret)
return ret;
pginfo.addr = encl_page->desc & PAGE_MASK;
pginfo.contents = (unsigned long)kmap_atomic(b.contents);
- pginfo.metadata = (unsigned long)kmap_atomic(b.pcmd) +
- b.pcmd_offset;
+ pcmd_page = kmap_atomic(b.pcmd);
+ pginfo.metadata = (unsigned long)pcmd_page + b.pcmd_offset;
if (secs_page)
pginfo.secs = (u64)sgx_get_epc_virt_addr(secs_page);
@@ -55,11 +83,24 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
ret = -EFAULT;
}
- kunmap_atomic((void *)(unsigned long)(pginfo.metadata - b.pcmd_offset));
+ memset(pcmd_page + b.pcmd_offset, 0, sizeof(struct sgx_pcmd));
+
+ /*
+ * The area for the PCMD in the page was zeroed above. Check if the
+ * whole page is now empty meaning that all PCMD's have been zeroed:
+ */
+ pcmd_page_empty = !memchr_inv(pcmd_page, 0, PAGE_SIZE);
+
+ kunmap_atomic(pcmd_page);
kunmap_atomic((void *)(unsigned long)pginfo.contents);
sgx_encl_put_backing(&b, false);
+ sgx_encl_truncate_backing_page(encl, page_index);
+
+ if (pcmd_page_empty)
+ sgx_encl_truncate_backing_page(encl, PFN_DOWN(page_pcmd_off));
+
return ret;
}
@@ -577,7 +618,7 @@ static struct page *sgx_encl_get_backing_page(struct sgx_encl *encl,
int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
struct sgx_backing *backing)
{
- pgoff_t pcmd_index = PFN_DOWN(encl->size) + 1 + (page_index >> 5);
+ pgoff_t page_pcmd_off = sgx_encl_get_backing_page_pcmd_offset(encl, page_index);
struct page *contents;
struct page *pcmd;
@@ -585,7 +626,7 @@ int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
if (IS_ERR(contents))
return PTR_ERR(contents);
- pcmd = sgx_encl_get_backing_page(encl, pcmd_index);
+ pcmd = sgx_encl_get_backing_page(encl, PFN_DOWN(page_pcmd_off));
if (IS_ERR(pcmd)) {
put_page(contents);
return PTR_ERR(pcmd);
@@ -594,9 +635,7 @@ int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
backing->page_index = page_index;
backing->contents = contents;
backing->pcmd = pcmd;
- backing->pcmd_offset =
- (page_index & (PAGE_SIZE / sizeof(struct sgx_pcmd) - 1)) *
- sizeof(struct sgx_pcmd);
+ backing->pcmd_offset = page_pcmd_off & (PAGE_SIZE - 1);
return 0;
}
--
2.35.1
ATTENTION PLEASE,
I am Mrs Aminata Zongo, a personal Accountant/Executive board of
Directors working with United bank for African Burkina Faso (UBA). I
have an interesting business proposal for you that will be of immense
benefit to both of us. Although this may be hard for you to believe,
we stand to gain a huge amount between us in a matter of days. Please
grant me the benefit of doubt and hear me out. I need you to signify
your interest by replying to my mail.
Honestly, i have business transaction worth the sum of
(US$8,200,000.00) Eight Million two hundred thousand united state
dollars to transfer to you through proper documentation in position of
your own Account.
Most importantly, I will need you to promise to keep whatever you
learn from me between us even if you decide not to go along with me. I
will make more details available to you on receipt of a positive
response from you.
This transaction is risk-free; please urgently confirm your
willingness and interest to assist in this deal, I am in good faith
and with trust waiting for your Urgent respond and maximum cooperation
for more details.
Best Regards,
Mrs Aminata Zongo.
This is the start of the stable review cycle for the 4.9.306 release.
There are 38 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 12 Mar 2022 14:07:58 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.306-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.9.306-rc2
Juergen Gross <jgross(a)suse.com>
xen/netfront: react properly to failing gnttab_end_foreign_access_ref()
Juergen Gross <jgross(a)suse.com>
xen/gnttab: fix gnttab_end_foreign_access() without page specified
Juergen Gross <jgross(a)suse.com>
xen: remove gnttab_query_foreign_access()
Juergen Gross <jgross(a)suse.com>
xen/gntalloc: don't use gnttab_query_foreign_access()
Juergen Gross <jgross(a)suse.com>
xen/scsifront: don't use gnttab_query_foreign_access() for mapped status
Juergen Gross <jgross(a)suse.com>
xen/netfront: don't use gnttab_query_foreign_access() for mapped status
Juergen Gross <jgross(a)suse.com>
xen/blkfront: don't use gnttab_query_foreign_access() for mapped status
Juergen Gross <jgross(a)suse.com>
xen/grant-table: add gnttab_try_end_foreign_access()
Juergen Gross <jgross(a)suse.com>
xen/xenbus: don't let xenbus_grant_ring() remove grants in error case
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: fix build warning in proc-v7-bugs.c
WANG Chao <chao.wang(a)ucloud.cn>
x86, modpost: Replace last remnants of RETPOLINE with CONFIG_RETPOLINE
Masahiro Yamada <yamada.masahiro(a)socionext.com>
x86/build: Fix compiler support check for CONFIG_RETPOLINE
Nathan Chancellor <nathan(a)kernel.org>
ARM: Do not use NOCROSSREFS directive with ld.lld
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: fix co-processor register typo
Emmanuel Gil Peyrot <linkmauve(a)linkmauve.fr>
ARM: fix build error when BPF_SYSCALL is disabled
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: include unprivileged BPF status in Spectre V2 reporting
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: Spectre-BHB workaround
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: use LOADADDR() to get load address of sections
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: early traps initialisation
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: report Spectre v2 status through sysfs
Mark Rutland <mark.rutland(a)arm.com>
arm/arm64: smccc/psci: add arm_smccc_1_1_get_conduit()
Steven Price <steven.price(a)arm.com>
arm/arm64: Provide a wrapper for SMCCC 1.1 calls
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Warn about eIBRS + LFENCE + Unprivileged eBPF + SMT
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Warn about Spectre v2 LFENCE mitigation
Kim Phillips <kim.phillips(a)amd.com>
x86/speculation: Update link to AMD speculation whitepaper
Kim Phillips <kim.phillips(a)amd.com>
x86/speculation: Use generic retpoline by default on AMD
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting
Peter Zijlstra <peterz(a)infradead.org>
Documentation/hw-vuln: Update spectre doc
Peter Zijlstra <peterz(a)infradead.org>
x86/speculation: Add eIBRS + Retpoline options
Peter Zijlstra (Intel) <peterz(a)infradead.org>
x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE
Peter Zijlstra <peterz(a)infradead.org>
x86,bugs: Unconditionally allow spectre_v2=retpoline,amd
Borislav Petkov <bp(a)suse.de>
x86/speculation: Merge one test in spectre_v2_user_select_mitigation()
Lukas Bulwahn <lukas.bulwahn(a)gmail.com>
Documentation: refer to config RANDOMIZE_BASE for kernel address-space randomization
Josh Poimboeuf <jpoimboe(a)redhat.com>
Documentation: Add swapgs description to the Spectre v1 documentation
Tim Chen <tim.c.chen(a)linux.intel.com>
Documentation: Add section about CPU vulnerabilities for Spectre
Zhenzhong Duan <zhenzhong.duan(a)oracle.com>
x86/retpoline: Remove minimal retpoline support
Zhenzhong Duan <zhenzhong.duan(a)oracle.com>
x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support
Zhenzhong Duan <zhenzhong.duan(a)oracle.com>
x86/speculation: Add RETPOLINE_AMD support to the inline asm CALL_NOSPEC variant
-------------
Diffstat:
Documentation/hw-vuln/index.rst | 1 +
Documentation/hw-vuln/spectre.rst | 785 +++++++++++++++++++++++++++++++
Documentation/kernel-parameters.txt | 8 +-
Makefile | 4 +-
arch/arm/include/asm/assembler.h | 10 +
arch/arm/include/asm/spectre.h | 32 ++
arch/arm/kernel/Makefile | 2 +
arch/arm/kernel/entry-armv.S | 79 +++-
arch/arm/kernel/entry-common.S | 24 +
arch/arm/kernel/spectre.c | 71 +++
arch/arm/kernel/traps.c | 65 ++-
arch/arm/kernel/vmlinux-xip.lds.S | 45 +-
arch/arm/kernel/vmlinux.lds.S | 45 +-
arch/arm/mm/Kconfig | 11 +
arch/arm/mm/proc-v7-bugs.c | 199 ++++++--
arch/x86/Kconfig | 4 -
arch/x86/Makefile | 11 +-
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/nospec-branch.h | 41 +-
arch/x86/kernel/cpu/bugs.c | 225 ++++++---
drivers/block/xen-blkfront.c | 67 +--
drivers/firmware/psci.c | 15 +
drivers/net/xen-netfront.c | 54 ++-
drivers/scsi/xen-scsifront.c | 3 +-
drivers/xen/gntalloc.c | 25 +-
drivers/xen/grant-table.c | 59 ++-
drivers/xen/xenbus/xenbus_client.c | 24 +-
include/linux/arm-smccc.h | 74 +++
include/linux/bpf.h | 11 +
include/linux/compiler-gcc.h | 2 +-
include/linux/module.h | 2 +-
include/xen/grant_table.h | 19 +-
kernel/sysctl.c | 8 +
scripts/mod/modpost.c | 2 +-
tools/arch/x86/include/asm/cpufeatures.h | 2 +-
35 files changed, 1763 insertions(+), 268 deletions(-)
This is the start of the stable review cycle for the 4.14.271 release.
There are 31 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 12 Mar 2022 14:07:58 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.271-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.271-rc2
Juergen Gross <jgross(a)suse.com>
xen/netfront: react properly to failing gnttab_end_foreign_access_ref()
Juergen Gross <jgross(a)suse.com>
xen/gnttab: fix gnttab_end_foreign_access() without page specified
Juergen Gross <jgross(a)suse.com>
xen/9p: use alloc/free_pages_exact()
Juergen Gross <jgross(a)suse.com>
xen: remove gnttab_query_foreign_access()
Juergen Gross <jgross(a)suse.com>
xen/gntalloc: don't use gnttab_query_foreign_access()
Juergen Gross <jgross(a)suse.com>
xen/scsifront: don't use gnttab_query_foreign_access() for mapped status
Juergen Gross <jgross(a)suse.com>
xen/netfront: don't use gnttab_query_foreign_access() for mapped status
Juergen Gross <jgross(a)suse.com>
xen/blkfront: don't use gnttab_query_foreign_access() for mapped status
Juergen Gross <jgross(a)suse.com>
xen/grant-table: add gnttab_try_end_foreign_access()
Juergen Gross <jgross(a)suse.com>
xen/xenbus: don't let xenbus_grant_ring() remove grants in error case
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: fix build warning in proc-v7-bugs.c
Nathan Chancellor <nathan(a)kernel.org>
ARM: Do not use NOCROSSREFS directive with ld.lld
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: fix co-processor register typo
Emmanuel Gil Peyrot <linkmauve(a)linkmauve.fr>
ARM: fix build error when BPF_SYSCALL is disabled
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: include unprivileged BPF status in Spectre V2 reporting
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: Spectre-BHB workaround
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: use LOADADDR() to get load address of sections
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: early traps initialisation
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: report Spectre v2 status through sysfs
Mark Rutland <mark.rutland(a)arm.com>
arm/arm64: smccc/psci: add arm_smccc_1_1_get_conduit()
Steven Price <steven.price(a)arm.com>
arm/arm64: Provide a wrapper for SMCCC 1.1 calls
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Warn about eIBRS + LFENCE + Unprivileged eBPF + SMT
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Warn about Spectre v2 LFENCE mitigation
Kim Phillips <kim.phillips(a)amd.com>
x86/speculation: Update link to AMD speculation whitepaper
Kim Phillips <kim.phillips(a)amd.com>
x86/speculation: Use generic retpoline by default on AMD
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting
Peter Zijlstra <peterz(a)infradead.org>
Documentation/hw-vuln: Update spectre doc
Peter Zijlstra <peterz(a)infradead.org>
x86/speculation: Add eIBRS + Retpoline options
Peter Zijlstra (Intel) <peterz(a)infradead.org>
x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE
Peter Zijlstra <peterz(a)infradead.org>
x86,bugs: Unconditionally allow spectre_v2=retpoline,amd
Borislav Petkov <bp(a)suse.de>
x86/speculation: Merge one test in spectre_v2_user_select_mitigation()
-------------
Diffstat:
Documentation/admin-guide/hw-vuln/spectre.rst | 48 ++++--
Documentation/admin-guide/kernel-parameters.txt | 8 +-
Makefile | 4 +-
arch/arm/include/asm/assembler.h | 10 ++
arch/arm/include/asm/spectre.h | 32 ++++
arch/arm/kernel/Makefile | 2 +
arch/arm/kernel/entry-armv.S | 79 ++++++++-
arch/arm/kernel/entry-common.S | 24 +++
arch/arm/kernel/spectre.c | 71 ++++++++
arch/arm/kernel/traps.c | 65 ++++++-
arch/arm/kernel/vmlinux-xip.lds.S | 45 +++--
arch/arm/kernel/vmlinux.lds.S | 45 +++--
arch/arm/mm/Kconfig | 11 ++
arch/arm/mm/proc-v7-bugs.c | 199 +++++++++++++++++++---
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/nospec-branch.h | 16 +-
arch/x86/kernel/cpu/bugs.c | 214 +++++++++++++++++-------
drivers/block/xen-blkfront.c | 67 ++++----
drivers/firmware/psci.c | 15 ++
drivers/net/xen-netfront.c | 54 +++---
drivers/scsi/xen-scsifront.c | 3 +-
drivers/xen/gntalloc.c | 25 +--
drivers/xen/grant-table.c | 59 ++++---
drivers/xen/xenbus/xenbus_client.c | 24 ++-
include/linux/arm-smccc.h | 74 ++++++++
include/linux/bpf.h | 11 ++
include/xen/grant_table.h | 19 ++-
kernel/sysctl.c | 8 +
net/9p/trans_xen.c | 14 +-
tools/arch/x86/include/asm/cpufeatures.h | 2 +-
30 files changed, 986 insertions(+), 264 deletions(-)
This is the start of the stable review cycle for the 4.19.234 release.
There are 33 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 12 Mar 2022 14:07:58 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.234-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.19.234-rc2
Juergen Gross <jgross(a)suse.com>
xen/netfront: react properly to failing gnttab_end_foreign_access_ref()
Juergen Gross <jgross(a)suse.com>
xen/gnttab: fix gnttab_end_foreign_access() without page specified
Juergen Gross <jgross(a)suse.com>
xen/pvcalls: use alloc/free_pages_exact()
Juergen Gross <jgross(a)suse.com>
xen/9p: use alloc/free_pages_exact()
Juergen Gross <jgross(a)suse.com>
xen: remove gnttab_query_foreign_access()
Juergen Gross <jgross(a)suse.com>
xen/gntalloc: don't use gnttab_query_foreign_access()
Juergen Gross <jgross(a)suse.com>
xen/scsifront: don't use gnttab_query_foreign_access() for mapped status
Juergen Gross <jgross(a)suse.com>
xen/netfront: don't use gnttab_query_foreign_access() for mapped status
Juergen Gross <jgross(a)suse.com>
xen/blkfront: don't use gnttab_query_foreign_access() for mapped status
Juergen Gross <jgross(a)suse.com>
xen/grant-table: add gnttab_try_end_foreign_access()
Juergen Gross <jgross(a)suse.com>
xen/xenbus: don't let xenbus_grant_ring() remove grants in error case
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: fix build warning in proc-v7-bugs.c
Nathan Chancellor <nathan(a)kernel.org>
ARM: Do not use NOCROSSREFS directive with ld.lld
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: fix co-processor register typo
Sami Tolvanen <samitolvanen(a)google.com>
kbuild: add CONFIG_LD_IS_LLD
Emmanuel Gil Peyrot <linkmauve(a)linkmauve.fr>
ARM: fix build error when BPF_SYSCALL is disabled
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: include unprivileged BPF status in Spectre V2 reporting
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: Spectre-BHB workaround
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: use LOADADDR() to get load address of sections
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: early traps initialisation
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: report Spectre v2 status through sysfs
Mark Rutland <mark.rutland(a)arm.com>
arm/arm64: smccc/psci: add arm_smccc_1_1_get_conduit()
Steven Price <steven.price(a)arm.com>
arm/arm64: Provide a wrapper for SMCCC 1.1 calls
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Warn about eIBRS + LFENCE + Unprivileged eBPF + SMT
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Warn about Spectre v2 LFENCE mitigation
Kim Phillips <kim.phillips(a)amd.com>
x86/speculation: Update link to AMD speculation whitepaper
Kim Phillips <kim.phillips(a)amd.com>
x86/speculation: Use generic retpoline by default on AMD
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting
Peter Zijlstra <peterz(a)infradead.org>
Documentation/hw-vuln: Update spectre doc
Peter Zijlstra <peterz(a)infradead.org>
x86/speculation: Add eIBRS + Retpoline options
Peter Zijlstra (Intel) <peterz(a)infradead.org>
x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE
Peter Zijlstra <peterz(a)infradead.org>
x86,bugs: Unconditionally allow spectre_v2=retpoline,amd
Borislav Petkov <bp(a)suse.de>
x86/speculation: Merge one test in spectre_v2_user_select_mitigation()
-------------
Diffstat:
Documentation/admin-guide/hw-vuln/spectre.rst | 48 ++++--
Documentation/admin-guide/kernel-parameters.txt | 8 +-
Makefile | 4 +-
arch/arm/include/asm/assembler.h | 10 ++
arch/arm/include/asm/spectre.h | 32 ++++
arch/arm/kernel/Makefile | 2 +
arch/arm/kernel/entry-armv.S | 79 ++++++++-
arch/arm/kernel/entry-common.S | 24 +++
arch/arm/kernel/spectre.c | 71 ++++++++
arch/arm/kernel/traps.c | 65 ++++++-
arch/arm/kernel/vmlinux.lds.h | 43 ++++-
arch/arm/mm/Kconfig | 11 ++
arch/arm/mm/proc-v7-bugs.c | 201 +++++++++++++++++++---
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/nospec-branch.h | 16 +-
arch/x86/kernel/cpu/bugs.c | 214 +++++++++++++++++-------
drivers/block/xen-blkfront.c | 63 ++++---
drivers/firmware/psci.c | 15 ++
drivers/net/xen-netfront.c | 54 +++---
drivers/scsi/xen-scsifront.c | 3 +-
drivers/xen/gntalloc.c | 25 +--
drivers/xen/grant-table.c | 71 ++++----
drivers/xen/pvcalls-front.c | 8 +-
drivers/xen/xenbus/xenbus_client.c | 24 ++-
include/linux/arm-smccc.h | 74 ++++++++
include/linux/bpf.h | 11 ++
include/xen/grant_table.h | 19 ++-
init/Kconfig | 3 +
kernel/sysctl.c | 8 +
net/9p/trans_xen.c | 14 +-
tools/arch/x86/include/asm/cpufeatures.h | 2 +-
31 files changed, 963 insertions(+), 261 deletions(-)
This is the start of the stable review cycle for the 5.4.184 release.
There are 33 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 12 Mar 2022 14:07:58 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.184-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.4.184-rc2
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Revert "ACPI: PM: s2idle: Cancel wakeup before dispatching EC GPE"
Juergen Gross <jgross(a)suse.com>
xen/netfront: react properly to failing gnttab_end_foreign_access_ref()
Juergen Gross <jgross(a)suse.com>
xen/gnttab: fix gnttab_end_foreign_access() without page specified
Juergen Gross <jgross(a)suse.com>
xen/pvcalls: use alloc/free_pages_exact()
Juergen Gross <jgross(a)suse.com>
xen/9p: use alloc/free_pages_exact()
Juergen Gross <jgross(a)suse.com>
xen: remove gnttab_query_foreign_access()
Juergen Gross <jgross(a)suse.com>
xen/gntalloc: don't use gnttab_query_foreign_access()
Juergen Gross <jgross(a)suse.com>
xen/scsifront: don't use gnttab_query_foreign_access() for mapped status
Juergen Gross <jgross(a)suse.com>
xen/netfront: don't use gnttab_query_foreign_access() for mapped status
Juergen Gross <jgross(a)suse.com>
xen/blkfront: don't use gnttab_query_foreign_access() for mapped status
Juergen Gross <jgross(a)suse.com>
xen/grant-table: add gnttab_try_end_foreign_access()
Juergen Gross <jgross(a)suse.com>
xen/xenbus: don't let xenbus_grant_ring() remove grants in error case
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: fix build warning in proc-v7-bugs.c
Nathan Chancellor <nathan(a)kernel.org>
ARM: Do not use NOCROSSREFS directive with ld.lld
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: fix co-processor register typo
Emmanuel Gil Peyrot <linkmauve(a)linkmauve.fr>
ARM: fix build error when BPF_SYSCALL is disabled
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: include unprivileged BPF status in Spectre V2 reporting
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: Spectre-BHB workaround
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: use LOADADDR() to get load address of sections
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: early traps initialisation
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: report Spectre v2 status through sysfs
Mark Rutland <mark.rutland(a)arm.com>
arm/arm64: smccc/psci: add arm_smccc_1_1_get_conduit()
Steven Price <steven.price(a)arm.com>
arm/arm64: Provide a wrapper for SMCCC 1.1 calls
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Warn about eIBRS + LFENCE + Unprivileged eBPF + SMT
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Warn about Spectre v2 LFENCE mitigation
Kim Phillips <kim.phillips(a)amd.com>
x86/speculation: Update link to AMD speculation whitepaper
Kim Phillips <kim.phillips(a)amd.com>
x86/speculation: Use generic retpoline by default on AMD
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting
Peter Zijlstra <peterz(a)infradead.org>
Documentation/hw-vuln: Update spectre doc
Peter Zijlstra <peterz(a)infradead.org>
x86/speculation: Add eIBRS + Retpoline options
Peter Zijlstra (Intel) <peterz(a)infradead.org>
x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE
Peter Zijlstra <peterz(a)infradead.org>
x86,bugs: Unconditionally allow spectre_v2=retpoline,amd
Borislav Petkov <bp(a)suse.de>
x86/speculation: Merge one test in spectre_v2_user_select_mitigation()
-------------
Diffstat:
Documentation/admin-guide/hw-vuln/spectre.rst | 48 ++++--
Documentation/admin-guide/kernel-parameters.txt | 8 +-
Makefile | 4 +-
arch/arm/include/asm/assembler.h | 10 ++
arch/arm/include/asm/spectre.h | 32 ++++
arch/arm/kernel/Makefile | 2 +
arch/arm/kernel/entry-armv.S | 79 ++++++++-
arch/arm/kernel/entry-common.S | 24 +++
arch/arm/kernel/spectre.c | 71 ++++++++
arch/arm/kernel/traps.c | 65 ++++++-
arch/arm/kernel/vmlinux.lds.h | 43 ++++-
arch/arm/mm/Kconfig | 11 ++
arch/arm/mm/proc-v7-bugs.c | 200 +++++++++++++++++++---
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/nospec-branch.h | 16 +-
arch/x86/kernel/cpu/bugs.c | 216 +++++++++++++++++-------
drivers/acpi/ec.c | 10 --
drivers/acpi/sleep.c | 14 +-
drivers/block/xen-blkfront.c | 63 ++++---
drivers/firmware/psci/psci.c | 15 ++
drivers/net/xen-netfront.c | 54 +++---
drivers/scsi/xen-scsifront.c | 3 +-
drivers/xen/gntalloc.c | 25 +--
drivers/xen/grant-table.c | 71 ++++----
drivers/xen/pvcalls-front.c | 8 +-
drivers/xen/xenbus/xenbus_client.c | 24 ++-
include/linux/arm-smccc.h | 74 ++++++++
include/linux/bpf.h | 12 ++
include/xen/grant_table.h | 19 ++-
kernel/sysctl.c | 8 +
net/9p/trans_xen.c | 14 +-
tools/arch/x86/include/asm/cpufeatures.h | 2 +-
32 files changed, 970 insertions(+), 277 deletions(-)
Hello
Greetings to you my dear how are you doing today? May this mail find
you well. My name is Lady. Rachael Cole and am from France but i
working with the Bank BTCI in Lome-Togo as an Accountant, I have
Business Proposal for you from my Bank. And its about the fund in my
deceased clients Account that has no indication of beneficiary or next
of Kin. Now my Bank has mandated me to provide the Beneficiary or the
next of Kin to the fund or it will be confiscated, so i seek your
consent to present you to my Bank as the Beneficiary to the fund so
that they will release the fund to you for our mutual benefit, so do
respond for further information if only you are interested.
Lady. Rachael Cole F.
Note, I'm sending all the patches again for all of the -rc2 releases as
there has been a lot of churn from what was in -rc1 to -rc2.
This is the start of the stable review cycle for the 5.16.14 release.
There are 53 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 12 Mar 2022 14:07:58 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.16.14-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.16.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.16.14-rc2
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Revert "ACPI: PM: s2idle: Cancel wakeup before dispatching EC GPE"
Juergen Gross <jgross(a)suse.com>
xen/netfront: react properly to failing gnttab_end_foreign_access_ref()
Juergen Gross <jgross(a)suse.com>
xen/gnttab: fix gnttab_end_foreign_access() without page specified
Juergen Gross <jgross(a)suse.com>
xen/pvcalls: use alloc/free_pages_exact()
Juergen Gross <jgross(a)suse.com>
xen/9p: use alloc/free_pages_exact()
Juergen Gross <jgross(a)suse.com>
xen: remove gnttab_query_foreign_access()
Juergen Gross <jgross(a)suse.com>
xen/gntalloc: don't use gnttab_query_foreign_access()
Juergen Gross <jgross(a)suse.com>
xen/scsifront: don't use gnttab_query_foreign_access() for mapped status
Juergen Gross <jgross(a)suse.com>
xen/netfront: don't use gnttab_query_foreign_access() for mapped status
Juergen Gross <jgross(a)suse.com>
xen/blkfront: don't use gnttab_query_foreign_access() for mapped status
Juergen Gross <jgross(a)suse.com>
xen/grant-table: add gnttab_try_end_foreign_access()
Juergen Gross <jgross(a)suse.com>
xen/xenbus: don't let xenbus_grant_ring() remove grants in error case
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: fix build warning in proc-v7-bugs.c
Nathan Chancellor <nathan(a)kernel.org>
arm64: Do not include __READ_ONCE() block in assembly files
Nathan Chancellor <nathan(a)kernel.org>
ARM: Do not use NOCROSSREFS directive with ld.lld
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: fix co-processor register typo
Emmanuel Gil Peyrot <linkmauve(a)linkmauve.fr>
ARM: fix build error when BPF_SYSCALL is disabled
James Morse <james.morse(a)arm.com>
arm64: proton-pack: Include unprivileged eBPF status in Spectre v2 mitigation reporting
James Morse <james.morse(a)arm.com>
arm64: Use the clearbhb instruction in mitigations
James Morse <james.morse(a)arm.com>
KVM: arm64: Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and migrated
James Morse <james.morse(a)arm.com>
arm64: Mitigate spectre style branch history side channels
James Morse <james.morse(a)arm.com>
arm64: proton-pack: Report Spectre-BHB vulnerabilities as part of Spectre-v2
James Morse <james.morse(a)arm.com>
arm64: Add percpu vectors for EL1
James Morse <james.morse(a)arm.com>
arm64: entry: Add macro for reading symbol addresses from the trampoline
James Morse <james.morse(a)arm.com>
arm64: entry: Add vectors that have the bhb mitigation sequences
James Morse <james.morse(a)arm.com>
arm64: entry: Add non-kpti __bp_harden_el1_vectors for mitigations
James Morse <james.morse(a)arm.com>
arm64: entry: Allow the trampoline text to occupy multiple pages
James Morse <james.morse(a)arm.com>
arm64: entry: Make the kpti trampoline's kpti sequence optional
James Morse <james.morse(a)arm.com>
arm64: entry: Move trampoline macros out of ifdef'd section
James Morse <james.morse(a)arm.com>
arm64: entry: Don't assume tramp_vectors is the start of the vectors
James Morse <james.morse(a)arm.com>
arm64: entry: Allow tramp_alias to access symbols after the 4K boundary
James Morse <james.morse(a)arm.com>
arm64: entry: Move the trampoline data page before the text page
James Morse <james.morse(a)arm.com>
arm64: entry: Free up another register on kpti's tramp_exit path
James Morse <james.morse(a)arm.com>
arm64: entry: Make the trampoline cleanup optional
James Morse <james.morse(a)arm.com>
KVM: arm64: Allow indirect vectors to be used without SPECTRE_V3A
James Morse <james.morse(a)arm.com>
arm64: spectre: Rename spectre_v4_patch_fw_mitigation_conduit
James Morse <james.morse(a)arm.com>
arm64: entry.S: Add ventry overflow sanity checks
Joey Gouly <joey.gouly(a)arm.com>
arm64: cpufeature: add HWCAP for FEAT_RPRES
Joey Gouly <joey.gouly(a)arm.com>
arm64: cpufeature: add HWCAP for FEAT_AFP
Joey Gouly <joey.gouly(a)arm.com>
arm64: add ID_AA64ISAR2_EL1 sys register
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: include unprivileged BPF status in Spectre V2 reporting
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: Spectre-BHB workaround
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: use LOADADDR() to get load address of sections
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: early traps initialisation
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: report Spectre v2 status through sysfs
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Warn about eIBRS + LFENCE + Unprivileged eBPF + SMT
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Warn about Spectre v2 LFENCE mitigation
Kim Phillips <kim.phillips(a)amd.com>
x86/speculation: Update link to AMD speculation whitepaper
Kim Phillips <kim.phillips(a)amd.com>
x86/speculation: Use generic retpoline by default on AMD
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting
Peter Zijlstra <peterz(a)infradead.org>
Documentation/hw-vuln: Update spectre doc
Peter Zijlstra <peterz(a)infradead.org>
x86/speculation: Add eIBRS + Retpoline options
Peter Zijlstra (Intel) <peterz(a)infradead.org>
x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE
-------------
Diffstat:
Documentation/admin-guide/hw-vuln/spectre.rst | 50 +--
Documentation/admin-guide/kernel-parameters.txt | 8 +-
Documentation/arm64/cpu-feature-registers.rst | 17 ++
Documentation/arm64/elf_hwcaps.rst | 8 +
Makefile | 4 +-
arch/arm/include/asm/assembler.h | 10 +
arch/arm/include/asm/spectre.h | 32 ++
arch/arm/include/asm/vmlinux.lds.h | 43 ++-
arch/arm/kernel/Makefile | 2 +
arch/arm/kernel/entry-armv.S | 79 ++++-
arch/arm/kernel/entry-common.S | 24 ++
arch/arm/kernel/spectre.c | 71 +++++
arch/arm/kernel/traps.c | 65 +++-
arch/arm/mm/Kconfig | 11 +
arch/arm/mm/proc-v7-bugs.c | 208 ++++++++++---
arch/arm64/Kconfig | 9 +
arch/arm64/include/asm/assembler.h | 53 ++++
arch/arm64/include/asm/cpu.h | 1 +
arch/arm64/include/asm/cpufeature.h | 29 ++
arch/arm64/include/asm/cputype.h | 8 +
arch/arm64/include/asm/fixmap.h | 6 +-
arch/arm64/include/asm/hwcap.h | 2 +
arch/arm64/include/asm/insn.h | 1 +
arch/arm64/include/asm/kvm_host.h | 5 +
arch/arm64/include/asm/rwonce.h | 4 +-
arch/arm64/include/asm/sections.h | 5 +
arch/arm64/include/asm/spectre.h | 4 +
arch/arm64/include/asm/sysreg.h | 18 ++
arch/arm64/include/asm/vectors.h | 73 +++++
arch/arm64/include/uapi/asm/hwcap.h | 2 +
arch/arm64/include/uapi/asm/kvm.h | 5 +
arch/arm64/kernel/cpu_errata.c | 7 +
arch/arm64/kernel/cpufeature.c | 25 ++
arch/arm64/kernel/cpuinfo.c | 3 +
arch/arm64/kernel/entry.S | 214 +++++++++----
arch/arm64/kernel/image-vars.h | 4 +
arch/arm64/kernel/proton-pack.c | 391 +++++++++++++++++++++++-
arch/arm64/kernel/vmlinux.lds.S | 2 +-
arch/arm64/kvm/arm.c | 5 +-
arch/arm64/kvm/hyp/hyp-entry.S | 9 +
arch/arm64/kvm/hyp/nvhe/mm.c | 4 +-
arch/arm64/kvm/hyp/vhe/switch.c | 9 +-
arch/arm64/kvm/hypercalls.c | 12 +
arch/arm64/kvm/psci.c | 18 +-
arch/arm64/kvm/sys_regs.c | 2 +-
arch/arm64/mm/mmu.c | 12 +-
arch/arm64/tools/cpucaps | 1 +
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/nospec-branch.h | 16 +-
arch/x86/kernel/alternative.c | 8 +-
arch/x86/kernel/cpu/bugs.c | 204 ++++++++++---
arch/x86/lib/retpoline.S | 2 +-
arch/x86/net/bpf_jit_comp.c | 2 +-
drivers/acpi/ec.c | 10 -
drivers/acpi/sleep.c | 14 +-
drivers/block/xen-blkfront.c | 63 ++--
drivers/net/xen-netfront.c | 54 ++--
drivers/scsi/xen-scsifront.c | 3 +-
drivers/xen/gntalloc.c | 25 +-
drivers/xen/grant-table.c | 71 +++--
drivers/xen/pvcalls-front.c | 8 +-
drivers/xen/xenbus/xenbus_client.c | 24 +-
include/linux/arm-smccc.h | 5 +
include/linux/bpf.h | 12 +
include/xen/grant_table.h | 19 +-
kernel/sysctl.c | 7 +
net/9p/trans_xen.c | 14 +-
tools/arch/x86/include/asm/cpufeatures.h | 2 +-
68 files changed, 1782 insertions(+), 358 deletions(-)
This is the start of the stable review cycle for the 5.15.28 release.
There are 58 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 12 Mar 2022 14:07:58 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.28-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.15.28-rc2
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Revert "ACPI: PM: s2idle: Cancel wakeup before dispatching EC GPE"
Juergen Gross <jgross(a)suse.com>
xen/netfront: react properly to failing gnttab_end_foreign_access_ref()
Juergen Gross <jgross(a)suse.com>
xen/gnttab: fix gnttab_end_foreign_access() without page specified
Juergen Gross <jgross(a)suse.com>
xen/pvcalls: use alloc/free_pages_exact()
Juergen Gross <jgross(a)suse.com>
xen/9p: use alloc/free_pages_exact()
Juergen Gross <jgross(a)suse.com>
xen: remove gnttab_query_foreign_access()
Juergen Gross <jgross(a)suse.com>
xen/gntalloc: don't use gnttab_query_foreign_access()
Juergen Gross <jgross(a)suse.com>
xen/scsifront: don't use gnttab_query_foreign_access() for mapped status
Juergen Gross <jgross(a)suse.com>
xen/netfront: don't use gnttab_query_foreign_access() for mapped status
Juergen Gross <jgross(a)suse.com>
xen/blkfront: don't use gnttab_query_foreign_access() for mapped status
Juergen Gross <jgross(a)suse.com>
xen/grant-table: add gnttab_try_end_foreign_access()
Juergen Gross <jgross(a)suse.com>
xen/xenbus: don't let xenbus_grant_ring() remove grants in error case
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: fix build warning in proc-v7-bugs.c
Nathan Chancellor <nathan(a)kernel.org>
arm64: Do not include __READ_ONCE() block in assembly files
Nathan Chancellor <nathan(a)kernel.org>
ARM: Do not use NOCROSSREFS directive with ld.lld
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: fix co-processor register typo
Emmanuel Gil Peyrot <linkmauve(a)linkmauve.fr>
ARM: fix build error when BPF_SYSCALL is disabled
James Morse <james.morse(a)arm.com>
arm64: proton-pack: Include unprivileged eBPF status in Spectre v2 mitigation reporting
James Morse <james.morse(a)arm.com>
arm64: Use the clearbhb instruction in mitigations
James Morse <james.morse(a)arm.com>
KVM: arm64: Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and migrated
James Morse <james.morse(a)arm.com>
arm64: Mitigate spectre style branch history side channels
James Morse <james.morse(a)arm.com>
arm64: proton-pack: Report Spectre-BHB vulnerabilities as part of Spectre-v2
James Morse <james.morse(a)arm.com>
arm64: Add percpu vectors for EL1
James Morse <james.morse(a)arm.com>
arm64: entry: Add macro for reading symbol addresses from the trampoline
James Morse <james.morse(a)arm.com>
arm64: entry: Add vectors that have the bhb mitigation sequences
James Morse <james.morse(a)arm.com>
arm64: entry: Add non-kpti __bp_harden_el1_vectors for mitigations
James Morse <james.morse(a)arm.com>
arm64: entry: Allow the trampoline text to occupy multiple pages
James Morse <james.morse(a)arm.com>
arm64: entry: Make the kpti trampoline's kpti sequence optional
James Morse <james.morse(a)arm.com>
arm64: entry: Move trampoline macros out of ifdef'd section
James Morse <james.morse(a)arm.com>
arm64: entry: Don't assume tramp_vectors is the start of the vectors
James Morse <james.morse(a)arm.com>
arm64: entry: Allow tramp_alias to access symbols after the 4K boundary
James Morse <james.morse(a)arm.com>
arm64: entry: Move the trampoline data page before the text page
James Morse <james.morse(a)arm.com>
arm64: entry: Free up another register on kpti's tramp_exit path
James Morse <james.morse(a)arm.com>
arm64: entry: Make the trampoline cleanup optional
James Morse <james.morse(a)arm.com>
KVM: arm64: Allow indirect vectors to be used without SPECTRE_V3A
James Morse <james.morse(a)arm.com>
arm64: spectre: Rename spectre_v4_patch_fw_mitigation_conduit
James Morse <james.morse(a)arm.com>
arm64: entry.S: Add ventry overflow sanity checks
Joey Gouly <joey.gouly(a)arm.com>
arm64: cpufeature: add HWCAP for FEAT_RPRES
Joey Gouly <joey.gouly(a)arm.com>
arm64: cpufeature: add HWCAP for FEAT_AFP
Joey Gouly <joey.gouly(a)arm.com>
arm64: add ID_AA64ISAR2_EL1 sys register
Anshuman Khandual <anshuman.khandual(a)arm.com>
arm64: Add Cortex-X2 CPU part definition
Marc Zyngier <maz(a)kernel.org>
arm64: Add HWCAP for self-synchronising virtual counter
Suzuki K Poulose <suzuki.poulose(a)arm.com>
arm64: Add Neoverse-N2, Cortex-A710 CPU part definition
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: include unprivileged BPF status in Spectre V2 reporting
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: Spectre-BHB workaround
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: use LOADADDR() to get load address of sections
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: early traps initialisation
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: report Spectre v2 status through sysfs
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Warn about eIBRS + LFENCE + Unprivileged eBPF + SMT
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Warn about Spectre v2 LFENCE mitigation
Kim Phillips <kim.phillips(a)amd.com>
x86/speculation: Update link to AMD speculation whitepaper
Kim Phillips <kim.phillips(a)amd.com>
x86/speculation: Use generic retpoline by default on AMD
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting
Peter Zijlstra <peterz(a)infradead.org>
Documentation/hw-vuln: Update spectre doc
Peter Zijlstra <peterz(a)infradead.org>
x86/speculation: Add eIBRS + Retpoline options
Peter Zijlstra (Intel) <peterz(a)infradead.org>
x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE
Peter Zijlstra <peterz(a)infradead.org>
x86,bugs: Unconditionally allow spectre_v2=retpoline,amd
Huang Pei <huangpei(a)loongson.cn>
slip: fix macro redefine warning
-------------
Diffstat:
Documentation/admin-guide/hw-vuln/spectre.rst | 48 ++-
Documentation/admin-guide/kernel-parameters.txt | 8 +-
Documentation/arm64/cpu-feature-registers.rst | 29 +-
Documentation/arm64/elf_hwcaps.rst | 12 +
Makefile | 4 +-
arch/arm/include/asm/assembler.h | 10 +
arch/arm/include/asm/spectre.h | 32 ++
arch/arm/include/asm/vmlinux.lds.h | 43 ++-
arch/arm/kernel/Makefile | 2 +
arch/arm/kernel/entry-armv.S | 79 ++++-
arch/arm/kernel/entry-common.S | 24 ++
arch/arm/kernel/spectre.c | 71 +++++
arch/arm/kernel/traps.c | 65 +++-
arch/arm/mm/Kconfig | 11 +
arch/arm/mm/proc-v7-bugs.c | 208 ++++++++++---
arch/arm64/Kconfig | 9 +
arch/arm64/include/asm/assembler.h | 53 ++++
arch/arm64/include/asm/cpu.h | 1 +
arch/arm64/include/asm/cpufeature.h | 29 ++
arch/arm64/include/asm/cputype.h | 14 +
arch/arm64/include/asm/fixmap.h | 6 +-
arch/arm64/include/asm/hwcap.h | 3 +
arch/arm64/include/asm/insn.h | 1 +
arch/arm64/include/asm/kvm_host.h | 5 +
arch/arm64/include/asm/rwonce.h | 4 +-
arch/arm64/include/asm/sections.h | 5 +
arch/arm64/include/asm/spectre.h | 4 +
arch/arm64/include/asm/sysreg.h | 18 ++
arch/arm64/include/asm/vectors.h | 73 +++++
arch/arm64/include/uapi/asm/hwcap.h | 3 +
arch/arm64/include/uapi/asm/kvm.h | 5 +
arch/arm64/kernel/cpu_errata.c | 7 +
arch/arm64/kernel/cpufeature.c | 28 +-
arch/arm64/kernel/cpuinfo.c | 4 +
arch/arm64/kernel/entry.S | 214 +++++++++----
arch/arm64/kernel/image-vars.h | 4 +
arch/arm64/kernel/proton-pack.c | 391 +++++++++++++++++++++++-
arch/arm64/kernel/vmlinux.lds.S | 2 +-
arch/arm64/kvm/arm.c | 5 +-
arch/arm64/kvm/hyp/hyp-entry.S | 9 +
arch/arm64/kvm/hyp/nvhe/mm.c | 4 +-
arch/arm64/kvm/hyp/vhe/switch.c | 9 +-
arch/arm64/kvm/hypercalls.c | 12 +
arch/arm64/kvm/psci.c | 18 +-
arch/arm64/kvm/sys_regs.c | 2 +-
arch/arm64/mm/mmu.c | 12 +-
arch/arm64/tools/cpucaps | 1 +
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/nospec-branch.h | 16 +-
arch/x86/kernel/cpu/bugs.c | 205 +++++++++----
arch/x86/lib/retpoline.S | 2 +-
drivers/acpi/ec.c | 10 -
drivers/acpi/sleep.c | 14 +-
drivers/block/xen-blkfront.c | 63 ++--
drivers/net/slip/slip.h | 2 +
drivers/net/xen-netfront.c | 54 ++--
drivers/scsi/xen-scsifront.c | 3 +-
drivers/xen/gntalloc.c | 25 +-
drivers/xen/grant-table.c | 71 +++--
drivers/xen/pvcalls-front.c | 8 +-
drivers/xen/xenbus/xenbus_client.c | 24 +-
include/linux/arm-smccc.h | 5 +
include/linux/bpf.h | 12 +
include/xen/grant_table.h | 19 +-
kernel/sysctl.c | 7 +
net/9p/trans_xen.c | 14 +-
tools/arch/x86/include/asm/cpufeatures.h | 2 +-
67 files changed, 1800 insertions(+), 359 deletions(-)
I'm announcing the release of the 4.9.306 kernel.
All users of the 4.9 kernel series must upgrade.
The updated 4.9.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.9.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Documentation/hw-vuln/index.rst | 1
Documentation/hw-vuln/spectre.rst | 785 +++++++++++++++++++++++++++++++
Documentation/kernel-parameters.txt | 8
Makefile | 2
arch/arm/include/asm/assembler.h | 10
arch/arm/include/asm/spectre.h | 32 +
arch/arm/kernel/Makefile | 2
arch/arm/kernel/entry-armv.S | 79 ++-
arch/arm/kernel/entry-common.S | 24
arch/arm/kernel/spectre.c | 71 ++
arch/arm/kernel/traps.c | 65 ++
arch/arm/kernel/vmlinux-xip.lds.S | 45 +
arch/arm/kernel/vmlinux.lds.S | 45 +
arch/arm/mm/Kconfig | 11
arch/arm/mm/proc-v7-bugs.c | 199 ++++++-
arch/x86/Kconfig | 4
arch/x86/Makefile | 11
arch/x86/include/asm/cpufeatures.h | 2
arch/x86/include/asm/nospec-branch.h | 41 +
arch/x86/kernel/cpu/bugs.c | 225 ++++++--
drivers/block/xen-blkfront.c | 67 +-
drivers/firmware/psci.c | 15
drivers/net/xen-netfront.c | 54 +-
drivers/scsi/xen-scsifront.c | 3
drivers/xen/gntalloc.c | 25
drivers/xen/grant-table.c | 59 +-
drivers/xen/xenbus/xenbus_client.c | 24
include/linux/arm-smccc.h | 74 ++
include/linux/bpf.h | 11
include/linux/compiler-gcc.h | 2
include/linux/module.h | 2
include/xen/grant_table.h | 19
kernel/sysctl.c | 8
scripts/mod/modpost.c | 2
tools/arch/x86/include/asm/cpufeatures.h | 2
35 files changed, 1762 insertions(+), 267 deletions(-)
Borislav Petkov (1):
x86/speculation: Merge one test in spectre_v2_user_select_mitigation()
Emmanuel Gil Peyrot (1):
ARM: fix build error when BPF_SYSCALL is disabled
Greg Kroah-Hartman (1):
Linux 4.9.306
Josh Poimboeuf (4):
Documentation: Add swapgs description to the Spectre v1 documentation
x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting
x86/speculation: Warn about Spectre v2 LFENCE mitigation
x86/speculation: Warn about eIBRS + LFENCE + Unprivileged eBPF + SMT
Juergen Gross (9):
xen/xenbus: don't let xenbus_grant_ring() remove grants in error case
xen/grant-table: add gnttab_try_end_foreign_access()
xen/blkfront: don't use gnttab_query_foreign_access() for mapped status
xen/netfront: don't use gnttab_query_foreign_access() for mapped status
xen/scsifront: don't use gnttab_query_foreign_access() for mapped status
xen/gntalloc: don't use gnttab_query_foreign_access()
xen: remove gnttab_query_foreign_access()
xen/gnttab: fix gnttab_end_foreign_access() without page specified
xen/netfront: react properly to failing gnttab_end_foreign_access_ref()
Kim Phillips (2):
x86/speculation: Use generic retpoline by default on AMD
x86/speculation: Update link to AMD speculation whitepaper
Lukas Bulwahn (1):
Documentation: refer to config RANDOMIZE_BASE for kernel address-space randomization
Mark Rutland (1):
arm/arm64: smccc/psci: add arm_smccc_1_1_get_conduit()
Masahiro Yamada (1):
x86/build: Fix compiler support check for CONFIG_RETPOLINE
Nathan Chancellor (1):
ARM: Do not use NOCROSSREFS directive with ld.lld
Peter Zijlstra (3):
x86,bugs: Unconditionally allow spectre_v2=retpoline,amd
x86/speculation: Add eIBRS + Retpoline options
Documentation/hw-vuln: Update spectre doc
Peter Zijlstra (Intel) (1):
x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE
Russell King (Oracle) (7):
ARM: report Spectre v2 status through sysfs
ARM: early traps initialisation
ARM: use LOADADDR() to get load address of sections
ARM: Spectre-BHB workaround
ARM: include unprivileged BPF status in Spectre V2 reporting
ARM: fix co-processor register typo
ARM: fix build warning in proc-v7-bugs.c
Steven Price (1):
arm/arm64: Provide a wrapper for SMCCC 1.1 calls
Tim Chen (1):
Documentation: Add section about CPU vulnerabilities for Spectre
WANG Chao (1):
x86, modpost: Replace last remnants of RETPOLINE with CONFIG_RETPOLINE
Zhenzhong Duan (3):
x86/speculation: Add RETPOLINE_AMD support to the inline asm CALL_NOSPEC variant
x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support
x86/retpoline: Remove minimal retpoline support
Greeting,
FYI, we noticed the following commit (built with gcc-9):
commit: 56348560d495d2501e87db559a61de717cd3ab02 ("debugfs: do not attempt to create a new file before the filesystem is initalized")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: boot
on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang(a)intel.com>
[ 22.678378][ T1] UBI error: cannot create "ubi" debugfs directory, error -2
[ 22.679890][ T1] UBI error: cannot initialize UBI, error -2
To reproduce:
# build kernel
cd linux
cp config-5.11.0-rc5-00034-g56348560d495 .config
make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 olddefconfig prepare modules_prepare bzImage modules
make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 INSTALL_MOD_PATH=<mod-install-dir> modules_install
cd <mod-install-dir>
find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation
Thanks,
Oliver Sang
This is the start of the stable review cycle for the 5.16.14 release.
There are 37 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Fri, 11 Mar 2022 15:58:48 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.16.14-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.16.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.16.14-rc1
Emmanuel Gil Peyrot <linkmauve(a)linkmauve.fr>
ARM: fix build error when BPF_SYSCALL is disabled
James Morse <james.morse(a)arm.com>
arm64: proton-pack: Include unprivileged eBPF status in Spectre v2 mitigation reporting
James Morse <james.morse(a)arm.com>
arm64: Use the clearbhb instruction in mitigations
James Morse <james.morse(a)arm.com>
KVM: arm64: Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and migrated
James Morse <james.morse(a)arm.com>
arm64: Mitigate spectre style branch history side channels
James Morse <james.morse(a)arm.com>
arm64: proton-pack: Report Spectre-BHB vulnerabilities as part of Spectre-v2
James Morse <james.morse(a)arm.com>
arm64: Add percpu vectors for EL1
James Morse <james.morse(a)arm.com>
arm64: entry: Add macro for reading symbol addresses from the trampoline
James Morse <james.morse(a)arm.com>
arm64: entry: Add vectors that have the bhb mitigation sequences
James Morse <james.morse(a)arm.com>
arm64: entry: Add non-kpti __bp_harden_el1_vectors for mitigations
James Morse <james.morse(a)arm.com>
arm64: entry: Allow the trampoline text to occupy multiple pages
James Morse <james.morse(a)arm.com>
arm64: entry: Make the kpti trampoline's kpti sequence optional
James Morse <james.morse(a)arm.com>
arm64: entry: Move trampoline macros out of ifdef'd section
James Morse <james.morse(a)arm.com>
arm64: entry: Don't assume tramp_vectors is the start of the vectors
James Morse <james.morse(a)arm.com>
arm64: entry: Allow tramp_alias to access symbols after the 4K boundary
James Morse <james.morse(a)arm.com>
arm64: entry: Move the trampoline data page before the text page
James Morse <james.morse(a)arm.com>
arm64: entry: Free up another register on kpti's tramp_exit path
James Morse <james.morse(a)arm.com>
arm64: entry: Make the trampoline cleanup optional
James Morse <james.morse(a)arm.com>
KVM: arm64: Allow indirect vectors to be used without SPECTRE_V3A
James Morse <james.morse(a)arm.com>
arm64: spectre: Rename spectre_v4_patch_fw_mitigation_conduit
James Morse <james.morse(a)arm.com>
arm64: entry.S: Add ventry overflow sanity checks
Joey Gouly <joey.gouly(a)arm.com>
arm64: cpufeature: add HWCAP for FEAT_RPRES
Joey Gouly <joey.gouly(a)arm.com>
arm64: cpufeature: add HWCAP for FEAT_AFP
Joey Gouly <joey.gouly(a)arm.com>
arm64: add ID_AA64ISAR2_EL1 sys register
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: include unprivileged BPF status in Spectre V2 reporting
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: Spectre-BHB workaround
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: use LOADADDR() to get load address of sections
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: early traps initialisation
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: report Spectre v2 status through sysfs
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Warn about eIBRS + LFENCE + Unprivileged eBPF + SMT
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Warn about Spectre v2 LFENCE mitigation
Kim Phillips <kim.phillips(a)amd.com>
x86/speculation: Update link to AMD speculation whitepaper
Kim Phillips <kim.phillips(a)amd.com>
x86/speculation: Use generic retpoline by default on AMD
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting
Peter Zijlstra <peterz(a)infradead.org>
Documentation/hw-vuln: Update spectre doc
Peter Zijlstra <peterz(a)infradead.org>
x86/speculation: Add eIBRS + Retpoline options
Peter Zijlstra (Intel) <peterz(a)infradead.org>
x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE
-------------
Diffstat:
Documentation/admin-guide/hw-vuln/spectre.rst | 50 +--
Documentation/admin-guide/kernel-parameters.txt | 8 +-
Documentation/arm64/cpu-feature-registers.rst | 17 ++
Documentation/arm64/elf_hwcaps.rst | 8 +
Makefile | 4 +-
arch/arm/include/asm/assembler.h | 10 +
arch/arm/include/asm/spectre.h | 32 ++
arch/arm/include/asm/vmlinux.lds.h | 35 ++-
arch/arm/kernel/Makefile | 2 +
arch/arm/kernel/entry-armv.S | 79 ++++-
arch/arm/kernel/entry-common.S | 24 ++
arch/arm/kernel/spectre.c | 71 +++++
arch/arm/kernel/traps.c | 65 +++-
arch/arm/mm/Kconfig | 11 +
arch/arm/mm/proc-v7-bugs.c | 207 ++++++++++---
arch/arm64/Kconfig | 9 +
arch/arm64/include/asm/assembler.h | 53 ++++
arch/arm64/include/asm/cpu.h | 1 +
arch/arm64/include/asm/cpufeature.h | 29 ++
arch/arm64/include/asm/cputype.h | 8 +
arch/arm64/include/asm/fixmap.h | 6 +-
arch/arm64/include/asm/hwcap.h | 2 +
arch/arm64/include/asm/insn.h | 1 +
arch/arm64/include/asm/kvm_host.h | 5 +
arch/arm64/include/asm/sections.h | 5 +
arch/arm64/include/asm/spectre.h | 4 +
arch/arm64/include/asm/sysreg.h | 18 ++
arch/arm64/include/asm/vectors.h | 73 +++++
arch/arm64/include/uapi/asm/hwcap.h | 2 +
arch/arm64/include/uapi/asm/kvm.h | 5 +
arch/arm64/kernel/cpu_errata.c | 7 +
arch/arm64/kernel/cpufeature.c | 25 ++
arch/arm64/kernel/cpuinfo.c | 3 +
arch/arm64/kernel/entry.S | 214 +++++++++----
arch/arm64/kernel/image-vars.h | 4 +
arch/arm64/kernel/proton-pack.c | 391 +++++++++++++++++++++++-
arch/arm64/kernel/vmlinux.lds.S | 2 +-
arch/arm64/kvm/arm.c | 5 +-
arch/arm64/kvm/hyp/hyp-entry.S | 9 +
arch/arm64/kvm/hyp/nvhe/mm.c | 4 +-
arch/arm64/kvm/hyp/vhe/switch.c | 9 +-
arch/arm64/kvm/hypercalls.c | 12 +
arch/arm64/kvm/psci.c | 18 +-
arch/arm64/kvm/sys_regs.c | 2 +-
arch/arm64/mm/mmu.c | 12 +-
arch/arm64/tools/cpucaps | 1 +
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/nospec-branch.h | 16 +-
arch/x86/kernel/alternative.c | 8 +-
arch/x86/kernel/cpu/bugs.c | 204 ++++++++++---
arch/x86/lib/retpoline.S | 2 +-
arch/x86/net/bpf_jit_comp.c | 2 +-
include/linux/arm-smccc.h | 5 +
include/linux/bpf.h | 12 +
kernel/sysctl.c | 7 +
tools/arch/x86/include/asm/cpufeatures.h | 2 +-
56 files changed, 1606 insertions(+), 216 deletions(-)
The patch titled
Subject: ocfs2: fix crash when initialize filecheck kobj fails
has been added to the -mm tree. Its filename is
ocfs2-fix-crash-when-initialize-filecheck-kobj-fails.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/ocfs2-fix-crash-when-initialize-f…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/ocfs2-fix-crash-when-initialize-f…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Subject: ocfs2: fix crash when initialize filecheck kobj fails
Once s_root is set, genric_shutdown_super() will be called if fill_super()
fails. That means, we will call ocfs2_dismount_volume() twice in such
case, which can lead to kernel crash. Fix this issue by initializing
filecheck kobj before setting s_root.
Link: https://lkml.kernel.org/r/20220310081930.86305-1-joseph.qi@linux.alibaba.com
Fixes: 5f483c4abb50 ("ocfs2: add kobject for online file check")
Signed-off-by: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Junxiao Bi <junxiao.bi(a)oracle.com>
Cc: Changwei Ge <gechangwei(a)live.cn>
Cc: Gang He <ghe(a)suse.com>
Cc: Jun Piao <piaojun(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/ocfs2/super.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
--- a/fs/ocfs2/super.c~ocfs2-fix-crash-when-initialize-filecheck-kobj-fails
+++ a/fs/ocfs2/super.c
@@ -1105,17 +1105,6 @@ static int ocfs2_fill_super(struct super
goto read_super_error;
}
- root = d_make_root(inode);
- if (!root) {
- status = -ENOMEM;
- mlog_errno(status);
- goto read_super_error;
- }
-
- sb->s_root = root;
-
- ocfs2_complete_mount_recovery(osb);
-
osb->osb_dev_kset = kset_create_and_add(sb->s_id, NULL,
&ocfs2_kset->kobj);
if (!osb->osb_dev_kset) {
@@ -1133,6 +1122,17 @@ static int ocfs2_fill_super(struct super
goto read_super_error;
}
+ root = d_make_root(inode);
+ if (!root) {
+ status = -ENOMEM;
+ mlog_errno(status);
+ goto read_super_error;
+ }
+
+ sb->s_root = root;
+
+ ocfs2_complete_mount_recovery(osb);
+
if (ocfs2_mount_local(osb))
snprintf(nodestr, sizeof(nodestr), "local");
else
_
Patches currently in -mm which might be from joseph.qi(a)linux.alibaba.com are
ocfs2-fix-crash-when-initialize-filecheck-kobj-fails.patch
ocfs2-cleanup-some-return-variables.patch
The patch titled
Subject: mm: only re-generate demotion targets when a numa node changes its N_CPU state
has been added to the -mm tree. Its filename is
mm-only-re-generate-demotion-targets-when-a-numa-node-changes-its-n_cpu-state.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-only-re-generate-demotion-targ…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-only-re-generate-demotion-targ…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Oscar Salvador <osalvador(a)suse.de>
Subject: mm: only re-generate demotion targets when a numa node changes its N_CPU state
Abhishek reported that after patch [1], hotplug operations are taking
~double the expected time. [2]
The reason behind is that the CPU callbacks that migrate_on_reclaim_init()
sets always call set_migration_target_nodes() whenever a CPU is brought
up/down. But we only care about numa nodes going from having cpus to
become cpuless, and vice versa, as that influences the demotion_target
order.
We do already have two CPU callbacks (vmstat_cpu_online() and
vmstat_cpu_dead()) that check exactly that, so get rid of the CPU
callbacks in migrate_on_reclaim_init() and only call
set_migration_target_nodes() from vmstat_cpu_{dead,online}() whenever a
numa node change its N_CPU state.
[1] https://lore.kernel.org/linux-mm/20210721063926.3024591-2-ying.huang@intel.…
[2] https://lore.kernel.org/linux-mm/eb438ddd-2919-73d4-bd9f-b7eecdd9577a@linux…
Link: https://lkml.kernel.org/r/20220310120749.23077-1-osalvador@suse.de
Fixes: 884a6e5d1f93b ("mm/migrate: update node demotion order on hotplug events")
Signed-off-by: Oscar Salvador <osalvador(a)suse.de>
Reviewed-by: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Tested-by: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Reported-by: Abhishek Goel <huntbag(a)linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: "Huang, Ying" <ying.huang(a)intel.com>
Cc: Abhishek Goel <huntbag(a)linux.vnet.ibm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/migrate.h | 8 +++++++
mm/migrate.c | 41 ++++----------------------------------
mm/vmstat.c | 13 +++++++++++-
3 files changed, 25 insertions(+), 37 deletions(-)
--- a/include/linux/migrate.h~mm-only-re-generate-demotion-targets-when-a-numa-node-changes-its-n_cpu-state
+++ a/include/linux/migrate.h
@@ -48,7 +48,15 @@ int folio_migrate_mapping(struct address
struct folio *newfolio, struct folio *folio, int extra_count);
extern bool numa_demotion_enabled;
+extern void migrate_on_reclaim_init(void);
+#ifdef CONFIG_HOTPLUG_CPU
+extern void set_migration_target_nodes(void);
#else
+static inline void set_migration_target_nodes(void) {}
+#endif
+#else
+
+static inline void set_migration_target_nodes(void) {}
static inline void putback_movable_pages(struct list_head *l) {}
static inline int migrate_pages(struct list_head *l, new_page_t new,
--- a/mm/migrate.c~mm-only-re-generate-demotion-targets-when-a-numa-node-changes-its-n_cpu-state
+++ a/mm/migrate.c
@@ -3211,7 +3211,7 @@ again:
/*
* For callers that do not hold get_online_mems() already.
*/
-static void set_migration_target_nodes(void)
+void set_migration_target_nodes(void)
{
get_online_mems();
__set_migration_target_nodes();
@@ -3275,51 +3275,20 @@ static int __meminit migrate_on_reclaim_
return notifier_from_errno(0);
}
-/*
- * React to hotplug events that might affect the migration targets
- * like events that online or offline NUMA nodes.
- *
- * The ordering is also currently dependent on which nodes have
- * CPUs. That means we need CPU on/offline notification too.
- */
-static int migration_online_cpu(unsigned int cpu)
-{
- set_migration_target_nodes();
- return 0;
-}
-
-static int migration_offline_cpu(unsigned int cpu)
+void __init migrate_on_reclaim_init(void)
{
- set_migration_target_nodes();
- return 0;
-}
-
-static int __init migrate_on_reclaim_init(void)
-{
- int ret;
-
node_demotion = kmalloc_array(nr_node_ids,
sizeof(struct demotion_nodes),
GFP_KERNEL);
WARN_ON(!node_demotion);
- ret = cpuhp_setup_state_nocalls(CPUHP_MM_DEMOTION_DEAD, "mm/demotion:offline",
- NULL, migration_offline_cpu);
/*
- * In the unlikely case that this fails, the automatic
- * migration targets may become suboptimal for nodes
- * where N_CPU changes. With such a small impact in a
- * rare case, do not bother trying to do anything special.
+ * At this point, all numa nodes with memory/CPus have their state
+ * properly set, so we can build the demotion order now.
*/
- WARN_ON(ret < 0);
- ret = cpuhp_setup_state(CPUHP_AP_MM_DEMOTION_ONLINE, "mm/demotion:online",
- migration_online_cpu, NULL);
- WARN_ON(ret < 0);
-
+ set_migration_target_nodes();
hotplug_memory_notifier(migrate_on_reclaim_callback, 100);
- return 0;
}
-late_initcall(migrate_on_reclaim_init);
#endif /* CONFIG_HOTPLUG_CPU */
bool numa_demotion_enabled = false;
--- a/mm/vmstat.c~mm-only-re-generate-demotion-targets-when-a-numa-node-changes-its-n_cpu-state
+++ a/mm/vmstat.c
@@ -28,6 +28,7 @@
#include <linux/mm_inline.h>
#include <linux/page_ext.h>
#include <linux/page_owner.h>
+#include <linux/migrate.h>
#include "internal.h"
@@ -2049,7 +2050,12 @@ static void __init init_cpu_node_state(v
static int vmstat_cpu_online(unsigned int cpu)
{
refresh_zone_stat_thresholds();
- node_set_state(cpu_to_node(cpu), N_CPU);
+
+ if (!node_state(cpu_to_node(cpu), N_CPU)) {
+ node_set_state(cpu_to_node(cpu), N_CPU);
+ set_migration_target_nodes();
+ }
+
return 0;
}
@@ -2072,6 +2078,8 @@ static int vmstat_cpu_dead(unsigned int
return 0;
node_clear_state(node, N_CPU);
+ set_migration_target_nodes();
+
return 0;
}
@@ -2103,6 +2111,9 @@ void __init init_mm_internals(void)
start_shepherd_timer();
#endif
+#if defined(CONFIG_MIGRATION) && defined(CONFIG_HOTPLUG_CPU)
+ migrate_on_reclaim_init();
+#endif
#ifdef CONFIG_PROC_FS
proc_create_seq("buddyinfo", 0444, NULL, &fragmentation_op);
proc_create_seq("pagetypeinfo", 0400, NULL, &pagetypeinfo_op);
_
Patches currently in -mm which might be from osalvador(a)suse.de are
arch-x86-mm-numa-do-not-initialize-nodes-twice.patch
arch-x86-mm-numa-do-not-initialize-nodes-twice-v2.patch
mm-only-re-generate-demotion-targets-when-a-numa-node-changes-its-n_cpu-state.patch
This is the start of the stable review cycle for the 4.19.234 release.
There are 18 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Fri, 11 Mar 2022 15:58:48 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.234-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.19.234-rc1
Emmanuel Gil Peyrot <linkmauve(a)linkmauve.fr>
ARM: fix build error when BPF_SYSCALL is disabled
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: include unprivileged BPF status in Spectre V2 reporting
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: Spectre-BHB workaround
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: use LOADADDR() to get load address of sections
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: early traps initialisation
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: report Spectre v2 status through sysfs
Mark Rutland <mark.rutland(a)arm.com>
arm/arm64: smccc/psci: add arm_smccc_1_1_get_conduit()
Steven Price <steven.price(a)arm.com>
arm/arm64: Provide a wrapper for SMCCC 1.1 calls
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Warn about eIBRS + LFENCE + Unprivileged eBPF + SMT
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Warn about Spectre v2 LFENCE mitigation
Kim Phillips <kim.phillips(a)amd.com>
x86/speculation: Update link to AMD speculation whitepaper
Kim Phillips <kim.phillips(a)amd.com>
x86/speculation: Use generic retpoline by default on AMD
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting
Peter Zijlstra <peterz(a)infradead.org>
Documentation/hw-vuln: Update spectre doc
Peter Zijlstra <peterz(a)infradead.org>
x86/speculation: Add eIBRS + Retpoline options
Peter Zijlstra (Intel) <peterz(a)infradead.org>
x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE
Peter Zijlstra <peterz(a)infradead.org>
x86,bugs: Unconditionally allow spectre_v2=retpoline,amd
Borislav Petkov <bp(a)suse.de>
x86/speculation: Merge one test in spectre_v2_user_select_mitigation()
-------------
Diffstat:
Documentation/admin-guide/hw-vuln/spectre.rst | 48 ++++--
Documentation/admin-guide/kernel-parameters.txt | 8 +-
Makefile | 4 +-
arch/arm/include/asm/assembler.h | 10 ++
arch/arm/include/asm/spectre.h | 32 ++++
arch/arm/kernel/Makefile | 2 +
arch/arm/kernel/entry-armv.S | 79 ++++++++-
arch/arm/kernel/entry-common.S | 24 +++
arch/arm/kernel/spectre.c | 71 ++++++++
arch/arm/kernel/traps.c | 65 ++++++-
arch/arm/kernel/vmlinux.lds.h | 35 +++-
arch/arm/mm/Kconfig | 11 ++
arch/arm/mm/proc-v7-bugs.c | 200 +++++++++++++++++++---
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/nospec-branch.h | 16 +-
arch/x86/kernel/cpu/bugs.c | 214 +++++++++++++++++-------
drivers/firmware/psci.c | 15 ++
include/linux/arm-smccc.h | 74 ++++++++
include/linux/bpf.h | 11 ++
kernel/sysctl.c | 8 +
tools/arch/x86/include/asm/cpufeatures.h | 2 +-
21 files changed, 796 insertions(+), 135 deletions(-)
For some reason, the Microsoft Surface Go 3 uses the standard ACPI
interface for battery information, but does not use the standard PNP0C0A
HID. Instead it uses MSHW0146 as identifier. Add that ID to the driver
as this seems to work well.
Additionally, the power state is not updated immediately after the AC
has been (un-)plugged, so add the respective quirk for that.
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Maximilian Luz <luzmaximilian(a)gmail.com>
---
drivers/acpi/battery.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/drivers/acpi/battery.c b/drivers/acpi/battery.c
index ea31ae01458b..dc208f5f5a1f 100644
--- a/drivers/acpi/battery.c
+++ b/drivers/acpi/battery.c
@@ -59,6 +59,10 @@ MODULE_PARM_DESC(cache_time, "cache time in milliseconds");
static const struct acpi_device_id battery_device_ids[] = {
{"PNP0C0A", 0},
+
+ /* Microsoft Surface Go 3 */
+ {"MSHW0146", 0},
+
{"", 0},
};
@@ -1148,6 +1152,14 @@ static const struct dmi_system_id bat_dmi_table[] __initconst = {
DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkPad"),
},
},
+ {
+ /* Microsoft Surface Go 3 */
+ .callback = battery_notification_delay_quirk,
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Microsoft Corporation"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "Surface Go 3"),
+ },
+ },
{},
};
--
2.35.1
Please I need your help,
Please forgive me for stressing you with my predicaments as I know
that this letter may come to you as a big surprise. Actually, I came
across your E-mail from my personal search afterward. I decided to
email you directly believing that you will be honest to fulfil my
final wish before I die.
Meanwhile, I am Mrs.Karen Olsen, 62 years old,I am suffering from a
long time cancer and from all indication my condition is really
deteriorating as my doctors have confirmed and courageously advised me
that I may not live beyond two months from now for the reason that my
tumour has reached a critical stage which has defiled all forms of
medical treatment.
As a matter of fact, I registered as a nurse by profession while my
husband was dealing on Gold Dust and Gold Dore Bars till his sudden
death the year 2017 then I took over his business till date. In fact,
at this moment I have a deposit sum of eight million five hundred
thousand US dollars [$8,500,000.00] with one of the leading bank but
unfortunately I cannot visit the bank since I m critically sick and
powerless to do anything myself but my bank account officer advised me
to assign any of my trustworthy relative, friends or partner with
authorization letter to stand as the recipient of my money but
sorrowfully I don t have any reliable relative and no child.
Therefore, I want you to receive the money and take 30% to take care
of yourself and family while 70% should be used basically on
humanitarian purposes mostly to orphanages, Motherless babies home,
less privileged and disable citizens and widows around the world. and
as soon as I receive your response I shall send you my pictures,
banking records and with full contacts of my banking institution to If
you are interested in carrying out this task please contact me for
more details
Hope to hear from you soon.
Yours Faithfully
Mrs.Karen Olsen
Fix the descriptions of the return values of helper
bpf_current_task_under_cgroup().
Fixes: c6b5fb8690fa ("bpf: add documentation for eBPF helpers (42-50)")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Hengqi Chen <hengqi.chen(a)gmail.com>
---
include/uapi/linux/bpf.h | 4 ++--
tools/include/uapi/linux/bpf.h | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index bc23020b638d..374db485f063 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2302,8 +2302,8 @@ union bpf_attr {
* Return
* The return value depends on the result of the test, and can be:
*
- * * 0, if current task belongs to the cgroup2.
- * * 1, if current task does not belong to the cgroup2.
+ * * 1, if current task belongs to the cgroup2.
+ * * 0, if current task does not belong to the cgroup2.
* * A negative error code, if an error occurred.
*
* long bpf_skb_change_tail(struct sk_buff *skb, u32 len, u64 flags)
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index bc23020b638d..374db485f063 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2302,8 +2302,8 @@ union bpf_attr {
* Return
* The return value depends on the result of the test, and can be:
*
- * * 0, if current task belongs to the cgroup2.
- * * 1, if current task does not belong to the cgroup2.
+ * * 1, if current task belongs to the cgroup2.
+ * * 0, if current task does not belong to the cgroup2.
* * A negative error code, if an error occurred.
*
* long bpf_skb_change_tail(struct sk_buff *skb, u32 len, u64 flags)
--
2.30.2
upstream commit 277c8cb3e8ac ("MIPS: fix local_{add,sub}_return on MIPS64")
was backported to v5.15.27 as
commit f98371d2ac83 ("MIPS: fix local_{add,sub}_return on MIPS64")
but breaks MIPS build:
In file included from ./arch/mips/include/asm/local.h:8:0,
from ./include/linux/genhd.h:20,
from ./include/linux/blkdev.h:8,
from ./include/linux/blk-cgroup.h:23,
from ./include/linux/writeback.h:14,
from ./include/linux/memcontrol.h:22,
from ./include/net/sock.h:53,
from ./include/linux/tcp.h:19,
from drivers/net/slip/slip.c:91:
./arch/mips/include/asm/asm.h:68:0: warning: "END" redefined
#define END(function) \
In file included from drivers/net/slip/slip.c:88:0:
drivers/net/slip/slip.h:44:0: note: this is the location of the previous definition
#define END 0300 /* indicates end of frame */
Analyses reveals that with the backported MIPS fix there is a new
#include <asm/asm.h> introduced by ./arch/mips/include/asm/local.h
which already defines some END macro.
But why does v5.16.x compile fine where
commit a0ecfd10d669c ("MIPS: fix local_{add,sub}_return on MIPS64")
is also present since v5.16.3?
Deeper analyses shows that there is another patch introduced
in v5.16-rc1 which removed one #include in the above chain and
therefore does not define END by asm/asm.h:
commit 348332e000697 ("mm: don't include <linux/blk-cgroup.h> in <linux/writeback.h>")
Hence, the MIPS fix should only be applied to branches where
the mm fix is already present. Or the mm fix should be backported
as well (if it has no side-effects).
Note: the MIPS fix was apparently not (yet?) applied to v5.10.y or earlier
even tough the Fixes: 7232311ef14c ("local_t: mips extension")
would be true.
BR and thanks,
Nikolaus Schaller
Dear Sir or Madam,
We would be very grateful if you could take a moment to leave feedback
to us, if you received our email dated 4th March 2022 to you.
Best regards.
Gordon Harvey.
From: Nicolas Saenz Julienne <nsaenzju(a)redhat.com>
At the moment running osnoise on a nohz_full CPU or uncontested FIFO
priority and a PREEMPT_RCU kernel might have the side effect of
extending grace periods too much. This will entice RCU to force a
context switch on the wayward CPU to end the grace period, all while
introducing unwarranted noise into the tracer. This behaviour is
unavoidable as overly extending grace periods might exhaust the system's
memory.
This same exact problem is what extended quiescent states (EQS) were
created for, conversely, rcu_momentary_dyntick_idle() emulates them by
performing a zero duration EQS. So let's make use of it.
In the common case rcu_momentary_dyntick_idle() is fairly inexpensive:
atomically incrementing a local per-CPU counter and doing a store. So it
shouldn't affect osnoise's measurements (which has a 1us granularity),
so we'll call it unanimously.
The uncommon case involve calling rcu_momentary_dyntick_idle() after
having the osnoise process:
- Receive an expedited quiescent state IPI with preemption disabled or
during an RCU critical section. (activates rdp->cpu_no_qs.b.exp
code-path).
- Being preempted within in an RCU critical section and having the
subsequent outermost rcu_read_unlock() called with interrupts
disabled. (t->rcu_read_unlock_special.b.blocked code-path).
Neither of those are possible at the moment, and are unlikely to be in
the future given the osnoise's loop design. On top of this, the noise
generated by the situations described above is unavoidable, and if not
exposed by rcu_momentary_dyntick_idle() will be eventually seen in
subsequent rcu_read_unlock() calls or schedule operations.
Link: https://lkml.kernel.org/r/20220307180740.577607-1-nsaenzju@redhat.com
Cc: stable(a)vger.kernel.org
Fixes: bce29ac9ce0b ("trace: Add osnoise tracer")
Signed-off-by: Nicolas Saenz Julienne <nsaenzju(a)redhat.com>
Acked-by: Paul E. McKenney <paulmck(a)kernel.org>
Acked-by: Daniel Bristot de Oliveira <bristot(a)kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
kernel/trace/trace_osnoise.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
index 2aa3efdca755..5e3c62a08fc0 100644
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -1386,6 +1386,26 @@ static int run_osnoise(void)
osnoise_stop_tracing();
}
+ /*
+ * In some cases, notably when running on a nohz_full CPU with
+ * a stopped tick PREEMPT_RCU has no way to account for QSs.
+ * This will eventually cause unwarranted noise as PREEMPT_RCU
+ * will force preemption as the means of ending the current
+ * grace period. We avoid this problem by calling
+ * rcu_momentary_dyntick_idle(), which performs a zero duration
+ * EQS allowing PREEMPT_RCU to end the current grace period.
+ * This call shouldn't be wrapped inside an RCU critical
+ * section.
+ *
+ * Note that in non PREEMPT_RCU kernels QSs are handled through
+ * cond_resched()
+ */
+ if (IS_ENABLED(CONFIG_PREEMPT_RCU)) {
+ local_irq_disable();
+ rcu_momentary_dyntick_idle();
+ local_irq_enable();
+ }
+
/*
* For the non-preemptive kernel config: let threads runs, if
* they so wish.
--
2.34.1
From: Daniel Bristot de Oliveira <bristot(a)kernel.org>
Nicolas reported that using:
# trace-cmd record -e all -M 10 -p osnoise --poll
Resulted in the following kernel warning:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1217 at kernel/tracepoint.c:404 tracepoint_probe_unregister+0x280/0x370
[...]
CPU: 0 PID: 1217 Comm: trace-cmd Not tainted 5.17.0-rc6-next-20220307-nico+ #19
RIP: 0010:tracepoint_probe_unregister+0x280/0x370
[...]
CR2: 00007ff919b29497 CR3: 0000000109da4005 CR4: 0000000000170ef0
Call Trace:
<TASK>
osnoise_workload_stop+0x36/0x90
tracing_set_tracer+0x108/0x260
tracing_set_trace_write+0x94/0xd0
? __check_object_size.part.0+0x10a/0x150
? selinux_file_permission+0x104/0x150
vfs_write+0xb5/0x290
ksys_write+0x5f/0xe0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7ff919a18127
[...]
---[ end trace 0000000000000000 ]---
The warning complains about an attempt to unregister an
unregistered tracepoint.
This happens on trace-cmd because it first stops tracing, and
then switches the tracer to nop. Which is equivalent to:
# cd /sys/kernel/tracing/
# echo osnoise > current_tracer
# echo 0 > tracing_on
# echo nop > current_tracer
The osnoise tracer stops the workload when no trace instance
is actually collecting data. This can be caused both by
disabling tracing or disabling the tracer itself.
To avoid unregistering events twice, use the existing
trace_osnoise_callback_enabled variable to check if the events
(and the workload) are actually active before trying to
deactivate them.
Link: https://lore.kernel.org/all/c898d1911f7f9303b7e14726e7cc9678fbfb4a0e.camel@…
Link: https://lkml.kernel.org/r/938765e17d5a781c2df429a98f0b2e7cc317b022.16468239…
Cc: stable(a)vger.kernel.org
Cc: Marcelo Tosatti <mtosatti(a)redhat.com>
Fixes: 2fac8d6486d5 ("tracing/osnoise: Allow multiple instances of the same tracer")
Reported-by: Nicolas Saenz Julienne <nsaenzju(a)redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot(a)kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
kernel/trace/trace_osnoise.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
index cfddb30e65ab..2aa3efdca755 100644
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -2200,6 +2200,17 @@ static void osnoise_workload_stop(void)
if (osnoise_has_registered_instances())
return;
+ /*
+ * If callbacks were already disabled in a previous stop
+ * call, there is no need to disable then again.
+ *
+ * For instance, this happens when tracing is stopped via:
+ * echo 0 > tracing_on
+ * echo nop > current_tracer.
+ */
+ if (!trace_osnoise_callback_enabled)
+ return;
+
trace_osnoise_callback_enabled = false;
/*
* Make sure that ftrace_nmi_enter/exit() see
--
2.34.1
stable-rc queue/4.19 x86 and i386 gcc-11 builds failed due to following
multiple warnings and errors.
metadata:
git_describe: v4.19.233-22-g83c76d59eabe
git_repo: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc-queues
git_sha: 83c76d59eabe7545b86485336a9aeb0f652666be
git_short_log: 83c76d59eabe (\ARM: fix build warning in proc-v7-bugs.c\)
target_arch: x86_64
toolchain: gcc-11
make --silent --keep-going --jobs=8
O=/home/tuxbuild/.cache/tuxmake/builds/2/build ARCH=x86_64
CROSS_COMPILE=x86_64-linux-gnu- 'CC=sccache x86_64-linux-gnu-gcc'
'HOSTCC=sccache gcc'
arch/x86/entry/entry_64.S: Assembler messages:
arch/x86/entry/entry_64.S:1738: Warning: no instruction mnemonic
suffix given and no register operands; using default for `sysret'
arch/x86/kernel/cpu/bugs.c: In function 'spectre_v2_select_mitigation':
arch/x86/kernel/cpu/bugs.c:973:41: error: implicit declaration of
function 'unprivileged_ebpf_enabled'
[-Werror=implicit-function-declaration]
973 | if (mode == SPECTRE_V2_EIBRS && unprivileged_ebpf_enabled())
| ^~~~~~~~~~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
drivers/crypto/ccp/sp-platform.c:37:34: warning: array 'sp_of_match'
assumed to have one element
37 | static const struct of_device_id sp_of_match[];
| ^~~~~~~~~~~
drivers/gpu/drm/i915/intel_dp.c: In function 'intel_dp_check_mst_status':
drivers/gpu/drm/i915/intel_dp.c:4129:30: warning:
'drm_dp_channel_eq_ok' reading 6 bytes from a region of size 4
[-Wstringop-overread]
4129 | !drm_dp_channel_eq_ok(&esi[10],
intel_dp->lane_count)) {
|
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/i915/intel_dp.c:4129:30: note: referencing argument 1
of type 'const u8 *' {aka 'const unsigned char *'}
In file included from drivers/gpu/drm/i915/intel_dp.c:39:
include/drm/drm_dp_helper.h:954:6: note: in a call to function
'drm_dp_channel_eq_ok'
954 | bool drm_dp_channel_eq_ok(const u8 link_status[DP_LINK_STATUS_SIZE],
| ^~~~~~~~~~~~~~~~~~~~
make[1]: Target '_all' not remade because of errors.
Reported-by: Linux Kernel Functional Testing <lkft(a)linaro.org>
build link [1] & [2]
--
Linaro LKFT
https://lkft.linaro.org
[1] https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc-queues/-/jobs…
[2] https://builds.tuxbuild.com/26C6OfFPTfEBORLviGXIOuwmzR2/
On Thu, Mar 10, 2022 at 01:26:16PM +0100, Rafael J. Wysocki wrote:
> Hi Greg & Sasha,
>
> Commit 4287509b4d21e34dc492 that went into 5.16.y as a backport of
> mainline commit dc0075ba7f38 ("ACPI: PM: s2idle: Cancel wakeup before
> dispatching EC GPE") is causing trouble in 5.16.y, but 5.17-rc7
> including the original commit is fine.
>
> This is most likely due to some other changes that commit dc0075ba7f38
> turns out to depend on which have not been backported, but because it
> is not an essential fix (and it was backported, because it carried a
> Fixes tag and not because it was marked for backporting), IMV it is
> better to revert it from 5.16.y than to try to pull all of the
> dependencies in (and risk missing any of them), so please do that.
>
> Please see this thread:
>
> https://lore.kernel.org/linux-pm/31b9d1cd-6a67-218b-4ada-12f72e6f00dc@redha…
Odd that this is only showing up in 5.16 as this commit also is in 5.4
and 5.10 and 5.15. Should I drop it from there as well?
thanks,
greg k-h
--
Hi Dear,
My name is Lisa Williams, I am from the United States of America, Its
my pleasure to contact you for new and special friendship, I will be
glad to see your reply for us to know each other better.
Yours
Lisa
I messed up the stable list address, sorry.
On Thu, Mar 10, 2022 at 1:26 PM Rafael J. Wysocki <rafael(a)kernel.org> wrote:
>
> Hi Greg & Sasha,
>
> Commit 4287509b4d21e34dc492 that went into 5.16.y as a backport of
> mainline commit dc0075ba7f38 ("ACPI: PM: s2idle: Cancel wakeup before
> dispatching EC GPE") is causing trouble in 5.16.y, but 5.17-rc7
> including the original commit is fine.
>
> This is most likely due to some other changes that commit dc0075ba7f38
> turns out to depend on which have not been backported, but because it
> is not an essential fix (and it was backported, because it carried a
> Fixes tag and not because it was marked for backporting), IMV it is
> better to revert it from 5.16.y than to try to pull all of the
> dependencies in (and risk missing any of them), so please do that.
>
> Please see this thread:
>
> https://lore.kernel.org/linux-pm/31b9d1cd-6a67-218b-4ada-12f72e6f00dc@redha…
>
> for reference.
>
> Cheers!
This is the start of the stable review cycle for the 4.9.306 release.
There are 24 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Fri, 11 Mar 2022 15:58:48 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.306-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.9.306-rc1
Emmanuel Gil Peyrot <linkmauve(a)linkmauve.fr>
ARM: fix build error when BPF_SYSCALL is disabled
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: include unprivileged BPF status in Spectre V2 reporting
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: Spectre-BHB workaround
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: use LOADADDR() to get load address of sections
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: early traps initialisation
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: report Spectre v2 status through sysfs
Mark Rutland <mark.rutland(a)arm.com>
arm/arm64: smccc/psci: add arm_smccc_1_1_get_conduit()
Steven Price <steven.price(a)arm.com>
arm/arm64: Provide a wrapper for SMCCC 1.1 calls
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Warn about eIBRS + LFENCE + Unprivileged eBPF + SMT
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Warn about Spectre v2 LFENCE mitigation
Kim Phillips <kim.phillips(a)amd.com>
x86/speculation: Update link to AMD speculation whitepaper
Kim Phillips <kim.phillips(a)amd.com>
x86/speculation: Use generic retpoline by default on AMD
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting
Peter Zijlstra <peterz(a)infradead.org>
Documentation/hw-vuln: Update spectre doc
Peter Zijlstra <peterz(a)infradead.org>
x86/speculation: Add eIBRS + Retpoline options
Peter Zijlstra (Intel) <peterz(a)infradead.org>
x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE
Peter Zijlstra <peterz(a)infradead.org>
x86,bugs: Unconditionally allow spectre_v2=retpoline,amd
Borislav Petkov <bp(a)suse.de>
x86/speculation: Merge one test in spectre_v2_user_select_mitigation()
Lukas Bulwahn <lukas.bulwahn(a)gmail.com>
Documentation: refer to config RANDOMIZE_BASE for kernel address-space randomization
Josh Poimboeuf <jpoimboe(a)redhat.com>
Documentation: Add swapgs description to the Spectre v1 documentation
Tim Chen <tim.c.chen(a)linux.intel.com>
Documentation: Add section about CPU vulnerabilities for Spectre
Zhenzhong Duan <zhenzhong.duan(a)oracle.com>
x86/retpoline: Remove minimal retpoline support
Zhenzhong Duan <zhenzhong.duan(a)oracle.com>
x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support
Zhenzhong Duan <zhenzhong.duan(a)oracle.com>
x86/speculation: Add RETPOLINE_AMD support to the inline asm CALL_NOSPEC variant
-------------
Diffstat:
Documentation/hw-vuln/index.rst | 1 +
Documentation/hw-vuln/spectre.rst | 785 +++++++++++++++++++++++++++++++
Documentation/kernel-parameters.txt | 8 +-
Makefile | 4 +-
arch/arm/include/asm/assembler.h | 10 +
arch/arm/include/asm/spectre.h | 32 ++
arch/arm/kernel/Makefile | 2 +
arch/arm/kernel/entry-armv.S | 79 +++-
arch/arm/kernel/entry-common.S | 24 +
arch/arm/kernel/spectre.c | 71 +++
arch/arm/kernel/traps.c | 65 ++-
arch/arm/kernel/vmlinux-xip.lds.S | 37 +-
arch/arm/kernel/vmlinux.lds.S | 37 +-
arch/arm/mm/Kconfig | 11 +
arch/arm/mm/proc-v7-bugs.c | 198 ++++++--
arch/x86/Kconfig | 4 -
arch/x86/Makefile | 5 +-
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/nospec-branch.h | 41 +-
arch/x86/kernel/cpu/bugs.c | 223 ++++++---
drivers/firmware/psci.c | 15 +
include/linux/arm-smccc.h | 74 +++
include/linux/bpf.h | 11 +
kernel/sysctl.c | 8 +
tools/arch/x86/include/asm/cpufeatures.h | 2 +-
25 files changed, 1596 insertions(+), 153 deletions(-)
This is the start of the stable review cycle for the 5.10.105 release.
There are 43 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Fri, 11 Mar 2022 15:58:48 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.105-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.10.105-rc1
Emmanuel Gil Peyrot <linkmauve(a)linkmauve.fr>
ARM: fix build error when BPF_SYSCALL is disabled
James Morse <james.morse(a)arm.com>
arm64: proton-pack: Include unprivileged eBPF status in Spectre v2 mitigation reporting
James Morse <james.morse(a)arm.com>
arm64: Use the clearbhb instruction in mitigations
James Morse <james.morse(a)arm.com>
KVM: arm64: Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and migrated
James Morse <james.morse(a)arm.com>
arm64: Mitigate spectre style branch history side channels
James Morse <james.morse(a)arm.com>
KVM: arm64: Allow indirect vectors to be used without SPECTRE_V3A
James Morse <james.morse(a)arm.com>
arm64: proton-pack: Report Spectre-BHB vulnerabilities as part of Spectre-v2
James Morse <james.morse(a)arm.com>
arm64: Add percpu vectors for EL1
James Morse <james.morse(a)arm.com>
arm64: entry: Add macro for reading symbol addresses from the trampoline
James Morse <james.morse(a)arm.com>
arm64: entry: Add vectors that have the bhb mitigation sequences
James Morse <james.morse(a)arm.com>
arm64: entry: Add non-kpti __bp_harden_el1_vectors for mitigations
James Morse <james.morse(a)arm.com>
arm64: entry: Allow the trampoline text to occupy multiple pages
James Morse <james.morse(a)arm.com>
arm64: entry: Make the kpti trampoline's kpti sequence optional
James Morse <james.morse(a)arm.com>
arm64: entry: Move trampoline macros out of ifdef'd section
James Morse <james.morse(a)arm.com>
arm64: entry: Don't assume tramp_vectors is the start of the vectors
James Morse <james.morse(a)arm.com>
arm64: entry: Allow tramp_alias to access symbols after the 4K boundary
James Morse <james.morse(a)arm.com>
arm64: entry: Move the trampoline data page before the text page
James Morse <james.morse(a)arm.com>
arm64: entry: Free up another register on kpti's tramp_exit path
James Morse <james.morse(a)arm.com>
arm64: entry: Make the trampoline cleanup optional
James Morse <james.morse(a)arm.com>
arm64: spectre: Rename spectre_v4_patch_fw_mitigation_conduit
James Morse <james.morse(a)arm.com>
arm64: entry.S: Add ventry overflow sanity checks
Joey Gouly <joey.gouly(a)arm.com>
arm64: cpufeature: add HWCAP for FEAT_RPRES
Joey Gouly <joey.gouly(a)arm.com>
arm64: cpufeature: add HWCAP for FEAT_AFP
Joey Gouly <joey.gouly(a)arm.com>
arm64: add ID_AA64ISAR2_EL1 sys register
Marc Zyngier <maz(a)kernel.org>
arm64: Add HWCAP for self-synchronising virtual counter
Anshuman Khandual <anshuman.khandual(a)arm.com>
arm64: Add Cortex-A510 CPU part definition
Anshuman Khandual <anshuman.khandual(a)arm.com>
arm64: Add Cortex-X2 CPU part definition
Suzuki K Poulose <suzuki.poulose(a)arm.com>
arm64: Add Neoverse-N2, Cortex-A710 CPU part definition
Hector Martin <marcan(a)marcan.st>
arm64: cputype: Add CPU implementor & types for the Apple M1 cores
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: include unprivileged BPF status in Spectre V2 reporting
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: Spectre-BHB workaround
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: use LOADADDR() to get load address of sections
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: early traps initialisation
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: report Spectre v2 status through sysfs
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Warn about eIBRS + LFENCE + Unprivileged eBPF + SMT
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Warn about Spectre v2 LFENCE mitigation
Kim Phillips <kim.phillips(a)amd.com>
x86/speculation: Update link to AMD speculation whitepaper
Kim Phillips <kim.phillips(a)amd.com>
x86/speculation: Use generic retpoline by default on AMD
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting
Peter Zijlstra <peterz(a)infradead.org>
Documentation/hw-vuln: Update spectre doc
Peter Zijlstra <peterz(a)infradead.org>
x86/speculation: Add eIBRS + Retpoline options
Peter Zijlstra (Intel) <peterz(a)infradead.org>
x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE
Peter Zijlstra <peterz(a)infradead.org>
x86,bugs: Unconditionally allow spectre_v2=retpoline,amd
-------------
Diffstat:
Documentation/admin-guide/hw-vuln/spectre.rst | 48 ++--
Documentation/admin-guide/kernel-parameters.txt | 8 +-
Documentation/arm64/cpu-feature-registers.rst | 29 +-
Documentation/arm64/elf_hwcaps.rst | 12 +
Makefile | 4 +-
arch/arm/include/asm/assembler.h | 10 +
arch/arm/include/asm/spectre.h | 32 +++
arch/arm/include/asm/vmlinux.lds.h | 35 ++-
arch/arm/kernel/Makefile | 2 +
arch/arm/kernel/entry-armv.S | 79 +++++-
arch/arm/kernel/entry-common.S | 24 ++
arch/arm/kernel/spectre.c | 71 +++++
arch/arm/kernel/traps.c | 65 ++++-
arch/arm/mm/Kconfig | 11 +
arch/arm/mm/proc-v7-bugs.c | 207 +++++++++++---
arch/arm64/Kconfig | 9 +
arch/arm64/include/asm/assembler.h | 33 +++
arch/arm64/include/asm/cpu.h | 1 +
arch/arm64/include/asm/cpucaps.h | 3 +-
arch/arm64/include/asm/cpufeature.h | 28 ++
arch/arm64/include/asm/cputype.h | 22 ++
arch/arm64/include/asm/fixmap.h | 6 +-
arch/arm64/include/asm/hwcap.h | 3 +
arch/arm64/include/asm/insn.h | 1 +
arch/arm64/include/asm/kvm_asm.h | 8 +
arch/arm64/include/asm/kvm_mmu.h | 3 +-
arch/arm64/include/asm/mmu.h | 6 +
arch/arm64/include/asm/sections.h | 5 +
arch/arm64/include/asm/spectre.h | 4 +
arch/arm64/include/asm/sysreg.h | 18 ++
arch/arm64/include/asm/vectors.h | 73 +++++
arch/arm64/include/uapi/asm/hwcap.h | 3 +
arch/arm64/include/uapi/asm/kvm.h | 5 +
arch/arm64/kernel/cpu_errata.c | 7 +
arch/arm64/kernel/cpufeature.c | 28 +-
arch/arm64/kernel/cpuinfo.c | 4 +
arch/arm64/kernel/entry.S | 213 ++++++++++----
arch/arm64/kernel/proton-pack.c | 359 +++++++++++++++++++++++-
arch/arm64/kernel/vmlinux.lds.S | 2 +-
arch/arm64/kvm/arm.c | 3 +-
arch/arm64/kvm/hyp/hyp-entry.S | 4 +
arch/arm64/kvm/hyp/smccc_wa.S | 75 +++++
arch/arm64/kvm/hyp/vhe/switch.c | 9 +-
arch/arm64/kvm/hypercalls.c | 12 +
arch/arm64/kvm/psci.c | 18 +-
arch/arm64/kvm/sys_regs.c | 2 +-
arch/arm64/mm/mmu.c | 12 +-
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/nospec-branch.h | 16 +-
arch/x86/kernel/cpu/bugs.c | 205 ++++++++++----
include/linux/arm-smccc.h | 5 +
include/linux/bpf.h | 12 +
kernel/sysctl.c | 7 +
tools/arch/x86/include/asm/cpufeatures.h | 2 +-
54 files changed, 1651 insertions(+), 214 deletions(-)
[PROBLEM]
Patch "btrfs: don't let new writes to trigger autodefrag on the same
inode" now makes autodefrag really only to scan one inode once
per autodefrag run.
That patch works mostly fine, as the trace events show their intervals
are almost 30s:
(only showing the root 257 ino 329891 start 0)
486.810041: defrag_file_start: root=257 ino=329891 start=0 len=754728960
506.407089: defrag_file_start: root=257 ino=329891 start=0 len=754728960
536.463320: defrag_file_start: root=257 ino=329891 start=0 len=754728960
539.721309: defrag_file_start: root=257 ino=329891 start=0 len=754728960
569.741417: defrag_file_start: root=257 ino=329891 start=0 len=754728960
594.832649: defrag_file_start: root=257 ino=329891 start=0 len=754728960
624.258214: defrag_file_start: root=257 ino=329891 start=0 len=754728960
654.856264: defrag_file_start: root=257 ino=329891 start=0 len=754728960
684.943029: defrag_file_start: root=257 ino=329891 start=0 len=754728960
715.288662: defrag_file_start: root=257 ino=329891 start=0 len=754728960
But there are some outlawers, like 536s->539s, it's only 3s, not the 30s
default commit interval.
[CAUSE]
There are several call sites which can wake up transaction kthread
early, while transaction kthread itself can skip transaction if its
timer doesn't reach commit interval, but cleaner is always woken up.
This means each time transaction ktreahd gets woken up, we also trigger
autodefrag, even transaction kthread chooses to skip its work.
This is not a good behavior for files under heavy write load, as we
waste extra IO/CPU while the defragged extents can easily get fragmented
again.
[FIX]
In btrfs_run_defrag_inodes(), we check if our current time is larger
than last run + commit_interval.
If not, skip this run and wait for next opportunity.
This patch along with patch "btrfs: don't let new writes to trigger
autodefrag on the same inode" are mostly for backport to v5.16.
This is just to reduce the unnecessary IO/CPU caused by autodefrag, the
final solution would be allowing users to change autodefrag scan
interval and target extent size.
Cc: stable(a)vger.kernel.org # 5.16+
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
---
fs/btrfs/ctree.h | 1 +
fs/btrfs/file.c | 11 +++++++++++
2 files changed, 12 insertions(+)
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index a8a3de10cead..44116a47307e 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -899,6 +899,7 @@ struct btrfs_fs_info {
/* auto defrag inodes go here */
spinlock_t defrag_inodes_lock;
+ u64 defrag_last_run_ksec;
struct rb_root defrag_inodes;
atomic_t defrag_running;
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index abba1871e86e..a852754f5601 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -312,9 +312,20 @@ int btrfs_run_defrag_inodes(struct btrfs_fs_info *fs_info)
{
struct inode_defrag *defrag;
struct rb_root defrag_inodes;
+ u64 ksec = ktime_get_seconds();
u64 first_ino = 0;
u64 root_objectid = 0;
+ /*
+ * If cleaner get woken up early, skip this run to avoid frequent
+ * re-dirty, which is not really useful for heavy writes.
+ *
+ * TODO: Make autodefrag to happen in its own thread.
+ */
+ if (ksec - fs_info->defrag_last_run_ksec < fs_info->commit_interval)
+ return 0;
+ fs_info->defrag_last_run_ksec = ksec;
+
atomic_inc(&fs_info->defrag_running);
spin_lock(&fs_info->defrag_inodes_lock);
/*
--
2.35.1
This is the start of the stable review cycle for the 5.15.28 release.
There are 43 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Fri, 11 Mar 2022 15:58:48 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.28-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.15.28-rc1
Christoph Hellwig <hch(a)lst.de>
block: drop unused includes in <linux/genhd.h>
Huang Pei <huangpei(a)loongson.cn>
slip: fix macro redefine warning
Emmanuel Gil Peyrot <linkmauve(a)linkmauve.fr>
ARM: fix build error when BPF_SYSCALL is disabled
James Morse <james.morse(a)arm.com>
arm64: proton-pack: Include unprivileged eBPF status in Spectre v2 mitigation reporting
James Morse <james.morse(a)arm.com>
arm64: Use the clearbhb instruction in mitigations
James Morse <james.morse(a)arm.com>
KVM: arm64: Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and migrated
James Morse <james.morse(a)arm.com>
arm64: Mitigate spectre style branch history side channels
James Morse <james.morse(a)arm.com>
arm64: proton-pack: Report Spectre-BHB vulnerabilities as part of Spectre-v2
James Morse <james.morse(a)arm.com>
arm64: Add percpu vectors for EL1
James Morse <james.morse(a)arm.com>
arm64: entry: Add macro for reading symbol addresses from the trampoline
James Morse <james.morse(a)arm.com>
arm64: entry: Add vectors that have the bhb mitigation sequences
James Morse <james.morse(a)arm.com>
arm64: entry: Add non-kpti __bp_harden_el1_vectors for mitigations
James Morse <james.morse(a)arm.com>
arm64: entry: Allow the trampoline text to occupy multiple pages
James Morse <james.morse(a)arm.com>
arm64: entry: Make the kpti trampoline's kpti sequence optional
James Morse <james.morse(a)arm.com>
arm64: entry: Move trampoline macros out of ifdef'd section
James Morse <james.morse(a)arm.com>
arm64: entry: Don't assume tramp_vectors is the start of the vectors
James Morse <james.morse(a)arm.com>
arm64: entry: Allow tramp_alias to access symbols after the 4K boundary
James Morse <james.morse(a)arm.com>
arm64: entry: Move the trampoline data page before the text page
James Morse <james.morse(a)arm.com>
arm64: entry: Free up another register on kpti's tramp_exit path
James Morse <james.morse(a)arm.com>
arm64: entry: Make the trampoline cleanup optional
James Morse <james.morse(a)arm.com>
KVM: arm64: Allow indirect vectors to be used without SPECTRE_V3A
James Morse <james.morse(a)arm.com>
arm64: spectre: Rename spectre_v4_patch_fw_mitigation_conduit
James Morse <james.morse(a)arm.com>
arm64: entry.S: Add ventry overflow sanity checks
Joey Gouly <joey.gouly(a)arm.com>
arm64: cpufeature: add HWCAP for FEAT_RPRES
Joey Gouly <joey.gouly(a)arm.com>
arm64: cpufeature: add HWCAP for FEAT_AFP
Joey Gouly <joey.gouly(a)arm.com>
arm64: add ID_AA64ISAR2_EL1 sys register
Anshuman Khandual <anshuman.khandual(a)arm.com>
arm64: Add Cortex-X2 CPU part definition
Marc Zyngier <maz(a)kernel.org>
arm64: Add HWCAP for self-synchronising virtual counter
Suzuki K Poulose <suzuki.poulose(a)arm.com>
arm64: Add Neoverse-N2, Cortex-A710 CPU part definition
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: include unprivileged BPF status in Spectre V2 reporting
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: Spectre-BHB workaround
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: use LOADADDR() to get load address of sections
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: early traps initialisation
Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
ARM: report Spectre v2 status through sysfs
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Warn about eIBRS + LFENCE + Unprivileged eBPF + SMT
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Warn about Spectre v2 LFENCE mitigation
Kim Phillips <kim.phillips(a)amd.com>
x86/speculation: Update link to AMD speculation whitepaper
Kim Phillips <kim.phillips(a)amd.com>
x86/speculation: Use generic retpoline by default on AMD
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting
Peter Zijlstra <peterz(a)infradead.org>
Documentation/hw-vuln: Update spectre doc
Peter Zijlstra <peterz(a)infradead.org>
x86/speculation: Add eIBRS + Retpoline options
Peter Zijlstra (Intel) <peterz(a)infradead.org>
x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE
Peter Zijlstra <peterz(a)infradead.org>
x86,bugs: Unconditionally allow spectre_v2=retpoline,amd
-------------
Diffstat:
Documentation/admin-guide/hw-vuln/spectre.rst | 48 ++-
Documentation/admin-guide/kernel-parameters.txt | 8 +-
Documentation/arm64/cpu-feature-registers.rst | 29 +-
Documentation/arm64/elf_hwcaps.rst | 12 +
Makefile | 4 +-
arch/arm/include/asm/assembler.h | 10 +
arch/arm/include/asm/spectre.h | 32 ++
arch/arm/include/asm/vmlinux.lds.h | 35 ++-
arch/arm/kernel/Makefile | 2 +
arch/arm/kernel/entry-armv.S | 79 ++++-
arch/arm/kernel/entry-common.S | 24 ++
arch/arm/kernel/spectre.c | 71 +++++
arch/arm/kernel/traps.c | 65 +++-
arch/arm/mm/Kconfig | 11 +
arch/arm/mm/proc-v7-bugs.c | 207 ++++++++++---
arch/arm64/Kconfig | 9 +
arch/arm64/include/asm/assembler.h | 53 ++++
arch/arm64/include/asm/cpu.h | 1 +
arch/arm64/include/asm/cpufeature.h | 29 ++
arch/arm64/include/asm/cputype.h | 14 +
arch/arm64/include/asm/fixmap.h | 6 +-
arch/arm64/include/asm/hwcap.h | 3 +
arch/arm64/include/asm/insn.h | 1 +
arch/arm64/include/asm/kvm_host.h | 5 +
arch/arm64/include/asm/sections.h | 5 +
arch/arm64/include/asm/spectre.h | 4 +
arch/arm64/include/asm/sysreg.h | 18 ++
arch/arm64/include/asm/vectors.h | 73 +++++
arch/arm64/include/uapi/asm/hwcap.h | 3 +
arch/arm64/include/uapi/asm/kvm.h | 5 +
arch/arm64/kernel/cpu_errata.c | 7 +
arch/arm64/kernel/cpufeature.c | 28 +-
arch/arm64/kernel/cpuinfo.c | 4 +
arch/arm64/kernel/entry.S | 214 +++++++++----
arch/arm64/kernel/image-vars.h | 4 +
arch/arm64/kernel/proton-pack.c | 391 +++++++++++++++++++++++-
arch/arm64/kernel/vmlinux.lds.S | 2 +-
arch/arm64/kvm/arm.c | 5 +-
arch/arm64/kvm/hyp/hyp-entry.S | 9 +
arch/arm64/kvm/hyp/nvhe/mm.c | 4 +-
arch/arm64/kvm/hyp/vhe/switch.c | 9 +-
arch/arm64/kvm/hypercalls.c | 12 +
arch/arm64/kvm/psci.c | 18 +-
arch/arm64/kvm/sys_regs.c | 2 +-
arch/arm64/mm/mmu.c | 12 +-
arch/arm64/tools/cpucaps | 1 +
arch/um/drivers/ubd_kern.c | 1 +
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/nospec-branch.h | 16 +-
arch/x86/kernel/cpu/bugs.c | 205 +++++++++----
arch/x86/lib/retpoline.S | 2 +-
block/genhd.c | 1 +
block/holder.c | 1 +
block/partitions/core.c | 1 +
drivers/block/amiflop.c | 1 +
drivers/block/ataflop.c | 1 +
drivers/block/floppy.c | 1 +
drivers/block/swim.c | 1 +
drivers/block/xen-blkfront.c | 1 +
drivers/md/md.c | 1 +
drivers/net/slip/slip.h | 2 +
drivers/s390/block/dasd_genhd.c | 1 +
drivers/scsi/sd.c | 1 +
drivers/scsi/sg.c | 1 +
drivers/scsi/sr.c | 1 +
drivers/scsi/st.c | 1 +
include/linux/arm-smccc.h | 5 +
include/linux/bpf.h | 12 +
include/linux/genhd.h | 14 +-
include/linux/part_stat.h | 1 +
kernel/sysctl.c | 7 +
tools/arch/x86/include/asm/cpufeatures.h | 2 +-
72 files changed, 1642 insertions(+), 229 deletions(-)
On 09.03.22 17:46, Reinhold Mannsberger wrote:
> Dear Thorsten!
>
> Thank you for your quick response!
>
> Now that you gave me the advice to check dmesg I found out that the
> messages are the same with kernel version 5.16.10 and 5.16.12. But - as
> I described - with kernel version 5.16.10 I had to press the power
> button to resume from suspend. So I my conclusion that my laptop does
> not go to suspend ist apparently wrong.
Good. :-D
> In any case you find excerpts
> from dmesg with both kernel versions attached.
>
> Now there is one thing I really would like to understand. Concluding
> from the time stamps in dmesg it seems that my laptop goes to suspend
> only for a moment right before I re-open the lid. Of course I did not
> close my laptop lid only for 3 seconds - as it could be concluded from
> the time stamps for "PM: suspend entry (s2idle)" and "PM: suspend exit"
> - but for a longer period of time. Can you please enlighten me about that?
I have no idea, I'm just tracking regressions and sadly don't known much
about this. To me it looks a bit like s2idle is not working properly,
but I might be totally wrong with that. Maybe google might tell you; or
some measurements where you check how quickly the batter drains in
sleep. Oof you ask the PM developers -- but as this is not a regression
neither the regressions list nor the stable list care, so I guess you do
it in a separate mail.
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
P.S.: As the Linux kernel's regression tracker I'm getting a lot of
reports on my table. I can only look briefly into most of them and lack
knowledge about most of the areas they concern. I thus unfortunately
will sometimes get things wrong or miss something important. I hope
that's not the case here; if you think it is, don't hesitate to tell me
in a public reply, it's in everyone's interest to set the public record
straight.
> Am Mittwoch, dem 09.03.2022 um 07:51 +0100 schrieb Thorsten Leemhuis:
>> Hi!
>>
>> On 08.03.22 19:21, Reinhold Mannsberger wrote:
>>>
>>> I am using Linux Mint Xfce 20.3 with kernel version 5.16. I had to use
>>> kernel 5.16 because with the standard kernel version of Linux mit 20.3
>>> (which is 5.13) my laptop did not correctly resume, when I closed the
>>> lid.
>>>
>>> With kernel 5.16 my laptop perfectly went to to suspend when I closed
>>> the lid and it perfectly resumed, when I opened the lid again. This
>>> means: I had to press the power button once
>>
>> That sounds odd to me, as most modern Laptops wake up automatically when
>> you open the lid. It's unlikely, but maybe that just that started to
>> work now?
>>
>>> when I reopened the lid -
>>> and then the laptop resumed (to the login screen). This was true until
>>> kernel version 5.16.10. With kernel version > 5.16.10 my laptop does
>>> not go into suspend anymore. This means: When I open the lid I am back
>>> at the login screen immediately (I don't have to press the power button
>>> anymore).
>>
>> You want to check dmesg if the system really didn't go to sleep; it will
>> likely also provide a hint of what went wrong. Just upload the output
>> (generated after a fresh start and where you suspend and resume once the
>> system booted) somewhere and send us a link or sent it as an attachment
>> in a reply. If that doesn't provide any hints of what might be wrong,
>> you might need to find the change that introduced the problem using a
>> bisection.
>>
>> HTH, Ciao, Thorsten
>>
>>> System information for my laptop:
>>> ----------------------------------------------------------------------
>>> System: Kernel: 5.16.10-051610-generic x86_64 bits: 64 compiler: N/A
>>> Desktop: Xfce 4.16.0
>>> tk: Gtk 3.24.20 wm: xfwm4 dm: LightDM Distro: Linux Mint
>>> 20.3 Una
>>> base: Ubuntu 20.04 focal
>>> Machine: Type: Laptop System: HP product: HP ProBook 455 G8 Notebook
>>> PC v: N/A serial: <filter>
>>> Chassis: type: 10 serial: <filter>
>>> Mobo: HP model: 8864 v: KBC Version 41.1E.00 serial:
>>> <filter> UEFI: HP
>>> v: T78 Ver. 01.07.00 date: 10/08/2021
>>> Battery: ID-1: BAT0 charge: 43.8 Wh condition: 44.5/45.0 Wh (99%)
>>> volts: 13.0/11.4
>>> model: Hewlett-Packard Primary serial: <filter> status:
>>> Unknown
>>> CPU: Topology: 8-Core model: AMD Ryzen 7 5800U with Radeon
>>> Graphics bits: 64 type: MT MCP
>>> arch: Zen 3 L2 cache: 4096 KiB
>>> flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a
>>> ssse3 svm bogomips: 60685
>>> Speed: 3497 MHz min/max: 1600/1900 MHz Core speeds (MHz): 1:
>>> 3474 2: 3464 3: 3473
>>> 4: 3471 5: 4362 6: 4332 7: 3478 8: 3455 9: 3459 10: 3452 11:
>>> 3462 12: 3468 13: 3468
>>> 14: 3468 15: 3467 16: 3472
>>> Graphics: Device-1: AMD vendor: Hewlett-Packard driver: amdgpu v:
>>> kernel bus ID: 05:00.0
>>> chip ID: 1002:1638
>>> Display: x11 server: X.Org 1.20.13 driver: amdgpu,ati
>>> unloaded: fbdev,modesetting,vesa
>>> resolution: 1920x1080~60Hz
>>> OpenGL: renderer: AMD RENOIR (DRM 3.44.0 5.16.10-051610-
>>> generic LLVM 12.0.0)
>>> v: 4.6 Mesa 21.2.6 direct render: Yes
>>> ----------------------------------------------------------------------
>>>
>>>
>>> Best regards,
>>>
>>> Reinhold Mannsberger
>>>
>>>
>>>
Hi,
Two fixes for x86 arch.
## Changelog
v4:
- Address comment from Greg, sha1 commit Fixes only needs to be 12 chars.
- Add the author of the fixed commit to the CC list.
v3:
- Fold in changes from Alviro, the previous version is still
leaking @bank[n].
v2:
- Fix wrong copy/paste.
## Short Summary
Patch 1, fixes the wrong asm constraint in delay_loop function.
Fortunately, the constraint violation that's fixed by patch 1 doesn't
yield any bug due to the nature of System V ABI. Should we backport
this?
Patch 2, fixes memory leak in mce/amd code.
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Tony Luck <tony.luck(a)intel.com>
Signed-off-by: Alviro Iskandar Setiawan <alviro.iskandar(a)gnuweeb.org>
Signed-off-by: Ammar Faizi <ammarfaizi2(a)gnuweeb.org>
---
Ammar Faizi (2):
x86/delay: Fix the wrong asm constraint in `delay_loop()`
x86/mce/amd: Fix memory leak when `threshold_create_bank()` fails
arch/x86/kernel/cpu/mce/amd.c | 16 ++++++++++------
arch/x86/lib/delay.c | 4 ++--
2 files changed, 12 insertions(+), 8 deletions(-)
base-commit: 7e57714cd0ad2d5bb90e50b5096a0e671dec1ef3
--
2.32.0
ld.lld does not support the NOCROSSREFS directive at the moment, which
breaks the build after commit b9baf5c8c5c3 ("ARM: Spectre-BHB
workaround"):
ld.lld: error: ./arch/arm/kernel/vmlinux.lds:34: AT expected, but got NOCROSSREFS
Support for this directive will eventually be implemented, at which
point a version check can be added. To avoid breaking the build in the
meantime, just define NOCROSSREFS to nothing when using ld.lld, with a
link to the issue for tracking.
Cc: stable(a)vger.kernel.org
Fixes: b9baf5c8c5c3 ("ARM: Spectre-BHB workaround")
Link: https://github.com/ClangBuiltLinux/linux/issues/1609
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
Since b9baf5c8c5c3 has been backported to stable, I have marked this for
stable as well, using a Fixes tag to notate that this should go back to
all releases that have b9baf5c8c5c3, not to indicate any blame of
b9baf5c8c5c3, as this is clearly an ld.lld deficiency.
It would be nice if this could be applied directly to unblock our CI if
there are no objections.
arch/arm/include/asm/vmlinux.lds.h | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/arch/arm/include/asm/vmlinux.lds.h b/arch/arm/include/asm/vmlinux.lds.h
index 0ef21bfae9f6..fad45c884e98 100644
--- a/arch/arm/include/asm/vmlinux.lds.h
+++ b/arch/arm/include/asm/vmlinux.lds.h
@@ -26,6 +26,14 @@
#define ARM_MMU_DISCARD(x) x
#endif
+/*
+ * ld.lld does not support NOCROSSREFS:
+ * https://github.com/ClangBuiltLinux/linux/issues/1609
+ */
+#ifdef CONFIG_LD_IS_LLD
+#define NOCROSSREFS
+#endif
+
/* Set start/end symbol names to the LMA for the section */
#define ARM_LMA(sym, section) \
sym##_start = LOADADDR(section); \
base-commit: e7e19defa57580d679bf0d03f8a34933008a7930
--
2.35.1
When building arm64 defconfig + CONFIG_LTO_CLANG_{FULL,THIN}=y after
commit 558c303c9734 ("arm64: Mitigate spectre style branch history side
channels"), the following error occurs:
<instantiation>:4:2: error: invalid fixup for movz/movk instruction
mov w0, #ARM_SMCCC_ARCH_WORKAROUND_3
^
Marc figured out that moving "#include <linux/init.h>" in
include/linux/arm-smccc.h into a !__ASSEMBLY__ block resolves it. The
full include chain with CONFIG_LTO=y from include/linux/arm-smccc.h:
include/linux/init.h
include/linux/compiler.h
arch/arm64/include/asm/rwonce.h
arch/arm64/include/asm/alternative-macros.h
arch/arm64/include/asm/assembler.h
The asm/alternative-macros.h include in asm/rwonce.h only happens when
CONFIG_LTO is set, which ultimately casues asm/assembler.h to be
included before the definition of ARM_SMCCC_ARCH_WORKAROUND_3. As a
result, the preprocessor does not expand ARM_SMCCC_ARCH_WORKAROUND_3 in
__mitigate_spectre_bhb_fw, which results in the error above.
Avoid this problem by just avoiding the CONFIG_LTO=y __READ_ONCE() block
in asm/rwonce.h with assembly files, as nothing in that block is useful
to assembly files, which allows ARM_SMCCC_ARCH_WORKAROUND_3 to be
properly expanded with CONFIG_LTO=y builds.
Cc: stable(a)vger.kernel.org
Fixes: e35123d83ee3 ("arm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=y")
Link: https://lore.kernel.org/r/20220309155716.3988480-1-maz@kernel.org/
Reported-by: Marc Zyngier <maz(a)kernel.org>
Acked-by: James Morse <james.morse(a)arm.com>
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
This is based on current mainline; if it should be based on a specific
arm64 branch, please let me know.
As 558c303c9734 is going to stable, I marked this for stable as well to
avoid breaking Android. I used e35123d83ee3 for the fixes tag to make it
clear to the stable team this should only go where that commit is
present. If a different fixes tag should be used, please feel free to
substitute.
arch/arm64/include/asm/rwonce.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/rwonce.h b/arch/arm64/include/asm/rwonce.h
index 1bce62fa908a..56f7b1d4d54b 100644
--- a/arch/arm64/include/asm/rwonce.h
+++ b/arch/arm64/include/asm/rwonce.h
@@ -5,7 +5,7 @@
#ifndef __ASM_RWONCE_H
#define __ASM_RWONCE_H
-#ifdef CONFIG_LTO
+#if defined(CONFIG_LTO) && !defined(__ASSEMBLY__)
#include <linux/compiler_types.h>
#include <asm/alternative-macros.h>
@@ -66,7 +66,7 @@
})
#endif /* !BUILD_VDSO */
-#endif /* CONFIG_LTO */
+#endif /* CONFIG_LTO && !__ASSEMBLY__ */
#include <asm-generic/rwonce.h>
base-commit: 330f4c53d3c2d8b11d86ec03a964b86dc81452f5
--
2.35.1
Tie the lifetime the KVM module to the lifetime of each VM via
kvm.users_count. This way anything that grabs a reference to the VM via
kvm_get_kvm() cannot accidentally outlive the KVM module.
Prior to this commit, the lifetime of the KVM module was tied to the
lifetime of /dev/kvm file descriptors, VM file descriptors, and vCPU
file descriptors by their respective file_operations "owner" field.
This approach is insufficient because references grabbed via
kvm_get_kvm() donot prevent closing any of the aforementioned file
descriptors.
This fixes a long standing theoretical bug in KVM that at least affects
async page faults. kvm_setup_async_pf() grabs a reference via
kvm_get_kvm(), and drops it in an asynchronous work callback. Nothing
prevents the VM file descriptor from being closed and the KVM module
from being unloaded before this callback runs.
PPC and s390 also look broken beyond the Fixes commits listed below, but
the below commits should be more than enough to guarantee inclusion in
all stable kernels.
Fixes: 3d3aab1b973b ("KVM: set owner of cpu and vm file operations")
[ This 2.6.29 commit was an incomplete attempt to fix this bug. ]
Fixes: af585b921e5d ("KVM: Halt vcpu if page it tries to access is swapped out")
[ This 2.6.38 commit introduced async_pf and is definitely broken. ]
Cc: stable(a)vger.kernel.org
Suggested-by: Ben Gardon <bgardon(a)google.com>
[ Based on a patch from Ben implemented for Google's kernel. ]
Reviewed-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: David Matlack <dmatlack(a)google.com>
---
virt/kvm/kvm_main.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 9581a24c3d17..e17f9fd847e0 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -117,6 +117,8 @@ EXPORT_SYMBOL_GPL(kvm_debugfs_dir);
static const struct file_operations stat_fops_per_vm;
+static struct file_operations kvm_chardev_ops;
+
static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
unsigned long arg);
#ifdef CONFIG_KVM_COMPAT
@@ -1132,6 +1134,12 @@ static struct kvm *kvm_create_vm(unsigned long type)
preempt_notifier_inc();
kvm_init_pm_notifier(kvm);
+ /* Use the "try" variant to play nice with e.g. "rmmod --wait". */
+ if (!try_module_get(kvm_chardev_ops.owner)) {
+ r = -ENODEV;
+ goto out_err;
+ }
+
return kvm;
out_err:
@@ -1221,6 +1229,7 @@ static void kvm_destroy_vm(struct kvm *kvm)
preempt_notifier_dec();
hardware_disable_all();
mmdrop(mm);
+ module_put(kvm_chardev_ops.owner);
}
void kvm_get_kvm(struct kvm *kvm)
base-commit: ce41d078aaa9cf15cbbb4a42878cc6160d76525e
--
2.35.1.616.g0bdcbb4464-goog