The patch titled
Subject: mm/page_alloc.c: fix page corruption caused by racy check in __free_pages
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
fix-page-corruption-caused-by-racy-check-in-__free_pages.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: David Chen <david.chen(a)nutanix.com>
Subject: mm/page_alloc.c: fix page corruption caused by racy check in __free_pages
Date: Thu, 9 Feb 2023 17:48:28 +0000
When we upgraded our kernel, we started seeing some page corruption like
the following consistently:
BUG: Bad page state in process ganesha.nfsd pfn:1304ca
page:0000000022261c55 refcount:0 mapcount:-128 mapping:0000000000000000 in=
dex:0x0 pfn:0x1304ca
flags: 0x17ffffc0000000()
raw: 0017ffffc0000000 ffff8a513ffd4c98 ffffeee24b35ec08 0000000000000000
raw: 0000000000000000 0000000000000001 00000000ffffff7f 0000000000000000
page dumped because: nonzero mapcount
CPU: 0 PID: 15567 Comm: ganesha.nfsd Kdump: loaded Tainted: P B O =
5.10.158-1.nutanix.20221209.el7.x86_64 #1
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Referenc=
e Platform, BIOS 6.00 04/05/2016
Call Trace:
dump_stack+0x74/0x96
bad_page.cold+0x63/0x94
check_new_page_bad+0x6d/0x80
rmqueue+0x46e/0x970
get_page_from_freelist+0xcb/0x3f0
? _cond_resched+0x19/0x40
__alloc_pages_nodemask+0x164/0x300
alloc_pages_current+0x87/0xf0
skb_page_frag_refill+0x84/0x110
...
Sometimes, it would also show up as corruption in the free list pointer and
cause crashes.
After bisecting the issue, we found the issue started from e320d3012d25:
if (put_page_testzero(page))
free_the_page(page, order);
else if (!PageHead(page))
while (order-- > 0)
free_the_page(page + (1 << order), order);
So the problem is the check PageHead is racy because at this point we
already dropped our reference to the page. So even if we came in with
compound page, the page can already be freed and PageHead can return false
and we will end up freeing all the tail pages causing double free.
Link: https://lkml.kernel.org/r/Message-ID:
Fixes: e320d3012d25 ("mm/page_alloc.c: fix freeing non-compound pages")
Signed-off-by: Chunwei Chen <david.chen(a)nutanix.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/mm/page_alloc.c~fix-page-corruption-caused-by-racy-check-in-__free_pages
+++ a/mm/page_alloc.c
@@ -5631,9 +5631,12 @@ EXPORT_SYMBOL(get_zeroed_page);
*/
void __free_pages(struct page *page, unsigned int order)
{
+ /* get PageHead before we drop reference */
+ int head =3D PageHead(page);
+
if (put_page_testzero(page))
free_the_page(page, order);
- else if (!PageHead(page))
+ else if (!head)
while (order-- > 0)
free_the_page(page + (1 << order), order);
}
_
Patches currently in -mm which might be from david.chen(a)nutanix.com are
fix-page-corruption-caused-by-racy-check-in-__free_pages.patch
--
Hello Dear
Am a dying woman here in the hospital, i was diagnose as a
Coronavirus patient over 2 months ago. I am A business woman who is
dealing with Gold Exportation, I Am 59 year old from USA California i
have a charitable and unfufilling project that am about to handover
to you, if you are interested to know more about this project please reply me.
Hope to hear from you
Best Regard
Margaret Christopher
Atm, drm_dp_remove_payload() uses the same payload state to both get the
vc_start_slot required for the payload removal DPCD message and to
deduct time_slots from vc_start_slot of all payloads after the one being
removed.
The above isn't always correct, as vc_start_slot must be the up-to-date
version contained in the new payload state, but time_slots must be the
one used when the payload was previously added, contained in the old
payload state. The new payload's time_slots can change vs. the old one
if the current atomic commit changes the corresponding mode.
This patch let's drivers pass the old and new payload states to
drm_dp_remove_payload(), but keeps these the same for now in all drivers
not to change the behavior. A follow-up i915 patch will pass in that
driver the correct old and new states to the function.
Cc: Lyude Paul <lyude(a)redhat.com>
Cc: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Cc: Ben Skeggs <bskeggs(a)redhat.com>
Cc: Karol Herbst <kherbst(a)redhat.com>
Cc: Harry Wentland <harry.wentland(a)amd.com>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: Wayne Lin <Wayne.Lin(a)amd.com>
Cc: stable(a)vger.kernel.org # 6.1
Cc: dri-devel(a)lists.freedesktop.org
Reviewed-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Signed-off-by: Imre Deak <imre.deak(a)intel.com>
---
.../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 2 +-
drivers/gpu/drm/display/drm_dp_mst_topology.c | 26 ++++++++++---------
drivers/gpu/drm/i915/display/intel_dp_mst.c | 4 ++-
drivers/gpu/drm/nouveau/dispnv50/disp.c | 2 +-
include/drm/display/drm_dp_mst_helper.h | 3 ++-
5 files changed, 21 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
index a50319fc42b11..180d3893b68da 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
@@ -208,7 +208,7 @@ bool dm_helpers_dp_mst_write_payload_allocation_table(
if (enable)
drm_dp_add_payload_part1(mst_mgr, mst_state, payload);
else
- drm_dp_remove_payload(mst_mgr, mst_state, payload);
+ drm_dp_remove_payload(mst_mgr, mst_state, payload, payload);
/* mst_mgr->->payloads are VC payload notify MST branch using DPCD or
* AUX message. The sequence is slot 1-63 allocated sequence for each
diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c b/drivers/gpu/drm/display/drm_dp_mst_topology.c
index 847c10aa2098c..1990ff5dc7ddd 100644
--- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
@@ -3342,7 +3342,8 @@ EXPORT_SYMBOL(drm_dp_add_payload_part1);
* drm_dp_remove_payload() - Remove an MST payload
* @mgr: Manager to use.
* @mst_state: The MST atomic state
- * @payload: The payload to write
+ * @old_payload: The payload with its old state
+ * @new_payload: The payload to write
*
* Removes a payload from an MST topology if it was successfully assigned a start slot. Also updates
* the starting time slots of all other payloads which would have been shifted towards the start of
@@ -3350,36 +3351,37 @@ EXPORT_SYMBOL(drm_dp_add_payload_part1);
*/
void drm_dp_remove_payload(struct drm_dp_mst_topology_mgr *mgr,
struct drm_dp_mst_topology_state *mst_state,
- struct drm_dp_mst_atomic_payload *payload)
+ const struct drm_dp_mst_atomic_payload *old_payload,
+ struct drm_dp_mst_atomic_payload *new_payload)
{
struct drm_dp_mst_atomic_payload *pos;
bool send_remove = false;
/* We failed to make the payload, so nothing to do */
- if (payload->vc_start_slot == -1)
+ if (new_payload->vc_start_slot == -1)
return;
mutex_lock(&mgr->lock);
- send_remove = drm_dp_mst_port_downstream_of_branch(payload->port, mgr->mst_primary);
+ send_remove = drm_dp_mst_port_downstream_of_branch(new_payload->port, mgr->mst_primary);
mutex_unlock(&mgr->lock);
if (send_remove)
- drm_dp_destroy_payload_step1(mgr, mst_state, payload);
+ drm_dp_destroy_payload_step1(mgr, mst_state, new_payload);
else
drm_dbg_kms(mgr->dev, "Payload for VCPI %d not in topology, not sending remove\n",
- payload->vcpi);
+ new_payload->vcpi);
list_for_each_entry(pos, &mst_state->payloads, next) {
- if (pos != payload && pos->vc_start_slot > payload->vc_start_slot)
- pos->vc_start_slot -= payload->time_slots;
+ if (pos != new_payload && pos->vc_start_slot > new_payload->vc_start_slot)
+ pos->vc_start_slot -= old_payload->time_slots;
}
- payload->vc_start_slot = -1;
+ new_payload->vc_start_slot = -1;
mgr->payload_count--;
- mgr->next_start_slot -= payload->time_slots;
+ mgr->next_start_slot -= old_payload->time_slots;
- if (payload->delete)
- drm_dp_mst_put_port_malloc(payload->port);
+ if (new_payload->delete)
+ drm_dp_mst_put_port_malloc(new_payload->port);
}
EXPORT_SYMBOL(drm_dp_remove_payload);
diff --git a/drivers/gpu/drm/i915/display/intel_dp_mst.c b/drivers/gpu/drm/i915/display/intel_dp_mst.c
index f3cb12dcfe0a7..dc4e5ff1dbb31 100644
--- a/drivers/gpu/drm/i915/display/intel_dp_mst.c
+++ b/drivers/gpu/drm/i915/display/intel_dp_mst.c
@@ -526,6 +526,8 @@ static void intel_mst_disable_dp(struct intel_atomic_state *state,
to_intel_connector(old_conn_state->connector);
struct drm_dp_mst_topology_state *mst_state =
drm_atomic_get_mst_topology_state(&state->base, &intel_dp->mst_mgr);
+ struct drm_dp_mst_atomic_payload *payload =
+ drm_atomic_get_mst_payload_state(mst_state, connector->port);
struct drm_i915_private *i915 = to_i915(connector->base.dev);
drm_dbg_kms(&i915->drm, "active links %d\n",
@@ -534,7 +536,7 @@ static void intel_mst_disable_dp(struct intel_atomic_state *state,
intel_hdcp_disable(intel_mst->connector);
drm_dp_remove_payload(&intel_dp->mst_mgr, mst_state,
- drm_atomic_get_mst_payload_state(mst_state, connector->port));
+ payload, payload);
intel_audio_codec_disable(encoder, old_crtc_state, old_conn_state);
}
diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
index edcb2529b4025..ed9d374147b8d 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
@@ -885,7 +885,7 @@ nv50_msto_prepare(struct drm_atomic_state *state,
// TODO: Figure out if we want to do a better job of handling VCPI allocation failures here?
if (msto->disabled) {
- drm_dp_remove_payload(mgr, mst_state, payload);
+ drm_dp_remove_payload(mgr, mst_state, payload, payload);
nvif_outp_dp_mst_vcpi(&mstm->outp->outp, msto->head->base.index, 0, 0, 0, 0);
} else {
diff --git a/include/drm/display/drm_dp_mst_helper.h b/include/drm/display/drm_dp_mst_helper.h
index 41fd8352ab656..f5eb9aa152b14 100644
--- a/include/drm/display/drm_dp_mst_helper.h
+++ b/include/drm/display/drm_dp_mst_helper.h
@@ -841,7 +841,8 @@ int drm_dp_add_payload_part2(struct drm_dp_mst_topology_mgr *mgr,
struct drm_dp_mst_atomic_payload *payload);
void drm_dp_remove_payload(struct drm_dp_mst_topology_mgr *mgr,
struct drm_dp_mst_topology_state *mst_state,
- struct drm_dp_mst_atomic_payload *payload);
+ const struct drm_dp_mst_atomic_payload *old_payload,
+ struct drm_dp_mst_atomic_payload *new_payload);
int drm_dp_check_act_status(struct drm_dp_mst_topology_mgr *mgr);
--
2.37.1
The patch titled
Subject: mm/MADV_COLLAPSE: set EAGAIN on unexpected page refcount
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-madv_collapse-set-eagain-on-unexpected-page-refcount.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Zach O'Keefe" <zokeefe(a)google.com>
Subject: mm/MADV_COLLAPSE: set EAGAIN on unexpected page refcount
Date: Tue, 24 Jan 2023 17:57:37 -0800
During collapse, in a few places we check to see if a given small page has
any unaccounted references. If the refcount on the page doesn't match our
expectations, it must be there is an unknown user concurrently interested
in the page, and so it's not safe to move the contents elsewhere.
However, the unaccounted pins are likely an ephemeral state.
In such a situation, make MADV_COLLAPSE set EAGAIN errno, indicating that
collapse may succeed on retry.
Link: https://lkml.kernel.org/r/20230125015738.912924-1-zokeefe@google.com
Fixes: 7d8faaf15545 ("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse")
Signed-off-by: Zach O'Keefe <zokeefe(a)google.com>
Reported-by: Hugh Dickins <hughd(a)google.com>
Acked-by: Hugh Dickins <hughd(a)google.com>
Reviewed-by: Yang Shi <shy828301(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/khugepaged.c | 1 +
1 file changed, 1 insertion(+)
--- a/mm/khugepaged.c~mm-madv_collapse-set-eagain-on-unexpected-page-refcount
+++ a/mm/khugepaged.c
@@ -2611,6 +2611,7 @@ static int madvise_collapse_errno(enum s
case SCAN_CGROUP_CHARGE_FAIL:
return -EBUSY;
/* Resource temporary unavailable - trying again might succeed */
+ case SCAN_PAGE_COUNT:
case SCAN_PAGE_LOCK:
case SCAN_PAGE_LRU:
case SCAN_DEL_PAGE_LRU:
_
Patches currently in -mm which might be from zokeefe(a)google.com are
mm-madv_collapse-set-eagain-on-unexpected-page-refcount.patch